Quadratic improvement on accuracy of approximating pure quantum states and unitary gates by probabilistic implementation
Abstract
Pure quantum states are often approximately encoded as classical bit strings such as those representing probability amplitudes and those describing circuits that generate the quantum states. The crucial quantity is the minimum length of classical bit strings from which the original pure states are approximately reconstructible. We derive asymptotically tight bounds on the minimum bit length required for probabilistic encodings with which one can approximately reconstruct the original pure state as an ensemble of the quantum states encoded in classical strings. We also show that such a probabilistic encoding asymptotically halves the bit length required for “deterministic” ones. This is based on the fact that the accuracy of approximating pure states by using a given subset of pure states can be increased quadratically if we use ensembles of pure states in the subset. Moreover, we show that a similar fact holds when we consider the approximation of unitary gates by using a given subset of unitary gates. This improves the reduction rate of the circuit size by using probabilistic circuit synthesis compared to previous results. This also demonstrates that the reduction is possible even for low-accuracy circuit synthesis, which might improve the accuracy of various NISQ algorithms.
I Introduction
Pure quantum states are often approximately encoded in classical states in various quantum information processing tasks, such as classical bit strings storing probability amplitudes of pure states in the classical simulation of quantum circuits; classical data obtained by measurement in state tomography or state estimation; classical descriptions of quantum circuits that generate target pure states in quantum circuit synthesis [1, 2]. Recently, compact classical encodings from which one can predict probability distributions on outcomes obtained by measurements in certain classes have been developed [3, 4, 5].
The key issue is the minimum encoding. Here, we investigate it in terms of the minimum length of classical bit strings, with which one can approximately achieve any information processing task achievable with the original pure state on a -dimensional system. That is, from the classical strings, one can construct a quantum state that is indistinguishable from the original state within a certain accuracy by any measurements. In addition to deterministic encodings, which deterministically associate with a classical bit string, we consider probabilistic encodings, which associate with one of multiple classical strings according to some distribution. That is, in probabilistic encodings, is associated with an ensemble of classical strings from which is constructed as an ensemble of the quantum states encoded in the classical strings.
Besides providing fundamental limits for information processing tasks using classical encodings, the minimum length is a fundamental quantity in various theoretical subjects including communication complexity, computational complexity, and asymptotic geometric analysis. Indeed, in deterministic encodings, classical strings do nothing but encode elements in an -covering (sometimes called an -net) of the set of pure states. Thus, the minimum bit length is the logarithm of the minimum size of -coverings (called the covering number). Due to its prominent role in algorithm design and asymptotic geometric analysis, the covering number has been well-studied [6, 7, 8], and it is known that bits are enough to encode an -covering. On the other hand, a particular task in communication complexity called distributed quantum sampling, which aims to classically transmit a pure state so as to approximately sample outcomes of arbitrary quantum measurement, provides a lower bound on the minimum length required for probabilistic encodings as [9]. Taking the two known facts into account, it seems that the minimum probabilistic encoding can be realized by a deterministic one. This intuition is supported by the fact that an ensemble of deterministic classical states (called a probabilistic classical state) represents our lack of knowledge about a classical system while a pure state represents our maximum knowledge about a quantum system.
Contrary to this intuition, we show that the minimum length required for probabilistic encodings is exactly half of the one required for deterministic encodings in the asymptotic limits of the dimension or accuracy. Thus, the minimum encoding must associate some pure states with ensembles of classical strings describing distinct quantum states, which may be counter-intuitive, considering that pure states themselves are not probabilistic mixtures of distinct quantum states. The excessive bits required for deterministic encodings can be interpreted as a consequence of their excessive predictive capability such that they can not only reconstruct quantum states within a certain accuracy but also deterministically compute the probability distribution of any measurements within the same accuracy. Such a deterministic computation is impossible by using either the minimum (probabilistic) encoding or the original quantum states.
The bit length reduction by using probabilistic encodings follows from our refined estimation of the covering number and the following fact we prove: for any finite set of pure states, it is possible to quadratically increase the accuracy of approximating arbitrary pure states by using ensembles of pure states in . Moreover, we show that a similar fact holds when we consider the approximation of unitary gates by using a finite set of unitary gates. Recently, it has been found that when we approximately implement arbitrary unitary gates by using a gate sequence over a finite universal gate set (called circuit synthesis), the length of the gate sequence or the number of gates can be reduced by using ensembles of gate sequences for high-accuracy circuit synthesis [10, 11]. Our result improves the reduction rate of the previous results and shows that the reduction is possible even for low-accuracy circuit synthesis, which might improve the accuracy of NISQ algorithms.
II Preliminaries
In this section, we summarize basic notations used throughout the paper. Note that we consider only finite-dimensional Hilbert spaces. In particular, two-dimensional Hilbert space is called a qubit. and represent the set of linear operators and positive semidefinite operators on Hilbert space , respectively. and represent the set of quantum states and that of pure states, respectively. Pure state is sometimes alternatively represented by complex unit vector satisfying . Any physical transformation of the quantum state can be represented by a completely positive and trace preserving (CPTP) linear mapping .
The trace distance of two quantum states is defined as for . It represents the maximum total variation distance between probability distributions obtained by measurements performed on two quantum states. A similar notion measuring the distinguishability of and is the fidelity function, defined by , where is a purification of , i.e., , and the maximization is taken over all the purifications. Fuchs-van de Graaf inequalities [12] provide relationships between the two measures with respect to the distinguishability as follows:
(1) |
holds for any states , where the equality of the right inequality holds when and are pure.
III Classical encoding of pure states

The existence of a probabilistic encoding is equivalent to the existence of physical transformation that can approximately reconstruct arbitrary pure state from probabilistic classical state as shown in Fig. 1. Physical transformation can be regarded as a decoder of the classical states to quantum states, which outputs mixed state when label is inputted. Formally, is represented by a classical-quantum channel [13], which is defined as , where is an orthonormal basis. We require that with the output state of , one can approximately sample any measurement outcomes performed on within total variation distance , i.e., for given . Thus, we say that a probabilistic encoding of with accuracy exists if and only if there exists set of quantum states satisfying
(2) |
where the minimum is taken over probability distribution over , i.e, and . Note that since any mixed state is a probabilistic mixture of pure states and trace distance is convex, Eq. (2) also guarantees that an arbitrary mixed state is also approximately reconstructible within the same accuracy as pure states.
III.1 Minimum deterministic encoding
First, we consider deterministic encodings, which can be defined as particular probabilistic encodings. Concretely, every pure state is encoded into a single label , which implies the output state of is . Thus, we say that a deterministic encoding of with accuracy exists if and only if there exists set of quantum states satisfying
(3) |
which is called an external -covering of . A set of pure states satisfying Eq. (3) is called an internal -covering of , which corresponds to particular deterministic encodings such as one storing probability amplitudes approximately representing . The minimum size of internal (or external) -coverings is called the internal (or external) covering number and denoted by (or .) Note that by definition and the minimum bit length required for deterministic encodings equals to .
Since the condition of the -coverings in Eq. (3) is equivalent to that for the set of -balls to cover , a detailed analysis of the volume of the -ball provides a good estimation of the covering numbers. As shown in Appendix A, the volume can be calculated as with respect to the unitarily invariant probability measure for any . This directly provides a lower bound on and also its upper bound by applying the method of constructing an internal -covering developed in [8]. We obtain the following estimation of , which is tighter than previous estimations [6, 7] in large dimensions. For completeness, we provide a construction of an internal -covering and an estimation of the parameters appearing in the construction in Appendix B.
Lemma 1.
For any and a positive integer specified below, the internal covering number of internal -coverings of is bounded as follows: For any , there exists such that
(4) |
where the left inequality holds for any , and the right inequality holds for any . For example, if , we can set .
To obtain a lower bound on , we use the following upper bounds on the volume of the -ball as shown in Appendix C:
(5) |
This bound and imply that the volume of the -ball can be maximized by setting its center as a pure state if , which is contrary to what happens in a qubit (), where corresponds to the intersection of the Bloch sphere and a ball centered at and the intersection is maximized not by a ball centered at a point on the Bloch sphere but by a ball centered at a point inside the Bloch ball. The qubit case also implies that the condition for the second inequality cannot be fully relaxed. if with the maximally mixed state implies another condition is also not fully removable. By using Eq. (5), we easily obtain the following lower bound on .
Lemma 2.
For any and a positive integer specified below, the external covering number of external -coverings of is bounded as follows:
(6) |
Using the two lemmas by setting , we obtain the following theorem straightforwardly.
Theorem 1.
For any and an integer specified below, the minimum size of label set used over all deterministic encodings of with accuracy is bounded by
(7) |
where . Moreover, if , the lower bound can be strengthened as .
Using Theorem 1 and , we obtain the asymptotic bit rate per dimension of the minimum deterministic encoding for fixed . We can also obtain the asymptotic bit rate per accuracy of the minimum deterministic encoding for fixed .
III.2 Minimum probabilistic encoding
We prove the existence of a probabilistic encoding that achieves exactly half the asymptotic bit length required for the minimum deterministic encoding, and its optimality. The main tool for the proof is the following minimax relationship between the fidelity and the trace distance.
Lemma 3.
For any CPTP linear mapping , it holds that
(8) |
Proof.
We use the minimax theorem as follows:
(9) |
Note that the minimax theorem, used in the second equation, is applicable since is affine with respect to each variable and the domain of and are compact and convex. The last equality holds since the maximum is achieved if . ∎
As a special case of Lemma 3, we obtain the following lemma about the relationship of the accuracy of approximating pure states by a finite number of pure states and that by their ensembles:
Corollary 1.
Let be a finite set of pure states. Then, it holds that
(10) |
where the first minimum is taken over probability distribution over .
Proof.
Suppose in Lemma 3 is a classical-quantum channel such that , which corresponds to a particular decoder for a classical encoding that outputs pure state when label is inputted. Then, L.H.S. of Eq. (3) is equal to that of Eq. (10) by interchanging and . and R.H.S. of Eq. (3) is equal to
(11) |
which is equal to R.H.S. of Eq. (10) since the equality of the second inequlity in Eq. (1) holds when and are pure. ∎
This corollary implies that an internal -covering is sufficient to approximate arbitrary pure state within accuracy by using its probabilistic mixture. This can be intuitively understood by the curvature of the sphere as illustrated in Fig. 2. Indeed, Corollary 1 (for ) and the Bloch representation imply that for any compact and convex set whose extreme points reside on sphere with radius , holds, where and are the distance between and the farthest point on from and that between and the farthest point on from , respectively. This can also be derived from elementary geometric observations.

By using Lemma 3 and Corollary 1, we can derive following asymptotically tight bounds on the minimum bit length required for probabilistic encodings:
Theorem 2.
For any and an integer , the minimum size of label set used over all probabilistic encodings of with accuracy is bounded by
(12) |
where .
Proof.
When is an internal -covering, Corollary 1 implies satisfies Eq. (2). Thus, there exists a probabilistic encoding of with accuracy and label set , whose size is upper bounded by using Lemma 1 with setting .
Next, we show the lower bound. Let satisfy Eq. (2). By using Lemma 3 and setting a classical-quantum channel such that as in the proof of Corollary 1, we obtain
(13) | |||||
By letting , we obtain that for any , there exists and such that
(14) | |||||
Thus, is an internal -covering of . Hence, the lower bound can be obtained by applying Lemma 1 as . ∎
Using Theorem 2 and , we obtain the asymptotic bit rate per dimension of the minimum probabilistic encoding for fixed , and one per accuracy of the minimum probabilistic encoding for fixed , which are exactly half of those of the minimum deterministic encoding.
IV Probabilistic circuit synthesis
In the context of circuit synthesis using a finite universal gate set, it has recently been found that the length of the gate sequence or the number of gates can be reduced by using ensembles of gate sequences [10, 11]. This is based on the so-called mixing lemma, which shows that if finite set of unitary transformations approximates arbitrary unitary transformations within sufficient high accuracy, it is possible to increase the accuracy by using particular ensembles .
Based on Corollary 1, where we derive the accuracy achieved by the optimal ensembles of pure states in approximating arbitrary pure states, we can derive bounds on the accuracy achieved by the optimal ensembles of unitary transformations in approximating arbitrary unitary transformations in the following theorem. Note that similarly to [10, 11], we measure the accuracy of the approximation by using the diamond norm (sometimes called the completely bounded trace norm) of hermitian preserving linear mappings, defined as
(15) |
where , is the identity mapping on and .
Theorem 3.
For an integer specified below, let be a finite set of unitary transformations on . Then, it holds that
(16) | |||
(17) |
where the first minimum is taken over probability distribution over . Note that if , the equalities hold.
This theorem resembles Corollary 1 (especially when ), which can be regarded as a consequence of the similarity between pure states and unitary transformations via the Choi-Jamiołkowski representation. Although a proof uses the minimax theorem as in the proof of Corollary 1, it requires additional work to some extent. We give a complete proof in Appendix D with the sharp lower bound.
This theorem implies that quantum circuit synthesis using probabilistically generated circuits formed from finite universal gate set can reduce the circuit size. To see this, first, let be the set of unitary transformations representing the unitary circuit realized by gate sequences of length at most . Next, let be the smallest length of gate sequences to approximate arbitrary unitary transformations within accuracy , i.e., . Theorem 3 implies that by using the probabilistic implementation, we can implement arbitrary unitary transformation within accuracy only with an -size circuit.
The accuracy in approximating unitary transformations is often measured by using the operator norm . The celebrated Solovay-Kitaev theorem shows that for any finite universal gate set , set of unitary circuits realized by gate sequences of length () is sufficient for approximating arbitrary unitary operators, i.e., , where we denote a unitary circuit and the unitary operator representing the circuit by the same symbol . On the other hand, a relationship between the operator norm and the diamond norm shown in Appendix E implies that if , where and . Combining with Theorem 3, we can verify that arbitrary unitary transformations can be approximated by ensembles of such that
(18) |
if . This bound is tighter than the previous bound obtained in [10, Theorem 1], . Moreover, our bound holds if while the previous bound was shown to hold for . In addition to the improved estimation, our lower bound reveals the limitation of the probabilistic implemetation.
As suggested by the Solovay-Kitaev theorem, if we assume in the high-accuracy regime (), the probabilistic implementation reduces the length of gate sequences by about since from a volume consideration. Even in the low-accuracy regime, our bound guarantees the reduction. For example, we show how the gate length can be reduced by using the probabilistic implementation to synthesize single-qubit unitary transformations with gate set in Appendix F.
V Conclusion
In this paper, we have considered the minimum probabilistic encoding so as to approximately reconstruct an arbitrary pure state from an -bit string within accuracy with respect to the trace distance. We then demonstrated that it cannot be realized by simply storing an element of the minimum -covering. More precisely, we proved that the bit rate required for probabilistic encodings is exactly half of that of the minimum length of bits necessary to store elements of an -covering of in asymptotic limit or in limit when . In limit when , the same result holds if we consider only internal -coverings; however, in general, whether the same result holds or not is an open problem. Several numerical calculations suggest a positive answer.
Moreover, we show that similarly to the state encoding, for any finite set of unitary transformations, we can at least quadratically increase the accuracy of approximating arbitrary unitary transformations by using ensembles of . Particularly, we obtain bounds on the accuracy when one uses the optimal ensembles to reveal the possibility and the limitation of the probabilistic implementation of unitary transformations.
Our result could provide a new quantitative guiding principle to explore further capabilities and limitations of manipulating a quantum system as well as the foundations of quantum theory, including the following two related topics:
-
1.
The results demonstrate an information-theoretical separation of the memory size to store a pure quantum state between strong simulations and weak ones, which are two types of classical simulation of a quantum computer [14, 15, 16] (the former approximately computes the probability distribution over the outcomes, whereas the latter only approximately samples the outcomes.)
-
2.
The complex projective space representation of pure states can be regarded as a classical encoding of pure states in deterministic classical states describing operators in . The fact that any distinct pure states are encoded indistinguishable classical states inclines us to think that the indistinguishability of non-orthogonal pure states results from our limited ability to measure them. To interpret the indistinguishability as an intrinsic feature of pure states, classical encodings of pure states in probabilistic classical states have been constructed in -epistemic models [17, 18], in which the encodings use indistinguishable and probabilistic classical states to encode some distinct pure states. Our results show that indistinguishable classical states encoding distinct elements in an -covering are not only helpful for such an interpretation but also necessary for the minimum probabilistic encoding.
Acknowledgements.
We thank Yuki Takeuchi, Yasunari Suzuki, Yasuhiro Takahashi, and Adel Sohbi for helpful discussions. This work was partially supported by JST Moonshot R&D MILLENNIA Program (Grant No.JPMJMS2061). SA was partially supported by JST, PRESTO Grant No.JPMJPR2111 and JPMXS0120319794. GK was supported in part by the Grant-in-Aid for Scientific Research (C) No.20K03779, (C) No.21K03388, and (S) No.18H05237 of JSPS, CREST (Japan Science and Technology Agency) Grant No.JPMJCR1671. ST was partially supported by the Grant-in-Aid for Transformative Research Areas No.JP20H05966 of JSPS.References
- Shende et al. [2006] V. Shende, S. Bullock, and I. Markov, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 25, 1000 (2006).
- Brandão et al. [2021] F. G. Brandão, W. Chemissany, N. Hunter-Jones, R. Kueng, and J. Preskill, PRX Quantum 2, 030316 (2021).
- Huang et al. [2020] H.-Y. Huang, R. Kueng, and J. Preskill, Nature Physics 16, 1050 (2020).
- Raz [1999] R. Raz, in Proceedings of the 31st Annual ACM Symposium on Theory of Computing, STOC ’99 (Association for Computing Machinery, New York, NY, USA, 1999) pp. 358–367.
- Gosset and Smolin [2019] D. Gosset and J. Smolin, in Proceedings of the 14th Conference on the Theory of Quantum Computation, Communication and Cryptography (2019) pp. 8:1–8:9.
- Hayden et al. [2006] P. Hayden, D. W. Leung, and A. Winter, Communications in Mathematical Physics 265, 95 (2006), arXiv:0407049 [quant-ph] .
- Gavinsky and Ito [2013] D. Gavinsky and T. Ito, Quantum Info. Comput. 13, 583 (2013).
- Guralnick and Sudakov [2017] R. Guralnick and B. Sudakov, Alice and Bob Meet Banach (American Mathematical Society, 2017).
- Montanaro [2019] A. Montanaro, Quantum 3, 154 (2019).
- Campbell [2017] E. Campbell, Phys. Rev. A 95, 042306 (2017).
- Hastings [2017] M. B. Hastings, Quantum Info. Comput. 17, 488 (2017).
- Fuchs and van de Graaf [1999] C. Fuchs and J. van de Graaf, IEEE Transactions on Information Theory 45, 1216 (1999).
- Horodecki et al. [2003] M. Horodecki, P. W. Shor, and M. B. Ruskai, Reviews in Mathematical Physics 15, 629 (2003), https://doi.org/10.1142/S0129055X03001709 .
- Van Den Nes [2010] M. Van Den Nes, Quantum Info. Comput. 10, 258 (2010).
- Bravyi et al. [2019] S. Bravyi, D. Browne, P. Calpin, E. Campbell, D. Gosset, and M. Howard, Quantum 3, 181 (2019).
- Hillmich et al. [2020] S. Hillmich, I. L. Markov, and R. Wille, in 2020 57th ACM/IEEE Design Automation Conference (DAC) (2020) pp. 1–6.
- Lewis et al. [2012] P. G. Lewis, D. Jennings, J. Barrett, and T. Rudolph, Phys. Rev. Lett. 109, 150404 (2012).
- Aaronson et al. [2013] S. Aaronson, A. Bouland, L. Chua, and G. Lowther, Phys. Rev. A 88, 032111 (2013).
- Gutoski and Watrous [2007] G. Gutoski and J. Watrous, in Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing, STOC ’07 (Association for Computing Machinery, New York, NY, USA, 2007) pp. 565–574.
- Chiribella et al. [2009] G. Chiribella, G. M. D’Ariano, and P. Perinotti, Phys. Rev. A 80, 022339 (2009).
Appendix A Volume of -ball in
To construct an -covering, we first derive the volume of -ball in as follows:
(19) |
where is the unitarily invariant probability measure on the Borel sets of .
When , Eq. (19) holds. By assuming , we proceed as follows:
(20) |
where the first equality uses fixed pure state and the unitary invariance of and the trace distance, the second equality uses Eq. (1), and the third equality uses the relationship between and the uniform spherical probability measure . Using a spherical coordinate system, we can proceed as follows:
(21) | |||||
(22) | |||||
and the domain of the integration is given by . Since the domain and that of the integrand have reflection symmetries about two lines and , it is sufficient to perform the integration in domain . By changing the variables as , we obtain
(23) | |||||
for . This completes the calculation.
Appendix B Upper bound for internal covering number
We construct an internal -covering () following the proof in [8, Corollary 5.5]. The construction is based on the fact that sufficiently many pure states randomly sampled form an -covering. However, since the probability of a new random pure state residing in the uncovered region decreases when many random -balls are sampled, it is better to stop sampling a pure state and change the strategy of the construction.
In the proof, we represent some parameters explicitly, which are tailored to the -covering with respect to the trace distance. Assume and let . Let be a set of finite randomly sampled pure states with respect to product measure . The expected volume of the region not covered by () can be calculated as follows:
(24) |
where we use Fubini’s theorem and Eq. (19) in the second equation and the third equation, respectively. Note that is the indicator function, i.e., iff is true.
Thus, there exists such that . Pick as much as possible such that are disjoint and contained in . When , we can verify that is an -covering and its size is upper bounded as
(25) |
By setting , and with , we obtain the following upper bound:
(26) |
where . Since , we obtain that for any there exists such that
(27) |
For example, if , we can set . This completes the proof.
Appendix C Lower bound for external covering number
We can derive a lower bound for the external covering number as a direct consequence of the following upper bounds on the volume of the maximum intersection of -ball in and , which will be proven in this section: For any and any ,
(28) | |||||
(29) |
Combined with Eq. (19), Eq. (29) implies that the maximum intersection is achieved when is pure if two conditions and are satisfied. These two conditions are not tight but cannot be fully relaxed since for any and , where is the maximally mixed state .
C.1 Proof of Eq. (28)
By defining , we obtain . For for any pure state . This completes the proof since for any .
C.2 Proof of Eq. (29)
Let , where is a set of eigenvectors of and eigenvalues are arranged in decreasing order, i.e., . Since depends not on the eigenvectors but on the eigenvalues of , it is sufficient to consider only diagonal with respect to a fixed basis. However, it is difficult to exactly calculate due to a complicated relationship between and the largest eigenvalue of , resulting from the condition .
We derive lower bound of and use the relationship to show Eq. (29), where is a measurable function. Since simple bound is too loose to show Eq. (29), we derive a tighter lower bound as follows: Let and be the Hermitian projectors on two-dimensional subspace and its orthogonal complement, respectively. We then obtain
(30) | |||||
where we use the monotonicity of the trace distance under a CPTP mapping in the first inequality. Define as the value in Eq. (30), which can be explicitly written as
(31) |
where and is an orthonormal basis of . The explicit formula implies is uniquely defined (although neither nor is uniquely defined if ) and continuous, thus measurable. Since , satisfying Eq. (29), if , we consider the case . By further assuming , we obtain
(32) |
where the this condition is not trivial, i.e., . Assuming , we calculate an upper bound on as follows: By defining as the set of unitary operators on and as , we can show that for any unitary operator ,
(33) |
where we use the unitarily invariance of in the first equality, , and is the indicator function, i.e., iff is true. By integrating Eq. (33) with respect to the Haar measure on and using Fubini’s theorem, we obtain
(34) |
where is identified with a pure state on acting on subspace . Using Fubini’s theorem again and Eq. (19), we can proceed with the calculation:
(35) |
where , we use the convexity of and the unitary invariance of in the last inequality, and we use the probability density of derived by Eq. (19) in the last equality. To confirm the calculation, we plot a comparison between and its upper bound for a particular as shown in Fig. 3, where we use the following explicit expression of :
(36) |
where and and . Note that .

It is sufficient to show that under the two conditions and ,
(37) |
since . Since the integrand of and its partial derivative with respect to are continuous, we can interchange the partial differential and integral operators:
(38) | |||||
where , and . Since and are non-negative in the entire considered region , if is non-negative for all . However, can be negative for some if and only if . Taking account of considered region , it is sufficient to show for all , where can be negative.
For fixed , let satisfy . Since is monotonically decreasing in , is uniquely defined, if and if . Thus, showing
(39) |
and
(40) |
is sufficient for . For
(41) | |||||
holds for any
Appendix D Proof for Thoerem 3 and its sharp lower bound
In this section, we prove Theorem 3 with a tighter lower bound as the following theorem. After the proof, we show that this lower bound is sharp by constructing set of unitary transformations such that is arbitrarily close to its lower bound while the upper bound given in the theorem is (actually, each in the set can be perfectly distinguished from the identity transformation,) where and are defined in the following theorem.
Theorem 4.
For an integer specified below, let be a finite set of unitary transformations on . Then, it holds that
(43) | |||
(44) |
where the first minimum is taken over probability distribution over . Note that if , the equalities hold.
The lower bound of Theorem 3 can be derived from this theorem as follows:
(45) |
In the proof, we use the following fact about the diamond norm: For two CPTP linear mappings and from to , we can verify
(46) |
where is the Choi-Jamiołkowski operator of linear mapping and is the set of measuring strategies [19] or that of quantum testers [20].
Proof.
Let and be unitary transformations from to , defined as and , respectively.
First, we show
(47) |
By using Eq. (46), we obtain
(48) |
where we use the minimax theorem in the second equation since is affine with respect to each variable and the domain of and are compact and convex. Since it is known that the set of mappings associated to quantum testers is equivalent to that of mappings associated to pure states and hermitian projectors [20, Theorem 10] for sufficiently large dimensional Hilbert space (to be self-contained, we provide a proof for the equivalence between the two mappings in Appendix G) we can proceed as follows:
(49) |
where is the set of hermitian projectors on . Let , and maximize Eq. (49). Since arbitrary unitary transformations cannot be represented by ensembles of finite set of unitary transformations, the first term cannot be , thus . Let be the pure state such that . Then, we can verify that Eq. (49) is still maximized even if we replace by . Thus, in Eq. (49) can be restricted as a pure state, i.e., , and we proceed as follows:
(50) |
Before proceeding to the next step, we show the set of mappings associated to pure states and is equivalent to that of mappings associated to linear operator such that , where is the Schatten -norm of . By using decompositions and with respect to orthonormal bases, we can verify that with equals to and . On the other hand, By using the singular value decomposition , where implies with some , we can verify that with and ( is an orthonormal basis) equals to .
By using the equivalent between two sets of mappings and , we proceed as follows:
(51) |
where we use the polar decomposition in the second equation and is a unitary operator. On the other hand, since , it holds that
(52) |
Since for any if the maximum and the minimum exist, we obtain Ineq. (47).
Next, we show
(53) |
where . Due to Eq. (52), we can verify that . First, observe that for any unitary operator on ,
(54) |
holds, where is the -th eigenvalue of , and in the inequality, we use the following two facts: (i) the minimization is achieved only if satisfies due to a geometric obseravation, and (ii) for such probability distribution and complex numbers , . By using this, we obtain
(55) |
This and Eq. (51) implies
(56) |
This completes the proof.
∎
D.1 Sharpness of the lower bound
We show the equality in Ineq. (53) holds when we approximate arbitrary unitary transformations by using restricted set of unitary transformations, where
(57) |
More precisely, since we assume is a finite set, we show that Ineq. (53) is getting tighter when and is an -covering of , i.e., , where . This can be shown by the following two observations:
First, for any , there exists pure state such that since is contained in the convex hull of ’s eigenvalues. Then, we obtain
(58) |
This implies that in Ineq. (53) satisfies .
Second, by using Eq. (51), we obtain
(59) |
where in the inequality, we use that for any , there exists such that , where and is a purification of . Since can be arbitrarily small positive number, showing
(60) |
is sufficient to prove the sharpness of Ineq. (53).
For any unitary operator , there exists such that are simultaneously diagonalizable and the -th eigenvalues of and are the same for all . By letting , where complex number satisfies and is an orthonormal basis, we can proceed as follows: for any unitary operator ,
(61) |
This completes the proof.
Appendix E The operator norm and the diamond norm of unitary transformations
Suppose . In this section, we show that if , where is a unitary transformation on .
By using the unitarily invariance of the diamond norm and the operator norm, it is sufficient to show if , where . Let be the -th eigenvalue of . Then, we obtain
(62) |
On the other hand, by using the similar observation used for deriving Eq. (52),
(63) |
By using an elementary geometric observation shown in Fig.4, we obtain . This completes the proof.

Appendix F Circuit synthesis of single qubit unitary transformations by using gate set
Recall that is the smallest length of gate sequences formed from gate set to approximate arbitrary single qubit unitary transformations within accuracy , i.e., , where is the set of unitary transformations representing the unitary circuit realized by the gate sequences of length at most . In this section, we perform a numerical calculation of and show that the probabilistic implementation reduces the gate length owing to Theorem 3.
First, we generate . Next, we randomly sample single-qubit unitary transformations with respect to the Haar measure on the unitary group on . Third, for each randomly sampled unitary transformation , we calculate the half of the minimum diamond norm between and . Then, we compute the maximum value of over all the sampled . Since another numerical experiment indicates that the set of randomly sampled single-qubit unitary transformations is -covering of that of single-qubit unitary transformations with high probability, we assume that gate sequences of length at most approximate arbitrary single-qubit unitary transformations within accuracy . (The true accuracy satisfies .)
In Fig.5, we plot and , which correspond to the minimum length of gate sequences to synthesize single-qubit unitary transformations within accuracy by using the deterministic implementation and the probabilistic one, respectively. This graph shows that the probabilistic implementation can reduce the circuit size in a wide range of accuracy .

Appendix G Equivalence between quantum testers and quantum networks
Recall that the Choi-Jamiołkowski operator of linear mapping is defined as , and the set of quantum testers is defined as . In this section, we show that the set of mappings associated to quantum testers is equivalent to that of mappings associated to pure states and hermitian projectors for sufficiently large dimensional Hilbert space . Note a proof for more general quantum testers is given in [20, Theorem 10].
First, we show that for any and , there exists such that as follows: By letting , we obtain
(64) |
where and represent the partial transpose of and the partial trace, and the subscript of the operator denotes the system where the operator acts on. We can also verify that as follows: Let , where with the computational basis and . Then, we obtain that for any postive semidefinte operator ,
(65) |
which implies . By letting , we can also verify that
(66) |
where satisfies , and the last inequality can be verified by the fact that .
Next, we show that for any , there exist and such that as follows: Let , be a purification of , its singular value decomposition be () and be , where and is the complex conjugate of . Then we can verify that
(67) |
Since , is a positive operator-valued measure (POVM), which can be embedded in a larger Hilbert space as a projection-valued measure (PVM) owing to the Naimark’s extension. This completes the proof.