This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Probabilistic unitary synthesis with optimal accuracy

Seiseki Akibue [email protected] 0000-0001-9654-9361 NTT Communication Science Laboratories, NTT Corporation3–1, Morinosato-WakamiyaAtsugiKanagawaJapan243-0198 Go Kato Advanced ICT Research Institute, NICT4–2–1, Nukui-KitamachiKoganeiTokyoJapan184-8795  and  Seiichiro Tani Department of Mathematics, Waseda University1–6–1, Nishi-WasedaShinjukuTokyoJapan169-8050
(2023; 16 January 2023; 16 March 2009; 5 June 2009)
Abstract.

The purpose of unitary synthesis is to find a gate sequence that optimally approximates a target unitary transformation. A new synthesis approach, called probabilistic synthesis, has been introduced, and its superiority has been demonstrated over traditional deterministic approaches with respect to approximation error and gate length. However, the optimality of current probabilistic synthesis algorithms is unknown. We obtain the tight lower bound on the approximation error obtained by the optimal probabilistic synthesis, which guarantees the sub-optimality of current algorithms. We also show its tight upper bound, which improves and unifies current upper bounds depending on the class of target unitaries. These two bounds reveal the fundamental relationship of approximation error between probabilistic approximation and deterministic approximation of unitary transformations. From a computational point of view, we show that the optimal probability distribution can be computed by the semidefinite program (SDP) we construct. We also construct an efficient probabilistic synthesis algorithm for single-qubit unitaries, rigorously estimate its time complexity, and show that it reduces the approximation error quadratically compared with deterministic algorithms.

quantum gate synthesis, convex approximation, unitary gate decomposition
copyright: acmcopyrightjournalyear: 2023doi: XXXXXXX.XXXXXXXprice: 15.00isbn: 978-1-4503-XXXX-X/18/06ccs: Theory of computation Quantum complexity theoryccs: Theory of computation Quantum information theory

1. Introduction

In quantum simulation and quantum computation, a global unitary transformation on a many-body quantum system is obtained as a sequence of unitary transformations on a fixed-size system, e.g., those obtained by nearest-neighbor interactions. To guarantee and increase the accuracy of obtaining such transformations, rather than controlling their continuous parameters, each unitary transformation on the fixed-size system is realized as a sequence of gates chosen from a finite gate set {gi}i\{g_{i}\}_{i}, where each gig_{i} results in a fixed unitary transformation with negligible error thanks to the sophisticated calibration, quantum error correction (Terhal, 2015) or the nature of the system (Kitaev, 2003). If {gi}i\{g_{i}\}_{i} is universal, arbitrary unitary transformation can be approximated by a unitary transformation gingi2gi1g_{i_{n}}\circ\cdots\circ g_{i_{2}}\circ g_{i_{1}} obtained as a gate sequence for an appropriate choice of gate length nn depending on the approximate error one wants to achieve. For a given universal gate set such as the set of the Hadamard, controlled-NOT, and π/8\pi/8 gates (Nielsen and Chuang, 2000), an algorithm to find a gate sequence for a given unitary transformation and an approximation error bound is called a unitary synthesis algorithm.

To suppress the effect of decoherence or overhead caused by the fault-tolerant implementation of each gate (Knill et al., 1998; Aharonov and Ben-Or, 2008), various studies (Kitaev et al., 2002; Harrow et al., 2002; Kliuchnikov et al., 2016, 2013; Ross, 2015; Bocharov et al., 2015; Fowler, 2011; Bouland and Giurgica-Tiron, 2021) have proposed unitary synthesis algorithms for minimizing the length of the output gate sequence. Following the celebrated Solovay-Kitaev algorithm (Kitaev et al., 2002), many algorithms are used to find one of the shortest gate sequences that can approximate a target unitary transformation Υ\Upsilon within the desired approximation error. Obviously, the goal can be achieved by brute force search (Fowler, 2011). However, to guarantee their efficiency, many algorithms are designed for synthesizing restricted classes of unitary transformations by using particular gate sets or for finding a nearly shortest gate sequence.

While approximating an Υ\Upsilon by using a single sequence of gates is a natural approach, the advantage of another approach using a probabilistic mixture of unitaries has been demonstrated (Hastings, 2017; Campbell, 2017; Kliuchnikov et al., 2022). Suppose that a synthesis algorithm produces a gate sequence for implementing a unitary transformation in {Υi}={gingi2gi1}i\{\Upsilon_{\vec{i}}\}=\{g_{i_{n}}\circ\cdots\circ g_{i_{2}}\circ g_{i_{1}}\}_{\vec{i}} in accordance with the probability distribution p(i)p(\vec{i}) to approximate an Υ\Upsilon. If the algorithm independently samples i\vec{i} for each time the Υ\Upsilon is used in the entire circuit, the physical transformation governed by the randomly executed unitary transformation Υi\Upsilon_{\vec{i}} in accordance with the p(i)p(\vec{i}) is described by a probabilistic mixture ip(i)Υi\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}} of unitaries. In this case, the approximation error should be measured by the distance between the Υ\Upsilon and ip(i)Υi\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}.

Campbell (Campbell, 2017) and Vadym et al. (Kliuchnikov et al., 2022) constructed algorithms to compute a probability distribution {p(i)}i\{p(\vec{i})\}_{\vec{i}} for a given Υ\Upsilon and a set {Υi}i\{\Upsilon_{\vec{i}}\}_{\vec{i}} of unitary transformations implemented as a gate sequence such that the approximation error of ip(i)Υi\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}} against Υ\Upsilon is almost quadratically better than that of a single optimal unitary transformation in {Υi}i\{\Upsilon_{\vec{i}}\}_{\vec{i}}. More precisely, Υip(i)Υi=O(ϵ2)\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond}=O(\epsilon^{2}) for the worst approximation error ϵ=maxΥmini12ΥΥi\epsilon=\max_{\Upsilon}\min_{\vec{i}}\frac{1}{2}\left\|\Upsilon-\Upsilon_{\vec{i}}\right\|_{\diamond} caused by deterministic synthesis, where 𝒜\left\|\mathcal{A}-\mathcal{B}\right\|_{\diamond} is the diamond norm (Watrous, 2018; Kitaev et al., 2002). This also indicates that probabilistically executing Υi\Upsilon_{\vec{i}} in accordance with p(i)p(\vec{i}) can further reduce the length of the shortest gate sequence without increasing the approximation error (if one measures the error by using the above diamond norm) (Campbell, 2017). In general, probabilistic synthesis consists of two procedures; (i) computing a probability distribution {p(i)}i\{p(\vec{i})\}_{\vec{i}} and sampling a description i\vec{i} of a gate sequence with a classical computer, and (ii) implementing the sampled gate sequence on a quantum computer. In contrast to deterministic synthesis, procedure (ii) may require a quantum computer to rearrange a gate sequence each time it realizes a target unitary transformation. However, such rearrangeability is usually assumed as a standard functionality of a quantum computer.

However, procedure (i) should be meticulously designed to construct a practical synthesis algorithm. This is because the number of possible gate sequences grows exponentially with respect to the length nn of the sequence, resulting in a large degree of freedom in choosing {p(i)}i\{p(\vec{i})\}_{\vec{i}}. While a probabilistic synthesis algorithm with guaranteed time complexity exists for single-qubit unitary transformations that correspond to axial rotations (Kliuchnikov et al., 2022), no such algorithm was known even for general single-qubit unitary transformations. Furthermore, a fundamental question remained open regarding the optimality of existing synthesis algorithms in comparison to the minimum approximation error minpΥip(i)Υi\min_{p}\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond}. Minimax optimization makes it difficult to investigate the minimum approximation error from an analytical perspective except for a few specific Υ\Upsilon and sets {Υi}i\{\Upsilon_{\vec{i}}\}_{\vec{i}} (Sacchi and Sacchi, 2017).

1.1. Our contribution

We obtain the tight lower bound on minpΥip(i)Υi\min_{p}\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond}, which reveals the fundamental limitation of probabilistic synthesis and indicates the sub-optimality of current algorithms. To obtain the main result, we focus on the analytical relationship between minpΥip(i)Υi\min_{p}\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond} and miniΥΥi\min_{\vec{i}}\left\|\Upsilon-\Upsilon_{\vec{i}}\right\|_{\diamond}, which represent the minimum approximation error obtained by probabilistic synthesis and that by deterministic synthesis, respectively. To be mathematically comprehensive, we also obtain the tight upper bound on minpΥip(i)Υi\min_{p}\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond}, which essentially unifies various upper bounds (Hastings, 2017; Campbell, 2017; Kliuchnikov et al., 2022) depending on the class of target unitary transformations. More precisely, the two bounds are given as the following theorem.

THEOREM 4.3. (simplified version) For an integer d2d\geq 2 specified below, let Υ\Upsilon and {Υi}i\{\Upsilon_{\vec{i}}\}_{\vec{i}} be a target unitary transformation and a finite set of unitary transformations on the dd-dimensional Hilbert space, respectively. It then holds that

(1) 4δd(1δd)maxΥminp12Υip(i)Υiϵ2with{δ=11ϵ2andϵ=maxΥmini12ΥΥi.\frac{4\delta}{d}\left(1-\frac{\delta}{d}\right)\leq\max_{\Upsilon}\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond}\leq\epsilon^{2}\ \ {\rm with}\ \left\{\begin{array}[]{l}\delta=1-\sqrt{1-\epsilon^{2}}\ \ \ {\rm and}\\ \epsilon=\max_{\Upsilon}\min_{\vec{i}}\frac{1}{2}\left\|\Upsilon-\Upsilon_{\vec{i}}\right\|_{\diamond}.\end{array}\right.

This theorem provides bounds on the worst approximation error caused when one probabilistically synthesizes the target unitary transformation that is most difficult to approximate. As shown in Fig. 1, the gap between the upper and lower bounds exists if and only if d3d\geq 3. We can show that the gap is inevitable by constructing {Υi}i\{\Upsilon_{\vec{i}}\}_{\vec{i}} for achieving the upper bound and that for achieving the lower bound. That is, Ineq. (1) represents the fundamental relationship of the approximation error between the deterministic approximation of unitary transformations and their probabilistic approximation that depends only on the dimension dd of the system.

Refer to caption
Figure 1. Lower and upper bounds on worst approximation error maxΥminp12Υip(i)Υi\max_{\Upsilon}\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond} caused by probabilistic synthesis with respect to maxΥmini12ΥΥi\max_{\Upsilon}\min_{\vec{i}}\frac{1}{2}\left\|\Upsilon-\Upsilon_{\vec{i}}\right\|_{\diamond} caused by deterministic synthesis for two-qubit systems, i.e., d=4d=4. Both lower and upper bounds, represented with thick and thin curves, respectively, are achievable for certain {Υi}\{\Upsilon_{\vec{i}}\}.

From a computational point of view, we show that the optimal probability distribution for approximating an Υ\Upsilon can be computed by the semidefinite program (SDP) we construct when the set {Υi}i\{\Upsilon_{\vec{i}}\}_{\vec{i}} of unitary transformations implemented as a gate sequence is given. (This set is computable with certain synthesis algorithms.) In addition to its optimality, we can rigorously estimate the worst time complexity of our SDP due to established methods for numerically solving SDPs. As the second main result, we construct a probabilistic synthesis algorithm for single-qubit unitary transformations from the following theorem.

THEOREM 5.4. (informal version) For a given gate set, there exists a probabilistic synthesis algorithm for a single-qubit unitary transformation with

INPUT: a target single-qubit unitary transformation Υ\Upsilon and target approximation error ϵ(0,1)\epsilon\in\left(0,1\right)

OUTPUT: a gate sequence implementing a single-qubit unitary transformation Υi\Upsilon_{\vec{i}} sampled from a set {Υi}i\{\Upsilon_{\vec{i}}\}_{\vec{i}} in accordance with probability distribution p^(i)\hat{p}(\vec{i}).

such that the algorithm satisfies the following properties:

  • Efficiency: All steps of the algorithm take polylog(1ϵ)polylog\left(\frac{1}{\epsilon}\right)-time,

  • Quadratic improvement: The approximation error 12Υip^(i)Υi\frac{1}{2}\left\|\Upsilon-\sum_{\vec{i}}\hat{p}(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond} obtained with this algorithm is upper bounded by ϵ2\epsilon^{2}, whereas the error mini12ΥΥi\min_{\vec{i}}\frac{1}{2}\left\|\Upsilon-\Upsilon_{\vec{i}}\right\|_{\diamond} obtained by deterministic synthesis using the unitary transformations in {Υi}i\{\Upsilon_{\vec{i}}\}_{\vec{i}} is upper bounded by ϵ\epsilon.

The first property of the algorithm is desirable for fault-tolerant quantum computation (FTQC). The polylog(1ϵ)polylog\left(\frac{1}{\epsilon}\right)-time overhead due to the synthesis algorithm does not impair a quadratic speedup achieved with a quantum computer over a classical computer since the approximation error of each unitary transformation should satisfy 1ϵ=poly(n)\frac{1}{\epsilon}=poly\left(n\right) if a quantum circuit contains a polynomial number of single-qubit unitaries with respect to the problem size nn. Due to the second property of the algorithm, we can verify that it surpasses current algorithms (Hastings, 2017; Campbell, 2017) with respect to the approximation error. Our algorithm also surpasses a current algorithm (Kliuchnikov et al., 2022) in terms of applicability to a general single-qubit unitary transformation.

1.2. Technical outline

Previous studies searched for the mixing probability distribution {p(i)}i\{p(\vec{i})\}_{\vec{i}} by using the first-order approximation of unitary operators (Hastings, 2017; Campbell, 2017) and obtained the upper bound on the worst approximation error maxΥminp12Υip(i)Υi\max_{\Upsilon}\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond} caused by probabilistic synthesis. In contrast, we use the strong duality of SDP, essentially equivalent to the minimax theorem, to obtain tight bounds Ineq. (1) obtained by the optimal mixing probability distribution. A similar technique can be found in the analyses of the optimal convex approximation of quantum states by using a restricted set of states (Sacchi, 2017) and that of unital mappings by using unitary transformations (Yu et al., 2012). While inventing tractable upper bounds on the approximation error of a general unital mapping is an open problem (Yu et al., 2012), we provide an upper bound by exploiting the property of a unitary transformation as a pure unital mapping.

To prove that our single-qubit unitary synthesis algorithm satisfies the expected properties, we show the fact that Υi\Upsilon_{\vec{i}} that is far from Υ\Upsilon is not necessary to be sampled to optimally approximate Υ\Upsilon for single-qubit unitary transformations by exploiting the magic basis (Bennett et al., 1996) representation of single-qubit unitary operators. The magic basis representation enables us to embed the metric space of single-qubit unitary transformations induced by the diamond norm into that of S3S^{3} with respect to the angle. While numerical simulations indicate the same fact holds for qudit unitary transformations, rigorous proof is a subject for future work.

1.3. Organization

This article is organized as follows. Section 2 is devoted to preliminaries, introducing basis notations in quantum information theory and semidefinite programming. In Section 3, we construct an SDP that computes the optimal probability distribution in probabilistic synthesis. The SDP is provided as a primal and dual problem whose solutions coincide due to the strong duality of the SDP. The coincidence plays a crucial role in the proof of the first main theorem about the fundamental limitation on the approximation error shown in Section 4. Section 5 provides an efficient probabilistic synthesis algorithm for single-qubit unitary transformations as the second main theorem. We also provide a simple geometric interpretation of the superiority of probabilistic synthesis by considering single-qubit unitary transformations corresponding to axial rotations in Section 5.2. We present our conclusions in Section 6.

2. Preliminaries

In this section, we summarize basic notations used throughout the paper. Note that we consider only finite-dimensional Hilbert spaces. In particular, two-dimensional Hilbert space 2\mathbb{C}^{2} is called a qubit. The 𝐋()\mathbf{L}\left(\mathcal{H}\right) and 𝐏𝐨𝐬()\mathbf{Pos}\left(\mathcal{H}\right) represent the set of linear operators and positive semidefinite operators on Hilbert space \mathcal{H}, respectively. 𝕀𝐏𝐨𝐬()\mathbb{I}\in\mathbf{Pos}\left(\mathcal{H}\right) represents the identity operator, and we sometimes use the subscript to specify the system where 𝕀\mathbb{I} acts as 𝕀\mathbb{I}_{\mathcal{H}}. For Hermitian operators AA and BB on \mathcal{H}, ABA\geq B represents AB𝐏𝐨𝐬()A-B\in\mathbf{Pos}\left(\mathcal{H}\right), and A>BA>B represents ABA-B is positive definite. The 𝐒():={ρ𝐏𝐨𝐬():tr[ρ]=1}\mathbf{S}\left(\mathcal{H}\right):=\left\{\rho\in\mathbf{Pos}\left(\mathcal{H}\right):\text{tr}\left[\rho\right]=1\right\} and 𝐏():={ρ𝐒():tr[ρ2]=1}\mathbf{P}\left(\mathcal{H}\right):=\left\{\rho\in\mathbf{S}\left(\mathcal{H}\right):\text{tr}\left[\rho^{2}\right]=1\right\} represent the set of quantum states and that of pure states, respectively. Pure state ϕ𝐏()\phi\in\mathbf{P}\left(\mathcal{H}\right) is sometimes alternatively represented by complex unit vector |ϕ|{\phi}\rangle\in\mathcal{H} satisfying ϕ=|ϕϕ|\phi=|{\phi}\rangle\langle{\phi}|. Any physical transformation of the quantum state can be represented by a completely positive and trace preserving (CPTP) linear mapping Γ:𝐋(1)𝐋(2)\Gamma:\mathbf{L}\left(\mathcal{H}_{1}\right)\rightarrow\mathbf{L}\left(\mathcal{H}_{2}\right). There exists one-to-one correspondence between a linear mapping Ξ:𝐋(1)𝐋(2)\Xi:\mathbf{L}\left(\mathcal{H}_{1}\right)\rightarrow\mathbf{L}\left(\mathcal{H}_{2}\right) and its Choi-Jamiołkowski operator J(Ξ):=i,j|ij|Ξ(|ij|)𝐋(12)J(\Xi):=\sum_{i,j}|{i}\rangle\langle{j}|\otimes\Xi(|{i}\rangle\langle{j}|)\in\mathbf{L}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right).

The trace distance ρσtr\left\|\rho-\sigma\right\|_{\text{tr}} of two quantum states ρ,σ𝐒()\rho,\sigma\in\mathbf{S}\left(\mathcal{H}\right) is defined as Mtr:=12tr[MM]\left\|M\right\|_{\text{tr}}:=\frac{1}{2}\text{tr}\left[\sqrt{MM^{\dagger}}\right] for M𝐋()M\in\mathbf{L}\left(\mathcal{H}\right). It represents the maximum total variation distance between probability distributions obtained from measurements performed on two quantum states. A similar notion measuring the distinguishability of ρ\rho and σ\sigma is the fidelity function, defined by F(ρ,σ):=maxtr[ΦρΦσ]F\left(\rho,\sigma\right):=\max\text{tr}\left[\Phi^{\rho}\Phi^{\sigma}\right], where Φρ𝐏()\Phi^{\rho}\in\mathbf{P}\left(\mathcal{H}\otimes\mathcal{H}^{\prime}\right) is a purification of ρ\rho, i.e., ρ=tr[Φρ]\rho=\text{tr}_{\mathcal{H}^{\prime}}\left[\Phi^{\rho}\right], and the maximization is taken over all the purifications. Fuchs-van de Graaf inequalities (Fuchs and van de Graaf, 1999) provide relationships between the two measures with respect to the distinguishability as follows:

(2) 1F(ρ,σ)ρσtr1F(ρ,σ)1-\sqrt{F\left(\rho,\sigma\right)}\leq\left\|\rho-\sigma\right\|_{\text{tr}}\leq\sqrt{1-F\left(\rho,\sigma\right)}

holds for any state ρ,σ𝐒()\rho,\sigma\in\mathbf{S}\left(\mathcal{H}\right), where the equality of the right inequality holds when ρ\rho and σ\sigma are pure.

The distance measuring the distinguishability of two CPTP mappings 𝒜,:𝐋(1)𝐋(2)\mathcal{A},\mathcal{B}:\mathbf{L}\left(\mathcal{H}_{1}\right)\rightarrow\mathbf{L}\left(\mathcal{H}_{2}\right) corresponding to the trace distance is the diamond norm 𝒜\left\|\mathcal{A}-\mathcal{B}\right\|_{\diamond} defined by 12𝒜:=maxΦ𝐏(13)((𝒜)id)(Φ)tr\frac{1}{2}\left\|\mathcal{A}-\mathcal{B}\right\|_{\diamond}:=\max_{\Phi\in\mathbf{P}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{3}\right)}\left\|((\mathcal{A}-\mathcal{B})\otimes id)(\Phi)\right\|_{\text{tr}}, where idid represents the identity mapping acting on 3\mathcal{H}_{3}.

Let Ξ:𝐋(1)𝐋(2)\Xi:\mathbf{L}\left(\mathcal{H}_{1}\right)\rightarrow\mathbf{L}\left(\mathcal{H}_{2}\right) be a linear Hermitian-preserving mapping and AA and BB be Hermitian operators on 1\mathcal{H}_{1} and 2\mathcal{H}_{2}, respectively. SDP is an optimization problem formally defined with a triple (Ξ,A,B)(\Xi,A,B) as follows (Watrous, 2018):

(3)
Primal problem Dual problem
maximize: tr[AX]\text{tr}\left[AX\right] minimize: tr[BY]\text{tr}\left[BY\right]
subject to: X𝐏𝐨𝐬(1)X\in\mathbf{Pos}\left(\mathcal{H}_{1}\right), subject to: Y is a Hermitian operator on 2Y\text{\ is\ a\ Hermitian\ operator\ on\ }\mathcal{H}_{2},
Ξ(X)=B\Xi(X)=B Ξ(Y)A\Xi^{\dagger}(Y)\geq A,

where Ξ:𝐋(2)𝐋(1)\Xi^{\dagger}:\mathbf{L}\left(\mathcal{H}_{2}\right)\rightarrow\mathbf{L}\left(\mathcal{H}_{1}\right) is the adjoint of Ξ\Xi, defined as the linear mapping satisfying tr[YΞ(X)]=tr[(Ξ(Y))X]\text{tr}\left[Y^{\dagger}\Xi(X)\right]=\text{tr}\left[(\Xi^{\dagger}(Y))^{\dagger}X\right] for all X𝐋(1)X\in\mathbf{L}\left(\mathcal{H}_{1}\right) and Y𝐋(2)Y\in\mathbf{L}\left(\mathcal{H}_{2}\right). We can easily verify that the solution to the primal problem is smaller than or equal to that of the dual problem. The situation when the two solutions coincide is called a strong duality. Slater’s theorem states that the strong duality holds if either of the following conditions holds:

  1. (1)

    The solution to the primal problem is finite, and there exists a Hermitian operator YY on 2\mathcal{H}_{2} such that Ξ(Y)>A\Xi^{\dagger}(Y)>A.

  2. (2)

    The solution to the dual problem is finite, and there exists a positive definite operator XX on 1\mathcal{H}_{1} such that Ξ(X)=B\Xi(X)=B.

For a metric space (X,d)(X,d) and two subsets S,TXS,T\subseteq X, SS is called an ϵ\epsilon-covering of TT if suptTinfsSd(s,t)ϵ\sup_{t\in T}\inf_{s\in S}d(s,t)\leq\epsilon. In this article, we basically assume that XX is the set of CPTP mappings, the metric is defined as d(𝒜,)=12𝒜d(\mathcal{A},\mathcal{B})=\frac{1}{2}\left\|\mathcal{A}-\mathcal{B}\right\|_{\diamond}, SS is a finite set of unitary transformations and TT is a subset of unitary transformations such as a 2ϵ2\epsilon-ball {Υ:12ΥΥ2ϵ}\left\{\Upsilon^{\prime}:\frac{1}{2}\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}\leq 2\epsilon\right\} around a unitary transformation Υ:𝐋()𝐋()\Upsilon:\mathbf{L}\left(\mathcal{H}\right)\rightarrow\mathbf{L}\left(\mathcal{H}\right).

3. Semidefinite programming for computing optimal mixing probability

In this section, we construct an SDP for computing the optimal probability distribution that minimizes the diamond norm between the target CPTP mapping 𝒜\mathcal{A} and a probabilistic mixture of CPTP mappings {x}x\{\mathcal{B}_{x}\}_{x}. We can compute the optimal probability distribution in probabilistic unitary synthesis by solving this SDP by restricting 𝒜\mathcal{A} and {x}x\{\mathcal{B}_{x}\}_{x} as unitary transformations. We also mention the relationship between our SDP and the algorithm proposed by Campbell (Campbell, 2017).

Proposition 3.1.

Let 𝒜\mathcal{A} and {x}xX\{\mathcal{B}_{x}\}_{x\in X} be a target CPTP mapping and a finite set of CPTP mappings from 𝐋(1)\mathbf{L}\left(\mathcal{H}_{1}\right) to 𝐋(2)\mathbf{L}\left(\mathcal{H}_{2}\right), respectively. Then, distance minp12𝒜xXp(x)x\min_{p}\frac{1}{2}\left\|\mathcal{A}-\sum_{x\in X}p(x)\mathcal{B}_{x}\right\|_{\diamond} and the optimal probability distribution {p(x)}xX\{p(x)\}_{x\in X}, which minimizes the distance, can be computed with the following SDP:

(4)
Primal problem Dual problem
maximize: tr[J(𝒜)T]t\text{tr}\left[J(\mathcal{A})T\right]-t minimize: rr\in\mathbb{R}
subject to: 0Tρ𝕀20\leq T\leq\rho\otimes\mathbb{I}_{\mathcal{H}_{2}}, subject to: S0SJ(𝒜xXp(x)x)S\geq 0\wedge S\geq J\left(\mathcal{A}-\sum_{x\in X}p(x)\mathcal{B}_{x}\right),
ρ𝐒(1)\rho\in\mathbf{S}\left(\mathcal{H}_{1}\right) r𝕀1tr2[S]r\mathbb{I}_{\mathcal{H}_{1}}\geq\text{tr}_{\mathcal{H}_{2}}\left[S\right],
xX,tr[J(x)T]t\forall x\in X,\text{tr}\left[J(\mathcal{B}_{x})T\right]\leq t. xX,p(x)0\forall x\in X,p(x)\geq 0,
xXp(x)1\sum_{x\in X}p(x)\leq 1.

Note that the strong duality holds in this SDP, i.e., the optimum primal and dual values are equal.

Proof.

Recall that for two CPTP mapping 𝒜\mathcal{A} and \mathcal{B} from 𝐋(1)\mathbf{L}\left(\mathcal{H}_{1}\right) to 𝐋(2)\mathbf{L}\left(\mathcal{H}_{2}\right), 12𝒜\frac{1}{2}\left\|\mathcal{A}-\mathcal{B}\right\|_{\diamond} can be computed by the following SDP:

Primal problem Dual problem
maximize: tr[J(𝒜)T]\text{tr}\left[J(\mathcal{A}-\mathcal{B})T\right] minimize: rr\in\mathbb{R}
subject to: 0Tρ𝕀20\leq T\leq\rho\otimes\mathbb{I}_{\mathcal{H}_{2}}, subject to: S0SJ(𝒜)S\geq 0\wedge S\geq J(\mathcal{A}-\mathcal{B}),
ρ𝐒(1)\rho\in\mathbf{S}\left(\mathcal{H}_{1}\right). r𝕀1tr2[S]r\mathbb{I}_{\mathcal{H}_{1}}\geq\text{tr}_{\mathcal{H}_{2}}\left[S\right].

The primal problem can be obtained by observing

(5) 12𝒜\displaystyle\frac{1}{2}\left\|\mathcal{A}-\mathcal{B}\right\|_{\diamond} =\displaystyle= maxΦ𝐏(13)Π𝐏𝐫𝐨𝐣(23)tr[((𝒜)id)(Φ)Π]\displaystyle\max_{\begin{subarray}{c}\Phi\in\mathbf{P}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{3}\right)\\ \Pi\in\mathbf{Proj}(\mathcal{H}_{2}\otimes\mathcal{H}_{3})\end{subarray}}\text{tr}\left[((\mathcal{A}-\mathcal{B})\otimes id)(\Phi)\Pi\right]
(6) =\displaystyle= maxT𝐓(1:2)tr[J(𝒜)T],\displaystyle\max_{T\in\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2})}\text{tr}\left[J(\mathcal{A}-\mathcal{B})T\right],

where Π\Pi is a Hermitian projector acting on 23\mathcal{H}_{2}\otimes\mathcal{H}_{3}, 𝐓(1:2):={T𝐏𝐨𝐬(12):ρ𝐒(1),Tρ𝕀}\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2}):=\{T\in\mathbf{Pos}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right):\exists\rho\in\mathbf{S}\left(\mathcal{H}_{1}\right),T\leq\rho\otimes\mathbb{I}\} is called the set of measuring strategies (Gutoski and Watrous, 2007) or that of quantum testers (Chiribella et al., 2009), and the last equality was shown by Chiribella et al. (Chiribella et al., 2009, Theorem 10). To be self-contained, we provide a proof for the equality in Appendix A, with which the equality can be verified by applying Eq. (60) with fixing Ξ=𝒜\Xi=\mathcal{A}-\mathcal{B}. A formal SDP and the verification of the strong duality are provided in Appendix B.

By extending the dual problem of this SDP to include the minimization of probability distribution {p(x)}xX\{p(x)\}_{x\in X}, we obtain Eq. (4). Note that the last condition xXp(x)1\sum_{x\in X}p(x)\leq 1 in the dual problem is different from the condition xXp(x)=1\sum_{x\in X}p(x)=1 of a probability distribution; however, the optimum dual value can be achieved under the latter condition. Again, a formal SDP and the verification of the strong duality are provided in Appendix B. ∎

For a given Υ:𝐋()𝐋()\Upsilon:\mathbf{L}\left(\mathcal{H}\right)\rightarrow\mathbf{L}\left(\mathcal{H}\right) and a given set {Υx:𝐋()𝐋()}xX\{\Upsilon_{x}:\mathbf{L}\left(\mathcal{H}\right)\rightarrow\mathbf{L}\left(\mathcal{H}\right)\}_{x\in X} of unitary transformations implemented as a gate sequence, which forms an ϵ\epsilon-covering the set of unitary transformations with sufficiently small ϵ\epsilon, “the convex hull finding algorithm” proposed by Campbell (Campbell, 2017) can find a probability distribution {p~(x)}xX~\{\tilde{p}(x)\}_{x\in\tilde{X}} such that xX~p~(x)Hx=0\sum_{x\in\tilde{X}}\tilde{p}(x)H_{x}=0, where Υx(ρ)=Υ(eiHxρeiHx)\Upsilon_{x}(\rho)=\Upsilon(e^{iH_{x}}\rho e^{-iH_{x}}) and Hx=O(ϵ)H_{x}=O(\epsilon) for all xX~Xx\in\tilde{X}\subseteq X. Note that M=O(ϵ)M=O(\epsilon) represents M=O(ϵ)\left\|M\right\|_{\infty}=O(\epsilon) as ϵ0\epsilon\rightarrow 0 for a linear operator M𝐋()M\in\mathbf{L}\left(\mathcal{H}\right) depending on ϵ\epsilon. By using the dual problem in Proposition 3.1, we can verify that the distance ϵ\epsilon, which is achievable by a deterministic unitary synthesis finding the closest Υx\Upsilon_{x} to approximate Υ\Upsilon, can be improved into O(ϵ2)O(\epsilon^{2}) by mixing unitaries in accordance with the probability distribution {p~(x)}xX~\{\tilde{p}(x)\}_{x\in\tilde{X}} as follows. First, by using the dual problem of the SDP to compute the diamond norm between two CPTP mappings, we obtain

(7) 12ΥxX~p~(x)Υx=12idxX~p~(x)Υ1Υxtr[S]\displaystyle\frac{1}{2}\left\|\Upsilon-\sum_{x\in\tilde{X}}\tilde{p}(x)\Upsilon_{x}\right\|_{\diamond}=\frac{1}{2}\left\|id-\sum_{x\in\tilde{X}}\tilde{p}(x)\Upsilon^{-1}\circ\Upsilon_{x}\right\|_{\diamond}\leq\left\|\text{tr}_{\mathcal{H}^{\prime}}\left[S\right]\right\|_{\infty}
(8) withS0SJ(id)xX~p~(x)J(Υ1Υx),\displaystyle{\rm with}\ S\geq 0\wedge S\geq J(id)-\sum_{x\in\tilde{X}}\tilde{p}(x)J(\Upsilon^{-1}\circ\Upsilon_{x}),

where \mathcal{H}^{\prime} represents the the output system of Υ\Upsilon, which is isomorphic to \mathcal{H}. Second, by using the Taylor expansions eiHx=𝕀+iHx+Rxe^{iH_{x}}=\mathbb{I}+iH_{x}+R_{x}, where Rx=O(ϵ2)R_{x}=O(\epsilon^{2}), we obtain

(10) J(id)xX~p~(x)J(Υ1Υx)\displaystyle J(id)-\sum_{x\in\tilde{X}}\tilde{p}(x)J(\Upsilon^{-1}\circ\Upsilon_{x}) =\displaystyle= xX~p~(x){(RxJ(id)+J(id)Rx)i(HxJ(id)RxRxJ(id)Hx)}P\displaystyle\sum_{x\in\tilde{X}}\tilde{p}(x)\left\{-(R_{x}J(id)+J(id)R_{x}^{\dagger})-i(H_{x}J(id)R_{x}^{\dagger}-R_{x}J(id)H_{x})\right\}-P
\displaystyle\leq xX~p~(x){(1RxRxJ(id)Rx+RxJ(id))\displaystyle\sum_{x\in\tilde{X}}\tilde{p}(x)\Big{\{}\left(\frac{1}{\left\|R_{x}\right\|_{\infty}}R_{x}J(id)R_{x}^{\dagger}+\left\|R_{x}\right\|_{\infty}J(id)\right)
+(RxHxHxJ(id)Hx+HxRxRxJ(id)Rx)}\displaystyle\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ +\left(\frac{\left\|R_{x}\right\|_{\infty}}{\left\|H_{x}\right\|_{\infty}}H_{x}J(id)H_{x}+\frac{\left\|H_{x}\right\|_{\infty}}{\left\|R_{x}\right\|_{\infty}}R_{x}J(id)R_{x}^{\dagger}\right)\Big{\}}

where P=xX~p~(x)(HxJ(id)Hx+RxJ(id)Rx)𝐏𝐨𝐬()P=\sum_{x\in\tilde{X}}\tilde{p}(x)(H_{x}J(id)H_{x}+R_{x}J(id)R_{x}^{\dagger})\in\mathbf{Pos}\left(\mathcal{H}\otimes\mathcal{H}^{\prime}\right), HxH_{x} and RxR_{x} acts on \mathcal{H}^{\prime} and we use the fact that |ϕ~ψ~|+|ψ~ϕ~|ϕ~+ψ~|{\tilde{\phi}}\rangle\langle{\tilde{\psi}}|+|{\tilde{\psi}}\rangle\langle{\tilde{\phi}}|\leq\tilde{\phi}+\tilde{\psi} with complex vectors (|ϕ~,|ψ~)=(Rx12j(𝕀Rx)|jj,Rx12j|jj)(|{\tilde{\phi}}\rangle,|{\tilde{\psi}}\rangle)=\left(\left\|R_{x}\right\|_{\infty}^{-\frac{1}{2}}\sum_{j}(\mathbb{I}_{\mathcal{H}}\otimes R_{x})|{jj}\rangle,\left\|R_{x}\right\|_{\infty}^{\frac{1}{2}}\sum_{j}|{jj}\rangle\right) and (|ϕ~,|ψ~)=(Rx12Hx12j(𝕀Hx)|jj,iRx12Hx12j(𝕀Rx)|jj)(|{\tilde{\phi}}\rangle,|{\tilde{\psi}}\rangle)=\left(\left\|R_{x}\right\|_{\infty}^{\frac{1}{2}}\left\|H_{x}\right\|_{\infty}^{-\frac{1}{2}}\sum_{j}(\mathbb{I}_{\mathcal{H}}\otimes H_{x})|{jj}\rangle,i\left\|R_{x}\right\|_{\infty}^{-\frac{1}{2}}\left\|H_{x}\right\|_{\infty}^{\frac{1}{2}}\sum_{j}(\mathbb{I}_{\mathcal{H}}\otimes R_{x})|{jj}\rangle\right) in the inequality. Third, by letting SS in Eq. (8) be R.H.S. of Eq. (10), we obtain

(11) tr[S]\displaystyle\left\|\text{tr}_{\mathcal{H}^{\prime}}\left[S\right]\right\|_{\infty} =\displaystyle= xX~p~(x)((RxRx)TRx+Rx𝕀+Rx(Hx2)THx+Hx(RxRx)TRx)=O(ϵ2).\displaystyle\left\|\sum_{x\in\tilde{X}}\tilde{p}(x)\left(\frac{(R_{x}^{\dagger}R_{x})^{T}}{\left\|R_{x}\right\|_{\infty}}+\left\|R_{x}\right\|_{\infty}\mathbb{I}_{\mathcal{H}}+\frac{\left\|R_{x}\right\|_{\infty}(H_{x}^{2})^{T}}{\left\|H_{x}\right\|_{\infty}}+\frac{\left\|H_{x}\right\|_{\infty}(R_{x}^{\dagger}R_{x})^{T}}{\left\|R_{x}\right\|_{\infty}}\right)\right\|_{\infty}=O(\epsilon^{2}).

Since the approximation error 12ΥxX~p~(x)Υx\frac{1}{2}\left\|\Upsilon-\sum_{x\in\tilde{X}}\tilde{p}(x)\Upsilon_{x}\right\|_{\diamond} is generally worse than the optimal one minp12ΥxXp(x)Υx\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond}, we can obtain a better probability distribution and better estimation of the approximation error by numerically solving the SDP shown in Proposition 3.1. The ellipsoid method guarantees that {p(x)}xX\{p(x)\}_{x\in X} and rr in the dual problem such that the difference between rr and the optimum dual value is less than ϵ\epsilon can be computed in poly(|X|log(1ϵ))poly\left(|X|\log\left(\frac{1}{\epsilon}\right)\right)-time (Lovász, 2003). Note that we assume the dimension of the Hilbert space is constant since the unitary synthesis is usually executed for Υ\Upsilon on a fixed-size system.

4. Tight bounds on error of probabilistic approximation

This section investigates the relationship between the discrete approximation of unitary transformations and the probabilistic approximation for a general Υ\Upsilon and general set {Υx}x\{\Upsilon_{x}\}_{x}. Specifically, we show the tight relationship between minpΥxp(x)Υx\min_{p}\left\|\Upsilon-\sum_{x}p(x)\Upsilon_{x}\right\|_{\diamond} and minxΥΥx\min_{x}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}, where the former represents the minimum approximation error obtained by probabilistic synthesis and the latter represents that by deterministic synthesis when {Υx}x\{\Upsilon_{x}\}_{x} is a set of unitary transformations implemented as a gate sequence. The first lemma shows the fundamental limitation of probabilistic synthesis, and the second one shows its superiority over deterministic synthesis.

Lemma 4.1.

For an integer d2d\geq 2 specified below, let Υ:𝐋(d)𝐋(d)\Upsilon:\mathbf{L}\left(\mathbb{C}^{d}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{d}\right) and {Υx:𝐋(d)𝐋(d)}xX\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{d}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{d}\right)\right\}_{x\in X} be a target unitary transformation and finite set of unitary transformations, respectively. Then

(14) 2dϵ24δd(1δd)minp12ΥxXp(x)Υxwith{δ=11ϵ2andϵ=minxX12ΥΥx\displaystyle\frac{2}{d}\epsilon^{2}\leq\frac{4\delta}{d}\left(1-\frac{\delta}{d}\right)\leq\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond}\ {\rm with}\ \left\{\begin{array}[]{l}\delta=1-\sqrt{1-\epsilon^{2}}\ \ \ {\rm and}\\ \epsilon=\min_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\end{array}\right.

holds, where the minimization of pp is taken over probability distributions over XX.

Proof.

The first inequality can be straightforwardly verified as follows:

(15) 2dϵ2=4δd(1δ2)4δd(1δd).\frac{2}{d}\epsilon^{2}=\frac{4\delta}{d}\left(1-\frac{\delta}{2}\right)\leq\frac{4\delta}{d}\left(1-\frac{\delta}{d}\right).

Thus, we prove the second inequality. First, by computing the diamond norm between Υ\Upsilon and Υx\Upsilon_{x}, we obtain

(16) 12ΥΥx\displaystyle\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond} =\displaystyle= maxΦ𝐏(dd)Υidd(Φ)Υxidd(Φ)tr\displaystyle\max_{\Phi\in\mathbf{P}\left(\mathbb{C}^{d}\otimes\mathbb{C}^{d}\right)}\left\|\Upsilon\otimes id_{\mathbb{C}^{d}}(\Phi)-\Upsilon_{x}\otimes id_{\mathbb{C}^{d}}(\Phi)\right\|_{\text{tr}}
(17) =\displaystyle= maxΦ𝐏(dd)1F(Υidd(Φ),Υxidd(Φ))\displaystyle\max_{\Phi\in\mathbf{P}\left(\mathbb{C}^{d}\otimes\mathbb{C}^{d}\right)}\sqrt{1-F\left(\Upsilon\otimes id_{\mathbb{C}^{d}}(\Phi),\Upsilon_{x}\otimes id_{\mathbb{C}^{d}}(\Phi)\right)}
(18) =\displaystyle= 1minΦ𝐏(dd)|Φ|UUx𝕀d|Φ|2\displaystyle\sqrt{1-\min_{\Phi\in\mathbf{P}\left(\mathbb{C}^{d}\otimes\mathbb{C}^{d}\right)}|\langle{\Phi}|U^{\dagger}U_{x}\otimes\mathbb{I}_{\mathbb{C}^{d}}|{\Phi}\rangle|^{2}}
(19) =\displaystyle= 1minρ𝐒(d)|tr[ρUUx]|2,\displaystyle\sqrt{1-\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}|\text{tr}\left[\rho U^{\dagger}U_{x}\right]|^{2}},

where Υ(ρ)=UρU\Upsilon(\rho)=U\rho U^{\dagger} and Υx(ρ)=UxρUx\Upsilon_{x}(\rho)=U_{x}\rho U_{x}^{\dagger}. This indicates

(20) 1δ=maxxXminρ𝐒(d)|tr[ρUUx]|.1-\delta=\max_{x\in X}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}|\text{tr}\left[\rho U^{\dagger}U_{x}\right]|.

Next, by using the primal problem in our SDP in Proposition 3.1, we obtain

(21) minp12ΥxXp(x)Υx\displaystyle\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond} =\displaystyle= maxT𝐓(d:d)(tr[J(Υ)T]maxxXtr[J(Υx)T])\displaystyle\max_{T\in\mathbf{T}(\mathbb{C}^{d}:\mathbb{C}^{d})}\left(\text{tr}\left[J(\Upsilon)T\right]-\max_{x\in X}\text{tr}\left[J(\Upsilon_{x})T\right]\right)
(22) \displaystyle\geq 1d2(tr[J(Υ)J(Υ)]maxxXtr[J(Υx)J(Υ)])\displaystyle\frac{1}{d^{2}}\left(\text{tr}\left[J(\Upsilon)J(\Upsilon)\right]-\max_{x\in X}\text{tr}\left[J(\Upsilon_{x})J(\Upsilon)\right]\right)
(23) =\displaystyle= 11d2maxxX|tr[UUx]|2,\displaystyle 1-\frac{1}{d^{2}}\max_{x\in X}\left|\text{tr}\left[U^{\dagger}U_{x}\right]\right|^{2},

where 𝐓(1:2):={T𝐏𝐨𝐬(12):ρ𝐒(1),Tρ𝕀2}\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2}):=\{T\in\mathbf{Pos}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right):\exists\rho\in\mathbf{S}\left(\mathcal{H}_{1}\right),T\leq\rho\otimes\mathbb{I}_{\mathcal{H}_{2}}\}, and we set T=1d2J(Υ)(𝕀dd𝕀d)T=\frac{1}{d^{2}}J(\Upsilon)\left(\leq\frac{\mathbb{I}_{\mathbb{C}^{d}}}{d}\otimes\mathbb{I}_{\mathbb{C}^{d}}\right) to obtain the inequality.

In Eq. (20) and Eq. (23), the same unitary operator W=UUxW=U^{\dagger}U_{x} appears in the term minρ𝐒(d)|tr[ρW]|\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}|\text{tr}\left[\rho W\right]| and |tr[W]||\text{tr}\left[W\right]|, respectively. We can prove the second inequality in Ineq. (14) by establishing a relationship between the two terms as follows. For any unitary operator WW on d\mathbb{C}^{d} (d2)(d\geq 2),

(24) 1d|tr[W]|=1d|i=1dλi(W)|2dminp|i=1dp(i)λi(W)|+d2d=2dminρ𝐒(d)|tr[ρW]|+d2d\displaystyle\frac{1}{d}\left|\text{tr}\left[W\right]\right|=\frac{1}{d}\left|\sum_{i=1}^{d}\lambda_{i}(W)\right|\leq\frac{2}{d}\min_{p}\left|\sum_{i=1}^{d}p(i)\lambda_{i}(W)\right|+\frac{d-2}{d}=\frac{2}{d}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left|\text{tr}\left[\rho W\right]\right|+\frac{d-2}{d}

holds, where λi(W)\lambda_{i}(W) is the ii-th eigenvalue of WW, and in the inequality, we use the following two facts: (i) the minimization is achieved only if pp satisfies i,p(i)12\forall i,p(i)\leq\frac{1}{2} due to a geometric observation, and (ii) for such pp and complex numbers λi{z:|z|=1}\lambda_{i}\in\{z\in\mathbb{C}:|z|=1\}, |ip(i)λi||i12λi||i(12p(i))λi|12|iλi|i(12p(i))=12|iλi|d22\left|\sum_{i}p(i)\lambda_{i}\right|\geq\left|\sum_{i}\frac{1}{2}\lambda_{i}\right|-\left|\sum_{i}\left(\frac{1}{2}-p(i)\right)\lambda_{i}\right|\geq\frac{1}{2}\left|\sum_{i}\lambda_{i}\right|-\sum_{i}\left(\frac{1}{2}-p(i)\right)=\frac{1}{2}\left|\sum_{i}\lambda_{i}\right|-\frac{d-2}{2}. ∎

To the best of our knowledge, the dependence of the approximation error obtained by probabilistic synthesis on the dimension of the Hilbert space shown in this theorem has never been found. This dependence is inevitable since we can also show the sharpness of this theorem in Appendix C. More precisely, we can show that for any real number ϵ(0,1]\epsilon\in(0,1], any integer d2d\geq 2 and any Υ\Upsilon, there exists {Υx}xX\{\Upsilon_{x}\}_{x\in X} achieving the lower bound in Ineq. (14).

In the following lemma, we show the tight upper bound showing that the worst approximation error caused by deterministic synthesis can be reduced by probabilistic synthesis at least quadratically. Our upper bound slightly improves the various existing upper bounds (Hastings, 2017; Campbell, 2017), which have been proven for several classes of target unitary transformations and ϵ\epsilon-coverings {Υx}xX\{\Upsilon_{x}\}_{x\in X} with small ϵ\epsilon. Using Proposition 5.5, shown in the next section, we can verify that our upper bound is still tight even if we consider the approximation of axial single-qubit unitary transformations.

Lemma 4.2.

For a non-negative real number ϵ0\epsilon\geq 0 and integer d2d\geq 2 specified below, if {Υx:𝐋(d)𝐋(d)}xX\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{d}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{d}\right)\right\}_{x\in X} is a finite ϵ\epsilon-covering of the set of unitary transformations, i.e., maxΥminxX12ΥΥxϵ\max_{\Upsilon}\min_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\leq\epsilon, then

(25) minp12ΥxXp(x)Υxϵ2\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond}\leq\epsilon^{2}

holds for any unitary transformation Υ\Upsilon, where the minimization of pp are taken over probability distributions over XX.

Proof.

First, by using the primal problem in our SDP in Proposition 3.1, we obtain

(26) (L.H.S.)\displaystyle(L.H.S.) =\displaystyle= maxT𝐓(d:d)(tr[J(Υ)T]maxxXtr[J(Υx)T])\displaystyle\max_{T\in\mathbf{T}(\mathbb{C}^{d}:\mathbb{C}^{d})}\left(\text{tr}\left[J(\Upsilon)T\right]-\max_{x\in X}\text{tr}\left[J(\Upsilon_{x})T\right]\right)
(27) =\displaystyle= maxΦ𝐏(d)Π𝐏𝐫𝐨𝐣(d)(tr[(U𝕀)Φ(U𝕀)Π]maxxXtr[(Ux𝕀)Φ(Ux𝕀)Π]),\displaystyle\max_{\begin{subarray}{c}\Phi\in\mathbf{P}\left(\mathbb{C}^{d}\otimes\mathcal{H}\right)\\ \Pi\in\mathbf{Proj}(\mathbb{C}^{d}\otimes\mathcal{H})\end{subarray}}\Big{(}\text{tr}\left[(U\otimes\mathbb{I}_{\mathcal{H}})\Phi(U\otimes\mathbb{I}_{\mathcal{H}})^{\dagger}\Pi\right]-\max_{x\in X}\text{tr}\left[(U_{x}\otimes\mathbb{I}_{\mathcal{H}})\Phi(U_{x}\otimes\mathbb{I}_{\mathcal{H}})^{\dagger}\Pi\right]\Big{)},

where 𝐓(1:2):={T𝐏𝐨𝐬(12):ρ𝐒(1),Tρ𝕀2}\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2}):=\{T\in\mathbf{Pos}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right):\exists\rho\in\mathbf{S}\left(\mathcal{H}_{1}\right),T\leq\rho\otimes\mathbb{I}_{\mathcal{H}_{2}}\}, Υ(ρ)=UρU\Upsilon(\rho)=U\rho U^{\dagger}, Υx(ρ)=UxρUx\Upsilon_{x}(\rho)=U_{x}\rho U_{x}^{\dagger}, 𝐏𝐫𝐨𝐣()\mathbf{Proj}(\mathcal{H}) is the set of Hermitian projectors on \mathcal{H}, and we use Eq. (60) by taking Ξ{ΥΥx}xX\Xi\in\{\Upsilon-\Upsilon_{x}\}_{x\in X} to obtain the last equality.

Let Φ^\hat{\Phi} and Π^\hat{\Pi} maximize Eq. (27). We can verify that Π^U|Φ^=0\hat{\Pi}U|{\hat{\Phi}}\rangle=0 if and only if there exists xXx\in X such that Υx=Υ\Upsilon_{x}=\Upsilon. If Π^U|Φ^0\hat{\Pi}U|{\hat{\Phi}}\rangle\neq 0, let Ψ^\hat{\Psi} be the pure state such that |Ψ^Π^U|Φ^|{\hat{\Psi}}\rangle\propto\hat{\Pi}U|{\hat{\Phi}}\rangle. Then, we can verify that Eq. (27) is still maximized even if we replace Π^\hat{\Pi} with Ψ^\hat{\Psi}. If Π^U|Φ^=0\hat{\Pi}U|{\hat{\Phi}}\rangle=0, (xX,Υx=Υ)(\exists x\in X,\Upsilon_{x}=\Upsilon) indicates that Eq. (27) is still maximized even if we replace Π^\hat{\Pi} and Φ^\hat{\Phi} with an arbitrary pure state Ψ^\hat{\Psi} and (Υ1id)(Ψ^)(\Upsilon^{-1}\otimes id_{\mathcal{H}})(\hat{\Psi}), respectively. Thus, in both cases, Π\Pi in Eq. (27) can be restricted as a pure state, i.e., Π=Ψ𝐏(d)\Pi=\Psi\in\mathbf{P}\left(\mathbb{C}^{d}\otimes\mathcal{H}\right), and we proceed as follows:

(28) Eq.(27)=maxΦ,Ψ𝐏(d)(|Ψ|U𝕀|Φ|2maxxX|Ψ|Ux𝕀|Φ|2).\displaystyle{\rm Eq}.~{}\eqref{eq:R2}=\max_{\Phi,\Psi\in\mathbf{P}\left(\mathbb{C}^{d}\otimes\mathcal{H}\right)}\Big{(}|\langle{\Psi}|U\otimes\mathbb{I}_{\mathcal{H}}|{\Phi}\rangle|^{2}-\max_{x\in X}|\langle{\Psi}|U_{x}\otimes\mathbb{I}_{\mathcal{H}}|{\Phi}\rangle|^{2}\Big{)}.

Before proceeding to the next step, we show that the set of mappings fΦ,Ψ:U|Ψ|U𝕀|Φ|f_{\Phi,\Psi}:U\mapsto|\langle{\Psi}|U\otimes\mathbb{I}_{\mathcal{H}}|{\Phi}\rangle| associated with pure states Φ\Phi and Ψ\Psi is equivalent to that of mappings gA:U|tr[AU]|g_{A}:U\mapsto\left|\text{tr}\left[AU\right]\right| associated with linear operator A𝐋(d)A\in\mathbf{L}\left(\mathbb{C}^{d}\right) such that A11\left\|A\right\|_{1}\leq 1, where A1\left\|A\right\|_{1} is the Schatten 11-norm of AA. By using decompositions |Φ=i,jαij|i|j|{\Phi}\rangle=\sum_{i,j}\alpha_{ij}|{i}\rangle|{j}\rangle and |Ψ=i,jβij|i|j|{\Psi}\rangle=\sum_{i,j}\beta_{ij}|{i}\rangle|{j}\rangle with respect to orthonormal bases, we can verify that gAg_{A} with A=i,j,kαikβjk|ij|A=\sum_{i,j,k}\alpha_{ik}\beta^{*}_{jk}|{i}\rangle\langle{j}| is equal to fΦ,Ψf_{\Phi,\Psi} and A1=maxUgA(U)=maxUfΦ,Ψ(U)1\left\|A\right\|_{1}=\max_{U}g_{A}(U)=\max_{U}f_{\Phi,\Psi}(U)\leq 1. On the other hand, by using the singular value decomposition A=ipi|xiyi|A=\sum_{i}p_{i}|{x_{i}}\rangle\langle{y_{i}}|, where A11\left\|A\right\|_{1}\leq 1 indicates p+ipi=1p+\sum_{i}p_{i}=1 with some p0p\geq 0, we can verify that fΦ,Ψf_{\Phi,\Psi} with |Φ=p|0|+ipi|xi|i|{\Phi}\rangle=\sqrt{p}|{0}\rangle|{\bot}\rangle+\sum_{i}\sqrt{p_{i}}|{x_{i}}\rangle|{i}\rangle and |Ψ=p|0|+ipi|yi|i|{\Psi}\rangle=\sqrt{p}|{0}\rangle|{\bot^{\prime}}\rangle+\sum_{i}\sqrt{p_{i}}|{y_{i}}\rangle|{i}\rangle ({|i}i{|,|}\{|{i}\rangle\}_{i}\cup\{|{\bot}\rangle,|{\bot^{\prime}}\rangle\} is an orthonormal basis) is equal to gAg_{A}.

By using the equivalent between two sets of mappings, we proceed as follows:

(29) Eq.(28)\displaystyle{\rm Eq}.~{}\eqref{eq:R3} =\displaystyle= maxA:A11(|tr[AU]|2maxxX|tr[AUx]|2)=maxV,ρ𝐒(d)(|tr[ρVU]|2maxxX|tr[ρVUx]|2),\displaystyle\max_{A:\left\|A\right\|_{1}\leq 1}\left(\left|\text{tr}\left[AU\right]\right|^{2}-\max_{x\in X}\left|\text{tr}\left[AU_{x}\right]\right|^{2}\right)=\max_{V,\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left(\left|\text{tr}\left[\rho V^{\dagger}U\right]\right|^{2}-\max_{x\in X}\left|\text{tr}\left[\rho V^{\dagger}U_{x}\right]\right|^{2}\right),

where we use the fact that the maximization is achieved when A1=1\left\|A\right\|_{1}=1 and use the polar decomposition A=ρVA=\rho V^{\dagger} with a unitary operator VV acting on d\mathbb{C}^{d}.

By using Eq. (29), we obtain

(30) maxΥminp12ΥxXp(x)Υx\displaystyle\max_{\Upsilon}\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond} =\displaystyle= maxV,ρ𝐒(d)(maxU|tr[ρVU]|2maxxX|tr[ρVUx]|2)\displaystyle\max_{V,\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left(\max_{U}\left|\text{tr}\left[\rho V^{\dagger}U\right]\right|^{2}-\max_{x\in X}\left|\text{tr}\left[\rho V^{\dagger}U_{x}\right]\right|^{2}\right)
(31) =\displaystyle= 1minV,ρ𝐒(d)maxxX|tr[ρVUx]|2\displaystyle 1-\min_{V,\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\max_{x\in X}\left|\text{tr}\left[\rho V^{\dagger}U_{x}\right]\right|^{2}
(32) \displaystyle\leq 1minVmaxxXminρ𝐒(d)|tr[ρVUx]|2,\displaystyle 1-\min_{V}\max_{x\in X}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left|\text{tr}\left[\rho V^{\dagger}U_{x}\right]\right|^{2},

where the maximization of Υ\Upsilon is taken over unitary transformations, and we use the fact that maxxminyf(x,y)minymaxxf(x,y)\max_{x}\min_{y}f(x,y)\leq\min_{y}\max_{x}f(x,y) for any ff if the maximum and minimum exist in the inequality. Using Eq. (19) completes the proof. ∎

The combination of Lemmas 4.1 and 4.2 can be summarized as the following theorem.

Theorem 4.3.

For an integer d2d\geq 2 specified below, let Υ:𝐋(d)𝐋(d)\Upsilon:\mathbf{L}\left(\mathbb{C}^{d}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{d}\right) and {Υx:𝐋(d)𝐋(d)}xX\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{d}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{d}\right)\right\}_{x\in X} be a target unitary transformation and finite set of unitary transformations, respectively. Then,

(36) 4δΥd(1δΥd)minp12ΥxXp(x)Υxϵ2with{δΥ=11ϵΥ2ϵΥ=minxX12ΥΥxϵ=maxΥminxX12ΥΥx\displaystyle\frac{4\delta_{\Upsilon}}{d}\left(1-\frac{\delta_{\Upsilon}}{d}\right)\leq\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond}\leq\epsilon^{2}\ {\rm with}\ \left\{\begin{array}[]{l}\delta_{\Upsilon}=1-\sqrt{1-\epsilon_{\Upsilon}^{2}}\\ \epsilon_{\Upsilon}=\min_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\\ \epsilon=\max_{\Upsilon}\min_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\end{array}\right.

holds, where the maximization of Υ\Upsilon and minimization of pp are taken over unitary transformations on 𝐋(d)\mathbf{L}\left(\mathbb{C}^{d}\right) and probability distributions over XX, respectively.

By maximizing Υ\Upsilon over all the unitary transformations, we obtain Ineq. (1) as a simplified version of this theorem. As mentioned in the introduction, in Appendix C, we show that both the upper and lower bounds in Ineq. (1) are tight, i.e., for any real number ϵ(0,1]\epsilon\in(0,1] and any integer d2d\geq 2, the two bounds are achievable for some {Υi}i\{\Upsilon_{\vec{i}}\}_{\vec{i}}.

5. Probabilistic synthesis for single-qubit unitary transformation

In this section, we construct a simplified SDP that computes the optimal mixing probability for single-qubit-unitary synthesis. Before discussing that, we first show the special properties of the probabilistic mixture of single-qubit unitaries. In the first subsection, we prove Lemma 5.3, which is a crucial ingredient for constructing the SDP and has a direct application to constructing an efficient probabilistic synthesis algorithm. In the second subsection, we investigate the approximation of single-qubit unitary transformations corresponding to axial rotations to provide a geometric interpretation of the quadratic improvement owing to the probabilistic mixture and confirmation of Lemma 5.3.

We show the first special property of a single-qubit unitary operator in the following Lemma, which essentially shows the equivalence between the set of maximally entangled two-qubit states and a real subspace in the two qubits.

Lemma 5.1.

For any finite set {Φx𝐏(22)}xX\{\Phi_{x}\in\mathbf{P}\left(\mathbb{C}^{2}\otimes\mathbb{C}^{2}\right)\}_{x\in X} of maximally entangled states and any real numbers {rx}xX\{r_{x}\in\mathbb{R}\}_{x\in X}, the Hermitian operator H=xXrxΦxH=\sum_{x\in X}r_{x}\Phi_{x} is diagonalizable with respect to maximally entangled eigenstates.

Proof.

First, we show the equivalence between the set of two-qubit maximally entangled vectors and a real subspace in the two qubits. Define four vectors representing maximally entangled states:

|Ψ1=12(|00+|11),\displaystyle|{\Psi_{1}}\rangle=\frac{1}{\sqrt{2}}(|{00}\rangle+|{11}\rangle),\ \ \ |Ψ2=i2(|00|11),\displaystyle|{\Psi_{2}}\rangle=\frac{i}{\sqrt{2}}(|{00}\rangle-|{11}\rangle),
(37) |Ψ3=i2(|01+|10),\displaystyle|{\Psi_{3}}\rangle=\frac{i}{\sqrt{2}}(|{01}\rangle+|{10}\rangle),\ \ \ |Ψ4=12(|01|10).\displaystyle|{\Psi_{4}}\rangle=\frac{1}{\sqrt{2}}(|{01}\rangle-|{10}\rangle).

Any vector in the real subspace 𝒦MES\mathcal{K}_{MES} spanned by {|Ψi}i=14\{|{\Psi_{i}}\rangle\}_{i=1}^{4} can be represented by

(38) 12((u1+iu2)|00+(u4+iu3)|01(u4iu3)|10+(u1iu2)|11)\frac{1}{\sqrt{2}}\left((u_{1}+iu_{2})|{00}\rangle+(u_{4}+iu_{3})|{01}\rangle-(u_{4}-iu_{3})|{10}\rangle+(u_{1}-iu_{2})|{11}\rangle\right)

with real numbers {ui}i=14\{u_{i}\in\mathbb{R}\}_{i=1}^{4}. On the other hand, any maximally entangled state can be obtained by applying the single-qubit unitary operator represented by (eiϕ1cosθeiϕ2sinθeiϕ2sinθeiϕ1cosθ)\left(\begin{matrix}e^{i\phi_{1}}\cos\theta&&e^{i\phi_{2}}\sin\theta\\ -e^{-i\phi_{2}}\sin\theta&&e^{-i\phi_{1}}\cos\theta\end{matrix}\right) to |Ψ1|{\Psi_{1}}\rangle and can be represented by a vector

(39) 12(eiϕ1cosθ|00+eiϕ2sinθ|01eiϕ2sinθ|10+eiϕ1cosθ|11).\frac{1}{\sqrt{2}}\left(e^{i\phi_{1}}\cos\theta|{00}\rangle+e^{i\phi_{2}}\sin\theta|{01}\rangle-e^{-i\phi_{2}}\sin\theta|{10}\rangle+e^{-i\phi_{1}}\cos\theta|{11}\rangle\right).

By comparing Eqs. (38) and (39), we can verify that any two-qubit maximally entangled state can be represented as a unit vector in 𝒦MES\mathcal{K}_{MES} and any unit vector in 𝒦MES\mathcal{K}_{MES} represents a maximally entangled state. This equivalence has been indicated in a previous study (Bennett et al., 1996), and the basis defined in Eq. (5) is called the magic basis (Hill and Wootters, 1997).

Since H=xXrxΦxH=\sum_{x\in X}r_{x}\Phi_{x} is represented as a real symmetric matrix with respect to the basis {|Ψi}i=14\{|{\Psi_{i}}\rangle\}_{i=1}^{4}, HH is diagonalizable with respect to real eigenvectors, which represents maximally entangled states. ∎

Next, we show a special property of the diamond norm between probabilistic mixtures of single-qubit unitaries in the following Lemma, which essentially shows that the input state in the definition of the diamond norm can be maximally entangled.

Lemma 5.2.

For a subset {Υx:𝐋(2)𝐋(2)}xX\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{2}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{2}\right)\right\}_{x\in X} of single-qubit unitary transformations and probability distributions pp and qq over a finite set XX, it holds that

(40) xXp(x)ΥxxXq(x)Υx=xX(p(x)q(x))J(Υx)tr.\left\|\sum_{x\in X}p(x)\Upsilon_{x}-\sum_{x\in X}q(x)\Upsilon_{x}\right\|_{\diamond}=\left\|\sum_{x\in X}(p(x)-q(x))J(\Upsilon_{x})\right\|_{\text{tr}}.
Proof.

For d(2)d(\geq 2)-dimensional CPTP maps {Υx}xX\{\Upsilon_{x}\}_{x\in X}, it holds that

(41) (L.H.S.)=maxΦ𝐏(dd)2xX(p(x)q(x))Υxidd(Φ)tr2dxX(p(x)q(x))J(Υx)tr.(L.H.S.)=\max_{\Phi\in\mathbf{P}\left(\mathbb{C}^{d}\otimes\mathbb{C}^{d}\right)}2\left\|\sum_{x\in X}(p(x)-q(x))\Upsilon_{x}\otimes id_{\mathbb{C}^{d}}(\Phi)\right\|_{\text{tr}}\geq\frac{2}{d}\left\|\sum_{x\in X}(p(x)-q(x))J(\Upsilon_{x})\right\|_{\text{tr}}.

On the other hand, by using the dual problem of the SDP to compute the diamond norm used in the proof of Proposition 3.1, we obtain

(42) (L.H.S.)2tr2[S]with(S0)(SxX(p(x)q(x))J(Υx)),(L.H.S.)\leq 2\left\|\text{tr}_{2}\left[S\right]\right\|_{\infty}\ {\rm with}\ \left(S\geq 0\right)\wedge\left(S\geq\sum_{x\in X}(p(x)-q(x))J(\Upsilon_{x})\right),

where tr2[]\text{tr}_{2}\left[\cdot\right] represents the partial trace of the second system of 22\mathbb{C}^{2}\otimes\mathbb{C}^{2}. By using Lemma 5.1, we can verify that xX(p(x)q(x))J(Υx)=i=14λiΦi\sum_{x\in X}(p(x)-q(x))J(\Upsilon_{x})=\sum_{i=1}^{4}\lambda_{i}\Phi_{i} with real numbers λi\lambda_{i} and a set of orthogonal maximally entangled states {Φi}i=14\{\Phi_{i}\}_{i=1}^{4}. By setting S=i:λi>0λiΦiS=\sum_{i:\lambda_{i}>0}\lambda_{i}\Phi_{i}, we obtain

(43) 2tr2[S]=2i:λi>0λi𝕀2=i:λi>0λi=(R.H.S.).2\left\|\text{tr}_{2}\left[S\right]\right\|_{\infty}=2\left\|\sum_{i:\lambda_{i}>0}\lambda_{i}\frac{\mathbb{I}}{2}\right\|_{\infty}=\sum_{i:\lambda_{i}>0}\lambda_{i}=(R.H.S.).

This completes the proof. ∎

5.1. Support of optimal probability distribution

To achieve the quadratic improvement owing to the probabilistic approximation of Υ\Upsilon by using {Υx}xX\{\Upsilon_{x}\}_{x\in X}, we assume {Υx}xX\{\Upsilon_{x}\}_{x\in X} is an ϵ\epsilon-covering of the set of unitary transformations in Lemma 4.2. Since |X|=Ω(1ϵc)|X|=\Omega\left(\frac{1}{\epsilon^{c}}\right) from a volume consideration, the runtime poly(|X|log(1ϵ))poly\left(|X|\log\left(\frac{1}{\epsilon}\right)\right) of our SDP to compute the optimal probability distribution proposed in Proposition 3.1 increases as poly(1ϵ)poly\left(\frac{1}{\epsilon}\right) at best. However, by using the following lemma, we can construct a much more efficient SDP.

Lemma 5.3.

For a non-negative real number ϵ0\epsilon\geq 0, if Υ\Upsilon is a single-qubit unitary transformation and {Υx:𝐋(2)𝐋(2)}xX\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{2}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{2}\right)\right\}_{x\in X} is a finite ϵ\epsilon-covering of the set of single-qubit unitary transformations, i.e., maxΥminxX12ΥΥxϵ\max_{\Upsilon}\min_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\leq\epsilon, then

(44) minpΥxXp(x)Υx=minp^ΥxX^p^(x)Υx\min_{p}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond}=\min_{\hat{p}}\left\|\Upsilon-\sum_{x\in\hat{X}}\hat{p}(x)\Upsilon_{x}\right\|_{\diamond}

holds, where X^:={xX:12ΥΥx2ϵ}\hat{X}:=\{x\in X:\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\leq 2\epsilon\} and the minimization of pp and p^\hat{p} are taken over probability distributions over XX and those over X^\hat{X}, respectively.

Proof.

By using Lemma 5.2, we obtain

(45) (L.H.S.)=minpJ(Υ)xXp(x)J(Υx)tr=minpJ(Υ)xXp(x)J(Υx),(L.H.S.)=\min_{p}\left\|J(\Upsilon)-\sum_{x\in X}p(x)J(\Upsilon_{x})\right\|_{\text{tr}}=\min_{p}\left\|J(\Upsilon)-\sum_{x\in X}p(x)J(\Upsilon_{x})\right\|_{\infty},

where we use the dimension of the eigenspace of J(Υ)xXp(x)J(Υx)J(\Upsilon)-\sum_{x\in X}p(x)J(\Upsilon_{x}) with positive eigenvalues is at most 11 in the last equality. By using Lemma 5.1, we can proceed with the following two ways:

(46) Eq.(45)\displaystyle{\rm Eq}.~{}\eqref{eq:6_1} =\displaystyle= minpmaxρconv(MES)tr[ρ(J(Υ)xXp(x)J(Υx))]and\displaystyle\min_{p}\max_{\rho\in\text{conv}\left({\rm MES}\right)}\text{tr}\left[\rho\left(J(\Upsilon)-\sum_{x\in X}p(x)J(\Upsilon_{x})\right)\right]\ {\rm and}
(47) Eq.(45)\displaystyle{\rm Eq}.~{}\eqref{eq:6_1} =\displaystyle= minpmaxMcone(MES)M𝕀tr[M(J(Υ)xXp(x)J(Υx))],\displaystyle\min_{p}\max_{\begin{subarray}{c}M\in{\rm cone}({\rm MES})\\ M\leq\mathbb{I}\end{subarray}}\text{tr}\left[M\left(J(\Upsilon)-\sum_{x\in X}p(x)J(\Upsilon_{x})\right)\right],

where conv(MES)\text{conv}\left({\rm MES}\right) and cone(MES){\rm cone}({\rm MES}) are the convex hull of the set of maximally entangled states {Φ𝐏(22):tr2[Φ]=𝕀2}\left\{\Phi\in\mathbf{P}\left(\mathbb{C}^{2}\otimes\mathbb{C}^{2}\right):\text{tr}_{2}\left[\Phi\right]=\frac{\mathbb{I}}{2}\right\} and the convex cone generated by the set {Φ}\{\Phi\}, respectively. Note that the convex cone generated by a subset 𝐗\mathbf{X} in a vector space is defined as the set of finite linear combinations of 𝐗\mathbf{X} with non-negative coefficients.

Since the domains of pp, ρ\rho, and MM are compact and convex and f(p,H):=tr[H(J(Υ)xXp(x)J(Υx))]f(p,H):=\text{tr}\left[H(J(\Upsilon)-\sum_{x\in X}p(x)J(\Upsilon_{x}))\right] is affine with respect to each variable, we can apply the minimax theorem and obtain

(48) Eq.(45)=maxρconv(MES)(tr[ρJ(Υ)]maxxXtr[ρJ(Υx)])=maxMcone(MES)M𝕀(tr[MJ(Υ)]maxxXtr[MJ(Υx)]).{\rm Eq}.~{}\eqref{eq:6_1}=\max_{\rho\in\text{conv}\left({\rm MES}\right)}\left(\text{tr}\left[\rho J(\Upsilon)\right]-\max_{x\in X}\text{tr}\left[\rho J(\Upsilon_{x})\right]\right)=\max_{\begin{subarray}{c}M\in{\rm cone}({\rm MES})\\ M\leq\mathbb{I}\end{subarray}}\left(\text{tr}\left[MJ(\Upsilon)\right]-\max_{x\in X}\text{tr}\left[MJ(\Upsilon_{x})\right]\right).

When (L.H.S.)=0(L.H.S.)=0, the theorem holds since there exists xXx\in X such that Υx=Υ\Upsilon_{x}=\Upsilon. In the following, we assume (L.H.S.)>0(L.H.S.)>0. If ρ\rho with ρ<1\left\|\rho\right\|_{\infty}<1 maximizes the formula, we can show a contradiction by setting M=ρρM=\frac{\rho}{\left\|\rho\right\|_{\infty}}. Thus, ρ\rho that maximizes the formula satisfies ρ=1\left\|\rho\right\|_{\infty}=1, i.e., ρ\rho is a (pure) maximally entangled state. Therefore, we obtain

(49) Eq.(48)=maxΥ12(tr[J(Υ)J(Υ)]maxxXtr[J(Υ)J(Υx)])=maxUU(2)12(|tr[UU]|2maxxX|tr[UxU]|2),{\rm Eq}.~{}\eqref{eq:6_2}=\max_{\Upsilon^{\prime}}\frac{1}{2}\left(\text{tr}\left[J(\Upsilon^{\prime})J(\Upsilon)\right]-\max_{x\in X}\text{tr}\left[J(\Upsilon^{\prime})J(\Upsilon_{x})\right]\right)=\max_{U^{\prime}\in U(2)}\frac{1}{2}\left(\left|\text{tr}\left[U^{\dagger}U^{\prime}\right]\right|^{2}-\max_{x\in X}\left|\text{tr}\left[U_{x}^{\dagger}U^{\prime}\right]\right|^{2}\right),

where Υ(ρ)=UρU\Upsilon(\rho)=U\rho U^{\dagger}, Υx(ρ)=UxρUx\Upsilon_{x}(\rho)=U_{x}\rho U_{x}^{\dagger}, the maximization of Υ\Upsilon^{\prime} is taken over single-qubit unitary transformations, and U(2)U(2) represents the set of single-qubit unitary operators. By observing that the minimization in Eq. (19) is achieved by ρ=𝕀2\rho=\frac{\mathbb{I}}{2} for single-qubit unitary operators, we obtain

(50) Eq.(49)=maxΥ12(minxXΥΥx2ΥΥ2).{\rm Eq}.~{}\eqref{eq:6_3}=\max_{\Upsilon^{\prime}}\frac{1}{2}\left(\min_{x\in X}\left\|\Upsilon^{\prime}-\Upsilon_{x}\right\|_{\diamond}^{2}-\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}^{2}\right).

Since so far we did not use the assumption that {Υx}xX\{\Upsilon_{x}\}_{x\in X} is an ϵ\epsilon-covering, we obtain

(51) (R.H.S.)ofEq.(44)=maxΥ12(minxX^ΥΥx2ΥΥ2).(R.H.S.)\ {\rm of}\ {\rm Eq}.~{}\eqref{eq:support}=\max_{\Upsilon^{\prime}}\frac{1}{2}\left(\min_{x\in\hat{X}}\left\|\Upsilon^{\prime}-\Upsilon_{x}\right\|_{\diamond}^{2}-\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}^{2}\right).

Note that the maximization in Eq. (50) is achieved by Υ\Upsilon^{\prime} satisfying 12ΥΥϵ\frac{1}{2}\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}\leq\epsilon since minxX12ΥΥxϵ\min_{x\in X}\frac{1}{2}\left\|\Upsilon^{\prime}-\Upsilon_{x}\right\|_{\diamond}\leq\epsilon due to the definition of the ϵ\epsilon-covering. If we can show that the maximization in Eq. (51) is also achieved by such Υ\Upsilon^{\prime}, we can prove the equivalence between Eqs. (50) and (51). For the minimization in Eq. (50) is achieved by xX^x\in\hat{X} owing to the triangle inequality. To complete the proof, we show the following statement: for all Υ\Upsilon^{\prime},

(52) 12ΥΥ>ϵminxX^ΥΥxΥΥ.\frac{1}{2}\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}>\epsilon\Rightarrow\min_{x\in\hat{X}}\left\|\Upsilon^{\prime}-\Upsilon_{x}\right\|_{\diamond}\leq\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}.

We assume ϵ<1\epsilon<1; otherwise, the statement is trivial. By using the equivalence between the set of two-qubit maximally entangled vectors and a real subspace shown in the proof of Lemma 5.1, there exist unit real vectors u,u4\vec{u},\vec{u}^{\prime}\in\mathbb{R}^{4} such that i,j=14uiuj|ΨiΨj|=12J(Υ)\sum_{i,j=1}^{4}u_{i}u_{j}|{\Psi_{i}}\rangle\langle{\Psi_{j}}|=\frac{1}{2}J(\Upsilon), i,j=14uiuj|ΨiΨj|=12J(Υ)\sum_{i,j=1}^{4}u^{\prime}_{i}u^{\prime}_{j}|{\Psi_{i}}\rangle\langle{\Psi_{j}}|=\frac{1}{2}J(\Upsilon^{\prime}) and

(53) 0cosθ1:=uu<1ϵ2,0\leq\cos\theta_{1}:=\vec{u}\cdot\vec{u}^{\prime}<\sqrt{1-\epsilon^{2}},

where {|Ψi}\{|{\Psi_{i}}\rangle\} is defined in Eq. (5), θ1[0,π2]\theta_{1}\in[0,\frac{\pi}{2}], the first inequality can be satisfied by appropriately setting the sign of u\vec{u}, and the second (strict) inequality is derived from 12ΥΥ>ϵ\frac{1}{2}\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}>\epsilon and Lemma 5.2. In the real subspace spanned by {u,u}\{\vec{u},\vec{u}^{\prime}\}, there exists a unique unit real vector v4\vec{v}\in\mathbb{R}^{4} such that

(54) cosθ2:=uv=1ϵ2uv=cos(θ1θ2),\cos\theta_{2}:=\vec{u}\cdot\vec{v}=\sqrt{1-\epsilon^{2}}\ \wedge\ \vec{u}^{\prime}\cdot\vec{v}=\cos(\theta_{1}-\theta_{2}),

where θ2[0,π2]\theta_{2}\in[0,\frac{\pi}{2}], as shown in Fig. 2. Note that the unitary transformation Υ^\hat{\Upsilon} corresponding to v\vec{v}, i.e., i,j=14vivj|ΨiΨj|=12J(Υ^)\sum_{i,j=1}^{4}v_{i}v_{j}|{\Psi_{i}}\rangle\langle{\Psi_{j}}|=\frac{1}{2}J(\hat{\Upsilon}), satisfies 12ΥΥ^=ϵ\frac{1}{2}\left\|\Upsilon-\hat{\Upsilon}\right\|_{\diamond}=\epsilon due to Lemma 5.2. Since there exists xXx\in X such that 12ΥxΥ^ϵ\frac{1}{2}\left\|\Upsilon_{x}-\hat{\Upsilon}\right\|_{\diamond}\leq\epsilon and 12ΥxΥ12ΥxΥ^+12ΥΥ^2ϵ\frac{1}{2}\left\|\Upsilon_{x}-\Upsilon\right\|_{\diamond}\leq\frac{1}{2}\left\|\Upsilon_{x}-\hat{\Upsilon}\right\|_{\diamond}+\frac{1}{2}\left\|\Upsilon-\hat{\Upsilon}\right\|_{\diamond}\leq 2\epsilon, we can find a unit real vector w4\vec{w}\in\mathbb{R}^{4} corresponding to Υx\Upsilon_{x} with xX^x\in\hat{X}, i.e., i,j=14wiwj|ΨiΨj|=12J(Υx)\sum_{i,j=1}^{4}w_{i}w_{j}|{\Psi_{i}}\rangle\langle{\Psi_{j}}|=\frac{1}{2}J(\Upsilon_{x}), and satisfying

(55) cosθ3:=wv1ϵ2,\cos\theta_{3}:=\vec{w}\cdot\vec{v}\geq\sqrt{1-\epsilon^{2}},

where θ3[0,π2]\theta_{3}\in[0,\frac{\pi}{2}], due to Lemma 5.2. By using Lemma 5.2 again, we obtain

(56) ΥΥxΥΥ|uu||uw|.\left\|\Upsilon^{\prime}-\Upsilon_{x}\right\|_{\diamond}\leq\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}\Leftrightarrow|\vec{u}^{\prime}\cdot\vec{u}|\leq|\vec{u}^{\prime}\cdot\vec{w}|.

By letting cosθ4:=uw\cos\theta_{4}:=\vec{u}^{\prime}\cdot\vec{w} with θ4[0,π]\theta_{4}\in[0,\pi] and using the triangle inequality for angles in the three-dimensional subspace spanned by {u,u,w}\{\vec{u},\vec{u}^{\prime},\vec{w}\}, we obtain

(57) θ4(θ1θ2)+θ3θ1.\theta_{4}\leq(\theta_{1}-\theta_{2})+\theta_{3}\leq\theta_{1}.

This completes the proof.

Refer to caption
Figure 2. Three-dimensional subspace spanned by {u,u,w,v}\{\vec{u},\vec{u}^{\prime},\vec{w},\vec{v}\} in the proof of Lemma 5.3, where vspan({u,u})\vec{v}\in\text{span}\left(\{\vec{u},\vec{u}^{\prime}\}\right). We apply the triangle inequality for the angle θ4\theta_{4} between u\vec{u}^{\prime} and w\vec{w}, the angle θ3\theta_{3} between v\vec{v} and w\vec{w} and the angle (θ1θ2)(\theta_{1}-\theta_{2}) between v\vec{v} and u\vec{u}^{\prime}.

As an application of Lemma 5.3, we construct an efficient probabilistic synthesis algorithm in the proof of the following theorem.

Theorem 5.4.

For a given gate set, there exists a probabilistic synthesis algorithm for a single-qubit unitary transformation with

INPUT: a single-qubit unitary transformation Υ:𝐋(2)𝐋(2)\Upsilon:\mathbf{L}\left(\mathbb{C}^{2}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{2}\right), an approximation error ϵ(0,1)\epsilon\in\left(0,1\right), and precision δ>0\delta>0 such that 1δ=(1ϵ)O(1)\frac{1}{\delta}=\left(\frac{1}{\epsilon}\right)^{O(1)}

OUTPUT: a gate sequence for implementing a single-qubit unitary transformation Υx\Upsilon_{x} sampled from a set {Υx:𝐋(2)𝐋(2)}xX^\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{2}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{2}\right)\right\}_{x\in\hat{X}} in accordance with probability distribution p^(x)\hat{p}(x)

such that the algorithm satisfies the following properties:

  • Efficiency: All steps of the algorithm take polylog(1ϵ)polylog\left(\frac{1}{\epsilon}\right)-time,

  • Quadratic improvement: The approximation error 12ΥxX^p^(x)Υx\frac{1}{2}\left\|\Upsilon-\sum_{x\in\hat{X}}\hat{p}(x)\Upsilon_{x}\right\|_{\diamond} obtained by this algorithm is upper bounded by ϵ2+δ\epsilon^{2}+\delta, whereas the error minxX^12ΥΥx\min_{x\in\hat{X}}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond} obtained by deterministic synthesis using the unitary transformations in {Υx}xX^\{\Upsilon_{x}\}_{x\in\hat{X}} is upper bounded by ϵ\epsilon,

Proof.

We assume that the algorithm calls an efficient deterministic synthesis algorithm such as the Solovay-Kitaev algorithm as a subroutine, i.e., the subroutine can find a gate sequence for implementing a unitary transformation Υ\Upsilon^{\prime} such that 12ΥΥϵ\frac{1}{2}\left\|\Upsilon-\Upsilon^{\prime}\right\|_{\diamond}\leq\epsilon within polylog(1ϵ)polylog\left(\frac{1}{\epsilon}\right)-time. In the following, we explicitly construct the algorithm:

Efficient probabilistic synthesis algorithm for single-qubit unitary transformation

  1. (1)

    Set free parameters c>0c>0 and c>0c^{\prime}>0 satisfying c+c1c+c^{\prime}\leq 1.

  2. (2)

    Generate a list {Υ^x}xX^\{\hat{\Upsilon}_{x}\}_{x\in\hat{X}} of single-qubit unitary transformations such that for any unitary transformation Υ^\hat{\Upsilon}, minxX^12Υ^Υ^xcϵ\min_{x\in\hat{X}}\frac{1}{2}\left\|\hat{\Upsilon}-\hat{\Upsilon}_{x}\right\|_{\diamond}\leq c\epsilon if 12ΥΥ^2ϵ\frac{1}{2}\left\|\Upsilon-\hat{\Upsilon}\right\|_{\diamond}\leq 2\epsilon. That is, {Υ^x}xX^\{\hat{\Upsilon}_{x}\}_{x\in\hat{X}} is a cϵc\epsilon-covering of the 2ϵ2\epsilon-ball around the target unitary transformation.

  3. (3)

    Call an efficient deterministic synthesis algorithm to find gate sequences for implementing unitary transformations {Υx}xX^\{\Upsilon_{x}\}_{x\in\hat{X}} such that 12ΥxΥ^xcϵ\frac{1}{2}\left\|\Upsilon_{x}-\hat{\Upsilon}_{x}\right\|_{\diamond}\leq c^{\prime}\epsilon for all xX^x\in\hat{X}.

  4. (4)

    Numerically solve our SDP shown in Proposition 3.1 by using {Υx}xX^\{\Upsilon_{x}\}_{x\in\hat{X}} as a set of CPTP mappings and obtain a probability distribution p^\hat{p}, which causes the approximation error δ\delta-close to minp12ΥxX^p(x)Υx\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in\hat{X}}p(x)\Upsilon_{x}\right\|_{\diamond}.

  5. (5)

    Sample gate sequences for implementing unitary transformations {Υx}xX^\{\Upsilon_{x}\}_{x\in\hat{X}} in accordance with p^\hat{p}.

The two properties can be verified as follows:

  • Efficiency: All steps of the algorithm take polylog(1ϵ)polylog\left(\frac{1}{\epsilon}\right)-time if the size X^\hat{X} of the list generated in the second step is upper bounded by a constant (independent to ϵ\epsilon.) We can generate such a constant-size list {Υ^x}xX^\{\hat{\Upsilon}_{x}\}_{x\in\hat{X}} by using the correspondence between a single-qubit unitary operator and unit vector in 4\mathbb{R}^{4} and Lemma 5.2.

  • Quadratic improvement: The approximation error 12ΥxX^p^(x)Υx\frac{1}{2}\left\|\Upsilon-\sum_{x\in\hat{X}}\hat{p}(x)\Upsilon_{x}\right\|_{\diamond} obtained by this algorithm is at least ϵ2+δ\epsilon^{2}+\delta since {Υx}xX^\{\Upsilon_{x}\}_{x\in\hat{X}} is a subset of an ϵ\epsilon-covering {Υx}xX^{Υy}y\{\Upsilon_{x}\}_{x\in\hat{X}}\cup\{\Upsilon^{\prime}_{y}\}_{y} of the set of single-qubit unitary transformations, where {Υy}y\{\Upsilon^{\prime}_{y}\}_{y} is an ϵ\epsilon-covering of the complement of the 2ϵ2\epsilon-ball around Υ\Upsilon and 12ΥΥy>2ϵ\frac{1}{2}\left\|\Upsilon-\Upsilon^{\prime}_{y}\right\|_{\diamond}>2\epsilon for any yy, and we can apply Lemmas 4.2 and 5.3.

Note that the quadratic improvement on the approximation error achieved by this algorithm heavily relies on Lemma 5.3. In Appendix D, we perform numerical experiments to confirm that this lemma would hold for qudit unitary transformations, which implies that this synthesis algorithm is applicable to qudit unitary transformations.

5.2. Convex-hull approximation for axial rotations

At a glance, the reduction of the approximation error due to probabilistically mixing unitaries seems strange since a unitary transformation is not a probabilistic mixture of any distinct unitary transformations. A simple geometric interpretation of the reduction is given in the following theorem, considering single-qubit unitary transformations corresponding to axial rotations.

We investigate the convex-hull approximation of a single-qubit unitary transformation Υθ^\Upsilon_{\hat{\theta}} by using unitary transformations {Υθ}θ𝚯\{\Upsilon_{\theta}\}_{\theta\in\mathbf{\Theta}} that rotate Bloch vectors about the same axes as Υθ^\Upsilon_{\hat{\theta}}, where Υθ(ρ):=R(θ)ρR(θ)\Upsilon_{\theta}(\rho):=R(\theta)\rho R^{\dagger}(\theta), R(θ):=|00|+eiθ|11|R(\theta):=|{0}\rangle\langle{0}|+e^{i\theta}|{1}\rangle\langle{1}| with an orthonormal basis {|0,|1}\{|{0}\rangle,|{1}\rangle\}, and 𝚯\mathbf{\Theta} is a finite subset of [0,2π)[0,2\pi). In this case, every unitary transformation Υθ\Upsilon_{\theta} can be represented by a unit complex number eiθe^{i\theta} in the complex plane, as shown in Fig. 3. Furthermore, the following proposition shows that the metric space of probabilistic mixtures of Υθ\Upsilon_{\theta} induced by the diamond norm can be identified with a unit disc in the complex plane.

Proposition 5.5.

For a finite subset 𝚯\mathbf{\Theta} of [0,2π)[0,2\pi), let {Υθ}θ𝚯\{\Upsilon_{\theta}\}_{\theta\in\mathbf{\Theta}} be a set of single-qubit unitary transformations that rotate Bloch vectors about a fixed axis, i.e., Υθ(ρ):=R(θ)ρR(θ)\Upsilon_{\theta}(\rho):=R(\theta)\rho R^{\dagger}(\theta) with R(θ):=|00|+eiθ|11|R(\theta):=|{0}\rangle\langle{0}|+e^{i\theta}|{1}\rangle\langle{1}| and an orthonormal basis {|0,|1}\{|{0}\rangle,|{1}\rangle\}. For probability distributions pp and qq over 𝚯\mathbf{\Theta}, it holds that

(58) θ𝚯p(θ)Υθθ𝚯q(θ)Υθ=|θ𝚯p(θ)eiθθ𝚯q(θ)eiθ|.\left\|\sum_{\theta\in\mathbf{\Theta}}p(\theta)\Upsilon_{\theta}-\sum_{\theta\in\mathbf{\Theta}}q(\theta)\Upsilon_{\theta}\right\|_{\diamond}=\left|\sum_{\theta\in\mathbf{\Theta}}p(\theta)e^{i\theta}-\sum_{\theta\in\mathbf{\Theta}}q(\theta)e^{i\theta}\right|.
Proof.

By using Lemma 5.2, we obtain

(59) (L.H.S.)=θ𝚯(p(θ)q(θ))J(Υθ)tr=(R.H.S.),(L.H.S.)=\left\|\sum_{\theta\in\mathbf{\Theta}}(p(\theta)-q(\theta))J(\Upsilon_{\theta})\right\|_{\text{tr}}=(R.H.S.),

where we use the diagonalization of θ𝚯(p(θ)q(θ))J(Υθ)\sum_{\theta\in\mathbf{\Theta}}(p(\theta)-q(\theta))J(\Upsilon_{\theta}), which can be obtained via a straightforward calculation, in the last equality. ∎

By using this proposition, we can obtain 12Υθ^θ𝚯p(θ)Υθ=12|eiθ^θ𝚯p(θ)eiθ|\frac{1}{2}\left\|\Upsilon_{\hat{\theta}}-\sum_{\theta\in\mathbf{\Theta}}p(\theta)\Upsilon_{\theta}\right\|_{\diamond}=\frac{1}{2}\left|e^{i\hat{\theta}}-\sum_{\theta\in\mathbf{\Theta}}p(\theta)e^{i\theta}\right|, which indicates that the optimal probability distribution and approximation error in the convex-hull approximation of Υθ^\Upsilon_{\hat{\theta}} can be computed by finding the closest point in the convex hull of {eiθ}θ𝚯\{e^{i\theta}\}_{\theta\in\mathbf{\Theta}} to the target point eiθ^e^{i\hat{\theta}}. As represented in Fig. 3, the quadratic reduction in approximation error owing to convex-hull approximation over discrete-point approximation can be shown by an elementary geometric observation.

Refer to caption
Figure 3. Corresponding complex numbers to target unitary transformation Υπ2\Upsilon_{\frac{\pi}{2}} and unitary transformations {Υθ}θ𝚯\{\Upsilon_{\theta}\}_{\theta\in\mathbf{\Theta}} with 𝚯={0,π3,2π3,π,4π3,5π3}\mathbf{\Theta}=\left\{0,\frac{\pi}{3},\frac{2\pi}{3},\pi,\frac{4\pi}{3},\frac{5\pi}{3}\right\}. Convex hull of {Υθ}θ𝚯\{\Upsilon_{\theta}\}_{\theta\in\mathbf{\Theta}} corresponds to shaded region. If we let approximation error obtained by deterministic approximation be ϵ=minθ𝚯12Υπ2Υθ=minθ𝚯12|ieiθ|\epsilon=\min_{\theta\in\mathbf{\Theta}}\frac{1}{2}\left\|\Upsilon_{\frac{\pi}{2}}-\Upsilon_{\theta}\right\|_{\diamond}=\min_{\theta\in\mathbf{\Theta}}\frac{1}{2}\left|i-e^{i\theta}\right|, that obtained by probabilistic approximation is given by ϵ2=minp12Υπ2θ𝚯p(θ)Υθ=minp12|iθ𝚯p(θ)eiθ|\epsilon^{2}=\min_{p}\frac{1}{2}\left\|\Upsilon_{\frac{\pi}{2}}-\sum_{\theta\in\mathbf{\Theta}}p(\theta)\Upsilon_{\theta}\right\|_{\diamond}=\min_{p}\frac{1}{2}\left|i-\sum_{\theta\in\mathbf{\Theta}}p(\theta)e^{i\theta}\right|, which demonstrates quadratic reduction in error.

6. Conclusion

We considered the analytical relationship between minpΥxp(x)Υx\min_{p}\left\|\Upsilon-\sum_{x}p(x)\Upsilon_{x}\right\|_{\diamond} and minxΥΥx\min_{x}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}, which represent the minimum approximation error obtained by probabilistic synthesis and that by deterministic synthesis, respectively. As the main result, we obtained tight upper and lower bounds on minpΥxp(x)Υx\min_{p}\left\|\Upsilon-\sum_{x}p(x)\Upsilon_{x}\right\|_{\diamond}, which guarantees the sub-optimality of the current algorithms as well as suggests the existence of an improved synthesis algorithm. We showed that the optimal probability distribution in the approximation can be computed by an SDP. We also constructed an efficient probabilistic synthesis algorithm for single-qubit unitary transformations and showed that it quadratically reduces approximation error compared with deterministic synthesis and its optimality can be reduced into the choice of unitary transformations close to the target unitary one. While numerical simulations indicate the algorithm works well for qudit unitary transformations, a rigorous proof is a subject for future work.

When we run this algorithm for qudit unitary transformation, the time complexity of the SDP used in the algorithm becomes poly(d,|X^|)poly\left(d,|\hat{X}|\right) for a fixed desired approximation error ϵ\epsilon and precision δ\delta, where dd is the dimension of the unitary operators and |X^||\hat{X}| is the size of the list of synthesized unitary transformations. Since |X^||\hat{X}| grows exponentially with respect to d2d^{2} (to make the list an ϵ\epsilon-covering of the 2ϵ2\epsilon-ball around the target unitary transformation), the algorithm is not practical for higher dimensional unitary transformations.

There are two ways to make the algorithm more practical. First, restricting a class of target unitary transformations, such as axial rotations, would significantly reduce the size |X^||\hat{X}| but still achieve the guaranteed quadratic improvement. Indeed, Fig. 3 implies that the quadratic improvement can be achieved by mixing only two realizable unitary transformations. Alternatively, we can consider a modified algorithm that probabilistically mixes a randomly sampled small subset of X^\hat{X}. While this modified algorithm does not provide the guaranteed quadratic improvement as the original one, numerical experiments in Appendix D suggest that it still attains such improvement for randomly chosen target unitary transformations.

Similar to the probabilistic mixture of unitary transformations, that of general CPTP mappings implemented by a certain quantum device is relatively easy to implement by classically controlling the quantum device. Such a probabilistic mixture of implementable CPTP mappings is considered a free operation in many quantum resource theories (Horodecki and Oppenheim, 2013; Brandão and Gour, 2015; Chitambar and Gour, 2019). To quantify or simulate a target CPTP mapping using the probabilistic mixture (sometimes assisted by a resource state), a mathematical tool is required to analyze the optimal convex approximation of a general CPTP mapping. From the mathematical perspective as well as from the resource theoretical perspective, computing or bounding the approximation error of a unital CPTP mapping by using a probabilistic mixture of unitary transformations plays a crucial role in investigating the asymptotic quantum Birkhoff conjecture (Haagerup and Musat, 2011; Yu et al., 2012). Our SDP shown in Proposition 3.1 and our bounds (or possibly their extension to general CPTP mappings) could be numerical and analytical tools to investigate such problems.

Acknowledgements.
We thank Yoshihisa Yamamoto, Aram Harrow, Isaac Chuang, Sho Sugiura, Yuki Takeuchi, Yasunari Suzuki, Yasuhiro Takahashi, and Adel Sohbi for their helpful discussions. This work was partially supported by JST Moonshot R&D MILLENNIA Program (Grant No.JPMJMS2061). SA was partially supported by JST, PRESTO Grant No.JPMJPR2111 and JPMXS0120319794. GK was supported in part by the Grant-in-Aid for Scientific Research (C) No.20K03779, (C) No.21K03388, and (S) No.18H05237 of JSPS, CREST (Japan Science and Technology Agency) Grant No.JPMJCR1671. ST was partially supported by JSPS KAKENHI Grant Numbers JP20H05966 and JP22H00522.

References

  • (1)
  • Aharonov and Ben-Or (2008) Dorit Aharonov and Michael Ben-Or. 2008. Fault-Tolerant Quantum Computation with Constant Error Rate. SIAM J. Comput. 38, 4 (2008), 1207–1282. https://doi.org/10.1137/S0097539799359385 arXiv:https://doi.org/10.1137/S0097539799359385
  • Bennett et al. (1996) Charles H. Bennett, David P. DiVincenzo, John A. Smolin, and William K. Wootters. 1996. Mixed-state entanglement and quantum error correction. Phys. Rev. A 54 (Nov 1996), 3824–3851. Issue 5. https://doi.org/10.1103/PhysRevA.54.3824
  • Bocharov et al. (2015) Alex Bocharov, Martin Roetteler, and Krysta M. Svore. 2015. Efficient Synthesis of Universal Repeat-Until-Success Quantum Circuits. Phys. Rev. Lett. 114 (Feb 2015), 080502. Issue 8. https://doi.org/10.1103/PhysRevLett.114.080502
  • Bouland and Giurgica-Tiron (2021) Adam Bouland and Tudor Giurgica-Tiron. 2021. Efficient Universal Quantum Compilation: An Inverse-free Solovay-Kitaev Algorithm. arXiv:2112.02040 [quant-ph]
  • Brandão and Gour (2015) Fernando G. S. L. Brandão and Gilad Gour. 2015. Reversible Framework for Quantum Resource Theories. Phys. Rev. Lett. 115 (Aug 2015), 070503. Issue 7. https://doi.org/10.1103/PhysRevLett.115.070503
  • Campbell (2017) Earl Campbell. 2017. Shorter gate sequences for quantum computing by mixing unitaries. Phys. Rev. A 95 (Apr 2017), 042306. Issue 4. https://doi.org/10.1103/PhysRevA.95.042306
  • Chiribella et al. (2009) Giulio Chiribella, Giacomo Mauro D’Ariano, and Paolo Perinotti. 2009. Theoretical framework for quantum networks. Phys. Rev. A 80 (Aug 2009), 022339. Issue 2. https://doi.org/10.1103/PhysRevA.80.022339
  • Chitambar and Gour (2019) Eric Chitambar and Gilad Gour. 2019. Quantum resource theories. Rev. Mod. Phys.s 91, 2 (2019), 025001.
  • Fowler (2011) Austin G. Fowler. 2011. Constructing Arbitrary Steane Code Single Logical Qubit Fault-Tolerant Gates. Quantum Info. Comput. 11, 9–10 (sep 2011), 867–873.
  • Fuchs and van de Graaf (1999) C.A. Fuchs and J. van de Graaf. 1999. Cryptographic distinguishability measures for quantum-mechanical states. IEEE Trans. Inf. Theory. 45, 4 (1999), 1216–1227. https://doi.org/10.1109/18.761271
  • Gutoski and Watrous (2007) Gus Gutoski and John Watrous. 2007. Toward a General Theory of Quantum Games. In Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing (San Diego, California, USA) (STOC ’07). Association for Computing Machinery, New York, NY, USA, 565–574. https://doi.org/10.1145/1250790.1250873
  • Haagerup and Musat (2011) Uffe Haagerup and Magdalena Musat. 2011. Factorization and Dilation Problems for Completely Positive Maps on von Neumann Algebras. Commun. Math. Phys. 303, 2 (2011), 555–594. https://doi.org/10.1007/s00220-011-1216-y
  • Harrow et al. (2002) Aram W. Harrow, Benjamin Recht, and Isaac L. Chuang. 2002. Efficient discrete approximations of quantum gates. J. Math. Phys. 43, 9 (2002), 4445–4451. https://doi.org/10.1063/1.1495899 arXiv:https://doi.org/10.1063/1.1495899
  • Hastings (2017) Matthew B. Hastings. 2017. Turning Gate Synthesis Errors into Incoherent Errors. Quantum Info. Comput. 17, 5–6 (mar 2017), 488–494.
  • Hill and Wootters (1997) Sam A. Hill and William K. Wootters. 1997. Entanglement of a Pair of Quantum Bits. Phys. Rev. Lett. 78 (Jun 1997), 5022–5025. Issue 26. https://doi.org/10.1103/PhysRevLett.78.5022
  • Horodecki and Oppenheim (2013) Michal Horodecki and Jonathan Oppenheim. 2013. (QUANTUMNESS IN THE CONTEXT OF) RESOURCE THEORIES. Int. J. Mod. Phys. B 27, 01n03 (2013), 1345019. https://doi.org/10.1142/S0217979213450197 arXiv:https://doi.org/10.1142/S0217979213450197
  • Kitaev (2003) A.Yu. Kitaev. 2003. Fault-tolerant quantum computation by anyons. Ann. Phys. 303, 1 (2003), 2–30. https://doi.org/10.1016/S0003-4916(02)00018-0
  • Kitaev et al. (2002) A. Yu Kitaev, A. H. Shen, and M. N. Vyalyi. 2002. Classical and Quantum Computation. American Mathematical Society.
  • Kliuchnikov et al. (2022) Vadym Kliuchnikov, Kristin Lauter, Romy Minko, Adam Paetznick, and Christophe Petit. 2022. Shorter quantum circuits. arXiv:2203.10064 [quant-ph]
  • Kliuchnikov et al. (2013) Vadym Kliuchnikov, Dmitri Maslov, and Michele Mosca. 2013. Asymptotically Optimal Approximation of Single Qubit Unitaries by Clifford and TT Circuits Using a Constant Number of Ancillary Qubits. Phys. Rev. Lett. 110 (May 2013), 190502. Issue 19. https://doi.org/10.1103/PhysRevLett.110.190502
  • Kliuchnikov et al. (2016) Vadym Kliuchnikov, Dmitri Maslov, and Michele Mosca. 2016. Practical Approximation of Single-Qubit Unitaries by Single-Qubit Quantum Clifford and T Circuits. IEEE Trans. Comput. 65, 1 (2016), 161–172. https://doi.org/10.1109/TC.2015.2409842
  • Knill et al. (1998) Emanuel Knill, Raymond Laflamme, and Wojciech H. Zurek. 1998. Resilient quantum computation: error models and thresholds. Proc. R. Soc. Lond. A. 454 (1998), 365–384. https://doi.org/10.1098/rspa.1998.0166
  • Lovász (2003) L. Lovász. 2003. Semidefinite Programs and Combinatorial Optimization. Springer New York, New York, NY, 137–194. https://doi.org/10.1007/0-387-22444-0_6
  • Nielsen and Chuang (2000) Michael A. Nielsen and Isaac L. Chuang. 2000. Quantum Computation and Quantum Information. Cambridge University Press.
  • Ross (2015) Neil J. Ross. 2015. Optimal Ancilla-Free CLIFFORD+V Approximation of Z-Rotations. Quantum Info. Comput. 15, 11–12 (sep 2015), 932–950.
  • Sacchi (2017) Massimiliano F. Sacchi. 2017. Optimal convex approximations of quantum states. Phys. Rev. A 96 (Oct 2017), 042325. Issue 4. https://doi.org/10.1103/PhysRevA.96.042325
  • Sacchi and Sacchi (2017) Massimiliano F. Sacchi and Tito Sacchi. 2017. Convex approximations of quantum channels. Phys. Rev. A 96 (Sep 2017), 032311. Issue 3. https://doi.org/10.1103/PhysRevA.96.032311
  • Terhal (2015) Barbara M. Terhal. 2015. Quantum error correction for quantum memories. Rev. Mod. Phys. 87 (Apr 2015), 307–346. Issue 2. https://doi.org/10.1103/RevModPhys.87.307
  • Watrous (2018) John Watrous. 2018. The Theory of Quantum Information. Cambridge University Press. https://doi.org/10.1017/9781316848142
  • Yu et al. (2012) Nengkun Yu, Runyao Duan, and Quanhua Xu. 2012. Bounds on the distance between a unital quantum channel and the convex hull of unitary channels, with applications to the asymptotic quantum Birkhoff conjecture. arXiv preprint arXiv:1201.1172 (2012).

Appendix A Equivalence between quantum testers and quantum networks

Recall that the Choi-Jamiołkowski operator of linear mapping Ξ:𝐋(1)𝐋(2)\Xi:\mathbf{L}\left(\mathcal{H}_{1}\right)\rightarrow\mathbf{L}\left(\mathcal{H}_{2}\right) is defined as J(Ξ):=i,j|ij|Ξ(|ij|)𝐋(12)J(\Xi):=\sum_{i,j}|{i}\rangle\langle{j}|\otimes\Xi(|{i}\rangle\langle{j}|)\in\mathbf{L}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right), and the set of quantum testers is defined as 𝐓(1:2):={T𝐏𝐨𝐬(12):ρ𝐒(1),Tρ𝕀2}\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2}):=\{T\in\mathbf{Pos}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right):\exists\rho\in\mathbf{S}\left(\mathcal{H}_{1}\right),T\leq\rho\otimes\mathbb{I}_{\mathcal{H}_{2}}\}. In this section, we show that the set of mappings fT:Ξtr[J(Ξ)T]f_{T}:\Xi\mapsto\text{tr}\left[J\left(\Xi\right)T\right] associated with quantum testers T𝐓(1:2)T\in\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2}) is equivalent to that of mappings gΦ,Π:Ξtr[Ξid3(Φ)Π]g_{\Phi,\Pi}:\Xi\mapsto\text{tr}\left[\Xi\otimes id_{\mathcal{H}_{3}}(\Phi)\Pi\right] associated with pure states Φ𝐏(13)\Phi\in\mathbf{P}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{3}\right) and Hermitian projectors Π𝐏𝐫𝐨𝐣(23)\Pi\in\mathbf{Proj}(\mathcal{H}_{2}\otimes\mathcal{H}_{3}) for sufficiently large dimensional Hilbert space 3\mathcal{H}_{3}. This equivalence indicates

(60) maxT𝐓(1:2)minΞfT(Ξ)=maxΦ𝐏(13)Π𝐏𝐫𝐨𝐣(23)minΞgΦ,Π(Ξ),\max_{T\in\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2})}\min_{\Xi}f_{T}(\Xi)=\max_{\begin{subarray}{c}\Phi\in\mathbf{P}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{3}\right)\\ \Pi\in\mathbf{Proj}(\mathcal{H}_{2}\otimes\mathcal{H}_{3})\end{subarray}}\min_{\Xi}g_{\Phi,\Pi}(\Xi),

where the minimization of Ξ\Xi is taken over a compact subset of linear mappings specified in the proofs of Proposition 3.1 and Lemma 4.2. Note that a proof for more general quantum testers is given in (Chiribella et al., 2009, Theorem 10).

First, we show that for any Φ\Phi and Π\Pi, there exists T𝐓(1:2)T\in\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2}) such that fT=gΦ,Πf_{T}=g_{\Phi,\Pi} as follows. By letting T=tr3[(ΦT1𝕀2)(𝕀1Π)]T=\text{tr}_{3}\left[(\Phi^{T_{1}}\otimes\mathbb{I}_{2})(\mathbb{I}_{1}\otimes\Pi)\right], we obtain

(61) gΦ,Π(Ξ)=tr[Ξid3(Φ)Π]\displaystyle g_{\Phi,\Pi}(\Xi)=\text{tr}\left[\Xi\otimes id_{\mathcal{H}_{3}}(\Phi)\Pi\right] =\displaystyle= tr[(J(Ξ)𝕀3)(ΦT1𝕀2)(𝕀1Π)]=tr[J(Ξ)T]=fT(Ξ),\displaystyle\text{tr}\left[(J(\Xi)\otimes\mathbb{I}_{3})(\Phi^{T_{1}}\otimes\mathbb{I}_{2})(\mathbb{I}_{1}\otimes\Pi)\right]=\text{tr}\left[J(\Xi)T\right]=f_{T}(\Xi),

where ΦT1\Phi^{T_{1}} and tr3[]\text{tr}_{3}\left[\cdot\right] represent the partial transpose of Φ\Phi and the partial trace, respectively, and the subscript of the operator denotes the system on which the operator acts. We can also verify that T𝐓(1:2)T\in\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2}) as follows. Let X=ijαij|j3i|1X=\sum_{ij}\alpha_{ij}|{j}\rangle_{3}\langle{i}|_{1}, where |Φ=ijαij|i1|j3|{\Phi}\rangle=\sum_{ij}\alpha_{ij}|{i}\rangle_{1}|{j}\rangle_{3} with the computational basis {|i1𝐏(1)}i\{|{i}\rangle_{1}\in\mathbf{P}\left(\mathcal{H}_{1}\right)\}_{i} and {|j3𝐏(3)}j\{|{j}\rangle_{3}\in\mathbf{P}\left(\mathcal{H}_{3}\right)\}_{j}. We then obtain that for any positive semidefinite operator P𝐏𝐨𝐬(12)P\in\mathbf{Pos}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right),

(62) tr[PT]=tr[(P𝕀3)(ΦT1𝕀2)(𝕀1Π)]=tr[(X𝕀2)P(X𝕀2)Π]0,\text{tr}\left[PT\right]=\text{tr}\left[(P\otimes\mathbb{I}_{3})(\Phi^{T_{1}}\otimes\mathbb{I}_{2})(\mathbb{I}_{1}\otimes\Pi)\right]=\text{tr}\left[(X\otimes\mathbb{I}_{2})P(X\otimes\mathbb{I}_{2})^{\dagger}\Pi\right]\geq 0,

which indicates T0T\geq 0. By letting ρ=tr3[ΦT1]=tr3[Φ]T(𝐒(1))\rho=\text{tr}_{3}\left[\Phi^{T_{1}}\right]=\text{tr}_{3}\left[\Phi\right]^{T}(\in\mathbf{S}\left(\mathcal{H}_{1}\right)), we can also verify that

(63) ρ𝕀2T=tr3[(ΦT1𝕀2)(𝕀123𝕀1Π)]=tr3[(ΦT1𝕀2)(𝕀1Π)]0,\displaystyle\rho\otimes\mathbb{I}_{2}-T=\text{tr}_{3}\left[(\Phi^{T_{1}}\otimes\mathbb{I}_{2})(\mathbb{I}_{123}-\mathbb{I}_{1}\otimes\Pi)\right]=\text{tr}_{3}\left[(\Phi^{T_{1}}\otimes\mathbb{I}_{2})(\mathbb{I}_{1}\otimes\Pi_{\bot})\right]\geq 0,

where Π𝐏𝐫𝐨𝐣(23)\Pi_{\bot}\in\mathbf{Proj}(\mathcal{H}_{2}\otimes\mathcal{H}_{3}) satisfies Π+Π=𝕀\Pi+\Pi_{\bot}=\mathbb{I}, and the last inequality can be verified by the fact that T0T\geq 0.

Next, we show that for any T𝐓(1:2)T\in\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2}), there exist Φ𝐏(13)\Phi\in\mathbf{P}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{3}\right) and Π𝐏𝐫𝐨𝐣(23)\Pi\in\mathbf{Proj}(\mathcal{H}_{2}\otimes\mathcal{H}_{3}) such that fT=gΦ,Πf_{T}=g_{\Phi,\Pi} as follows. Let Tρ1𝕀2T\leq\rho_{1}\otimes\mathbb{I}_{2}, Φ^𝐏(11)\hat{\Phi}\in\mathbf{P}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{1^{\prime}}\right) be a purification of ρ1T\rho_{1}^{T}, its singular value decomposition be |Φ^=ip(i)|xi1|yi1|{\hat{\Phi}}\rangle=\sum_{i}\sqrt{p(i)}|{x_{i}}\rangle_{1}|{y_{i}}\rangle_{1^{\prime}} (p(i)>0p(i)>0), and P𝐏𝐨𝐬(21)P\in\mathbf{Pos}\left(\mathcal{H}_{2}\otimes\mathcal{H}_{1^{\prime}}\right) be P=XTXP=XTX^{\dagger}, where X=i1p(i)|yi1xi|1X=\sum_{i}\frac{1}{\sqrt{p(i)}}|{y_{i}}\rangle_{1^{\prime}}\langle{x_{i}^{*}}|_{1} and |ϕ|{\phi^{*}}\rangle is the complex conjugate of |ϕ|{\phi}\rangle. We can then verify that

(64) fT(Ξ)=tr[J(Ξ)T]=tr[(J(Ξ)𝕀1)(Φ^T1𝕀2)(𝕀1P)]=tr[Ξid1(Φ^)P].f_{T}(\Xi)=\text{tr}\left[J(\Xi)T\right]=\text{tr}\left[(J(\Xi)\otimes\mathbb{I}_{1^{\prime}})(\hat{\Phi}^{T_{1}}\otimes\mathbb{I}_{2})(\mathbb{I}_{1}\otimes P)\right]=\text{tr}\left[\Xi\otimes id_{\mathcal{H}_{1^{\prime}}}(\hat{\Phi})P\right].

Since PX(ρ1𝕀2)X𝕀12P\leq X(\rho_{1}\otimes\mathbb{I}_{2})X^{\dagger}\leq\mathbb{I}_{1^{\prime}2}, {P,𝕀P}\{P,\mathbb{I}-P\} is a positive operator-valued measure (POVM). Owing to the Naimark’s extension, we can embed Φ^\hat{\Phi} and {P,𝕀P}\{P,\mathbb{I}-P\} in a larger Hilbert space as a pure state Φ\Phi and a projection-valued measure (PVM) {Π,Π}\{\Pi,\Pi_{\bot}\}, respectively, which completes the proof.

Appendix B Formal SDPs and their strong duality

A formal SDP to compute 12𝒜\frac{1}{2}\left\|\mathcal{A}-\mathcal{B}\right\|_{\diamond} is defined with a triple (Ξ,A,B)(\Xi,A,B) such that

(65) A=(J(𝒜)00000000),B=(0001),Ξ((TTρ))=(T+Tρ𝕀200tr[ρ])\displaystyle A=\left(\begin{matrix}J(\mathcal{A}-\mathcal{B})&0&0\\ 0&0&0\\ 0&0&0\end{matrix}\right),\ \ B=\left(\begin{matrix}0&0\\ 0&1\end{matrix}\right),\ \ \Xi\left(\left(\begin{matrix}T&*&*\\ *&T^{\prime}&*\\ *&*&\rho\end{matrix}\right)\right)=\left(\begin{matrix}T+T^{\prime}-\rho\otimes\mathbb{I}_{\mathcal{H}_{2}}&0\\ 0&\text{tr}\left[\rho\right]\end{matrix}\right)

holds for any linear operators T,T𝐋(12)T,T^{\prime}\in\mathbf{L}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right) and ρ𝐋(1)\rho\in\mathbf{L}\left(\mathcal{H}_{1}\right), where the asterisks in the argument to Ξ\Xi represent arbitrary linear operators upon which Ξ\Xi does not depend, and we identify a linear operator and its matrix representation with respect to a fixed orthonormal basis. The dual problem is obtained by observing that the adjoint of Ξ\Xi satisfies

(66) Ξ((Sr))=(S000S000r𝕀1tr2[S])\Xi^{\dagger}\left(\left(\begin{matrix}S&*\\ *&r\\ \end{matrix}\right)\right)=\left(\begin{matrix}S&0&0\\ 0&S&0\\ 0&0&r\mathbb{I}_{\mathcal{H}_{1}}-\text{tr}_{\mathcal{H}_{2}}\left[S\right]\end{matrix}\right)

for any linear operator S𝐋(12)S\in\mathbf{L}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right) and any complex number rr\in\mathbb{C}. We can verify the strong duality of this SDP by observing Ξ(𝕀1𝕀22dim1𝕀1𝕀22dim1𝕀1dim1)=B\Xi\left(\frac{\mathbb{I}_{\mathcal{H}_{1}}\otimes\mathbb{I}_{\mathcal{H}_{2}}}{2\dim\mathcal{H}_{1}}\oplus\frac{\mathbb{I}_{\mathcal{H}_{1}}\otimes\mathbb{I}_{\mathcal{H}_{2}}}{2\dim\mathcal{H}_{1}}\oplus\frac{\mathbb{I}_{\mathcal{H}_{1}}}{\dim\mathcal{H}_{1}}\right)=B and applying the Slater’s theorem.

A formal SDP shown in Proposition 3.1 is defined with a triple (Ξ,A,B)(\Xi,A,B) such that

(67) A\displaystyle A =\displaystyle= (J(𝒜)000000000000000000000001)\displaystyle\left(\begin{matrix}J(\mathcal{A})&0&0&0&0\\ 0&0&0&0&0\\ 0&0&0&0&0\\ 0&0&0&0&0\\ 0&0&0&0&-1\end{matrix}\right)
(68) B\displaystyle B =\displaystyle= (000010000)\displaystyle\left(\begin{matrix}0&0&0\\ 0&1&0\\ 0&0&0\end{matrix}\right)
(69) Ξ((SrP))\displaystyle\Xi^{\dagger}\left(\left(\begin{matrix}S&*&*\\ *&r&*\\ *&*&P\end{matrix}\right)\right) =\displaystyle= (S+xXP(x)J(x)00000S00000r𝕀1tr2[S]00000P00000tr[P])\displaystyle\left(\begin{matrix}S+\sum_{x\in X}P(x)J(\mathcal{B}_{x})&0&0&0&0\\ 0&S&0&0&0\\ 0&0&r\mathbb{I}_{\mathcal{H}_{1}}-\text{tr}_{\mathcal{H}_{2}}\left[S\right]&0&0\\ 0&0&0&P&0\\ 0&0&0&0&-\text{tr}\left[P\right]\end{matrix}\right)

holds for any linear operators S𝐋(12)S\in\mathbf{L}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right), P𝐋(|X|)P\in\mathbf{L}\left(\mathbb{C}^{|X|}\right) and any complex number rr\in\mathbb{C}, where P(x)P(x) represents a diagonal element x|P|x\langle{x}|P|{x}\rangle. The primal problem is obtained by observing that the adjoint of Ξ\Xi^{\dagger} satisfies

(70) Ξ((TTρQt))=(T+Tρ𝕀2000tr[ρ]000xXtr[J(x)T]|xx|+Qt𝕀|X|)\Xi\left(\left(\begin{matrix}T&*&*&*&*\\ *&T^{\prime}&*&*&*\\ *&*&\rho&*&*\\ *&*&*&Q&*\\ *&*&*&*&t\end{matrix}\right)\right)=\left(\begin{matrix}T+T^{\prime}-\rho\otimes\mathbb{I}_{\mathcal{H}_{2}}&0&0\\ 0&\text{tr}\left[\rho\right]&0\\ 0&0&\sum_{x\in X}\text{tr}\left[J(\mathcal{B}_{x})T\right]|{x}\rangle\langle{x}|+Q-t\mathbb{I}_{\mathbb{C}^{|X|}}\end{matrix}\right)

for any linear operators T,T𝐋(12)T,T^{\prime}\in\mathbf{L}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right), ρ𝐋(1)\rho\in\mathbf{L}\left(\mathcal{H}_{1}\right), Q𝐋(|X|)Q\in\mathbf{L}\left(\mathbb{C}^{|X|}\right) and any complex number tt\in\mathbb{C}. We can verify the strong duality of this SDP by observing Ξ(𝕀1𝕀22dim1𝕀1𝕀22dim1𝕀1dim1𝕀|X|21)=B\Xi\left(\frac{\mathbb{I}_{\mathcal{H}_{1}}\otimes\mathbb{I}_{\mathcal{H}_{2}}}{2\dim\mathcal{H}_{1}}\oplus\frac{\mathbb{I}_{\mathcal{H}_{1}}\otimes\mathbb{I}_{\mathcal{H}_{2}}}{2\dim\mathcal{H}_{1}}\oplus\frac{\mathbb{I}_{\mathcal{H}_{1}}}{\dim\mathcal{H}_{1}}\oplus\frac{\mathbb{I}_{\mathbb{C}^{|X|}}}{2}\oplus 1\right)=B and applying the Slater’s theorem.

Appendix C Sharpness of approximation error bounds

In this section, we make the same assumption d2d\geq 2 as Lemma 4.1 and 4.2.

C.1. Lower bounds

To show the sharpness of the lower bounds in Ineqs. (14) and (1), we consider a set {Υx}xX:={Υ:W𝐖ϵ(d),Υ(ρ)=WρW}\{\Upsilon_{x}\}_{x\in X}:=\{\Upsilon:\exists W\in\mathbf{W}_{\epsilon}^{(d)},\Upsilon(\rho)=W\rho W^{\dagger}\} of unitary transformations, where

(71) 𝐖ϵ(d):={W:WU(d)minzconv(λ(W))|z|1ϵ2,}withϵ[0,1]andd2,\displaystyle\mathbf{W}_{\epsilon}^{(d)}:=\left\{W:W\in U(d)\wedge\min_{z\in\text{conv}\left(\lambda(W)\right)}|z|\leq\sqrt{1-\epsilon^{2}},\right\}\ \ {\rm with}\ \ \epsilon\in[0,1]\ {\rm and}\ d\geq 2,

where U(d)U(d) represents the set of unitary operators acting on d\mathbb{C}^{d}, λ(W)\lambda(W) represents the set of eigenvalues of WW, and conv(X)\text{conv}\left(X\right) represents the convex hull of a subset XX in a vector space. To be precise, the two lower bounds are not directly applicable to {Υx}xX\{\Upsilon_{x}\}_{x\in X} since the size |X||X| of the set is infinite. However, the compactness of the set of unitary transformations on a finite-dimensional Hilbert space enables us to extend Ineqs. (14) and (1) for |X|=|X|=\infty by replacing minxX12ΥΥx\min_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond} and minp12ΥxXp(x)Υx\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond} with infxX12ΥΥx\inf_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond} and infΛconv({Υx}xX)12ΥΛ\inf_{\Lambda\in\text{conv}\left(\{\Upsilon_{x}\}_{x\in X}\right)}\frac{1}{2}\left\|\Upsilon-\Lambda\right\|_{\diamond}, respectively.

We show that this example achieves the lower bounds in the extended inequalities with a target unitary transformation Υ=id\Upsilon=id in Ineq. (14). This also indicates that there exists a finite subset {Υx}xX~\{\Upsilon_{x}\}_{x\in\tilde{X}} of {Υx}xX\{\Upsilon_{x}\}_{x\in X} such that minp12idxX~p(x)Υx\min_{p}\frac{1}{2}\left\|id-\sum_{x\in\tilde{X}}p(x)\Upsilon_{x}\right\|_{\diamond} and maxΥminp12ΥxX~p(x)Υx\max_{\Upsilon}\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in\tilde{X}}p(x)\Upsilon_{x}\right\|_{\diamond} are arbitrarily close to their each lower bound in Ineqs. (14) and (1), respectively. For letting {Υx}xX~\{\Upsilon_{x}\}_{x\in\tilde{X}} be an ϵ~\tilde{\epsilon}-covering of {Υx}xX\{\Upsilon_{x}\}_{x\in X} with sufficiently small ϵ~\tilde{\epsilon} is sufficient to show this. Thus, the sharpness of the lower bounds in the extended inequalities indicates that in the original inequalities. Note that we can show the sharpness of Ineq. (14) when an Υ\Upsilon is not the identity transformation by replacing {Υx}xX\{\Upsilon_{x}\}_{x\in X} with {ΥΥx}xX\{\Upsilon\circ\Upsilon_{x}\}_{x\in X}.

First, by using Eq. (19), we obtain

maxΥinfxX12ΥΥxinfxX12idΥx\displaystyle\max_{\Upsilon}\inf_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\geq\inf_{x\in X}\frac{1}{2}\left\|id-\Upsilon_{x}\right\|_{\diamond}
(72) =1supW𝐖ϵ(d)minρ𝐒(d)|tr[ρW]|2=1supW𝐖ϵ(d)minzconv(λ(W))|z|2=ϵ.\displaystyle=\sqrt{1-\sup_{W\in\mathbf{W}_{\epsilon}^{(d)}}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left|\text{tr}\left[\rho W\right]\right|^{2}}=\sqrt{1-\sup_{W\in\mathbf{W}_{\epsilon}^{(d)}}\min_{z\in\text{conv}\left(\lambda(W)\right)}|z|^{2}}=\epsilon.

Second, by using the extended version of Eq. (31), we obtain

(73) infΛconv({Υx}xX)12idΛmaxΥinfΛconv({Υx}xX)12ΥΛ=1minVU(d),ρ𝐒(d)supW𝐖ϵ(d)|tr[ρVW]|2.\displaystyle\inf_{\Lambda\in\text{conv}\left(\{\Upsilon_{x}\}_{x\in X}\right)}\frac{1}{2}\left\|id-\Lambda\right\|_{\diamond}\leq\max_{\Upsilon}\inf_{\Lambda\in\text{conv}\left(\{\Upsilon_{x}\}_{x\in X}\right)}\frac{1}{2}\left\|\Upsilon-\Lambda\right\|_{\diamond}=1-\min_{V\in U(d),\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\sup_{W\in\mathbf{W}_{\epsilon}^{(d)}}\left|\text{tr}\left[\rho V^{\dagger}W\right]\right|^{2}.

In the following, we show that for any VU(d)V\in U(d) and ρ𝐒(d)\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right),

(74) supW𝐖ϵ(d)|tr[ρVW]|2(12δd)2withδ=11ϵ2,\sup_{W\in\mathbf{W}_{\epsilon}^{(d)}}\left|\text{tr}\left[\rho V^{\dagger}W\right]\right|^{2}\geq\left(1-\frac{2\delta}{d}\right)^{2}\ {\rm with}\ \delta=1-\sqrt{1-\epsilon^{2}},

which is sufficient to verify that {Υx}xX\{\Upsilon_{x}\}_{x\in X} achieves lower bounds in the extended Ineqs. (14) and (1).

Let the diagonalization of VV be V=i=1dλi(V)|ii|V=\sum_{i=1}^{d}\lambda_{i}(V)|{i}\rangle\langle{i}|. Since supW𝐖ϵ(d)|tr[ρVW]|2=tr[ρVV]|2=1\sup_{W\in\mathbf{W}_{\epsilon}^{(d)}}|\text{tr}\left[\rho V^{\dagger}W\right]|^{2}=\text{tr}\left[\rho V^{\dagger}V\right]|^{2}=1 if minzconv(λ(V))|z|1ϵ2\min_{z\in\text{conv}\left(\lambda(V)\right)}|z|\leq\sqrt{1-\epsilon^{2}}, we assume ϵ>0\epsilon>0 and minzconv(λ(V))|z|>1ϵ2\min_{z\in\text{conv}\left(\lambda(V)\right)}|z|>\sqrt{1-\epsilon^{2}}. We can then define {W(ij)𝐖ϵ(d)}1i<jd\{W^{(ij)}\in\mathbf{W}_{\epsilon}^{(d)}\}_{1\leq i<j\leq d} as

(75) W(ij)\displaystyle W^{(ij)} :=\displaystyle:= k{i,j}λk(V)|kk|+λ+(ij)|ii|+λ(ij)|jj|,\displaystyle\sum_{k\notin\{i,j\}}\lambda_{k}(V)|{k}\rangle\langle{k}|+\lambda_{+}^{(ij)}|{i}\rangle\langle{i}|+\lambda_{-}^{(ij)}|{j}\rangle\langle{j}|,
(76) whereλ±(ij)\displaystyle{\rm where}\ \lambda_{\pm}^{(ij)} =\displaystyle= 1ϵ2λi(V)+λj(V)|λi(V)+λj(V)|±ϵλi(V)λj(V)|λi(V)λj(V)|ifλi(V)λj(V),\displaystyle\sqrt{1-\epsilon^{2}}\frac{\lambda_{i}(V)+\lambda_{j}(V)}{|\lambda_{i}(V)+\lambda_{j}(V)|}\pm\epsilon\frac{\lambda_{i}(V)-\lambda_{j}(V)}{|\lambda_{i}(V)-\lambda_{j}(V)|}\ {\rm if}\ \lambda_{i}(V)\neq\lambda_{j}(V),
(77) andλ±(ij)\displaystyle{\rm and}\ \lambda_{\pm}^{(ij)} =\displaystyle= 1ϵ2λi(V)±iϵλi(V)ifλi(V)=λj(V).\displaystyle\sqrt{1-\epsilon^{2}}\lambda_{i}(V)\pm i\epsilon\lambda_{i}(V)\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ {\rm if}\ \lambda_{i}(V)=\lambda_{j}(V).
Refer to caption
Figure 4. Geometric positions of eigenvalues λi(V)\lambda_{i}(V), λj(V)\lambda_{j}(V) and λ±(ij)\lambda_{\pm}^{(ij)} of unitary operators, which lie on unit circle in complex plane. Note that real and imaginary axes are rotated to horizontalize line equidistant from λi(V)\lambda_{i}(V) and λj(V)\lambda_{j}(V).

(See geometric positions of eigenvalues in the complex plane shown in Fig. 4.) Note that we can easily verify that |λ±(ij)|=1|\lambda_{\pm}^{(ij)}|=1 and |12(λ+(ij)+λ(ij))|=1ϵ2\left|\frac{1}{2}\left(\lambda_{+}^{(ij)}+\lambda_{-}^{(ij)}\right)\right|=\sqrt{1-\epsilon^{2}}, which guarantees W(ij)𝐖ϵ(d)W^{(ij)}\in\mathbf{W}_{\epsilon}^{(d)}. Moreover, we can verify that λ(VW(ij))={1,z(ij),z(ij)}\lambda\left(V^{\dagger}W^{(ij)}\right)=\{1,z^{(ij)},z^{(ij)*}\} with a unit complex number z(ij)z^{(ij)} satisfying Re[z(ij)]1ϵ2\text{Re}\left[z^{(ij)}\right]\geq\sqrt{1-\epsilon^{2}}. Then, for any VU(d)V\in U(d) and ρ𝐒(d)\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right), the left hand side of Ineq. (74) can be bounded as

(78) supW𝐖ϵ(d)|tr[ρVW]|2\displaystyle\sup_{W\in\mathbf{W}_{\epsilon}^{(d)}}\left|\text{tr}\left[\rho V^{\dagger}W\right]\right|^{2} \displaystyle\geq max1i<jd|tr[ρVW(ij)]|2minpmax1i<jd|k{i,j}p(k)+p(i)z(ij)+p(j)z(ij)|2\displaystyle\max_{1\leq i<j\leq d}\left|\text{tr}\left[\rho V^{\dagger}W^{(ij)}\right]\right|^{2}\geq\min_{p}\max_{1\leq i<j\leq d}\left|\sum_{k\notin\{i,j\}}p(k)+p(i)z^{(ij)}+p(j)z^{(ij)*}\right|^{2}
(79) \displaystyle\geq minpmax1i<jd{k{i,j}p(k)+(p(i)+p(j))Re[z(ij)]}2\displaystyle\min_{p}\max_{1\leq i<j\leq d}\left\{\sum_{k\notin\{i,j\}}p(k)+(p(i)+p(j))\text{Re}\left[z^{(ij)}\right]\right\}^{2}
(80) \displaystyle\geq minpmax1i<jd{k{i,j}p(k)+(p(i)+p(j))1ϵ2}2\displaystyle\min_{p}\max_{1\leq i<j\leq d}\left\{\sum_{k\notin\{i,j\}}p(k)+(p(i)+p(j))\sqrt{1-\epsilon^{2}}\right\}^{2}
(81) \displaystyle\geq minpmax1i<jd{1δ(p(i)+p(j))}2(12δd)2.\displaystyle\min_{p}\max_{1\leq i<j\leq d}\left\{1-\delta(p(i)+p(j))\right\}^{2}\geq\left(1-\frac{2\delta}{d}\right)^{2}.

This completes the proof.

C.2. Upper bound

We show the sharpness of the upper bound in Ineq. (1), We consider a set {Υx}xX:={Υ:V𝐕ϵ(d),Υ(ρ)=VρV}\{\Upsilon_{x}\}_{x\in X}:=\{\Upsilon:\exists V\in\mathbf{V}_{\epsilon}^{(d)},\Upsilon(\rho)=V\rho V^{\dagger}\} of unitary transformations, where

(82) 𝐕ϵ(d)\displaystyle\mathbf{V}_{\epsilon}^{(d)} :=\displaystyle:= {(100V1)(W00𝕀d2)(100V2):V1,V2U(d1),W𝐑ϵ},\displaystyle\left\{\left(\begin{matrix}1&0\\ 0&V_{1}\end{matrix}\right)\left(\begin{matrix}W&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\left(\begin{matrix}1&0\\ 0&V_{2}\end{matrix}\right):V_{1},V_{2}\in U(d-1),W\in\mathbf{R}_{\epsilon}\right\},
(83) 𝐑ϵ\displaystyle\mathbf{R}_{\epsilon} :=\displaystyle:= {(cosθsinθsinθcosθ):0θarccos(ϵ)}withϵ[0,1]andd2.\displaystyle\left\{\left(\begin{matrix}\cos\theta&-\sin\theta\\ \sin\theta&\cos\theta\end{matrix}\right):0\leq\theta\leq\arccos(\epsilon)\right\}\ \ {\rm with}\ \ \epsilon\in[0,1]\ {\rm and}\ d\geq 2.

Here 𝕀d\mathbb{I}_{d} represents the d×dd\times d identity matrix, and we identify a unitary operator and its matrix representation with respect to a fixed orthonormal basis {|i}i=0d1\{|{i}\rangle\}_{i=0}^{d-1}. Since |X|=|X|=\infty, we show the sharpness of the upper bound in the extended Ineq. (1), which is defined in the proof of the sharpness of the lower bounds. Note that

(84) UU(d),α,V𝐕0(d),U=eiαV\forall U\in U(d),\exists\alpha\in\mathbb{R},\exists V\in\mathbf{V}_{0}^{(d)},U=e^{i\alpha}V

holds. This can be verified from the following three observations: First, by letting U|i=|eiU|{i}\rangle=|{e_{i}}\rangle, there exists V~1,V~2U(d1)\tilde{V}_{1},\tilde{V}_{2}\in U(d-1) and W~U(2)\tilde{W}\in U(2) such that (W~00𝕀d2)(100V~1)|e0=|0\left(\begin{matrix}\tilde{W}^{\dagger}&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\left(\begin{matrix}1&0\\ 0&\tilde{V}_{1}^{\dagger}\end{matrix}\right)|{e_{0}}\rangle=|{0}\rangle and (100V~2)(W~00𝕀d2)(100V~1)|ei=|i\left(\begin{matrix}1&0\\ 0&\tilde{V}_{2}^{\dagger}\end{matrix}\right)\left(\begin{matrix}\tilde{W}^{\dagger}&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\left(\begin{matrix}1&0\\ 0&\tilde{V}_{1}^{\dagger}\end{matrix}\right)|{e_{i}}\rangle=|{i}\rangle for all ii. Second, for any W~U(2)\tilde{W}\in U(2), there exists α,β,γ\alpha,\beta,\gamma\in\mathbb{R} and W𝐑0W\in\mathbf{R}_{0} such that W~=eiα(100eiβ)W(100eiγ)\tilde{W}=e^{i\alpha}\left(\begin{matrix}1&0\\ 0&e^{i\beta}\end{matrix}\right)W\left(\begin{matrix}1&0\\ 0&e^{i\gamma}\end{matrix}\right). Third, by letting V1=V~1(eiβ00eiα𝕀d2)V_{1}=\tilde{V}_{1}\left(\begin{matrix}e^{i\beta}&0\\ 0&e^{-i\alpha}\mathbb{I}_{d-2}\end{matrix}\right) and V2=(eiγ00𝕀d2)V~2V_{2}=\left(\begin{matrix}e^{i\gamma}&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\tilde{V}_{2}, we can verify U=eiα(100V1)(W00𝕀d2)(100V2)U=e^{i\alpha}\left(\begin{matrix}1&0\\ 0&V_{1}\end{matrix}\right)\left(\begin{matrix}W&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\left(\begin{matrix}1&0\\ 0&V_{2}\end{matrix}\right).

First, by using Eq. (19), we obtain

(85) maxΥinfxX12ΥΥx\displaystyle\max_{\Upsilon}\inf_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond} =\displaystyle= 1minUU(d)supV𝐕ϵ(d)minρ𝐒(d)|tr[ρUV]|2\displaystyle\sqrt{1-\min_{U\in U(d)}\sup_{V\in\mathbf{V}_{\epsilon}^{(d)}}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left|\text{tr}\left[\rho U^{\dagger}V\right]\right|^{2}}
(86) =\displaystyle= 1minW𝐑0supV𝐕ϵ(d)minρ𝐒(d)|tr[ρ(W00𝕀d2)V]|2\displaystyle\sqrt{1-\min_{W\in\mathbf{R}_{0}}\sup_{V\in\mathbf{V}_{\epsilon}^{(d)}}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left|\text{tr}\left[\rho\left(\begin{matrix}W^{\dagger}&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)V\right]\right|^{2}}
(87) \displaystyle\leq 1minW𝐑0supW𝐑ϵminρ𝐒(d)|tr[ρ(WW00𝕀d2)]|2\displaystyle\sqrt{1-\min_{W\in\mathbf{R}_{0}}\sup_{W^{\prime}\in\mathbf{R}_{\epsilon}}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left|\text{tr}\left[\rho\left(\begin{matrix}W^{\dagger}W^{\prime}&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\right]\right|^{2}}
(88) =\displaystyle= 1min0θπ2sup0θarccos(ϵ)minρ𝐒(d)|tr[ρ(cos(θθ)sin(θθ)0sin(θθ)cos(θθ)000𝕀d2)]|2\displaystyle\sqrt{1-\min_{0\leq\theta\leq\frac{\pi}{2}}\sup_{0\leq\theta^{\prime}\leq\arccos(\epsilon)}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left|\text{tr}\left[\rho\left(\begin{matrix}\cos(\theta^{\prime}-\theta)&-\sin(\theta^{\prime}-\theta)&0\\ \sin(\theta^{\prime}-\theta)&\cos(\theta^{\prime}-\theta)&0\\ 0&0&\mathbb{I}_{d-2}\end{matrix}\right)\right]\right|^{2}}
(89) =\displaystyle= 1min0θπ2sup0θarccos(ϵ)cos2(θθ)\displaystyle\sqrt{1-\min_{0\leq\theta\leq\frac{\pi}{2}}\sup_{0\leq\theta^{\prime}\leq\arccos(\epsilon)}\cos^{2}(\theta^{\prime}-\theta)}
(90) =\displaystyle= max0θπ2inf0θarccos(ϵ)|sin(θθ)|=ϵ,\displaystyle\max_{0\leq\theta\leq\frac{\pi}{2}}\inf_{0\leq\theta^{\prime}\leq\arccos(\epsilon)}|\sin(\theta^{\prime}-\theta)|=\epsilon,

where we use Eq. (84) in the second equality and use λ((cos(θθ)sin(θθ)0sin(θθ)cos(θθ)000𝕀d2))={1,e±i(θθ)}\lambda\left(\left(\begin{matrix}\cos(\theta^{\prime}-\theta)&-\sin(\theta^{\prime}-\theta)&0\\ \sin(\theta^{\prime}-\theta)&\cos(\theta^{\prime}-\theta)&0\\ 0&0&\mathbb{I}_{d-2}\end{matrix}\right)\right)=\{1,e^{\pm i(\theta^{\prime}-\theta)}\} in the fourth equality.

Second, by using the definition of the diamond norm, we obtain

(91) maxΥinfΛconv({Υx}xX)12ΥΛ\displaystyle\max_{\Upsilon}\inf_{\Lambda\in\text{conv}\left(\{\Upsilon_{x}\}_{x\in X}\right)}\frac{1}{2}\left\|\Upsilon-\Lambda\right\|_{\diamond} \displaystyle\geq maxΥinfΛconv({Υx}xX)Υ(|00|)Λ(|00|)tr\displaystyle\max_{\Upsilon}\inf_{\Lambda\in\text{conv}\left(\{\Upsilon_{x}\}_{x\in X}\right)}\left\|\Upsilon(|{0}\rangle\langle{0}|)-\Lambda(|{0}\rangle\langle{0}|)\right\|_{\text{tr}}
(92) \displaystyle\geq 1minΥsupΛconv({Υx}xX)F(Υ(|00|),Λ(|00|))\displaystyle 1-\min_{\Upsilon}\sup_{\Lambda\in\text{conv}\left(\{\Upsilon_{x}\}_{x\in X}\right)}F\left(\Upsilon(|{0}\rangle\langle{0}|),\Lambda(|{0}\rangle\langle{0}|)\right)
(93) =\displaystyle= 1minΥsupxXF(Υ(|00|),Υx(|00|))\displaystyle 1-\min_{\Upsilon}\sup_{x\in X}F\left(\Upsilon(|{0}\rangle\langle{0}|),\Upsilon_{x}(|{0}\rangle\langle{0}|)\right)
(94) =\displaystyle= 1minUU(d)supV𝐕ϵ(d)|0|UV|0|2\displaystyle 1-\min_{U\in U(d)}\sup_{V\in\mathbf{V}_{\epsilon}^{(d)}}\left|\langle{0}|U^{\dagger}V|{0}\rangle\right|^{2}
(95) =\displaystyle= 1minW𝐑0supW𝐑ϵVU(d1)|0|(W00𝕀d2)(100V)(W00𝕀d2)|0|2\displaystyle 1-\min_{W\in\mathbf{R}_{0}}\sup_{\begin{subarray}{c}W^{\prime}\in\mathbf{R}_{\epsilon}\\ V\in U(d-1)\end{subarray}}\left|\langle{0}|\left(\begin{matrix}W^{\dagger}&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\left(\begin{matrix}1&0\\ 0&V\end{matrix}\right)\left(\begin{matrix}W^{\prime}&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)|{0}\rangle\right|^{2}
(96) =\displaystyle= 1min0θπ2sup0θarccos(ϵ)VU(d1)|cosθcosθ+sinθsinθ1|V|1|2\displaystyle 1-\min_{0\leq\theta\leq\frac{\pi}{2}}\sup_{\begin{subarray}{c}0\leq\theta^{\prime}\leq\arccos(\epsilon)\\ V\in U(d-1)\end{subarray}}\left|\cos\theta\cos\theta^{\prime}+\sin\theta\sin\theta^{\prime}\langle{1}|V|{1}\rangle\right|^{2}
(97) =\displaystyle= max0θπ2inf0θarccos(ϵ)sin2(θθ)=ϵ2,\displaystyle\max_{0\leq\theta\leq\frac{\pi}{2}}\inf_{0\leq\theta^{\prime}\leq\arccos(\epsilon)}\sin^{2}(\theta^{\prime}-\theta)=\epsilon^{2},

where we use ϕρtr=maxΠ𝐏𝐫𝐨𝐣()tr[Π(ϕρ)]1tr[ϕρ]\left\|\phi-\rho\right\|_{\text{tr}}=\max_{\Pi\in\mathbf{Proj}\left(\mathcal{H}\right)}\text{tr}\left[\Pi(\phi-\rho)\right]\geq 1-\text{tr}\left[\phi\rho\right] in the second inequality and use Eq. (84) in the third equality. This and the extended upper bound in Ineq. (1) complete the proof.

Appendix D Numerical experiment for actual approximation errors

Recall that Theorem 4.3 has established the relationship between the actual approximation error ϵΥprob\epsilon_{\Upsilon}^{prob} caused by the probabilistic approximation, that ϵΥ\epsilon_{\Upsilon} caused by the deterministic approximation, and the worst approximation error ϵ\epsilon, where each approximation error is defined by

(98) ϵΥprob\displaystyle\epsilon_{\Upsilon}^{prob} :=\displaystyle:= minp12ΥxXp(x)Υx\displaystyle\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond}
(99) ϵΥ\displaystyle\epsilon_{\Upsilon} :=\displaystyle:= minxX12ΥΥx\displaystyle\min_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}
(100) ϵ\displaystyle\epsilon :=\displaystyle:= maxΥminxX12ΥΥx.\displaystyle\max_{\Upsilon}\min_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}.

By using these notations, we can rewrite the relationship established in Theorem 4.3 as follows:

(101) 2ϵΥ2d4δΥd(1δΥd)ϵΥprobϵ2,\frac{2\epsilon_{\Upsilon}^{2}}{d}\simeq\frac{4\delta_{\Upsilon}}{d}\left(1-\frac{\delta_{\Upsilon}}{d}\right)\leq\epsilon_{\Upsilon}^{prob}\leq\epsilon^{2},

where δΥ=11ϵΥ2\delta_{\Upsilon}=1-\sqrt{1-\epsilon_{\Upsilon}^{2}} and the approximation becomes tighter when ϵΥ0\epsilon_{\Upsilon}\rightarrow 0.

While these inequalities are tight, it is helpful to know how small ϵΥprob\epsilon_{\Upsilon}^{prob} can be realized compared to ϵΥ2\epsilon_{\Upsilon}^{2}. In this appendix, we perform numerical experiments to demonstrate that ϵΥprob\epsilon_{\Upsilon}^{prob} is comparable to ϵΥ2\epsilon_{\Upsilon}^{2} for randomly chosen target unitary transformations Υ\Upsilon. Moreover, we demonstrate that ϵΥprob\epsilon_{\Upsilon}^{prob} becomes smaller than ϵΥ2\epsilon_{\Upsilon}^{2} for high-dimensional unitary transformations, i.e., the probabilistic approximation reduces the approximation error more than quadratically.

Additionally, the experiments have another purpose: to provide pieces of evidence supporting that Lemma 5.3 holds for qudit unitary transformations, which is crucial in constructing our probabilistic synthesis algorithm. Recall that it states that

(102) minpΥxXp(x)Υx=minp^ΥxX^p^(x)Υx,\min_{p}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond}=\min_{\hat{p}}\left\|\Upsilon-\sum_{x\in\hat{X}}\hat{p}(x)\Upsilon_{x}\right\|_{\diamond},

where X^:={xX:12ΥΥx2ϵ}\hat{X}:=\{x\in X:\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\leq 2\epsilon\}. Our numerical experiments support this statement for randomly sampled target unitary transformations Υ:𝐋(d)𝐋(d)\Upsilon:\mathbf{L}\left(\mathbb{C}^{d}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{d}\right), randomly constructed ϵ\epsilon-coverings {Υx:𝐋(d)𝐋(d)}xX\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{d}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{d}\right)\right\}_{x\in X}, and d=3,4d=3,4.

Setting of numerical experiments

First, we construct ϵ\epsilon-coverings {Υx:𝐋(d)𝐋(d)}xX\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{d}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{d}\right)\right\}_{x\in X} by randomly choosing |X|=105|X|=10^{5}, |X|=106|X|=10^{6} and |X|=107|X|=10^{7} unitary operators for d=2d=2, d=3d=3, and d=4d=4, respectively. Note that the random sampling of unitary operators means the sampling probability distribution is induced by the Haar measure on U(d)U(d). We compute a lower bound on ϵ\epsilon as maximinxX12ΥiΥx\max_{i}\min_{x\in X}\frac{1}{2}\left\|\Upsilon_{i}-\Upsilon_{x}\right\|_{\diamond} by using 3030 randomly chosen target unitary transformations {Υi}i=130\{\Upsilon_{i}\}_{i=1}^{30}. We interpret {Υx}xX\left\{\Upsilon_{x}\right\}_{x\in X} as the set of available unitary transformations in probabilistic and deterministic approximation.

Next, we randomly choose a target unitary transformation Υ\Upsilon and compute ϵΥ\epsilon_{\Upsilon}. We define the actual approximation error caused by probabilistically mixing restricted available unitary transformations as

(103) ϵΥprob(ϵ):=minp12ΥxX^(ϵ)p(x)Υx\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime}):=\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in\hat{X}(\epsilon^{\prime})}p(x)\Upsilon_{x}\right\|_{\diamond}

where X^(ϵ):={xX:12ΥΥxϵ}\hat{X}(\epsilon^{\prime}):=\{x\in X:\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\leq\epsilon^{\prime}\}. Note that ϵΥprob(ϵ)\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime}) is a monotonically decreasing function. Moreover, if Lemma 5.3 holds for qudit unitary transformations, ϵΥprob(2ϵ)=ϵΥprob(1)=ϵΥprob\epsilon_{\Upsilon}^{prob}(2\epsilon)=\epsilon_{\Upsilon}^{prob}(1)=\epsilon_{\Upsilon}^{prob}.

Third, we compute the actual approximation error caused by probabilistically mixing more restricted available unitary transformations as

(104) ϵΥprob(N):=minp12Υx=1Np(x)Υx,\epsilon_{\Upsilon}^{prob}(N):=\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x=1}^{N}p(x)\Upsilon_{x}\right\|_{\diamond},

where {Υx}x=1N\{\Upsilon_{x}\}_{x=1}^{N} is a randomly sampled subset of X^(ϵ)\hat{X}(\epsilon^{\prime}) and ϵ\epsilon^{\prime} is chosen large enough for ϵΥprob(ϵ)\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime}) to converge.

Results of numerical experiments

In Fig. 5, we draw the graphs of ϵΥprob(ϵ)\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime}) for 1010 randomly chosen Υ\Upsilon by using different colors corresponding to Υ\Upsilon. We can observe that ϵΥprob(ϵ)\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime}) is saturated when ϵ1.4ϵ\epsilon^{\prime}\geq 1.4\epsilon and ϵΥprob\epsilon_{\Upsilon}^{prob} is comparable or smaller than ϵΥ2\epsilon_{\Upsilon}^{2} since log(ϵΥprob(ϵ))log(ϵΥ)2ϵΥprob(ϵ)ϵΥ2\frac{\log(\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime}))}{\log(\epsilon_{\Upsilon})}\geq 2\Leftrightarrow\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime})\leq\epsilon_{\Upsilon}^{2} and ϵΥprobϵΥprob(ϵ)\epsilon_{\Upsilon}^{prob}\leq\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime}) by definition.

In Fig. 6, we draw the graph of ϵΥprob(N)\epsilon_{\Upsilon}^{prob}(N) for a randomly chosen Υ:𝐋(4)𝐋(4)\Upsilon:\mathbf{L}\left(\mathbb{C}^{4}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{4}\right). Note that we plot its empirical variance and average since ϵΥprob(N)\epsilon_{\Upsilon}^{prob}(N) is a random variable depending on the choice of a subset of X^(ϵ)\hat{X}(\epsilon^{\prime}). We can observe that the approximation error ϵΥprob(N)\epsilon_{\Upsilon}^{prob}(N) rapidly converges to its minimum ϵΥprob(ϵ)\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime}). Therefore, we can expect that our probabilistic synthesis algorithm proposed in Theorem 5.4 can be made more efficient while still attaining an approximation error that is nearly optimal.

Environment of numerical experiments

We performed numerical experiments using Mathematica 14.0.0.0 on a MacBook Pro equipped with a 2.4GHz Intel Core i9 processor and 64GB of memory.

Refer to caption
Figure 5. The comparison between the actual approximation error ϵΥprob\epsilon_{\Upsilon}^{prob} caused by the probabilistic approximation and that ϵΥ\epsilon_{\Upsilon} caused by the deterministic approximation for 1010 randomly sampled target unitary transformations Υ\Upsilon. For both approximations, we use the set of available unitary transformations induced by an ϵ\epsilon-covering of U(d)U(d). For each Υ\Upsilon, we compute the actual approximation errors ϵΥprob(ϵ)\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime}) caused by probabilistically mixing the set of available unitary transformations Υx\Upsilon_{x} such that 12ΥΥxϵ\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\leq\epsilon^{\prime}. The gray vertical lines represent lower bounds on ϵ\epsilon and 1.4ϵ1.4\epsilon.
Refer to caption
Figure 6. The approximation error caused by probabilistically mixing restricted subsets {Υx}x=1N\{\Upsilon_{x}\}_{x=1}^{N} of available unitary transformations. First, we randomly choose a target unitary transformation Υ\Upsilon. Next, we randomly choose 8 such subsets from the set {Υx:12ΥΥxϵ}\{\Upsilon_{x}:\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\leq\epsilon^{\prime}\} of available unitary transformations that seem sufficient to attain the optimal approximation error caused by probabilistically mixing all the available unitary transformations. The blue graph represents the average of the approximation error and the gray region represents its variance.