Probabilistic unitary synthesis with optimal accuracy

Seiseki Akibue [email protected] 0000-0001-9654-9361 NTT Communication Science Laboratories, NTT Corporation3–1, Morinosato-WakamiyaAtsugiKanagawaJapan243-0198 , Go Kato Advanced ICT Research Institute, NICT4–2–1, Nukui-KitamachiKoganeiTokyoJapan184-8795 and Seiichiro Tani Department of Mathematics, Waseda University1–6–1, Nishi-WasedaShinjukuTokyoJapan169-8050

(2023; 16 January 2023; 16 March 2009; 5 June 2009)

Abstract.

The purpose of unitary synthesis is to find a gate sequence that optimally approximates a target unitary transformation. A new synthesis approach, called probabilistic synthesis, has been introduced, and its superiority has been demonstrated over traditional deterministic approaches with respect to approximation error and gate length. However, the optimality of current probabilistic synthesis algorithms is unknown. We obtain the tight lower bound on the approximation error obtained by the optimal probabilistic synthesis, which guarantees the sub-optimality of current algorithms. We also show its tight upper bound, which improves and unifies current upper bounds depending on the class of target unitaries. These two bounds reveal the fundamental relationship of approximation error between probabilistic approximation and deterministic approximation of unitary transformations. From a computational point of view, we show that the optimal probability distribution can be computed by the semidefinite program (SDP) we construct. We also construct an efficient probabilistic synthesis algorithm for single-qubit unitaries, rigorously estimate its time complexity, and show that it reduces the approximation error quadratically compared with deterministic algorithms.

quantum gate synthesis, convex approximation, unitary gate decomposition

^†^†copyright: acmcopyright^†^†journalyear: 2023^†^†doi: XXXXXXX.XXXXXXX^†^†price: 15.00^†^†isbn: 978-1-4503-XXXX-X/18/06^†^†ccs: Theory of computation Quantum complexity theory^†^†ccs: Theory of computation Quantum information theory

1. Introduction

In quantum simulation and quantum computation, a global unitary transformation on a many-body quantum system is obtained as a sequence of unitary transformations on a fixed-size system, e.g., those obtained by nearest-neighbor interactions. To guarantee and increase the accuracy of obtaining such transformations, rather than controlling their continuous parameters, each unitary transformation on the fixed-size system is realized as a sequence of gates chosen from a finite gate set $\{g_{i}\}_{i}$ , where each $g_{i}$ results in a fixed unitary transformation with negligible error thanks to the sophisticated calibration, quantum error correction (Terhal, 2015) or the nature of the system (Kitaev, 2003). If $\{g_{i}\}_{i}$ is universal, arbitrary unitary transformation can be approximated by a unitary transformation $g_{i_{n}}\circ\cdots\circ g_{i_{2}}\circ g_{i_{1}}$ obtained as a gate sequence for an appropriate choice of gate length $n$ depending on the approximate error one wants to achieve. For a given universal gate set such as the set of the Hadamard, controlled-NOT, and $\pi/8$ gates (Nielsen and Chuang, 2000), an algorithm to find a gate sequence for a given unitary transformation and an approximation error bound is called a unitary synthesis algorithm.

To suppress the effect of decoherence or overhead caused by the fault-tolerant implementation of each gate (Knill et al., 1998; Aharonov and Ben-Or, 2008), various studies (Kitaev et al., 2002; Harrow et al., 2002; Kliuchnikov et al., 2016, 2013; Ross, 2015; Bocharov et al., 2015; Fowler, 2011; Bouland and Giurgica-Tiron, 2021) have proposed unitary synthesis algorithms for minimizing the length of the output gate sequence. Following the celebrated Solovay-Kitaev algorithm (Kitaev et al., 2002), many algorithms are used to find one of the shortest gate sequences that can approximate a target unitary transformation $\Upsilon$ within the desired approximation error. Obviously, the goal can be achieved by brute force search (Fowler, 2011). However, to guarantee their efficiency, many algorithms are designed for synthesizing restricted classes of unitary transformations by using particular gate sets or for finding a nearly shortest gate sequence.

While approximating an $\Upsilon$ by using a single sequence of gates is a natural approach, the advantage of another approach using a probabilistic mixture of unitaries has been demonstrated (Hastings, 2017; Campbell, 2017; Kliuchnikov et al., 2022). Suppose that a synthesis algorithm produces a gate sequence for implementing a unitary transformation in $\{\Upsilon_{\vec{i}}\}=\{g_{i_{n}}\circ\cdots\circ g_{i_{2}}\circ g_{i_{1}}\}_{\vec{i}}$ in accordance with the probability distribution $p(\vec{i})$ to approximate an $\Upsilon$ . If the algorithm independently samples $\vec{i}$ for each time the $\Upsilon$ is used in the entire circuit, the physical transformation governed by the randomly executed unitary transformation $\Upsilon_{\vec{i}}$ in accordance with the $p(\vec{i})$ is described by a probabilistic mixture $\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}$ of unitaries. In this case, the approximation error should be measured by the distance between the $\Upsilon$ and $\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}$ .

Campbell (Campbell, 2017) and Vadym et al. (Kliuchnikov et al., 2022) constructed algorithms to compute a probability distribution $\{p(\vec{i})\}_{\vec{i}}$ for a given $\Upsilon$ and a set $\{\Upsilon_{\vec{i}}\}_{\vec{i}}$ of unitary transformations implemented as a gate sequence such that the approximation error of $\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}$ against $\Upsilon$ is almost quadratically better than that of a single optimal unitary transformation in $\{\Upsilon_{\vec{i}}\}_{\vec{i}}$ . More precisely, $\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond}=O(\epsilon^{2})$ for the worst approximation error $\epsilon=\max_{\Upsilon}\min_{\vec{i}}\frac{1}{2}\left\|\Upsilon-\Upsilon_{\vec{i}}\right\|_{\diamond}$ caused by deterministic synthesis, where $\left\|\mathcal{A}-\mathcal{B}\right\|_{\diamond}$ is the diamond norm (Watrous, 2018; Kitaev et al., 2002). This also indicates that probabilistically executing $\Upsilon_{\vec{i}}$ in accordance with $p(\vec{i})$ can further reduce the length of the shortest gate sequence without increasing the approximation error (if one measures the error by using the above diamond norm) (Campbell, 2017). In general, probabilistic synthesis consists of two procedures; (i) computing a probability distribution $\{p(\vec{i})\}_{\vec{i}}$ and sampling a description $\vec{i}$ of a gate sequence with a classical computer, and (ii) implementing the sampled gate sequence on a quantum computer. In contrast to deterministic synthesis, procedure (ii) may require a quantum computer to rearrange a gate sequence each time it realizes a target unitary transformation. However, such rearrangeability is usually assumed as a standard functionality of a quantum computer.

However, procedure (i) should be meticulously designed to construct a practical synthesis algorithm. This is because the number of possible gate sequences grows exponentially with respect to the length $n$ of the sequence, resulting in a large degree of freedom in choosing $\{p(\vec{i})\}_{\vec{i}}$ . While a probabilistic synthesis algorithm with guaranteed time complexity exists for single-qubit unitary transformations that correspond to axial rotations (Kliuchnikov et al., 2022), no such algorithm was known even for general single-qubit unitary transformations. Furthermore, a fundamental question remained open regarding the optimality of existing synthesis algorithms in comparison to the minimum approximation error $\min_{p}\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond}$ . Minimax optimization makes it difficult to investigate the minimum approximation error from an analytical perspective except for a few specific $\Upsilon$ and sets $\{\Upsilon_{\vec{i}}\}_{\vec{i}}$ (Sacchi and Sacchi, 2017).

1.1. Our contribution

We obtain the tight lower bound on $\min_{p}\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond}$ , which reveals the fundamental limitation of probabilistic synthesis and indicates the sub-optimality of current algorithms. To obtain the main result, we focus on the analytical relationship between $\min_{p}\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond}$ and $\min_{\vec{i}}\left\|\Upsilon-\Upsilon_{\vec{i}}\right\|_{\diamond}$ , which represent the minimum approximation error obtained by probabilistic synthesis and that by deterministic synthesis, respectively. To be mathematically comprehensive, we also obtain the tight upper bound on $\min_{p}\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond}$ , which essentially unifies various upper bounds (Hastings, 2017; Campbell, 2017; Kliuchnikov et al., 2022) depending on the class of target unitary transformations. More precisely, the two bounds are given as the following theorem.

THEOREM 4.3. (simplified version) For an integer $d\geq 2$ specified below, let $\Upsilon$ and $\{\Upsilon_{\vec{i}}\}_{\vec{i}}$ be a target unitary transformation and a finite set of unitary transformations on the $d$ -dimensional Hilbert space, respectively. It then holds that

(1)

\frac{4\delta}{d}\left(1-\frac{\delta}{d}\right)\leq\max_{\Upsilon}\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond}\leq\epsilon^{2}\ \ {\rm with}\ \left\{\begin{array}[]{l}\delta=1-\sqrt{1-\epsilon^{2}}\ \ \ {\rm and}\\ \epsilon=\max_{\Upsilon}\min_{\vec{i}}\frac{1}{2}\left\|\Upsilon-\Upsilon_{\vec{i}}\right\|_{\diamond}.\end{array}\right.

This theorem provides bounds on the worst approximation error caused when one probabilistically synthesizes the target unitary transformation that is most difficult to approximate. As shown in Fig. 1, the gap between the upper and lower bounds exists if and only if $d\geq 3$ . We can show that the gap is inevitable by constructing $\{\Upsilon_{\vec{i}}\}_{\vec{i}}$ for achieving the upper bound and that for achieving the lower bound. That is, Ineq. (1) represents the fundamental relationship of the approximation error between the deterministic approximation of unitary transformations and their probabilistic approximation that depends only on the dimension $d$ of the system.

Refer to caption — Figure 1. Lower and upper bounds on worst approximation error $\max_{\Upsilon}\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond}$ caused by probabilistic synthesis with respect to $\max_{\Upsilon}\min_{\vec{i}}\frac{1}{2}\left\|\Upsilon-\Upsilon_{\vec{i}}\right\|_{\diamond}$ caused by deterministic synthesis for two-qubit systems, i.e., $d=4$ . Both lower and upper bounds, represented with thick and thin curves, respectively, are achievable for certain $\{\Upsilon_{\vec{i}}\}$ .

From a computational point of view, we show that the optimal probability distribution for approximating an $\Upsilon$ can be computed by the semidefinite program (SDP) we construct when the set $\{\Upsilon_{\vec{i}}\}_{\vec{i}}$ of unitary transformations implemented as a gate sequence is given. (This set is computable with certain synthesis algorithms.) In addition to its optimality, we can rigorously estimate the worst time complexity of our SDP due to established methods for numerically solving SDPs. As the second main result, we construct a probabilistic synthesis algorithm for single-qubit unitary transformations from the following theorem.

THEOREM 5.4. (informal version) For a given gate set, there exists a probabilistic synthesis algorithm for a single-qubit unitary transformation with

INPUT: a target single-qubit unitary transformation $\Upsilon$ and target approximation error $\epsilon\in\left(0,1\right)$

OUTPUT: a gate sequence implementing a single-qubit unitary transformation $\Upsilon_{\vec{i}}$ sampled from a set $\{\Upsilon_{\vec{i}}\}_{\vec{i}}$ in accordance with probability distribution $\hat{p}(\vec{i})$ .

such that the algorithm satisfies the following properties:

•

Efficiency: All steps of the algorithm take $polylog\left(\frac{1}{\epsilon}\right)$ -time,
•

Quadratic improvement: The approximation error $\frac{1}{2}\left\|\Upsilon-\sum_{\vec{i}}\hat{p}(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond}$ obtained with this algorithm is upper bounded by $\epsilon^{2}$ , whereas the error $\min_{\vec{i}}\frac{1}{2}\left\|\Upsilon-\Upsilon_{\vec{i}}\right\|_{\diamond}$ obtained by deterministic synthesis using the unitary transformations in $\{\Upsilon_{\vec{i}}\}_{\vec{i}}$ is upper bounded by $\epsilon$ .

The first property of the algorithm is desirable for fault-tolerant quantum computation (FTQC). The $polylog\left(\frac{1}{\epsilon}\right)$ -time overhead due to the synthesis algorithm does not impair a quadratic speedup achieved with a quantum computer over a classical computer since the approximation error of each unitary transformation should satisfy $\frac{1}{\epsilon}=poly\left(n\right)$ if a quantum circuit contains a polynomial number of single-qubit unitaries with respect to the problem size $n$ . Due to the second property of the algorithm, we can verify that it surpasses current algorithms (Hastings, 2017; Campbell, 2017) with respect to the approximation error. Our algorithm also surpasses a current algorithm (Kliuchnikov et al., 2022) in terms of applicability to a general single-qubit unitary transformation.

1.2. Technical outline

Previous studies searched for the mixing probability distribution $\{p(\vec{i})\}_{\vec{i}}$ by using the first-order approximation of unitary operators (Hastings, 2017; Campbell, 2017) and obtained the upper bound on the worst approximation error $\max_{\Upsilon}\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{\vec{i}}p(\vec{i})\Upsilon_{\vec{i}}\right\|_{\diamond}$ caused by probabilistic synthesis. In contrast, we use the strong duality of SDP, essentially equivalent to the minimax theorem, to obtain tight bounds Ineq. (1) obtained by the optimal mixing probability distribution. A similar technique can be found in the analyses of the optimal convex approximation of quantum states by using a restricted set of states (Sacchi, 2017) and that of unital mappings by using unitary transformations (Yu et al., 2012). While inventing tractable upper bounds on the approximation error of a general unital mapping is an open problem (Yu et al., 2012), we provide an upper bound by exploiting the property of a unitary transformation as a pure unital mapping.

To prove that our single-qubit unitary synthesis algorithm satisfies the expected properties, we show the fact that $\Upsilon_{\vec{i}}$ that is far from $\Upsilon$ is not necessary to be sampled to optimally approximate $\Upsilon$ for single-qubit unitary transformations by exploiting the magic basis (Bennett et al., 1996) representation of single-qubit unitary operators. The magic basis representation enables us to embed the metric space of single-qubit unitary transformations induced by the diamond norm into that of $S^{3}$ with respect to the angle. While numerical simulations indicate the same fact holds for qudit unitary transformations, rigorous proof is a subject for future work.

1.3. Organization

This article is organized as follows. Section 2 is devoted to preliminaries, introducing basis notations in quantum information theory and semidefinite programming. In Section 3, we construct an SDP that computes the optimal probability distribution in probabilistic synthesis. The SDP is provided as a primal and dual problem whose solutions coincide due to the strong duality of the SDP. The coincidence plays a crucial role in the proof of the first main theorem about the fundamental limitation on the approximation error shown in Section 4. Section 5 provides an efficient probabilistic synthesis algorithm for single-qubit unitary transformations as the second main theorem. We also provide a simple geometric interpretation of the superiority of probabilistic synthesis by considering single-qubit unitary transformations corresponding to axial rotations in Section 5.2. We present our conclusions in Section 6.

2. Preliminaries

In this section, we summarize basic notations used throughout the paper. Note that we consider only finite-dimensional Hilbert spaces. In particular, two-dimensional Hilbert space $\mathbb{C}^{2}$ is called a qubit. The $\mathbf{L}\left(\mathcal{H}\right)$ and $\mathbf{Pos}\left(\mathcal{H}\right)$ represent the set of linear operators and positive semidefinite operators on Hilbert space $\mathcal{H}$ , respectively. $\mathbb{I}\in\mathbf{Pos}\left(\mathcal{H}\right)$ represents the identity operator, and we sometimes use the subscript to specify the system where $\mathbb{I}$ acts as $\mathbb{I}_{\mathcal{H}}$ . For Hermitian operators $A$ and $B$ on $\mathcal{H}$ , $A\geq B$ represents $A-B\in\mathbf{Pos}\left(\mathcal{H}\right)$ , and $A>B$ represents $A-B$ is positive definite. The $\mathbf{S}\left(\mathcal{H}\right):=\left\{\rho\in\mathbf{Pos}\left(\mathcal{H}\right):\text{tr}\left[\rho\right]=1\right\}$ and $\mathbf{P}\left(\mathcal{H}\right):=\left\{\rho\in\mathbf{S}\left(\mathcal{H}\right):\text{tr}\left[\rho^{2}\right]=1\right\}$ represent the set of quantum states and that of pure states, respectively. Pure state $\phi\in\mathbf{P}\left(\mathcal{H}\right)$ is sometimes alternatively represented by complex unit vector $|{\phi}\rangle\in\mathcal{H}$ satisfying $\phi=|{\phi}\rangle\langle{\phi}|$ . Any physical transformation of the quantum state can be represented by a completely positive and trace preserving (CPTP) linear mapping $\Gamma:\mathbf{L}\left(\mathcal{H}_{1}\right)\rightarrow\mathbf{L}\left(\mathcal{H}_{2}\right)$ . There exists one-to-one correspondence between a linear mapping $\Xi:\mathbf{L}\left(\mathcal{H}_{1}\right)\rightarrow\mathbf{L}\left(\mathcal{H}_{2}\right)$ and its Choi-Jamiołkowski operator $J(\Xi):=\sum_{i,j}|{i}\rangle\langle{j}|\otimes\Xi(|{i}\rangle\langle{j}|)\in\mathbf{L}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right)$ .

The trace distance $\left\|\rho-\sigma\right\|_{\text{tr}}$ of two quantum states $\rho,\sigma\in\mathbf{S}\left(\mathcal{H}\right)$ is defined as $\left\|M\right\|_{\text{tr}}:=\frac{1}{2}\text{tr}\left[\sqrt{MM^{\dagger}}\right]$ for $M\in\mathbf{L}\left(\mathcal{H}\right)$ . It represents the maximum total variation distance between probability distributions obtained from measurements performed on two quantum states. A similar notion measuring the distinguishability of $\rho$ and $\sigma$ is the fidelity function, defined by $F\left(\rho,\sigma\right):=\max\text{tr}\left[\Phi^{\rho}\Phi^{\sigma}\right]$ , where $\Phi^{\rho}\in\mathbf{P}\left(\mathcal{H}\otimes\mathcal{H}^{\prime}\right)$ is a purification of $\rho$ , i.e., $\rho=\text{tr}_{\mathcal{H}^{\prime}}\left[\Phi^{\rho}\right]$ , and the maximization is taken over all the purifications. Fuchs-van de Graaf inequalities (Fuchs and van de Graaf, 1999) provide relationships between the two measures with respect to the distinguishability as follows:

(2)

1-\sqrt{F\left(\rho,\sigma\right)}\leq\left\|\rho-\sigma\right\|_{\text{tr}}\leq\sqrt{1-F\left(\rho,\sigma\right)}

holds for any state $\rho,\sigma\in\mathbf{S}\left(\mathcal{H}\right)$ , where the equality of the right inequality holds when $\rho$ and $\sigma$ are pure.

The distance measuring the distinguishability of two CPTP mappings $\mathcal{A},\mathcal{B}:\mathbf{L}\left(\mathcal{H}_{1}\right)\rightarrow\mathbf{L}\left(\mathcal{H}_{2}\right)$ corresponding to the trace distance is the diamond norm $\left\|\mathcal{A}-\mathcal{B}\right\|_{\diamond}$ defined by $\frac{1}{2}\left\|\mathcal{A}-\mathcal{B}\right\|_{\diamond}:=\max_{\Phi\in\mathbf{P}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{3}\right)}\left\|((\mathcal{A}-\mathcal{B})\otimes id)(\Phi)\right\|_{\text{tr}}$ , where $id$ represents the identity mapping acting on $\mathcal{H}_{3}$ .

Let $\Xi:\mathbf{L}\left(\mathcal{H}_{1}\right)\rightarrow\mathbf{L}\left(\mathcal{H}_{2}\right)$ be a linear Hermitian-preserving mapping and $A$ and $B$ be Hermitian operators on $\mathcal{H}_{1}$ and $\mathcal{H}_{2}$ , respectively. SDP is an optimization problem formally defined with a triple $(\Xi,A,B)$ as follows (Watrous, 2018):

(3)

Primal problem		Dual problem
maximize:	$\text{tr}\left[AX\right]$	minimize:	$\text{tr}\left[BY\right]$
subject to:	$X\in\mathbf{Pos}\left(\mathcal{H}_{1}\right)$ ,	subject to:	$Y\text{\ is\ a\ Hermitian\ operator\ on\ }\mathcal{H}_{2}$ ,
	$\Xi(X)=B$		$\Xi^{\dagger}(Y)\geq A$ ,

where $\Xi^{\dagger}:\mathbf{L}\left(\mathcal{H}_{2}\right)\rightarrow\mathbf{L}\left(\mathcal{H}_{1}\right)$ is the adjoint of $\Xi$ , defined as the linear mapping satisfying $\text{tr}\left[Y^{\dagger}\Xi(X)\right]=\text{tr}\left[(\Xi^{\dagger}(Y))^{\dagger}X\right]$ for all $X\in\mathbf{L}\left(\mathcal{H}_{1}\right)$ and $Y\in\mathbf{L}\left(\mathcal{H}_{2}\right)$ . We can easily verify that the solution to the primal problem is smaller than or equal to that of the dual problem. The situation when the two solutions coincide is called a strong duality. Slater’s theorem states that the strong duality holds if either of the following conditions holds:

(1)

The solution to the primal problem is finite, and there exists a Hermitian operator $Y$ on $\mathcal{H}_{2}$ such that $\Xi^{\dagger}(Y)>A$ .
(2)

The solution to the dual problem is finite, and there exists a positive definite operator $X$ on $\mathcal{H}_{1}$ such that $\Xi(X)=B$ .

For a metric space $(X,d)$ and two subsets $S,T\subseteq X$ , $S$ is called an $\epsilon$ -covering of $T$ if $\sup_{t\in T}\inf_{s\in S}d(s,t)\leq\epsilon$ . In this article, we basically assume that $X$ is the set of CPTP mappings, the metric is defined as $d(\mathcal{A},\mathcal{B})=\frac{1}{2}\left\|\mathcal{A}-\mathcal{B}\right\|_{\diamond}$ , $S$ is a finite set of unitary transformations and $T$ is a subset of unitary transformations such as a $2\epsilon$ -ball $\left\{\Upsilon^{\prime}:\frac{1}{2}\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}\leq 2\epsilon\right\}$ around a unitary transformation $\Upsilon:\mathbf{L}\left(\mathcal{H}\right)\rightarrow\mathbf{L}\left(\mathcal{H}\right)$ .

3. Semidefinite programming for computing optimal mixing probability

In this section, we construct an SDP for computing the optimal probability distribution that minimizes the diamond norm between the target CPTP mapping $\mathcal{A}$ and a probabilistic mixture of CPTP mappings $\{\mathcal{B}_{x}\}_{x}$ . We can compute the optimal probability distribution in probabilistic unitary synthesis by solving this SDP by restricting $\mathcal{A}$ and $\{\mathcal{B}_{x}\}_{x}$ as unitary transformations. We also mention the relationship between our SDP and the algorithm proposed by Campbell (Campbell, 2017).

Proposition 3.1.

Let $\mathcal{A}$ and $\{\mathcal{B}_{x}\}_{x\in X}$ be a target CPTP mapping and a finite set of CPTP mappings from $\mathbf{L}\left(\mathcal{H}_{1}\right)$ to $\mathbf{L}\left(\mathcal{H}_{2}\right)$ , respectively. Then, distance $\min_{p}\frac{1}{2}\left\|\mathcal{A}-\sum_{x\in X}p(x)\mathcal{B}_{x}\right\|_{\diamond}$ and the optimal probability distribution $\{p(x)\}_{x\in X}$ , which minimizes the distance, can be computed with the following SDP:

(4)

Primal problem		Dual problem
maximize:	$\text{tr}\left[J(\mathcal{A})T\right]-t$	minimize:	$r\in\mathbb{R}$
subject to:	$0\leq T\leq\rho\otimes\mathbb{I}_{\mathcal{H}_{2}}$ ,	subject to:	$S\geq 0\wedge S\geq J\left(\mathcal{A}-\sum_{x\in X}p(x)\mathcal{B}_{x}\right)$ ,
	$\rho\in\mathbf{S}\left(\mathcal{H}_{1}\right)$		$r\mathbb{I}_{\mathcal{H}_{1}}\geq\text{tr}_{\mathcal{H}_{2}}\left[S\right]$ ,
	$\forall x\in X,\text{tr}\left[J(\mathcal{B}_{x})T\right]\leq t$ .		$\forall x\in X,p(x)\geq 0$ ,
			$\sum_{x\in X}p(x)\leq 1$ .

Note that the strong duality holds in this SDP, i.e., the optimum primal and dual values are equal.

Proof.

Recall that for two CPTP mapping $\mathcal{A}$ and $\mathcal{B}$ from $\mathbf{L}\left(\mathcal{H}_{1}\right)$ to $\mathbf{L}\left(\mathcal{H}_{2}\right)$ , $\frac{1}{2}\left\|\mathcal{A}-\mathcal{B}\right\|_{\diamond}$ can be computed by the following SDP:

Primal problem		Dual problem
maximize:	$\text{tr}\left[J(\mathcal{A}-\mathcal{B})T\right]$	minimize:	$r\in\mathbb{R}$
subject to:	$0\leq T\leq\rho\otimes\mathbb{I}_{\mathcal{H}_{2}}$ ,	subject to:	$S\geq 0\wedge S\geq J(\mathcal{A}-\mathcal{B})$ ,
	$\rho\in\mathbf{S}\left(\mathcal{H}_{1}\right)$ .		$r\mathbb{I}_{\mathcal{H}_{1}}\geq\text{tr}_{\mathcal{H}_{2}}\left[S\right]$ .

The primal problem can be obtained by observing

(5)		$\displaystyle\frac{1}{2}\left\\|\mathcal{A}-\mathcal{B}\right\\|_{\diamond}$	$\displaystyle=$	$\displaystyle\max_{\begin{subarray}{c}\Phi\in\mathbf{P}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{3}\right)\\ \Pi\in\mathbf{Proj}(\mathcal{H}_{2}\otimes\mathcal{H}_{3})\end{subarray}}\text{tr}\left[((\mathcal{A}-\mathcal{B})\otimes id)(\Phi)\Pi\right]$
(6)			$\displaystyle=$	$\displaystyle\max_{T\in\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2})}\text{tr}\left[J(\mathcal{A}-\mathcal{B})T\right],$

where $\Pi$ is a Hermitian projector acting on $\mathcal{H}_{2}\otimes\mathcal{H}_{3}$ , $\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2}):=\{T\in\mathbf{Pos}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right):\exists\rho\in\mathbf{S}\left(\mathcal{H}_{1}\right),T\leq\rho\otimes\mathbb{I}\}$ is called the set of measuring strategies (Gutoski and Watrous, 2007) or that of quantum testers (Chiribella et al., 2009), and the last equality was shown by Chiribella et al. (Chiribella et al., 2009, Theorem 10). To be self-contained, we provide a proof for the equality in Appendix A, with which the equality can be verified by applying Eq. (60) with fixing $\Xi=\mathcal{A}-\mathcal{B}$ . A formal SDP and the verification of the strong duality are provided in Appendix B.

By extending the dual problem of this SDP to include the minimization of probability distribution $\{p(x)\}_{x\in X}$ , we obtain Eq. (4). Note that the last condition $\sum_{x\in X}p(x)\leq 1$ in the dual problem is different from the condition $\sum_{x\in X}p(x)=1$ of a probability distribution; however, the optimum dual value can be achieved under the latter condition. Again, a formal SDP and the verification of the strong duality are provided in Appendix B. ∎

For a given $\Upsilon:\mathbf{L}\left(\mathcal{H}\right)\rightarrow\mathbf{L}\left(\mathcal{H}\right)$ and a given set $\{\Upsilon_{x}:\mathbf{L}\left(\mathcal{H}\right)\rightarrow\mathbf{L}\left(\mathcal{H}\right)\}_{x\in X}$ of unitary transformations implemented as a gate sequence, which forms an $\epsilon$ -covering the set of unitary transformations with sufficiently small $\epsilon$ , “the convex hull finding algorithm” proposed by Campbell (Campbell, 2017) can find a probability distribution $\{\tilde{p}(x)\}_{x\in\tilde{X}}$ such that $\sum_{x\in\tilde{X}}\tilde{p}(x)H_{x}=0$ , where $\Upsilon_{x}(\rho)=\Upsilon(e^{iH_{x}}\rho e^{-iH_{x}})$ and $H_{x}=O(\epsilon)$ for all $x\in\tilde{X}\subseteq X$ . Note that $M=O(\epsilon)$ represents $\left\|M\right\|_{\infty}=O(\epsilon)$ as $\epsilon\rightarrow 0$ for a linear operator $M\in\mathbf{L}\left(\mathcal{H}\right)$ depending on $\epsilon$ . By using the dual problem in Proposition 3.1, we can verify that the distance $\epsilon$ , which is achievable by a deterministic unitary synthesis finding the closest $\Upsilon_{x}$ to approximate $\Upsilon$ , can be improved into $O(\epsilon^{2})$ by mixing unitaries in accordance with the probability distribution $\{\tilde{p}(x)\}_{x\in\tilde{X}}$ as follows. First, by using the dual problem of the SDP to compute the diamond norm between two CPTP mappings, we obtain

(7)		$\displaystyle\frac{1}{2}\left\\|\Upsilon-\sum_{x\in\tilde{X}}\tilde{p}(x)\Upsilon_{x}\right\\|_{\diamond}=\frac{1}{2}\left\\|id-\sum_{x\in\tilde{X}}\tilde{p}(x)\Upsilon^{-1}\circ\Upsilon_{x}\right\\|_{\diamond}\leq\left\\|\text{tr}_{\mathcal{H}^{\prime}}\left[S\right]\right\\|_{\infty}$
(8)		$\displaystyle{\rm with}\ S\geq 0\wedge S\geq J(id)-\sum_{x\in\tilde{X}}\tilde{p}(x)J(\Upsilon^{-1}\circ\Upsilon_{x}),$

where $\mathcal{H}^{\prime}$ represents the the output system of $\Upsilon$ , which is isomorphic to $\mathcal{H}$ . Second, by using the Taylor expansions $e^{iH_{x}}=\mathbb{I}+iH_{x}+R_{x}$ , where $R_{x}=O(\epsilon^{2})$ , we obtain

(10)	$\displaystyle J(id)-\sum_{x\in\tilde{X}}\tilde{p}(x)J(\Upsilon^{-1}\circ\Upsilon_{x})$	$\displaystyle=$	$\displaystyle\sum_{x\in\tilde{X}}\tilde{p}(x)\left\{-(R_{x}J(id)+J(id)R_{x}^{\dagger})-i(H_{x}J(id)R_{x}^{\dagger}-R_{x}J(id)H_{x})\right\}-P$
		$\displaystyle\leq$	$\displaystyle\sum_{x\in\tilde{X}}\tilde{p}(x)\Big{\{}\left(\frac{1}{\left\\|R_{x}\right\\|_{\infty}}R_{x}J(id)R_{x}^{\dagger}+\left\\|R_{x}\right\\|_{\infty}J(id)\right)$
			$\displaystyle\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ +\left(\frac{\left\\|R_{x}\right\\|_{\infty}}{\left\\|H_{x}\right\\|_{\infty}}H_{x}J(id)H_{x}+\frac{\left\\|H_{x}\right\\|_{\infty}}{\left\\|R_{x}\right\\|_{\infty}}R_{x}J(id)R_{x}^{\dagger}\right)\Big{\}}$

where $P=\sum_{x\in\tilde{X}}\tilde{p}(x)(H_{x}J(id)H_{x}+R_{x}J(id)R_{x}^{\dagger})\in\mathbf{Pos}\left(\mathcal{H}\otimes\mathcal{H}^{\prime}\right)$ , $H_{x}$ and $R_{x}$ acts on $\mathcal{H}^{\prime}$ and we use the fact that $|{\tilde{\phi}}\rangle\langle{\tilde{\psi}}|+|{\tilde{\psi}}\rangle\langle{\tilde{\phi}}|\leq\tilde{\phi}+\tilde{\psi}$ with complex vectors $(|{\tilde{\phi}}\rangle,|{\tilde{\psi}}\rangle)=\left(\left\|R_{x}\right\|_{\infty}^{-\frac{1}{2}}\sum_{j}(\mathbb{I}_{\mathcal{H}}\otimes R_{x})|{jj}\rangle,\left\|R_{x}\right\|_{\infty}^{\frac{1}{2}}\sum_{j}|{jj}\rangle\right)$ and $(|{\tilde{\phi}}\rangle,|{\tilde{\psi}}\rangle)=\left(\left\|R_{x}\right\|_{\infty}^{\frac{1}{2}}\left\|H_{x}\right\|_{\infty}^{-\frac{1}{2}}\sum_{j}(\mathbb{I}_{\mathcal{H}}\otimes H_{x})|{jj}\rangle,i\left\|R_{x}\right\|_{\infty}^{-\frac{1}{2}}\left\|H_{x}\right\|_{\infty}^{\frac{1}{2}}\sum_{j}(\mathbb{I}_{\mathcal{H}}\otimes R_{x})|{jj}\rangle\right)$ in the inequality. Third, by letting $S$ in Eq. (8) be R.H.S. of Eq. (10), we obtain

(11)

\displaystyle\left\|\text{tr}_{\mathcal{H}^{\prime}}\left[S\right]\right\|_{\infty}

\displaystyle=

\displaystyle\left\|\sum_{x\in\tilde{X}}\tilde{p}(x)\left(\frac{(R_{x}^{\dagger}R_{x})^{T}}{\left\|R_{x}\right\|_{\infty}}+\left\|R_{x}\right\|_{\infty}\mathbb{I}_{\mathcal{H}}+\frac{\left\|R_{x}\right\|_{\infty}(H_{x}^{2})^{T}}{\left\|H_{x}\right\|_{\infty}}+\frac{\left\|H_{x}\right\|_{\infty}(R_{x}^{\dagger}R_{x})^{T}}{\left\|R_{x}\right\|_{\infty}}\right)\right\|_{\infty}=O(\epsilon^{2}).

Since the approximation error $\frac{1}{2}\left\|\Upsilon-\sum_{x\in\tilde{X}}\tilde{p}(x)\Upsilon_{x}\right\|_{\diamond}$ is generally worse than the optimal one $\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond}$ , we can obtain a better probability distribution and better estimation of the approximation error by numerically solving the SDP shown in Proposition 3.1. The ellipsoid method guarantees that $\{p(x)\}_{x\in X}$ and $r$ in the dual problem such that the difference between $r$ and the optimum dual value is less than $\epsilon$ can be computed in $poly\left(|X|\log\left(\frac{1}{\epsilon}\right)\right)$ -time (Lovász, 2003). Note that we assume the dimension of the Hilbert space is constant since the unitary synthesis is usually executed for $\Upsilon$ on a fixed-size system.

4. Tight bounds on error of probabilistic approximation

This section investigates the relationship between the discrete approximation of unitary transformations and the probabilistic approximation for a general $\Upsilon$ and general set $\{\Upsilon_{x}\}_{x}$ . Specifically, we show the tight relationship between $\min_{p}\left\|\Upsilon-\sum_{x}p(x)\Upsilon_{x}\right\|_{\diamond}$ and $\min_{x}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}$ , where the former represents the minimum approximation error obtained by probabilistic synthesis and the latter represents that by deterministic synthesis when $\{\Upsilon_{x}\}_{x}$ is a set of unitary transformations implemented as a gate sequence. The first lemma shows the fundamental limitation of probabilistic synthesis, and the second one shows its superiority over deterministic synthesis.

Lemma 4.1.

For an integer $d\geq 2$ specified below, let $\Upsilon:\mathbf{L}\left(\mathbb{C}^{d}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{d}\right)$ and $\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{d}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{d}\right)\right\}_{x\in X}$ be a target unitary transformation and finite set of unitary transformations, respectively. Then

(14)

\displaystyle\frac{2}{d}\epsilon^{2}\leq\frac{4\delta}{d}\left(1-\frac{\delta}{d}\right)\leq\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond}\ {\rm with}\ \left\{\begin{array}[]{l}\delta=1-\sqrt{1-\epsilon^{2}}\ \ \ {\rm and}\\ \epsilon=\min_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\end{array}\right.

holds, where the minimization of $p$ is taken over probability distributions over $X$ .

Proof.

The first inequality can be straightforwardly verified as follows:

(15)

\frac{2}{d}\epsilon^{2}=\frac{4\delta}{d}\left(1-\frac{\delta}{2}\right)\leq\frac{4\delta}{d}\left(1-\frac{\delta}{d}\right).

Thus, we prove the second inequality. First, by computing the diamond norm between $\Upsilon$ and $\Upsilon_{x}$ , we obtain

(16)	$\displaystyle\frac{1}{2}\left\\|\Upsilon-\Upsilon_{x}\right\\|_{\diamond}$	$\displaystyle=$	$\displaystyle\max_{\Phi\in\mathbf{P}\left(\mathbb{C}^{d}\otimes\mathbb{C}^{d}\right)}\left\\|\Upsilon\otimes id_{\mathbb{C}^{d}}(\Phi)-\Upsilon_{x}\otimes id_{\mathbb{C}^{d}}(\Phi)\right\\|_{\text{tr}}$
(17)		$\displaystyle=$	$\displaystyle\max_{\Phi\in\mathbf{P}\left(\mathbb{C}^{d}\otimes\mathbb{C}^{d}\right)}\sqrt{1-F\left(\Upsilon\otimes id_{\mathbb{C}^{d}}(\Phi),\Upsilon_{x}\otimes id_{\mathbb{C}^{d}}(\Phi)\right)}$
(18)		$\displaystyle=$	$\displaystyle\sqrt{1-\min_{\Phi\in\mathbf{P}\left(\mathbb{C}^{d}\otimes\mathbb{C}^{d}\right)}\|\langle{\Phi}\|U^{\dagger}U_{x}\otimes\mathbb{I}_{\mathbb{C}^{d}}\|{\Phi}\rangle\|^{2}}$
(19)		$\displaystyle=$	$\displaystyle\sqrt{1-\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\|\text{tr}\left[\rho U^{\dagger}U_{x}\right]\|^{2}},$

where $\Upsilon(\rho)=U\rho U^{\dagger}$ and $\Upsilon_{x}(\rho)=U_{x}\rho U_{x}^{\dagger}$ . This indicates

(20)

1-\delta=\max_{x\in X}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}|\text{tr}\left[\rho U^{\dagger}U_{x}\right]|.

Next, by using the primal problem in our SDP in Proposition 3.1, we obtain

(21)	$\displaystyle\min_{p}\frac{1}{2}\left\\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\\|_{\diamond}$	$\displaystyle=$	$\displaystyle\max_{T\in\mathbf{T}(\mathbb{C}^{d}:\mathbb{C}^{d})}\left(\text{tr}\left[J(\Upsilon)T\right]-\max_{x\in X}\text{tr}\left[J(\Upsilon_{x})T\right]\right)$
(22)		$\displaystyle\geq$	$\displaystyle\frac{1}{d^{2}}\left(\text{tr}\left[J(\Upsilon)J(\Upsilon)\right]-\max_{x\in X}\text{tr}\left[J(\Upsilon_{x})J(\Upsilon)\right]\right)$
(23)		$\displaystyle=$	$\displaystyle 1-\frac{1}{d^{2}}\max_{x\in X}\left\|\text{tr}\left[U^{\dagger}U_{x}\right]\right\|^{2},$

where $\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2}):=\{T\in\mathbf{Pos}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right):\exists\rho\in\mathbf{S}\left(\mathcal{H}_{1}\right),T\leq\rho\otimes\mathbb{I}_{\mathcal{H}_{2}}\}$ , and we set $T=\frac{1}{d^{2}}J(\Upsilon)\left(\leq\frac{\mathbb{I}_{\mathbb{C}^{d}}}{d}\otimes\mathbb{I}_{\mathbb{C}^{d}}\right)$ to obtain the inequality.

In Eq. (20) and Eq. (23), the same unitary operator $W=U^{\dagger}U_{x}$ appears in the term $\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}|\text{tr}\left[\rho W\right]|$ and $|\text{tr}\left[W\right]|$ , respectively. We can prove the second inequality in Ineq. (14) by establishing a relationship between the two terms as follows. For any unitary operator $W$ on $\mathbb{C}^{d}$ $(d\geq 2)$ ,

(24)

\displaystyle\frac{1}{d}\left|\text{tr}\left[W\right]\right|=\frac{1}{d}\left|\sum_{i=1}^{d}\lambda_{i}(W)\right|\leq\frac{2}{d}\min_{p}\left|\sum_{i=1}^{d}p(i)\lambda_{i}(W)\right|+\frac{d-2}{d}=\frac{2}{d}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left|\text{tr}\left[\rho W\right]\right|+\frac{d-2}{d}

holds, where $\lambda_{i}(W)$ is the $i$ -th eigenvalue of $W$ , and in the inequality, we use the following two facts: (i) the minimization is achieved only if $p$ satisfies $\forall i,p(i)\leq\frac{1}{2}$ due to a geometric observation, and (ii) for such $p$ and complex numbers $\lambda_{i}\in\{z\in\mathbb{C}:|z|=1\}$ , $\left|\sum_{i}p(i)\lambda_{i}\right|\geq\left|\sum_{i}\frac{1}{2}\lambda_{i}\right|-\left|\sum_{i}\left(\frac{1}{2}-p(i)\right)\lambda_{i}\right|\geq\frac{1}{2}\left|\sum_{i}\lambda_{i}\right|-\sum_{i}\left(\frac{1}{2}-p(i)\right)=\frac{1}{2}\left|\sum_{i}\lambda_{i}\right|-\frac{d-2}{2}$ . ∎

To the best of our knowledge, the dependence of the approximation error obtained by probabilistic synthesis on the dimension of the Hilbert space shown in this theorem has never been found. This dependence is inevitable since we can also show the sharpness of this theorem in Appendix C. More precisely, we can show that for any real number $\epsilon\in(0,1]$ , any integer $d\geq 2$ and any $\Upsilon$ , there exists $\{\Upsilon_{x}\}_{x\in X}$ achieving the lower bound in Ineq. (14).

In the following lemma, we show the tight upper bound showing that the worst approximation error caused by deterministic synthesis can be reduced by probabilistic synthesis at least quadratically. Our upper bound slightly improves the various existing upper bounds (Hastings, 2017; Campbell, 2017), which have been proven for several classes of target unitary transformations and $\epsilon$ -coverings $\{\Upsilon_{x}\}_{x\in X}$ with small $\epsilon$ . Using Proposition 5.5, shown in the next section, we can verify that our upper bound is still tight even if we consider the approximation of axial single-qubit unitary transformations.

Lemma 4.2.

For a non-negative real number $\epsilon\geq 0$ and integer $d\geq 2$ specified below, if $\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{d}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{d}\right)\right\}_{x\in X}$ is a finite $\epsilon$ -covering of the set of unitary transformations, i.e., $\max_{\Upsilon}\min_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\leq\epsilon$ , then

(25)

\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond}\leq\epsilon^{2}

holds for any unitary transformation $\Upsilon$ , where the minimization of $p$ are taken over probability distributions over $X$ .

Proof.

First, by using the primal problem in our SDP in Proposition 3.1, we obtain

(26)		$\displaystyle(L.H.S.)$	$\displaystyle=$	$\displaystyle\max_{T\in\mathbf{T}(\mathbb{C}^{d}:\mathbb{C}^{d})}\left(\text{tr}\left[J(\Upsilon)T\right]-\max_{x\in X}\text{tr}\left[J(\Upsilon_{x})T\right]\right)$
(27)			$\displaystyle=$	$\displaystyle\max_{\begin{subarray}{c}\Phi\in\mathbf{P}\left(\mathbb{C}^{d}\otimes\mathcal{H}\right)\\ \Pi\in\mathbf{Proj}(\mathbb{C}^{d}\otimes\mathcal{H})\end{subarray}}\Big{(}\text{tr}\left[(U\otimes\mathbb{I}_{\mathcal{H}})\Phi(U\otimes\mathbb{I}_{\mathcal{H}})^{\dagger}\Pi\right]-\max_{x\in X}\text{tr}\left[(U_{x}\otimes\mathbb{I}_{\mathcal{H}})\Phi(U_{x}\otimes\mathbb{I}_{\mathcal{H}})^{\dagger}\Pi\right]\Big{)},$

where $\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2}):=\{T\in\mathbf{Pos}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right):\exists\rho\in\mathbf{S}\left(\mathcal{H}_{1}\right),T\leq\rho\otimes\mathbb{I}_{\mathcal{H}_{2}}\}$ , $\Upsilon(\rho)=U\rho U^{\dagger}$ , $\Upsilon_{x}(\rho)=U_{x}\rho U_{x}^{\dagger}$ , $\mathbf{Proj}(\mathcal{H})$ is the set of Hermitian projectors on $\mathcal{H}$ , and we use Eq. (60) by taking $\Xi\in\{\Upsilon-\Upsilon_{x}\}_{x\in X}$ to obtain the last equality.

Let $\hat{\Phi}$ and $\hat{\Pi}$ maximize Eq. (27). We can verify that $\hat{\Pi}U|{\hat{\Phi}}\rangle=0$ if and only if there exists $x\in X$ such that $\Upsilon_{x}=\Upsilon$ . If $\hat{\Pi}U|{\hat{\Phi}}\rangle\neq 0$ , let $\hat{\Psi}$ be the pure state such that $|{\hat{\Psi}}\rangle\propto\hat{\Pi}U|{\hat{\Phi}}\rangle$ . Then, we can verify that Eq. (27) is still maximized even if we replace $\hat{\Pi}$ with $\hat{\Psi}$ . If $\hat{\Pi}U|{\hat{\Phi}}\rangle=0$ , $(\exists x\in X,\Upsilon_{x}=\Upsilon)$ indicates that Eq. (27) is still maximized even if we replace $\hat{\Pi}$ and $\hat{\Phi}$ with an arbitrary pure state $\hat{\Psi}$ and $(\Upsilon^{-1}\otimes id_{\mathcal{H}})(\hat{\Psi})$ , respectively. Thus, in both cases, $\Pi$ in Eq. (27) can be restricted as a pure state, i.e., $\Pi=\Psi\in\mathbf{P}\left(\mathbb{C}^{d}\otimes\mathcal{H}\right)$ , and we proceed as follows:

(28)

\displaystyle{\rm Eq}.~{}\eqref{eq:R2}=\max_{\Phi,\Psi\in\mathbf{P}\left(\mathbb{C}^{d}\otimes\mathcal{H}\right)}\Big{(}|\langle{\Psi}|U\otimes\mathbb{I}_{\mathcal{H}}|{\Phi}\rangle|^{2}-\max_{x\in X}|\langle{\Psi}|U_{x}\otimes\mathbb{I}_{\mathcal{H}}|{\Phi}\rangle|^{2}\Big{)}.

Before proceeding to the next step, we show that the set of mappings $f_{\Phi,\Psi}:U\mapsto|\langle{\Psi}|U\otimes\mathbb{I}_{\mathcal{H}}|{\Phi}\rangle|$ associated with pure states $\Phi$ and $\Psi$ is equivalent to that of mappings $g_{A}:U\mapsto\left|\text{tr}\left[AU\right]\right|$ associated with linear operator $A\in\mathbf{L}\left(\mathbb{C}^{d}\right)$ such that $\left\|A\right\|_{1}\leq 1$ , where $\left\|A\right\|_{1}$ is the Schatten $1$ -norm of $A$ . By using decompositions $|{\Phi}\rangle=\sum_{i,j}\alpha_{ij}|{i}\rangle|{j}\rangle$ and $|{\Psi}\rangle=\sum_{i,j}\beta_{ij}|{i}\rangle|{j}\rangle$ with respect to orthonormal bases, we can verify that $g_{A}$ with $A=\sum_{i,j,k}\alpha_{ik}\beta^{*}_{jk}|{i}\rangle\langle{j}|$ is equal to $f_{\Phi,\Psi}$ and $\left\|A\right\|_{1}=\max_{U}g_{A}(U)=\max_{U}f_{\Phi,\Psi}(U)\leq 1$ . On the other hand, by using the singular value decomposition $A=\sum_{i}p_{i}|{x_{i}}\rangle\langle{y_{i}}|$ , where $\left\|A\right\|_{1}\leq 1$ indicates $p+\sum_{i}p_{i}=1$ with some $p\geq 0$ , we can verify that $f_{\Phi,\Psi}$ with $|{\Phi}\rangle=\sqrt{p}|{0}\rangle|{\bot}\rangle+\sum_{i}\sqrt{p_{i}}|{x_{i}}\rangle|{i}\rangle$ and $|{\Psi}\rangle=\sqrt{p}|{0}\rangle|{\bot^{\prime}}\rangle+\sum_{i}\sqrt{p_{i}}|{y_{i}}\rangle|{i}\rangle$ ( $\{|{i}\rangle\}_{i}\cup\{|{\bot}\rangle,|{\bot^{\prime}}\rangle\}$ is an orthonormal basis) is equal to $g_{A}$ .

By using the equivalent between two sets of mappings, we proceed as follows:

(29)

\displaystyle{\rm Eq}.~{}\eqref{eq:R3}

\displaystyle=

\displaystyle\max_{A:\left\|A\right\|_{1}\leq 1}\left(\left|\text{tr}\left[AU\right]\right|^{2}-\max_{x\in X}\left|\text{tr}\left[AU_{x}\right]\right|^{2}\right)=\max_{V,\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left(\left|\text{tr}\left[\rho V^{\dagger}U\right]\right|^{2}-\max_{x\in X}\left|\text{tr}\left[\rho V^{\dagger}U_{x}\right]\right|^{2}\right),

where we use the fact that the maximization is achieved when $\left\|A\right\|_{1}=1$ and use the polar decomposition $A=\rho V^{\dagger}$ with a unitary operator $V$ acting on $\mathbb{C}^{d}$ .

By using Eq. (29), we obtain

(30)	$\displaystyle\max_{\Upsilon}\min_{p}\frac{1}{2}\left\\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\\|_{\diamond}$	$\displaystyle=$	$\displaystyle\max_{V,\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left(\max_{U}\left\|\text{tr}\left[\rho V^{\dagger}U\right]\right\|^{2}-\max_{x\in X}\left\|\text{tr}\left[\rho V^{\dagger}U_{x}\right]\right\|^{2}\right)$
(31)		$\displaystyle=$	$\displaystyle 1-\min_{V,\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\max_{x\in X}\left\|\text{tr}\left[\rho V^{\dagger}U_{x}\right]\right\|^{2}$
(32)		$\displaystyle\leq$	$\displaystyle 1-\min_{V}\max_{x\in X}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left\|\text{tr}\left[\rho V^{\dagger}U_{x}\right]\right\|^{2},$

where the maximization of $\Upsilon$ is taken over unitary transformations, and we use the fact that $\max_{x}\min_{y}f(x,y)\leq\min_{y}\max_{x}f(x,y)$ for any $f$ if the maximum and minimum exist in the inequality. Using Eq. (19) completes the proof. ∎

The combination of Lemmas 4.1 and 4.2 can be summarized as the following theorem.

Theorem 4.3.

(36)

\displaystyle\frac{4\delta_{\Upsilon}}{d}\left(1-\frac{\delta_{\Upsilon}}{d}\right)\leq\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond}\leq\epsilon^{2}\ {\rm with}\ \left\{\begin{array}[]{l}\delta_{\Upsilon}=1-\sqrt{1-\epsilon_{\Upsilon}^{2}}\\ \epsilon_{\Upsilon}=\min_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\\ \epsilon=\max_{\Upsilon}\min_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\end{array}\right.

holds, where the maximization of $\Upsilon$ and minimization of $p$ are taken over unitary transformations on $\mathbf{L}\left(\mathbb{C}^{d}\right)$ and probability distributions over $X$ , respectively.

By maximizing $\Upsilon$ over all the unitary transformations, we obtain Ineq. (1) as a simplified version of this theorem. As mentioned in the introduction, in Appendix C, we show that both the upper and lower bounds in Ineq. (1) are tight, i.e., for any real number $\epsilon\in(0,1]$ and any integer $d\geq 2$ , the two bounds are achievable for some $\{\Upsilon_{\vec{i}}\}_{\vec{i}}$ .

5. Probabilistic synthesis for single-qubit unitary transformation

In this section, we construct a simplified SDP that computes the optimal mixing probability for single-qubit-unitary synthesis. Before discussing that, we first show the special properties of the probabilistic mixture of single-qubit unitaries. In the first subsection, we prove Lemma 5.3, which is a crucial ingredient for constructing the SDP and has a direct application to constructing an efficient probabilistic synthesis algorithm. In the second subsection, we investigate the approximation of single-qubit unitary transformations corresponding to axial rotations to provide a geometric interpretation of the quadratic improvement owing to the probabilistic mixture and confirmation of Lemma 5.3.

We show the first special property of a single-qubit unitary operator in the following Lemma, which essentially shows the equivalence between the set of maximally entangled two-qubit states and a real subspace in the two qubits.

Lemma 5.1.

For any finite set $\{\Phi_{x}\in\mathbf{P}\left(\mathbb{C}^{2}\otimes\mathbb{C}^{2}\right)\}_{x\in X}$ of maximally entangled states and any real numbers $\{r_{x}\in\mathbb{R}\}_{x\in X}$ , the Hermitian operator $H=\sum_{x\in X}r_{x}\Phi_{x}$ is diagonalizable with respect to maximally entangled eigenstates.

Proof.

First, we show the equivalence between the set of two-qubit maximally entangled vectors and a real subspace in the two qubits. Define four vectors representing maximally entangled states:

	$\displaystyle\|{\Psi_{1}}\rangle=\frac{1}{\sqrt{2}}(\|{00}\rangle+\|{11}\rangle),\ \ \$		$\displaystyle\|{\Psi_{2}}\rangle=\frac{i}{\sqrt{2}}(\|{00}\rangle-\|{11}\rangle),$
(37)		$\displaystyle\|{\Psi_{3}}\rangle=\frac{i}{\sqrt{2}}(\|{01}\rangle+\|{10}\rangle),\ \ \$		$\displaystyle\|{\Psi_{4}}\rangle=\frac{1}{\sqrt{2}}(\|{01}\rangle-\|{10}\rangle).$

Any vector in the real subspace $\mathcal{K}_{MES}$ spanned by $\{|{\Psi_{i}}\rangle\}_{i=1}^{4}$ can be represented by

(38)

\frac{1}{\sqrt{2}}\left((u_{1}+iu_{2})|{00}\rangle+(u_{4}+iu_{3})|{01}\rangle-(u_{4}-iu_{3})|{10}\rangle+(u_{1}-iu_{2})|{11}\rangle\right)

with real numbers $\{u_{i}\in\mathbb{R}\}_{i=1}^{4}$ . On the other hand, any maximally entangled state can be obtained by applying the single-qubit unitary operator represented by $\left(\begin{matrix}e^{i\phi_{1}}\cos\theta&&e^{i\phi_{2}}\sin\theta\\ -e^{-i\phi_{2}}\sin\theta&&e^{-i\phi_{1}}\cos\theta\end{matrix}\right)$ to $|{\Psi_{1}}\rangle$ and can be represented by a vector

(39)

\frac{1}{\sqrt{2}}\left(e^{i\phi_{1}}\cos\theta|{00}\rangle+e^{i\phi_{2}}\sin\theta|{01}\rangle-e^{-i\phi_{2}}\sin\theta|{10}\rangle+e^{-i\phi_{1}}\cos\theta|{11}\rangle\right).

By comparing Eqs. (38) and (39), we can verify that any two-qubit maximally entangled state can be represented as a unit vector in $\mathcal{K}_{MES}$ and any unit vector in $\mathcal{K}_{MES}$ represents a maximally entangled state. This equivalence has been indicated in a previous study (Bennett et al., 1996), and the basis defined in Eq. (5) is called the magic basis (Hill and Wootters, 1997).

Since $H=\sum_{x\in X}r_{x}\Phi_{x}$ is represented as a real symmetric matrix with respect to the basis $\{|{\Psi_{i}}\rangle\}_{i=1}^{4}$ , $H$ is diagonalizable with respect to real eigenvectors, which represents maximally entangled states. ∎

Next, we show a special property of the diamond norm between probabilistic mixtures of single-qubit unitaries in the following Lemma, which essentially shows that the input state in the definition of the diamond norm can be maximally entangled.

Lemma 5.2.

For a subset $\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{2}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{2}\right)\right\}_{x\in X}$ of single-qubit unitary transformations and probability distributions $p$ and $q$ over a finite set $X$ , it holds that

(40)

\left\|\sum_{x\in X}p(x)\Upsilon_{x}-\sum_{x\in X}q(x)\Upsilon_{x}\right\|_{\diamond}=\left\|\sum_{x\in X}(p(x)-q(x))J(\Upsilon_{x})\right\|_{\text{tr}}.

Proof.

For $d(\geq 2)$ -dimensional CPTP maps $\{\Upsilon_{x}\}_{x\in X}$ , it holds that

(41)

(L.H.S.)=\max_{\Phi\in\mathbf{P}\left(\mathbb{C}^{d}\otimes\mathbb{C}^{d}\right)}2\left\|\sum_{x\in X}(p(x)-q(x))\Upsilon_{x}\otimes id_{\mathbb{C}^{d}}(\Phi)\right\|_{\text{tr}}\geq\frac{2}{d}\left\|\sum_{x\in X}(p(x)-q(x))J(\Upsilon_{x})\right\|_{\text{tr}}.

On the other hand, by using the dual problem of the SDP to compute the diamond norm used in the proof of Proposition 3.1, we obtain

(42)

(L.H.S.)\leq 2\left\|\text{tr}_{2}\left[S\right]\right\|_{\infty}\ {\rm with}\ \left(S\geq 0\right)\wedge\left(S\geq\sum_{x\in X}(p(x)-q(x))J(\Upsilon_{x})\right),

where $\text{tr}_{2}\left[\cdot\right]$ represents the partial trace of the second system of $\mathbb{C}^{2}\otimes\mathbb{C}^{2}$ . By using Lemma 5.1, we can verify that $\sum_{x\in X}(p(x)-q(x))J(\Upsilon_{x})=\sum_{i=1}^{4}\lambda_{i}\Phi_{i}$ with real numbers $\lambda_{i}$ and a set of orthogonal maximally entangled states $\{\Phi_{i}\}_{i=1}^{4}$ . By setting $S=\sum_{i:\lambda_{i}>0}\lambda_{i}\Phi_{i}$ , we obtain

(43)

2\left\|\text{tr}_{2}\left[S\right]\right\|_{\infty}=2\left\|\sum_{i:\lambda_{i}>0}\lambda_{i}\frac{\mathbb{I}}{2}\right\|_{\infty}=\sum_{i:\lambda_{i}>0}\lambda_{i}=(R.H.S.).

This completes the proof. ∎

5.1. Support of optimal probability distribution

To achieve the quadratic improvement owing to the probabilistic approximation of $\Upsilon$ by using $\{\Upsilon_{x}\}_{x\in X}$ , we assume $\{\Upsilon_{x}\}_{x\in X}$ is an $\epsilon$ -covering of the set of unitary transformations in Lemma 4.2. Since $|X|=\Omega\left(\frac{1}{\epsilon^{c}}\right)$ from a volume consideration, the runtime $poly\left(|X|\log\left(\frac{1}{\epsilon}\right)\right)$ of our SDP to compute the optimal probability distribution proposed in Proposition 3.1 increases as $poly\left(\frac{1}{\epsilon}\right)$ at best. However, by using the following lemma, we can construct a much more efficient SDP.

Lemma 5.3.

For a non-negative real number $\epsilon\geq 0$ , if $\Upsilon$ is a single-qubit unitary transformation and $\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{2}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{2}\right)\right\}_{x\in X}$ is a finite $\epsilon$ -covering of the set of single-qubit unitary transformations, i.e., $\max_{\Upsilon}\min_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\leq\epsilon$ , then

(44)

\min_{p}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond}=\min_{\hat{p}}\left\|\Upsilon-\sum_{x\in\hat{X}}\hat{p}(x)\Upsilon_{x}\right\|_{\diamond}

holds, where $\hat{X}:=\{x\in X:\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\leq 2\epsilon\}$ and the minimization of $p$ and $\hat{p}$ are taken over probability distributions over $X$ and those over $\hat{X}$ , respectively.

Proof.

By using Lemma 5.2, we obtain

(45)

(L.H.S.)=\min_{p}\left\|J(\Upsilon)-\sum_{x\in X}p(x)J(\Upsilon_{x})\right\|_{\text{tr}}=\min_{p}\left\|J(\Upsilon)-\sum_{x\in X}p(x)J(\Upsilon_{x})\right\|_{\infty},

where we use the dimension of the eigenspace of $J(\Upsilon)-\sum_{x\in X}p(x)J(\Upsilon_{x})$ with positive eigenvalues is at most $1$ in the last equality. By using Lemma 5.1, we can proceed with the following two ways:

(46)		$\displaystyle{\rm Eq}.~{}\eqref{eq:6_1}$	$\displaystyle=$	$\displaystyle\min_{p}\max_{\rho\in\text{conv}\left({\rm MES}\right)}\text{tr}\left[\rho\left(J(\Upsilon)-\sum_{x\in X}p(x)J(\Upsilon_{x})\right)\right]\ {\rm and}$
(47)		$\displaystyle{\rm Eq}.~{}\eqref{eq:6_1}$	$\displaystyle=$	$\displaystyle\min_{p}\max_{\begin{subarray}{c}M\in{\rm cone}({\rm MES})\\ M\leq\mathbb{I}\end{subarray}}\text{tr}\left[M\left(J(\Upsilon)-\sum_{x\in X}p(x)J(\Upsilon_{x})\right)\right],$

where $\text{conv}\left({\rm MES}\right)$ and ${\rm cone}({\rm MES})$ are the convex hull of the set of maximally entangled states $\left\{\Phi\in\mathbf{P}\left(\mathbb{C}^{2}\otimes\mathbb{C}^{2}\right):\text{tr}_{2}\left[\Phi\right]=\frac{\mathbb{I}}{2}\right\}$ and the convex cone generated by the set $\{\Phi\}$ , respectively. Note that the convex cone generated by a subset $\mathbf{X}$ in a vector space is defined as the set of finite linear combinations of $\mathbf{X}$ with non-negative coefficients.

Since the domains of $p$ , $\rho$ , and $M$ are compact and convex and $f(p,H):=\text{tr}\left[H(J(\Upsilon)-\sum_{x\in X}p(x)J(\Upsilon_{x}))\right]$ is affine with respect to each variable, we can apply the minimax theorem and obtain

(48)

{\rm Eq}.~{}\eqref{eq:6_1}=\max_{\rho\in\text{conv}\left({\rm MES}\right)}\left(\text{tr}\left[\rho J(\Upsilon)\right]-\max_{x\in X}\text{tr}\left[\rho J(\Upsilon_{x})\right]\right)=\max_{\begin{subarray}{c}M\in{\rm cone}({\rm MES})\\ M\leq\mathbb{I}\end{subarray}}\left(\text{tr}\left[MJ(\Upsilon)\right]-\max_{x\in X}\text{tr}\left[MJ(\Upsilon_{x})\right]\right).

When $(L.H.S.)=0$ , the theorem holds since there exists $x\in X$ such that $\Upsilon_{x}=\Upsilon$ . In the following, we assume $(L.H.S.)>0$ . If $\rho$ with $\left\|\rho\right\|_{\infty}<1$ maximizes the formula, we can show a contradiction by setting $M=\frac{\rho}{\left\|\rho\right\|_{\infty}}$ . Thus, $\rho$ that maximizes the formula satisfies $\left\|\rho\right\|_{\infty}=1$ , i.e., $\rho$ is a (pure) maximally entangled state. Therefore, we obtain

(49)

{\rm Eq}.~{}\eqref{eq:6_2}=\max_{\Upsilon^{\prime}}\frac{1}{2}\left(\text{tr}\left[J(\Upsilon^{\prime})J(\Upsilon)\right]-\max_{x\in X}\text{tr}\left[J(\Upsilon^{\prime})J(\Upsilon_{x})\right]\right)=\max_{U^{\prime}\in U(2)}\frac{1}{2}\left(\left|\text{tr}\left[U^{\dagger}U^{\prime}\right]\right|^{2}-\max_{x\in X}\left|\text{tr}\left[U_{x}^{\dagger}U^{\prime}\right]\right|^{2}\right),

where $\Upsilon(\rho)=U\rho U^{\dagger}$ , $\Upsilon_{x}(\rho)=U_{x}\rho U_{x}^{\dagger}$ , the maximization of $\Upsilon^{\prime}$ is taken over single-qubit unitary transformations, and $U(2)$ represents the set of single-qubit unitary operators. By observing that the minimization in Eq. (19) is achieved by $\rho=\frac{\mathbb{I}}{2}$ for single-qubit unitary operators, we obtain

(50)

{\rm Eq}.~{}\eqref{eq:6_3}=\max_{\Upsilon^{\prime}}\frac{1}{2}\left(\min_{x\in X}\left\|\Upsilon^{\prime}-\Upsilon_{x}\right\|_{\diamond}^{2}-\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}^{2}\right).

Since so far we did not use the assumption that $\{\Upsilon_{x}\}_{x\in X}$ is an $\epsilon$ -covering, we obtain

(51)

(R.H.S.)\ {\rm of}\ {\rm Eq}.~{}\eqref{eq:support}=\max_{\Upsilon^{\prime}}\frac{1}{2}\left(\min_{x\in\hat{X}}\left\|\Upsilon^{\prime}-\Upsilon_{x}\right\|_{\diamond}^{2}-\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}^{2}\right).

Note that the maximization in Eq. (50) is achieved by $\Upsilon^{\prime}$ satisfying $\frac{1}{2}\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}\leq\epsilon$ since $\min_{x\in X}\frac{1}{2}\left\|\Upsilon^{\prime}-\Upsilon_{x}\right\|_{\diamond}\leq\epsilon$ due to the definition of the $\epsilon$ -covering. If we can show that the maximization in Eq. (51) is also achieved by such $\Upsilon^{\prime}$ , we can prove the equivalence between Eqs. (50) and (51). For the minimization in Eq. (50) is achieved by $x\in\hat{X}$ owing to the triangle inequality. To complete the proof, we show the following statement: for all $\Upsilon^{\prime}$ ,

(52)

\frac{1}{2}\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}>\epsilon\Rightarrow\min_{x\in\hat{X}}\left\|\Upsilon^{\prime}-\Upsilon_{x}\right\|_{\diamond}\leq\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}.

We assume $\epsilon<1$ ; otherwise, the statement is trivial. By using the equivalence between the set of two-qubit maximally entangled vectors and a real subspace shown in the proof of Lemma 5.1, there exist unit real vectors $\vec{u},\vec{u}^{\prime}\in\mathbb{R}^{4}$ such that $\sum_{i,j=1}^{4}u_{i}u_{j}|{\Psi_{i}}\rangle\langle{\Psi_{j}}|=\frac{1}{2}J(\Upsilon)$ , $\sum_{i,j=1}^{4}u^{\prime}_{i}u^{\prime}_{j}|{\Psi_{i}}\rangle\langle{\Psi_{j}}|=\frac{1}{2}J(\Upsilon^{\prime})$ and

(53)

0\leq\cos\theta_{1}:=\vec{u}\cdot\vec{u}^{\prime}<\sqrt{1-\epsilon^{2}},

where $\{|{\Psi_{i}}\rangle\}$ is defined in Eq. (5), $\theta_{1}\in[0,\frac{\pi}{2}]$ , the first inequality can be satisfied by appropriately setting the sign of $\vec{u}$ , and the second (strict) inequality is derived from $\frac{1}{2}\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}>\epsilon$ and Lemma 5.2. In the real subspace spanned by $\{\vec{u},\vec{u}^{\prime}\}$ , there exists a unique unit real vector $\vec{v}\in\mathbb{R}^{4}$ such that

(54)

\cos\theta_{2}:=\vec{u}\cdot\vec{v}=\sqrt{1-\epsilon^{2}}\ \wedge\ \vec{u}^{\prime}\cdot\vec{v}=\cos(\theta_{1}-\theta_{2}),

where $\theta_{2}\in[0,\frac{\pi}{2}]$ , as shown in Fig. 2. Note that the unitary transformation $\hat{\Upsilon}$ corresponding to $\vec{v}$ , i.e., $\sum_{i,j=1}^{4}v_{i}v_{j}|{\Psi_{i}}\rangle\langle{\Psi_{j}}|=\frac{1}{2}J(\hat{\Upsilon})$ , satisfies $\frac{1}{2}\left\|\Upsilon-\hat{\Upsilon}\right\|_{\diamond}=\epsilon$ due to Lemma 5.2. Since there exists $x\in X$ such that $\frac{1}{2}\left\|\Upsilon_{x}-\hat{\Upsilon}\right\|_{\diamond}\leq\epsilon$ and $\frac{1}{2}\left\|\Upsilon_{x}-\Upsilon\right\|_{\diamond}\leq\frac{1}{2}\left\|\Upsilon_{x}-\hat{\Upsilon}\right\|_{\diamond}+\frac{1}{2}\left\|\Upsilon-\hat{\Upsilon}\right\|_{\diamond}\leq 2\epsilon$ , we can find a unit real vector $\vec{w}\in\mathbb{R}^{4}$ corresponding to $\Upsilon_{x}$ with $x\in\hat{X}$ , i.e., $\sum_{i,j=1}^{4}w_{i}w_{j}|{\Psi_{i}}\rangle\langle{\Psi_{j}}|=\frac{1}{2}J(\Upsilon_{x})$ , and satisfying

(55)

\cos\theta_{3}:=\vec{w}\cdot\vec{v}\geq\sqrt{1-\epsilon^{2}},

where $\theta_{3}\in[0,\frac{\pi}{2}]$ , due to Lemma 5.2. By using Lemma 5.2 again, we obtain

(56)

\left\|\Upsilon^{\prime}-\Upsilon_{x}\right\|_{\diamond}\leq\left\|\Upsilon^{\prime}-\Upsilon\right\|_{\diamond}\Leftrightarrow|\vec{u}^{\prime}\cdot\vec{u}|\leq|\vec{u}^{\prime}\cdot\vec{w}|.

By letting $\cos\theta_{4}:=\vec{u}^{\prime}\cdot\vec{w}$ with $\theta_{4}\in[0,\pi]$ and using the triangle inequality for angles in the three-dimensional subspace spanned by $\{\vec{u},\vec{u}^{\prime},\vec{w}\}$ , we obtain

(57)

\theta_{4}\leq(\theta_{1}-\theta_{2})+\theta_{3}\leq\theta_{1}.

This completes the proof.

∎

As an application of Lemma 5.3, we construct an efficient probabilistic synthesis algorithm in the proof of the following theorem.

Theorem 5.4.

For a given gate set, there exists a probabilistic synthesis algorithm for a single-qubit unitary transformation with

INPUT: a single-qubit unitary transformation $\Upsilon:\mathbf{L}\left(\mathbb{C}^{2}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{2}\right)$ , an approximation error $\epsilon\in\left(0,1\right)$ , and precision $\delta>0$ such that $\frac{1}{\delta}=\left(\frac{1}{\epsilon}\right)^{O(1)}$

OUTPUT: a gate sequence for implementing a single-qubit unitary transformation $\Upsilon_{x}$ sampled from a set $\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{2}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{2}\right)\right\}_{x\in\hat{X}}$ in accordance with probability distribution $\hat{p}(x)$

such that the algorithm satisfies the following properties:

•

Efficiency: All steps of the algorithm take $polylog\left(\frac{1}{\epsilon}\right)$ -time,
•

Quadratic improvement: The approximation error $\frac{1}{2}\left\|\Upsilon-\sum_{x\in\hat{X}}\hat{p}(x)\Upsilon_{x}\right\|_{\diamond}$ obtained by this algorithm is upper bounded by $\epsilon^{2}+\delta$ , whereas the error $\min_{x\in\hat{X}}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}$ obtained by deterministic synthesis using the unitary transformations in $\{\Upsilon_{x}\}_{x\in\hat{X}}$ is upper bounded by $\epsilon$ ,

Proof.

We assume that the algorithm calls an efficient deterministic synthesis algorithm such as the Solovay-Kitaev algorithm as a subroutine, i.e., the subroutine can find a gate sequence for implementing a unitary transformation $\Upsilon^{\prime}$ such that $\frac{1}{2}\left\|\Upsilon-\Upsilon^{\prime}\right\|_{\diamond}\leq\epsilon$ within $polylog\left(\frac{1}{\epsilon}\right)$ -time. In the following, we explicitly construct the algorithm:

Efficient probabilistic synthesis algorithm for single-qubit unitary transformation

(1)

Set free parameters $c>0$ and $c^{\prime}>0$ satisfying $c+c^{\prime}\leq 1$ .
(2)

Generate a list $\{\hat{\Upsilon}_{x}\}_{x\in\hat{X}}$ of single-qubit unitary transformations such that for any unitary transformation $\hat{\Upsilon}$ , $\min_{x\in\hat{X}}\frac{1}{2}\left\|\hat{\Upsilon}-\hat{\Upsilon}_{x}\right\|_{\diamond}\leq c\epsilon$ if $\frac{1}{2}\left\|\Upsilon-\hat{\Upsilon}\right\|_{\diamond}\leq 2\epsilon$ . That is, $\{\hat{\Upsilon}_{x}\}_{x\in\hat{X}}$ is a $c\epsilon$ -covering of the $2\epsilon$ -ball around the target unitary transformation.
(3)

Call an efficient deterministic synthesis algorithm to find gate sequences for implementing unitary transformations $\{\Upsilon_{x}\}_{x\in\hat{X}}$ such that $\frac{1}{2}\left\|\Upsilon_{x}-\hat{\Upsilon}_{x}\right\|_{\diamond}\leq c^{\prime}\epsilon$ for all $x\in\hat{X}$ .
(4)

Numerically solve our SDP shown in Proposition 3.1 by using $\{\Upsilon_{x}\}_{x\in\hat{X}}$ as a set of CPTP mappings and obtain a probability distribution $\hat{p}$ , which causes the approximation error $\delta$ -close to $\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in\hat{X}}p(x)\Upsilon_{x}\right\|_{\diamond}$ .
(5)

Sample gate sequences for implementing unitary transformations $\{\Upsilon_{x}\}_{x\in\hat{X}}$ in accordance with $\hat{p}$ .

The two properties can be verified as follows:

•

Efficiency: All steps of the algorithm take $polylog\left(\frac{1}{\epsilon}\right)$ -time if the size $\hat{X}$ of the list generated in the second step is upper bounded by a constant (independent to $\epsilon$ .) We can generate such a constant-size list $\{\hat{\Upsilon}_{x}\}_{x\in\hat{X}}$ by using the correspondence between a single-qubit unitary operator and unit vector in $\mathbb{R}^{4}$ and Lemma 5.2.
•

Quadratic improvement: The approximation error $\frac{1}{2}\left\|\Upsilon-\sum_{x\in\hat{X}}\hat{p}(x)\Upsilon_{x}\right\|_{\diamond}$ obtained by this algorithm is at least $\epsilon^{2}+\delta$ since $\{\Upsilon_{x}\}_{x\in\hat{X}}$ is a subset of an $\epsilon$ -covering $\{\Upsilon_{x}\}_{x\in\hat{X}}\cup\{\Upsilon^{\prime}_{y}\}_{y}$ of the set of single-qubit unitary transformations, where $\{\Upsilon^{\prime}_{y}\}_{y}$ is an $\epsilon$ -covering of the complement of the $2\epsilon$ -ball around $\Upsilon$ and $\frac{1}{2}\left\|\Upsilon-\Upsilon^{\prime}_{y}\right\|_{\diamond}>2\epsilon$ for any $y$ , and we can apply Lemmas 4.2 and 5.3.

∎

Note that the quadratic improvement on the approximation error achieved by this algorithm heavily relies on Lemma 5.3. In Appendix D, we perform numerical experiments to confirm that this lemma would hold for qudit unitary transformations, which implies that this synthesis algorithm is applicable to qudit unitary transformations.

5.2. Convex-hull approximation for axial rotations

At a glance, the reduction of the approximation error due to probabilistically mixing unitaries seems strange since a unitary transformation is not a probabilistic mixture of any distinct unitary transformations. A simple geometric interpretation of the reduction is given in the following theorem, considering single-qubit unitary transformations corresponding to axial rotations.

We investigate the convex-hull approximation of a single-qubit unitary transformation $\Upsilon_{\hat{\theta}}$ by using unitary transformations $\{\Upsilon_{\theta}\}_{\theta\in\mathbf{\Theta}}$ that rotate Bloch vectors about the same axes as $\Upsilon_{\hat{\theta}}$ , where $\Upsilon_{\theta}(\rho):=R(\theta)\rho R^{\dagger}(\theta)$ , $R(\theta):=|{0}\rangle\langle{0}|+e^{i\theta}|{1}\rangle\langle{1}|$ with an orthonormal basis $\{|{0}\rangle,|{1}\rangle\}$ , and $\mathbf{\Theta}$ is a finite subset of $[0,2\pi)$ . In this case, every unitary transformation $\Upsilon_{\theta}$ can be represented by a unit complex number $e^{i\theta}$ in the complex plane, as shown in Fig. 3. Furthermore, the following proposition shows that the metric space of probabilistic mixtures of $\Upsilon_{\theta}$ induced by the diamond norm can be identified with a unit disc in the complex plane.

Proposition 5.5.

For a finite subset $\mathbf{\Theta}$ of $[0,2\pi)$ , let $\{\Upsilon_{\theta}\}_{\theta\in\mathbf{\Theta}}$ be a set of single-qubit unitary transformations that rotate Bloch vectors about a fixed axis, i.e., $\Upsilon_{\theta}(\rho):=R(\theta)\rho R^{\dagger}(\theta)$ with $R(\theta):=|{0}\rangle\langle{0}|+e^{i\theta}|{1}\rangle\langle{1}|$ and an orthonormal basis $\{|{0}\rangle,|{1}\rangle\}$ . For probability distributions $p$ and $q$ over $\mathbf{\Theta}$ , it holds that

(58)

\left\|\sum_{\theta\in\mathbf{\Theta}}p(\theta)\Upsilon_{\theta}-\sum_{\theta\in\mathbf{\Theta}}q(\theta)\Upsilon_{\theta}\right\|_{\diamond}=\left|\sum_{\theta\in\mathbf{\Theta}}p(\theta)e^{i\theta}-\sum_{\theta\in\mathbf{\Theta}}q(\theta)e^{i\theta}\right|.

Proof.

By using Lemma 5.2, we obtain

(59)

(L.H.S.)=\left\|\sum_{\theta\in\mathbf{\Theta}}(p(\theta)-q(\theta))J(\Upsilon_{\theta})\right\|_{\text{tr}}=(R.H.S.),

where we use the diagonalization of $\sum_{\theta\in\mathbf{\Theta}}(p(\theta)-q(\theta))J(\Upsilon_{\theta})$ , which can be obtained via a straightforward calculation, in the last equality. ∎

By using this proposition, we can obtain $\frac{1}{2}\left\|\Upsilon_{\hat{\theta}}-\sum_{\theta\in\mathbf{\Theta}}p(\theta)\Upsilon_{\theta}\right\|_{\diamond}=\frac{1}{2}\left|e^{i\hat{\theta}}-\sum_{\theta\in\mathbf{\Theta}}p(\theta)e^{i\theta}\right|$ , which indicates that the optimal probability distribution and approximation error in the convex-hull approximation of $\Upsilon_{\hat{\theta}}$ can be computed by finding the closest point in the convex hull of $\{e^{i\theta}\}_{\theta\in\mathbf{\Theta}}$ to the target point $e^{i\hat{\theta}}$ . As represented in Fig. 3, the quadratic reduction in approximation error owing to convex-hull approximation over discrete-point approximation can be shown by an elementary geometric observation.

6. Conclusion

We considered the analytical relationship between $\min_{p}\left\|\Upsilon-\sum_{x}p(x)\Upsilon_{x}\right\|_{\diamond}$ and $\min_{x}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}$ , which represent the minimum approximation error obtained by probabilistic synthesis and that by deterministic synthesis, respectively. As the main result, we obtained tight upper and lower bounds on $\min_{p}\left\|\Upsilon-\sum_{x}p(x)\Upsilon_{x}\right\|_{\diamond}$ , which guarantees the sub-optimality of the current algorithms as well as suggests the existence of an improved synthesis algorithm. We showed that the optimal probability distribution in the approximation can be computed by an SDP. We also constructed an efficient probabilistic synthesis algorithm for single-qubit unitary transformations and showed that it quadratically reduces approximation error compared with deterministic synthesis and its optimality can be reduced into the choice of unitary transformations close to the target unitary one. While numerical simulations indicate the algorithm works well for qudit unitary transformations, a rigorous proof is a subject for future work.

When we run this algorithm for qudit unitary transformation, the time complexity of the SDP used in the algorithm becomes $poly\left(d,|\hat{X}|\right)$ for a fixed desired approximation error $\epsilon$ and precision $\delta$ , where $d$ is the dimension of the unitary operators and $|\hat{X}|$ is the size of the list of synthesized unitary transformations. Since $|\hat{X}|$ grows exponentially with respect to $d^{2}$ (to make the list an $\epsilon$ -covering of the $2\epsilon$ -ball around the target unitary transformation), the algorithm is not practical for higher dimensional unitary transformations.

There are two ways to make the algorithm more practical. First, restricting a class of target unitary transformations, such as axial rotations, would significantly reduce the size $|\hat{X}|$ but still achieve the guaranteed quadratic improvement. Indeed, Fig. 3 implies that the quadratic improvement can be achieved by mixing only two realizable unitary transformations. Alternatively, we can consider a modified algorithm that probabilistically mixes a randomly sampled small subset of $\hat{X}$ . While this modified algorithm does not provide the guaranteed quadratic improvement as the original one, numerical experiments in Appendix D suggest that it still attains such improvement for randomly chosen target unitary transformations.

Similar to the probabilistic mixture of unitary transformations, that of general CPTP mappings implemented by a certain quantum device is relatively easy to implement by classically controlling the quantum device. Such a probabilistic mixture of implementable CPTP mappings is considered a free operation in many quantum resource theories (Horodecki and Oppenheim, 2013; Brandão and Gour, 2015; Chitambar and Gour, 2019). To quantify or simulate a target CPTP mapping using the probabilistic mixture (sometimes assisted by a resource state), a mathematical tool is required to analyze the optimal convex approximation of a general CPTP mapping. From the mathematical perspective as well as from the resource theoretical perspective, computing or bounding the approximation error of a unital CPTP mapping by using a probabilistic mixture of unitary transformations plays a crucial role in investigating the asymptotic quantum Birkhoff conjecture (Haagerup and Musat, 2011; Yu et al., 2012). Our SDP shown in Proposition 3.1 and our bounds (or possibly their extension to general CPTP mappings) could be numerical and analytical tools to investigate such problems.

Acknowledgements.

We thank Yoshihisa Yamamoto, Aram Harrow, Isaac Chuang, Sho Sugiura, Yuki Takeuchi, Yasunari Suzuki, Yasuhiro Takahashi, and Adel Sohbi for their helpful discussions. This work was partially supported by JST Moonshot R&D MILLENNIA Program (Grant No.JPMJMS2061). SA was partially supported by JST, PRESTO Grant No.JPMJPR2111 and JPMXS0120319794. GK was supported in part by the Grant-in-Aid for Scientific Research (C) No.20K03779, (C) No.21K03388, and (S) No.18H05237 of JSPS, CREST (Japan Science and Technology Agency) Grant No.JPMJCR1671. ST was partially supported by JSPS KAKENHI Grant Numbers JP20H05966 and JP22H00522.

References

(1)
Aharonov and Ben-Or (2008) Dorit Aharonov and Michael Ben-Or. 2008. Fault-Tolerant Quantum Computation with Constant Error Rate. SIAM J. Comput. 38, 4 (2008), 1207–1282. https://doi.org/10.1137/S0097539799359385 arXiv:https://doi.org/10.1137/S0097539799359385
Bennett et al. (1996) Charles H. Bennett, David P. DiVincenzo, John A. Smolin, and William K. Wootters. 1996. Mixed-state entanglement and quantum error correction. Phys. Rev. A 54 (Nov 1996), 3824–3851. Issue 5. https://doi.org/10.1103/PhysRevA.54.3824
Bocharov et al. (2015) Alex Bocharov, Martin Roetteler, and Krysta M. Svore. 2015. Efficient Synthesis of Universal Repeat-Until-Success Quantum Circuits. Phys. Rev. Lett. 114 (Feb 2015), 080502. Issue 8. https://doi.org/10.1103/PhysRevLett.114.080502
Bouland and Giurgica-Tiron (2021) Adam Bouland and Tudor Giurgica-Tiron. 2021. Efficient Universal Quantum Compilation: An Inverse-free Solovay-Kitaev Algorithm. arXiv:2112.02040 [quant-ph]
Brandão and Gour (2015) Fernando G. S. L. Brandão and Gilad Gour. 2015. Reversible Framework for Quantum Resource Theories. Phys. Rev. Lett. 115 (Aug 2015), 070503. Issue 7. https://doi.org/10.1103/PhysRevLett.115.070503
Campbell (2017) Earl Campbell. 2017. Shorter gate sequences for quantum computing by mixing unitaries. Phys. Rev. A 95 (Apr 2017), 042306. Issue 4. https://doi.org/10.1103/PhysRevA.95.042306
Chiribella et al. (2009) Giulio Chiribella, Giacomo Mauro D’Ariano, and Paolo Perinotti. 2009. Theoretical framework for quantum networks. Phys. Rev. A 80 (Aug 2009), 022339. Issue 2. https://doi.org/10.1103/PhysRevA.80.022339
Chitambar and Gour (2019) Eric Chitambar and Gilad Gour. 2019. Quantum resource theories. Rev. Mod. Phys.s 91, 2 (2019), 025001.
Fowler (2011) Austin G. Fowler. 2011. Constructing Arbitrary Steane Code Single Logical Qubit Fault-Tolerant Gates. Quantum Info. Comput. 11, 9–10 (sep 2011), 867–873.
Fuchs and van de Graaf (1999) C.A. Fuchs and J. van de Graaf. 1999. Cryptographic distinguishability measures for quantum-mechanical states. IEEE Trans. Inf. Theory. 45, 4 (1999), 1216–1227. https://doi.org/10.1109/18.761271
Gutoski and Watrous (2007) Gus Gutoski and John Watrous. 2007. Toward a General Theory of Quantum Games. In Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing (San Diego, California, USA) (STOC ’07). Association for Computing Machinery, New York, NY, USA, 565–574. https://doi.org/10.1145/1250790.1250873
Haagerup and Musat (2011) Uffe Haagerup and Magdalena Musat. 2011. Factorization and Dilation Problems for Completely Positive Maps on von Neumann Algebras. Commun. Math. Phys. 303, 2 (2011), 555–594. https://doi.org/10.1007/s00220-011-1216-y
Harrow et al. (2002) Aram W. Harrow, Benjamin Recht, and Isaac L. Chuang. 2002. Efficient discrete approximations of quantum gates. J. Math. Phys. 43, 9 (2002), 4445–4451. https://doi.org/10.1063/1.1495899 arXiv:https://doi.org/10.1063/1.1495899
Hastings (2017) Matthew B. Hastings. 2017. Turning Gate Synthesis Errors into Incoherent Errors. Quantum Info. Comput. 17, 5–6 (mar 2017), 488–494.
Hill and Wootters (1997) Sam A. Hill and William K. Wootters. 1997. Entanglement of a Pair of Quantum Bits. Phys. Rev. Lett. 78 (Jun 1997), 5022–5025. Issue 26. https://doi.org/10.1103/PhysRevLett.78.5022
Horodecki and Oppenheim (2013) Michal Horodecki and Jonathan Oppenheim. 2013. (QUANTUMNESS IN THE CONTEXT OF) RESOURCE THEORIES. Int. J. Mod. Phys. B 27, 01n03 (2013), 1345019. https://doi.org/10.1142/S0217979213450197 arXiv:https://doi.org/10.1142/S0217979213450197
Kitaev (2003) A.Yu. Kitaev. 2003. Fault-tolerant quantum computation by anyons. Ann. Phys. 303, 1 (2003), 2–30. https://doi.org/10.1016/S0003-4916(02)00018-0
Kitaev et al. (2002) A. Yu Kitaev, A. H. Shen, and M. N. Vyalyi. 2002. Classical and Quantum Computation. American Mathematical Society.
Kliuchnikov et al. (2022) Vadym Kliuchnikov, Kristin Lauter, Romy Minko, Adam Paetznick, and Christophe Petit. 2022. Shorter quantum circuits. arXiv:2203.10064 [quant-ph]
Kliuchnikov et al. (2013) Vadym Kliuchnikov, Dmitri Maslov, and Michele Mosca. 2013. Asymptotically Optimal Approximation of Single Qubit Unitaries by Clifford and $T$ Circuits Using a Constant Number of Ancillary Qubits. Phys. Rev. Lett. 110 (May 2013), 190502. Issue 19. https://doi.org/10.1103/PhysRevLett.110.190502
Kliuchnikov et al. (2016) Vadym Kliuchnikov, Dmitri Maslov, and Michele Mosca. 2016. Practical Approximation of Single-Qubit Unitaries by Single-Qubit Quantum Clifford and T Circuits. IEEE Trans. Comput. 65, 1 (2016), 161–172. https://doi.org/10.1109/TC.2015.2409842
Knill et al. (1998) Emanuel Knill, Raymond Laflamme, and Wojciech H. Zurek. 1998. Resilient quantum computation: error models and thresholds. Proc. R. Soc. Lond. A. 454 (1998), 365–384. https://doi.org/10.1098/rspa.1998.0166
Lovász (2003) L. Lovász. 2003. Semidefinite Programs and Combinatorial Optimization. Springer New York, New York, NY, 137–194. https://doi.org/10.1007/0-387-22444-0_6
Nielsen and Chuang (2000) Michael A. Nielsen and Isaac L. Chuang. 2000. Quantum Computation and Quantum Information. Cambridge University Press.
Ross (2015) Neil J. Ross. 2015. Optimal Ancilla-Free CLIFFORD+V Approximation of Z-Rotations. Quantum Info. Comput. 15, 11–12 (sep 2015), 932–950.
Sacchi (2017) Massimiliano F. Sacchi. 2017. Optimal convex approximations of quantum states. Phys. Rev. A 96 (Oct 2017), 042325. Issue 4. https://doi.org/10.1103/PhysRevA.96.042325
Sacchi and Sacchi (2017) Massimiliano F. Sacchi and Tito Sacchi. 2017. Convex approximations of quantum channels. Phys. Rev. A 96 (Sep 2017), 032311. Issue 3. https://doi.org/10.1103/PhysRevA.96.032311
Terhal (2015) Barbara M. Terhal. 2015. Quantum error correction for quantum memories. Rev. Mod. Phys. 87 (Apr 2015), 307–346. Issue 2. https://doi.org/10.1103/RevModPhys.87.307
Watrous (2018) John Watrous. 2018. The Theory of Quantum Information. Cambridge University Press. https://doi.org/10.1017/9781316848142
Yu et al. (2012) Nengkun Yu, Runyao Duan, and Quanhua Xu. 2012. Bounds on the distance between a unital quantum channel and the convex hull of unitary channels, with applications to the asymptotic quantum Birkhoff conjecture. arXiv preprint arXiv:1201.1172 (2012).

Appendix A Equivalence between quantum testers and quantum networks

Recall that the Choi-Jamiołkowski operator of linear mapping $\Xi:\mathbf{L}\left(\mathcal{H}_{1}\right)\rightarrow\mathbf{L}\left(\mathcal{H}_{2}\right)$ is defined as $J(\Xi):=\sum_{i,j}|{i}\rangle\langle{j}|\otimes\Xi(|{i}\rangle\langle{j}|)\in\mathbf{L}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right)$ , and the set of quantum testers is defined as $\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2}):=\{T\in\mathbf{Pos}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right):\exists\rho\in\mathbf{S}\left(\mathcal{H}_{1}\right),T\leq\rho\otimes\mathbb{I}_{\mathcal{H}_{2}}\}$ . In this section, we show that the set of mappings $f_{T}:\Xi\mapsto\text{tr}\left[J\left(\Xi\right)T\right]$ associated with quantum testers $T\in\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2})$ is equivalent to that of mappings $g_{\Phi,\Pi}:\Xi\mapsto\text{tr}\left[\Xi\otimes id_{\mathcal{H}_{3}}(\Phi)\Pi\right]$ associated with pure states $\Phi\in\mathbf{P}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{3}\right)$ and Hermitian projectors $\Pi\in\mathbf{Proj}(\mathcal{H}_{2}\otimes\mathcal{H}_{3})$ for sufficiently large dimensional Hilbert space $\mathcal{H}_{3}$ . This equivalence indicates

(60)

\max_{T\in\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2})}\min_{\Xi}f_{T}(\Xi)=\max_{\begin{subarray}{c}\Phi\in\mathbf{P}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{3}\right)\\ \Pi\in\mathbf{Proj}(\mathcal{H}_{2}\otimes\mathcal{H}_{3})\end{subarray}}\min_{\Xi}g_{\Phi,\Pi}(\Xi),

where the minimization of $\Xi$ is taken over a compact subset of linear mappings specified in the proofs of Proposition 3.1 and Lemma 4.2. Note that a proof for more general quantum testers is given in (Chiribella et al., 2009, Theorem 10).

First, we show that for any $\Phi$ and $\Pi$ , there exists $T\in\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2})$ such that $f_{T}=g_{\Phi,\Pi}$ as follows. By letting $T=\text{tr}_{3}\left[(\Phi^{T_{1}}\otimes\mathbb{I}_{2})(\mathbb{I}_{1}\otimes\Pi)\right]$ , we obtain

(61)

\displaystyle g_{\Phi,\Pi}(\Xi)=\text{tr}\left[\Xi\otimes id_{\mathcal{H}_{3}}(\Phi)\Pi\right]

\displaystyle=

\displaystyle\text{tr}\left[(J(\Xi)\otimes\mathbb{I}_{3})(\Phi^{T_{1}}\otimes\mathbb{I}_{2})(\mathbb{I}_{1}\otimes\Pi)\right]=\text{tr}\left[J(\Xi)T\right]=f_{T}(\Xi),

where $\Phi^{T_{1}}$ and $\text{tr}_{3}\left[\cdot\right]$ represent the partial transpose of $\Phi$ and the partial trace, respectively, and the subscript of the operator denotes the system on which the operator acts. We can also verify that $T\in\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2})$ as follows. Let $X=\sum_{ij}\alpha_{ij}|{j}\rangle_{3}\langle{i}|_{1}$ , where $|{\Phi}\rangle=\sum_{ij}\alpha_{ij}|{i}\rangle_{1}|{j}\rangle_{3}$ with the computational basis $\{|{i}\rangle_{1}\in\mathbf{P}\left(\mathcal{H}_{1}\right)\}_{i}$ and $\{|{j}\rangle_{3}\in\mathbf{P}\left(\mathcal{H}_{3}\right)\}_{j}$ . We then obtain that for any positive semidefinite operator $P\in\mathbf{Pos}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right)$ ,

(62)

\text{tr}\left[PT\right]=\text{tr}\left[(P\otimes\mathbb{I}_{3})(\Phi^{T_{1}}\otimes\mathbb{I}_{2})(\mathbb{I}_{1}\otimes\Pi)\right]=\text{tr}\left[(X\otimes\mathbb{I}_{2})P(X\otimes\mathbb{I}_{2})^{\dagger}\Pi\right]\geq 0,

which indicates $T\geq 0$ . By letting $\rho=\text{tr}_{3}\left[\Phi^{T_{1}}\right]=\text{tr}_{3}\left[\Phi\right]^{T}(\in\mathbf{S}\left(\mathcal{H}_{1}\right))$ , we can also verify that

(63)

\displaystyle\rho\otimes\mathbb{I}_{2}-T=\text{tr}_{3}\left[(\Phi^{T_{1}}\otimes\mathbb{I}_{2})(\mathbb{I}_{123}-\mathbb{I}_{1}\otimes\Pi)\right]=\text{tr}_{3}\left[(\Phi^{T_{1}}\otimes\mathbb{I}_{2})(\mathbb{I}_{1}\otimes\Pi_{\bot})\right]\geq 0,

where $\Pi_{\bot}\in\mathbf{Proj}(\mathcal{H}_{2}\otimes\mathcal{H}_{3})$ satisfies $\Pi+\Pi_{\bot}=\mathbb{I}$ , and the last inequality can be verified by the fact that $T\geq 0$ .

Next, we show that for any $T\in\mathbf{T}(\mathcal{H}_{1}:\mathcal{H}_{2})$ , there exist $\Phi\in\mathbf{P}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{3}\right)$ and $\Pi\in\mathbf{Proj}(\mathcal{H}_{2}\otimes\mathcal{H}_{3})$ such that $f_{T}=g_{\Phi,\Pi}$ as follows. Let $T\leq\rho_{1}\otimes\mathbb{I}_{2}$ , $\hat{\Phi}\in\mathbf{P}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{1^{\prime}}\right)$ be a purification of $\rho_{1}^{T}$ , its singular value decomposition be $|{\hat{\Phi}}\rangle=\sum_{i}\sqrt{p(i)}|{x_{i}}\rangle_{1}|{y_{i}}\rangle_{1^{\prime}}$ ( $p(i)>0$ ), and $P\in\mathbf{Pos}\left(\mathcal{H}_{2}\otimes\mathcal{H}_{1^{\prime}}\right)$ be $P=XTX^{\dagger}$ , where $X=\sum_{i}\frac{1}{\sqrt{p(i)}}|{y_{i}}\rangle_{1^{\prime}}\langle{x_{i}^{*}}|_{1}$ and $|{\phi^{*}}\rangle$ is the complex conjugate of $|{\phi}\rangle$ . We can then verify that

(64)

f_{T}(\Xi)=\text{tr}\left[J(\Xi)T\right]=\text{tr}\left[(J(\Xi)\otimes\mathbb{I}_{1^{\prime}})(\hat{\Phi}^{T_{1}}\otimes\mathbb{I}_{2})(\mathbb{I}_{1}\otimes P)\right]=\text{tr}\left[\Xi\otimes id_{\mathcal{H}_{1^{\prime}}}(\hat{\Phi})P\right].

Since $P\leq X(\rho_{1}\otimes\mathbb{I}_{2})X^{\dagger}\leq\mathbb{I}_{1^{\prime}2}$ , $\{P,\mathbb{I}-P\}$ is a positive operator-valued measure (POVM). Owing to the Naimark’s extension, we can embed $\hat{\Phi}$ and $\{P,\mathbb{I}-P\}$ in a larger Hilbert space as a pure state $\Phi$ and a projection-valued measure (PVM) $\{\Pi,\Pi_{\bot}\}$ , respectively, which completes the proof.

Appendix B Formal SDPs and their strong duality

A formal SDP to compute $\frac{1}{2}\left\|\mathcal{A}-\mathcal{B}\right\|_{\diamond}$ is defined with a triple $(\Xi,A,B)$ such that

(65)

\displaystyle A=\left(\begin{matrix}J(\mathcal{A}-\mathcal{B})&0&0\\ 0&0&0\\ 0&0&0\end{matrix}\right),\ \ B=\left(\begin{matrix}0&0\\ 0&1\end{matrix}\right),\ \ \Xi\left(\left(\begin{matrix}T&*&*\\ *&T^{\prime}&*\\ *&*&\rho\end{matrix}\right)\right)=\left(\begin{matrix}T+T^{\prime}-\rho\otimes\mathbb{I}_{\mathcal{H}_{2}}&0\\ 0&\text{tr}\left[\rho\right]\end{matrix}\right)

holds for any linear operators $T,T^{\prime}\in\mathbf{L}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right)$ and $\rho\in\mathbf{L}\left(\mathcal{H}_{1}\right)$ , where the asterisks in the argument to $\Xi$ represent arbitrary linear operators upon which $\Xi$ does not depend, and we identify a linear operator and its matrix representation with respect to a fixed orthonormal basis. The dual problem is obtained by observing that the adjoint of $\Xi$ satisfies

(66)

\Xi^{\dagger}\left(\left(\begin{matrix}S&*\\ *&r\\ \end{matrix}\right)\right)=\left(\begin{matrix}S&0&0\\ 0&S&0\\ 0&0&r\mathbb{I}_{\mathcal{H}_{1}}-\text{tr}_{\mathcal{H}_{2}}\left[S\right]\end{matrix}\right)

for any linear operator $S\in\mathbf{L}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right)$ and any complex number $r\in\mathbb{C}$ . We can verify the strong duality of this SDP by observing $\Xi\left(\frac{\mathbb{I}_{\mathcal{H}_{1}}\otimes\mathbb{I}_{\mathcal{H}_{2}}}{2\dim\mathcal{H}_{1}}\oplus\frac{\mathbb{I}_{\mathcal{H}_{1}}\otimes\mathbb{I}_{\mathcal{H}_{2}}}{2\dim\mathcal{H}_{1}}\oplus\frac{\mathbb{I}_{\mathcal{H}_{1}}}{\dim\mathcal{H}_{1}}\right)=B$ and applying the Slater’s theorem.

A formal SDP shown in Proposition 3.1 is defined with a triple $(\Xi,A,B)$ such that

(67)	$\displaystyle A$	$\displaystyle=$	$\displaystyle\left(\begin{matrix}J(\mathcal{A})&0&0&0&0\\ 0&0&0&0&0\\ 0&0&0&0&0\\ 0&0&0&0&0\\ 0&0&0&0&-1\end{matrix}\right)$
(68)	$\displaystyle B$	$\displaystyle=$	$\displaystyle\left(\begin{matrix}0&0&0\\ 0&1&0\\ 0&0&0\end{matrix}\right)$
(69)	$\displaystyle\Xi^{\dagger}\left(\left(\begin{matrix}S&&\\ &r&\\ &&P\end{matrix}\right)\right)$	$\displaystyle=$	$\displaystyle\left(\begin{matrix}S+\sum_{x\in X}P(x)J(\mathcal{B}_{x})&0&0&0&0\\ 0&S&0&0&0\\ 0&0&r\mathbb{I}_{\mathcal{H}_{1}}-\text{tr}_{\mathcal{H}_{2}}\left[S\right]&0&0\\ 0&0&0&P&0\\ 0&0&0&0&-\text{tr}\left[P\right]\end{matrix}\right)$

holds for any linear operators $S\in\mathbf{L}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right)$ , $P\in\mathbf{L}\left(\mathbb{C}^{|X|}\right)$ and any complex number $r\in\mathbb{C}$ , where $P(x)$ represents a diagonal element $\langle{x}|P|{x}\rangle$ . The primal problem is obtained by observing that the adjoint of $\Xi^{\dagger}$ satisfies

(70)

\Xi\left(\left(\begin{matrix}T&*&*&*&*\\ *&T^{\prime}&*&*&*\\ *&*&\rho&*&*\\ *&*&*&Q&*\\ *&*&*&*&t\end{matrix}\right)\right)=\left(\begin{matrix}T+T^{\prime}-\rho\otimes\mathbb{I}_{\mathcal{H}_{2}}&0&0\\ 0&\text{tr}\left[\rho\right]&0\\ 0&0&\sum_{x\in X}\text{tr}\left[J(\mathcal{B}_{x})T\right]|{x}\rangle\langle{x}|+Q-t\mathbb{I}_{\mathbb{C}^{|X|}}\end{matrix}\right)

for any linear operators $T,T^{\prime}\in\mathbf{L}\left(\mathcal{H}_{1}\otimes\mathcal{H}_{2}\right)$ , $\rho\in\mathbf{L}\left(\mathcal{H}_{1}\right)$ , $Q\in\mathbf{L}\left(\mathbb{C}^{|X|}\right)$ and any complex number $t\in\mathbb{C}$ . We can verify the strong duality of this SDP by observing $\Xi\left(\frac{\mathbb{I}_{\mathcal{H}_{1}}\otimes\mathbb{I}_{\mathcal{H}_{2}}}{2\dim\mathcal{H}_{1}}\oplus\frac{\mathbb{I}_{\mathcal{H}_{1}}\otimes\mathbb{I}_{\mathcal{H}_{2}}}{2\dim\mathcal{H}_{1}}\oplus\frac{\mathbb{I}_{\mathcal{H}_{1}}}{\dim\mathcal{H}_{1}}\oplus\frac{\mathbb{I}_{\mathbb{C}^{|X|}}}{2}\oplus 1\right)=B$ and applying the Slater’s theorem.

Appendix C Sharpness of approximation error bounds

In this section, we make the same assumption $d\geq 2$ as Lemma 4.1 and 4.2.

C.1. Lower bounds

To show the sharpness of the lower bounds in Ineqs. (14) and (1), we consider a set $\{\Upsilon_{x}\}_{x\in X}:=\{\Upsilon:\exists W\in\mathbf{W}_{\epsilon}^{(d)},\Upsilon(\rho)=W\rho W^{\dagger}\}$ of unitary transformations, where

(71)

\displaystyle\mathbf{W}_{\epsilon}^{(d)}:=\left\{W:W\in U(d)\wedge\min_{z\in\text{conv}\left(\lambda(W)\right)}|z|\leq\sqrt{1-\epsilon^{2}},\right\}\ \ {\rm with}\ \ \epsilon\in[0,1]\ {\rm and}\ d\geq 2,

where $U(d)$ represents the set of unitary operators acting on $\mathbb{C}^{d}$ , $\lambda(W)$ represents the set of eigenvalues of $W$ , and $\text{conv}\left(X\right)$ represents the convex hull of a subset $X$ in a vector space. To be precise, the two lower bounds are not directly applicable to $\{\Upsilon_{x}\}_{x\in X}$ since the size $|X|$ of the set is infinite. However, the compactness of the set of unitary transformations on a finite-dimensional Hilbert space enables us to extend Ineqs. (14) and (1) for $|X|=\infty$ by replacing $\min_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}$ and $\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond}$ with $\inf_{x\in X}\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}$ and $\inf_{\Lambda\in\text{conv}\left(\{\Upsilon_{x}\}_{x\in X}\right)}\frac{1}{2}\left\|\Upsilon-\Lambda\right\|_{\diamond}$ , respectively.

We show that this example achieves the lower bounds in the extended inequalities with a target unitary transformation $\Upsilon=id$ in Ineq. (14). This also indicates that there exists a finite subset $\{\Upsilon_{x}\}_{x\in\tilde{X}}$ of $\{\Upsilon_{x}\}_{x\in X}$ such that $\min_{p}\frac{1}{2}\left\|id-\sum_{x\in\tilde{X}}p(x)\Upsilon_{x}\right\|_{\diamond}$ and $\max_{\Upsilon}\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in\tilde{X}}p(x)\Upsilon_{x}\right\|_{\diamond}$ are arbitrarily close to their each lower bound in Ineqs. (14) and (1), respectively. For letting $\{\Upsilon_{x}\}_{x\in\tilde{X}}$ be an $\tilde{\epsilon}$ -covering of $\{\Upsilon_{x}\}_{x\in X}$ with sufficiently small $\tilde{\epsilon}$ is sufficient to show this. Thus, the sharpness of the lower bounds in the extended inequalities indicates that in the original inequalities. Note that we can show the sharpness of Ineq. (14) when an $\Upsilon$ is not the identity transformation by replacing $\{\Upsilon_{x}\}_{x\in X}$ with $\{\Upsilon\circ\Upsilon_{x}\}_{x\in X}$ .

First, by using Eq. (19), we obtain

	$\displaystyle\max_{\Upsilon}\inf_{x\in X}\frac{1}{2}\left\\|\Upsilon-\Upsilon_{x}\right\\|_{\diamond}\geq\inf_{x\in X}\frac{1}{2}\left\\|id-\Upsilon_{x}\right\\|_{\diamond}$
(72)		$\displaystyle=\sqrt{1-\sup_{W\in\mathbf{W}_{\epsilon}^{(d)}}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left\|\text{tr}\left[\rho W\right]\right\|^{2}}=\sqrt{1-\sup_{W\in\mathbf{W}_{\epsilon}^{(d)}}\min_{z\in\text{conv}\left(\lambda(W)\right)}\|z\|^{2}}=\epsilon.$

Second, by using the extended version of Eq. (31), we obtain

(73)

\displaystyle\inf_{\Lambda\in\text{conv}\left(\{\Upsilon_{x}\}_{x\in X}\right)}\frac{1}{2}\left\|id-\Lambda\right\|_{\diamond}\leq\max_{\Upsilon}\inf_{\Lambda\in\text{conv}\left(\{\Upsilon_{x}\}_{x\in X}\right)}\frac{1}{2}\left\|\Upsilon-\Lambda\right\|_{\diamond}=1-\min_{V\in U(d),\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\sup_{W\in\mathbf{W}_{\epsilon}^{(d)}}\left|\text{tr}\left[\rho V^{\dagger}W\right]\right|^{2}.

In the following, we show that for any $V\in U(d)$ and $\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)$ ,

(74)

\sup_{W\in\mathbf{W}_{\epsilon}^{(d)}}\left|\text{tr}\left[\rho V^{\dagger}W\right]\right|^{2}\geq\left(1-\frac{2\delta}{d}\right)^{2}\ {\rm with}\ \delta=1-\sqrt{1-\epsilon^{2}},

which is sufficient to verify that $\{\Upsilon_{x}\}_{x\in X}$ achieves lower bounds in the extended Ineqs. (14) and (1).

Let the diagonalization of $V$ be $V=\sum_{i=1}^{d}\lambda_{i}(V)|{i}\rangle\langle{i}|$ . Since $\sup_{W\in\mathbf{W}_{\epsilon}^{(d)}}|\text{tr}\left[\rho V^{\dagger}W\right]|^{2}=\text{tr}\left[\rho V^{\dagger}V\right]|^{2}=1$ if $\min_{z\in\text{conv}\left(\lambda(V)\right)}|z|\leq\sqrt{1-\epsilon^{2}}$ , we assume $\epsilon>0$ and $\min_{z\in\text{conv}\left(\lambda(V)\right)}|z|>\sqrt{1-\epsilon^{2}}$ . We can then define $\{W^{(ij)}\in\mathbf{W}_{\epsilon}^{(d)}\}_{1\leq i<j\leq d}$ as

(75)	$\displaystyle W^{(ij)}$	$\displaystyle:=$	$\displaystyle\sum_{k\notin\{i,j\}}\lambda_{k}(V)\|{k}\rangle\langle{k}\|+\lambda_{+}^{(ij)}\|{i}\rangle\langle{i}\|+\lambda_{-}^{(ij)}\|{j}\rangle\langle{j}\|,$
(76)	$\displaystyle{\rm where}\ \lambda_{\pm}^{(ij)}$	$\displaystyle=$	$\displaystyle\sqrt{1-\epsilon^{2}}\frac{\lambda_{i}(V)+\lambda_{j}(V)}{\|\lambda_{i}(V)+\lambda_{j}(V)\|}\pm\epsilon\frac{\lambda_{i}(V)-\lambda_{j}(V)}{\|\lambda_{i}(V)-\lambda_{j}(V)\|}\ {\rm if}\ \lambda_{i}(V)\neq\lambda_{j}(V),$
(77)	$\displaystyle{\rm and}\ \lambda_{\pm}^{(ij)}$	$\displaystyle=$	$\displaystyle\sqrt{1-\epsilon^{2}}\lambda_{i}(V)\pm i\epsilon\lambda_{i}(V)\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ {\rm if}\ \lambda_{i}(V)=\lambda_{j}(V).$

(See geometric positions of eigenvalues in the complex plane shown in Fig. 4.) Note that we can easily verify that $|\lambda_{\pm}^{(ij)}|=1$ and $\left|\frac{1}{2}\left(\lambda_{+}^{(ij)}+\lambda_{-}^{(ij)}\right)\right|=\sqrt{1-\epsilon^{2}}$ , which guarantees $W^{(ij)}\in\mathbf{W}_{\epsilon}^{(d)}$ . Moreover, we can verify that $\lambda\left(V^{\dagger}W^{(ij)}\right)=\{1,z^{(ij)},z^{(ij)*}\}$ with a unit complex number $z^{(ij)}$ satisfying $\text{Re}\left[z^{(ij)}\right]\geq\sqrt{1-\epsilon^{2}}$ . Then, for any $V\in U(d)$ and $\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)$ , the left hand side of Ineq. (74) can be bounded as

(78)	$\displaystyle\sup_{W\in\mathbf{W}_{\epsilon}^{(d)}}\left\|\text{tr}\left[\rho V^{\dagger}W\right]\right\|^{2}$	$\displaystyle\geq$	$\displaystyle\max_{1\leq i<j\leq d}\left\|\text{tr}\left[\rho V^{\dagger}W^{(ij)}\right]\right\|^{2}\geq\min_{p}\max_{1\leq i<j\leq d}\left\|\sum_{k\notin\{i,j\}}p(k)+p(i)z^{(ij)}+p(j)z^{(ij)*}\right\|^{2}$
(79)		$\displaystyle\geq$	$\displaystyle\min_{p}\max_{1\leq i<j\leq d}\left\{\sum_{k\notin\{i,j\}}p(k)+(p(i)+p(j))\text{Re}\left[z^{(ij)}\right]\right\}^{2}$
(80)		$\displaystyle\geq$	$\displaystyle\min_{p}\max_{1\leq i<j\leq d}\left\{\sum_{k\notin\{i,j\}}p(k)+(p(i)+p(j))\sqrt{1-\epsilon^{2}}\right\}^{2}$
(81)		$\displaystyle\geq$	$\displaystyle\min_{p}\max_{1\leq i<j\leq d}\left\{1-\delta(p(i)+p(j))\right\}^{2}\geq\left(1-\frac{2\delta}{d}\right)^{2}.$

This completes the proof.

C.2. Upper bound

We show the sharpness of the upper bound in Ineq. (1), We consider a set $\{\Upsilon_{x}\}_{x\in X}:=\{\Upsilon:\exists V\in\mathbf{V}_{\epsilon}^{(d)},\Upsilon(\rho)=V\rho V^{\dagger}\}$ of unitary transformations, where

(82)		$\displaystyle\mathbf{V}_{\epsilon}^{(d)}$	$\displaystyle:=$	$\displaystyle\left\{\left(\begin{matrix}1&0\\ 0&V_{1}\end{matrix}\right)\left(\begin{matrix}W&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\left(\begin{matrix}1&0\\ 0&V_{2}\end{matrix}\right):V_{1},V_{2}\in U(d-1),W\in\mathbf{R}_{\epsilon}\right\},$
(83)		$\displaystyle\mathbf{R}_{\epsilon}$	$\displaystyle:=$	$\displaystyle\left\{\left(\begin{matrix}\cos\theta&-\sin\theta\\ \sin\theta&\cos\theta\end{matrix}\right):0\leq\theta\leq\arccos(\epsilon)\right\}\ \ {\rm with}\ \ \epsilon\in[0,1]\ {\rm and}\ d\geq 2.$

Here $\mathbb{I}_{d}$ represents the $d\times d$ identity matrix, and we identify a unitary operator and its matrix representation with respect to a fixed orthonormal basis $\{|{i}\rangle\}_{i=0}^{d-1}$ . Since $|X|=\infty$ , we show the sharpness of the upper bound in the extended Ineq. (1), which is defined in the proof of the sharpness of the lower bounds. Note that

(84)

\forall U\in U(d),\exists\alpha\in\mathbb{R},\exists V\in\mathbf{V}_{0}^{(d)},U=e^{i\alpha}V

holds. This can be verified from the following three observations: First, by letting $U|{i}\rangle=|{e_{i}}\rangle$ , there exists $\tilde{V}_{1},\tilde{V}_{2}\in U(d-1)$ and $\tilde{W}\in U(2)$ such that $\left(\begin{matrix}\tilde{W}^{\dagger}&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\left(\begin{matrix}1&0\\ 0&\tilde{V}_{1}^{\dagger}\end{matrix}\right)|{e_{0}}\rangle=|{0}\rangle$ and $\left(\begin{matrix}1&0\\ 0&\tilde{V}_{2}^{\dagger}\end{matrix}\right)\left(\begin{matrix}\tilde{W}^{\dagger}&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\left(\begin{matrix}1&0\\ 0&\tilde{V}_{1}^{\dagger}\end{matrix}\right)|{e_{i}}\rangle=|{i}\rangle$ for all $i$ . Second, for any $\tilde{W}\in U(2)$ , there exists $\alpha,\beta,\gamma\in\mathbb{R}$ and $W\in\mathbf{R}_{0}$ such that $\tilde{W}=e^{i\alpha}\left(\begin{matrix}1&0\\ 0&e^{i\beta}\end{matrix}\right)W\left(\begin{matrix}1&0\\ 0&e^{i\gamma}\end{matrix}\right)$ . Third, by letting $V_{1}=\tilde{V}_{1}\left(\begin{matrix}e^{i\beta}&0\\ 0&e^{-i\alpha}\mathbb{I}_{d-2}\end{matrix}\right)$ and $V_{2}=\left(\begin{matrix}e^{i\gamma}&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\tilde{V}_{2}$ , we can verify $U=e^{i\alpha}\left(\begin{matrix}1&0\\ 0&V_{1}\end{matrix}\right)\left(\begin{matrix}W&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\left(\begin{matrix}1&0\\ 0&V_{2}\end{matrix}\right)$ .

First, by using Eq. (19), we obtain

(85)	$\displaystyle\max_{\Upsilon}\inf_{x\in X}\frac{1}{2}\left\\|\Upsilon-\Upsilon_{x}\right\\|_{\diamond}$	$\displaystyle=$	$\displaystyle\sqrt{1-\min_{U\in U(d)}\sup_{V\in\mathbf{V}_{\epsilon}^{(d)}}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left\|\text{tr}\left[\rho U^{\dagger}V\right]\right\|^{2}}$
(86)		$\displaystyle=$	$\displaystyle\sqrt{1-\min_{W\in\mathbf{R}_{0}}\sup_{V\in\mathbf{V}_{\epsilon}^{(d)}}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left\|\text{tr}\left[\rho\left(\begin{matrix}W^{\dagger}&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)V\right]\right\|^{2}}$
(87)		$\displaystyle\leq$	$\displaystyle\sqrt{1-\min_{W\in\mathbf{R}_{0}}\sup_{W^{\prime}\in\mathbf{R}_{\epsilon}}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left\|\text{tr}\left[\rho\left(\begin{matrix}W^{\dagger}W^{\prime}&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\right]\right\|^{2}}$
(88)		$\displaystyle=$	$\displaystyle\sqrt{1-\min_{0\leq\theta\leq\frac{\pi}{2}}\sup_{0\leq\theta^{\prime}\leq\arccos(\epsilon)}\min_{\rho\in\mathbf{S}\left(\mathbb{C}^{d}\right)}\left\|\text{tr}\left[\rho\left(\begin{matrix}\cos(\theta^{\prime}-\theta)&-\sin(\theta^{\prime}-\theta)&0\\ \sin(\theta^{\prime}-\theta)&\cos(\theta^{\prime}-\theta)&0\\ 0&0&\mathbb{I}_{d-2}\end{matrix}\right)\right]\right\|^{2}}$
(89)		$\displaystyle=$	$\displaystyle\sqrt{1-\min_{0\leq\theta\leq\frac{\pi}{2}}\sup_{0\leq\theta^{\prime}\leq\arccos(\epsilon)}\cos^{2}(\theta^{\prime}-\theta)}$
(90)		$\displaystyle=$	$\displaystyle\max_{0\leq\theta\leq\frac{\pi}{2}}\inf_{0\leq\theta^{\prime}\leq\arccos(\epsilon)}\|\sin(\theta^{\prime}-\theta)\|=\epsilon,$

where we use Eq. (84) in the second equality and use $\lambda\left(\left(\begin{matrix}\cos(\theta^{\prime}-\theta)&-\sin(\theta^{\prime}-\theta)&0\\ \sin(\theta^{\prime}-\theta)&\cos(\theta^{\prime}-\theta)&0\\ 0&0&\mathbb{I}_{d-2}\end{matrix}\right)\right)=\{1,e^{\pm i(\theta^{\prime}-\theta)}\}$ in the fourth equality.

Second, by using the definition of the diamond norm, we obtain

(91)	$\displaystyle\max_{\Upsilon}\inf_{\Lambda\in\text{conv}\left(\{\Upsilon_{x}\}_{x\in X}\right)}\frac{1}{2}\left\\|\Upsilon-\Lambda\right\\|_{\diamond}$	$\displaystyle\geq$	$\displaystyle\max_{\Upsilon}\inf_{\Lambda\in\text{conv}\left(\{\Upsilon_{x}\}_{x\in X}\right)}\left\\|\Upsilon(\|{0}\rangle\langle{0}\|)-\Lambda(\|{0}\rangle\langle{0}\|)\right\\|_{\text{tr}}$
(92)		$\displaystyle\geq$	$\displaystyle 1-\min_{\Upsilon}\sup_{\Lambda\in\text{conv}\left(\{\Upsilon_{x}\}_{x\in X}\right)}F\left(\Upsilon(\|{0}\rangle\langle{0}\|),\Lambda(\|{0}\rangle\langle{0}\|)\right)$
(93)		$\displaystyle=$	$\displaystyle 1-\min_{\Upsilon}\sup_{x\in X}F\left(\Upsilon(\|{0}\rangle\langle{0}\|),\Upsilon_{x}(\|{0}\rangle\langle{0}\|)\right)$
(94)		$\displaystyle=$	$\displaystyle 1-\min_{U\in U(d)}\sup_{V\in\mathbf{V}_{\epsilon}^{(d)}}\left\|\langle{0}\|U^{\dagger}V\|{0}\rangle\right\|^{2}$
(95)		$\displaystyle=$	$\displaystyle 1-\min_{W\in\mathbf{R}_{0}}\sup_{\begin{subarray}{c}W^{\prime}\in\mathbf{R}_{\epsilon}\\ V\in U(d-1)\end{subarray}}\left\|\langle{0}\|\left(\begin{matrix}W^{\dagger}&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\left(\begin{matrix}1&0\\ 0&V\end{matrix}\right)\left(\begin{matrix}W^{\prime}&0\\ 0&\mathbb{I}_{d-2}\end{matrix}\right)\|{0}\rangle\right\|^{2}$
(96)		$\displaystyle=$	$\displaystyle 1-\min_{0\leq\theta\leq\frac{\pi}{2}}\sup_{\begin{subarray}{c}0\leq\theta^{\prime}\leq\arccos(\epsilon)\\ V\in U(d-1)\end{subarray}}\left\|\cos\theta\cos\theta^{\prime}+\sin\theta\sin\theta^{\prime}\langle{1}\|V\|{1}\rangle\right\|^{2}$
(97)		$\displaystyle=$	$\displaystyle\max_{0\leq\theta\leq\frac{\pi}{2}}\inf_{0\leq\theta^{\prime}\leq\arccos(\epsilon)}\sin^{2}(\theta^{\prime}-\theta)=\epsilon^{2},$

where we use $\left\|\phi-\rho\right\|_{\text{tr}}=\max_{\Pi\in\mathbf{Proj}\left(\mathcal{H}\right)}\text{tr}\left[\Pi(\phi-\rho)\right]\geq 1-\text{tr}\left[\phi\rho\right]$ in the second inequality and use Eq. (84) in the third equality. This and the extended upper bound in Ineq. (1) complete the proof.

Appendix D Numerical experiment for actual approximation errors

Recall that Theorem 4.3 has established the relationship between the actual approximation error $\epsilon_{\Upsilon}^{prob}$ caused by the probabilistic approximation, that $\epsilon_{\Upsilon}$ caused by the deterministic approximation, and the worst approximation error $\epsilon$ , where each approximation error is defined by

(98)	$\displaystyle\epsilon_{\Upsilon}^{prob}$	$\displaystyle:=$	$\displaystyle\min_{p}\frac{1}{2}\left\\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\\|_{\diamond}$
(99)	$\displaystyle\epsilon_{\Upsilon}$	$\displaystyle:=$	$\displaystyle\min_{x\in X}\frac{1}{2}\left\\|\Upsilon-\Upsilon_{x}\right\\|_{\diamond}$
(100)	$\displaystyle\epsilon$	$\displaystyle:=$	$\displaystyle\max_{\Upsilon}\min_{x\in X}\frac{1}{2}\left\\|\Upsilon-\Upsilon_{x}\right\\|_{\diamond}.$

By using these notations, we can rewrite the relationship established in Theorem 4.3 as follows:

(101)

\frac{2\epsilon_{\Upsilon}^{2}}{d}\simeq\frac{4\delta_{\Upsilon}}{d}\left(1-\frac{\delta_{\Upsilon}}{d}\right)\leq\epsilon_{\Upsilon}^{prob}\leq\epsilon^{2},

where $\delta_{\Upsilon}=1-\sqrt{1-\epsilon_{\Upsilon}^{2}}$ and the approximation becomes tighter when $\epsilon_{\Upsilon}\rightarrow 0$ .

While these inequalities are tight, it is helpful to know how small $\epsilon_{\Upsilon}^{prob}$ can be realized compared to $\epsilon_{\Upsilon}^{2}$ . In this appendix, we perform numerical experiments to demonstrate that $\epsilon_{\Upsilon}^{prob}$ is comparable to $\epsilon_{\Upsilon}^{2}$ for randomly chosen target unitary transformations $\Upsilon$ . Moreover, we demonstrate that $\epsilon_{\Upsilon}^{prob}$ becomes smaller than $\epsilon_{\Upsilon}^{2}$ for high-dimensional unitary transformations, i.e., the probabilistic approximation reduces the approximation error more than quadratically.

Additionally, the experiments have another purpose: to provide pieces of evidence supporting that Lemma 5.3 holds for qudit unitary transformations, which is crucial in constructing our probabilistic synthesis algorithm. Recall that it states that

(102)

\min_{p}\left\|\Upsilon-\sum_{x\in X}p(x)\Upsilon_{x}\right\|_{\diamond}=\min_{\hat{p}}\left\|\Upsilon-\sum_{x\in\hat{X}}\hat{p}(x)\Upsilon_{x}\right\|_{\diamond},

where $\hat{X}:=\{x\in X:\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\leq 2\epsilon\}$ . Our numerical experiments support this statement for randomly sampled target unitary transformations $\Upsilon:\mathbf{L}\left(\mathbb{C}^{d}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{d}\right)$ , randomly constructed $\epsilon$ -coverings $\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{d}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{d}\right)\right\}_{x\in X}$ , and $d=3,4$ .

Setting of numerical experiments

First, we construct $\epsilon$ -coverings $\left\{\Upsilon_{x}:\mathbf{L}\left(\mathbb{C}^{d}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{d}\right)\right\}_{x\in X}$ by randomly choosing $|X|=10^{5}$ , $|X|=10^{6}$ and $|X|=10^{7}$ unitary operators for $d=2$ , $d=3$ , and $d=4$ , respectively. Note that the random sampling of unitary operators means the sampling probability distribution is induced by the Haar measure on $U(d)$ . We compute a lower bound on $\epsilon$ as $\max_{i}\min_{x\in X}\frac{1}{2}\left\|\Upsilon_{i}-\Upsilon_{x}\right\|_{\diamond}$ by using $30$ randomly chosen target unitary transformations $\{\Upsilon_{i}\}_{i=1}^{30}$ . We interpret $\left\{\Upsilon_{x}\right\}_{x\in X}$ as the set of available unitary transformations in probabilistic and deterministic approximation.

Next, we randomly choose a target unitary transformation $\Upsilon$ and compute $\epsilon_{\Upsilon}$ . We define the actual approximation error caused by probabilistically mixing restricted available unitary transformations as

(103)

\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime}):=\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x\in\hat{X}(\epsilon^{\prime})}p(x)\Upsilon_{x}\right\|_{\diamond}

where $\hat{X}(\epsilon^{\prime}):=\{x\in X:\frac{1}{2}\left\|\Upsilon-\Upsilon_{x}\right\|_{\diamond}\leq\epsilon^{\prime}\}$ . Note that $\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime})$ is a monotonically decreasing function. Moreover, if Lemma 5.3 holds for qudit unitary transformations, $\epsilon_{\Upsilon}^{prob}(2\epsilon)=\epsilon_{\Upsilon}^{prob}(1)=\epsilon_{\Upsilon}^{prob}$ .

Third, we compute the actual approximation error caused by probabilistically mixing more restricted available unitary transformations as

(104)

\epsilon_{\Upsilon}^{prob}(N):=\min_{p}\frac{1}{2}\left\|\Upsilon-\sum_{x=1}^{N}p(x)\Upsilon_{x}\right\|_{\diamond},

where $\{\Upsilon_{x}\}_{x=1}^{N}$ is a randomly sampled subset of $\hat{X}(\epsilon^{\prime})$ and $\epsilon^{\prime}$ is chosen large enough for $\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime})$ to converge.

Results of numerical experiments

In Fig. 5, we draw the graphs of $\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime})$ for $10$ randomly chosen $\Upsilon$ by using different colors corresponding to $\Upsilon$ . We can observe that $\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime})$ is saturated when $\epsilon^{\prime}\geq 1.4\epsilon$ and $\epsilon_{\Upsilon}^{prob}$ is comparable or smaller than $\epsilon_{\Upsilon}^{2}$ since $\frac{\log(\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime}))}{\log(\epsilon_{\Upsilon})}\geq 2\Leftrightarrow\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime})\leq\epsilon_{\Upsilon}^{2}$ and $\epsilon_{\Upsilon}^{prob}\leq\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime})$ by definition.

In Fig. 6, we draw the graph of $\epsilon_{\Upsilon}^{prob}(N)$ for a randomly chosen $\Upsilon:\mathbf{L}\left(\mathbb{C}^{4}\right)\rightarrow\mathbf{L}\left(\mathbb{C}^{4}\right)$ . Note that we plot its empirical variance and average since $\epsilon_{\Upsilon}^{prob}(N)$ is a random variable depending on the choice of a subset of $\hat{X}(\epsilon^{\prime})$ . We can observe that the approximation error $\epsilon_{\Upsilon}^{prob}(N)$ rapidly converges to its minimum $\epsilon_{\Upsilon}^{prob}(\epsilon^{\prime})$ . Therefore, we can expect that our probabilistic synthesis algorithm proposed in Theorem 5.4 can be made more efficient while still attaining an approximation error that is nearly optimal.

Environment of numerical experiments

We performed numerical experiments using Mathematica 14.0.0.0 on a MacBook Pro equipped with a 2.4GHz Intel Core i9 processor and 64GB of memory.

	$\displaystyle\|{\Psi_{1}}\rangle=\frac{1}{\sqrt{2}}(\|{00}\rangle+\|{11}\rangle),\ \ \$		$\displaystyle\|{\Psi_{2}}\rangle=\frac{i}{\sqrt{2}}(\|{00}\rangle-\|{11}\rangle),$
(37)		$\displaystyle\|{\Psi_{3}}\rangle=\frac{i}{\sqrt{2}}(\|{01}\rangle+\|{10}\rangle),\ \ \$		$\displaystyle\|{\Psi_{4}}\rangle=\frac{1}{\sqrt{2}}(\|{01}\rangle-\|{10}\rangle).$