This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\floatsetup

[table]capposition=top \floatsetup[figure]capposition=bottom \newfloatcommandcapbtabboxtable[][\FBwidth]

A two-stage solution to quantum process tomography: error analysis and optimal design

Shuixin Xiao, Yuanlong Wang, Jun Zhang, Daoyi Dong, Gary J. Mooney, Ian R. Petersen, and Hidehiro Yonezawa This research was supported by the Australian Research Council Future Fellowship Funding Scheme under Project FT220100656, Discovery Project Funding Scheme under Project DP210101938, the Centres of Excellence under Grant CE170100012, and the National Natural Science Foundation of China (62173229, 12288201). Shuixin Xiao is with School of Engineering and Technology, University of New South Wales, Canberra ACT 2600, Australia, and University of Michigan–Shanghai Jiao Tong University Joint Institute, Shanghai Jiao Tong University, Shanghai 200240, China, and also with School of Engineering, The Australian National University, Canberra, ACT 2601, Australia (e-mail: [email protected]). Yuanlong Wang is with Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China (e-mail: [email protected]).Jun Zhang is with University of Michigan–Shanghai Jiao Tong University Joint Institute, Shanghai Jiao Tong University, Shanghai 200240, China (e-mail: [email protected]).Daoyi Dong is with CIICADA Lab, School of Engineering, Australian National University, ACT 2601, Australia, and School of Physics, The University of Melbourne, Parkville, Victoria 3010, Australia (e-mail: [email protected]).Gary J. Mooney is with School of Physics, University of Melbourne, Parkville, Victoria 3010, Australia (e-mail: [email protected]).Ian R. Petersen is with School of Engineering, Australian National University, ACT 2601, Australia (e-mail: [email protected]). Hidehiro Yonezawa is with RIKEN Center for Quantum Computing, Japan, and School of Engineering and Technology, University of New South Wales, Canberra ACT 2600, Australia, and also with Centre for Quantum Computation and Communication Technology, Australian Research Council, Canberra, ACT 2600, Australia (e-mail: [email protected]).
Abstract

Quantum process tomography is a critical task for characterizing the dynamics of quantum systems and achieving precise quantum control. In this paper, we propose a two-stage solution for both trace-preserving and non-trace-preserving quantum process tomography. Utilizing a tensor structure, our algorithm exhibits a computational complexity of O(MLd2)O(MLd^{2}) where dd is the dimension of the quantum system and MM, LL represent the numbers of different input states and measurement operators, respectively. We establish an analytical error upper bound and then design the optimal input states and the optimal measurement operators, which are both based on minimizing the error upper bound and maximizing the robustness characterized by the condition number. Numerical examples and testing on IBM quantum devices are presented to demonstrate the performance and efficiency of our algorithm.

Index Terms:
Quantum process tomography, quantum system identification, error analysis, quantum system.

I Introduction

In the past decades, quantum science and technologies have significantly advanced in many fields such as quantum computation [1, 2], quantum communication [3, 4], quantum sensing [5] and quantum control [6, 7, 8]. To fully develop and realize these technologies, a fundamental challenge to overcome is to characterize unknown quantum dynamics. A typical framework to formulate this problem is quantum process tomography (QPT), where the parameters determining the map from input quantum states to output states need to be estimated.

For a closed quantum system, QPT is reduced to Hamiltonian identification and many results have been obtained in this scenario [9, 10, 11, 12, 13, 14]. For an open quantum system characterized by time-independent parameters, the process is completely positive (CP) and attention is usually restricted to the trace-preserving (TP) case. Using direct and convex optimization methods, Zorzi et al. [15, 16] gave the minimum positive operator-valued measure (POVM) resources required to estimate a CPTP quantum process, and Knee et al. [17] and Surawy-Stepney et al. [18] proposed iterative projection algorithms to identify such a process. The efficiency of parameter estimation for quantum processes was studied in [19]. Zhang and Sarovar [20] utilized the time traces of observable measurements to identify the parameters in a master equation. The identifiability and identification for the passive quantum systems were investigated in [21]. When the quantum process is non-trace-preserving (non-TP), Huang et al. [22] proposed a convex optimization method and Bongioanni et al. [23] demonstrated non-TP QPT based on a maximum likelihood approach in an experiment. In practice, a quantum process may be affected by non-Markovian noise. White et al. [24] developed a framework to characterize non-Markovian dynamics and the work in [25] proposed process tensor tomography for non-Markovian QPT. Moreover, adaptive strategies for QPT were proposed in [26, 27, 28].

In this paper, we focus on Standard Quantum Process Tomography (SQPT) [2, 14, 29, 30, 31] where states with the same dimension as the process are inputted, and the corresponding outputs are measured to reconstruct the process. We extend the framework in [2, 13] to allow for general input states and non-TP processes, and also directly connect the process with measurement data in a unified system equation. Using a mathematical representation, we deduce the special structural properties of the system equation, allowing for more efficient analysis, solution and optimal design. We give an analytical two-stage solution (TSS) to a general QPT problem, as a closed-form estimation which is not common among the existing QPT solutions. For dd-dimensional quantum systems, our algorithm has computational complexity O(MLd2)O(MLd^{2}) and storage requirement O(ML)O(ML), where Md2M\geq d^{2} is the number of different informationally complete or overcomplete input states and Ld2L\geq d^{2} is the number of different measurement operators. We also establish an analytical error upper bound for our algorithm. Then we study the optimization of input states and measurement operators, based on this error upper bound and the condition number of the system equation, which reflects the robustness of our algorithm. We give theorems to characterize lower bounds on the error and condition number for both input states and measurement operators. We prove that SIC (symmetric informationally complete) states and MUB (mutually unbiased bases) states [32] are two examples achieving the lower bounds for our algorithm. We also prove MUB measurement achieves the lower bounds among measurement operators. Numerical examples and testing on IBM Quantum devices demonstrate the effectiveness of our algorithm and validate the theoretical error analysis. We compare our TSS algorithm with the convex optimization method in [14] and the results show that our algorithm is more efficient in both time and space costs during calculation. The main contributions of this paper are summarized as follows.

  1. (i)

    We propose an analytical two-stage solution (TSS) to a general QPT problem. Our TSS algorithm does not require specific prior knowledge about the unknown process and can be applied in both TP and non-TP quantum processes.

  2. (ii)

    Utilizing a tensor structure, our TSS algorithm has the computational complexity O(MLd2)O(MLd^{2}) and a clear storage requirement O(ML)O(ML), resulting from our closed-form estimation formula.

  3. (iii)

    Our TSS algorithm can give an analytical error upper bound which can be further utilized to optimize the input states and the measurement operators.

  4. (iv)

    Numerical examples and testing on IBM Quantum devices demonstrate the theoretical results and the effectiveness of our method. Compared to the convex optimization method in [14], our algorithm is more efficient in both time and space costs.

The organization of this paper is as follows. Section II introduces some preliminary knowledge and establishes our framework for QPT. Section III studies the tensor structure and presents our TSS algorithm for QPT. In Section IV, we analyze the computational complexity and the storage requirement. In Section V, we show the error analysis and in Section VI, we study the optimization of the input states and the measurement operators. Numerical examples are presented in Section VII and Section VIII concludes this paper.

Notation: The ii-th row and jj-th column of a matrix XX is XijX_{ij}. The jj-th column of XX is colj(X)\operatorname{col}_{j}(X). The transpose of XX is XTX^{T}. The conjugate ()(*) and transpose of XX is XX^{\dagger}. The sets of real and complex numbers are \mathbb{R} and \mathbb{C}, respectively. The sets of dd-dimension complex vectors and d×dd\times d complex matrices are d\mathbb{C}^{d} and d×d\mathbb{C}^{d\times d}, respectively. The identity matrix is II. i=1\rm i=\sqrt{-1}. The trace of XX is Tr(X)\text{Tr}(X). The Frobenius norm of a matrix XX is denoted as X||X|| and the 2-norm of a vector xx is x||x||. We use density matrix ρ\rho to represent a quantum state where ρ=ρ\rho=\rho^{\dagger}, ρ0\rho\geq 0 and Tr(ρ)=1\operatorname{Tr}(\rho)=1 and use a unit complex vector |ψ|\psi\rangle to represent a pure state. The estimate of XX is X^\hat{X}. The inner product of two matrices XX and YY is defined as X,YTr(XY)\langle X,Y\rangle\triangleq\text{Tr}(X^{\dagger}Y). The inner product of two vectors xx and yy is defined as x,yxy\langle x,y\rangle\triangleq x^{\dagger}y. The tensor product of AA and BB is denoted ABA\otimes B. Hilbert space is \mathbb{H}. The Kronecker delta function is δ\delta. The diag(a)\operatorname{diag}(a) denotes a diagonal matrix with the ii-th diagonal element being the ii-th element of the vector aa. For any positive semidefinite Xd×dX_{d\times d} with spectral decomposition X=UPUX=UPU^{\dagger}, we define X\sqrt{X} or X12X^{\frac{1}{2}} as Udiag(P11,P22,,Pdd)UU\operatorname{diag}\left(\sqrt{P_{11}},\sqrt{P_{22}},\cdots,\sqrt{P_{dd}}\right)U^{\dagger}.

II Preliminaries and quantum process tomography

II-A Vectorization function

We introduce the (column-)vectorization function vec:m×nmn\text{vec}:\mathbb{C}^{m\times n}\mapsto\mathbb{C}^{mn}. For a matrix Xm×nX_{m\times n},

vec(Xm×n)\displaystyle\text{vec}(X_{m\times n})\triangleq [X11,X21,,Xm1,X12,,Xm2,\displaystyle[X_{11},X_{21},\cdots,X_{m1},X_{12},\cdots,X_{m2}, (1)
,X1n,,Xmn]T.\displaystyle\cdots,X_{1n},\cdots,X_{mn}]^{T}.

Thus, vec()\text{vec}(\cdot) is a linear and basis dependent map. We also define the inverse vec1()\text{vec}^{-1}(\cdot) which maps a d2×1d^{2}\times 1 vector into a d×dd\times d square matrix. The common properties of vec()\text{vec}(\cdot) are listed as follows [33, 34]:

X,Y=vec(X),vec(Y),\langle X,Y\rangle=\langle\text{vec}(X),\text{vec}(Y)\rangle, (2)
vec(XYZ)=(ZTX)vec(Y),\text{vec}(XYZ)=(Z^{T}\otimes X)\text{vec}(Y), (3)
Tr1(vec(X)vec(Y))=XY,\text{Tr}_{1}(\text{vec}(X)\text{vec}(Y)^{\dagger})=XY^{\dagger}, (4)

where Tr1(X)\text{Tr}_{1}(X)denotes the partial trace on the space 1\mathbb{H}_{1} with XX belonging to the space 12\mathbb{H}_{1}\otimes\mathbb{H}_{2}.

II-B Quantum process tomography

According to the system architecture, QPT generally can be divided into three classes: Standard Quantum Process Tomography (SQPT) [2, 14, 29, 30, 31], Ancilla-Assisted Process Tomography (AAPT) [35, 36, 37, 38] and Direct Characterization of Quantum Dynamics (DCQD) [39, 40, 41]. In SQPT, one inputs quantum states with the same dimension as the process, and then we reconstruct the process from quantum state tomography (QST) on the output states. In AAPT and DCQD, an auxiliary system (ancilla) is attached to the principal system, and the input states and the measurement of the outputs are both carried out on the extended Hilbert space. AAPT transforms QPT to QST in a larger Hilbert space and the input state can be separable [42]. DCQD requires entangled input states and measurements such that the measured probability distributions are directly related to the elements of the process representation [42]. Since AAPT needs high dimensional states in the extended Hilbert space and DCQD usually requires entanglement which are both more technologically demanding, we focus on the SQPT scenario in this paper.

Here, we briefly introduce the SQPT framework as given in [2, 13] and extend it to a more general framework based on the following two aspects:

  1. (i)

    Input Fock states are extended to be arbitrary states.

  2. (ii)

    The unknown process is extended from the TP case to both the TP and non-TP cases.

For a dd-dimensional quantum system, its dynamics can be described by a completely-positive (CP) linear map \mathcal{E}. If we input a quantum state ρin\rho^{\text{in}} (input state), using the Kraus operator-sum representation [2], the output state ρout\rho^{\text{out}} is given by

ρout =(ρin )=i=1d2𝒜iρin 𝒜i,\rho^{\text{out }}=\mathcal{E}\left(\rho^{\text{in }}\right)=\sum_{i=1}^{d^{2}}\mathcal{A}_{i}\rho^{\text{in }}\mathcal{A}_{i}^{\dagger}, (5)

where 𝒜id×d\mathcal{A}_{i}\in\mathbb{C}^{d\times d} and satisfy

i=1d2𝒜i𝒜iId.\sum_{i=1}^{d^{2}}\mathcal{A}_{i}^{\dagger}\mathcal{A}_{i}\leq I_{d}. (6)

When the equality in (6) holds, the map \mathcal{E} is said to be trace-preserving (TP). Otherwise, it is non-trace-preserving (non-TP). Let {Ei}i=1d2\{E_{i}\}_{i=1}^{d^{2}} be a fixed basis of d×d\mathbb{C}^{d\times d}. From {𝒜i}i=1d2\{\mathcal{A}_{i}\}_{i=1}^{d^{2}}, we can obtain process matrix XX [2, 13] which is a d2×d2d^{2}\times d^{2} Hermitian and positive semidefinite matrix such that

(ρin)=j,k=1d2EjρinEkXjk.\mathcal{E}(\rho^{\text{in}})=\sum_{j,k=1}^{d^{2}}E_{j}\rho^{\text{in}}E_{k}^{\dagger}X_{jk}. (7)

In this paper, we consider both TP and non-TP quantum processes and (6) becomes

j,k=1d2XjkEkEjId.\sum_{j,k=1}^{d^{2}}X_{jk}E_{k}^{\dagger}E_{j}\leq I_{d}. (8)

Since the relationship between XX and \mathcal{E} is one-to-one [2, 13], the identification of \mathcal{E} is equivalent to identifying XX. The number of independent elements in XX is d4d2d^{4}-d^{2} for the TP case and d4d^{4} for the non-TP case.

Let {σn}n=1d2\{\sigma_{n}\}_{n=1}^{d^{2}} be a complete basis set of d×d\mathbb{C}^{d\times d}. Denote the set of all the input states as {ρmin}m=1M\{\rho_{m}^{\text{in}}\}_{m=1}^{M}. We can expand each output state uniquely in {σn}n=1d2\{\sigma_{n}\}_{n=1}^{d^{2}} as

ρmout=(ρmin)=n=1d2αmnσn.\rho_{m}^{\text{out}}=\mathcal{E}(\rho_{m}^{\text{in}})=\sum_{n=1}^{d^{2}}\alpha_{mn}\sigma_{n}. (9)

We also define βmnjk\beta_{mn}^{jk} such that

EjρminEk=n=1d2βmnjkσn.E_{j}\rho_{m}^{\text{in}}E_{k}^{\dagger}=\sum_{n=1}^{d^{2}}\beta_{mn}^{jk}\sigma_{n}. (10)

From the linear independence of {σn}n=1d2\{\sigma_{n}\}_{n=1}^{d^{2}}, the relationship between XX and α\alpha is [2, 13]

j,k=1d2βmnjkXjk=αmn.\sum_{j,k=1}^{d^{2}}\beta_{mn}^{jk}X_{jk}=\alpha_{mn}. (11)

To guarantee that (11) has a unique solution, the maximal linear independent subset of {ρmin}m=1M\{\rho_{m}^{\text{in}}\}_{m=1}^{M} must have d2d^{2} elements. If this is achieved with M=d2M=d^{2}, we call the input state set informationally complete. If this is achieved with M>d2M>d^{2}, we call it informationally overcomplete. The work in [2, 13] is restricted to a special complete case, where {ρmin}m=1M={|jk|}j,k=1d\{\rho_{m}^{\text{in}}\}_{m=1}^{M}=\{|j\rangle\langle k|\}_{j,k=1}^{d} (after a linear combination of the practical experiment results). Here we extend the framework of [2, 13] to the general informationally complete/overcomplete case with Md2M\geq d^{2}. We define the matrix AA where AmnαmnA_{mn}\triangleq\alpha_{mn} and arrange the elements βmnjk\beta_{mn}^{jk} into a matrix BB:

B=[β1111β1121β1112β1122β11d2d2β2111β2121β2112β2122β21d2d2β1211β1221β1212β1222β12d2d2β2211β2221β2212β2222β22d2d2βMd211βMd221βMd212βMd222βMd2d2d2]B=\left[\begin{array}[]{*{7}{c}}\beta_{11}^{11}&\beta_{11}^{21}&\cdots&\beta_{11}^{12}&\beta_{11}^{22}&\cdots&\beta_{11}^{d^{2}d^{2}}\\ \beta_{21}^{11}&\beta_{21}^{21}&\cdots&\beta_{21}^{12}&\beta_{21}^{22}&\cdots&\beta_{21}^{d^{2}d^{2}}\\ \lx@intercol\dotfill\hfil\hfil\lx@intercol\\ \beta_{12}^{11}&\beta_{12}^{21}&\cdots&\beta_{12}^{12}&\beta_{12}^{22}&\cdots&\beta_{12}^{d^{2}d^{2}}\\ \beta_{22}^{11}&\beta_{22}^{21}&\cdots&\beta_{22}^{12}&\beta_{22}^{22}&\cdots&\beta_{22}^{d^{2}d^{2}}\\ \lx@intercol\dotfill\hfil\hfil\lx@intercol\\ \beta_{Md^{2}}^{11}&\beta_{Md^{2}}^{21}&\cdots&\beta_{Md^{2}}^{12}&\beta_{Md^{2}}^{22}&\cdots&\beta_{Md^{2}}^{d^{2}d^{2}}\\ \end{array}\right] (12)

which is an Md2×d4Md^{2}\times d^{4} matrix. We rewrite (11) into a compact form as

Bvec(X)=vec(A),B\text{vec}(X)=\text{vec}(A), (13)

where BB is determined once the bases {Ei}i=1d2\{E_{i}\}_{i=1}^{d^{2}} and {σn}n=1d2\{\sigma_{n}\}_{n=1}^{d^{2}} are chosen.

In an experiment, assume that the number of copies for each input state is NN. Assume Pi,jP_{i,j} is the ii-th POVM element (i.e., measurement operator) in the jj-th POVM set where 1jJ1\leq j\leq J and 1inj1\leq i\leq n_{j}. Therefore, the total number of different measurement operators is L=j=1JnjL=\sum_{j=1}^{J}n_{j} and we have

P1,j+P2,j++Pnj,j=Id,P_{1,j}+P_{2,j}+\cdots+P_{n_{j},j}=I_{d}, (14)

for 1jJ1\leq j\leq J, which is the completeness constraint for each POVM set. We then vectorize these POVM elements as {Pl}l=1L\{P_{l}\}_{l=1}^{L} where, e.g., Pi+q=1j1nqPi,jP_{i+\sum_{q=1}^{j-1}n_{q}}\leftarrow P_{i,j}. These are the measurement operators on the output states. To extract all of the information in the output state, these measurement operators should be informationally complete or overcomplete, and thus Ld2L\geq d^{2}. Some efficient methods for QST with fewer measurements are also discussed in [43, 44, 45]. Then, we can also uniquely expand PlP_{l} in {σn}n=1d2\{\sigma_{n}\}_{n=1}^{d^{2}} as

Pl=j=1d2μljσj,P_{l}=\sum_{j=1}^{d^{2}}\mu_{lj}\sigma_{j}, (15)

and the ideal measurement probability pmlp_{ml} of the mm-th output state and the ll-th measurement operator is

pml\displaystyle p_{ml} =Tr(ρmoutPl)\displaystyle=\operatorname{Tr}\left(\rho^{\text{out}}_{m}P_{l}\right)
=Tr(n=1d2αmnσnj=1d2μljσj)\displaystyle=\operatorname{Tr}\left(\sum_{n=1}^{d^{2}}\alpha_{mn}\sigma_{n}\sum_{j=1}^{d^{2}}\mu_{lj}\sigma_{j}\right)
=n,j=1d2αmnμljTr(σnσj).\displaystyle=\sum_{n,j=1}^{d^{2}}\alpha_{mn}\mu_{lj}\operatorname{Tr}\left(\sigma_{n}\sigma_{j}\right). (16)

Define the matrix 𝒫\mathcal{P} with 𝒫mlpml\mathcal{P}_{ml}\triangleq p_{ml} and let

C[Tr(σ1j=1d2μ1jσj)Tr(σd2j=1d2μ1jσj)Tr(σ1j=1d2μ2jσj)Tr(σd2j=1d2μ2jσj)Tr(σ1j=1d2μLjσj)Tr(σd2j=1d2μLjσj)],C\!\triangleq\!\!\left[\begin{array}[]{cccc}\operatorname{Tr}\left(\sigma_{1}\sum_{j=1}^{d^{2}}\mu_{1j}\sigma_{j}\right)&\cdots&\operatorname{Tr}\left(\sigma_{d^{2}}\sum_{j=1}^{d^{2}}\mu_{1j}\sigma_{j}\right)\\ \operatorname{Tr}\left(\sigma_{1}\sum_{j=1}^{d^{2}}\mu_{2j}\sigma_{j}\right)&\cdots&\operatorname{Tr}\left(\sigma_{d^{2}}\sum_{j=1}^{d^{2}}\mu_{2j}\sigma_{j}\right)\\ \vdots&\vdots&\vdots\\ \operatorname{Tr}\left(\sigma_{1}\sum_{j=1}^{d^{2}}\mu_{Lj}\sigma_{j}\right)&\cdots&\operatorname{Tr}\left(\sigma_{d^{2}}\sum_{j=1}^{d^{2}}\mu_{Lj}\sigma_{j}\right)\end{array}\right], (17)

which is determined by the measurement operators {Pl}l=1L\{P_{l}\}_{l=1}^{L}. Thus we have

(IMC)vec(AT)=vec(𝒫T).\left(I_{M}\otimes C\right)\operatorname{vec}(A^{T})=\operatorname{vec}(\mathcal{P}^{T}). (18)

Define KK such that Kvec(A)=vec(AT)K\operatorname{vec}(A)=\operatorname{vec}(A^{T}) and thus KK is a Md2×Md2Md^{2}\times Md^{2} commutation matrix. Using (13) and (18), we have

(IMC)KBvec(X)=vec(𝒫T),\left(I_{M}\otimes C\right)KB\operatorname{vec}(X)=\operatorname{vec}(\mathcal{P}^{T}), (19)

which directly connects the unknown quantum process XX with measurement data 𝒫\mathcal{P} in the experiment.

Assume that the practical measurement result is p^ml\hat{p}_{ml} and the measurement error is eml=p^mlpmle_{ml}=\hat{p}_{ml}-p_{ml}. According to the central limit theorem, emle_{ml} converges in distribution to a normal distribution with mean zero and variance (pmlpml2)/(N/J)\left(p_{ml}-p_{ml}^{2}\right)/(N/J) [46, 47]. We consider the general scenario where there is no prior knowledge about whether the process is TP or non-TP, and the QPT can be formulated as the following optimization problem:

Problem 1

Given the parameterization matrix BB for the input states, the parameterization matrix CC for the measurement operators, the permutation matrix KK and the experimental data 𝒫^\hat{\mathcal{P}}, find a Hermitian and positive semidefinite estimate X^\hat{X} minimizing (IMC)KBvec(X^)vec(𝒫^T)\left\|\left(I_{M}\otimes C\right)KB\operatorname{vec}(\hat{X})-\operatorname{vec}(\widehat{\mathcal{P}}^{T})\right\| and satisfying the constraint (8).

III Tensor structure and two-stage quantum process tomography solution

In this section, we discuss how to simplify the structure of BB under a suitably chosen representation, which can reduce the computational complexity and storage requirement. Then we propose a two-stage solution (TSS) to obtain analytical tomography formulas for Problem 1. When there is prior knowledge that the process is TP, most of the following analysis still applies, unless otherwise specified.

III-A Tensor structure of QPT

Firstly, we consider the basis sets {Ei}i=1d2\{E_{i}\}_{i=1}^{d^{2}} and {σn}n=1d2\{\sigma_{n}\}_{n=1}^{d^{2}}, which, if properly chosen, can simplify the QPT problem and greatly benefit the time and space costs of our algorithm. We choose these basis sets as the natural basis {|jk|}1j,kd\{|j\rangle\langle k|\}_{1\leq j,k\leq d} where i=(j1)d+k,n=(k1)d+ji=(j-1)d+k,n=(k-1)d+j. The advantages of this choice can be demonstrated as follows.

Lemma 1

[13, 15] If {Ei}i=1d2\{E_{i}\}_{i=1}^{d^{2}} is chosen as the natural basis {|jk|}1j,kd\{|j\rangle\langle k|\}_{1\leq j,k\leq d} where i=(j1)d+ki=(j-1)d+k, then the constraint (8) becomes Tr1(X)Id\operatorname{Tr}_{1}(X)\leq I_{d}.

Lemma 1 fellows from the Choi-Jamiołkowski isomorphism and the proof for Tr1(X)=Id\operatorname{Tr}_{1}(X)=I_{d} in the TP case can be found in [13, 15], which can be straightforwardly extended to non-TP cases. Define FTr1(X)IdF\triangleq\operatorname{Tr}_{1}(X)\leq I_{d}, and FF represents the success probability of the quantum process [23]. Furthermore, we have the following benefit.

Proposition 1

Define the collection of all the vectorized input states as

V[vec(ρ1in),vec(ρ2in),,vec(ρMin)].V\triangleq\left[\operatorname{vec}\left(\rho_{1}^{\text{in}}\right),\operatorname{vec}\left(\rho_{2}^{\text{in}}\right),\cdots,\operatorname{vec}\left(\rho_{M}^{\text{in}}\right)\right]. (20)

If {Ei}i=1d2\{E_{i}\}_{i=1}^{d^{2}} and {σn}n=1d2\{\sigma_{n}\}_{n=1}^{d^{2}} are chosen as the natural basis {|jk|}1j,kd\{|j\rangle\langle k|\}_{1\leq j,k\leq d} where i=(j1)d+k,n=(k1)d+ji=(j-1)d+k,n=(k-1)d+j, then we have

(Id2VT)R=B,\left(I_{d^{2}}\otimes V^{T}\right)R=B, (21)

where RR is a d4×d4d^{4}\times d^{4} permutation matrix.

Proof:

Using (3), (10) becomes

(EkEj)vec(ρmin)=n=1d2βmnjkvec(σn).\left(E_{k}^{*}\otimes E_{j}\right)\operatorname{vec}\left(\rho_{m}^{\text{in}}\right)=\sum_{n=1}^{d^{2}}\beta_{{mn}}^{{jk}}\operatorname{vec}\left(\sigma_{n}\right). (22)

Since {σn}n=1d2\{\sigma_{n}\}_{n=1}^{d^{2}} the is natural basis, using (20), we have

(EkEj)V=[β11jkβM1jkβ1d2jkβMd2jk],\left(E_{k}^{*}\otimes E_{j}\right)V=\left[\begin{array}[]{ccc}\beta_{1{1}}^{{jk}}&\cdots&\beta_{{M1}}^{{jk}}\\ \vdots&\vdots&\vdots\\ \beta_{1{d^{2}}}^{{jk}}&\cdots&\beta_{{Md^{2}}}^{{jk}}\end{array}\right], (23)

and vec([(EkEj)V]T)\operatorname{vec}\left(\left[\left(E_{k}^{*}\otimes E_{j}\right)V\right]^{T}\right) is the (j+(k1)d2)(j+(k-1)d^{2})-th column of BB. Let k=(u1)d+vk=(u-1)d+v, j=(x1)d+yj=(x-1)d+y. Then

EkEj=|uv||xy|,E_{k}^{*}\otimes E_{j}=|u\rangle\langle v|\otimes|x\rangle\langle y|, (24)

where only the element at ((u1)d+x,(v1)d+y)\left((u-1)d+x,(v-1)d+y\right) is non-zero. Then

(EkEj)vec(ρmin)\displaystyle\left(E_{k}^{*}\otimes E_{j}\right)\operatorname{vec}\left(\rho_{m}^{\text{in}}\right) =[0,,0,(ρmin)yv((u1)d+x)th,0,,0]T,\displaystyle=[0,\cdots,0,\underbrace{\left(\rho_{m}^{\text{in}}\right)_{yv}}_{((u-1)d+x)\text{th}},0,\cdots,0]^{T}, (25)

and therefore,

[(EkEj)V]T=[00(ρ1in)yv0000(ρMin)yv00],\left[\left(E_{k}^{*}\otimes E_{j}\right)V\right]^{T}=\left[\begin{array}[]{ccccccc}0&\cdots&0&\left(\rho_{1}^{\text{in}}\right)_{yv}&0&\cdots&0\\ \vdots&&\vdots&\vdots&\vdots&&\vdots\\ 0&\cdots&0&\left(\rho_{M}^{\text{in}}\right)_{yv}&0&\cdots&0\end{array}\right], (26)

where only the (u1)d+x(u-1)d+x-th column is non-zero. Thus the (j+(k1)d2)(j+(k-1)d^{2})-th column of BB is

[0,,0,(ρ1in)yv,,(ρMin)yv,0,,0]T.\left[0,\cdots,0,\left(\rho_{1}^{\text{in}}\right)_{yv},\cdots,\left(\rho_{M}^{\text{in}}\right)_{yv},0,\cdots,0\right]^{T}. (27)

The matrix Id2VT=diag([VT,,VT])I_{d^{2}}\otimes V^{T}=\operatorname{diag}\left(\left[V^{T},\cdots,V^{T}\right]\right) is block diagonal. Thus the (d2[(u1)d+x1]+(v1)d+y)(d^{2}\left[(u-1)d+x-1\right]+(v-1)d+y)-th column of Id2VTI_{d^{2}}\otimes V^{T} is the same as the (j+(k1)d2)(j+(k-1)d^{2})-th column of BB. Therefore,

(Id2VT)R=B,\left(I_{d^{2}}\otimes V^{T}\right)R=B, (28)

where RR is a d4×d4d^{4}\times d^{4} permutation matrix. ∎

Since we choose {σn}n=1d2\{\sigma_{n}\}_{n=1}^{d^{2}} as the natural basis, we have

C=[vec(P1),vec(P2),,vec(PL)]T.C=\left[\operatorname{vec}\left(P_{1}\right),\operatorname{vec}\left(P_{2}\right),\cdots,\operatorname{vec}\left(P_{L}\right)\right]^{T}. (29)

Using Proposition 1 and (19), we have

(IMC)K(Id2VT)Rvec(X)=vec(𝒫T).\left(I_{M}\otimes C\right)K\left(I_{d^{2}}\otimes V^{T}\right)R\operatorname{vec}(X)=\operatorname{vec}({\mathcal{P}^{T}}). (30)

Define Y(IMC)K(Id2VT)RY\triangleq\left(I_{M}\otimes C\right)K\left(I_{d^{2}}\otimes V^{T}\right)R. Since we assume the input states and measurement operators are both informationally complete or overcomplete, the unique least squares solution to minYvec(X^)vec(𝒫^T)\min||Y\text{vec}(\hat{X})-\text{vec}(\widehat{\mathcal{P}}^{T})|| is

vec(X^LS(1))=(YY)1Yvec(𝒫^T).\operatorname{vec}\left(\hat{X}_{\text{LS}}^{(1)}\right)=\left(Y^{\dagger}Y\right)^{-1}Y^{\dagger}\operatorname{vec}(\widehat{\mathcal{P}}^{T}). (31)

However, the computational complexity for this least squares solution is usually O(d12)O(d^{12}) [17] which is quite high. To reduce the computational complexity, we do not actually utilize (31). Noting the special structure property of (30), we firstly reconstruct A^\hat{A} and then obtain X^\hat{X}, which is more efficient in computation. To reconstruct A^\hat{A} from (18), we have

vec(A^)=KT(IM(CC)1C)vec(𝒫^T),\operatorname{vec}(\hat{A})=K^{T}\left(I_{M}\otimes\left(C^{\dagger}C\right)^{-1}C^{\dagger}\right)\operatorname{vec}(\widehat{\mathcal{P}}^{T}), (32)

and then the least squares solution vec(X^LS(2))\operatorname{vec}\left(\hat{X}_{\text{LS}}^{(2)}\right) is given by

vec(X^LS(2))\displaystyle\operatorname{vec}\left(\hat{X}_{\text{LS}}^{(2)}\right) =(BB)1Bvec(A^)\displaystyle=\left(B^{\dagger}B\right)^{-1}B^{\dagger}\operatorname{vec}(\hat{A}) (33)
=RT(Id2(VVT)1V)vec(A^).\displaystyle=R^{T}\left(I_{d^{2}}\otimes\left(V^{*}V^{T}\right)^{-1}V^{*}\right)\operatorname{vec}(\hat{A}).

Therefore, we can directly obtain the least squares solution vec(X^LS(2))\operatorname{vec}\left(\hat{X}_{\text{LS}}^{(2)}\right) as

vec(X^LS(2))=\displaystyle\operatorname{vec}\left(\hat{X}_{\text{LS}}^{(2)}\right)= RT(Id2(VVT)1V)KT(IM(CC)1C)vec(𝒫^T).\displaystyle R^{T}\left(I_{d^{2}}\otimes\left(V^{*}V^{T}\right)^{-1}V^{*}\right)K^{T}\left(I_{M}\otimes\left(C^{\dagger}C\right)^{-1}C^{\dagger}\right)\operatorname{vec}(\widehat{\mathcal{P}}^{T}). (34)
Remark 1

The procedure to obtain A^\hat{A} in (32) is usually known as QST for output states [2, 13] but our method is different from such a procedure. Here we do not consider any constraints, i.e., Hermitian, positive semidefinite and unit trace, on the estimate ρ^mout\hat{\rho}_{m}^{\text{out}} and we directly reconstruct the process from the data vector by linear inversion. Of course, it is possible to obtain an alternative physical estimate from ρ^mout\hat{\rho}_{m}^{\text{out}} as in [48] or via Maximum Likelihood Estimation (MLE). However, here A^\hat{A} is unnecessary to be a physical estimate because it is only an intermediate product, and our direct method can already guarantee the ultimate process estimation to be physical by later solving Problem 2.1.1 and Problem 2.2.2 in Sec. III-B.

Considering the positive semidefinite requirement X0X\geq 0 and the constraint Tr1(X)Id\operatorname{Tr}_{1}(X)\leq I_{d}, we convert QPT into an optimization problem as follows.

Problem 2

Given the parameterization matrix VV of all the input states, the permutation matrix RR and reconstructed A^\hat{A}, find a Hermitian and positive semidefinite estimate X^\hat{X} minimizing

X^vec1(RT(Id2(VVT)1V)vec(A^)),\left\|\hat{X}-\operatorname{vec}^{-1}\left(R^{T}\left(I_{d^{2}}\otimes\left(V^{*}V^{T}\right)^{-1}V^{*}\right)\operatorname{vec}(\hat{A})\right)\right\|,

such that Tr1(X^)Id\operatorname{Tr}_{1}(\hat{X})\leq I_{d}.

III-B Two-stage solution of QPT

The direct solution to Problem 2 is usually difficult and we split it into two optimization sub-problems (i.e., a two-stage solution):

Problem 2.1.1

Let

D^X^LS(2)=vec1(RT(Id2(VVT)1V)vec(A^))\hat{D}\triangleq\hat{X}_{\text{LS}}^{(2)}=\operatorname{vec}^{-1}\left(R^{T}\left(I_{d^{2}}\otimes\left(V^{*}V^{T}\right)^{-1}V^{*}\right)\operatorname{vec}(\hat{A})\right)

be a given matrix. Find a Hermitian and positive semidefinite d2×d2d^{2}\times d^{2} matrix G^\hat{G} minimizing G^D^||\hat{G}-\hat{D}||.

Problem 2.2.2

Let G^0\hat{G}\geq 0 be given. Find a Hermitian and positive semidefinite d2×d2d^{2}\times d^{2} matrix X^\hat{X} minimizing X^G^||\hat{X}-\hat{G}||, such that Tr1(X^)Id\operatorname{Tr}_{1}(\hat{X})\leq I_{d}.

For Problem 2.1.1, note that if S=S,W=WS=S^{\dagger},W=-W^{\dagger}, we have S+W2=S2+W2\|S+W\|^{2}=\|S\|^{2}+\|W\|^{2}. Therefore,

G^D^2=G^D^+D^22+D^D^22.\|\hat{G}-\hat{D}\|^{2}=\left\|\hat{G}-\frac{\hat{D}+\hat{D}^{\dagger}}{2}\right\|^{2}+\left\|\frac{\hat{D}-\hat{D}^{\dagger}}{2}\right\|^{2}. (35)

We perform the spectral decomposition as D^+D^2=UK^U\frac{\hat{D}+\hat{D}^{\dagger}}{2}=U\hat{K}U^{\dagger} where K^=diag(k1,,kd2)\hat{K}=\operatorname{diag}\left(k_{1},\cdots,k_{d^{2}}\right) is a diagonal matrix. We define Z^UG^U\hat{Z}\triangleq U^{\dagger}\hat{G}U. Since G^\hat{G} is positive semidefinite, we have Z^ii0\hat{Z}_{ii}\geq 0. Therefore,

G^D^+D^22\displaystyle\left\|\hat{G}-\frac{\hat{D}+\hat{D}^{\dagger}}{2}\right\|^{2} =K^Z^2\displaystyle=\|\hat{K}-\hat{Z}\|^{2}
=ij|Z^ij|2+i(kiZ^ii)2\displaystyle=\sum_{i\neq j}\left|\hat{Z}_{ij}\right|^{2}+\sum_{i}\left(k_{i}-\hat{Z}_{ii}\right)^{2}
ki<0(kiZ^ii)2\displaystyle\geq\sum_{k_{i}<0}\left(k_{i}-\hat{Z}_{ii}\right)^{2}
ki<0ki2.\displaystyle\geq\sum_{k_{i}<0}k_{i}^{2}. (36)

Therefore, the unique optimal solution is Z^=diag(b)\hat{Z}=\operatorname{diag}(b) where

bi={ki,ki0,0,ki<0,b_{i}=\begin{cases}k_{i},&k_{i}\geq 0,\\ 0,&k_{i}<0,\end{cases} (37)

and G^=Udiag(b)U\hat{G}=U\operatorname{diag}(b)U^{\dagger}.

After solving Problem 2.1.1, we rewrite the spectral decomposition of G^\hat{G} as

G^=i=1d2vec(S^i)vec(S^i),\displaystyle\hat{G}=\sum_{i=1}^{d^{2}}\operatorname{vec}\left(\hat{S}_{i}\right)\operatorname{vec}\left(\hat{S}_{i}\right)^{\dagger}, (38)

where each S^id×d\hat{S}_{i}\in\mathbb{C}^{d\times d}. Define F^i=1d2S^iS^i\hat{F}\triangleq\sum_{i=1}^{d^{2}}\hat{S}_{i}\hat{S}_{i}^{\dagger}. Using (4), we have

Tr1(G^)\displaystyle\operatorname{Tr}_{1}(\hat{G}) =Tr1(i=1d2vec(S^i)vec(S^i))\displaystyle=\operatorname{Tr}_{1}\left(\sum_{i=1}^{d^{2}}\operatorname{vec}\left(\hat{S}_{i}\right)\operatorname{vec}\left(\hat{S}_{i}\right)^{\dagger}\right) (39)
=i=1d2S^iS^i=F^.\displaystyle=\sum_{i=1}^{d^{2}}\hat{S}_{i}\hat{S}_{i}^{\dagger}=\hat{F}.

Then, we consider the partial trace constraint by correcting G^\hat{G}. Assume that the spectral decomposition of Tr1(X)=F\operatorname{Tr}_{1}(X)=F is

F=UFdiag(f1,,fd)UF,F=U_{F}\operatorname{diag}\left(f_{1},\cdots,f_{d}\right)U_{F}^{\dagger}, (40)

where 1f1fd01\geq f_{1}\geq\cdots f_{d}\geq 0 and the spectral decomposition of F^\hat{F} is

F^=UF^diag(f^1,,f^d)UF^,\hat{F}=U_{\hat{F}}\operatorname{diag}\left(\hat{f}_{1},\cdots,\hat{f}_{d}\right)U_{\hat{F}}^{\dagger}, (41)

where f^1f^c>0\hat{f}_{1}\geq\cdots\geq\hat{f}_{c}>0 and f^c+1=f^d=0\hat{f}_{c+1}=\cdots\hat{f}_{d}=0, i.e., Rank(F^)=c\operatorname{Rank}(\hat{F})=c. Then, we define

F¯UF^diag(f¯1,,f¯d)UF^,\bar{F}\triangleq U_{\hat{F}}\operatorname{diag}\left(\bar{f}_{1},\cdots,\bar{f}_{d}\right)U_{\hat{F}}^{\dagger}, (42)

where f¯i=f^i\bar{f}_{i}=\hat{f}_{i} for 1ic1\leq i\leq c, f¯i=f^cN\bar{f}_{i}=\frac{\hat{f}_{c}}{N} for c+1idc+1\leq i\leq d, and NN is the number of copies for each input states. Since F¯\bar{F} is invertible, F¯1/2\bar{F}^{-1/2} is well defined and we also define

F~UF^diag(f~1,,f~d)UF^,\tilde{F}\triangleq U_{\hat{F}}\operatorname{diag}\left(\tilde{f}_{1},\cdots,\tilde{f}_{d}\right)U_{\hat{F}}^{\dagger}, (43)

where f~i=min(f¯i,1)\tilde{f}_{i}=\min\left({\bar{f}}_{i},1\right) for 1id1\leq i\leq d. Thus, f~i1\tilde{f}_{i}\leq 1 for 1id1\leq i\leq d. To solve Problem 2.2.2 and satisfy the partial trace constraint, let Q^iF~1/2F¯1/2S^i\hat{Q}_{i}\triangleq\tilde{F}^{1/2}\bar{F}^{-1/2}\hat{S}_{i}. We then obtain X^\hat{X} as

X^=i=1d2vec(Q^i)vec(Q^i),\displaystyle\hat{X}=\sum_{i=1}^{d^{2}}\operatorname{vec}\left(\hat{Q}_{i}\right)\operatorname{vec}\left(\hat{Q}_{i}\right)^{\dagger}, (44)

where {vec(Q^i)}i=1d2\left\{\operatorname{vec}\left(\hat{Q}_{i}\right)\right\}_{i=1}^{d^{2}} might not be an orthogonal basis. From (44), X^\hat{X} is Hermitian, positive semidefinite and also satisfies

Tr1(X^)\displaystyle\operatorname{Tr}_{1}(\hat{X}) =Tr1(i=1d2vec(Q^i)vec(Q^i))\displaystyle=\operatorname{Tr}_{1}\left(\sum_{i=1}^{d^{2}}\operatorname{vec}\left(\hat{Q}_{i}\right)\operatorname{vec}\left(\hat{Q}_{i}\right)^{\dagger}\right)
=i=1d2Q^iQ^i=F~1/2F¯1/2i=1d2S^iS^iF¯1/2F~1/2\displaystyle=\sum_{i=1}^{d^{2}}\hat{Q}_{i}\hat{Q}_{i}^{\dagger}=\tilde{F}^{1/2}\bar{F}^{-1/2}\sum_{i=1}^{d^{2}}\hat{S}_{i}\hat{S}_{i}^{\dagger}\bar{F}^{-1/2}\tilde{F}^{1/2}
=F~1/2F¯1/2F^F¯1/2F~1/2\displaystyle=\tilde{F}^{1/2}\bar{F}^{-1/2}\hat{F}\bar{F}^{-1/2}\tilde{F}^{1/2}
=F~1/2UF^diag(1,,1,0,,0)UF^F~1/2\displaystyle=\tilde{F}^{1/2}U_{\hat{F}}\operatorname{diag}\left(1,\cdots,1,0,\cdots,0\right)U_{\hat{F}}^{\dagger}\tilde{F}^{1/2}
=UF^diag(f~1,,f~c,0,,0)UF^\displaystyle=U_{\hat{F}}\operatorname{diag}\left(\tilde{f}_{1},\cdots,\tilde{f}_{c},0,\cdots,0\right)U_{\hat{F}}^{\dagger}
Id.\displaystyle\leq I_{d}. (45)

The Stage 1 solution G^\hat{G} is a projection under the Frobenius norm satisfying the CP constraint, which was also discussed in [17, 18]. The TP constraint makes QPT a nontrivial extension of state tomography [17]. The work in [17, 18] introduced two iterative algorithms designed to ensure that the estimate satisfies the CPTP constraint. While these algorithms may offer good accuracy, our two-stage algorithm can achieve a closed-form solution. We can also efficiently optimize the input states and the measurement operators using the TSS algorithm. It is worth highlighting that our algorithm is versatile and can be applied to non-TP cases. Additionally, considering Hamiltonian identification in [13] where Rank(X^)=1\operatorname{Rank}(\hat{X})=1 and Tr1(X^)=Id\operatorname{Tr}_{1}(\hat{X})=I_{d}, our Stage 2 solution is optimal for Problem 2.2.2 which has been proved in [13]. For general QPT, the estimated process matrix X^\hat{X} is a sub-optimal solution.

Remark 2

If we have the prior knowledge that the quantum process is TP, we can assume that F^\hat{F} is non-singular (i.e., F^>0\hat{F}>0) and it is true when the number of copies is sufficiently large, because F^\hat{F} converges to IdI_{d} as NN tends to infinity. Thus, we correct Q^i\hat{Q}_{i} as Q^i=F^1/2S^i\hat{Q}_{i}=\hat{F}^{-1/2}\hat{S}_{i} and X^=i=1d2vec(Q^i)vec(Q^i)\hat{X}=\sum_{i=1}^{d^{2}}\operatorname{vec}\left(\hat{Q}_{i}\right)\operatorname{vec}\left(\hat{Q}_{i}\right)^{\dagger} where Tr1(X^)=Id\operatorname{Tr}_{1}(\hat{X})=I_{d}.

IV Computational complexity and storage requirements

Refer to caption
Figure 1: Procedure for our TSS algorithm for QPT which has four steps. In Step 1, using (32), we reconstruct the parameterization matrix of all of the output states A^\hat{A} directly from measurement data which is different from the QST in [2, 13]. Here A^\hat{A} is not needed be a physical estimate because it is only an intermediate product. In Step 2, we utilize (33). Step 3 and Step 4 address the positive semidefinite constraint and partial trace constraint, respectively, by solving Problems 2.1.1 and 2.2.2. These two steps together constitute our two-stage solution.

We summarize our procedure for the QPT framework with the TSS algorithm in Fig. 1. In this paper, we do not consider the time spent on experiments. Here we discuss the computational complexity and storage requirement in each step in the red box of Fig. 1 and give the total time and space complexity in Table I.

Step 1. We reconstruct A^\hat{A} using (32). The computational complexity is O(Ld4)O(Ld^{4}) for (CC)1C\left(C^{\dagger}C\right)^{-1}C^{\dagger}, O(MLd2)O(MLd^{2}) for (IM(CC)1C)vec(𝒫^T)\left(I_{M}\otimes\left(C^{\dagger}C\right)^{-1}C^{\dagger}\right)\operatorname{vec}(\widehat{\mathcal{P}}^{T}) and O(Md2)O(Md^{2}) for KT(IM(CC)1C)vec(𝒫^T)K^{T}\left(I_{M}\otimes\left(C^{\dagger}C\right)^{-1}C^{\dagger}\right)\operatorname{vec}(\widehat{\mathcal{P}}^{T}) where Md2M\geq d^{2} is the number of different input states and Ld2L\geq d^{2} is the number of different measurement operators. Thus, the total computational complexity is O(MLd2)O(MLd^{2}). For storage requirement, the storage is O(Ld2)O(Ld^{2}) for the parameterization matrix CC of measurement operators, O(ML)O(ML) for measurement data 𝒫^\widehat{\mathcal{P}} and O(Md2)O(Md^{2}) for A^\hat{A}. Since KK is a permutation matrix, it can be stored with an expense O(Md2)O(Md^{2}).

Step 2. Using (33), the computational complexity is O(Md4)O(Md^{4}) for (VVT)1V\left(V^{*}V^{T}\right)^{-1}V^{*}, O(Md4)O(Md^{4}) for
(Id2(VVT)1V)vec(A^)\left(I_{d^{2}}\otimes\left(V^{*}V^{T}\right)^{-1}V^{*}\right)\operatorname{vec}(\hat{A}) and O(d4)O(d^{4}) for RT(Id2(VVT)1V)vec(A^)R^{T}\left(I_{d^{2}}\otimes\left(V^{*}V^{T}\right)^{-1}V^{*}\right)\operatorname{vec}(\hat{A}). Thus, the total computational complexity is O(Md4)O(Md^{4}). We need to store (VVT)1V\left(V^{*}V^{T}\right)^{-1}V^{*} and RTR^{T}. Since (VVT)1V\left(V^{*}V^{T}\right)^{-1}V^{*} is a d2×Md^{2}\times M matrix and RR is a permutation matrix, the storage requirements are O(Md2)O(Md^{2}) and O(d2)O(d^{2}), respectively.

Step 3. To solve Problem 2.1.1, the computational complexity is determined by the spectral decomposition of D^+D^2\frac{\hat{D}+\hat{D}^{\dagger}}{2}, which is O(d6)O(d^{6}). The storage requirements are O(d4)O(d^{4}) for D^\hat{D}, O(d4)O(d^{4}) for the spectral decomposition of D^+D^2\frac{\hat{D}+\hat{D}^{\dagger}}{2}, and O(d4)O(d^{4}) for G^\hat{G}.

Step 4. To solve Problem 2.2.2, the computational complexity for calculating F^\hat{F}, F¯1/2\bar{F}^{-1/2} and F~1/2\tilde{F}^{1/2} are O(d4)O(d^{4}), O(d3)O(d^{3}) and O(d3)O(d^{3}), respectively. Then the computational complexity for calculating X^\hat{X} is O(d6)O(d^{6}). The storage requirements for {S^i}i=1d2\{\hat{S}_{i}\}_{i=1}^{d^{2}} and {Q^i}i=1d2\{\hat{Q}_{i}\}_{i=1}^{d^{2}} are both O(d4)O(d^{4}). Then the storage for F^\hat{F}, F¯\bar{F} and F~\tilde{F} are all O(d2)O(d^{2}), and for X^\hat{X} is O(d4)O(d^{4}).

TABLE I: Computational complexity and storage requirement
Computational complexity Storage requirement
Step 1 O(MLd2)O(MLd^{2}) O(ML)O(ML)
Step 2 O(Md4)O(Md^{4}) O(Md2)O(Md^{2})
Step 3 O(d6)O(d^{6}) O(d4)O(d^{4})
Step 4 O(d6)O(d^{6}) O(d4)O(d^{4})
Total O(MLd2)O(MLd^{2}) O(ML)O(ML)

The total computational complexity and storage requirements for our TSS algorithm are presented in Table I. Since Md2M\geq d^{2} and Ld2L\geq d^{2}, the total computational complexity is O(MLd2)O(MLd^{2}). From [2, 42, 17], the computational complexity for a direct least squares solution with d2d^{2} different input states is O(d12)O(d^{12}), which is significantly higher than O(MLd2)O(MLd^{2}). The reason we obtain a lower computational complexity here is that we utilize the special structure of BB described in Proposition 1. As for the total storage requirement, our TSS algorithm is O(ML)O(ML). Without employing the tensor structure as (33), the storage space will be O(Md6)O(Md^{6}) because the storage requirement of BB will be O(Md6)O(Md^{6}).

V Error analysis

From Fig. 1, our TSS algorithm has four steps. In this section, we analyze the error upper bound and give the following theorem to characterize the error upper bound analytically.

Theorem 1

If {Ei}i=1d2\{E_{i}\}_{i=1}^{d^{2}} and {σn}n=1d2\{\sigma_{n}\}_{n=1}^{d^{2}} are chosen as the natural basis and the input quantum states are {ρmin}m=1M\{\rho_{m}^{\text{in}}\}_{m=1}^{M}, then the estimation error of the TSS algorithm 𝔼X^X\mathbb{E}||\hat{X}-X|| scales as

O(dTr(F)JTr((CC)1)MTr((VVT)1)N),O\!\!\left(\!\!\frac{\sqrt{d}\operatorname{Tr}({F})\sqrt{J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right)}\!\sqrt{M\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right)}}{\sqrt{N}}\!\right),

where NN is the number of copies for each output state, JJ is the number of POVM sets, CC is defined as (29), VV is defined as (20), F=Tr1(X)F=\operatorname{Tr}_{1}(X) and 𝔼()\mathbb{E}(\cdot) denotes expectation with respect to all possible measurement results.

Proof:

We will first calculate the error for each of the four steps as shown in Fig. 1 and then present the final error bound. The main tools in this proof include the error analysis of QST in [46] and some matrix inequalities in Appendix A. For any quantity SS, we denote its estimation error as ΔSS^S\Delta_{S}\triangleq||\hat{S}-S||.

V-A Error in Step 1

The MSE of the mm-th estimated output state, i.e., colm(A^T)\operatorname{col}_{m}(\hat{A}^{T}) is asymptotically (see [46])

𝔼colm(A^T)colm(AT)2\displaystyle\mathbb{E}\left\|\operatorname{col}_{m}(\hat{A}^{T})-\operatorname{col}_{m}(A^{T})\right\|^{2} (46)
=\displaystyle= JNTr((CC)1CPC(CC)1),\displaystyle\frac{J}{N}\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}C^{\dagger}\mathrm{P}C\left(C^{\dagger}C\right)^{-1}\right),

where NN is the number of copies for each input state and

Pdiag(pm1pm12,,pmLpmL2).\mathrm{P}\triangleq\operatorname{diag}\left(p_{m1}-p_{m1}^{2},\cdots,p_{mL}-p_{mL}^{2}\right).

Therefore, the MSE is bounded by [46]

𝔼colm(A^T)colm(AT)2J4NTr((CC)1).\mathbb{E}\left\|\operatorname{col}_{m}(\hat{A}^{T})-\operatorname{col}_{m}(A^{T})\right\|^{2}\leq\frac{J}{4N}\operatorname{Tr}\left(\!\left(C^{\dagger}C\right)^{-1}\right). (47)

Then, for A^\hat{A}, we have

𝔼ΔA2=m=1M𝔼colm(A^T)colm(AT)2MJ4NTr((CC)1).\begin{array}[]{rl}\mathbb{E}\Delta_{A}^{2}&=\sum_{m=1}^{M}\mathbb{E}\left\|\operatorname{col}_{m}(\hat{A}^{T})-\operatorname{col}_{m}(A^{T})\right\|^{2}\\ &\leq\dfrac{MJ}{4N}\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right).\end{array} (48)

V-B Error in Step 2

Since

(VVT)1V2\displaystyle\left\|\left(V^{*}V^{T}\right)^{-1}V^{*}\right\|^{2} =Tr(VT(VVT)1(VVT)1V)\displaystyle=\operatorname{Tr}\left(V^{T}\left(V^{*}V^{T}\right)^{-1}\left(V^{*}V^{T}\right)^{-1}V^{*}\right) (49)
=Tr((VVT)1),\displaystyle=\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right),

we have

ΔD2=D^D2\displaystyle\quad\Delta_{D}^{2}=\|\hat{D}-D\|^{2}
=vec(D^)vec(D)2\displaystyle=\left\|\operatorname{vec}\left(\hat{D}\right)-\operatorname{vec}\left(D\right)\right\|^{2}
=RT(I(VVT)1V)(vec(A^)vec(A))2\displaystyle=\left\|R^{T}\left(I\otimes\left(V^{*}V^{T}\right)^{-1}V^{*}\right)(\operatorname{vec}(\hat{A})-\operatorname{vec}(A))\right\|^{2}
(VVT)1V2vec(A^)vec(A)2\displaystyle\leq\left\|\left(V^{*}V^{T}\right)^{-1}V^{*}\right\|^{2}\|\operatorname{vec}(\hat{A})-\operatorname{vec}(A)\|^{2}
=Tr((VVT)1)ΔA2,\displaystyle=\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right)\Delta_{A}^{2}, (50)

where we use Proposition 3 for the inequality.

V-C Error in Step 3

Since G^\hat{G} minimizes G^D^||\hat{G}-\hat{D}||, we have G^D^D^D\|\hat{G}-\hat{D}\|\leq\|\hat{D}-D\| and thus

ΔG=\displaystyle\Delta_{G}= G^G=G^DG^D^+D^D2ΔD,\displaystyle\|\hat{G}-G\|=\|\hat{G}-D\|\leq\|\hat{G}-\hat{D}\|+\|\hat{D}-D\|\leq 2\Delta_{D}, (51)

because the true value G=DG=D.

V-D Error in Step 4

For F^\hat{F}, if 1f^1f^d01\geq\hat{f}_{1}\geq\cdots\geq\hat{f}_{d}\geq 0, i.e., F^I\hat{F}\leq I, we have F~1/2F¯1/2=I\tilde{F}^{1/2}\bar{F}^{-1/2}=I and thus X^=G^\hat{X}=\hat{G}. Therefore, using (48), (50) and (51), we obtain the final error bound as

𝔼X^X=𝔼G^GO(JTr((CC)1)MTr((VVT)1)N).\displaystyle\mathbb{E}\|\hat{X}-X\|=\mathbb{E}\|\hat{G}-G\|\sim O\left(\frac{\sqrt{J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right)}\sqrt{M\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right)}}{\sqrt{N}}\right). (52)

Then we consider the case where at least one eigenvalue of F^\hat{F} is larger than 11. Assume f^i>1\hat{f}_{i}>1 for 1ia1\leq i\leq a and 0f^i10\leq\hat{f}_{i}\leq 1 for a+1ida+1\leq i\leq d in (41). From Lemma 2 and (39), we have

F^F\displaystyle\|\hat{F}-F\| =Tr1(G^)Tr1(G)\displaystyle=\left\|\operatorname{Tr}_{1}(\hat{G})-\operatorname{Tr}_{1}(G)\right\| (53)
dG^G\displaystyle\leq\sqrt{d}\|\hat{G}-G\|
=dΔG.\displaystyle=\sqrt{d}\Delta_{G}.

Using (40), (41), and (131) in Appendix A, we can obtain

maxi|f^ifi|F^FdΔG,\max_{i}\left|\hat{f}_{i}-f_{i}\right|\leq\|\hat{F}-F\|\leq\sqrt{d}\Delta_{G}, (54)

and therefore we have

Tr(F)ddΔGTr(F^)Tr(F)+ddΔG.\operatorname{Tr}({F})-d\sqrt{d}\Delta_{G}\leq\operatorname{Tr}(\hat{F})\leq\operatorname{Tr}({F})+d\sqrt{d}\Delta_{G}. (55)

Using (42) and (132) in Appendix A, we have

F~F^2\displaystyle\|\tilde{F}-\hat{F}\|^{2} =i=1d(f~if^i)2\displaystyle=\sum_{i=1}^{d}\left(\tilde{f}_{i}-\hat{f}_{i}\right)^{2}
=i=1a(1fi^)2+i=c+1d(f^cN)2\displaystyle=\sum_{i=1}^{a}\left(1-\hat{f_{i}}\right)^{2}+\sum_{i=c+1}^{d}\left(\frac{\hat{f}_{c}}{N}\right)^{2}
i=1d(fifi^)2+(dc)(f^cN)2\displaystyle\leq\sum_{i=1}^{d}\left(f_{i}-\hat{f_{i}}\right)^{2}+(d-c)\left(\frac{\hat{f}_{c}}{N}\right)^{2}
F^F2+O(1N2).\displaystyle\leq\|\hat{F}-{F}\|^{2}+O\left(\frac{1}{N^{2}}\right). (56)

Since limNΔG=0\lim_{N\rightarrow\infty}\Delta_{G}=0, we have Tr(G^)=Tr(F^)Tr(F)\operatorname{Tr}(\hat{G})=\operatorname{Tr}(\hat{F})\sim\operatorname{Tr}({F}), and thus (56) indicates

Tr(F^)Tr(F~)Tr(F).\operatorname{Tr}(\hat{F})\sim\operatorname{Tr}(\tilde{F})\sim\operatorname{Tr}({F}). (57)

Then we consider F~1/2F¯1/2\tilde{F}^{1/2}\bar{F}^{-1/2} as

F~1/2F¯1/2=UF^diag(f~1f¯1,f~df¯d)UF^.\tilde{F}^{1/2}\bar{F}^{-1/2}=U_{\hat{F}}\operatorname{diag}\left(\sqrt{\frac{\tilde{f}_{1}}{\bar{f}_{1}}},\cdots\sqrt{\frac{\tilde{f}_{d}}{\bar{f}_{d}}}\right)U_{\hat{F}}^{\dagger}. (58)

For a+1ida+1\leq i\leq d, f~i=f¯i\tilde{f}_{i}=\bar{f}_{i}. For 1ia1\leq i\leq a, f¯i=f^i\bar{f}_{i}=\hat{f}_{i}, and we have

f~if¯i1=11+(fi^1)1=1f^i2+o(fi^12).\sqrt{\frac{\tilde{f}_{i}}{\bar{f}_{i}}}-1=\sqrt{\frac{1}{1+\left(\hat{f_{i}}-1\right)}}-1=\frac{1-\hat{f}_{i}}{2}+o\left(\frac{\hat{f_{i}}-1}{2}\right). (59)

Thus, using (59) and (132) in Appendix A, we have

F~1/2F¯1/2Id2=i=1a(1f^i2+o(1f^i2))2\displaystyle\quad\left\|\tilde{F}^{1/2}\bar{F}^{-1/2}-I_{d}\right\|^{2}=\sum_{i=1}^{a}\left(\frac{1-\hat{f}_{i}}{2}+o\left(\frac{1-\hat{f}_{i}}{2}\right)\right)^{2} (60)
i=1a(f^i1)24i=1d(fi^fi)24F^F24.\displaystyle\sim\frac{\sum_{i=1}^{a}\left(\hat{f}_{i}-1\right)^{2}}{4}\leq\frac{\sum_{i=1}^{d}\left(\hat{f_{i}}-f_{i}\right)^{2}}{4}\leq\frac{\|\hat{F}-{F}\|^{2}}{4}.

Then using (45) and (57), we have

Tr(X^)\displaystyle\operatorname{Tr}(\hat{X}) =Tr(Tr1(X^))=i=1cf~i\displaystyle=\operatorname{Tr}\left(\operatorname{Tr}_{1}(\hat{X})\right)=\sum_{i=1}^{c}\tilde{f}_{i} (61)
=Tr(F~)(dc)f^cNTr(F).\displaystyle=\operatorname{Tr}(\tilde{F})-\frac{(d-c)\hat{f}_{c}}{N}\sim\operatorname{Tr}({F}).

Using (44), we thus have

i=1d2vec(Q^i)2\displaystyle\sum_{i=1}^{d^{2}}\left\|\operatorname{vec}\left(\hat{Q}_{i}\right)\right\|^{2} =i=1d2vec(Q^i)vec(Q^i)\displaystyle=\sum_{i=1}^{d^{2}}\operatorname{vec}\left(\hat{Q}_{i}\right)^{\dagger}\operatorname{vec}\left(\hat{Q}_{i}\right) (62)
=Tr(X^)Tr(F).\displaystyle=\operatorname{Tr}(\hat{X})\sim\operatorname{Tr}({F}).

Similarly, using (38) and (39), we have

i=1d2vec(S^i)2\displaystyle\sum_{i=1}^{d^{2}}\left\|\operatorname{vec}\left(\hat{S}_{i}\right)\right\|^{2} =i=1d2vec(S^i)vec(S^i)\displaystyle=\sum_{i=1}^{d^{2}}\operatorname{vec}\left(\hat{S}_{i}\right)^{\dagger}\operatorname{vec}\left(\hat{S}_{i}\right) (63)
=Tr(G^)=Tr(Tr1(G^))\displaystyle=\operatorname{Tr}(\hat{G})=\operatorname{Tr}\left(\operatorname{Tr}_{1}(\hat{G})\right)
=Tr(F^)Tr(F).\displaystyle=\operatorname{Tr}(\hat{F})\sim\operatorname{Tr}(F).

Thus, from (3) and the definition of Q^i\hat{Q}_{i}, we know vec(Q^i)=(IdF~1/2F¯1/2)vec(S^i)\operatorname{vec}\left(\hat{Q}_{i}\right)=\left(I_{d}\otimes\tilde{F}^{1/2}\bar{F}^{-1/2}\right)\operatorname{vec}\left(\hat{S}_{i}\right), and we have

i=1d2vec(Q^i)vec(S^i)2\displaystyle\quad\sum_{i=1}^{d^{2}}\left\|\operatorname{vec}\left(\hat{Q}_{i}\right)-\operatorname{vec}\left(\hat{S}_{i}\right)\right\|^{2}
=i=1d2(IdF~1/2F¯1/2IdId)vec(S^i)2\displaystyle=\sum_{i=1}^{d^{2}}\left\|\left(I_{d}\otimes\tilde{F}^{1/2}\bar{F}^{-1/2}-I_{d}\otimes I_{d}\right)\operatorname{vec}\left(\hat{S}_{i}\right)\right\|^{2}
F~1/2F¯1/2Id2i=1d2vec(S^i)2\displaystyle\leq\left\|\tilde{F}^{1/2}\bar{F}^{-1/2}-I_{d}\right\|^{2}\sum_{i=1}^{d^{2}}\left\|\operatorname{vec}\left(\hat{S}_{i}\right)\right\|^{2}
Tr(F)4F^F2dTr(F)4GG^2,\displaystyle\sim\frac{\operatorname{Tr}({F})}{4}\|\hat{F}-F\|^{2}\sim\frac{d\operatorname{Tr}({F})}{4}\|G-\hat{G}\|^{2}, (64)

where we use Proposition 3 in the inequality. Therefore, using (62)–(64), we have

X^G^2\displaystyle\quad\|\hat{X}-\hat{G}\|^{2}
=i=1d2(vec(Q^i)vec(Q^i)vec(S^i)vec(S^i))2\displaystyle=\left\|\sum_{i=1}^{d^{2}}\left(\operatorname{vec}\left(\hat{Q}_{i}\right)\operatorname{vec}\left(\hat{Q}_{i}\right)^{\dagger}-\operatorname{vec}\left(\hat{S}_{i}\right)\operatorname{vec}\left(\hat{S}_{i}\right)^{\dagger}\right)\right\|^{2}
(i=1d2vec(Q^i)vec(Q^i)vec(S^i)vec(S^i))2\displaystyle\leq\left(\sum_{i=1}^{d^{2}}\left\|\operatorname{vec}\left(\hat{Q}_{i}\right)\operatorname{vec}\left(\hat{Q}_{i}\right)^{\dagger}-\operatorname{vec}\left(\hat{S}_{i}\right)\operatorname{vec}\left(\hat{S}_{i}\right)^{\dagger}\right\|\right)^{2}
=(i=1d2vec(Q^i)vec(Q^i)vec(S^i)vec(Q^i)+vec(S^i)vec(Q^i)vec(S^i)vec(S^i))2\displaystyle=\Bigg{(}\sum_{i=1}^{d^{2}}\Big{\|}\operatorname{vec}\left(\hat{Q}_{i}\right)\operatorname{vec}\left(\hat{Q}_{i}\right)^{\dagger}-\operatorname{vec}\left(\hat{S}_{i}\right)\operatorname{vec}\left(\hat{Q}_{i}\right)^{\dagger}+\operatorname{vec}\left(\hat{S}_{i}\right)\operatorname{vec}\left(\hat{Q}_{i}\right)^{\dagger}-\operatorname{vec}\left(\hat{S}_{i}\right)\operatorname{vec}\left(\hat{S}_{i}\right)^{\dagger}\Big{\|}\Bigg{)}^{2}
(i=1d2(vec(Q^i)+vec(S^i))vec(Q^i)vec(S^i))2\displaystyle\leq\Bigg{(}\sum_{i=1}^{d^{2}}\left(\left\|\operatorname{vec}\left(\hat{Q}_{i}\right)^{\dagger}\right\|+\left\|\operatorname{vec}\left(\hat{S}_{i}\right)\right\|\right)\left\|\operatorname{vec}\left(\hat{Q}_{i}\right)-\operatorname{vec}\left(\hat{S}_{i}\right)\right\|\Bigg{)}^{2}
(i=1d2(vec(Q^i)+vec(S^i))2)(i=1d2vec(Q^i)vec(S^i)2)\displaystyle\leq\left(\sum_{i=1}^{d^{2}}\left(\left\|\operatorname{vec}\left(\hat{Q}_{i}\right)\right\|+\left\|\operatorname{vec}\left(\hat{S}_{i}\right)\right\|\right)^{2}\right)\cdot\left(\sum_{i=1}^{d^{2}}\left\|\operatorname{vec}\left(\hat{Q}_{i}\right)-\operatorname{vec}\left(\hat{S}_{i}\right)\right\|^{2}\right)
2(i=1d2vec(Q^i)2+i=1d2vec(S^i)2)(i=1d2vec(Q^i)vec(S^i)2)\displaystyle\leq 2\left(\sum_{i=1}^{d^{2}}\left\|\operatorname{vec}\left(\hat{Q}_{i}\right)\right\|^{2}+\sum_{i=1}^{d^{2}}\left\|\operatorname{vec}\left(\hat{S}_{i}\right)\right\|^{2}\right)\cdot\left(\sum_{i=1}^{d^{2}}\left\|\operatorname{vec}\left(\hat{Q}_{i}\right)-\operatorname{vec}\left(\hat{S}_{i}\right)\right\|^{2}\right)
d(Tr(F))2GG^2,\displaystyle\sim d\left(\operatorname{Tr}({F})\right)^{2}\|G-\hat{G}\|^{2}, (65)

where we use the Cauchy-Schwarz inequality in the third inequality and AM-QM inequality [49] in the fourth inequality. Thus, using (48), (50), (51) and (65), the final error bound is

𝔼X^X𝔼X^G^+𝔼G^G\displaystyle\quad\mathbb{E}\|\hat{X}-X\|\leq\mathbb{E}\|\hat{X}-\hat{G}\|+\mathbb{E}\|\hat{G}-G\| (66)
(dTr(F)+1)𝔼G^G\displaystyle\leq(\sqrt{d}\operatorname{Tr}({F})+1)\mathbb{E}\|\hat{G}-G\|
O(dTr(F)JTr((CC)1)MTr((VVT)1)N).\displaystyle\leq O\left(\!\!\frac{\sqrt{d}\operatorname{Tr}({F})\sqrt{J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right)}\sqrt{M\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right)}}{\sqrt{N}}\right).

Comparing (52) and (66), we obtain the final error bound for any quantum process in the approximation sense as

𝔼X^XO(dTr(F)JTr((CC)1)MTr((VVT)1)N),\displaystyle\mathbb{E}\|\hat{X}-X\|\leq O\left(\frac{\sqrt{d}\operatorname{Tr}({F})\sqrt{J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right)}\sqrt{M\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right)}}{\sqrt{N}}\right), (67)

where F=Tr1(X)F=\operatorname{Tr}_{1}(X) represents the success probability of the quantum process [23]. ∎

Note that when the quantum process is TP, Tr(F)=d\operatorname{Tr}(F)=d and the error analysis is similar to non-TP case. Then the final error bound is

𝔼X^XO(d3/2JTr((CC)1)MTr((VVT)1)N).\displaystyle\mathbb{E}\|\hat{X}-X\|\leq O\left(\frac{d^{3/2}\sqrt{J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right)}\sqrt{M\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right)}}{\sqrt{N}}\right). (68)

VI Optimization of input states and measurement operators

To minimize the tomography error, the optimal input states {ρmin}m=1M\{\rho_{m}^{\text{in}}\}_{m=1}^{M} and optimal measurement operators {Pl}l=1L\{P_{l}\}_{l=1}^{L} are usually dependent on the specific process. One advantage of our TSS algorithm is that we can give an analytical error upper bound as (67) or (68) which depends on the input states and measurement operators instead of the unknown process. Therefore, we can optimize the input states and measurement operators based on the upper bound. In addition, we also consider how robust the estimated process X^\hat{X} is w.r.t. the measurement error, which is determined by (32) and (33). Therefore, we use the condition numbers of IMCI_{M}\otimes C and BB to describe the robustness, which reflects the sensitivity of the solution to the perturbations in data. Moreover, we consider the relationship between MM and the final estimation error with randomly generated input states.

VI-A Optimal input quantum states

Define the set with MM different types of input states as

𝒟(d,M)\displaystyle\mathcal{D}(d,M)\triangleq {{ρmin}m=1M1mM,ρmind×d,ρmin=(ρmin),ρmin0,Tr(ρmin)=1}.\displaystyle\Big{\{}\{\rho_{m}^{\text{in}}\}_{m=1}^{M}\mid\forall 1\leq m\leq M,\rho_{m}^{\text{in}}\in\mathbb{C}^{d\times d},\rho_{m}^{\text{in}}=\left(\rho_{m}^{\text{in}}\right)^{\dagger},\rho_{m}^{\text{in}}\geq 0,\operatorname{Tr}(\rho_{m}^{\text{in}})=1\Big{\}}. (69)

Similarly to the optimal probe states in quantum detector tomography [32, 50], we consider the optimal input states for QPT. The part of (67) or (68) dependent on the input states is MTr((VVT)1)M\operatorname{Tr}\big{(}(V^{*}V^{T})^{-1}\big{)}, which will be our first cost function. Therefore, we define the set of all the optimal input states to minimize this bound as

OIS1(d,M)argmin{ρmin}m=1M𝒟(d,M)MTr((VVT)1).\mathrm{OIS}_{1}(d,M)\triangleq\underset{\left\{\rho_{m}^{\text{in}}\right\}_{m=1}^{M}\in\mathcal{D}(d,M)}{\arg\min}M\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right). (70)

For general QPT, we have no prior knowledge of the unknown process. From (33) with the reconstructed A^\hat{A}, the following conditions are equivalent: (i) the process cannot be uniquely identified; (ii) the input states do not span the space of all dd-dimensional Hermitian matrices; (iii) VVTV^{*}V^{T} is singular; (iv) MTr((VVT)1)M\operatorname{Tr}\big{(}(V^{*}V^{T})^{-1}\big{)} is infinite. From (i) and (iv), it is thus reasonable to take MTr((VVT)1)M\operatorname{Tr}\big{(}(V^{*}V^{T})^{-1}\big{)} as a cost function.

In addition, maximizing the robustness of the reconstructed process with respect to measurement noise amounts to minimizing the condition number. Therefore, from (33), the second cost function is the condition number of BB; i.e., cond(B)\operatorname{cond}(B), defined as

cond(B)σmax(B)σmin(B)\operatorname{cond}(B)\triangleq\frac{\sigma_{\max}(B)}{\sigma_{\min}(B)}

and σmax/min(B)\sigma_{\max/\min}(B) is the maximum/minimum singular value of BB. Using (21), we have cond(B)=cond(I)cond(VT)=cond(V)\operatorname{cond}(B)=\operatorname{cond}(I)\operatorname{cond}(V^{T})=\operatorname{cond}(V) which also only depends on the input states. Similarly, we also define the set of all the optimal input states to minimize the condition number as

OIS2(d,M)argmin{ρmin}m=1M𝒟(d,M)cond(V).\mathrm{OIS}_{2}(d,M)\triangleq\underset{\left\{\rho_{m}^{\text{in}}\right\}_{m=1}^{M}\in\mathcal{D}(d,M)}{\arg\min}\operatorname{cond}(V). (71)

We then define two sets OIS1¯(d,M)\underline{\mathrm{OIS}_{1}}(d,M) and OIS2¯(d,M)\underline{\mathrm{OIS}_{2}}(d,M) to characterize two lower bounds of these two cost functions as

OIS1¯(d,M)\displaystyle\underline{\mathrm{OIS}_{1}}(d,M)\triangleq {{ρmin}m=1M𝒟(d,M)MTr((VVT)1)=d4+d3d2},\displaystyle\Big{\{}{\left\{\rho_{m}^{\text{in}}\right\}_{m=1}^{M}}\in\mathcal{D}(d,M)\mid M\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right)=d^{4}+d^{3}-d^{2}\Big{\}}, (72)
OIS2¯(d,M){{ρmin}m=1M𝒟(d,M)cond(V)=d+1},\underline{\mathrm{OIS}_{2}}(d,M)\triangleq\left\{{\left\{\rho_{m}^{\text{in}}\right\}_{m=1}^{M}}\in\mathcal{D}(d,M)\mid\operatorname{cond}(V)=\sqrt{d+1}\right\}, (73)

and the proof is as given in Theorem 2. In addition, we define the set of the optimal input states (OIS(d,M)\mathrm{OIS}(d,M)) and the set of the optimal input states achieving both of the lower bounds (OIS¯(d,M)\underline{\mathrm{OIS}}(d,M)) as

OIS(d,M)OIS1(d,M)OIS2(d,M),\displaystyle\mathrm{OIS}(d,M)\triangleq\mathrm{OIS}_{1}(d,M)\cap\mathrm{OIS}_{2}(d,M), (74)

and

OIS¯(d,M)OIS1¯(d,M)OIS2¯(d,M).\displaystyle\underline{\mathrm{OIS}}(d,M)\triangleq\underline{\mathrm{OIS}_{1}}(d,M)\cap\underline{\mathrm{OIS}_{2}}(d,M). (75)

Thus when OIS(d,M)\mathrm{OIS}(d,M) is non-empty, it should minimize MTr((VVT)1)M\operatorname{Tr}\big{(}(V^{*}V^{T})^{-1}\big{)} and minimize cond(V)\operatorname{cond}(V) simultaneously. Additionally, when OIS¯(d,M)\underline{\mathrm{OIS}}(d,M) is non-empty, it can achieve the lower bounds of MTr((VVT)1)M\operatorname{Tr}\big{(}(V^{*}V^{T})^{-1}\big{)} and cond(V)\operatorname{cond}(V) simultaneously. Then, we present the following theorem for OIS¯(d,M)\underline{\mathrm{OIS}}(d,M).

Theorem 2

For a dd-dimensional quantum process, with MM different types of input states and the parameterization matrix VV as in (20), we have MTr((VVT)1)d4+d3d2M\operatorname{Tr}\big{(}(V^{*}V^{T})^{-1}\big{)}\geq d^{4}+d^{3}-d^{2} and cond(V)d+1\operatorname{cond}(V)\geq\sqrt{d+1}. These lower bounds are achieved simultaneously; i.e., {ρmin}m=1MOIS¯(d,M){\left\{\rho_{m}^{\text{in}}\right\}_{m=1}^{M}}\in\underline{\mathrm{OIS}}(d,M) if and only if the eigenvalues of VVTV^{*}V^{T} are τ1=Md\tau_{1}=\frac{M}{d} and τ2==τd2=Md(d+1)\tau_{2}=\dots=\tau_{d^{2}}=\frac{M}{d(d+1)}.

Proof:

Denote the eigenvalues of VVTV^{*}V^{T} as τ1τ2τd2>0\tau_{1}\geq\tau_{2}\geq\cdots\geq\tau_{d^{2}}>0. Since {ρmin}m=1M𝒟(d,M)\left\{\rho_{m}^{\text{in}}\right\}_{m=1}^{M}\in\mathcal{D}(d,M), we can obtain two constraints i=1d2τiM\sum_{i=1}^{d^{2}}\tau_{i}\leq M and τ1Md\tau_{1}\geq\frac{M}{d} coming from the purity requirement and unit trace of quantum states, respectively, which have been proved in [32]. To minimize MTr((VVT)1)M\operatorname{Tr}\big{(}(V^{*}V^{T})^{-1}\big{)} and cond(V)\operatorname{cond}(V), we relax the optimization problems as follows:

min i=1d2Mτi\displaystyle\sum_{i=1}^{d^{2}}\frac{M}{\tau_{i}} (76)
s.t. i=1d2τiM,τ1Md,\displaystyle\sum_{i=1}^{d^{2}}\tau_{i}\leq M,\tau_{1}\geq\frac{M}{d},

and

min τ1τd2\displaystyle\sqrt{\frac{\tau_{1}}{\tau_{d^{2}}}} (77)
s.t. i=1d2τiM,τ1Md.\displaystyle\sum_{i=1}^{d^{2}}\tau_{i}\leq M,\tau_{1}\geq\frac{M}{d}.

Using Lagrange multiplier method, we have

MTr((VVT)1)d4+d3d2,M\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right)\geq d^{4}+d^{3}-d^{2}, (78)

and

cond(V)M/dM/(d(d+1))=d+1.\operatorname{cond}(V)\geq\sqrt{\frac{M/d}{M/\left(d(d+1)\right)}}=\sqrt{d+1}. (79)

These lower bounds hold if and only if τ1=Md\tau_{1}=\frac{M}{d} and τ2==τd2=Md(d+1)\tau_{2}=\dots=\tau_{d^{2}}=\frac{M}{d(d+1)}. ∎

Since 𝒟(d,M)\mathcal{D}(d,M) is a closed space, OIS1(d,M)\text{OIS}_{1}(d,M) and OIS2(d,M)\text{OIS}_{2}(d,M) are always non-empty for any Md2M\geq d^{2}. However, it is not clear whether this also holds for OIS(d,M)\text{OIS}(d,M). For the lower bounds of these two cost functions, if one of them is achieved, we have τ1=Md,τ2==τd2=Md(d+1)\tau_{1}=\frac{M}{d},\tau_{2}=\dots=\tau_{d^{2}}=\frac{M}{d(d+1)} and thus

OIS¯(d,M)=OIS1¯(d,M)=OIS2¯(d,M).\displaystyle\underline{\mathrm{OIS}}(d,M)=\underline{\mathrm{OIS}_{1}}(d,M)=\underline{\mathrm{OIS}_{2}}(d,M). (80)

If the two lower bounds cannot be achieved, the above three sets are empty. Therefore, (80) always holds for any Md2M\geq d^{2}. However, we still have not fully determined what values of MM make (80) non-empty. In addition, when OIS¯(d,M)\underline{\mathrm{OIS}}(d,M) is not empty, we have OIS¯(d,M)=OIS(d,M)\underline{\mathrm{OIS}}(d,M)={\mathrm{OIS}}(d,M). If {ρmin}m=1MOIS¯(d,M){\left\{\rho_{m}^{\text{in}}\right\}_{m=1}^{M}}\in\underline{\mathrm{OIS}}(d,M), all of them must be pure states from the purity requirement. This indicates that pure states may be better than mixed states in QPT for reducing the estimation error. We provide the following two examples in Appendix B which belong to OIS¯(d,M)\underline{\mathrm{OIS}}(d,M) and achieve the lower bounds: SIC (symmetric informationally complete) states with the smallest M=d2M=d^{2} and MUB (mutually unbiased bases) states for M=d(d+1)M=d(d+1). Moreover, we consider that the system is restricted to a multi-qubit system where d=2md=2^{m} and each input state is an mm-qubit tensor product state as ρj=ρj1(1)ρjm(m)\rho_{j}=\rho^{(1)}_{j_{1}}\otimes\cdots\otimes\rho^{(m)}_{j_{m}} where 1jiMi1\leq j_{i}\leq M_{i} and there are MiM_{i} different types of single qubits for the ii-th qubit of the product states. Thus, the total number of these mm-qubit tensor product states is thus M=i=1mMiM=\prod_{i=1}^{m}{M_{i}}. We have the following theorem.

Theorem 3

For mm-qubit product input states, we have MTr((VVT)1)20mM\operatorname{Tr}\big{(}(V^{*}V^{T})^{-1}\big{)}\geq 20^{m} and cond(V)3m\operatorname{cond}(V)\geq\sqrt{3^{m}}. These lower bounds are achieved simultaneously if and only if for each 1im1\leq i\leq m, {ρji(i)}i=1MiOIS¯(2,M)\left\{\rho^{(i)}_{j_{i}}\right\}_{i=1}^{M_{i}}\in\underline{\mathrm{OIS}}(2,M).

Proof:

Assuming that the parameterization matrix in (20) of all the ii-th single-qubit states {ρji(i)}ji=1Mi\left\{\rho_{j_{i}}^{(i)}\right\}_{j_{i}=1}^{M_{i}} is ViV_{i}. We thus have V=V1V2VmV=V_{1}\otimes V_{2}\cdots\otimes V_{m}. Therefore,

MTr((VVT)1)\displaystyle M\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right) =Mi=1mTr((ViViT)1)Mi=1m20Mi=20m,\displaystyle=M\prod_{i=1}^{m}\operatorname{Tr}\left(\left(V_{i}^{*}V_{i}^{T}\right)^{-1}\right)\geq M\prod_{i=1}^{m}\frac{20}{M_{i}}=20^{m}, (81)

and

cond(V)=i=1mcond(Vi)i=1m3=3m.\operatorname{cond}(V)=\prod_{i=1}^{m}\operatorname{cond}\left(V_{i}\right)\geq\prod_{i=1}^{m}\sqrt{3}=\sqrt{3^{m}}. (82)

The above two equalities hold if and only if for each 1im1\leq i\leq m, {ρji(i)}i=1MiOIS¯(2,M)\left\{\rho^{(i)}_{j_{i}}\right\}_{i=1}^{M_{i}}\in\underline{\mathrm{OIS}}(2,M). ∎

A similar proof for quantum detector tomography can also be found in [32]. For one-qubit input states, they are in the Bloch sphere and [32] has proved that platonic solids–tetrahedron, cube, octahedron, dodecahedron and icosahedron construct five examples, which belong to OIS¯(2,M)\underline{\mathrm{OIS}}(2,M) with M=4,6,8,12,20M=4,6,8,12,20 and achieve one-qubit lower bounds. OIS¯(2,4)\underline{\mathrm{OIS}}(2,4) and OIS¯(2,6)\underline{\mathrm{OIS}}(2,6) are one-qubit SIC states and one-qubit MUB states, respectively. For two-qubit product input states, Cube states (product states of two one-qubit MUB states) can achieve the lower bounds MTr((VVT)1)=400M\operatorname{Tr}\big{(}(V^{*}V^{T})^{-1}\big{)}=400 and cond(V)=3\operatorname{cond}(V)=3.

VI-B Optimal measurement operators

For our error bound (67) or (68), the part depending on the measurement operators is JTr((CC)1)J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right) and we need to minimize it. Since cond(IMC)=cond(IM)cond(C)=cond(C)\operatorname{cond}(I_{M}\otimes C)=\operatorname{cond}(I_{M})\operatorname{cond}(C)=\operatorname{cond}(C), we also consider the minimum of cond(C)\operatorname{cond}(C).

Assume Pi,jP_{i,j} is the ii-th POVM element (i.e., a measurement operator) in the jj-th POVM set where 1jJ1\leq j\leq J and 1inj1\leq i\leq n_{j}. Define the set size 𝓃[n1,,nJ]\mathcal{n}\triangleq[n_{1},\cdots,n_{J}]. Therefore, the total number of measurement operators L=j=1JnjL=\sum_{j=1}^{J}n_{j}. Let {Ωi}i=1d2\left\{\Omega_{i}\right\}_{i=1}^{d^{2}} be a complete set of dd-dimensional traceless Hermitian matrices except Ω1=I/d\Omega_{1}=I/\sqrt{d}, and they satisfy Tr(ΩiΩj)=δij\operatorname{Tr}\left(\Omega_{i}^{\dagger}\Omega_{j}\right)=\delta_{ij} where δij\delta_{ij} is the Kronecker function. Then we can parameterize the POVM element as

Pi,j\displaystyle P_{i,j} =a=1d2ϕi,jaΩa,\displaystyle=\sum_{a=1}^{d^{2}}\phi_{i,j}^{a}\Omega_{a}, (83)

where ϕi,jaTr(Pi,jΩa)\phi_{i,j}^{a}\triangleq\operatorname{Tr}\left(P_{i,j}\Omega_{a}\right) is real. Thus we use ϕi,j[ϕi,j1,,ϕi,jd2]T\phi_{i,j}\triangleq\left[\phi_{i,j}^{1},\cdots,\phi_{i,j}^{d^{2}}\right]^{T} as the parameterization of Pi,jP_{i,j} and the completeness constraint (14) becomes

i=1njϕi,j=[d,0,,0]T,\sum_{i=1}^{n_{j}}\phi_{i,j}=[\sqrt{d},0,\cdots,0]^{T}, (84)

and thus

i=1njϕi,j1=d,j=1njϕi,j=d.\sum_{i=1}^{n_{j}}\phi_{i,j}^{1}=\sqrt{d},\quad\left\|\sum_{j=1}^{n_{j}}\phi_{i,j}\right\|=\sqrt{d}. (85)

Conversely, using (85), we can also obtain (84). Define

C~=[ϕ1,1,ϕn1,1,ϕ1,2,,ϕnJ,J]T,\tilde{C}=\left[\phi_{1,1},\cdots\phi_{n_{1},1},\phi_{1,2},\cdots,\phi_{n_{J},J}\right]^{T}, (86)

which is the parameterization matrix of all the measurement operators in the basis of {Ωi}i=1d2\left\{\Omega_{i}\right\}_{i=1}^{d^{2}} and is a real matrix. Since {Ωi}i=1d2\left\{\Omega_{i}\right\}_{i=1}^{d^{2}} and natural basis are both orthonormal basis, the relationship between CC and C~\tilde{C} is C~=CU\tilde{C}=CU where UU is a unitary matrix. Therefore, the eigenvalues of CCC^{\dagger}{C} and C~TC~\tilde{C}^{T}\tilde{C} are the same, and JTr((CC)1)=JTr((C~TC~)1)J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right)=J\operatorname{Tr}\left(\left(\tilde{C}^{T}\tilde{C}\right)^{-1}\right), cond(C~)=cond(C)\operatorname{cond}(\tilde{C})=\operatorname{cond}(C).

Similarly, we define the set of all dd-dimensional measurement operators with set number JJ and set size 𝓃\mathcal{n} as

(d,J,𝓃){{Pi,j}i,j=1nj,J1jJ,1inj,Pi,jd×d,Pi,j=Pi,j,Pi,j0,i=1njPi,j=Id,}.\displaystyle\mathcal{M}(d,J,\mathcal{n})\triangleq\Big{\{}\left\{P_{i,j}\right\}_{i,j=1}^{n_{j},J}\mid\forall 1\leq j\leq J,\forall 1\leq i\leq n_{j},P_{i,j}\in\mathbb{C}^{d\times d},P_{i,j}=P_{i,j}^{\dagger},P_{i,j}\geq 0,\sum_{i=1}^{n_{j}}P_{i,j}=I_{d},\Big{\}}. (87)

For general QPT, we have no prior knowledge of the output states. From (32), the following conditions are equivalent: (i) the output states cannot be uniquely identified; (ii) the measurement operators do not span the space of all dd-dimensional Hermitian matrices; (iii) CCC^{\dagger}C is singular; (iv) JTr((CC)1)J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right) is infinite. From (i) and (iv), it is also reasonable to choose JTr((CC)1)J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right) as the first cost function. Thus we define the set of all the optimal measurement operators to minimize JTr((CC)1)J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right) (OMO1(d,J,𝓃)\mathrm{OMO}_{1}(d,J,\mathcal{n})) as

OMO1(d,J,𝓃)argmin{Pi,j}i,j=1nj,J(d,J,𝓃)JTr((CC)1).\mathrm{OMO}_{1}(d,J,\mathcal{n})\triangleq\underset{\left\{P_{i,j}\right\}_{i,j=1}^{n_{j},J}\in\mathcal{M}(d,J,\mathcal{n})}{\arg\min}J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right). (88)

Since maximizing the robustness of the reconstructed output states with respect to measurement noise is equivalent to minimizing the condition number, from (32), we also choose cond(C)\operatorname{cond}(C) as the second cost function. Thus, we define the set of all the optimal measurement operators to minimize cond(C)\operatorname{cond}(C) (OMO2(d,J,𝓃)\mathrm{OMO}_{2}(d,J,\mathcal{n})) as

OMO2(d,J,𝓃)argmin{Pi,j}i,j=1nj,J(d,J,𝓃)cond(C).\mathrm{OMO}_{2}(d,J,\mathcal{n})\triangleq\underset{\left\{P_{i,j}\right\}_{i,j=1}^{n_{j},J}\in\mathcal{M}(d,J,\mathcal{n})}{\arg\min}\operatorname{cond}(C). (89)

Then we define sj=1Jdnjs\triangleq\sum_{j=1}^{J}\frac{d}{n_{j}} and two sets OMO1¯(d,J,𝓃)\underline{\mathrm{OMO}_{1}}(d,J,\mathcal{n}) and OMO2¯(d,J,𝓃)\underline{\mathrm{OMO}_{2}}(d,J,\mathcal{n}) to characterize the lower bounds of these two cost functions as

OMO1¯(d,J,𝓃){{Pi,j}i,j=1nj,J(d,J,𝓃)JTr((CC)1)=J(1s+(d21)2Jds)},\displaystyle\underline{\mathrm{OMO}_{1}}(d,J,\mathcal{n})\triangleq\Big{\{}\left\{P_{i,j}\right\}_{i,j=1}^{n_{j},J}\in\mathcal{M}(d,J,\mathcal{n})\mid J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right)=J\left(\frac{1}{s}+\frac{\left(d^{2}-1\right)^{2}}{Jd-s}\right)\Big{\}}, (90)
OMO2¯(d,J,𝓃)\displaystyle\underline{\mathrm{OMO}_{2}}(d,J,\mathcal{n})\triangleq {{Pi,j}i,j=1nj,J(d,J,𝓃)cond(C)=(d21)sJds},\displaystyle\Big{\{}\left\{P_{i,j}\right\}_{i,j=1}^{n_{j},J}\in\mathcal{M}(d,J,\mathcal{n})\mid\operatorname{cond}({C})=\sqrt{\frac{\left(d^{2}-1\right)s}{Jd-s}}\Big{\}}, (91)

where the proof is given as in Theorem 4. We also define the set of all the optimal measurement operators (OMO(d,J,𝓃)\mathrm{OMO}(d,J,\mathcal{n})) as

OMO(d,J,𝓃)OMO1(d,J,𝓃)OMO2(d,J,𝓃),\displaystyle\mathrm{OMO}(d,J,\mathcal{n})\triangleq\mathrm{OMO}_{1}(d,J,\mathcal{n})\cap\mathrm{OMO}_{2}(d,J,\mathcal{n}), (92)

and the set of all the optimal measurement operators achieving both of the lower bounds (OMO¯(d,J,𝓃)\underline{\mathrm{OMO}}(d,J,\mathcal{n})) as

OMO¯(d,J,𝓃)OMO1¯(d,J,𝓃)OMO2¯(d,J,𝓃).\displaystyle\underline{\mathrm{OMO}}(d,J,\mathcal{n})\triangleq\underline{\mathrm{OMO}_{1}}(d,J,\mathcal{n})\cap\underline{\mathrm{OMO}_{2}}(d,J,\mathcal{n}). (93)

Thus, OMO(d,J,𝓃)\mathrm{OMO}(d,J,\mathcal{n}), when it is non-empty, should minimize JTr((CC)1)J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right) and minimize cond(C)\operatorname{cond}(C) simultaneously. When OMO¯(d,J,𝓃)\underline{\mathrm{OMO}}(d,J,\mathcal{n}) is non-empty, it can achieve the lower bounds of JTr((CC)1)J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right) and cond(C)\operatorname{cond}(C) simultaneously. Then we present the following theorem for OMO¯(d,J,𝓃)\underline{\mathrm{OMO}}(d,J,\mathcal{n}).

Theorem 4

For a dd-dimensional quantum process, let the total number of POVM sets be JJ and the number of POVM elements in the jj-th POVM set be njn_{j}. Then

JTr((CC)1)J(1s+(d21)2Jds)J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right)\geq J\left(\frac{1}{s}+\frac{\left(d^{2}-1\right)^{2}}{Jd-s}\right)

and

cond(C)(d21)sJds.\operatorname{cond}(C)\geq\sqrt{\frac{\left(d^{2}-1\right)s}{Jd-s}}.

These lower bounds are achieved simultaneously; i.e., {Pi,j}i,j=1nj,JOMO¯(d,J,𝓃)\left\{P_{i,j}\right\}_{i,j=1}^{n_{j},J}\in\underline{\mathrm{OMO}}(d,J,\mathcal{n}), if and only if the eigenvalues of CCC^{\dagger}C are v1=sv_{1}=s and v2==vd2=Jdsd21v_{2}=\dots=v_{d^{2}}=\frac{Jd-s}{d^{2}-1}.

Proof:

Denote the eigenvalues of CCC^{\dagger}{C} or C~TC~\tilde{C}^{T}\tilde{C} as v1v2vd2>0v_{1}\geq v_{2}\geq\cdots\geq v_{d^{2}}>0. Since all the measurement operators are positive semidefinite, we have

Tr((Pm,j)Pn,j)=(ϕm,j)Tϕn,j0.\operatorname{Tr}\left((P_{m,j})^{\dagger}P_{n,j}\right)=\left(\phi_{m,j}\right)^{T}\phi_{n,j}\geq 0. (94)

Using (84), we have

d\displaystyle d =i=1njϕi,j2=i=1njϕi,j2+mn(ϕm,j)Tϕn,j\displaystyle=\left\|\sum_{i=1}^{n_{j}}\phi_{i,j}\right\|^{2}=\sum_{i=1}^{n_{j}}\left\|\phi_{i,j}\right\|^{2}+\sum_{m\neq n}\left(\phi_{m,j}\right)^{T}\phi_{n,j} (95)
i=1njϕi,j2.\displaystyle\geq\sum_{i=1}^{n_{j}}\left\|\phi_{i,j}\right\|^{2}.

Therefore,

i=1d2vi=Tr(C~TC~)=j=1J(i=1njϕi,j2)Jd.\sum_{i=1}^{d^{2}}v_{i}=\operatorname{Tr}\left(\tilde{C}^{T}\tilde{C}\right)=\sum_{j=1}^{J}\left(\sum_{i=1}^{n_{j}}\left\|\phi_{i,j}\right\|^{2}\right)\leq Jd. (96)

Since

i=1nj(ϕi,j1)2(i=1njϕi,j1)2nj=dnj,\sum_{i=1}^{n_{j}}\left(\phi_{i,j}^{1}\right)^{2}\geq\frac{\left(\sum_{i=1}^{n_{j}}\phi_{i,j}^{1}\right)^{2}}{n_{j}}=\frac{d}{n_{j}}, (97)

the first diagonal element of C~TC~\tilde{C}^{T}\tilde{C} is

(C~TC~)11\displaystyle\left(\tilde{C}^{T}\tilde{C}\right)_{11} =j=1J(i=1nj(ϕi,j1)2)j=1Jdnj,\displaystyle=\sum_{j=1}^{J}\left(\sum_{i=1}^{n_{j}}\left(\phi_{i,j}^{1}\right)^{2}\right)\geq\sum_{j=1}^{J}\frac{d}{n_{j}}, (98)

and thus

v1(C~TC~)11j=1Jdnj.v_{1}\geq\left(\tilde{C}^{T}\tilde{C}\right)_{11}\geq\sum_{j=1}^{J}\frac{d}{n_{j}}. (99)

To minimize JTr((CC)1)J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right) and cond(C)\operatorname{cond}(C), the optimization problems can be relaxed as follows

min i=1d2Jvi\displaystyle\sum_{i=1}^{d^{2}}\frac{J}{v_{i}} (100)
s.t. i=1d2viJd,v1j=1Jdnj,\displaystyle\sum_{i=1}^{d^{2}}v_{i}\leq Jd,v_{1}\geq\sum_{j=1}^{J}\frac{d}{n_{j}},

and

min v1vd2\displaystyle\sqrt{\frac{v_{1}}{v_{d^{2}}}} (101)
s.t. i=1d2viJd,v1j=1Jdnj.\displaystyle\sum_{i=1}^{d^{2}}v_{i}\leq Jd,v_{1}\geq\sum_{j=1}^{J}\frac{d}{n_{j}}.

Using the Lagrange multiplier method, we have

JTr((CC)1)=JTr((C~TC~)1)J(1s+(d21)2Jds),\displaystyle J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right)=J\operatorname{Tr}\left(\left(\tilde{C}^{T}\tilde{C}\right)^{-1}\right)\geq J\left(\frac{1}{s}+\frac{\left(d^{2}-1\right)^{2}}{Jd-s}\right), (102)

and

cond(C)s(Jds)/(d21)=(d21)sJds.\operatorname{cond}({C})\geq\sqrt{\frac{s}{(Jd-s)/(d^{2}-1)}}=\sqrt{\frac{\left(d^{2}-1\right)s}{Jd-s}}. (103)

These lower bounds are achieved simultaneously if and only if v1=j=1Jdnjv_{1}=\sum_{j=1}^{J}\frac{d}{n_{j}} and v2==vd2=Jdv1d21v_{2}=\dots=v_{d^{2}}=\frac{Jd-v_{1}}{d^{2}-1}. ∎

Since (d,J,𝓃)\mathcal{M}(d,J,\mathcal{n}) is a closed space, there always exist non-empty OMO1(d,J,𝓃)\mathrm{OMO}_{1}(d,J,\mathcal{n}) and OMO2(d,J,𝓃)\mathrm{OMO}_{2}(d,J,\mathcal{n}) for arbitrary set number JJ and set size 𝓃\mathcal{n}. However, it is not clear whether this also holds for OMO(d,J,𝓃)\mathrm{OMO}(d,J,\mathcal{n}). For the lower bounds of these two cost functions, if one of them is achieved, we have v1=j=1Jdnj,v2==vd2=Jdv1d21v_{1}=\sum_{j=1}^{J}\frac{d}{n_{j}},v_{2}=\dots=v_{d^{2}}=\frac{Jd-v_{1}}{d^{2}-1} and thus

OMO¯(d,J,𝓃)=OMO1¯(d,J,𝓃)=OMO2¯(d,J,𝓃).\displaystyle\underline{\mathrm{OMO}}(d,J,\mathcal{n})=\underline{\mathrm{OMO}_{1}}(d,J,\mathcal{n})=\underline{\mathrm{OMO}_{2}}(d,J,\mathcal{n}). (104)

If the two lower bounds can not be achieved, the above three sets are empty. Therefore, (104) always holds for any JJ and 𝓃\mathcal{n}. Our results show that if {Pi,j}i,j=1nj,JOMO¯(d,J,𝓃)\left\{P_{i,j}\right\}_{i,j=1}^{n_{j},J}\in\underline{\mathrm{OMO}}(d,J,\mathcal{n}), then in each POVM set, these POVM elements are orthogonal; i.e., Tr((Pm,j)Pn,j)=0\operatorname{Tr}\left((P_{m,j})^{\dagger}P_{n,j}\right)=0 (mn)(m\neq n) from the first constraint. To achieve these lower bounds, we must have JdJ\geq d because the maximum number of orthogonal POVM elements in one set is dd. It is an open problem as to whether OMO¯(d,J,𝓃)\underline{\mathrm{OMO}}(d,J,\mathcal{n}) is non-empty for arbitrary set number JJ and set sizes njn_{j} (1jJ1\leq j\leq J). When it is non-empty, we have OMO¯(d,J,𝓃)=OMO(d,J,𝓃)\underline{\mathrm{OMO}}(d,J,\mathcal{n})={\mathrm{OMO}}(d,J,\mathcal{n}).

Here, we give one example of OMO¯(d,J,𝓃)\underline{\mathrm{OMO}}(d,J,\mathcal{n}): MUB (mutually unbiased bases) measurement with J=d+1J=d+1 and nj=dn_{j}=d for 1jd+11\leq j\leq d+1, which can achieve these lower bounds. Two sets of orthogonal bases m={|ψim}i=1d\mathcal{H}^{m}=\{\left|\psi_{i}^{m}\right\rangle\}_{i=1}^{d} and n={|ψjn}j=1d\mathcal{H}^{n}=\{\left|\psi_{j}^{n}\right\rangle\}_{j=1}^{d} are called mutually unbiased if and only if [51]

|ψim|ψjn|2={1d for mn,δij for m=n.\left|\left\langle\psi_{i}^{m}|\psi_{j}^{n}\right\rangle\right|^{2}=\left\{\begin{array}[]{ll}\frac{1}{d}&\text{ for }m\neq n,\\ \delta_{ij}&\text{ for }m=n.\end{array}\right. (105)

For a prime pp and a positive integer kk, there exist maximally d+1d+1 sets of mutually unbiased bases in Hilbert spaces of prime-power dimension d=pkd=p^{k} [52]. For other values, it is still an open problem as to whether there exist d+1d+1 sets of mutually unbiased bases. The corresponding MUB measurements are {|ψimψim|}m,i=1d+1,d\left\{\left|\psi_{i}^{m}\right\rangle\left\langle\psi_{i}^{m}\right|\right\}_{m,i=1}^{d+1,d}. Then we give the following proposition with a similar proof to that of Proposition 2 in [32].

Proposition 2

dd-dimensional MUB measurements (when they exist) belong to OMO¯(d,J,𝓃)\underline{\mathrm{OMO}}(d,J,\mathcal{n}) with J=d+1J=d+1 and nj=dn_{j}=d for 1jd+11\leq j\leq d+1.

Proof:

We assume that the standard singular value decomposition (SVD) of CC is C=UcΣVcC=U_{c}\Sigma V^{\dagger}_{c}. Thus, using (29) and (105), we have

CC=[Id(1d)d(1d)d(1d)dId(1d)d(1d)d(1d)dId]=UcΣΣTUcCC^{\dagger}=\left[\begin{array}[]{cccc}I_{d}&\left(\frac{1}{d}\right)_{d}&\cdots&\left(\frac{1}{d}\right)_{d}\\ \left(\frac{1}{d}\right)_{d}&I_{d}&\cdots&\left(\frac{1}{d}\right)_{d}\\ \vdots&\vdots&\vdots&\vdots\\ \left(\frac{1}{d}\right)_{d}&\left(\frac{1}{d}\right)_{d}&\cdots&I_{d}\end{array}\right]=U_{c}\Sigma\Sigma^{T}U^{\dagger}_{c} (106)

where (1d)d\left(\frac{1}{d}\right)_{d} denotes the d×dd\times d matrix where all the elements are 1d\frac{1}{d}. Therefore,

CC\displaystyle C^{\dagger}C =VcΣTUcTUcΣVc=Vcdiag(d+1,Id21)Vc.\displaystyle=V_{c}\Sigma^{T}U^{T}_{c}U_{c}\Sigma V^{\dagger}_{c}=V_{c}\operatorname{diag}\left(d+1,I_{d^{2}-1}\right)V^{\dagger}_{c}. (107)

Thus the eigenvalues of CCC^{\dagger}C are v1=d+1,v2==vd2=1v_{1}=d+1,v_{2}=\cdots=v_{d^{2}}=1 for MUB measurements and we can then calculate JTr((CC)1)=d3+d2dJ\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right)=d^{3}+d^{2}-d and cond(C)=d+1\operatorname{cond}(C)=\sqrt{d+1}, which achieve the lower bounds. ∎

Remark 3

SIC-POVM are usually thought as optimal (in certain senses) measurements in quantum physics. The simplest mathematical definition of SIC-POVM is a set of d2d^{2} normalized vectors |ψk\left|\psi_{k}\right\rangle in d\mathbb{C}^{d} satisfying [53]

|ψj|ψk|2=1d+1,jk.\left|\left\langle\psi_{j}|\psi_{k}\right\rangle\right|^{2}=\frac{1}{d+1},j\neq k. (108)

The corresponding POVM elements are Pi=1d|ψiψi|P_{i}=\frac{1}{d}\left|\psi_{i}\right\rangle\left\langle\psi_{i}\right| for 1id21\leq i\leq d^{2} and i=1d2Pi=I\sum_{i=1}^{d^{2}}P_{i}=I with J=1J=1, n1=d2n_{1}=d^{2}. However, here SIC-POVM does not belong to OMO¯(d,J,𝓃)\underline{\mathrm{OMO}}(d,J,\mathcal{n}) because these POVM elements are not orthogonal with each other. Therefore, the first constraint is not satisfied. We can calculate JTr((CC)1)=d3+d2dJ\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right)=d^{3}+d^{2}-d and cond(C)=d+1\operatorname{cond}(C)=\sqrt{d+1} for SIC-POVM. These values are the same as MUB measurement which has J=d+1J=d+1 and nj=dn_{j}=d for 1jd+11\leq j\leq d+1. Thus we conjecture that SIC-POVM belongs to OMO(d,J,𝓃)\mathrm{OMO}(d,J,\mathcal{n}) for J=1J=1, n1=d2n_{1}=d^{2}.

Remark 4

Our analysis for optimal measurement operators can also be applied to QST. A similar problem for QST has been discussed in [46]. However, their work is restricted to optimal projective measurement, while our result considers general POVM. Miranowicz et al. [54] also considered the minimum condition number for the optimal measurement operators in QST and proposed generalized Pauli operators whose condition number is 11. However, generalized Pauli operators are realized from projective measurements and their condition number does not directly link to the experimental data.

With input states like SIC states or MUB states belonging to OIS¯(d,M)\underline{\mathrm{OIS}}(d,M) and measurement operators like MUB measurements belonging to OMO¯(d,J,𝓃)\underline{\mathrm{OMO}}(d,J,\mathcal{n}), the error upper bound of (67) becomes O(d4Tr(F)N)O\left(\frac{d^{4}\operatorname{Tr}(F)}{\sqrt{N}}\right). The limitation of our error upper bound is that it might be loose with respect to the system dimension dd and we leave it an open problem to characterize dd more accurately. We can obtain a tighter upper bound if we use the natural basis states {|jk|}1j,kd\{|j\rangle\langle k|\}_{1\leq j,k\leq d} and M=d2M=d^{2} with MUB measurements. Note that in experiments, all of the quantum states are Hermitian. Hence, we cannot actually use |jk||j\rangle\langle k| (jkj\neq k) as input quantum states. According to [2], since we have

|jk|=\displaystyle|j\rangle\langle k|= |++|+i||1+i2|jj|1+i2|kk|,\displaystyle|+\rangle\langle+|+\mathrm{i}|-\rangle\langle-|-\frac{1+\mathrm{i}}{2}|j\rangle\langle j|-\frac{1+\mathrm{i}}{2}|k\rangle\langle k|, (109)

where |±=(|j±|k)/2|\pm\rangle=(|j\rangle\pm|k\rangle)/\sqrt{2}, (|jk|)\mathcal{E}(|j\rangle\langle k|) can be obtained as

(|jk|)=(|++|)+i(||)1+i2(|jj|)1+i2(|kk|).\begin{array}[]{rl}\mathcal{E}\left(|j\rangle\langle k|\right)=&\mathcal{E}(|+\rangle\langle+|)+\mathrm{i}\mathcal{E}(|-\rangle\langle-|)-\dfrac{1+\mathrm{i}}{2}\mathcal{E}(|j\rangle\langle j|)-\dfrac{1+\mathrm{i}}{2}\mathcal{E}(|k\rangle\langle k|).\\ \end{array} (110)

Using natural basis states which we define as all the states needed by the RHS of (109), Wang et al. [13] have proved that BB is a permutation matrix. Thus, the error analysis in Step 2 becomes

D^D\displaystyle\|\hat{D}-D\| =vec1(Bvec(A^))vec1(Bvec(A))=B(vec(A^)vec(A))=A^A,\displaystyle=\left\|\operatorname{vec}^{-1}\left(B^{\dagger}\operatorname{vec}(\hat{A})\right)-\operatorname{vec}^{-1}\left(B^{\dagger}\operatorname{vec}(A)\right)\right\|=\left\|B^{\dagger}(\operatorname{vec}(\hat{A})-\operatorname{vec}(A))\right\|=\|\hat{A}-A\|, (111)

and the error analysis in other steps are the same as our analysis. Therefore, we obtain a tighter upper bound as O(d3Tr(F)N)O\left(\frac{d^{3}\operatorname{Tr}(F)}{\sqrt{N}}\right).

VI-C On the different types of input quantum states and measurement operators

In practice, the different types of input quantum states MM and measurement operators JJ may vary and here we discuss the MSE scaling with respect to MM and JJ. We first allow MM to change, with the dimension dd, the measurement operators and the copy number of each input state NN given. The different input states are randomly generated as [55, 56] or as [57] for truncated coherent states. Coherent states are more straightforward to be prepared than Fock states in quantum optics and have been used in QPT in [58, 59]. Each scenario with NN copies generates the input states ρmin\rho_{m}^{\text{in}} according to a certain probability distribution and the total copies are Nt=NMN_{t}=NM, based on which E()\mathrm{E}(\cdot) denotes the expectation, different from 𝔼()\mathbb{E}(\cdot) in Theorem 1. Define fmvec(ρmin)E(vec(ρmin))\mathrm{f}_{m}\triangleq\operatorname{vec}\left(\rho_{m}^{\text{in}}\right)-\mathrm{E}\left(\operatorname{vec}\left(\rho_{m}^{\text{in}}\right)\right), which is i.i.d. with respect to the subscript ii. Thus we have

E{Tr((VVT)1)}\displaystyle\quad\mathrm{E}\left\{\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right)\right\}
=1MTr{ME[(m=1ME(vec(ρmin))+fm)(E(vec(ρmin)T)+fmT))1]}\displaystyle=\frac{1}{M}\operatorname{Tr}\Big{\{}M\mathrm{E}\Big{[}\Big{(}\sum_{m=1}^{M}\mathrm{E}\left(\operatorname{vec}\left(\rho_{m}^{\text{in}}\right)^{*}\right)+\mathrm{f}_{m}^{*}\Big{)}\cdot\Big{(}\mathrm{E}\Big{(}\operatorname{vec}\left(\rho_{m}^{\text{in}}\right)^{T}\Big{)}+\mathrm{f}_{m}^{T}\Big{)}\Big{)}^{-1}\Big{]}\Big{\}}
=1MTr{ME[(ME(vec(ρmin))E(vec(ρmin)T)+m=1MfmfmT)1]}\displaystyle=\frac{1}{M}\operatorname{Tr}\Big{\{}M\mathrm{E}\Big{[}\Big{(}M\mathrm{E}\left(\operatorname{vec}\left(\rho_{m}^{\text{in}}\right)^{*}\right)\mathrm{E}\left(\operatorname{vec}\left(\rho_{m}^{\text{in}}\right)^{T}\right)+\sum_{m=1}^{M}\mathrm{f}_{m}^{*}\mathrm{f}_{m}^{T}\Big{)}^{-1}\Big{]}\Big{\}}
=1MTr{[E(vec(ρmin))E(vec(ρmin)T)+E(1Mm=1MfmfmT)]1}.\displaystyle=\frac{1}{M}\operatorname{Tr}\Big{\{}\Big{[}\mathrm{E}\left(\operatorname{vec}\left(\rho_{m}^{\text{in}}\right)^{*}\right)\mathrm{E}\Big{(}\operatorname{vec}\left(\rho_{m}^{\text{in}}\right)^{T}\Big{)}+\mathrm{E}\Big{(}\frac{1}{M}\sum_{m=1}^{M}\mathrm{f}_{m}^{*}\mathrm{f}_{m}^{T}\Big{)}\Big{]}^{-1}\Big{\}}. (112)

According to the central limit theorem, as MM tends to infinity, E(1Mm=1MfmfmT)\mathrm{E}(\frac{1}{M}\sum_{m=1}^{M}\mathrm{f}_{m}^{*}\mathrm{f}_{m}^{T}) converges to a fixed matrix [57]. Therefore the expectation of Tr((VVT)1)\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right) is

E{Tr((VVT)1)}=O(1M).\displaystyle\mathrm{E}\left\{\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right)\right\}=O\left(\frac{1}{{M}}\right). (113)

In Step 1, since NN is given, using (46), we have

Ecolm(A^T)colm(AT)2=O(1).\displaystyle\mathrm{E}\left\|\operatorname{col}_{m}(\hat{A}^{T})-\operatorname{col}_{m}(A^{T})\right\|^{2}=O\left(1\right). (114)

In Step 2, assume that the SVD of (VVT)1V\left(V^{*}V^{T}\right)^{-1}V^{*} is

(VVT)1V=U[Σ1,0]W,\left(V^{*}V^{T}\right)^{-1}V^{*}=U\left[\Sigma_{1},0\right]W^{\dagger}, (115)

where Σ1\Sigma_{1} is a d2×d2d^{2}\times d^{2} diagonal matrix, and UU and WW are unitary. Using (49) and (113), EΣ12=O(1M)\mathrm{E}\left\|\Sigma_{1}\right\|^{2}=O\left(\frac{1}{{M}}\right). Let

hm=W(colm(A^)colm(A))=[hm(1)hm(2)],h_{m}=W^{\dagger}\left(\operatorname{col}_{m}(\hat{A})-\operatorname{col}_{m}(A)\right)=\left[\begin{array}[]{l}h_{m}^{(1)}\\ h_{m}^{(2)}\end{array}\right], (116)

where hm(1)h_{m}^{(1)} is a d2×1d^{2}\times 1 vector, hm(2)h_{m}^{(2)} is a (Md2)×1(M-d^{2})\times 1 vector and each element in hmh_{m} scales as O(1)O(1) from (114). Then for the error in Step 2, we have

ERT(Id2(VVT)1V)(vec(A^)vec(A))2\displaystyle\quad\mathrm{E}\left\|R^{T}\left(I_{d^{2}}\otimes\left(V^{*}V^{T}\right)^{-1}V^{*}\right)(\operatorname{vec}(\hat{A})-\operatorname{vec}(A))\right\|^{2}
=Em=1d2(VVT)1V(colm(A^)colm(A))2\displaystyle=\mathrm{E}\sum_{m=1}^{d^{2}}\left\|\left(V^{*}V^{T}\right)^{-1}V^{*}\left(\operatorname{col}_{m}(\hat{A})-\operatorname{col}_{m}(A)\right)\right\|^{2}
=Em=1d2U[Σ1,0][hm(1)hm(2)]2\displaystyle=\mathrm{E}\sum_{m=1}^{d^{2}}\left\|U\left[\Sigma_{1},0\right]\left[\begin{array}[]{l}h_{m}^{(1)}\\ h_{m}^{(2)}\end{array}\right]\right\|^{2}
=Em=1d2Σ1hm(1)2=O(1M).\displaystyle=\mathrm{E}\sum_{m=1}^{d^{2}}\left\|\Sigma_{1}h_{m}^{(1)}\right\|^{2}=O\left(\frac{1}{{M}}\right). (117)

Since Steps 3–4 are not related to the scaling on MM, the final error scaling on MM is

EX^X=O(1M).\mathrm{E}\|\hat{X}-X\|=O\left(\frac{1}{\sqrt{M}}\right). (118)

From our error upper bound in (67), using (113), we have

E{MTr((VVT)1)}=O(1),\displaystyle\mathrm{E}\left\{M\operatorname{Tr}\left(\left(V^{*}V^{T}\right)^{-1}\right)\right\}=O\left(1\right), (119)

and thus the scaling of (67) is O(1)O\left(1\right) with given NN and dd. But the accurate scaling on MM is EX^X=O(1M)\mathrm{E}\|\hat{X}-X\|=O\left(\frac{1}{\sqrt{M}}\right). Therefore, our error upper bound (67) is not always tight with respect to MM.

Then we consider the MSE scaling of our TSS on JJ. With given input states, NN and dd, we can also generate different measurement operators according to certain probability distributions. Similarly, we can prove

Ecolm(A^T)colm(AT)2=O(1),\mathrm{E}\left\|\operatorname{col}_{m}(\hat{A}^{T})-\operatorname{col}_{m}(A^{T})\right\|^{2}=O(1), (120)

and

E{JTr((CC)1)}=O(1).\displaystyle\mathrm{E}\left\{J\operatorname{Tr}\left(\left(C^{\dagger}C\right)^{-1}\right)\right\}=O\left(1\right). (121)

Therefore, our error upper bound (67) is tight with respect to JJ.

VII Numerical examples

To perform measurement on the output states, there are different measurement bases such as SIC-POVM [53], MUB measurement [52], and Cube bases [60]. In this section, we use Cube bases, as it is relatively easy to be implemented in experiments. For one-qubit systems, the Cube bases are {I±σx2,I±σy2,I±σz2}\left\{\frac{I\pm\sigma_{x}}{2},\frac{I\pm\sigma_{y}}{2},\frac{I\pm\sigma_{z}}{2}\right\} where σx\sigma_{x}, σy\sigma_{y}, σz\sigma_{z} are Pauli matrices. For multi-qubit systems, the Cube bases are the tensor products of one-qubit Cube bases.

VII-A Performance illustration

We consider a 44-dimensional quantum system and the TP process matrix XX is determined by {𝒜i}i=13\{\mathcal{A}_{i}\}_{i=1}^{3} where

𝒜1=U1diag(0.5,0.4,0,0),\displaystyle\mathcal{A}_{1}=U_{1}\operatorname{diag}(0.5,0.4,0,0), (122)
𝒜2=U2diag(0.1,0.2,0,0),\displaystyle\mathcal{A}_{2}=U_{2}\operatorname{diag}(0.1,0.2,0,0),
𝒜3=U3I𝒜1𝒜1𝒜2𝒜2,\displaystyle\mathcal{A}_{3}=U_{3}\sqrt{I-\mathcal{A}_{1}^{\dagger}\mathcal{A}_{1}-\mathcal{A}_{2}^{\dagger}\mathcal{A}_{2}},

where U1U_{1}, U2U_{2} and U3U_{3} are random unitary matrices [55, 56].

We use four different sets of probe states. The first set is SIC state which is given in (133) in Appendix B and M=16M=16. The second set is MUB states which is given in (135) in Appendix B and M=20M=20. The third one is random states where we randomly generate 2020 input states using the algorithm in [55, 56]. The fourth set is natural basis states in the RHS of (109). The result of MSE (mean squared error) versus the total number of copies Nt=MNN_{t}=MN is shown in Fig. 2(a), where MSE=𝔼X^X2\operatorname{MSE}=\mathbb{E}||\hat{X}-X||^{2} and we make all of the four sets have the same NtN_{t}. For each number of copies NtN_{t}, we repeat our algorithm 2020 times and obtain the average MSE and error bars. For all the four sets of input states, the scaling of MSE is basically O(1/Nt)O(1/N_{t}) which satisfies Theorem 1. Moreover, the MSEs of SIC states, MUB states, natural basis states are all smaller than random states. The MSEs of SIC states and MUB states are quite similar and both smaller than natural basis states. For non-TP processes, we consider the same 𝒜1,𝒜2\mathcal{A}_{1},\mathcal{A}_{2} and different 𝒜3\mathcal{A}_{3} as

𝒜3=U3U4diag(1,0.8,0.7,0.5)U4𝒜1𝒜1𝒜2𝒜2,\mathcal{A}_{3}=U_{3}\sqrt{U_{4}\operatorname{diag}(1,0.8,0.7,0.5)U_{4}^{\dagger}-\mathcal{A}_{1}^{\dagger}\mathcal{A}_{1}-\mathcal{A}_{2}^{\dagger}\mathcal{A}_{2}}, (123)

where U4U_{4} is a random unitary matrix. The result for non-TP processes is shown in Fig. 2(b) which is similar to Fig. 2(a). The MSE of non-TP processes is a little smaller than that of TP processes because the constraint for TP processes is stricter than that for non-TP processes.

Refer to caption
Refer to caption
Figure 2: Log-log plot of MSE versus the total number of copies NtN_{t} using SIC states, MUB states, random states and natural basis states. (a) TP process; (b) non-TP process.

We also test our algorithm on the IBM Quantum device [61]. Here we perform QPT on the CNOT gate defined as

CNOT=[1000000100100100],\text{CNOT}=\left[\begin{array}[]{llll}1&0&0&0\\ 0&0&0&1\\ 0&0&1&0\\ 0&1&0&0\end{array}\right], (124)

in the IBM Quantum Composer. For the input states, we use states {I+σx2,I+σy2,I±σz2}2\left\{\frac{I+\sigma_{x}}{2},\frac{I+\sigma_{y}}{2},\frac{I\pm\sigma_{z}}{2}\right\}^{\otimes 2} which can be generated by applying Hadamard and SS^{\dagger} gates to qubits initialized in the |0|0\rangle state [62]. In addition, we use Cube measurements. Fig. 3 is an example of IBM Quantum Composer where the input state is Iσz2I+σx2\frac{I-\sigma_{z}}{2}\otimes\frac{I+\sigma_{x}}{2} and the measurement operators are I±σy2I±σx2\frac{I\pm\sigma_{y}}{2}\otimes\frac{I\pm\sigma_{x}}{2}.

Refer to caption
Figure 3: An example of the IBM Composer. The input state is Iσz2I+σx2\frac{I-\sigma_{z}}{2}\otimes\frac{I+\sigma_{x}}{2} and the measurement operators are I±σy2I±σx2\frac{I\pm\sigma_{y}}{2}\otimes\frac{I\pm\sigma_{x}}{2}.

If we have prior knowledge that the process is unitary, some references have proposed effective identification algorithms [63, 64]. Here we assume that we do not have prior knowledge that the unknown process is in essence unitary. We perform these numerical examples using the MATLAB on a classical computer and the ibmq_qasm_simulatoribmq\_qasm\_simulator. The total numbers of copies are Nt=1.08×105,5.40×105,2.70×106,1.35×107N_{t}=1.08\times 10^{5},5.40\times 10^{5},2.70\times 10^{6},1.35\times 10^{7} and the experiments are repeated 5 times. Then we apply our TSS algorithm and the results are shown in Fig. 4. Besides MSE, we also consider another common fidelity metric, defined as [23]

F(X^,X)[TrX^XX^]2/[Tr(X)Tr(X^)],F(\hat{X},X)\triangleq\left[\operatorname{Tr}\sqrt{\sqrt{\hat{X}}X\sqrt{\hat{X}}}\right]^{2}/\left[\operatorname{Tr}\left(X\right)\operatorname{Tr}\left(\hat{X}\right)\right], (125)

and the correspond infidelity is defined as 1F(X^,X)1-F(\hat{X},X). From Fig. 4, the scalings of the MSEs from ibmq_qasm_simulatoribmq\_qasm\_simulator and simulation are also both O(1/Nt)O(1/N_{t}) which satisfies Theorem 1. The scalings of the infidelities are both O(1/Nt)O(1/\sqrt{N_{t}}) because the CNOT process is rank-deficient. A similar infidelity scaling has also been studied in QST [51, 65]. In addition, the errors between simulation and ibmq_qasm_simulatoribmq\_qasm\_simulator are close. Then we apply our algorithm on the ibmq_quitoibmq\_quito 5-qubit system. The MSE and fidelity are presented in Fig. 5 where the total number of copies are Nt=4.32×105,5.76×105,7.20×105,8.64×105,10.08×105,11.52×105N_{t}=4.32\times 10^{5},5.76\times 10^{5},7.20\times 10^{5},8.64\times 10^{5},10.08\times 10^{5},11.52\times 10^{5}, respectively, and we repeat 33 times. These results are both worse than ibmq_qasm_simulatoribmq\_qasm\_simulator and simulation because there exist state preparation error and readout error in ibmq_quitoibmq\_quito [61]. The average CNOT error is 1.253×1021.253\times 10^{-2} and average readout error is 4.430×1024.430\times 10^{-2} for ibmq_quitoibmq\_quito. Thus, the approximate fidelity is (11.253×102)(14.430×102)=0.9437(1-1.253\times 10^{-2})(1-4.430\times 10^{-2})=0.9437.

Refer to caption
Figure 4: Log-log plot of MSE and infidelity versus the total number of copies NtN_{t} for the CNOT process using simulation (Sim) by MATLAB and ibmq_qasm_simulatoribmq\_qasm\_simulator (QASM).
Refer to caption
Figure 5: MSE and infidelity versus the total number of copies NtN_{t} for the CNOT process using ibmq_quitoibmq\_quito 5-qubit system.

VII-B Performance comparison

Here, we compare the performance between our TSS algorithm and the convex optimization method in [14]. Baldwin et al. [14] formulated the TP QPT problem as a convex optimization problem

min\displaystyle\min m,l|mlTr(𝒪mlX^)|2\displaystyle\sum_{m,l}\left|\mathcal{g}_{ml}-\operatorname{Tr}\left(\mathcal{O}_{ml}^{\dagger}\hat{X}\right)\right|^{2} (126)
s.t. j,kX^jkEkEj=I,\displaystyle\sum_{j,k}\hat{X}_{jk}E_{k}^{\dagger}E_{j}=I,
X^=X^,X^0,\displaystyle\hat{X}=\hat{X}^{\dagger},\quad\hat{X}\geq 0,

where ml\mathcal{g}_{ml} is measurement data, 𝒪ml\mathcal{O}_{ml} is a constant matrix which is the tensor product between the transpose of the mm-th input states and ll-th measurement bases where 1mM1\leq m\leq M, 1lL1\leq l\leq L, and X^\hat{X} is the estimated process matrix. The detailed description can be found in equation (13) in [14]. This problem can be solved as a convex semidefinite program, which is realized by CVX [66, 67].

Refer to caption
Figure 6: Using random quantum states, running time TT and the MSE versus qubit number NqN_{q} for the convex optimization method in [14] and our TSS algorithm. The corresponding storage requirement without OO is labeled near the running time.

We randomly generate input quantum states as [55, 56] and M=d(d+1)M=d(d+1). With given number of copies N=3×104N=3\times 10^{4} for each output state and randomly generated process as in (122), a comparison of results is shown in Fig. 6 where the horizontal axis is the number of qubits NqN_{q}, the left vertical axis is the MSE on a logarithmic scale and the right vertical axis is the running time TT (in seconds) on a logarithmic scale. We repeat the simulation 1010 times and obtain the average MSE, the average running time and error bars.

When the two algorithms have a similar MSE (we set the error of convex optimization a little larger than TSS by changing CVX precision), the running time of TSS algorithm is smaller than convex optimization by approximately three orders of magnitude for three-qubit systems, indicating that our TSS algorithm is more efficient than the convex optimization method. Since M=d(d+1)M=d(d+1) and we choose Cube bases where L=6NqL=6^{N_{q}}, the computational complexity for our algorithm is O(96Nq)O(96^{N_{q}}). For the simulated running time of our algorithm, since our computational complexity analysis is asymptotical on dd, we fit the rightmost three points and the slope is 2.08432.0843, which is close to the theoretical value log10(96)=1.9823\log_{10}(96)=1.9823. The slight difference might be attributed to fact that the qubit number is not large enough. The storage requirement is also labeled by the black numbers in Fig. 6. The storage requirement for our algorithm is O(ML)O(ML). Since we use Cube bases of measurement, L=6Nq,M=4Nq+2NqL=6^{N_{q}},M=4^{N_{q}}+2^{N_{q}} and thus ML=24Nq+12NqML=24^{N_{q}}+12^{N_{q}}. As NqN_{q} increases from 11 to 55, MLML are 3636, 720720, 1555215552, 352512352512 and 82114568211456, respectively. For the convex optimization method, the main storage requirement is the cost of storing 𝒪ml\mathcal{O}_{ml}, which is a d2×d2d^{2}\times d^{2} matrix and the total number of these matrices is MLML. Thus, the storage requirement is O(MLd4)O(MLd^{4}). As NqN_{q} increases from 11 to 33, MLd4=384Nq+192NqMLd^{4}=384^{N_{q}}+192^{Nq} are 576576, 184320184320 and 6370099263700992, respectively. Thus, our algorithm needs less storage than the convex optimization method.

Refer to caption
Figure 7: Using random quantum states, the running time TT and the MSE versus the different type of input states MM for the convex optimization method in [14] and our TSS algorithm.

We then compare the two algorithms by changing the types of input states MM with d=4d=4, where we randomly generate input states as in [55, 56] and the number of copies for each output state is N=32×104N=3^{2}\times 10^{4}. The comparison results are shown in Fig. 7 where the horizontal axis is the different types of input states MM, the left vertical axis is the MSE and the right vertical axis is the running time TT, all on a logarithmic scale. From Fig. 7, the scaling between MSE and MM is 1.1474-1.1474, which is close to 1-1 and consistent with (118). This scaling is different from the scaling on NtN_{t} in Fig. 2 because here NN is given and MM increases, while in Fig. 2, MM is given and NN increases. For the simulated running time of our algorithm, we fit the rightmost three points to obtain the slope 1.00901.0090, which is close to the theoretical value 11. For the convex optimization method, the slope of the fitted line of the rightmost three points is 1.91861.9186, which is larger than that of our TSS algorithm, indicating that our TSS algorithm is more efficient than the convex optimization method.

VIII Conclusion

In this paper, we have introduced an analytical two-stage solution (TSS) applicable to both trace-preserving and non-trace-preserving QPT. Leveraging the natural basis, we have utilized the tensor structure of the coefficient matrix and provided insights into the computational complexity and storage requirements for our algorithm. Our contributions include the establishment of an analytical upper bound for errors and the optimization of input quantum states and measurement operators based on this error upper bound and condition number. The effectiveness and theoretical soundness of our algorithm have been demonstrated through numerical examples and testing on an IBM Quantum device. Furthermore, we have benchmarked our TSS algorithm against the convex optimization method outlined in [14], and the results show that our algorithm is more efficient in terms of both time and space costs. Further work will focus on extending our algorithm to AAPT.

Acknowledgments

The authors would like to thank Prof. Lloyd C. L. Hollenberg for the helpful discussion. This research was supported by the University of Melbourne through the establishment of the IBM Quantum Network Hub at the University. Shuixin Xiao would like to thank the support from the IEEE Control Systems Society Graduate Collaboration Fellowship.

Appendix A Several propositions and lemmas

Here, we give some propositions and lemmas which are utilized in proving Theorem 1.

Proposition 3

For any An×kA\in\mathbb{C}^{n\times k} and bmkb\in\mathbb{C}^{mk}, we have

(ImA)bAb\left\|\left(I_{m}\otimes A\right)b\right\|\leq\|A\|\|b\| (127)
Proof:

Let

b=[b1T,b2T,,bmT]T,b=\left[b_{1}^{T},b_{2}^{T},\cdots,b_{m}^{T}\right]^{T}, (128)

where bib_{i} is a k×1k\times 1 vector. Therefore,

(ImA)b2=i=1mAbi2A2i=1mbi2=A2b2.\displaystyle\left\|\!\left(I_{m}\otimes A\right)\!b\right\|^{2}\!=\!\sum_{i=1}^{m}\left\|Ab_{i}\right\|^{2}\!\leq\!\|A\|^{2}\sum_{i=1}^{m}\left\|b_{i}\right\|^{2}\!=\!\|A\|^{2}\|b\|^{2}. (129)

Lemma 2

[68] Let A\mathbb{H}_{A} and B\mathbb{H}_{B} be finite-dimensional Hilbert spaces of dimensions dAd_{A} and dBd_{B}, respectively, and let XABX\in\mathbb{H}_{A}\otimes\mathbb{H}_{B}. Then for any unitarily invariant norm that is multiplicative over tensor products, the partial trace satisfies the norm inequality

TrA(X)dAIAX,\left\|\operatorname{Tr}_{A}(X)\right\|\leq\frac{d_{A}}{\left\|I_{A}\right\|}\|X\|, (130)

where IAI_{A} is the identity operator.

Lemma 3

([69] Theorem 8.1 and Theorem 28.3) Let XX, YY be Hermitian matrices with eigenvalues λ1(X)λn(X)\lambda_{1}(X)\geq\cdots\geq\lambda_{n}(X) and λ1(Y)λn(Y)\lambda_{1}(Y)\geq\cdots\geq\lambda_{n}(Y), respectively. Then

maxj|λj(X)λj(Y)|XY,\max_{j}|\lambda_{j}(X)-\lambda_{j}(Y)|\leq||X-Y||, (131)

and

j=1n(λj(X)λj(Y))2XY2.\sum_{j=1}^{n}\left(\lambda_{j}(X)-\lambda_{j}(Y)\right)^{2}\leq||X-Y||^{2}. (132)

Appendix B SIC states and MUB states

For input states achieving the lower bounds of MTr((VVT)1)M\operatorname{Tr}\big{(}(V^{*}V^{T})^{-1}\big{)} and cond(V)\operatorname{cond}(V), we give two examples, SIC states and MUB states, as the following propositions.

Proposition 4

dd-dimensional SIC states (when they exist) belong to OIS¯(d,M)\underline{\mathrm{OIS}}(d,M) with the smallest MM as M=d2M=d^{2}.

Proposition 5

dd-dimensional MUB states (when they exist) belong to OIS¯(d,M)\underline{\mathrm{OIS}}(d,M) for M=d(d+1)M=d(d+1).

The proofs are similar to the optimal probe states in quantum detector tomography and can be found in [32].

For SIC-POVM in d=4d=4, Bengtsson [70] gave one expression ignoring overall phases and normalization in the natural basis as

[xxxxiiiiiiiiiiii1111xxxxiiii111111111111xxxxiiii1111iiii1111xxxx],\left[\begin{array}[]{cccccccccccccccc}x&x&x&x&\mathrm{i}&\mathrm{i}&-\mathrm{i}&-\mathrm{i}&\mathrm{i}&\mathrm{i}&-\mathrm{i}&-\mathrm{i}&\mathrm{i}&\mathrm{i}&-\mathrm{i}&-\mathrm{i}\\ 1&1&-1&-1&x&x&x&x&\mathrm{i}&-\mathrm{i}&\mathrm{i}&-\mathrm{i}&1&-1&1&-1\\ 1&-1&1&-1&1&-1&1&-1&x&x&x&x&-\mathrm{i}&\mathrm{i}&\mathrm{i}&-\mathrm{i}\\ 1&-1&-1&1&-\mathrm{i}&\mathrm{i}&\mathrm{i}&-\mathrm{i}&-1&1&1&-1&x&x&x&x\end{array}\right], (133)

where x=2+5x=\sqrt{2+\sqrt{5}}. Let |ψn(SIC)\left|\psi_{n}^{(\text{SIC})}\right\rangle be the nn-th column of (133). In this paper, we call the set of

ρn=|ψn(SIC)ψn(SIC)|Tr(|ψn(SIC)ψn(SIC)|)\rho_{n}=\frac{\left|\psi_{n}^{(\text{SIC})}\right\rangle\left\langle\psi_{n}^{(\text{SIC})}\right|}{\operatorname{Tr}\left(\left|\psi_{n}^{(\text{SIC})}\right\rangle\left\langle\psi_{n}^{(\text{SIC})}\right|\right)}

as SIC states and {Pn=1dρn}n=1d2\left\{P_{n}=\frac{1}{d}\rho_{n}\right\}_{n=1}^{d^{2}} as SIC-POVM.

For d=4d=4, five MUB measurement sets are

{|ψn(MUB)}={{|ψnA},{|ψnB},{|ψnC},{|ψnD},{|ψnE}},\left\{\!|\psi_{n}^{(\text{MUB})}\rangle\!\right\}\!=\!\left\{\!\left\{\left|\psi_{n}^{A}\right\rangle\!\right\}\!,\!\left\{\left|\psi_{n}^{B}\right\rangle\!\right\}\!,\!\left\{\left|\psi_{n}^{C}\right\rangle\!\right\}\!,\!\left\{\left|\psi_{n}^{D}\right\rangle\!\right\}\!,\!\left\{\left|\psi_{n}^{E}\right\rangle\!\right\}\!\right\}, (134)

and

{|ψnA}={|00,|01,|10,|11},\displaystyle\left\{\left|\psi_{n}^{A}\right\rangle\right\}=\{|00\rangle,|01\rangle,|10\rangle,|11\rangle\},
{|ψnB}={|R±,|L±},\displaystyle\left\{\left|\psi_{n}^{B}\right\rangle\right\}=\{|R\pm\rangle,|L\pm\rangle\},
{|ψnC}={|±R,|±L},\displaystyle\left\{\left|\psi_{n}^{C}\right\rangle\right\}=\{|\pm R\rangle,|\pm L\rangle\},
{|ψnD}={12(|R0±i|L1),12(|R1±i|L0)},\displaystyle\left\{\left|\psi_{n}^{D}\right\rangle\right\}=\left\{\frac{1}{\sqrt{2}}(|R0\rangle\pm\mathrm{i}|L1\rangle),\frac{1}{\sqrt{2}}(|R1\rangle\pm\mathrm{i}|L0\rangle)\right\},
{|ψnE}={12(|RR±i|LL),12(|RL±i|LR)},\displaystyle\left\{\left|\psi_{n}^{E}\right\rangle\right\}=\left\{\frac{1}{\sqrt{2}}(|RR\rangle\pm\mathrm{i}|LL\rangle),\frac{1}{\sqrt{2}}(|RL\rangle\pm\mathrm{i}|LR\rangle)\right\}, (135)

where |±=(|0±|1)/2|\pm\rangle=(|0\rangle\pm|1\rangle)/\sqrt{2}, |R=(|0i|1)/2|R\rangle=(|0\rangle-{\mathrm{i}}|1\rangle)/\sqrt{2}, and |L=(|0+i|1)/2|L\rangle=(|0\rangle+{\mathrm{i}}|1\rangle)/\sqrt{2} in the natural basis. Adamson et al. [51] utilized these Mutually Unbiased Bases in the QST. In this paper, we call ρn=|ψn(MUB)ψn(MUB)|\rho_{n}=|\psi_{n}^{(\text{MUB})}\rangle\langle\psi_{n}^{(\text{MUB})}| a MUB state and Pn=|ψn(MUB)ψn(MUB)|P_{n}=|\psi_{n}^{(\text{MUB})}\rangle\langle\psi_{n}^{(\text{MUB})}| a MUB measurement operator.

References

  • [1] D. P. DiVincenzo, “Quantum computation,” Science, vol. 270, no. 5234, pp. 255–261, 1995.
  • [2] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information. Cambridge University Press, 2010.
  • [3] N. Gisin and R. Thew, “Quantum communication,” Nature Photonics, vol. 1, no. 3, p. 165, 2007.
  • [4] M.-H. Hsieh and M. M. Wilde, “Trading classical communication, quantum communication, and entanglement in quantum Shannon theory,” IEEE Transactions on Information Theory, vol. 56, no. 9, pp. 4705–4730, 2010.
  • [5] C. L. Degen, F. Reinhard, and P. Cappellaro, “Quantum sensing,” Reviews of Modern Physics, vol. 89, p. 035002, 2017.
  • [6] D. Dong and I. R. Petersen, “Quantum control theory and applications: a survey,” IET Control Theory & Applications, vol. 4, no. 12, pp. 2651–2671, 2010.
  • [7] D. Dong and I. R. Petersen, “Quantum estimation, control and learning: Opportunities and challenges,” Annual Reviews in Control, vol. 54, pp. 243–251, 2022.
  • [8] D. Dong and I. R. Petersen, Learning and Robust Control in Quantum Technology. Springer Nature Switzerland AG, 2023.
  • [9] D. Burgarth and K. Yuasa, “Quantum system identification,” Physical Review Letters, vol. 108, p. 080502, 2012.
  • [10] A. Sone and P. Cappellaro, “Hamiltonian identifiability assisted by a single-probe measurement,” Physical Review A, vol. 95, p. 022335, 2017.
  • [11] Y. Wang, D. Dong, A. Sone, I. R. Petersen, H. Yonezawa, and P. Cappellaro, “Quantum Hamiltonian identifiability via a similarity transformation approach and beyond,” IEEE Transactions on Automatic Control, vol. 65, no. 11, pp. 4632–4647, 2020.
  • [12] J. Zhang and M. Sarovar, “Quantum Hamiltonian identification from measurement time traces,” Physical Review Letters, vol. 113, p. 080401, 2014.
  • [13] Y. Wang, D. Dong, B. Qi, J. Zhang, I. R. Petersen, and H. Yonezawa, “A quantum Hamiltonian identification algorithm: Computational complexity and error analysis,” IEEE Transactions on Automatic Control, vol. 63, no. 5, pp. 1388–1403, 2018.
  • [14] C. H. Baldwin, A. Kalev, and I. H. Deutsch, “Quantum process tomography of unitary and near-unitary maps,” Physical Review A, vol. 90, p. 012110, 2014.
  • [15] M. Zorzi, F. Ticozzi, and A. Ferrante, “Minimal resources identifiability and estimation of quantum channels,” Quantum Information Processing, vol. 13, no. 3, pp. 683–707, 2014.
  • [16] M. Zorzi, F. Ticozzi, and A. Ferrante, “Estimation of quantum channels: Identifiability and ML methods,” in 2012 51st IEEE Conference on Decision and Control (CDC), pp. 1674–1679, 2012.
  • [17] G. C. Knee, E. Bolduc, J. Leach, and E. M. Gauger, “Quantum process tomography via completely positive and trace-preserving projection,” Physical Review A, vol. 98, p. 062336, 2018.
  • [18] T. Surawy-Stepney, J. Kahn, R. Kueng, and M. Guţă, “Projected least-squares quantum process tomography,” Quantum, vol. 6, p. 844, 2022.
  • [19] Z. Ji, G. Wang, R. Duan, Y. Feng, and M. Ying, “Parameter estimation of quantum channels,” IEEE Transactions on Information Theory, vol. 54, no. 11, pp. 5172–5185, 2008.
  • [20] J. Zhang and M. Sarovar, “Identification of open quantum systems from observable time traces,” Physical Review A, vol. 91, no. 5, p. 052121, 2015.
  • [21] M. Guţă and N. Yamamoto, “System identification for passive linear quantum systems,” IEEE Transactions on Automatic Control, vol. 61, no. 4, pp. 921–936, 2016.
  • [22] X.-L. Huang, J. Gao, Z.-Q. Jiao, Z.-Q. Yan, Z.-Y. Zhang, D.-Y. Chen, X. Zhang, L. Ji, and X.-M. Jin, “Reconstruction of quantum channel via convex optimization,” Science Bulletin, vol. 65, no. 4, pp. 286–292, 2020.
  • [23] I. Bongioanni, L. Sansoni, F. Sciarrino, G. Vallone, and P. Mataloni, “Experimental quantum process tomography of non-trace-preserving maps,” Physical Review A, vol. 82, p. 042307, 2010.
  • [24] G. A. White, C. D. Hill, F. A. Pollock, L. C. Hollenberg, and K. Modi, “Demonstration of non-Markovian process characterisation and control on a quantum processor,” Nature Communications, vol. 11, p. 6301, 2020.
  • [25] G. A. White, F. A. Pollock, L. C. Hollenberg, K. Modi, and C. D. Hill, “Non-markovian quantum process tomography,” PRX Quantum, vol. 3, p. 020344, 2022.
  • [26] H. Wang, W. Zheng, N. Yu, K. Li, D. Lu, T. Xin, C. Li, Z. Ji, D. Kribs, B. Zeng, X. Peng, and J. Du, “Quantum state and process tomography via adaptive measurements,” Science China Physics, Mechanics & Astronomy, vol. 59, no. 10, p. 100313, 2016.
  • [27] Y. S. Teo, B.-G. Englert, J. Řeháček, and Z. Hradil, “Adaptive schemes for incomplete quantum process tomography,” Physical Review A, vol. 84, p. 062125, 2011.
  • [28] I. A. Pogorelov, G. I. Struchalin, S. S. Straupe, I. V. Radchenko, K. S. Kravtsov, and S. P. Kulik, “Experimental adaptive process tomography,” Physical Review A, vol. 95, p. 012302, 2017.
  • [29] I. L. Chuang and M. A. Nielsen, “Prescription for experimental determination of the dynamics of a quantum black box,” Journal of Modern Optics, vol. 44, no. 11-12, pp. 2455–2467, 1997.
  • [30] J. F. Poyatos, J. I. Cirac, and P. Zoller, “Complete characterization of a quantum process: The two-bit quantum gate,” Physical Review Letters, vol. 78, pp. 390–393, 1997.
  • [31] J. Fiurášek and Z. Hradil, “Maximum-likelihood estimation of quantum processes,” Physical Review A, vol. 63, p. 020101, 2001.
  • [32] S. Xiao, Y. Wang, D. Dong, and J. Zhang, “Optimal and two-step adaptive quantum detector tomography,” Automatica, vol. 141, p. 110296, 2022.
  • [33] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge University Press, 2012.
  • [34] J. Watrous, The Theory of Quantum Information. Cambridge University Press, 2018.
  • [35] D. W. Leung, “Choi’s proof as a recipe for quantum process tomography,” Journal of Mathematical Physics, vol. 44, no. 2, pp. 528–533, 2003.
  • [36] G. M. D’Ariano and P. Lo Presti, “Quantum tomography for measuring experimentally the matrix elements of an arbitrary quantum operation,” Physical Review Letters, vol. 86, pp. 4195–4198, 2001.
  • [37] J. B. Altepeter, D. Branning, E. Jeffrey, T. C. Wei, P. G. Kwiat, R. T. Thew, J. L. O’Brien, M. A. Nielsen, and A. G. White, “Ancilla-assisted quantum process tomography,” Physical Review Letters, vol. 90, p. 193601, 2003.
  • [38] G. M. D’Ariano and P. Lo Presti, “Imprinting complete information about a quantum channel on its output state,” Physical Review Letters, vol. 91, p. 047902, 2003.
  • [39] M. Mohseni and D. A. Lidar, “Direct characterization of quantum dynamics,” Physical Review Letters, vol. 97, p. 170501, 2006.
  • [40] M. Mohseni and D. A. Lidar, “Direct characterization of quantum dynamics: General theory,” Physical Review A, vol. 75, p. 062331, 2007.
  • [41] Z.-W. Wang, Y.-S. Zhang, Y.-F. Huang, X.-F. Ren, and G.-C. Guo, “Experimental realization of direct characterization of quantum dynamics,” Physical Review A, vol. 75, p. 044304, 2007.
  • [42] M. Mohseni, A. T. Rezakhani, and D. A. Lidar, “Quantum-process tomography: Resource analysis of different strategies,” Physical Review A, vol. 77, p. 032322, 2008.
  • [43] M. Zorzi, F. Ticozzi, and A. Ferrante, “Minimum relative entropy for quantum estimation: Feasibility and general solution,” IEEE Transactions on Information Theory, vol. 60, no. 1, pp. 357–367, 2014.
  • [44] J. Haah, A. W. Harrow, Z. Ji, X. Wu, and N. Yu, “Sample-optimal tomography of quantum states,” IEEE Transactions on Information Theory, vol. 63, no. 9, pp. 5628–5641, 2017.
  • [45] M. Berta, J. M. Renes, and M. M. Wilde, “Identifying the information gain of a quantum measurement,” IEEE Transactions on Information Theory, vol. 60, no. 12, pp. 7987–8006, 2014.
  • [46] B. Qi, Z. Hou, L. Li, D. Dong, G.-Y. Xiang, and G.-C. Guo, “Quantum state tomography via linear regression estimation,” Scientific Reports, vol. 3, p. 3496, 2013.
  • [47] B. Mu, H. Qi, I. R. Petersen, and G. Shi, “Quantum tomography by regularized linear regressions,” Automatica, vol. 114, p. 108837, 2020.
  • [48] J. A. Smolin, J. M. Gambetta, and G. Smith, “Efficient method for computing the maximum-likelihood quantum state from measurements with additive Gaussian noise,” Physical Review Letters, vol. 108, p. 070502, 2012.
  • [49] H. Sedrakyan and N. Sedrakyan, Algebraic Inequalities. Springer, 2018.
  • [50] S. Xiao, Y. Wang, D. Dong, and J. Zhang, “Optimal quantum detector tomography via linear regression estimation,” in 2021 60th IEEE Conference on Decision and Control (CDC), pp. 4140–4145, 2021.
  • [51] R. B. A. Adamson and A. M. Steinberg, “Improving quantum state estimation with mutually unbiased bases,” Physical Review Letters, vol. 105, p. 030406, 2010.
  • [52] T. Durt, B.-G. Englert, I. Bengtsson, and K. Życzkowski, “On mutually unbiased bases,” International Journal of Quantum Information, vol. 08, no. 04, pp. 535–640, 2010.
  • [53] J. M. Renes, R. Blume-Kohout, A. J. Scott, and C. M. Caves, “Symmetric informationally complete quantum measurements,” Journal of Mathematical Physics, vol. 45, no. 6, pp. 2171–2180, 2004.
  • [54] A. Miranowicz, K. Bartkiewicz, J. Peřina, M. Koashi, N. Imoto, and F. Nori, “Optimal two-qubit tomography based on local and global measurements: Maximal robustness against errors as described by condition numbers,” Physical Review A, vol. 90, p. 062123, 2014.
  • [55] J. A. Miszczak, “Generating and using truly random quantum states in Mathematica,” Computer Physics Communications, vol. 183, no. 1, pp. 118–124, 2012.
  • [56] N. Johnston, “QETLAB: A MATLAB toolbox for quantum entanglement, version 0.9,” Jan. 2016.
  • [57] Y. Wang, S. Yokoyama, D. Dong, I. R. Petersen, E. H. Huntington, and H. Yonezawa, “Two-stage estimation for quantum detector tomography: Error analysis, numerical and experimental results,” IEEE Transactions on Information Theory, vol. 67, no. 4, pp. 2293–2307, 2021.
  • [58] M. Lobino, D. Korystov, C. Kupchak, E. Figueroa, B. C. Sanders, and A. I. Lvovsky, “Complete characterization of quantum-optical processes,” Science, vol. 322, no. 5901, pp. 563–566, 2008.
  • [59] S. Rahimi-Keshari, A. Scherer, A. Mann, A. T. Rezakhani, A. I. Lvovsky, and B. C. Sanders, “Quantum process tomography with coherent states,” New Journal of Physics, vol. 13, no. 1, p. 013006, 2011.
  • [60] M. D. de Burgh, N. K. Langford, A. C. Doherty, and A. Gilchrist, “Choice of measurement sets in qubit tomography,” Physical Review A, vol. 78, p. 052122, 2008.
  • [61] “IBM Quantum.” https://quantum-computing.ibm.com, 2021.
  • [62] A. Dang, G. A. White, L. C. Hollenberg, and C. D. Hill, “Process tomography on a 7-qubit quantum processor via tensor network contraction path finding,” arXiv preprint arXiv:2112.06364, 2021.
  • [63] G. Gutoski and N. Johnston, “Process tomography for unitary quantum channels,” Journal of Mathematical Physics, vol. 55, no. 3, p. 032201, 2014.
  • [64] Y. Wang, Q. Yin, D. Dong, B. Qi, I. R. Petersen, Z. Hou, H. Yonezawa, and G.-Y. Xiang, “Quantum gate identification: Error analysis, numerical results and optical experiment,” Automatica, vol. 101, pp. 269–279, 2019.
  • [65] B. Qi, Z. Hou, Y. Wang, D. Dong, H.-S. Zhong, L. Li, G.-Y. Xiang, H. M. Wiseman, C.-F. Li, and G.-C. Guo, “Adaptive quantum state tomography via linear regression estimation: Theory and two-qubit experiment,” npj Quantum Information, vol. 3, p. 19, 2017.
  • [66] M. Grant and S. Boyd, “CVX: Matlab software for disciplined convex programming, version 2.1.” http://cvxr.com/cvx, Mar. 2014.
  • [67] M. Grant and S. Boyd, “Graph implementations for nonsmooth convex programs,” in Recent Advances in Learning and Control (V. Blondel, S. Boyd, and H. Kimura, eds.), Lecture Notes in Control and Information Sciences, pp. 95–110, Springer-Verlag Limited, 2008. http://stanford.edu/~boyd/graph_dcp.html.
  • [68] D. A. Lidar, P. Zanardi, and K. Khodjasteh, “Distance bounds on quantum dynamics,” Physical Review A, vol. 78, p. 012308, 2008.
  • [69] R. Bhatia, Perturbation Bounds for Matrix Eigenvalues. Society for Industrial and Applied Mathematics, 2007.
  • [70] I. Bengtsson, “From SICs and MUBs to Eddington,” Journal of Physics: Conference Series, vol. 254, p. 012007, 2010.