A Framework For Estimating Amplitudes of Quantum State With Single-Qubit Measurement

Nhat A. Nghiem Department of Physics and Astronomy, State University of New York at Stony Brook, Stony Brook, NY 11794-3800, USA C. N. Yang Institute for Theoretical Physics, State University of New York at Stony Brook, Stony Brook, NY 11794-3840, USA

Abstract

We propose and analyze a simple framework for estimating the amplitudes of a given $n$ -qubit quantum state $\ket{\psi}=\sum_{i=0}^{2^{n}-1}a_{i}\ket{i}$ in computational basis, utilizing a single-qubit measurement only. Previously, it was a common procedure that one could measure all qubits in order to collect measurement outcomes, from which one can estimate amplitudes of given quantum state. Here, we show that if restricting to single-qubit measurement, and one can perform measurement on arbitrary basis, then the measurement outcomes can be used to assist the finding of amplitudes in the usual computational, or Z basis. More concretely, such outcomes are capable of constructing a system of nonlinear algebraic equations, and by classically solving them, we obtain $\tilde{a}_{i}$ , which is the approximation to the corresponding amplitudes $a_{i}$ , including both real and imaginary component. We then discuss our framework from a broader perspective. First, we show that estimating all (norms of) amplitudes to additive accuracy $\delta$ , i.e., $||\tilde{a}_{i}-|a_{i}||\leq\delta$ for all $i$ , $\mathcal{O}(4^{n}/\delta^{4})$ single-qubit measurements is sufficient. Second, we show that to achieve total variation $\sum_{i=0}^{2^{n}-1}||\tilde{a}_{i}|^{2}-|a_{i}|^{2}|\leq\delta$ , $\mathcal{O}(6^{n}/\delta^{4})$ a single bit measurement is required. Finally, in order to achieve an average $L_{1}$ norm error $\sum_{i=0}^{2^{n}-1}||\tilde{a}_{i}|-|a_{i}||/2^{n}\leq\delta$ , a single bit measurement $\mathcal{O}(2^{n}/\delta^{4})$ is needed.

I Introduction

Quantum computation has been undergoing rapid development since some early proposals feynman2018simulating ; deutsch1985quantum ; deutsch1992rapid ; lloyd1996universal ; shor1999polynomial ; grover1996fast . By harnessing intrinsic properties of quantum physics, such as entanglement and superposition, a quantum computer can store, handle, and process information in a very different way compared to classical devices. The principle of quantum computation can be explained simply by the linear algebraic language. Typically, in an usual model of quantum computation, one begins with a state, which is a vector in a Hilbert space, then applies a sequence of gates, which are unitary matrices, followed by measurement operation at the end. The measurement operation features another unique property of quantum physics, where the system, represented by some state vector, is projected, in a probabilistic manner, into some other state depending on the kind of measurement operation being used. As a quantum state is represented by a vector in a Hilbert space, one can choose a basis, and in theory, any state admits decomposition in such a basis. However, given a basis, it is challenging to characterize, i.e. extracting the coefficients of any given unknown state, as the dimension, or parameters, grows exponentially with respect to the number of constituents, e.g., qubits. Furthermore, these parameters cannot be directly accessed but can only be revealed by performing measurement. The results of the measurement are then used to estimate the desired parameters. As mentioned, the measurement operation projects the given state into another one, an occurrence referred to as wave-function collapse. Hence, to characterize a given state completely, we need to perform many measurements on multiple copies of the given state.

Several protocols have been proposed to address the challenge and related problem mentioned above. For example, references huang2020predicting and huang2021efficient introduce learning-based methods to estimate linear properties, such as Pauli observables, of a given density state $\rho$ . More relevant to our main objective in this work is quantum state tomography yu2020sample ; bagan2004collective ; keyl2006quantum ; guctua2008optimal ; haah2016sample ; o2016efficient ; guctua2020fast ; flammia2012quantum ; kueng2017low , where measurement, including single-qubit measurements and multi-qubit measurements on copies of a given state $\rho$ , or could possibly be $\rho^{\otimes k}$ for some $k$ , is carried out to obtain a classical description of given $\rho$ .

Here, we observe that single-qubit measurement essentially contains more insights that could be more informative and beneficial for probing a given quantum system. A common measurement that previous work yu2020sample ; bagan2004collective used is the Pauli measurement, corresponding to a certain basis of measurement. One can ask, given the vastness of the Hilbert space where a quantum state vector resides, that in principle there are countless basis of measurement, so what if we can utilize another basis for the purpose of characterizing a given state ? Such insight is indeed simple and informative enough to allow us to achieve the goal. The probabilistic mechanism behind measurement obeys Born’s rule, and given a basis of measurement, one can explicitly write out the probability of obtaining certain outcomes (corresponding with a certain state after measurement) as a function of given state’s amplitudes. The process of measuring (multiple copies of) a given state then amounts to learning discrete probability distributions, and one can statistically estimate the probabilities corresponding to each outcome. These estimations are then used to reveal the given state.

More concretely, we consider an $n$ -qubit system $\ket{\psi}=\sum_{i=0}^{2^{n}-1}a_{i}\ket{i}$ where $\ket{i}$ is a computational basis state (or commonly known as Pauli $Z$ -basis), and we are interested in knowing all the so-called amplitudes $\{a_{i}\}_{i=0}^{2^{n}-1}$ , which are generally complex numbers. It is known that if we measure all qubits on the basis of $Z$ , then the probability of obtaining $\ket{i}$ after measurement is $\operatorname{Tr}(\ket{\psi}\bra{\psi}\cdot\ket{i}\bra{i})=|a_{i}|^{2}$ . Then using Hoeffding inequality, we can measure $\ket{\psi}$ $\mathcal{O}(1/\epsilon^{2})$ times (using different copies) to estimate $|a_{i}|^{2}$ to accuracy $\epsilon$ . The whole process is then repeated to estimate all the amplitude squares to the desired accuracy. Our protocol introduced in this work aims to rectify this task and closely related task (to be elaborated later), using single-qubit measurement only. As mentioned, we utilize the vastness of Hilbert space and that different bases of measurement are possible. Once a measurement (with chosen basis) is placed on a qubit, then the probability of obtaining a certain outcome can be derived explicitly. As we show explicitly later, this probability with each outcome could be expressed as a function of all amplitudes $\{a_{i}\}_{i=0}^{2^{n}-1}$ . In addition, these probabilities could be estimated statistically via measurement, and all such probabilities then form a system of nonlinear algebraic equations, by which solving them, the desired amplitudes could be estimated. As mentioned, our method employs only single-qubit measurement. Thus, it shares similarity with previous work yu2020sample . The difference is in the basis of measurement, as we go beyond the Pauli measurement.

In the following section II.1 we begin with two examples for 2-qubit and 3-qubit case to illustrate the insight behind our method. From these examples, generalization to $n$ -qubit case is discussed subsequently II.2. In Section II.2.1, II.2.2, II.2.3, we discuss our frameworks from a broader perspective, where we explicitly show the complexity, or the scaling of measurements, with respect to different expectation of error. We conclude in section III and discuss the potential future direction stemming from this work.

II Main Framework

We first provide two illustrating examples in section II.1, followed by a general framework provided in section II.2.

II.1 Examples

We begin by discussing two examples: $\ket{\psi}_{1}\in\mathbb{C}^{4}$ being the quantum state of 2-qubits and $\ket{\psi}_{2}\in\mathbb{C}^{8}$ being the quantum state of 3-qubits.

Let $\ket{\psi}_{1}=a_{00}\ket{00}+a_{01}\ket{01}+a_{10}\ket{10}+a_{11}\ket{11}$ be the decomposition of $\ket{\psi}_{1}$ in the computational basis state. If we measure two qubits, then it is fundamental, that is, Born’s rule, that the probability of measuring string $(i,j)$ ( $i,j=0,1$ ) is $p_{ij}=\operatorname{Tr}(\ket{ij}\bra{ij}\cdot\ket{\psi}_{1}\bra{\psi}_{1})=|a_{ij}|^{2}$ . If we want to estimate all of this probability $p_{ij}$ , then we need to repeat the measurement multiple times and use the results to statistically estimate all probabilities $p_{ij}$ .

To see how the outcomes can be used for estimating all probabilities $p_{ij}$ from measuring $\ket{\psi}_{1}$ in computational basis, we can first consider the estimation of $p_{00}$ . Then the measurement outcomes form a Bernoulli distribution for $00$ , as we simply treat all the outcomes $\{01,10,11\}$ as a discrete variable with corresponding probability $1-p_{00}$ . It is known (via Chernoff bound) that the samples required to estimate $p_{00}$ to accurary $\epsilon$ with probability at least $1-\delta$ is $\mathcal{O}(\ln(1/\delta)/\epsilon^{2})$ . The same strategy can be applied to estimate $p_{01},p_{10},p_{11}$ by treating each of them and the remaining probabilities as Bernoulli distribution. In the end, the whole procedure takes $\mathcal{O}(4\ln(1/\delta)/\epsilon^{2})$ repetitions to estimate all probabilities to additive accuracy $\epsilon$ , with high probability.

The same result is applied for 3-qubits case. Let $\ket{\psi}_{2}=a_{000}\ket{000}+a_{001}\ket{001}+a_{010}\ket{010}+a_{011}\ket{011}+a_{100}\ket{100}+a_{101}\ket{101}+a_{110}\ket{110}+a_{111}\ket{111}$ . Then repeating the measurements in computational basis state allows us to estimate each probability $p_{ijk}=|a_{ijk}|^{2}$ (for $i,j,k=0,1$ ) to error at most $\epsilon$ with success probability at least $1-\delta$ , with $\mathcal{O}(8\ln(1/\delta)/\epsilon^{2})$ repetitions according to the above lemma. It is straightforward to see that even in $n$ -qubits case, the procedure is the same for estimating all amplitudes, hence the sample complexity is $\mathcal{O}(2^{n}\ln(1/\delta)/\epsilon^{2})$ to achieve the desired precision.

Now for $\ket{\psi}_{1}$ , instead of measuring all qubits in computational basis state (with four possible outcomes $00,01,10,11$ ), we measure the first qubit in computational basis state. Then there will be two possible outcomes, which are 0 and 1 (or more like $\ket{0}$ or $\ket{1}$ in the first qubit), with corresponding probability $p_{0}$ and $p_{1}$ :

	$\displaystyle p_{0}=\operatorname{Tr}(\ket{0}\bra{0}\otimes\mathbb{I}_{2}\cdot\ket{\psi}_{1}\bra{\psi}_{1})$		(1)
	$\displaystyle p_{1}=\operatorname{Tr}(\ket{1}\bra{1}\otimes\mathbb{I}_{2}\cdot\ket{\psi}_{1}\bra{\psi}_{1})$		(2)

where $\mathbb{I}_{2}$ refers to the identity matrix on the second qubit system. To avoid the confusion, as subsequently we also have $p_{0}$ and $p_{1}$ but those probabilities are for the measurement’s outcome on the second qubit, we use the superscript to denote the qubit explicitly: $p_{0}^{1(or2)}$ and $p_{1}^{1(or2)}$ refers to the probability of obtaining $0$ and $1$ on the first, or second qubit respectively. We remark the following crucial information, which is essentially a marginal probability property:

	$\displaystyle p_{00}+p_{01}=p_{0}^{1}$		(3)
	$\displaystyle p_{10}+p_{11}=p_{1}^{1}$		(4)

where $p_{ij}=|a_{ij}|^{2}$ is the usual squared amplitude of state $\ket{\psi}_{1}$ . Likewise, if we measure the second qubit, there would also be two possible outcomes 0 and 1, with probability $p_{0}^{2}$ and $p_{1}^{2}$ . Then we also have that, for second qubit:

	$\displaystyle p_{00}+p_{10}=p_{0}^{2}$		(5)
	$\displaystyle p_{01}+p_{11}=p_{1}^{2}$		(6)

It is straightforward to see that the above equation form a linear system:

\displaystyle\begin{pmatrix}1&1&0&0\\ 0&0&1&1\\ 1&0&1&0\\ 0&1&0&1\end{pmatrix}\begin{pmatrix}p_{00}\\ p_{01}\\ p_{10}\\ p_{11}\end{pmatrix}=\begin{pmatrix}p_{0}^{1}\\ p_{1}^{1}\\ p_{0}^{2}\\ p_{1}^{2}\end{pmatrix}

(7)

As we have discussed, $\mathcal{O}(\ln(1/\delta)/\epsilon^{2})$ repetition is required to estimate $p_{0}^{1},p_{0}^{2}$ and $p_{1}^{1},p_{1}^{2}$ to the precision $\epsilon$ . This means that the right-hand side of the above linear system is erroneous. Therefore, if we wish to solve the linear system above to obtain the probability $p_{00},p_{01},p_{10},p_{11}$ , then the solution appears to be erroneous, which we will discuss in more detail later.

Now we examine the same procedure as above for 3-qubits cases. We measure each qubit of the state $\ket{\psi}_{2}$ $\mathcal{O}(1/\epsilon^{2})$ times, and obtain the outcomes $p_{0}^{1},p_{1}^{1},p_{0}^{2},p_{1}^{2},p_{0}^{3},p_{1}^{3}$ . Recall that

\displaystyle\ket{\psi}_{2}

\displaystyle=a_{000}\ket{000}+a_{001}\ket{001}+a_{010}\ket{010}+a_{011}\ket{011}+a_{100}\ket{100}+a_{101}\ket{101}+a_{110}\ket{110}+a_{111}\ket{111}

(8)

where each amplitude squared $|a_{ijk}|^{2}=p_{ijk}$ is the probability of obtaining corresponding outcome $(i,j,k)$ . Then due to marginal probability property, we have that:

	$\displaystyle p_{000}+p_{001}+p_{010}+p_{011}=p_{0}^{1}$		(9)
	$\displaystyle p_{100}+p_{101}+p_{110}+p_{111}=p_{1}^{1}$		(10)
	$\displaystyle p_{000}+p_{001}+p_{100}+p_{101}=p_{0}^{2}$		(11)
	$\displaystyle p_{010}+p_{011}+p_{110}+p_{111}=p_{1}^{2}$		(12)
	$\displaystyle p_{000}+p_{010}+p_{100}+p_{110}=p_{0}^{3}$		(13)
	$\displaystyle p_{001}+p_{011}+p_{101}+p_{111}=p_{1}^{3}$		(14)

The above equations can be written as a linear system:

\displaystyle\begin{pmatrix}1&1&1&1&0&0&0&0\\ 0&0&0&0&1&1&1&1\\ 1&1&0&0&1&1&0&0\\ 0&0&1&1&0&0&1&1\\ 1&0&1&0&1&0&1&0\\ 0&1&0&1&0&1&0&1\end{pmatrix}\begin{pmatrix}p_{000}\\ p_{001}\\ p_{010}\\ p_{011}\\ p_{100}\\ p_{101}\\ p_{110}\\ p_{111}\end{pmatrix}=\begin{pmatrix}p_{0}^{1}\\ p_{1}^{1}\\ p_{0}^{2}\\ p_{1}^{2}\\ p_{0}^{3}\\ p_{1}^{3}\end{pmatrix}

(15)

The above system is rectangular, rather than square system comparing to Eqn. 7, which makes it a bit trickier to solve, as there are more variables than the number of equations. In the context of linear algebra, it is referred as underdetermined systems, and unique solution might not exists, rather, there can be infinitely many solutions. To make it square, we need to have more equations. However, the used procedure employs a single qubit measurement only, and we already measure all individual qubit. In order to collect more information as to obtain more equations, we need to have more measurement outcomes.

There are two possible ways to solve the aforementioned challenge. The first one is, we can perform joint measurement on $2$ qubits. Suppose we perform measurement on the first two qubits of $\ket{\psi}_{2}$ , then there are four possible outcomes: 00,01,10,11. Let $p_{ij}^{1,2}$ with $i,j=0,1$ denote the probability of measuring $(i,j)$ on the qubit 1 and 2 of $\ket{\psi}_{2}$ . Then because of the decomposition of $\ket{\psi}_{2}=a_{000}\ket{000}+a_{001}\ket{001}+a_{010}\ket{010}+a_{011}\ket{011}+a_{100}\ket{100}+a_{101}\ket{101}+a_{110}\ket{110}+a_{111}\ket{111}$ , and $|a_{ijk}|^{2}=p_{ijk}$ is the corresponding probability of measuring outcome $(i,j,k)$ . Using marginal probability property as in previous 2-qubit case, we have that:

	$\displaystyle p_{000}+p_{001}=p_{00}^{1,2}$		(16)
	$\displaystyle p_{010}+p_{011}=p_{01}^{1,2}$		(17)

Note that there are two other outcomes $(1,0)$ and $(1,1)$ , however, we don’t need to use them, as the above equations, once combined with equations Eqn.14, yields:

	$\displaystyle p_{000}+p_{001}+p_{010}+p_{011}=p_{0}^{1}$		(18)
	$\displaystyle p_{100}+p_{101}+p_{110}+p_{111}=p_{1}^{1}$		(19)
	$\displaystyle p_{000}+p_{001}+p_{100}+p_{101}=p_{0}^{2}$		(20)
	$\displaystyle p_{010}+p_{011}+p_{110}+p_{111}=p_{1}^{2}$		(21)
	$\displaystyle p_{000}+p_{010}+p_{100}+p_{110}=p_{0}^{3}$		(22)
	$\displaystyle p_{001}+p_{011}+p_{101}+p_{111}=p_{1}^{3}$		(23)
	$\displaystyle p_{000}+p_{001}=p_{00}^{1,2}$		(24)
	$\displaystyle p_{010}+p_{011}=p_{01}^{1,2}$		(25)

which is a square linear system:

\displaystyle\begin{pmatrix}1&1&1&1&0&0&0&0\\ 0&0&0&0&1&1&1&1\\ 1&1&0&0&1&1&0&0\\ 0&0&1&1&0&0&1&1\\ 1&0&1&0&1&0&1&0\\ 0&1&0&1&0&1&0&1\\ 1&1&0&0&0&0&0&0\\ 0&0&1&1&0&0&0&0\end{pmatrix}\begin{pmatrix}p_{000}\\ p_{001}\\ p_{010}\\ p_{011}\\ p_{100}\\ p_{101}\\ p_{110}\\ p_{111}\end{pmatrix}=\begin{pmatrix}p_{0}^{1}\\ p_{1}^{1}\\ p_{0}^{2}\\ p_{1}^{2}\\ p_{0}^{3}\\ p_{1}^{3}\\ p_{00}^{1,2}\\ p_{01}^{1,2}\end{pmatrix}

(26)

In principle, it is sufficient from the above square system to solve for desired probability from the two qubits measurement outcomes $(0,0)$ and $(0,1)$ , in addition to single qubit measurements.

While the above described joint qubit measurement seems efficient enough to do, however, the whole procedure gives us a linear system with probabilities, or amplitude square, as variables. We remark that, all amplitudes are complex number, so for this 3-qubit case, we have $2^{3}$ amplitudes, corresponding with $2^{3+1}$ complex variables. Therefore, we can only find the amplitude square, meanwhile, amplitude could be complex and the above linear system isn’t giving us enough information to find the amplitude, including its real and complex part. Performing more joint 2-qubit measurements, in principle, can give us more equations. However, to our surprise, solution based on only single qubit measurement exists. Remind that, in order to obtain the equation Eqn. 15, we performed single qubit measurement on $Z$ basis. In fact, we can also perform measurement on arbitrary basis, and it should be able to give us more information. To proceed, let $\theta$ denotes the angle for another basis measurement, and $\ket{\theta}_{0/1}$ be the eigenstate of corresponding measurement operator. Then the probability of obtaining $\ket{\theta}_{0}$ or $\ket{\theta}_{1}$ is:

\displaystyle p(\theta)_{0/1}=\operatorname{Tr}(\ket{\theta}_{0/1}\bra{\theta}_{0/1}\otimes\mathbb{I}_{n-1}\cdot\ket{\psi}_{2}\bra{\psi}_{2})

(27)

where $\mathbb{I}_{4}$ refers to identity matrix in the remaining $2$ qubits system. Let

	$\displaystyle\ket{\theta}_{0}=\cos(\theta)\ket{0}+\sin(\theta)\ket{1}$		(28)
	$\displaystyle\ket{\theta}_{1}=\sin(\theta)\ket{0}-\cos(\theta)\ket{1}$		(29)

be the decomposition of $\ket{\theta}_{0/1}$ in regular computational basis. From the above equation, it is easy to rewrite:

	$\displaystyle\ket{0}=(\cos(\theta)\ket{\theta}_{0}+\sin(\theta)\ket{\theta}_{1})$		(30)
	$\displaystyle\ket{1}=(\sin(\theta)\ket{\theta}_{0}-\cos(\theta)\ket{\theta}_{1})$		(31)

Recall that

$\displaystyle\ket{\psi}_{2}$	$\displaystyle=a_{000}\ket{000}+a_{001}\ket{001}+a_{010}\ket{010}+a_{011}\ket{011}+a_{100}\ket{100}+a_{101}\ket{101}+a_{110}\ket{110}+a_{111}\ket{111}$	(32)
	$\displaystyle=\ket{0}\big{(}a_{000}\ket{00}+a_{001}\ket{01}+a_{010}\ket{10}+a_{011}\ket{11}\big{)}+\ket{1}\big{(}a_{100}\ket{00}+a_{101}\ket{01}+a_{110}\ket{10}+a_{111}\ket{11}\big{)}$	(33)
	$\displaystyle=(\cos(\theta)\ket{\theta}_{0}+\sin(\theta)\ket{\theta}_{1})\big{(}a_{000}\ket{00}+a_{001}\ket{01}+a_{010}\ket{10}+a_{11}\ket{11}\big{)}+$	(34)
	$\displaystyle(\sin(\theta)\ket{\theta}_{0}-\cos(\theta)\ket{\theta}_{1})\big{(}a_{100}\ket{00}+a_{101}\ket{01}+a_{110}\ket{10}+a_{111}\ket{11}\big{)}$	(35)

Let $\ket{\phi}_{1}=a_{000}\ket{00}+a_{001}\ket{01}+a_{010}\ket{10}+a_{011}\ket{11}$ and $\ket{\phi}_{2}=a_{100}\ket{00}+a_{101}\ket{01}+a_{110}\ket{10}+a_{111}\ket{11}$ . Then we have:

\displaystyle\ket{\psi}_{2}

\displaystyle=\ket{\theta}_{0}\big{(}\cos(\theta)\ket{\phi}_{1}+\sin(\theta)\ket{\phi}_{2}\big{)}+\ket{\theta}_{1}\big{(}\cos(\theta)\ket{\phi}_{1}-\sin(\theta)\ket{\phi}_{2}\big{)}

(36)

If we measure the first qubit, then the probability of measuring $\ket{\theta}_{0},\ket{\theta}_{1}$ is:

	$\displaystyle p(\theta)_{0}=\|\cos(\theta)\ket{\phi}_{1}+\sin(\theta)\ket{\phi}_{2}\|^{2}$		(37)
	$\displaystyle p(\theta)_{1}=\|\cos(\theta)\ket{\phi}_{1}-\sin(\theta)\ket{\phi}_{2}\|^{2}$		(38)

We have that:

$\displaystyle\cos(\theta)\ket{\phi}_{1}+\sin(\theta)\ket{\phi}_{2}=$	$\displaystyle(\cos(\theta)a_{000}+\sin(\theta)a_{100})\ket{00}+$	(39)
	$\displaystyle(\cos(\theta)a_{001}+\sin(\theta)a_{101})\ket{01}+$	(40)
	$\displaystyle(\cos(\theta)a_{010}+\sin(\theta)a_{110})\ket{10}+$	(41)
	$\displaystyle(\cos(\theta)a_{011}+\sin(\theta)a_{111})\ket{11}$	(42)

Then:

$\displaystyle\|\cos(\theta)\ket{\phi}_{1}+\sin(\theta)\ket{\phi}_{2}\|^{2}$	$\displaystyle=\|\cos(\theta)a_{000}+\sin(\theta)a_{100}\|^{2}+$	(43)
	$\displaystyle\|\cos(\theta)a_{001}+\sin(\theta)a_{101}\|^{2}+$	(44)
	$\displaystyle\cos(\theta)a_{010}+\sin(\theta)a_{110}\|^{2}+$	(45)
	$\displaystyle\|\cos(\theta)a_{011}+\sin(\theta)a_{111}\|^{2}$	(46)

Likewise:

$\displaystyle\|\cos(\theta)\ket{\phi}_{1}-\sin(\theta)\ket{\phi}_{2}\|^{2}$	$\displaystyle=\|\cos(\theta)a_{000}-\sin(\theta)a_{100}\|^{2}+$	(48)
	$\displaystyle\|\cos(\theta)a_{001}-\sin(\theta)a_{101}\|^{2}+$	(49)
	$\displaystyle\|\cos(\theta)a_{010}-\sin(\theta)a_{110}\|^{2}+$	(50)
	$\displaystyle\|\cos(\theta)a_{011}-\sin(\theta)a_{111}\|^{2}$	(51)

Recall that from equation Eqn. 14:

	$\displaystyle p_{000}+p_{001}+p_{010}+p_{011}=p_{0}^{1}$		(53)
	$\displaystyle p_{100}+p_{101}+p_{110}+p_{111}=p_{1}^{1}$		(54)
	$\displaystyle p_{000}+p_{001}+p_{100}+p_{101}=p_{0}^{2}$		(55)
	$\displaystyle p_{010}+p_{011}+p_{110}+p_{111}=p_{1}^{2}$		(56)
	$\displaystyle p_{000}+p_{010}+p_{100}+p_{110}=p_{0}^{3}$		(57)
	$\displaystyle p_{001}+p_{011}+p_{101}+p_{111}=p_{1}^{3}$		(58)

if we replace $p_{ijk}=|a_{ijk}|^{2}$ for $i,j,k=0,1$ , we equivalently have:

	$\displaystyle\|a_{000}\|^{2}+\|a_{001}\|^{2}+\|a_{010}\|^{2}+\|a_{011}\|^{2}=p_{0}^{1}$		(59)
	$\displaystyle\|a_{100}\|^{2}+\|a_{101}\|^{2}+\|a_{110}\|^{2}+\|a_{111}\|^{2}=p_{1}^{1}$		(60)
	$\displaystyle\|a_{000}\|^{2}+\|a_{001}\|^{2}+\|a_{100}\|^{2}+\|a_{101}\|^{2}=p_{0}^{2}$		(61)
	$\displaystyle\|a_{010}\|^{2}+\|a_{011}\|^{2}+\|a_{110}\|^{2}+\|a_{111}\|^{2}=p_{1}^{2}$		(62)
	$\displaystyle\|a_{000}\|^{2}+\|a_{010}\|^{2}+\|a_{100}\|^{2}+\|a_{110}\|^{2}=p_{0}^{3}$		(63)
	$\displaystyle\|a_{001}\|^{2}+\|a_{011}\|^{2}+\|a_{101}\|^{2}+\|a_{111}\|^{2}=p_{1}^{3}$		(64)

From the above measurement on $\theta$ -angle basis, we have:

	$\displaystyle\|\cos(\theta)a_{000}+\sin(\theta)a_{100}\|^{2}+\|\cos(\theta)a_{001}+\sin(\theta)a_{101}\|^{2}+$		(65)
	$\displaystyle\|\cos(\theta)a_{010}+\sin(\theta)a_{110}\|^{2}+\|\cos(\theta)a_{011}+\sin(\theta)a_{111}\|^{2}=p(\theta)_{0}$		(66)
	$\displaystyle\|\cos(\theta)a_{000}-\sin(\theta)a_{100}\|^{2}+\|\cos(\theta)a_{001}-\sin(\theta)a_{101}\|^{2}+$		(67)
	$\displaystyle\|\cos(\theta)a_{010}-\sin(\theta)a_{110}\|^{2}+\|\cos(\theta)a_{011}-\sin(\theta)a_{111}\|^{2}=p(\theta)_{1}$		(68)

Putting everything together, we have the following set of equations:

\displaystyle\begin{cases}&|a_{000}|^{2}+|a_{001}|^{2}+|a_{010}|^{2}+|a_{011}|^{2}=p_{0}^{1}\\ &|a_{100}|^{2}+|a_{101}|^{2}+|a_{110}|^{2}+|a_{111}|^{2}=p_{1}^{1}\\ &|a_{000}|^{2}+|a_{001}|^{2}+|a_{100}|^{2}+|a_{101}|^{2}=p_{0}^{2}\\ &|a_{010}|^{2}+|a_{011}|^{2}+|a_{110}|^{2}+|a_{111}|^{2}=p_{1}^{2}\\ &|a_{000}|^{2}+|a_{010}|^{2}+|a_{100}|^{2}+|a_{110}|^{2}=p_{0}^{3}\\ &|a_{001}|^{2}+|a_{011}|^{2}+|a_{101}|^{2}+|a_{111}|^{2}=p_{1}^{3}\\ &|\cos(\theta)a_{000}+\sin(\theta)a_{100}|^{2}+|\cos(\theta)a_{001}+\sin(\theta)a_{101}|^{2}+\\ &|\cos(\theta)a_{010}+\sin(\theta)a_{110}|^{2}+|\cos(\theta)a_{011}+\sin(\theta)a_{111}|^{2}=p(\theta)_{0}\\ &|\cos(\theta)a_{000}-\sin(\theta)a_{100}|^{2}+|\cos(\theta)a_{001}-\sin(\theta)a_{101}|^{2}+\\ &|\cos(\theta)a_{010}-\sin(\theta)a_{110}|^{2}+|\cos(\theta)a_{011}-\sin(\theta)a_{111}|^{2}=p(\theta)_{1}\\ \end{cases}

(70)

We remind that all amplitudes $a_{ijk}$ are complex numbers, i.e., $a_{ijk}=\Re(a_{ijk})+i\Im(a_{ijk})$ . At this point, we still don’t have enough information to solve because as mentioned, all amplitudes are complex number, so this 3-qubit case, we have $2^{3}$ amplitudes, corresponding with $2^{3+1}$ complex variables. The above equations contain 8 equations, so we need 8 more. In fact, there are many ways to collect more equations. Previously we have showed that performing joint 2-qubit measurements can also work, however, if we restrict ourselves to single qubit, then measuring a single qubit can also work well. If we measure the same first qubit in another basis, for example, different angle $\theta^{\prime}$ , then we will have 2 more equations, corresponding with two possible outcomes. By choosing more angles, or we can opt to measure another qubit, then we can obtain more outcomes, thus forming further equations.

The above examples on two-qubits and three-qubits cases have illustrated the key idea behind our proposal, as instead of measuring all qubits, we only need to measure some subsets of qubits, and even a single qubit measurement is sufficient. The outcomes are then being processed classically, e.g., solving a nonlinear system, to obtain the desired amplitudes. Now we proceed to provide a formal description for our method, generalizing the above examples into $n$ -qubits state.

II.2 General Framework

From the above examples, it is straightforward to generalize the procedure to the $n$ -qubit state. As there are $2^{n}$ amplitudes and $2^{n+1}$ variables, therefore we need $2^{n+1}$ equations. Each qubit measurement yields two possible outcomes, corresponding to the equations $2$ (see the back equation 7, 14), then measuring each qubit (among $n$ qubits) on a given basis yields $2n$ equations. If, as a first step, we choose to measure each individual qubit in Z basis, then we have $2n$ equations. Then, we continue to measure only the first qubit, then if in total $M$ is a different basis to measure, then we require $2M\geq 2^{n+1}-2n$ . Given that $M$ is an integer, one can opt for $M=\lceil 2^{n}-n\rceil$ , which is the total of different bases, or different angles, which one needs to choose and measure.

Algorithm 1

Input a $n$ -qubit quantum state $\ket{\psi}\in\mathbb{C}^{2^{n}}$ . Choose $M=\lceil 2^{n}-n\rceil$ and denote by $\theta_{1},\theta_{2},...,\theta_{M}$ the angle indicating the direction of the measurement basis. Then we do the following:

•

Measure each qubit of $\ket{\psi}$ in computational basis (Z basis) $\mathcal{O}(1/\epsilon^{2})$ times.
•

Measure the first qubit of $\ket{\psi}$ in basis $\{\theta_{i}\}_{i=1}^{M}$ $\mathcal{O}(1/\epsilon^{2})$ times.

•

Form the following system of nonlinear algebraic equations (e.g., as in 70):

\displaystyle\begin{cases}\sum_{i_{1},...,i_{n-1}=\{0,1\}}|a_{0i_{1}i_{2}...i_{n-1}}|^{2}=p_{0}^{1}\\ \sum_{i_{1},...,i_{n-1}=\{0,1\}}|a_{1i_{1}i_{2}...i_{n-1}}|^{2}=p_{1}^{1}\\ \sum_{i_{1},...,i_{n-1}=\{0,1\}}|a_{i_{1}0i_{2}...,i_{n-1}}|^{2}=p_{0}^{2}\\ \sum_{i_{1},...,i_{n-1}=\{0,1\}}|a_{i_{1}1i_{2}...,i_{n-1}}|^{2}=p_{1}^{2}\\ \vdots\\ \sum_{i_{1},...,i_{n-1}=\{0,1\}}|a_{i_{1}i_{2}...,i_{n-1}0}|^{2}=p_{0}^{n}\\ \sum_{i_{1},...,i_{n-1}=\{0,1\}}|a_{i_{1}i_{2}...,i_{n-1}1}|^{2}=p_{1}^{n}\\ \sum_{i_{1},...,i_{n-1}=\{0,1\}}|\cos(\theta_{1})a_{0i_{1}...i_{n-1}}+\sin(\theta_{1})a_{1i_{1}...i_{n-1}}|^{2}=p(\theta_{1})_{0}\\ \sum_{i_{1},...,i_{n-1}=\{0,1\}}|\cos(\theta_{1})a_{0i_{1}...i_{n-1}}-\sin(\theta_{1})a_{1i_{1}...i_{n-1}}|^{2}=p(\theta_{1})_{1}\\ \vdots\\ \sum_{i_{1},...,i_{n-1}=\{0,1\}}|\cos(\theta_{M})a_{0i_{1}...i_{n-1}}+\sin(\theta_{M})a_{1i_{1}...i_{n-1}}|^{2}=p(\theta_{M})_{0}\\ \sum_{i_{1},...,i_{n-1}=\{0,1\}}|\cos(\theta_{M})a_{0i_{1}...i_{n-1}}-\sin(\theta_{M})a_{1i_{1}...i_{n-1}}|^{2}=p(\theta_{M})_{1}\end{cases}

(71)

•

Solve the above equations and obtain $\Re(a_{0i_{1}...i_{n-1}})$ , $\Im(a_{0i_{1}...i_{n-1}})$ , $\Re(a_{1i_{1}...i_{n-1}})$ , $\Im(a_{1i_{1}...i_{n-1}})$ for all $i_{1},i_{2},...,i_{n-1}=\{0,1\}$

The last and very important point we need to discuss is the measurement-induced error and how it affects the solution we solve from such erroneous equations. Recall that the right-hand side of Eqn. 71 are estimated by collecting measurement results from single-qubit measurement. From an earlier discussion, we knew that by obeying the Bernoulli distribution, $\mathcal{O}(1/\epsilon^{2})$ is necessary and sufficient to estimate, for example, $p_{0}^{1},p_{1}^{1}$ to accuracy $\epsilon$ if we measure the first qubit. Likewise, all remaining probabilities are estimated up to $\epsilon$ . As the right-hand side is erroneous, the solution to the system of nonlinear algebraic equations is also deviating from the real, ideal solution. The following theorem characterizes the error deviation of the erroneous system from the ideal system:

Theorem 1 (Erroneous Nonlinear Algerbraic Equations)

Let $x_{1},x_{2},...,x_{N}$ be variables and $f_{1},f_{2},...,f_{N}:\mathbb{R}^{N}\longrightarrow\mathbb{R}$ be multivariate functions. Define two systems of nonlinear algebraic equations as follows.

\displaystyle\begin{cases}f_{1}(x_{1},x_{2},...,x_{N})=b_{1}\\ f_{2}(x_{1},x_{2},...,x_{N})=b_{2}\\ \vdots\\ f_{N}(x_{1},x_{2},...,x_{N})=b_{N}\end{cases}

(72)

and

\displaystyle\begin{cases}f_{1}(x_{1},x_{2},...,x_{N})=\tilde{b}_{1}\\ f_{2}(x_{1},x_{2},...,x_{N})=\tilde{b}_{2}\\ \vdots\\ f_{N}(x_{1},x_{2},...,x_{N})=\tilde{b}_{N}\end{cases}

(73)

Let the Jacobian be:

\displaystyle J=\begin{pmatrix}\frac{\partial f_{1}}{\partial x_{1}}&\frac{\partial f_{1}}{\partial x_{2}}&\cdots&\frac{\partial f_{1}}{\partial x_{N}}\\ \frac{\partial f_{2}}{\partial x_{1}}&\frac{\partial f_{2}}{\partial x_{2}}&\cdots&\frac{\partial f_{2}}{\partial x_{N}}\\ \cdots&\cdots&\ddots&\cdots\\ \frac{\partial f_{N}}{\partial x_{1}}&\frac{\partial f_{N}}{\partial x_{2}}&\cdots&\frac{\partial f_{N}}{\partial x_{N}}\\ \end{pmatrix}

(74)

Let $\textbf{x}=(\textbf{x}_{1},\textbf{x}_{2},...,\textbf{x}_{N})$ denote the solution to 72, and $\tilde{\textbf{x}}=(\tilde{\textbf{x}}_{1},\tilde{\textbf{x}}_{2},...,\tilde{\textbf{x}}_{N})$ denotes the solution to 73; let $\tilde{\textbf{b}}=(\tilde{b}_{1},\tilde{b}_{2},...,\tilde{b}_{N})$ and $\textbf{b}=(b_{1},b_{2},...,b_{N})$ . Then we have that:

\displaystyle|\tilde{\textbf{x}}-\textbf{x}|\leq|J^{-1}||\tilde{\textbf{b}}-\textbf{b}|

(75)

where $|.|$ refers to the usual $l_{2}$ Euclidean norm for a vector and matrix norm for a matrix.

Proof: We use Taylor’s expansion around the solution, in the limit $\epsilon\longrightarrow 0$ . We have that, for $i=1,2,...,N$ :

	$\displaystyle f_{i}(\tilde{\textbf{x}}_{1},\tilde{\textbf{x}}_{2},...,\tilde{\textbf{x}}_{N})=f_{i}(\textbf{x}_{1},\textbf{x}_{2},...,\textbf{x}_{N})+\frac{\partial f_{i}}{\partial\textbf{x}_{1}}(\tilde{\textbf{x}}_{1}-\textbf{x}_{1})+\frac{\partial f_{i}}{\partial\textbf{x}_{2}}(\tilde{\textbf{x}}_{2}-\textbf{x}_{2})+...+\frac{\partial f_{i}}{\partial\textbf{x}_{N}}(\tilde{\textbf{x}}_{N}-\textbf{x}_{N})$		(76)
	$\displaystyle\rightarrow\tilde{b}_{i}=b_{i}+\frac{\partial f_{i}}{\partial\textbf{x}_{1}}(\tilde{\textbf{x}}_{1}-\textbf{x}_{1})+\frac{\partial f_{i}}{\partial\textbf{x}_{2}}(\tilde{\textbf{x}}_{2}-\textbf{x}_{2})+...+\frac{\partial f_{i}}{\partial\textbf{x}_{N}}(\tilde{\textbf{x}}_{N}-\textbf{x}_{N})$		(77)

Then we have that:

	$\displaystyle\tilde{b}_{1}-b_{1}=\frac{\partial f_{1}}{\partial\textbf{x}_{1}}(\tilde{\textbf{x}}_{1}-\textbf{x}_{1})+\frac{\partial f_{1}}{\partial\textbf{x}_{2}}(\tilde{\textbf{x}}_{2}-\textbf{x}_{2})+...+\frac{\partial f_{1}}{\partial\textbf{x}_{N}}(\tilde{\textbf{x}}_{N}-\textbf{x}_{N})$		(78)
	$\displaystyle\tilde{b}_{2}-b_{2}=\frac{\partial f_{2}}{\partial\textbf{x}_{1}}(\tilde{\textbf{x}}_{1}-\textbf{x}_{1})+\frac{\partial f_{2}}{\partial\textbf{x}_{2}}(\tilde{\textbf{x}}_{2}-\textbf{x}_{2})+...+\frac{\partial f_{2}}{\partial\textbf{x}_{N}}(\tilde{\textbf{x}}_{N}-\textbf{x}_{N})$		(79)
	$\displaystyle\vdots$		(80)
	$\displaystyle\tilde{b}_{N}-b_{N}=\frac{\partial f_{N}}{\partial\textbf{x}_{1}}(\tilde{\textbf{x}}_{1}-\textbf{x}_{1})+\frac{\partial f_{N}}{\partial\textbf{x}_{2}}(\tilde{\textbf{x}}_{2}-\textbf{x}_{2})+...+\frac{\partial f_{N}}{\partial\textbf{x}_{N}}(\tilde{\textbf{x}}_{N}-\textbf{x}_{N})$		(81)

which has equivalent matrix form:

\displaystyle\begin{pmatrix}\tilde{b}_{1}-b_{1}\\ \tilde{b}_{2}-b_{2}\\ \vdots\\ \tilde{b}_{N}-b_{N}\end{pmatrix}=\begin{pmatrix}\frac{\partial f_{1}}{\partial x_{1}}&\frac{\partial f_{1}}{\partial x_{2}}&\cdots&\frac{\partial f_{1}}{\partial x_{N}}\\ \frac{\partial f_{2}}{\partial x_{1}}&\frac{\partial f_{2}}{\partial x_{2}}&\cdots&\frac{\partial f_{2}}{\partial x_{N}}\\ \cdots&\cdots&\ddots&\cdots\\ \frac{\partial f_{N}}{\partial x_{1}}&\frac{\partial f_{N}}{\partial x_{2}}&\cdots&\frac{\partial f_{N}}{\partial x_{N}}\\ \end{pmatrix}\cdot\begin{pmatrix}\tilde{x}_{1}-x_{1}\\ \tilde{x}_{2}-x_{2}\\ \vdots\\ \tilde{x}_{N}-x_{N}\end{pmatrix}

(82)

which could also be written as:

\displaystyle\tilde{\textbf{b}}-\textbf{b}=J(\tilde{\textbf{x}}-\textbf{x})

(84)

Therefore, we have that:

	$\displaystyle(\tilde{\textbf{x}}-\textbf{x})=J^{-1}(\tilde{\textbf{b}}-\textbf{b})$		(85)
	$\displaystyle\rightarrow\|(\tilde{\textbf{x}}-\textbf{x})\|=\|J^{-1}(\tilde{\textbf{b}}-\textbf{b})\|$		(86)

where $|.|$ refers to the usual $l_{2}$ Euclidean norm. As $\tilde{\textbf{b}}-\textbf{b}=|\tilde{\textbf{b}}-\textbf{b}|\big{(}(\tilde{\textbf{b}}-\textbf{b})/|\tilde{\textbf{b}}-\textbf{b}|\big{)}$ , then we have that:

\displaystyle|J^{-1}(\tilde{\textbf{b}}-\textbf{b})|\leq|\tilde{\textbf{b}}-\textbf{b}||J^{-1}|

(87)

As by definition, $|J^{-1}|=\max_{y}|J^{-1}y|$ for $y$ being a unit vector. So we have that:

\displaystyle|\tilde{\textbf{x}}-\textbf{x}|\leq|J^{-1}||\tilde{\textbf{b}}-\textbf{b}|

(88)

The proof is then completed.

The application of the above theorem to our main equation Eqn. 71 is straightforward. Such system forms a nonlinear algebraic equations, with $\Re/\Im(a_{0i_{1}...i_{n-1}})$ , $\Re/\Im(a_{1i_{1}...i_{n-1}})$ ( $i_{1},...,i_{n-1}=\{0,1\}$ ) being variables as $(x_{1},x_{2},...,x_{N})$ in Theorem 1, for which in this case we have $N=2^{n+1}$ . Additionally, the probabilities on the right hand side of equation 71 corresponds exactly to $\textbf{b}=(b_{1},b_{2},...,b_{N})$ in theorem 1. As each probability on the right hand side of equation 71 is estimated up to accuracy $\epsilon$ , it translates directly to the fact that for all $i=1,2,...,N$ , $|\tilde{b}_{i}-b_{i}|\leq\epsilon$ , then we have that:

\displaystyle|\tilde{\textbf{b}}-\textbf{b}|=\sqrt{\sum_{i=1}^{N}(\tilde{b}_{i}-b_{i})^{2}}\leq\sqrt{N}\epsilon

(89)

Therefore, from the above, we have:

\displaystyle|\tilde{\textbf{x}}-\textbf{x}|\leq|J^{-1}|\sqrt{N}\epsilon

(90)

where we remind that, x is being used in an abusive way (following theorem 1) to denote the ideal solution to our main system of interest 71, and $\tilde{\textbf{x}}$ denotes the erroneous solution. The above equation suggests that the error deviation depends on the behavior of corresponding Jacobian $J$ . If $J$ has the lowest non-zero singular value being $\mathcal{O}(1/poly(N))$ then the norm $|J^{-1}|$ gonna be large, making error induced larger. Otherwise, if the lowest non-zero singular value being large enough, then $|J^{-1}|$ could be $\mathcal{O}(1)$ . Then $|\tilde{\textbf{x}}-\textbf{x}|\leq\mathcal{O}(2\sqrt{N}\epsilon)$ . Now we discuss a few specific scenarios to show how our framework can be applied in a broader context.

II.2.1 Estimating each amplitude and its norm with additive error $\delta$

Suppose our $n$ -qubit state is $\ket{\psi}=\sum_{i_{0},i_{1},...,i_{n-1}={0,1}}a_{i_{0}i_{1}...i_{n-1}}\ket{i_{0}i_{1}...i_{n-1}}$ where each $a_{i_{0}i_{1}...i_{n-1}}$ is a complex number, and we are interested in estimating its norm $|a_{i_{0}i_{1}...i_{n-1}}|$ to additive error $\epsilon$ , for all amplitudes. As it is a complex number, we have that:

\displaystyle|a_{i_{0}i_{1}...i_{n-1}}|=\sqrt{\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}+\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}}

(91)

The framework we introduced return an approximation to real and imaginary component of all amplitudes $a_{i_{0}i_{1}...i_{n-1}}$ . As we expect all norm of amplitudes having additive accuracy $\delta$ , it is equivalent to:

\displaystyle\max_{i_{0},i_{1},...,i_{n-1}}\{\big{|}|\tilde{a}_{i_{0}i_{1}...i_{n-1}}|-|a_{i_{0}i_{1}...i_{n-1}}|\big{|}\}=\delta

(92)

where $\tilde{a}_{i_{0}i_{1}...i_{n-1}}$ refers to the approximation of corresponding amplitude $a_{i_{0}i_{1}...i_{n-1}}$ . Let $\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})$ and $\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})$ denotes the approximation of real and imaginary part of $a_{i_{0}i_{1}...i_{n-1}}$ . Let $\tilde{a}_{i_{0}i_{1}...i_{n-1}}=\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})+i\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})$ . We have that:

$\displaystyle\big{\|}\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|-\|a_{i_{0}i_{1}...i_{n-1}}\|\big{\|}^{2}$	$\displaystyle=\big{\|}\sqrt{\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}+\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}}-\sqrt{\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}+\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}}\big{\|}^{2}$	(93)
	$\displaystyle\leq\big{\|}\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}+\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-(\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}+\Im(a_{i_{0}i_{1}...i_{n-1}})^{2})\big{\|}$	(94)
	$\displaystyle\leq\big{\|}\big{(}\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\big{)}+\big{(}\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\big{)}\big{\|}$	(95)
	$\displaystyle\leq\big{\|}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\|+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\|\big{\|}$	(96)

where the second line comes from the fact that for any $x,y$ , we have $(x-y)^{2}=(x-y)(x+y)\rightarrow(x-y)^{2}=|(x-y)^{2}|=|(x-y)(x-y)|\leq|(x-y)(x+y)|\leq|x^{2}-y^{2}|$ . Remind that we are seeking for:

\displaystyle\max_{i_{0},i_{1},...,i_{n-1}}\big{|}\big{(}\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\big{)}+\big{(}\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\big{)}\big{|}\leq\epsilon^{2}

(97)

and remind further that all the values $\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}}),\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})$ , and $\Re(a_{i_{0}i_{1}...i_{n-1}}),\Im(a_{i_{0}i_{1}...i_{n-1}})$ (for all $i_{0},i_{1},...i_{n-1}$ ) are contained inside $\tilde{\textbf{x}}$ and x, as we denoted. We have that:

	$\displaystyle\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\|$	$\displaystyle=\|(\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}}))(\tilde{\Re}(a_{i_{0}i_{1}...i_{n-1}})+\Re(a_{i_{0}i_{1}...i_{n-1}}))\|$		(98)
		$\displaystyle\leq 2\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}})\|$		(99)

where we have used the fact that $|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})|,|\Re(a_{i_{0}i_{1}...i_{n-1}})|\leq 1$ . Likewise:

\displaystyle|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}|\leq 2|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})|

(100)

So we have:

	$\displaystyle\big{\|}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\|+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\|\big{\|}\leq$	$\displaystyle 2\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}})\|+$		(101)
		$\displaystyle 2\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})\|$		(102)

We have the Cauchy-Schwarz inequality:

	$\displaystyle\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}})\|+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})\|$		(103)
	$\displaystyle\leq\sqrt{2\big{(}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}})\|^{2}+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})\|^{2}\big{)}}$		(104)

Therefore, for all $i_{0},i_{1},...,i_{n-1}$ , we have:

	$\displaystyle\big{\|}\big{(}\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\big{)}+\big{(}\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\big{)}\big{\|}$		(105)
	$\displaystyle\leq 2\sqrt{2\big{(}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}})\|^{2}+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})\|^{2}\big{)}}$		(106)

Recall that from previous discussion that lead to equation 90, we have that:

\displaystyle|\tilde{\textbf{x}}-\textbf{x}|\leq|J^{-1}|\sqrt{N}\epsilon

(107)

where x contains the ideal solution $\Re(a_{i_{0}i_{1}...i_{n-1}}),\Im(a_{i_{0}i_{1}...i_{n-1}})$ and $\tilde{\textbf{x}}$ contains the approximated ones, for all $i_{0},i_{1},...,i_{n-1}=\{0,1\}$ . Hence, $\tilde{\textbf{x}}-\textbf{x}$ is a vector containing all $(\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}}))$ and $\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})$ . It is apparent that, for any specific $i_{0}i_{1}...i_{n-1}$ , we have:

	$\displaystyle\big{(}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}})\|^{2}+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})\|^{2}\big{)}$		(108)
	$\displaystyle\leq\sum_{i_{0},i_{1},...,i_{n-1}}\big{(}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}})\|^{2}+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})\|^{2}\big{)}$		(109)
	$\displaystyle=\|\tilde{\textbf{x}}-\textbf{x}\|^{2}\leq\|J^{-1}\|^{2}N\epsilon^{2}$		(110)

Therefore:

	$\displaystyle\sqrt{2\big{(}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})\|^{2}+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})\|^{2}\big{)}}$		(111)
	$\displaystyle\leq\sqrt{2}\|J^{-1}\|\sqrt{N}\epsilon$		(112)

It implies that:

	$\displaystyle\max_{i_{0},i_{1},...,i_{n-1}}\big{\|}\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\big{)}+\big{(}\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\big{)}\big{\|}$		(114)
	$\displaystyle\leq 2\sqrt{2\big{(}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}})\|^{2}+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})\|^{2}\big{)}}$		(115)
	$\displaystyle\leq 2\sqrt{2}\|J^{-1}\|\sqrt{N}\epsilon$		(116)

Previously, we also showed that:

	$\displaystyle\big{\|}\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|-\|a_{i_{0}i_{1}...i_{n-1}}\|\big{\|}^{2}$	$\displaystyle\leq\big{\|}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\|+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\|\big{\|}$		(117)
	$\displaystyle\longrightarrow\big{\|}\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|-\|a_{i_{0}i_{1}...i_{n-1}}\|\big{\|}$	$\displaystyle\leq\sqrt{\big{\|}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\|+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\|\big{\|}}$		(118)

Hence:

	$\displaystyle\max_{i_{0},i_{1},...,i_{n-1}}\big{\|}\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|-\|a_{i_{0}i_{1}...i_{n-1}}\|\big{\|}$	$\displaystyle\leq\max_{i_{0},i_{1},...,i_{n-1}}\sqrt{\big{\|}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\|+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\|\big{\|}}$		(119)
		$\displaystyle\leq 8^{1/4}\sqrt{J^{-1}}\sqrt{\sqrt{N}}\sqrt{\epsilon}$		(120)

We expect $\max_{i_{0},i_{1},...,i_{n-1}}\big{|}|\tilde{a}_{i_{0}i_{1}...i_{n-1}}|-|a_{i_{0}i_{1}...i_{n-1}}|\big{|}\leq\delta$ so it is sufficient to set:

	$\displaystyle 22^{1/4}\sqrt{J^{-1}}\sqrt{\sqrt{N}}\sqrt{\epsilon}=\delta$		(121)
	$\displaystyle\longrightarrow\epsilon=\frac{\delta^{2}}{4\sqrt{2}\|J^{-1}\|\sqrt{N}}$		(122)

Per algorithm 1, we first make measurement on each qubit $\mathcal{O}(1/\epsilon^{2})$ times, following by measuring first qubit $\mathcal{O}(2^{n}-n)$ times. Given that $N=2^{n+1}$ , the total number of measurement (or equivalently, the number of sample of $n$ -qubit state $\ket{\psi}$ ) required to estimate all amplitudes to additive error $\delta$ is $\mathcal{O}\big{(}\frac{4^{n}}{\delta^{4}}\big{)}$ .

The previous discussion was devoted to estimating the norm with maximum precision $\delta$ . Now we discuss a quite similar goal, which is outputting amplitudes such that:

\displaystyle\max_{i_{0},i_{1},...,i_{n-1}}\{\big{|}\tilde{a}_{i_{0}i_{1}...i_{n-1}}-a_{i_{0}i_{1}...i_{n-1}}|\big{|}\}=\delta

(123)

We have that:

$\displaystyle\big{\|}\tilde{a}_{i_{0}i_{1}...i_{n-1}}-a_{i_{0}i_{1}...i_{n-1}}\|\big{\|}$	$\displaystyle=\sqrt{(\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}}))^{2}+(\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}}))^{2}}$	(124)
	$\displaystyle\leq\sqrt{\sum_{i_{0},i_{1},...,i_{n-1}}\big{(}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}})\|^{2}+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})\|^{2}\big{)}}$	(125)
	$\displaystyle=\|\tilde{\textbf{x}}-\textbf{x}\|\leq\|J^{-1}\|\sqrt{2^{n}}\epsilon=\delta$	(126)

which directly implies that $\epsilon=\delta/(|J^{-1}|\sqrt{2^{n}})$ . So, the total number of measurements is $\mathcal{O}(2^{n}/\epsilon^{2})=\mathcal{O}(|J^{-1}|4^{n}/\delta^{4})$ .

Naively, as we have discussed, the common approach makes use of jointly all qubit measurement to estimate the amplitude square. More concretely, let $\ket{\psi}=\sum_{i_{0},i_{1},...,i_{n-1}=\{0,1\}}a_{i_{0}i_{1}...i_{n-1}}\ket{i_{0}i_{1}...i_{n}-1}$ . Suppose that we want to estimate $a_{i_{0}i_{1}...i_{n-1}}$ for a specific $i_{0},i_{1},...,i_{n-1}$ . Then we perform the measurement on all qubits, and the outcome forms a Bernoulli distribution for $i_{0}i_{1}...i_{n-1}$ and the remaining strings, with corresponding probability $|a_{i_{0}i_{1}...i_{n-1}}|^{2}$ and $1-|a_{i_{0}i_{1}...i_{n-1}}|^{2}$ . Then it takes $\mathcal{O}(1/\epsilon^{2})$ measurements to estimate this probability $|a_{i_{0}i_{1}...i_{n-1}}|^{2}$ to error $\epsilon$ . The procedure is then repeated to estimate all amplitude squares $|a_{i_{0}i_{1}...i_{n-1}}|^{2}$ to error $\epsilon$ , resulting in $\mathcal{O}(2^{n}/\epsilon^{2})$ measurements in total. However, we note that these estimations are for the amplitude square. If we want to find the amplitude, we need to take the square root, with further error propagation as follows. Let $|\tilde{a}_{i_{0}i_{1}...i_{n-1}}|^{2}$ denotes the amplitude denote the approximation to amplitude square of $|a_{i_{0}i_{1}...i_{n-1}}|^{2}$ , i.e:

\displaystyle\big{|}|\tilde{a}_{i_{0}i_{1}...i_{n-1}}|^{2}-|a_{i_{0}i_{1}...i_{n-1}}|^{2}\big{|}\leq\epsilon

(127)

Then we have that:

	$\displaystyle\big{\|}\sqrt{\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|^{2}}-\sqrt{\|a_{i_{0}i_{1}...i_{n-1}}\|^{2}}\big{\|}^{2}$	$\displaystyle\leq\big{\|}\sqrt{\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|^{2}}-\sqrt{\|a_{i_{0}i_{1}...i_{n-1}}\|^{2}}\big{\|}\big{\|}\sqrt{\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|^{2}}+\sqrt{\|a_{i_{0}i_{1}...i_{n-1}}\|^{2}}\big{\|}$		(128)
		$\displaystyle=\big{\|}\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|^{2}-\|a_{i_{0}i_{1}...i_{n-1}}\|^{2}\big{\|}\leq\epsilon$		(129)

Therefore, if we expect the error to be $\delta$ , we need to set $\epsilon=\delta^{2}$ :

	$\displaystyle\big{\|}\sqrt{\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|^{2}}-\sqrt{\|a_{i_{0}i_{1}...i_{n-1}}\|^{2}}\big{\|}^{2}\leq\delta^{2}$		(130)
	$\displaystyle\rightarrow\big{\|}\sqrt{\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|^{2}}-\sqrt{\|a_{i_{0}i_{1}...i_{n-1}}\|^{2}}\big{\|}\leq\delta$		(131)

and the total number of measurement needed is $\mathcal{O}(2^{n}/\delta^{4})$ , which is quadratically better with respect to $2^{n}$ comparing to $\mathcal{O}(4^{n}/\delta^{4})$ measurements required by the method introduced, which is quite expected because naive procedure employs more qubits measurement.

II.2.2 Estimating probability distribution with total variation $\delta$

Suppose our $n$ -qubit state is $\ket{\psi}=\sum_{i_{0},i_{1},...,i_{n-1}={0,1}}a_{i_{0}i_{1}...i_{n-1}}\ket{i_{0}i_{1}...i_{n-1}}$ , if we measure all qubits in computational basis, then apparently the outcome $i$ with corresponding probability $p(i_{0}i_{1}...i_{n-1})=|a_{i_{0}i_{1}...i_{n-1}}|^{2}$ forms a probability distribution, denoted as $p$ . If we wish to estimate such a distribution up to total variation $\delta$ , that is, obtain a distribution $\tilde{p}$ such that $|\tilde{p}-p|_{1}=\sum_{i_{0},i_{1},...i_{n-1}=0,1}|\tilde{p}(i_{0}i_{1}...i_{n-1})-p(i_{0}i_{1}...i_{n-1})|\leq\delta$ . Similarly to the previous discussion, let $\Re(a_{i})$ and $\Im(a_{i})$ denote the real and imaginary components of $a_{i}$ , and $\Re(\tilde{a}_{i})/\Im(\tilde{a}_{i})$ denotes the approximation to the corresponding real and imaginary components. We want that:

\displaystyle\sum_{i_{0},i_{1},...,i_{n-1}={0,1}}|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}+\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}|

\displaystyle\leq\delta

(132)

As discussed above, we have that:

	$\displaystyle\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}+\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\|$		(133)
	$\displaystyle\leq\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\|+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\|$		(134)
	$\displaystyle\leq 2\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}})\|+2\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})\|$		(135)

Therefore:

	$\displaystyle\sum_{i_{0},i_{1},...,i_{n-1}={0,1}}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}+\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\|$		(137)
	$\displaystyle\leq 2\sum_{i_{0},i_{1},...,i_{n-1}={0,1}}\big{(}(\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}}))+(\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})\big{)}$		(138)
	$\displaystyle\leq 2\sqrt{\sum_{i_{0},i_{1},...,i_{n-1}={0,1}}2^{n+1}\big{(}(\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}}))^{2}+(\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}}))^{2}\big{)}}$		(140)

where the second line comes from Cauchy-Schwarz inequality. Recall that we have:

	$\displaystyle\|\tilde{\textbf{x}}-\textbf{x}\|$	$\displaystyle=\sqrt{\sum_{i_{0},i_{1},...,i_{n-1}={0,1}}\big{(}(\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}}))^{2}+(\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}}))^{2}\big{)}}$		(141)
		$\displaystyle\leq\|J^{-1}\|\sqrt{N}\epsilon=\|J^{-1}\|\sqrt{2^{n}}\epsilon$		(142)

So we have that:

\displaystyle\sum_{i_{0},i_{1},...,i_{n-1}={0,1}}|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}+\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}|

\displaystyle\leq 2\sqrt{2^{n+1}}|J^{-1}|\sqrt{2^{n}}\epsilon=2\sqrt{2}2^{n}|J^{-1}|\epsilon

(143)

As desired, we expect the total variation to be $\delta$ , so we need to set $2\sqrt{2}2^{n}|J^{-1}|\epsilon=\delta\rightarrow\epsilon=\delta/(2\sqrt{2}2^{n}|J^{-1}|)$ . The total number of measurements is then $\mathcal{O}(2^{n}/\epsilon^{2})=\mathcal{O}(6^{n}|J^{-1}|/\delta^{4})$ .

In a naive approach, as described earlier, it takes in total of $\mathcal{O}(2^{n}/\epsilon^{2})$ measurement to estimate all amplitudes square $|a_{i_{0}i_{1}...i_{n-1}}|^{2}$ , each to accuracy $\epsilon$ . Hence, the accumulated error is:

\displaystyle\sum_{i_{0},i_{1},...,i_{n-1}={0,1}}||\tilde{a}_{i_{0}i_{1}...i_{n-1}}|^{2}-|a_{i_{0}i_{1}...i_{n-1}}|^{2}|\leq 2^{n}\epsilon

(144)

If we want the total variation to be $\delta$ , then we require $2^{n}\epsilon=\delta\rightarrow\epsilon=\delta/2^{n}$ . So the total measurement required is $\mathcal{O}(4^{n}/\epsilon^{2})$ . We remark that, given the decomposition of initial state $\ket{\psi}$ , its density matrix representation is:

\displaystyle\rho=\ket{\psi}\bra{\psi}=\sum_{i_{0},i_{1},...,i_{n-1}={0,1}}a_{i_{0}i_{1}...i_{n-1}}\ket{i_{0}i_{1}...i_{n-1}}\sum_{i_{0},i_{1},...,i_{n-1}={0,1}}a^{*}_{i_{0}i_{1}...i_{n-1}}\bra{i_{0}i_{1}...i_{n-1}}

(145)

And the fact that we expect to obtain approximations of all amplitudes with total variation

\displaystyle\sum_{i_{0},i_{1},...,i_{n-1}={0,1}}||\tilde{a}_{i_{0}i_{1}...i_{n-1}}|^{2}-|a_{i_{0}i_{1}...i_{n-1}}|^{2}|\leq\delta

(146)

is essentially equivalent to obtaining a state $\sigma$ such that $|\sigma-\rho|_{1}=\operatorname{Tr}|\sigma-\rho|\leq\delta$ , e.g., defining

\sigma=\sum_{i_{0},i_{1},...,i_{n-1}={0,1}}|\tilde{a}_{i_{0}i_{1}...i_{n-1}}|^{2}\ket{i_{0}i_{1}...i_{n-1}}\bra{i_{0}i_{1}...i_{n-1}}

This problem has been extensively studied in yu2020sample ; haah2016sample , where the authors proposed different measurement schemes to output the desired density matrix that approximates the given state. The work haah2016sample employs more qubits measurement, meanwhile yu2020sample employs single-qubit measurement. The complexity of yu2020sample is $\mathcal{O}(10^{n}/\delta^{2})$ measurement required to output the density state to trace distance $\delta$ , which is quadratically more efficient than our method in error $\delta$ , but less efficient by a factor of $4^{n}$ .

II.2.3 Estimating with average $L_{1}$ -norm accuracy

Again, let $\ket{\psi}=\sum_{i_{0},i_{1},...,i_{n-1}={0,1}}a_{i_{0}i_{1}...i_{n-1}}\ket{i_{0}i_{1}...i_{n-1}}$ and instead of individual error $\delta$ as in section II.2.1, we expect the average error to be $\delta$ :

\displaystyle\sum_{i_{0},i_{1},...,i_{n-1}}\frac{1}{2^{n}}||\tilde{a}_{i_{0}i_{1}...i_{n-1}}|-|a_{i_{0}i_{1}...i_{n-1}}||\leq\delta

(147)

Similar as before, let $a_{i_{0}i_{1}...i_{n-1}}=\Re(a_{i_{0}i_{1}...i_{n-1}})+i\Im(a_{i_{0}i_{1}...i_{n-1}})$ , and $\tilde{a}_{i_{0}i_{1}...i_{n-1}}=\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})+i\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})$ where $\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})$ and $\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})$ denotes the approximation to $\Re(a_{i_{0}i_{1}...i_{n-1}}),\Im(a_{i_{0}i_{1}...i_{n-1}})$ respectively. As showed previously, we have that:

$\displaystyle\big{\|}\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|-\|a_{i_{0}i_{1}...i_{n-1}}\|\big{\|}^{2}$	$\displaystyle\leq\big{\|}\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|^{2}-\|a_{i_{0}i_{1}...i_{n-1}}\|^{2}\big{\|}$	(148)
	$\displaystyle=\big{\|}\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}+\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\big{\|}$	(149)
	$\displaystyle\leq 2\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}})\|+2\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})\|$	(150)
	$\displaystyle\leq 2\sqrt{2}\sqrt{(\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}}))^{2}+(\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}}))^{2}}$	(151)

We have:

	$\displaystyle\frac{1}{2^{n}}\sum_{i_{0},i_{1},...,i_{n-1}}\|\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|-\|a_{i_{0}i_{1}...i_{n-1}}\|\|\leq\frac{1}{2^{n}}\sqrt{2^{n}\sum_{i_{0},i_{1},...,i_{n-1}}\big{\|}\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|-\|a_{i_{0}i_{1}...i_{n-1}}\|\big{\|}^{2}}$		(152)
	$\displaystyle\leq\frac{1}{2^{n}}\sqrt{2^{n}\sum_{i_{0},i_{1},...,i_{n-1}}\sqrt{2}\sqrt{(\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}}))^{2}+(\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}}))^{2}}}$		(153)
	$\displaystyle\leq\frac{1}{2^{n}}\sqrt{2^{n}\sqrt{2}\sqrt{2^{n}\sum_{i_{0},i_{1},...,i_{n-1}}((\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}}))^{2}+(\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}}))^{2})}}$		(154)
	$\displaystyle\leq\frac{1}{2^{n}}\sqrt{2^{n}\sqrt{2}\sqrt{2^{n}\|\tilde{\textbf{x}}-\textbf{x}\|^{2}}}$		(155)
	$\displaystyle\leq\frac{1}{2^{n}}\sqrt{2^{n}\sqrt{2}\sqrt{2^{n}}\|J^{-1}\|\sqrt{2^{n}}\epsilon}=2^{1/4}\sqrt{\|J^{-1}\|\epsilon}$		(156)

where the last line comes from Eqn. 90, with $N=2^{n}$ . We expect that

\displaystyle\sum_{i_{0},i_{1},...,i_{n-1}}\frac{1}{2^{n}}||\tilde{a}_{i_{0}i_{1}...i_{n-1}}|-|a_{i_{0}i_{1}...i_{n-1}}||\leq\delta

(157)

so we need to set $2^{1/4}\sqrt{|J^{-1}|\epsilon}=\delta\rightarrow\epsilon=\delta^{2}/(\sqrt{2}|J^{-1}|)$ . Hence, the total number of measurements needed is $\mathcal{O}(2^{n}/\epsilon^{2})=\mathcal{O}(2^{n}|J^{-1}|^{2}/\delta^{4})$ . Naively, we have seen that it takes $\mathcal{O}(2^{n}/\epsilon^{2})$ number of measurements to estimate all amplitude square $|a_{i_{0}i_{1}...i_{n-1}}|^{2}$ to accuracy $\epsilon$ . We also had from previous discussion, in section II.2.1 that

	$\displaystyle\big{\|}\sqrt{\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|^{2}}-\sqrt{\|a_{i_{0}i_{1}...i_{n-1}}\|^{2}}\big{\|}^{2}$	$\displaystyle\leq\big{\|}\sqrt{\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|^{2}}-\sqrt{\|a_{i_{0}i_{1}...i_{n-1}}\|^{2}}\big{\|}\big{\|}\sqrt{\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|^{2}}+\sqrt{\|a_{i_{0}i_{1}...i_{n-1}}\|^{2}}\big{\|}$		(158)
		$\displaystyle=\big{\|}\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|^{2}-\|a_{i_{0}i_{1}...i_{n-1}}\|^{2}\big{\|}\leq\epsilon$		(159)

Hence:

	$\displaystyle\big{\|}\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|-\|a_{i_{0}i_{1}...i_{n-1}}\|\big{\|}\leq\sqrt{\epsilon}$		(160)
	$\displaystyle\longrightarrow\sum_{i_{0},i_{1},...,i_{n-1}}\frac{1}{2^{n}}\|\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|-\|a_{i_{0}i_{1}...i_{n-1}}\|\|\leq\sqrt{\epsilon}$		(161)

If we expect the average $L_{1}$ norm error to be $\delta$ , then we need to set $\sqrt{\epsilon}=\delta\rightarrow\epsilon=\delta^{2}$ . So using naively, jointly multi-qubits measurement approach, the total number of measurement required is $\mathcal{O}(2^{n}/\epsilon^{2})=\mathcal{O}(2^{n}/\delta^{4})$ . Hence, in comparison, our single-qubit measurement framework achieves assymtotically similar number of measurements, given that the norm $|J^{-1}|$ grows as $\mathcal{O}(1)$ .

III Conclusion

In this work, we have outlined a simple framework for estimating amplitudes of a given state, using single-qubit measurements only. Our method executes measurement on each qubit in multiple basis, then uses the outcomes to construct a system of nonlinear algebraic equations. By classically solving such equations, we are able to obtain the approximation to the desired amplitudes. Our work adds to many existing literature regarding quantum state tomography, a simple way to recover a quantum state based on single-qubit measurement. One may wonder if our protocol can perform better if, instead of single-qubit measurement, we use multiqubit measurements instead. We recall that the key point of our method is to construct a nonlinear algebraic system. Performing more multiqubit measurements, in principle, would just produce a different set of nonlinear algebraic equations. Therefore, in some sense, our protocol implies that multiple-qubit measurement is not asymptotically stronger than single-qubit measurement. The trickiest aspect of our protocol is the norm of the inverse of Jacobian of the underlying nonlinear algebraic system. We expect such a norm to grow as much as $\mathcal{O}(1)$ , however, we are not able to prove it, and we believe that in general the behavior of such Jacobian can only be revealed numerically.

References

(1) Richard P Feynman. Simulating physics with computers. In Feynman and computation, pages 133–153. CRC Press, 2018.
(2) David Deutsch. Quantum theory, the church–turing principle and the universal quantum computer. Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences, 400(1818):97–117, 1985.
(3) David Deutsch and Richard Jozsa. Rapid solution of problems by quantum computation. Proceedings of the Royal Society of London. Series A: Mathematical and Physical Sciences, 439(1907):553–558, 1992.
(4) Seth Lloyd. Universal quantum simulators. Science, 273(5278):1073–1078, 1996.
(5) Peter W Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM review, 41(2):303–332, 1999.
(6) Lov K Grover. A fast quantum mechanical algorithm for database search. In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, pages 212–219, 1996.
(7) Hsin-Yuan Huang, Richard Kueng, and John Preskill. Predicting many properties of a quantum system from very few measurements. Nature Physics, 16(10):1050–1057, 2020.
(8) Hsin-Yuan Huang, Richard Kueng, and John Preskill. Efficient estimation of pauli observables by derandomization. Physical review letters, 127(3):030503, 2021.
(9) Nengkun Yu. Sample efficient tomography via pauli measurements. arXiv preprint arXiv:2009.04610, 2020.
(10) E Bagan, M Baig, Ramon Muñoz-Tapia, and A Rodriguez. Collective versus local measurements in a qubit mixed-state estimation. Physical Review A, 69(1):010304, 2004.
(11) Michael Keyl. Quantum state estimation and large deviations. Reviews in Mathematical Physics, 18(01):19–60, 2006.
(12) Mădălin Guţă, Bas Janssens, and Jonas Kahn. Optimal estimation of qubit states with continuous time measurements. Communications in Mathematical Physics, 277:127–160, 2008.
(13) Jeongwan Haah, Aram W Harrow, Zhengfeng Ji, Xiaodi Wu, and Nengkun Yu. Sample-optimal tomography of quantum states. In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing, pages 913–925, 2016.
(14) Ryan O’Donnell and John Wright. Efficient quantum tomography. In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing, pages 899–912, 2016.
(15) Madalin Guţă, Jonas Kahn, Richard Kueng, and Joel A Tropp. Fast state tomography with optimal error bounds. Journal of Physics A: Mathematical and Theoretical, 53(20):204001, 2020.
(16) Steven T Flammia, David Gross, Yi-Kai Liu, and Jens Eisert. Quantum tomography via compressed sensing: error bounds, sample complexity and efficient estimators. New Journal of Physics, 14(9):095022, 2012.
(17) Richard Kueng, Holger Rauhut, and Ulrich Terstiege. Low rank matrix recovery from rank one measurements. Applied and Computational Harmonic Analysis, 42(1):88–116, 2017.

$\displaystyle\|\cos(\theta)\ket{\phi}_{1}+\sin(\theta)\ket{\phi}_{2}\|^{2}$	$\displaystyle=\|\cos(\theta)a_{000}+\sin(\theta)a_{100}\|^{2}+$	(43)
	$\displaystyle\|\cos(\theta)a_{001}+\sin(\theta)a_{101}\|^{2}+$	(44)
	$\displaystyle\cos(\theta)a_{010}+\sin(\theta)a_{110}\|^{2}+$	(45)
	$\displaystyle\|\cos(\theta)a_{011}+\sin(\theta)a_{111}\|^{2}$	(46)

$\displaystyle\|\cos(\theta)\ket{\phi}_{1}-\sin(\theta)\ket{\phi}_{2}\|^{2}$	$\displaystyle=\|\cos(\theta)a_{000}-\sin(\theta)a_{100}\|^{2}+$	(48)
	$\displaystyle\|\cos(\theta)a_{001}-\sin(\theta)a_{101}\|^{2}+$	(49)
	$\displaystyle\|\cos(\theta)a_{010}-\sin(\theta)a_{110}\|^{2}+$	(50)
	$\displaystyle\|\cos(\theta)a_{011}-\sin(\theta)a_{111}\|^{2}$	(51)

$\displaystyle\big{\|}\|\tilde{a}_{i_{0}i_{1}...i_{n-1}}\|-\|a_{i_{0}i_{1}...i_{n-1}}\|\big{\|}^{2}$	$\displaystyle=\big{\|}\sqrt{\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}+\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}}-\sqrt{\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}+\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}}\big{\|}^{2}$	(93)
	$\displaystyle\leq\big{\|}\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}+\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-(\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}+\Im(a_{i_{0}i_{1}...i_{n-1}})^{2})\big{\|}$	(94)
	$\displaystyle\leq\big{\|}\big{(}\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\big{)}+\big{(}\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\big{)}\big{\|}$	(95)
	$\displaystyle\leq\big{\|}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\|+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\|\big{\|}$	(96)

	$\displaystyle\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\|$	$\displaystyle=\|(\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}}))(\tilde{\Re}(a_{i_{0}i_{1}...i_{n-1}})+\Re(a_{i_{0}i_{1}...i_{n-1}}))\|$		(98)
		$\displaystyle\leq 2\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}})\|$		(99)

	$\displaystyle\big{\|}\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Re(a_{i_{0}i_{1}...i_{n-1}})^{2}\|+\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})^{2}-\Im(a_{i_{0}i_{1}...i_{n-1}})^{2}\|\big{\|}\leq$	$\displaystyle 2\|\Re(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Re(a_{i_{0}i_{1}...i_{n-1}})\|+$		(101)
		$\displaystyle 2\|\Im(\tilde{a}_{i_{0}i_{1}...i_{n-1}})-\Im(a_{i_{0}i_{1}...i_{n-1}})\|$		(102)

A Framework For Estimating Amplitudes of Quantum State With Single-Qubit Measurement

Abstract

I Introduction

II Main Framework

II.1 Examples

II.2 General Framework

Algorithm 1

Theorem 1 (Erroneous Nonlinear Algerbraic Equations)

II.2.1 Estimating each amplitude and its norm with additive error δ\delta

II.2.2 Estimating probability distribution with total variation δ\delta

II.2.3 Estimating with average L1L_{1}-norm accuracy

III Conclusion

References

II.2.1 Estimating each amplitude and its norm with additive error $\delta$

II.2.2 Estimating probability distribution with total variation $\delta$

II.2.3 Estimating with average $L_{1}$ -norm accuracy