\authorinfo

* : Send correspondence to Kwangmin Yu, E-mail : [email protected]

Practical numerical integration on NISQ devices

Kwangmin Yu Computational Science Initiative, Brookhaven National Laboratory, Upton, New York 11973, USA Hyunkyung Lim Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York 11794, USA Pooja Rao Department of Mathematics, Stony Brook University, Stony Brook, New York 11794, USA

Abstract

This paper addresses the practical aspects of quantum algorithms used in numerical integration, specifically their implementation on Noisy Intermediate-Scale Quantum (NISQ) devices. Quantum algorithms for numerical integration utilize Quantum Amplitude Estimation (QAE) (Brassard et al., 2002) in conjunction with Grover’s algorithm. However, QAE is daunting to implement on NISQ devices since it typically relies on Quantum Phase Estimation (QPE), which requires many ancilla qubits and controlled operations. To mitigate these challenges, a recently published QAE algorithm (Suzuki et al., 2020), which does not rely on QPE, requires a much smaller number of controlled operations and does not require ancilla qubits. We implement this new algorithm for numerical integration on IBM quantum devices using Qiskit and optimize the circuit on each target device. We discuss the application of this algorithm on two qubits and its scalability to more than two qubits on NISQ devices.

keywords:

Quantum Algorithm, Quantum Counting, Monte Carlo Integration

1 Introduction

Since quantum computing (QC) was first introduced in the 1980s [1], it has grown into an active and diverse field of research. In recent years, significant progress has been made in building quantum computers by companies, such as IBM, Google, and Rigetti. Recently, Google, IBM, and Intel have announced 72 qubits [2], 50 qubits [3] (53 qubits [4]), and 49 qubits [5] quantum devices, respectively. More notably, IBM has made available its cloud enabled quantum computing platform to the public. These currently available NISQ devices provide a tangible quantum programming environment and are serving as a stepping stone for the large-scale universal quantum computers of the future [6]. Despite the significant breakthrough achieved in qubit numbers in the NISQ era, the quantum systems are limited by (a) the short coherence times, the amount of useful operational time in a calculation before the information loss, (b) the circuit depth, the number of sequential quantum operations that can be performed on a quantum device, and (c) the lack of error correction [6]. They are so noisy that the number of quantum gates that can be implemented is significantly impacted [7, 8]. Furthermore, due to limited physical inter-connectivity between the qubits, the multi-qubit gates (operations), such as controlled-NOT and controlled-controlled-NOT (Toffoli) are hard to implement efficiently [7, 9].

The aforementioned device limitations restrict the implementation of the quantum algorithms developed in the 1990s such as Quantum Fourier Transform (QFT) [10] and Quantum Phase Estimation (QPE)[11]. In particular, QPE, which requires many controlled operations and ancilla qubits, plays a critical role in various quantum algorithms, including Shor’s factoring algorithm[10]. This has motivated several efforts to design a more efficient version of QPE [12, 13], but the number of controlled operations is still restrictive. Therefore, the Quantum Amplitude Estimation (QAE) of Brassard et al. (canonical QAE) [14], which relies heavily on QPE, is daunting to implement on NISQ devices.

Several variants, such as maximum likelihood quantum amplitude estimation (MLQAE), iterative quantum amplitude estimation (IQAE), etc., have recently been published [15, 16, 17], that do not need QPE. Of particular importance to this study is the MLQAE method by Suzuki et al. [16], that has many desirable properties. First and foremost, since MLQAE does not use QPE, it avoids controlled operations in QPE and instead uses maximum likelihood estimation as a post-processing method. Another advantage is that it does not need ancilla qubits, which are necessary to achieve the desired accuracy in canonical QAE (CQAE). Finally, it is parallelizable because the queries can be executed simultaneously before they are post-processed using maximum likelihood estimation.

In this paper, we study the numerical implementation of quantum Monte Carlo integration (see Sec 1.1), whose building blocks are QAE and Grover’s search algorithm, [18] and its generalization, quantum amplitude amplification (QAA) [14]. The focus of the study is the investigation of methods for QAE that are appropriate for NISQ devices. To this end, we compare and contrast the performance of MLQAE with CQAE from a practical standpoint. We implement these quantum algorithms in Qiskit [19], an open source software suite for near-term quantum computing, and apply the transpiler in Qiskit to optimize the implementation on each target quantum device. This is followed by a comparative analysis of execution of these algorithms on IBMQ devices.

1.1 Monte Carlo Method for Integration

Monte Carlo (MC) methods are a wide class of computational techniques that make use of random number sampling. In this section, we review hit-or-miss Monte Carlo Integration (MCI) to estimate a definite integral numerically.

Given a domain $\mathscr{D}\subset\mathbb{R}^{n}$ with two real valued functions, $f,g:\mathscr{D}\subset\mathbb{R}^{n}\rightarrow\mathbb{R}$ (typically $g\equiv 0$ ) and $f>g$ , consider the definite integral of $f-g$ . Hit-or-miss MCI starts by generating samples uniformly from $\mathscr{D}\times[a_{n+1},b_{n+1}]$ where $a_{n+1}\leq\textrm{min}(f-g)$ and $b_{n+1}\geq\textrm{max}(f-g)$ . A sample $X=(x_{1},x_{2},\cdots,x_{n},x_{n+1})\in\mathscr{D}\times[a_{n+1},b_{n+1}]$ is considered a hit if it satisfies $x_{n+1}<f(x_{1},x_{2},\cdots,x_{n})$ and $x_{n+1}>g(x_{1},x_{2},\cdots,x_{n})$ . This method can be formulated as follows:

\frac{\idotsint_{\mathscr{D}}(f-g)d\textbf{x}^{n}}{\textrm{Volume}(\mathscr{D}\times[a_{n+1},b_{n+1}]))}\approx\frac{\#\textrm{of hits}}{\#\textrm{of samples}}.

The main advantage of MCI is that the error is solely a function of the number of the samples taken and so the basic process does not depend on the dimension of the function and thus can be scaled up easily, both in terms of effort and complexity, to higher dimensions. Deterministic numerical integration methods, such as the trapezoidal rule and Simpson’s rule, have error bounds that increase exponentially as the dimension of the integral increases. All these factors make MCI an appealing method in higher dimension ( $\gg 3$ ). Figure 1 shows a pictorial representation of MCI of $\int_{0}^{1}\sin(\pi x)\,dx=\frac{2}{\pi}\approx 0.63662$ . As shown in Fig. 1, we have $0.57$ from $100$ samples (left figure) and $0.64$ from $1000$ samples.

1.2 Quantum Monte Carlo Integration

Equation (1.1) shows that MCI is evaluated by counting the number of hits when the volume of $\mathscr{D^{\prime}}=\mathscr{D}\times[a_{n+1},b_{n+1}]$ , is easy to compute. If $\mathscr{D^{\prime}}$ is complicated, then it can be divided into finite sub-domains that are simpler and the same procedure can be followed on each of the subdomains. MCI basically counts the number of good states ( $\{\textbf{x}~{}|~{}\chi(\textbf{x})=1\}$ ) of the boolean function, $\chi:\mathscr{D^{\prime}}\subset\mathbb{R}^{n+1}\rightarrow\{0,1\}$ , which depends on $f,g$ and domain $\mathscr{D^{\prime}}$ in Eq. (1.1).

One way of implementing MCI on a quantum computer is by counting the solutions of Grover’s search algorithm. In fact, CQAE implements MCI by using the generalized Grover’s algorithm (Quantum Amplitude Amplification) and QPE [14].

1.3 Notations

We follow Dirac’s Bra-Ket notation for qubit representation and arithmetic, such as the inner product and the tensor product. For a multiple qubit system, we use the consecutive binary number strings with the most significant qubit located on the left and the least significant qubit on the right. For example, we have $\ket{0}\otimes\ket{0}\otimes\ket{0}=\ket{0}\ket{0}\ket{0}=\ket{000}$ in the binary representation and $\ket{011}=\ket{3}$ and $\ket{110}=\ket{6}$ in the decimal representation of the computational basis. In the decimal representation, the number of qubits $n$ is denoted by a subscript as in $\ket{0}_{n}$ . The dimension of a square matrix is also denoted by a subscript. For example, $\mathbb{I}_{n}$ denotes the $n\times n$ identity matrix. In quantum circuit diagrams, the top and the bottom qubits represent the least and the most significant qubits, respectively. The three Pauli matrices (Pauli gates when they are used in quantum circuits) $X$ , $Y$ and $Z$ are as follows:

X=\begin{pmatrix}0&1\\ 1&0\end{pmatrix},~{}~{}Y=\begin{pmatrix}0&-i\\ i&0\end{pmatrix},~{}~{}Z=\begin{pmatrix}1&0\\ 0&-1\end{pmatrix}.

The Hadamard matrix, H = $\frac{1}{\sqrt{2}}\begin{pmatrix}1&1\\ 1&-1\end{pmatrix}$ , is a unitary matrix that is also Hermitian, so it is its own inverse.

2 Quantum Amplitude Estimation Algorithms

First, we briefly review quantum amplitude amplification and amplitude estimation algorithms [14]. The amplitude amplification is a generalization of Grover’s search algorithm [18], without the loss of the quadratic quantum speedup over its classical counterpart.

Suppose we have a boolean function $f$ and a unitary operator $\mathcal{A}$ , where $f$ maps the good states to $1$ and the bad states to $0$ on domain $\mathscr{D}$ such that $|\mathscr{D}|=N=2^{n}$ , and $\mathcal{A}$ acts on $n+1$ qubits such that

\ket{\Psi}=\mathcal{A}\ket{0}_{n}\ket{0}=\sqrt{1-a}\ket{\psi_{0}}_{n}\ket{0}+\sqrt{a}\ket{\psi_{1}}_{n}\ket{1},

(1)

where the good state is $\ket{\psi_{1}}_{n}$ with $|\ket{\psi_{1}}_{n}|=k$ , the bad state is $\ket{\psi_{0}}_{n}$ , and $a=k/N\in[0,1]$ is unknown. The job of the quantum amplitude estimation algorithm is to find $a$ approximately. The algorithm complexity is measured by the number of quantum queries to the operator $\mathcal{A}$ .

To achieve the quantum speedup, instead of measuring the last qubit of $\ket{\Psi}=\mathcal{A}\ket{0}_{n}\ket{0}$ directly, $\ket{\Psi}$ is first amplified by the following unitary operator Q:

\textbf{Q}=\mathcal{A}\textbf{S}_{0}\mathcal{A}^{-1}\textbf{S}_{\chi},

(2)

where $\textbf{S}_{0}=\mathbb{I}_{n+1}-2\ket{0}_{n+1}\bra{0}_{n+1}$ and $\textbf{S}_{\chi}=(\bigotimes\limits^{n}\mathbb{I}_{2})\otimes Z$ . $\textbf{S}_{\chi}$ puts a negative sign to the good state $\ket{\psi_{1}}_{n}\ket{1}$ and does nothing to the bad state $\ket{\psi_{0}}_{n}\ket{0}$ . Let us define a parameter $\theta\in[0,\pi/2]$ so that $\sin^{2}\theta=a$ . With this, we can rewrite Eq. (1) as:

\ket{\Psi}=\mathcal{A}\ket{0}_{n}\ket{0}=\cos\theta\ket{\psi_{0}}_{n}\ket{0}+\sin\theta\ket{\psi_{1}}_{n}\ket{1}.

(3)

By applying Q (amplitude amplification operator) repeatedly $m$ times on $\ket{\Psi}$ , we get

\textbf{Q}^{m}\ket{\Psi}=\cos((2m+1)\theta)\ket{\psi_{0}}_{n}\ket{0}+\sin((2m+1)\theta)\ket{\psi_{1}}_{n}\ket{1}.

(4)

From Eqs. (1) and (3), it can be observed that the measurement after applying $\textbf{Q}^{m}$ on $\mathcal{A}\ket{0}_{n}\ket{0}$ shows a quadratically larger probability of obtaining the good state (provided $\theta$ is sufficiently small so that $(2m+1)\theta<\frac{\pi}{2}$ ) than measuring $\mathcal{A}\ket{0}_{n}\ket{0}$ directly [14].

CQAE [14] estimates $\theta$ in Eq. (3) by QPE which includes the inverse QFT. QPE is implemented by the controlled operation on the oracle queries. Hence, it needs a number of multi-controlled operations, which are further decomposed into many basis gates. When this algorithm is implemented on NISQ devices, the accuracy is strongly limited by the connectivity between qubits because all the ancilla registers need connectivity with the target register and a sufficiently large number of ancillae are needed to ensure desired accuracy of the estimation. On the other hand, MLQAE [16] is implemented by post-processing the result of the quantum computation (and measuring the process without QPE) using maximum likelihood estimation. Since different levels of amplification (depending on power of $Q$ ) can be executed independently, the algorithm is parallelizable. The implementation aspects of both CQAE and MLQAE will be discussed in the following sections.

3 Implementation

For our study, a two-qubit domain is used to implement CQAE and MLQAE algorithms, and the solution is fixed to be $\ket{01}$ out of the four possible values, $\ket{00}$ , $\ket{01}$ , $\ket{10}$ , and $\ket{11}$ . The algorithms are implemented in Qiskit [19], and optimized for IBM quantum devices using the transpiler in Qiskit.

3.1 Implementation

The quantum circuit implementation of Eqs. (1) and (2) with a two-qubit domain in Qiskit are described in Figs. 2 and 3, respectively. Figure 2 also includes the quantum circuit of $\mathcal{A}^{-1}$ as it is used in Eq. (2).

3.1.1 Canonical Quantum Amplitude Estimation

CQAE needs additional ancilla qubits to read out the amplitude. The number of ancilla qubits is represented by $m$ in this paper. Figures 4 and 11 show the quantum circuits with $m$ = $1$ and $3$ , respectively. Since the possible measured angle with $m$ ancillae are $2\pi i/2^{m}$ for $i=0$ to $2^{m}-1$ , having more ancilla qubits increases the accuracy of the readout. For example, the possible measurements for $m=1$ are $0$ and $\pi$ . The quantum circuit in Fig. 4 is a variation of the circuit in Fig. 3. The former adds control on the Q operator followed by the inverse QFT.

To run a circuit on a quantum device, each quantum gate in the circuit should be decomposed into the basis gates supported by the device. Figure 5 shows the decomposed and optimized circuit for the IBMQ VIGO device. For optimization, the transpiler in Qiskit is used with the optimization level set at $3$ , the highest possible value allowed in the package. In spite of applying the optimization, the circuit depth increases by more than double because of decomposition of the gates into the basis gates, and the addition of SWAP gates to the circuit to supplement disconnection between the qubits.

For $m=3$ , the possible measurements of $\theta$ are $0,\pi/4,\pi/2,\cdots,7\pi/4$ . The quantum circuit depth increases from $78$ (Fig. 4) to $514$ (Fig. 11) when $m$ is increased from $1$ to $3$ for more accuracy. In Fig. 11, $a_{0}$ , $a_{1}$ and $a_{2}$ represent the ancilla qubits and the circuit is composed of $\mathcal{A}$ , followed by $a_{0}$ controlled Q, followed by $a_{1}$ controlled $\textbf{Q}^{2}$ (two consecutive $a_{1}$ controlled Q), followed by $a_{2}$ controlled $\textbf{Q}^{4}$ (four consecutive $a_{2}$ controlled Q), and the inverse QFT at the end. The main reasons that lead to a sharp increase in the circuit depth are (a) exponential increase of controlled Q operations, which increase to $2^{m}-1$ if $m$ ancillae are used, and (b) controlled operations imposed on Toffoli operations in $\mathcal{A}$ , Q, and $\mathcal{A}^{-1},$ as shown in Fig. 3. The controlled Toffoli (ccc-NOT) are composed of a number of basic quantum gates as shown in Figs. 3 and 4. From Fig. 4 to Fig. 5 ( $m=1$ case), there is a twofold increases in circuit depth upon its decomposition into the basis gates for the quantum device. For $m=3$ , the circuit in Fig. 11, with a depth of $514$ , should be decomposed into the basis quantum gates. As a consequence, its depth will become more than $1000$ ! And for that reason, we do not show the decomposed circuit in this paper. The implementation of such a circuit is not feasible on current NISQ devices because of decoherence and device error.

3.1.2 Maximum Likelihood Quantum Amplitude Estimation

The two main benefits of MLQAE over CQAE are (a) MLQAE does not need ancilla qubits to read out the amplitude $\sqrt{a}$ of Eq. (1), and (b) parallel execution. The analysis of error bounds and the number of queries (application of $\mathcal{A}$ and $\mathcal{A}^{-1}$ ) are discussed by Suzuki et al. [16] and Aaronson et al. [15]. Both CQAE and MLQAE need $O\bigl{(}\frac{1}{\epsilon}\bigl{(}\sqrt{\frac{K}{N}}\bigr{)}\bigr{)}$ oracle queries to estimate $K$ good states in $N$ samples (domain) with error $\epsilon$ [15].

Figure 6 shows the quantum circuit implementation of MLQAE on a two-qubit domain. When only one circuit (only $\mathcal{A}$ ) is used, then there is no quantum speed up [16]. More circuits increase the accuracy of the estimation of the amplitude. Suzuki et al. [16] discuss two options for circuit sequencing in MLQAE, linearly incremental sequence (LIS) and exponential incremental sequence (EIS), and suggest EIS as the asymptotically optimal choice. Thus, we adapt the EIS which has exponential power of Q from the second circuit. For example, the $n$ -th circuit has $\textbf{Q}^{2^{n-2}}$ operator after $\mathcal{A}$ when we have $n>1$ . Figure 6 shows four circuits of the MLQAE implementation and the last circuit (Fig. 6 (d)) has $\textbf{Q}^{4}$ operator.

The number of Q queries of MLQAE of $n$ circuits is equivalent to that of CQAE of $n-1$ ancilla qubits. For example, when we have four circuits for MLQAE, the number of Q queries is equivalent to that of CQAE having three ancilla qubits ( $m=3$ case). Since both CQAE and MLQAE have $O\bigl{(}\frac{1}{\epsilon}\bigl{(}\sqrt{\frac{K}{N}}\bigr{)}\bigr{)}$ oracle queries, we can compare the two circuits of Figs. 11 and 6 with the same accuracy. In this comparison, CQAE has a circuit depth of $514$ and MLQAE has a circuit depth of $93$ in total (sum of all circuit depths in Fig. 6). The advantage offered by MLQAE with respect to decoherence due to its smaller circuit depth is even more pronounced because the circuits that make up the MLQAE run independently, and the measurements are post-processed. For MLQAE, the noise is dominated by the last (longest) circuit due to decoherence and gate noise present in that circuit. The circuits (a), (b), (c), and (d) in Fig. 6 are named MLQAE $\left[0\right]$ , MLQAE $\left[1\right]$ , MLQAE $\left[2\right]$ , MLQAE $\left[3\right]$ , respectively.

4 Results

In this section, we discuss the results obtained by running MLQAE on IBMQ simulator and IBMQ devices. The simulator does not have quantum decoherence and gate errors. Thus, we can regard the result from the simulator as perfect, meaning devoid of any quantum errors, although it does have errors and inaccuracies which arise from the algorithm, such as the discretization error. Therefore, we can observe quantum noise or error from the implementation of the algorithm by comparing the results between the simulator and the devices.

4.1 Maximum Likelihood Estimation

The key idea of MLQAE is the post-processing of the measurements from each of the quantum circuits. The likelihood functions and the resultant maximum likelihood function are defined as following in the domain $[0,\pi/2]$ for $\theta$ :

L_{k}(h_{k};\theta)=\{\sin^{2}((2m_{k}+1)\theta)\}^{h_{k}}\{\cos^{2}((2m_{k}+1)\theta)\}^{N-h_{k}},

(5)

L(\vec{\textbf{h}};\theta)=\prod\limits_{k=0}^{m}L_{k}(h_{k};\theta),

(6)

where $m_{k}$ , $h_{k}$ , and $N$ are the power of Q, hit count of $1$ , and number of shots for the $k$ -th circuit, respectively, and $\vec{\textbf{h}}=(h_{0},h_{1},\cdots,h_{m})$ . The maximum likelihood estimation estimates $\hat{\theta}$ , which maximizes $L(\vec{\textbf{h}};\hat{\theta})$ in the domain. But instead of $L(\vec{\textbf{h}};\theta)$ , $\log L(\vec{\textbf{h}};\theta)$ is used to estimate $\hat{\theta}$ since the $\log$ function is monotonically increasing. The Python code implementation is given in Appendix A.

4.2 Simulator

The results from the IBMQ simulator are shown in Fig. 7. Each histogram represents the result of its corresponding quantum circuit given in Fig. 6. Since $a$ is $1/4$ and $\theta$ is $\pi/6$ in Eqs. (1) and (3), in this setting, $3\theta$ , $5\theta$ , and $9\theta$ are $\pi/2$ , $5\pi/6$ , and $3\pi/2$ . Therefore, circuits MLQAE $\left[0\right]$ , MLQAE $\left[1\right]$ , MLQAE $\left[2\right]$ , and MLQAE $\left[3\right]$ will collapse to $\ket{1}$ with a probability of $0.25$ , $1.00$ , $0.25$ , $1.00$ , respectively, when they are measured in the states of Eq. (4). The results in Fig. 7 are consistent with the analytical computation.

To estimate $\theta$ from the measured probability of each circuit, MLQAE uses the maximum likelihood estimation method [16] (see Appendix A). In this case, the estimated $\theta$ is $0.524$ (see Eqs. (1) and (3)). Since the probability $a$ , of measuring $\ket{1}$ , is $\sin^{2}(\theta)$ , $a$ is $0.2504\simeq 0.25$ , as expected.

4.3 IBM Quantum Devices

The optimizer (transpiler) maps logical qubits to device qubits optimally by considering qubit connectivity, and adds a SWAP gate if there is no physical connection between qubits. Figure 8 shows the layout of how qubits, such as $q_{0}$ , $q_{1}$ , and $q_{2}$ in Fig. 6(d), are mapped to IBM quantum devices. For example, on IBMQX2, $q_{0}$ , $q_{1}$ , and $q_{2}$ are mapped to $0$ , $1$ , and $2$ as can be seen in Fig. 8.

The optimized circuits of the MLQAE implementation (see Fig. 6) for IBMQX2 and IBMQ VIGO are shown in Figs. 12 and 13, respectively. The main difference between the two devices is the qubit connectivity. The three qubits on IBMQX2 have inter-connectivity with one another while the qubits $1$ and $2$ on IBMQ VIGO are not connected. Thus, the optimized circuit for IBMQ VIGO, Fig 12 (b) in Appendix B, includes SWAP gates which are highlighted by red boxes. Since SWAP gates are decomposed into three CNOT gates, optimized circuits on IBMQ VIGO have longer circuit depth than the circuits for IBMQX2, as shown in Figs. 12 and 13 in Appendix B.

Figures 9 and 10 are histograms of measurements on quantum devices. The results from IBMQ VIGO are more accurate than those from IBMQX2. The result from the MLQAE[0] circuit on IBMQX2 (Fig. 9 (a)) shows a relative error of $83.8\%$ , whereas that value on IBMQ VIGO (Fig. 10 (a)) is only $7.2\%$ . This is because IBMQ VIGO has more accurate quantum device operations than IBMQX2, as depicted in Fig. 8. For the $\theta$ -estimation problem (see Appendix A), IBMQX2 and IBMQ VIGO predict $\theta=0.795$ ( $a=0.509$ ) and $\theta=0.780$ ( $a=0.494$ ), respectively. The former has $51.8\%$ and $103.6\%$ as the relative errors of $\theta$ and $a$ and those values for the latter are $49.0\%$ and $97.6\%$ . In contrast with the result from the MLQAE[0] circuit, there is not a significant difference between the overall accuracy of the two devices. We hypothesize that IBMQ VIGO loses its accuracy because of the long circuit depths in MLQAE[2] and MLQAE[3].

5 Conclusion

In this paper, we have implemented and discussed two quantum amplitude estimation algorithms on IBM quantum devices, in the broader context of Monte Carlo integration. Even though CQAE [14] is a monumental algorithm and a brilliant extension of Grover’s algorithm [18], it is extremely challenging and in some cases infeasible to implement on NISQ devices, because of the required number of controlled operations acting on Q operators in QPE. Recently developed quantum amplitude estimation algorithms do not use QPE. We have implemented one such algorithm, MLQAE, along with CQAE on IBM quantum devices and discussed their applicability on NISQ devices. As discussed in Sec. 4, MLQAE is practical even on NISQ devices and has several advantages, including smaller circuit depth and parallel execution. The advancements in QAE for quantum Monte Carlo integration rely on quantum amplitude amplification (QAA).

As shown in Figs. 3 and 6, one of the crucial components of QAA is the $\textbf{S}_{0}$ operator, and it needs a multi-controlled NOT operator. In general, when we have an $n$ -qubit problem domain ( $2^{n}$ ), we need $n$ controlled NOT gates for each Q operator. For example, when we have a four-qubit domain ( $16$ ), the $\textbf{S}_{0}$ gate needs the following quantum gate,

\Qcircuit@C=1.0em@R=0.2em@!R{\lstick{{q}_{0}}&\targ\qw\qw\\ \lstick{{q}_{1}}\ctrl{-1}\qw\qw\\ \lstick{{q}_{2}}\ctrl{-1}\qw\qw\\ \lstick{{q}_{3}}\ctrl{-1}\qw\qw\\ \lstick{{q}_{4}}\ctrl{-1}\qw\qw\\ }

The implementation of the ‘advanced’ mode of the above circuit in Qiskit is shown in Fig. 14. Even though the ‘basic’ mode has shorter circuit depth, it needs two additional ancilla qubits as shown in Fig. 15. The circuit decomposition of the ‘advanced’ mode for IBMQX2 and IBMQ VIGO are shown in Fig. 16 and 17, respectively. Since IBMQ VIGO has less connectivity than IBMQX2 (see Fig. 8), the circuit for IBMQ VIGO has much longer depth ( $181$ ) than that of IBMQX2 ( $142$ ). This example shows that the multi-controlled NOT gates have mainly two difficulties on NISQ devices. The first is that a multi-controlled NOT gate should be decomposed into many basis gates as shown in Figs. 16 and 17. The second difficulty is that all qubits which are involved in the gate should be connected. As shown in Fig. 17, the lack of connectivity between qubits are complemented by SWAP gates (decomposed into three CNOT gates), increasing decoherence through a longer circuit and increasing gate error. Therefore, it is necessary to develop more efficient circuits for $\textbf{S}_{0}$ , or better algorithms, to avoid the problems $\textbf{S}_{0}$ has in order to run scalable quantum Monte Carlo integration on NISQ devices.

References

[1] Feynman, R. P., “Simulating physics with computers,” International journal of theoretical physics 21(6), 467–488 (1982).
[2] Kelly, J., “A preview of bristlecone, google’s new quantum processor,” Google AI Blog (March 2018).
[3] Knight, W., “Ibm raises the bar with a 50-qubit quantum computer,” (November 2017).
[4] Shankland, S., “Ibm’s new 53-qubit quantum computer is its biggest yet,” CNET (September 2019).
[5] Hsu, J., “Ces 2018: Intel’s 49-qubit chip shoots for quantum supremacy,” IEEE Spectrum (January 2018).
[6] Preskill, J., “Quantum Computing in the NISQ era and beyond,” Quantum 2, 79 (Aug. 2018).
[7] Tannu, S. S. and Qureshi, M. K., “Not all qubits are created equal: a case for variability-aware policies for NISQ-era quantum computers,” in [Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems ], 987–999 (2019).
[8] Córcoles, A. D., Kandala, A., Javadi-Abhari, A., McClure, D. T., Cross, A. W., Temme, K., Nation, P. D., Steffen, M., and Gambetta, J., “Challenges and opportunities of near-term quantum computing systems,” arXiv preprint arXiv:1910.02894 (2019).
[9] Adbo, B. et al., “Ibm q 5 yorktown v1.x.x,” (January 2019).
[10] Shor, P. W., “Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer,” SIAM J. Comput. 26, 1484–1509 (Oct. 1997).
[11] Kitaev, A. Y., “Quantum measurements and the abelian stabilizer problem,” arXiv preprint quant-ph/9511026 (1995).
[12] Svore, K. M., Hastings, M. B., and Freedman, M., “Faster phase estimation,” Quantum Info. Comput. 14, 306–328 (Mar. 2014).
[13] van den Berg, E., “Practical sampling schemes for quantum phase estimation,” (2019).
[14] Brassard, G., Hoyer, P., Mosca, M., and Tapp, A., “Quantum amplitude amplification and estimation,” Contemporary Mathematics 305, 53–74 (2002).
[15] Aaronson, S. and Rall, P., “Quantum approximate counting, simplified,” Symposium on Simplicity in Algorithms , 24–32 (Jan 2020).
[16] Suzuki, Y., Uno, S., Raymond, R., Tanaka, T., Onodera, T., and Yamamoto, N., “Amplitude estimation without phase estimation,” Quantum Information Processing 19(2), 75 (2020).
[17] Grinko, D., Gacon, J., Zoufal, C., and Woerner, S., “Iterative quantum amplitude estimation,” arXiv preprint arXiv:1912.05559 (2019).
[18] Grover, L. K., “A fast quantum mechanical algorithm for database search,” Proceedings of the twenty-eighth annual ACM symposium on Theory of computing - STOC ’96 (1996).
[19] Aleksandrowicz, G., Alexander, T., Barkoutsos, P., Bello, L., Ben-Haim, Y., Bucher, D., Cabrera-Hernández, F. J., Carballo-Franquis, J., Chen, A., Chen, C.-F., and et al., “”qiskit: An open-source framework for quantum computing”,” (Jan 2019).