This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Distributed Shor’s algorithm

Ligang Xiao1 Daowen Qiu1,3, Le Luo2,3 Paulo Mateus4 1Institute of Quantum Computing and Computer Theory, School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
2School of Physics and Astronomy, Sun Yat-sen University, Zhuhai 519082, China
3QUDOOR Technologies Inc., Guangzhou, China
4Instituto de Telecomunicações, Departamento de Matemática, Instituto Superior Técnico, Av. Rovisco Pais 1049-001 Lisbon, Portugal
Abstract

Shor’s algorithm is one of the most important quantum algorithm proposed by Peter Shor [Proceedings of the 35th Annual Symposium on Foundations of Computer Science, 1994, pp. 124–134]. Shor’s algorithm can factor a large integer with certain probability and costs polynomial time in the length of the input integer. The key step of Shor’s algorithm is the order-finding algorithm. Specifically, given an LL-bit integer NN, we first randomly pick an integer aa with gcd(a,N)=1gcd(a,N)=1, the order of aa modulo NN is the smallest positive integer rr such that ar1(modN)a^{r}\equiv 1(\bmod N). The order-finding algorithm in Shor’s algorithm first uses quantum operations to obtain an estimation of sr\dfrac{s}{r} for some s{0,1,,r1}s\in\{0,1,\cdots,r-1\}, then rr is obtained by means of classical algorithms. In this paper, we propose a distributed Shor’s algorithm. The difference between our distributed algorithm and the traditional order-finding algorithm is that we use two quantum computers separately to estimate partial bits of sr\dfrac{s}{r} for some s{0,1,,r1}s\in\{0,1,\cdots,r-1\}. To ensure their measuring results correspond to the same sr\dfrac{s}{r}, we need employ quantum teleportation. We integrate the measuring results via classical post-processing. After that, we get an estimation of sr\dfrac{s}{r} with high precision. Compared with the traditional Shor’s algorithm that uses multiple controlling qubits, our algorithm reduces nearly L2\dfrac{L}{2} qubits and reduces the circuit depth of each computer.

keywords:
Shor’s algorithm , Distributed Shor’s algorithm , Quantum teleportation , Circuit depth  

1 Introduction

Quantum computing has shown great potential in some fields or problems, such as chemical molecular simulation [1], portfolio optimization [15], large number decomposition [16], unordered database search [9] and linear equation solving [10] et al. At present, there have been many useful algorithms in quantum computing [12], but to realize these algorithms requires the power of large-scale general quantum computers. However, it is still very difficult to develop a large-scale general quantum computer, because there are important physical problems in quantum computer that have not been solved. Therefore it is necessary to consider reducing the number of qubits and other computing resources required for quantum algorithms.

Distributed quantum computing is a computing method that solves problems collaboratively through multiple computing nodes. In distributed quantum computing, we can use multiple medium-scale quantum computers to complete a task that was originally completed by a single large-scale quantum computer. Distributed quantum computing not only reduces the number of qubits required, but also sometimes reduces the circuit depth of each computer. This is also important since noise is increased with circuit being deepened . Therefore, distributed quantum computing has been studied significantly (for example, [2, 3, 11, 18]).

Shor’s algorithm proposed by Peter Shor in 1994 [16] is an epoch-making discovery. It can factor a large integer with certain probability and costs time polynomial in the length of the input integer, whereas the time complexity of the best known classical algorithm for factoring large numbers is exponential. Shor’s algorithm can be applied in cracking various cryptosystems, such as RSA cryptography and elliptic curve cryptography. For this reason, Shor’s algorithm has received extensive attention from the community. However, recently some researchers have pointed out that using Shor’s algorithm to crack the commonly used 2048-bit RSA integer requires physical qubits of millions [6]. So it is vital to consider reducing the qubits required in Shor’s algorithm. Many researchers have been working on reducing the number of qubits required for Shor’s algorithm [4, 8, 14], and these results have shown that Shor’s algorithm can be implemented using only one controlling qubit to factor a LL-bit integer together with 2L+c2L+c qubits and circuit depth O(L3)O(L^{3}), where cc is a constant.

In 2004, Yimsiriwattana et al [18] proposed a distributed Shor’s algorithm. In this distributed algorithm, it directly divides the qubits into several parts reasonably, so each part has fewer qubits than the original one. Since all unitary operators can be decomposed into single qubit quantum gates and CNOT gates [13], they only need to consider how to implement CNOT gates acting on different parts, while a CNOT gate acting on different parts can be implemented by means of pre-sharing EPR pairs, local operations and classical communication. Their distributed algorithm needs to communicate O(L2)O(L^{2}) classical bits.

In this paper, we propose a new distributed Shor’s algorithm. In our distributed algorithm, two computers execute sequentially. Each computer estimates several bits of some key intermediate quantity. In order to guarantee the correlation between the two computers’ measuring results to some extent, we employ quantum communication. Furthermore, to obtain high accuracy, we can adjust the measuring result of the first computer in terms of the measuring result of the second computer through classical post-processing. Compared with the traditional Shor’s algorithm that uses multiple controlling qubits, our algorithm reduces the cost of qubits (reduces nearly L2\dfrac{L}{2} qubits) and the circuit depth of each computer. Although each computer in our distributed algorithm requires more qubits than the Shor’s algorithm mentioned above that uses only one controlling qubit, our method of using quantum communication to distribute the phase estimation of Shor’s algorithm may be applicable to other quantum algorithms.

The remainder of the paper is organized as follows. In Section 2, we review quantum teleportation and some quantum algorithms related to Shor’s algorithm. In Section 3, we present a distributed Shor’s algorithm (more specifically, a distributed order-finding algorithm), and prove the correctness of our algorithm. In Section 4, we analyze the performance of our algorithm, including space complexity, time complexity, circuit depth and communication complexity. Finally in Section 5, we conclude with a summary.

2 Preliminaries

In this section, we review the quantum Fourier transform, phase estimation algorithm, order-finding algorithm and others we will use. We assume that the readers are familiar with the liner algebra and basic notations in quantum computing (for the details we can refer to [13]).

2.1 Quantum Fourier transform

Quantum Fourier transform is a unitary operator with the following action on the standard basis states:

QFT|j=12nk=02n1e2πijk/2n|k,QFT|j\rangle=\frac{1}{\sqrt{2^{n}}}\sum_{k=0}^{2^{n}-1}e^{2\pi ijk/2^{n}}|k\rangle\text{,} (1)

for j=0,1,,2n1j=0,1,\cdots,2^{n}-1. Hence the inverse quantum Fourier transform is acted as follows:

QFT112nk=02n1e2πijk/2n|k=|j,QFT^{-1}\frac{1}{\sqrt{2^{n}}}\sum_{k=0}^{2^{n}-1}e^{2\pi ijk/2^{n}}|k\rangle=|j\rangle\text{,} (2)

for j=0,1,,2n1j=0,1,\cdots,2^{n}-1. Quantum Fourier transform and the inverse quantum Fourier transform can be implemented using O(n2)O(n^{2}) single elementary gates (i.e. O(n2)O(n^{2}) single qubit gates and two-qubit gates) [13, 16].

2.2 Phase estimation algorithm

Phase estimation algorithm is an application of the quantum Fourier transform. Let |u|u\rangle be a quantum state and let UU be a unitary operator that satisfies U|u=e2πiωU|u\rangle=e^{2\pi i\omega} for some real number ω[0,1)\omega\in[0,1). Suppose we can create the quantum state |u|u\rangle and implement controlled operation Cm(U)C_{m}(U) such that

Cm(U)|j|u=|jUj|uC_{m}(U)|j\rangle|u\rangle=|j\rangle U^{j}|u\rangle (3)

for any positive integer mm and mm-bit string jj, where the first register is control qubits. Figure 1 shows the implementation of Cm(U)C_{m}(U). Then we can apply phase estimation algorithm to estimate ω\omega. For the sake of convenience, we first define the following notations. In this paper, we treat bit strings and their corresponding binary integers as the same.

Refer to caption
Figure 1: Implementation for Cm(U)C_{m}(U)
Definition 1.

For any real number ω=a1a2al.b1b2\omega=a_{1}a_{2}\cdots a_{l}.b_{1}b_{2}\cdots, where ak1{0,1},k1=1,2,,la_{k_{1}}\in\{0,1\},k_{1}=1,2,\cdots,l and bk2{0,1},k2=1,2,b_{k_{2}}\in\{0,1\},k_{2}=1,2,\cdots, denote |ψt,ω,FBits(ω,i,j),IBits(ω,i,j)|\psi_{t,\omega}\rangle,FBits(\omega,i,j),IBits(\omega,i,j) and dt(x,y)d_{t}(x,y) respectively as follows:

  • 1.

    |ψt,ω|\psi_{t,\omega}\rangle: for any positive integer tt, |ψt,ω=QFT112tj=02t1e2πijω|j|\psi_{t,\omega}\rangle=QFT^{-1}\dfrac{1}{\sqrt{2^{t}}}\sum\limits_{j=0}^{2^{t}-1}e^{2\pi ij\omega}|j\rangle .

  • 2.

    FBits(ω,i,j)FBits(\omega,i,j): for any integer i,ji,j with 1ij1\leq i\leq j, FBits(ω,i,j)=bibi+1bjFBits(\omega,i,j)=b_{i}b_{i+1}\cdots b_{j}.

  • 3.

    IBits(ω,i,j)IBits(\omega,i,j): for any integer i,ji,j with 1ijl1\leq i\leq j\leq l, IBits(ω,i,j)=aiai+1ajIBits(\omega,i,j)=a_{i}a_{i+1}\cdots a_{j}.

  • 4.

    dt(x,y)d_{t}(x,y): for any two tt-bit strings (or tt-bit binary integers) x,yx,y, define dt(x,y)=min(|xy|,2t|xy|)d_{t}(x,y)=\min(|x-y|,2^{t}-|x-y|).

dt(,)d_{t}(\cdot,\cdot) is a useful distance to estimate the error of the algorithms in our paper and it has the following properties. We specify amodN=(kN+a)modNa\bmod N=(kN+a)\bmod N for any negative integer aa and positive integer NN, where kk is an integer and satisfies kN+a0kN+a\geq 0.

Lemma 1.

Let tt be a positive integer and let x,yx,y be any two tt-bit strings. It holds that:
(I) Let B={b{(2t1),,2t1}:(x+b)mod2t=y}B=\{b\in\{-(2^{t}-1),\cdots,2^{t}-1\}:(x+b)\bmod 2^{t}=y\}. Then dt(x,y)=minbB|b|d_{t}(x,y)=\min_{b\in B}|b|.
(II) dt(,)d_{t}(\cdot,\cdot) is a distance on {0,1}t\{0,1\}^{t}.
(III) Let t0<tt_{0}<t be an positive integer. If dt(x,y)<2tt0d_{t}(x,y)<2^{t-t_{0}}, then

dt0(IBits(x,1,t0),IBits(y,1,t0))1.d_{t_{0}}(IBits(x,1,t_{0}),IBits(y,1,t_{0}))\leq 1. (4)
Proof.

First we prove (I). It is clear for the case of x=yx=y. Without loss of generality, assume x>yx>y. Since xyx\not=y, we have BB contains only 22 elements. Note that

x+(yx)mod2t=y,\displaystyle x+(y-x)\bmod 2^{t}=y, (5)
x+(2t(xy))mod2t=y,\displaystyle x+(2^{t}-(x-y))\bmod 2^{t}=y, (6)
|yx|2t1,\displaystyle|y-x|\leq 2^{t}-1, (7)
|2t(xy)|2t1\displaystyle|2^{t}-(x-y)|\leq 2^{t}-1 (8)

and yx2t(xy)y-x\not=2^{t}-(x-y), we get that yxy-x and 2t(xy)2^{t}-(x-y) are exactly two elements of BB. Hence minbB|b|=min(|xy|,2t|xy|)=dt(x,y)\min_{b\in B}|b|=\min(|x-y|,2^{t}-|x-y|)=d_{t}(x,y). Thus (I) holds.

Then we prove (II)\rm(II). We just need to show that dt(,)d_{t}(\cdot,\cdot) satisfies the triangle inequality, that is, dt(x,y)dt(x,z)+dt(z,y)d_{t}(x,y)\leq d_{t}(x,z)+d_{t}(z,y) holds for any tt-bit string zz. By (I), we know that there exists b1,b2{(2t1),,2t1}b_{1},b_{2}\in\{-(2^{t}-1),\cdots,2^{t}-1\} such that

|b1|=dt(x,z),|b2|=dt(z,y),|b_{1}|=d_{t}(x,z),|b_{2}|=d_{t}(z,y), (9)

and

(x+b1)mod2t=z,(z+b2)mod2t=y.(x+b_{1})\bmod 2^{t}=z,(z+b_{2})\bmod 2^{t}=y. (10)

Hence (x+b1+b2)mod2t=y(x+b_{1}+b_{2})\bmod 2^{t}=y. Then by (I) again, we have

dt(x,y)|b1+b2||b1|+|b2|=dt(x,z)+dt(z,y).d_{t}(x,y)\leq|b_{1}+b_{2}|\leq|b_{1}|+|b_{2}|=d_{t}(x,z)+d_{t}(z,y). (11)

Thus, (II) holds.

Finally we prove (III)\rm(III). By (I) and dt(x,y)<2tt0d_{t}(x,y)<2^{t-t_{0}}, we know that there exists an integer bb with |b|<2tt0|b|<2^{t-t_{0}} such that

((2tt0IBits(x,1,t0)+IBits(x,t0+1,t)+b)mod2t=2tt0IBits(y,1,t0)+IBits(y,t0+1,t).((2^{t-t0}IBits(x,1,t_{0})+IBits(x,t_{0}+1,t)+b)\bmod 2^{t}=2^{t-t0}IBits(y,1,t_{0})+IBits(y,t_{0}+1,t). (12)

Then by (I) again we have

dt(2tt0IBits(x,1,t0),2tt0IBits(y,1,t0))|b+IBits(x,t0+1,t)IBits(y,t0+1,t)|<22tt0.d_{t}(2^{t-t0}IBits(x,1,t_{0}),2^{t-t0}IBits(y,1,t_{0}))\leq|b+IBits(x,t_{0}+1,t)-IBits(y,t_{0}+1,t)|<2\cdot 2^{t-t0}. (13)

Hence

dt0(IBits(x,1,t0),IBits(y,1,t0))<2.d_{t_{0}}(IBits(x,1,t_{0}),IBits(y,1,t_{0}))<2. (14)

Therefore Equation (4) holds. ∎

We can understand dt(,)d_{t}(\cdot,\cdot) in a more intuitive way. We place numbers 0 to 2t2^{t} evenly on a circumference where 0 and 2t2^{t} coincide. Suppose the distance of two adjacent points on the circumference is 11. Then dt(x,y)d_{t}(x,y) can be regarded as the length of the shortest path on the circumference from xx to yy. Next we review the phase estimation algorithm (see Algorithm 1) and its associated results.

Algorithm 1 Phase estimation algorithm

Procedure:

1:  Create initialize state |0t|u|0\rangle^{\otimes t}|u\rangle.
2:  Apply HtH^{\otimes t} to the first register: Ht|0t|u=12tj=02t1|j|uH^{\otimes t}|0\rangle^{\otimes t}|u\rangle=\dfrac{1}{\sqrt{2^{t}}}\sum\limits_{j=0}^{2^{t}-1}|j\rangle|u\rangle.
3:  Apply Ct(U)C_{t}(U): Ct(U)12tj=02t1|j|u=12tj=02t1|jUj|u=12tj=02t1|je2πijω|uC_{t}(U)\dfrac{1}{\sqrt{2^{t}}}\sum\limits_{j=0}^{2^{t}-1}|j\rangle|u\rangle=\dfrac{1}{\sqrt{2^{t}}}\sum\limits_{j=0}^{2^{t}-1}|j\rangle U^{j}|u\rangle=\dfrac{1}{\sqrt{2^{t}}}\sum\limits_{j=0}^{2^{t}-1}|j\rangle e^{2\pi ij\omega}|u\rangle.
4:  Apply QFT1QFT^{-1}: QFT112tj=02t1e2πijω|j|u=|ψt,ω|uQFT^{-1}\dfrac{1}{\sqrt{2^{t}}}\sum\limits_{j=0}^{2^{t}-1}e^{2\pi ij\omega}|j\rangle|u\rangle=|\psi_{t,\omega}\rangle|u\rangle.
5:  Measure the first register: obtain a tt-bit string ω~\widetilde{\omega}.

If the fractional part of ω\omega does not exceed tt bits (i.e. 2tω2^{t}\omega is an integer), by observing Equation (2) and the step 4 in Algorithm 1, we can see that ω~\widetilde{\omega} is a perfect estimate of ω\omega (i.e. ω~2t=ω\dfrac{\widetilde{\omega}}{2^{t}}=\omega). However, sometimes ω\omega is not approximated by ω~2t\dfrac{\widetilde{\omega}}{2^{t}} but is approximated by 1ω~2t1-\dfrac{\widetilde{\omega}}{2^{t}}. For example, if the binary representation of ω\omega is ω=0.111\omega=0.11\cdots 1 (sufficiently many 11s), we will obtain the measuring result 00000\cdots 0 with high probability. The purpose of phase estimation algorithm is to find a ω~\widetilde{\omega} such that ω~2t\dfrac{\widetilde{\omega}}{2^{t}} is close to ω\omega or ω1\omega-1. We have the following results.

Proposition 1 (See [13]).

In Algorithm 1, for any ϵ>0\epsilon>0 and any positive integer nn, if t=n+log2(2+12ϵ)t=n+\lceil\log_{2}(2+\dfrac{1}{2\epsilon})\rceil, then the probability of dt(ω~,FBits(ω,1,t))<2tnd_{t}(\widetilde{\omega},FBits(\omega,1,t))<2^{t-n} is at least 1ϵ1-\epsilon.

Lemma 2.

For any tt-bit string ω~\widetilde{\omega} and real number ω[0,1)\omega\in[0,1). If dt(ω~,FBits(ω,1,t))<2tnd_{t}(\widetilde{\omega},FBits(\omega,1,t))<2^{t-n}, then we have |ω~2tω|2n|\dfrac{\widetilde{\omega}}{2^{t}}-\omega|\leq 2^{-n} or 1|ω~2tω|2n1-|\dfrac{\widetilde{\omega}}{2^{t}}-\omega|\leq 2^{-n}, where n<tn<t.

Proof.

Since |2tωFBits(ω,1,t)|<1|2^{t}\omega-FBits(\omega,1,t)|<1, if dt(ω~,FBits(ω,1,t))=|ω~FBits(ω,1,t)|d_{t}(\widetilde{\omega},FBits(\omega,1,t))=|\widetilde{\omega}-FBits(\omega,1,t)|, we have

|ω~2tω||ω~FBits(ω,1,t)|+|FBits(ω,1,t)2tω|2tn,|\widetilde{\omega}-2^{t}\omega|\leq|\widetilde{\omega}-FBits(\omega,1,t)|+|FBits(\omega,1,t)-2^{t}\omega|\leq 2^{t-n}, (15)

and thus |ω~2tω|2n|\dfrac{\widetilde{\omega}}{2^{t}}-\omega|\leq 2^{-n}; if dt(ω~,FBits(ω,1,t))=2t|ω~FBits(ω,1,t)|d_{t}(\widetilde{\omega},FBits(\omega,1,t))=2^{t}-|\widetilde{\omega}-FBits(\omega,1,t)|, we have

2t|ω~2tω|2t(|ω~FBits(ω,1,t)||FBits(ω,1,t)2tω|)2tn,2^{t}-|\widetilde{\omega}-2^{t}\omega|\leq 2^{t}-(|\widetilde{\omega}-FBits(\omega,1,t)|-|FBits(\omega,1,t)-2^{t}\omega|)\leq 2^{t-n}, (16)

and therefore, we have 1|ω~2tω|2n1-|\dfrac{\widetilde{\omega}}{2^{t}}-\omega|\leq 2^{-n}. ∎

That is to say, if ω~2t\dfrac{\widetilde{\omega}}{2^{t}} is an estimate of FBits(ω,1,t)FBits(\omega,1,t) with error less than 2n2^{-n}, then ω~2t\dfrac{\widetilde{\omega}}{2^{t}} is an estimate of ω\omega with error no larger than 2n2^{-n}.

2.3 Order-finding algorithm

Phase estimation algorithm is a key subroutine in order-finding algorithm. Given an LL-bit integer NN and a positive integer aa with gcd(a,N)=1gcd(a,N)=1, the purpose of order-finding algorithm is to find the order rr of aa modulo NN, that is, the least integer rr that satisfies ar1(modN)a^{r}\equiv 1(\bmod\ N). An important unitary operator MaM_{a} in order-finding algorithm is defined as

Ma|x=|axmodN.M_{a}|x\rangle=|ax\ \bmod\ N\rangle\text{.} (17)

Denote

|us=1rk=0r1e2πisrk|akmodN.|u_{s}\rangle=\dfrac{1}{\sqrt{r}}\sum\limits_{k=0}^{r-1}e^{-2\pi i\frac{s}{r}k}|a^{k}\bmod\ N\rangle\text{.} (18)

We have

Ma|us=e2πisr|us,\displaystyle M_{a}|u_{s}\rangle=e^{2\pi i\frac{s}{r}}|u_{s}\rangle, (19)
1rs=0r1|us=|1,\displaystyle\dfrac{1}{\sqrt{r}}\sum\limits_{s=0}^{r-1}|u_{s}\rangle=|1\rangle, (20)

and

us|us=δs,s={0if ss,1if s=s.\langle u_{s}|u_{s^{\prime}}\rangle=\delta_{s,s^{\prime}}=\begin{cases}0&\text{if $s\not=s^{\prime}$},\\ 1&\text{if $s=s^{\prime}$}.\end{cases} (21)

So if we expect to apply phase estimation algorithm in finding order, the key is to construct Cm(Ma)C_{m}(M_{a}), that is, for any mm-bit string jj,

Cm(Ma)|j|x=|j|ajxmodN.C_{m}(M_{a})|j\rangle|x\rangle=|j\rangle|a^{j}x\bmod\ N\rangle\text{.} (22)

Algorithm 2 [13] and Figure 2 show the precedure of order-finding algorithm.

Algorithm 2 Order-finding algorithm

Input: Positive integers NN and aa with gcd(N,a)=1gcd(N,a)=1.
Output: The order rr of aa modulo NN.
Procedure:

1:  Create initial state |0t|1|0\rangle^{\otimes t}|1\rangle: t=2L+1+log2(2+12ϵ)t=2L+1+\lceil\log_{2}(2+\dfrac{1}{2\epsilon})\rceil and the second register has LL qubits.
2:  Apply HtH^{\otimes t} to the first register: Ht|0t|1=12tj=02t1|j|1H^{\otimes t}|0\rangle^{\otimes t}|1\rangle=\dfrac{1}{\sqrt{2^{t}}}\sum\limits_{j=0}^{2^{t}-1}|j\rangle|1\rangle.
3:  Apply Ct(Ma)C_{t}(M_{a}): Ct(Ma)12tj=02t1|j|1=12tj=02t1|jMj(1rs=0r1|us)=1r2ts=0r1j=02t1|je2πijsr|usC_{t}(M_{a})\dfrac{1}{\sqrt{2^{t}}}\sum\limits_{j=0}^{2^{t}-1}|j\rangle|1\rangle=\dfrac{1}{\sqrt{2^{t}}}\sum\limits_{j=0}^{2^{t}-1}|j\rangle M^{j}(\dfrac{1}{\sqrt{r}}\sum\limits_{s=0}^{r-1}|u_{s}\rangle)=\dfrac{1}{\sqrt{r2^{t}}}\sum\limits_{s=0}^{r-1}\sum\limits_{j=0}^{2^{t}-1}|j\rangle e^{2\pi ij\frac{s}{r}}|u_{s}\rangle.
4:  Apply QFT1QFT^{-1}: QFT11r2ts=0r1j=02t1|je2πijsr|us=1rs=0r1|ψt,s/r|usQFT^{-1}\dfrac{1}{\sqrt{r2^{t}}}\sum\limits_{s=0}^{r-1}\sum\limits_{j=0}^{2^{t}-1}|j\rangle e^{2\pi ij\frac{s}{r}}|u_{s}\rangle=\dfrac{1}{\sqrt{r}}\sum\limits_{s=0}^{r-1}|\psi_{t,s/r}\rangle|u_{s}\rangle
5:  Measure the first register: obtain a tt-bit string mm that is an estimation of sr\dfrac{s}{r} for some ss.
6:  Apply continued fractions algorithm: obtain rr.
Refer to caption
Figure 2: Circuit for order-finding algorithm

The purpose of steps 1 to 5 in Algorithm 2 is to get a measuring result mm such that mm is an estimation of sr\dfrac{s}{r} for some s{0,1,,r1}s\in\{0,1,\cdots,r-1\} (i.e. |m2tsr|2(2L+1)|\dfrac{m}{2^{t}}-\dfrac{s}{r}|\leq 2^{-(2L+1)}). Let {Pi}\{P_{i}\} be any projective measurement on 2t\mathbb{C}^{2^{t}} and let |ϕs|\phi_{s}\rangle be any tt-qubit quantum state for s=0,1,,r1s=0,1,\cdots,r-1. By Equation (21), we have

(PjI)s=0r1|ϕs|us2=s=0r1(Pj|ϕs)|us2\|(P_{j}\otimes I)\sum_{s=0}^{r-1}|\phi_{s}\rangle|u_{s}\rangle\|^{2}=\sum_{s=0}^{r-1}\|(P_{j}|\phi_{s}\rangle)|u_{s}\rangle\|^{2} (23)

for Pj{Pi}P_{j}\in\{P_{i}\}. Hence by Propositon 1 and Equation (23), we can obtain the following proposition immediately.

Proposition 2 (See [13]).

In Algorithm 2, the probability of dt(m,Bits(sr,t))<2t(2L+1)d_{t}(m,Bits(\dfrac{s}{r},t))<2^{t-(2L+1)} for any fixed s{0,1,,r1}s\in\{0,1,\cdots,r-1\} is at least 1ϵr\dfrac{1-\epsilon}{r}. Thus the probability that there exists an s{0,1,,r1}s\in\{0,1,\cdots,r-1\} such that

dt(m,FBits(sr,1,t))<2t(2L+1)d_{t}(m,FBits(\dfrac{s}{r},1,t))<2^{t-(2L+1)} (24)

is at least 1ϵ1-\epsilon.

Although it is an important part to discuss the probability of obtaining rr correctly from the measuring result by applying continued fractions algorithm, the details are omitted here and we focus on considering whether the measuring result is an estimation of sr\dfrac{s}{r} for some ss, since this is exactly the goal of the quantum part in the order-finding algorithm.

2.4 Quantum teleportation

Quantum teleportation is an important means to realize quantum communication [5, 7]. Quantum teleportation is effectively equivalent to physically teleporting qubits, but in fact, the realization of quantum teleportation only requires classical communication and both parties to share an EPR pair in advance. The following result is useful.

Theorem 1 ([5]).

When Alice and Bob share LL pairs of EPR pairs, they can simulate transmitting LL qubits by communicating 2L2L classical bits.

3 Distributed order-finding algorithm

In [11], a distributed phase estimation algorithm was proposed, but the method in [11] can not guarantee the precision of the result. However, their ideas deserve further consideration. In this section, by combining with quantum teleportation, we proposed a distributed order-finding algorithm and prove the correctness of our algorithm.

Without loss of generality, assume that L=log2(N)L=\lceil\log_{2}(N)\rceil is even. The idea of our distributed order-finding algorithm is as follows. We need two quantum computers (named AA and BB). We first apply order-finding algorithm in computer AA and obtain an estimation of the first L2+1\dfrac{L}{2}+1 bits of sr\dfrac{s}{r} for some s{0,1,,r1}s\in\{0,1,\cdots,r-1\}, and similarly obtain an estimation of the (L2+2)(\dfrac{L}{2}+2)th bit to (2L+1)(2L+1)th bit of sr\dfrac{s}{r} in computer BB. We can realize this by using Ct(Ma2l)C_{t}(M_{a}^{2^{l}}), since Ma2l|us=e2πi(2lsr)|usM_{a}^{2^{l}}|u_{s}\rangle=e^{2\pi i(2^{l}\frac{s}{r})}|u_{s}\rangle and the fractional part of 2lsr2^{l}\dfrac{s}{r} starts at the (l+1)(l+1)th bit of he fractional part of sr\dfrac{s}{r}. Moreover, since Ma2l=Ma2lmodNM_{a}^{2^{l}}=M_{a^{2^{l}}{\rm mod}\ N} and we can calculate a2lmodNa^{2^{l}}\bmod\ N classically with time complexity O(l)O(l), we can construct Ct(Ma2l)C_{t}(M_{a}^{2^{l}}) with the same way as Ct(Ma)C_{t}(M_{a}). In addition, to guarantee the measuring results of AA and BB corresponding to the same sr\dfrac{s}{r}, we need quantum teleportation.

However, in order to maintain high precision, computer BB actually estimates the L2\dfrac{L}{2}th bit to (2L+1)(2L+1)th bit, where the estimation of the L2\dfrac{L}{2}th bit and the (L2+1)(\dfrac{L}{2}+1)th bit is used to “correct” the measuring result of AA. This “correction” operation is handed over to a classical subroutine named CorrectResultsCorrectResults. Our distributed order-finding algorithm is shown in Algorithm 3 and Figure 3, and the subroutine CorrectResultsCorrectResults is shown in Algorithm 4

Refer to caption
Figure 3: Circuit for distributed order finding algorithm
Algorithm 3 Distributed order-finding algorithm

Input: Positive integers NN and aa with gcd(N,a)=1gcd(N,a)=1.
Output: The order rr of aa modulo NN.
Procedure:

1:  Computer AA creates initial state |0A|1C|0\rangle_{A}|1\rangle_{C}. Computer BB creates initial state |0B|0\rangle_{B}:  Here registers AA, BB and CC are t1t_{1}-qubit, t2t_{2}-qubit and LL-qubit, respectively. We take t1=L2+1+pt_{1}=\dfrac{L}{2}+1+p and t2=3L2+2+pt_{2}=\dfrac{3L}{2}+2+p, where p=log2(2+12ϵp=\lceil\log_{2}(2+\dfrac{1}{2\epsilon^{\prime}}\rceil and ϵ=ϵ2\epsilon^{\prime}=\dfrac{\epsilon}{2}.Computer AA:
2:  Apply Ht1H^{\otimes t_{1}} to register AA:  1rs=0r1(Ht1|0|us)|0\rightarrow\dfrac{1}{\sqrt{r}}\sum\limits_{s=0}^{r-1}(H^{\otimes t_{1}}|0\rangle|u_{s}\rangle)|0\rangle.
3:  Apply Ct1(Ma)C_{t_{1}}(M_{a}) to registers AA and CC:  1rs=0r1(12t1j=02t11e2πijsr|j|us)|0\rightarrow\dfrac{1}{\sqrt{r}}\sum\limits_{s=0}^{r-1}(\dfrac{1}{\sqrt{2^{t_{1}}}}\sum\limits_{j=0}^{2^{t_{1}}-1}e^{2\pi ij\frac{s}{r}}|j\rangle|u_{s}\rangle)|0\rangle.
4:  Apply QFT1QFT^{-1}to register AA:  1rs=0r1|ψt1,s/r|us|0\rightarrow\dfrac{1}{\sqrt{r}}\sum\limits_{s=0}^{r-1}|\psi_{t_{1},s/r}\rangle|u_{s}\rangle|0\rangle
5:  Teleport the qubits of register CC to computer BB:  1rs=0r1|ψt1,s/r|0|us\rightarrow\dfrac{1}{\sqrt{r}}\sum\limits_{s=0}^{r-1}|\psi_{t_{1},s/r}\rangle|0\rangle|u_{s}\rangle Computer BB:
6:  Apply Ht2H^{\otimes t_{2}} to register BB:  1rs=0r1|ψt1,s/rHt2|0|us\rightarrow\dfrac{1}{\sqrt{r}}\sum\limits_{s=0}^{r-1}|\psi_{t_{1},s/r}\rangle H^{\otimes t_{2}}|0\rangle|u_{s}\rangle
7:  Apply Ct2(Ma2L21)C_{t_{2}}(M_{a}^{2^{\frac{L}{2}-1}}) to registers BB and CC:  1rs=0r1|ψt1,s/r(12t2j=02t21e2πij(2L21sr)|j)|us\rightarrow\dfrac{1}{\sqrt{r}}\sum\limits_{s=0}^{r-1}|\psi_{t_{1},s/r}\rangle(\dfrac{1}{\sqrt{2^{t_{2}}}}\sum\limits_{j=0}^{2^{t_{2}}-1}e^{2\pi ij(2^{\frac{L}{2}-1}\frac{s}{r})}|j\rangle)|u_{s}\rangle
8:  Apply QFT1QFT^{-1}to register BB:  |ϕfinal=1rs=0r1|ψt1,s/r|ψt2,2L21s/r|us\rightarrow|\phi_{final}\rangle=\dfrac{1}{\sqrt{r}}\sum\limits_{s=0}^{r-1}|\psi_{t_{1},s/r}\rangle|\psi_{t_{2},2^{\frac{L}{2}-1}s/r}\rangle|u_{s}\rangle
9:  Computer AA measures register AA and computer BB measures register BB: AA obtains a t1t_{1}-bit string m1m_{1} and BB obtains a t2t_{2}-bit string m2m_{2}.
10:  mCorrectResults(m1,m2)m\leftarrow CorrectResults(m_{1},m_{2}): mm is a (2L+1+p)(2L+1+p)-bit string.
11:  Apply continued fractions algorithm: obtain rr.
Algorithm 4 CorrectResults subroutine

Input: Two measuring results: t1t_{1}-bit string m1m_{1} and t2t_{2}-bit string m2m_{2}.
Output: An estimation mm such that |m2(2L+1+p)sr|2(2L+1)|\dfrac{m}{2^{(2L+1+p)}}-\dfrac{s}{r}|\leq 2^{-(2L+1)} for some s{0,1,,r1}s\in\{0,1,\cdots,r-1\}.
Procedure:

1:  Choose CorrectionBit{1,0,1}CorrectionBit\in\{-1,0,1\} such that (IBits(m1,L2,L2+1)+CorrectionBit)mod22=IBits(m2,1,2)(IBits(m_{1},\dfrac{L}{2},\dfrac{L}{2}+1)+CorrectionBit)\bmod 2^{2}=IBits(m_{2},1,2).
2:  mprefix(IBits(m1,1,L2+1)+CorrectionBit)mod2L2+1m_{prefix}\leftarrow(IBits(m_{1},1,\dfrac{L}{2}+1)+CorrectionBit)\bmod 2^{\frac{L}{2}+1}
3:  mmprefixIBits(m2,3,t2)m\leftarrow m_{prefix}\circ IBits(m_{2},3,t_{2}) (“\circ” represents catenation)
4:  return mm
Remark 1.

Although Algorithm 3 is a serial algorithm, the two computers can also execute in parallel to some extent. For example, execute the algorithm in the following order: 1, (2, 6), 3, 5, 7, (4, 8), 9, 10, 11, where ii represents the iith step in Algorithm 3, and (i,j)(i,j) means that the iith and jjth steps are executed in parallel.

Next we prove the correctness of our algorithm, that is, we can obtain the output mm such that |m2(2L+1+p)sr|2(2L+1)|\dfrac{m}{2^{(2L+1+p)}}-\dfrac{s}{r}|\leq 2^{-(2L+1)} holds for some s{0,1,,r1}s\in\{0,1,\cdots,r-1\} with high probability. Let r,L,t1,t2,p,m1,m2,mprefix,m,ϵ,|ϕfinalr,L,t_{1},t_{2},p,m_{1},m_{2},m_{prefix},m,\epsilon^{\prime},|\phi_{final}\rangle be the same as those in Algorithm 3 and Algorithm 4. We first prove that if m1m_{1} and m2m_{2} are both estimations of some bits of s0r\dfrac{s_{0}}{r} with s0r=0.a1a2aL2+1\dfrac{s_{0}}{r}=0.a_{1}a_{2}\cdots a_{\frac{L}{2}+1}, then the output mm is perfect (i.e. m=a1a2aL2aL2+100m=a_{1}a_{2}\cdots a_{\frac{L}{2}}a_{\frac{L}{2}+1}0\cdots 0), and the probability of this case is not less than 1r\dfrac{1}{r}.

Proposition 3.

Let s0{0,1,,r1}s_{0}\in\{0,1,\cdots,r-1\} satisfy that 2L2+1s0r2^{\frac{L}{2}+1}\cdot\dfrac{s_{0}}{r} is an integer, that is, s0r=0.a1a2aL2+1\dfrac{s_{0}}{r}=0.a_{1}a_{2}\cdots a_{\frac{L}{2}+1} where ai{0,1}a_{i}\in\{0,1\}, i=1,2,,L2+1i=1,2,\cdots,\dfrac{L}{2}+1. Then in Algorithm 3, it holds that

Prob(m=a1a2aL2+100)1r.{\rm Prob}(m=a_{1}a_{2}\cdots a_{\frac{L}{2}+1}0\cdots 0)\geq\dfrac{1}{r}. (25)
Proof.

Since the fractional part of s0r\dfrac{s_{0}}{r} is at most (L2+1)(\dfrac{L}{2}+1)-bit, in Algorithm 3, we have

|ψt1,s0/r=|a1a2aL2+100|\psi_{t_{1},s_{0}/r}\rangle=|a_{1}a_{2}\cdots a_{\frac{L}{2}+1}0\cdots 0\rangle (26)

and

|ψt2,2L21s0/r=|aL2aL2+100.|\psi_{t_{2},2^{\frac{L}{2}-1}s_{0}/r}\rangle=|a_{\frac{L}{2}}a_{\frac{L}{2}+1}0\cdots 0\rangle. (27)

Let x=a1a2aL2aL2+100x=a_{1}a_{2}\cdots a_{\frac{L}{2}}a_{\frac{L}{2}+1}0\cdots 0 and y=aL2aL2+100y=a_{\frac{L}{2}}a_{\frac{L}{2}+1}0\cdots 0. By Equation (23), we have

Prob(m1=xandm2=y)\displaystyle{\rm Prob}(m_{1}=x\ \text{and}\ m_{2}=y) =|xx||yy|I|ϕfinal2\displaystyle=\||x\rangle\langle x|\otimes|y\rangle\langle y|\otimes I\ |\phi_{final}\rangle\|^{2} (28)
|xx||yy|I1r|ψt1,s0/r|ψt2,2L21s0/r|us02\displaystyle\geq\||x\rangle\langle x|\otimes|y\rangle\langle y|\otimes I\ \dfrac{1}{\sqrt{r}}|\psi_{t_{1},s_{0}/r}\rangle|\psi_{t_{2},2^{\frac{L}{2}-1}s_{0}/r}\rangle|u_{s_{0}}\rangle\|^{2} (29)
=1r.\displaystyle=\dfrac{1}{r}. (30)

Since CorrectResults(x,y)=a1a2aL2aL2+100CorrectResults(x,y)=a_{1}a_{2}\cdots a_{\frac{L}{2}}a_{\frac{L}{2}+1}0\cdots 0, the lemma holds. ∎

Then we prove that if m2m_{2} is an estimation of the L2\dfrac{L}{2}th to (2L+1)(2L+1)th bit of s0r\dfrac{s_{0}}{r}, we can get IBits(m2,1,2)=FBits(s0r,L2,L2+1)IBits(m_{2},1,2)=FBits(\dfrac{s_{0}}{r},\dfrac{L}{2},\dfrac{L}{2}+1).

Lemma 3.

Let s0{0,1,,r1}s_{0}\in\{0,1,\cdots,r-1\} satisfy that 2L2+1s0r2^{\frac{L}{2}+1}\cdot\dfrac{s_{0}}{r} is not an integer and let m2m_{2} satisfy

dt2(m2,FBits(s0r,L2,2L+1+p))<2p.d_{t_{2}}(m_{2},FBits(\dfrac{s_{0}}{r},\dfrac{L}{2},2L+1+p))<2^{p}. (31)

Then IBits(m2,1,2)=FBits(s0r,L2,L2+1)IBits(m_{2},1,2)=FBits(\dfrac{s_{0}}{r},\dfrac{L}{2},\dfrac{L}{2}+1).

Proof.

Since 2L2+1s0r2^{\frac{L}{2}+1}\cdot\dfrac{s_{0}}{r} is not an integer, we have

2L<1r2L2+1s0modrrr1r<12L.2^{-L}<\dfrac{1}{r}\leq\dfrac{2^{\frac{L}{2}+1}s_{0}\bmod r}{r}\leq\dfrac{r-1}{r}<1-2^{-L}. (32)

So we get FBits(s0r,L2+2,3L2+1)FBits(\dfrac{s_{0}}{r},\dfrac{L}{2}+2,\dfrac{3L}{2}+1) is not 00000\cdots 0 or 11111\cdots 1. Hence, FBits(s0r,L2+2,2L+1)FBits(\dfrac{s_{0}}{r},\dfrac{L}{2}+2,2L+1) is not 00000\cdots 0 or 11111\cdots 1. That is to say, if we add or subtract 1 to FBits(s0r,L2,2L+1)FBits(\dfrac{s_{0}}{r},\dfrac{L}{2},2L+1), its first two bits are not changed. Thus by Equation (31), we have

IBits(m2,1,2)=FBits(s0r,L2,L2+1).IBits(m_{2},1,2)=FBits(\dfrac{s_{0}}{r},\dfrac{L}{2},\dfrac{L}{2}+1). (33)

Therefore the lemma holds. ∎

If IBits(m2,1,2)=FBits(s0r,L2,L2+1)IBits(m_{2},1,2)=FBits(\dfrac{s_{0}}{r},\dfrac{L}{2},\dfrac{L}{2}+1), that is, the first two bits of m2m_{2} are correct, then we can use these two bits of m2m_{2} to “correct” m1m_{1}. The following lemma can be used to show the correctness of Algorithm 4.

Lemma 4.

Let t>2t>2 be a positive integer and let x,yx,y be two tt-bit strings with dt(x,y)1d_{t}(x,y)\leq 1. Then there only exists one element b0b_{0} in {1,0,1}\{-1,0,1\} such that (x+b0)mod2t=y(x+b_{0})\bmod 2^{t}=y, and for any b{1,0,1}b\in\{-1,0,1\}, (x+b)mod2t=y(x+b)\bmod 2^{t}=y if and only if (IBits(x,t1,t)+b)mod22=IBits(y,t1,t)(IBits(x,t-1,t)+b)\bmod 2^{2}=IBits(y,t-1,t).

Proof.

By Lemma 1, we know that there exists such a b0b_{0}. It is clear that such a b0b_{0} is unique. Next we prove that for any b{1,0,1}b\in\{-1,0,1\}, (x+b)mod2t=y(x+b)\bmod 2^{t}=y if and only if (IBits(x,t1,t)+b)mod22=IBits(y,t1,t)(IBits(x,t-1,t)+b)\bmod 2^{2}=IBits(y,t-1,t). For any b{1,0,1}b\in\{-1,0,1\}, suppose (x+b)mod2t=y(x+b)\bmod 2^{t}=y, then we have

(x+b)mod22=ymod22.(x+b)\bmod 2^{2}=y\bmod 2^{2}. (34)

That is,

(IBits(x,t1,t)+b)mod22=IBits(y,t1,t).(IBits(x,t-1,t)+b)\bmod 2^{2}=IBits(y,t-1,t). (35)

On the other hand, for any b{1,0,1}b\in\{-1,0,1\}, suppose (IBits(x,t1,t)+b)mod22=IBits(y,t1,t)(IBits(x,t-1,t)+b)\bmod 2^{2}=IBits(y,t-1,t). Since there only exists one elements b1b_{1} in {1,0,1}\{-1,0,1\} such that (IBits(x,t1,t)+b1)mod22=IBits(y,t1,t)(IBits(x,t-1,t)+b_{1})\bmod 2^{2}=IBits(y,t-1,t), bb is equal to b0b_{0}, that is, bb satisfies (x+b)mod2t=y(x+b)\bmod 2^{t}=y. Consequently, the lemma holds. ∎

We can inspect Lemma 4 from another aspect. If dL2+1(m1,FBits(s0r,1,L2+1))1d_{\frac{L}{2}+1}(m_{1},FBits(\dfrac{s_{0}}{r},1,\dfrac{L}{2}+1))\leq 1 and IBits(m2,1,2)=FBits(s0r,L2,L2+1)IBits(m_{2},1,2)=FBits(\dfrac{s_{0}}{r},\dfrac{L}{2},\dfrac{L}{2}+1) hold for some s0s_{0}, then the CorrectionBitCorrectionBit in Algorithm 4 exists, and mprefix=FBits(s0r,1,L2+1)m_{prefix}=FBits(\dfrac{s_{0}}{r},1,\dfrac{L}{2}+1) holds as well.

Proposition 4.

Let m2m_{2} satisfy dt2(m2,FBits(s0r,L2,2L+1+p))<2pd_{t_{2}}(m_{2},FBits(\dfrac{s_{0}}{r},\dfrac{L}{2},2L+1+p))<2^{p} for some s0{0,1,,r1}s_{0}\in\{0,1,\cdots,r-1\} with 2L2+1s0r2^{\frac{L}{2}+1}\cdot\dfrac{s_{0}}{r} being not an integer. Suppose dt1(m1,FBits(s0r,1,t1))<2pd_{t_{1}}(m_{1},FBits(\dfrac{s_{0}}{r},1,t_{1}))<2^{p}. Then |m22L+1+ps0r|2(2L+1)|\dfrac{m}{2^{2L+1+p}}-\dfrac{s_{0}}{r}|\leq 2^{-(2L+1)}.

Proof.

Since dt2(m2,FBits(s0r,L2,2L+1+p))<2pd_{t_{2}}(m_{2},FBits(\dfrac{s_{0}}{r},\dfrac{L}{2},2L+1+p))<2^{p} and 2L2+1s0r2^{\frac{L}{2}+1}\cdot\dfrac{s_{0}}{r} is not an integer, by Lemma 3, we have

IBits(m2,1,2)=FBits(s0r,L2,L2+1).IBits(m_{2},1,2)=FBits(\dfrac{s_{0}}{r},\dfrac{L}{2},\dfrac{L}{2}+1). (36)

Since dt1(m1,FBits(s0r,1,t1))<2pd_{t_{1}}(m_{1},FBits(\dfrac{s_{0}}{r},1,t_{1}))<2^{p}, by Lemma 1, we have

dL2+1(IBits(m1,1,L2+1),FBits(s0r,1,L2+1))1.d_{\frac{L}{2}+1}(IBits(m_{1},1,\dfrac{L}{2}+1),FBits(\dfrac{s_{0}}{r},1,\dfrac{L}{2}+1))\leq 1. (37)

As a result, in Algorithm 4, the CorrectionBitCorrectionBit exists. By Equation (36), Lemma 4, and the steps 1 to 2 in Algorithm 4, we get

IBits(m,1,L2+1)=mprefix=FBits(s0r,1,L2+1).IBits(m,1,\dfrac{L}{2}+1)=m_{prefix}=FBits(\dfrac{s_{0}}{r},1,\dfrac{L}{2}+1). (38)

Since m=mprefixIBits(m2,3,t2)m=m_{prefix}\circ IBits(m_{2},3,t_{2}), by Equation (36) and Equation (38), we get

d2L+1+p(m,FBits(s0r,1,2L+1+p))=d3L2+2+p(m2,FBits(s0r,L2,2L+1+p)<2p.d_{2L+1+p}(m,FBits(\dfrac{s_{0}}{r},1,2L+1+p))=d_{\frac{3L}{2}+2+p}(m_{2},FBits(\dfrac{s_{0}}{r},\dfrac{L}{2},2L+1+p)<2^{p}. (39)

Since s0r\dfrac{s_{0}}{r} is not an integer, similar to Equation (32), we know that FBits(s0r,1,2L+1)FBits(\dfrac{s_{0}}{r},1,2L+1) is not 00000\cdots 0 or 11111\cdots 1. Then by Equation (39), we get d2L+1+p(m,FBits(s0r,1,2L+1+p))=|mFBits(s0r,1,2L+1+p)|d_{2L+1+p}(m,FBits(\dfrac{s_{0}}{r},1,2L+1+p))=|m-FBits(\dfrac{s_{0}}{r},1,2L+1+p)|. Therefore, by Equation (39) and Lemma 2, we obtain

|m22L+1+ps0r|2(2L+1).|\dfrac{m}{2^{2L+1+p}}-\dfrac{s_{0}}{r}|\leq 2^{-(2L+1)}. (40)

Theorem 2.

In Algorithm 3, for any fixed s0{0,1,,r1}s_{0}\in\{0,1,\cdots,r-1\}, the probability of |m22L+1+ps0r|2(2L+1)|\dfrac{m}{2^{2L+1+p}}-\dfrac{s_{0}}{r}|\leq 2^{-(2L+1)} is at least 1ϵr\dfrac{1-\epsilon}{r}. The probability that there exists an s{0,1,,r1}s\in\{0,1,\cdots,r-1\} such that |m22L+1+ps0r|2(2L+1)|\dfrac{m}{2^{2L+1+p}}-\dfrac{s_{0}}{r}|\leq 2^{-(2L+1)} is at least 1ϵ1-\epsilon.

Proof.

By Proposition 3, for any fixed s0{0,1,,r1}s_{0}\in\{0,1,\cdots,r-1\} with 2L2+1s0r2^{\frac{L}{2}+1}\cdot\dfrac{s_{0}}{r} being an integer, we have

Prob((m22L+1+p=s0r)1r.{\rm Prob}((\dfrac{m}{2^{2L+1+p}}=\dfrac{s_{0}}{r})\geq\dfrac{1}{r}. (41)

For any fixed s0{0,1,,r1}s_{0}\in\{0,1,\cdots,r-1\} with 2L2+1s0r2^{\frac{L}{2}+1}\cdot\dfrac{s_{0}}{r} being not an integer, by Proposition 1 and Equation (23), we get that the probabilty of dt2(m2,FBits(s0r,L2,2L+1+p))<2pd_{t_{2}}(m_{2},FBits(\dfrac{s_{0}}{r},\dfrac{L}{2},2L+1+p))<2^{p} and dt1(m1,FBits(s0r,1,t1))<2pd_{t_{1}}(m_{1},FBits(\dfrac{s_{0}}{r},1,t_{1}))<2^{p} is at least 1r(1ϵ)2=1r(1ϵ2)2>1ϵr\dfrac{1}{r}(1-\epsilon^{\prime})^{2}=\dfrac{1}{r}(1-\dfrac{\epsilon}{2})^{2}>\dfrac{1-\epsilon}{r}. Consequently, by Proposition 4, we obtain

Prob(|m22L+1+ps0r|2(2L+1))>1ϵr.{\rm Prob}(|\dfrac{m}{2^{2L+1+p}}-\dfrac{s_{0}}{r}|\leq 2^{-(2L+1)})>\dfrac{1-\epsilon}{r}. (42)

Finally, the theorem has been proved. ∎

4 Complexity analysis

The complexity of the circuit of (distributed) order-finding algorithm depends on the construction of Ct(Ma)C_{t}(M_{a}). There are two kinds of implementation of Ct(Ma)C_{t}(M_{a}) proposed by Shor [17]. The first method (denoted as method (I)) needs time complexity O(L3)O(L^{3}) and space complexity O(L)O(L), and the second method (denoted as method (II) ) needs time complexity O(L2logLloglogL)O(L^{2}\log L\log\log L) and space complexity O(LlogLloglogL)O(L\log L\log\log L). In this section, we compare our distributed order-finding algorithm with the traditional order-finding algorithm. For a more concrete comparison, we consider that Ct(Ma)C_{t}(M_{a}) is implemented by method (I). There is a concrete implementation of order-finding algorithm by using method (I) in [18]. However, the advantages of our distributed order-finding algorithm in space and circuit depth are independent of whether method (I) or method (II) is used.

Space complexity The implementation of the operator Ct(Ma)C_{t}(M_{a}) in method (I) needs t+Lt+L qubits plus bb auxiliary qubits for any positive integer aa, where bb is O(L)O(L). By Theorem 1, to teleport LL qubits, computers AA and BB need to share LL pairs of EPR states and communicate with 2L2L classical bits. As a result, AA needs 5L2+1+log2(2+1ϵ)+b\dfrac{5L}{2}+1+\lceil\log_{2}(2+\dfrac{1}{\epsilon}\rceil)+b qubits and BB needs 5L2+2+log2(2+1ϵ)+b\dfrac{5L}{2}+2+\lceil\log_{2}(2+\dfrac{1}{\epsilon})\rceil+b qubits. As a comparison, order-finding algorithm needs 3L+1+log2(2+12ϵ)+b3L+1+\lceil\log_{2}(2+\dfrac{1}{2\epsilon})\rceil+b qubits. So, our distributed order-finding algorithm can reduce nearly L2\dfrac{L}{2} qubits.

Time complexity. The operator Ct(Ma)C_{t}(M_{a}) can be implemented by means of O(tL2)O(tL^{2}) elementary gates in method (I). Hence the gate complexity (or time complexity) in both our distributed order-finding algorithm and order-finding algorithm is O(L3)O(L^{3}).

Circuit depth. By Figure 1, we know that the circuit depth of Ct(Ma)C_{t}(M_{a}) depends on the circuit depth of controlled-Ma2x(x=0,1,,t1)M_{a}^{2^{x}}(x=0,1,\cdots,t-1) and tt. The circuit depth of controlled-Ma2xM_{a}^{2^{x}} is O(L2)O(L^{2}) in method (I). By observing the value “tt” in order-finding algorithm and our distributed order-finding algorithm, we clearly get that the circuit depth of each computer in our distributed order-finding algorithm is less than the traditional order-finding algorithm, even though both are O(L3)O(L^{3}).

Communication complexity. In our distributed Shor’s algorithm, we need to teleport LL qubits. Therefore, the communication complexity of our distributed Shor’s algorithm is O(L)O(L). As a comparison, the communication complexity of the distributed order-finding algorithm proposed in [18] is O(L2)O(L^{2}).

5 Conclusions

In this paper, we have proposed a new distributed Shor’s algorithm. More specifically, we have proposed a new distributed order-finding algorithm. In this distributed quantum algorithm, two computers work sequentially via quantum teleportation. Each of them can obtain an estimation of partial bits of sr\dfrac{s}{r} for some s{0,1,,r1}s\in\{0,1,\cdots,r-1\} with high probability. It is worth mentioning that they can also be executed in parallel to some extent. We have shown that our distributed algorithm has advantages over the traditional order-finding algorithm in space and circuit depth. Our distributed order-finding algorithm can reduce nearly L2\dfrac{L}{2} qubits and reduce the circuit depth to some extent for each computer. However, unlike parallel execution, the way of serial execution that has been used in our algorithm leads to noise in both computers.

We have proved the correctness of this distributed algorithm on two computers, a natural problem is whether or not this method can be generalized to multiple computers or to other quantum algorithms. We would further consider the problem in subsequent study.

6 Acknowledgements

This work is partly supported by the National Natural Science Foundation of China (Nos. 61572532, 61876195) and the Natural Science Foundation of Guangdong Province of China (No. 2017B030311011).

References

  • [1] A. Aspuru-Guzik, A.D. Dutoi, P. J. Love, M. Head-Gordon, Simulated quantum computation of molecular energies, Science, 309 (5741) (2005) 1704–1707.
  • [2] J. Avron, O. Casper, I. Rozen, Quantum advantage and noise reduction in distributed quantum computing, Physical Review A, 104 (5) (2021) 052404.
  • [3] R. Beals, S. Brierley, O. Gray, A. W. Harrow, S. Kutin, N. Linden, D. Shepherd, M. Stather, Efficient distributed quantum computing, Proceedings of the Royal Society A Mathematical Physical and Engineering Science, 469 (2153) (2013) 20120686.
  • [4] S. Beauregard, Circuit for Shor’s algorithm using 2n+32n+3 qubits, Quantum Information and Computation, 3 (2) (2003) 175–185.
  • [5] C. Bennett, G. Brassard, C. Crépeau, R. Jozsa, A. Peres, and W. K. Wootters, Teleporting an unknown quantum state via dual classical and Einstein-Podolsky-Rosen channels, Physiscal Review Letters, 70 (13) (1993) 1895–1899.
  • [6] C. Gidney, M. Ekera, How to factor 2048 bit RSA integers in 8 hours using 20 million noisy qubits, Quantum, 5 (2021) 433.
  • [7] N. Gisin, R. Thew, Quantum communication, Nature Photonics, 1 (3) (2007) 165–171.
  • [8] T. Haner, M. Roetteler, K. M. Svore, Factoring using 2n+22n+2 qubits with Toffoli based modular multiplication, Quantum Information and Computation, 17 (7-8) (2017) 673–684.
  • [9] L. K. Grover, A fast quantum mechanical algorithm for database search, in: Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, 1996, pp. 212–219.
  • [10] A. W. Harrow, A. Hassidim, S. Lloyd, Quantum algorithm for linear systems of equations, Physical Review Letters, 103 (15) (2009) 150502.
  • [11] K. Li, D. Qiu, L. Li, S. Zheng, Z. Rong, Application of distributed semi-quantum computing model in phase estimation, Information Processing Letters 120 (2017) 23–29.
  • [12] A. Montanaro, Quantum algorithms: an overview, npj Quantum Information, 2 (2016) 15023.
  • [13] M. A. Nielsen, I. L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, Cambridge, 2000.
  • [14] S. Parker, M. B. Plenio, Efficient factorization with a single pure qubit and logN\log N mixed qubits, Physical Review Letters, 85 (14) (2000) 3049–3052.
  • [15] G. Rosenberg, P. Haghnegahdar, P. Goddard, P. Carr, K. Wu, M. L. De Prado, Solving the optimal trading trajectory problem using a quantum annealer, IEEE Journal of Selected Topics in Signal Processing, 10 (6) (2016) 1053–1060.
  • [16] P.W. Shor, Algorithms for quantum computation: discrete logarithms and factoring, in: Proceedings of the 35th Annual Symposium on Foundations of Computer Science, 1994, pp. 124–134.
  • [17] P.W. Shor, Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer, Siam Review 41 (2) (1999) 303–332.
  • [18] A. Yimsiriwattana, S.J. Lomonaco, Distributed quantum computing: a distributed Shor algorithm, Quantum Information and Computation II, 5436 (2004) 360–372.