Distributed Shor’s algorithm
Abstract
Shor’s algorithm is one of the most important quantum algorithm proposed by Peter Shor [Proceedings of the 35th Annual Symposium on Foundations of Computer Science, 1994, pp. 124–134]. Shor’s algorithm can factor a large integer with certain probability and costs polynomial time in the length of the input integer. The key step of Shor’s algorithm is the order-finding algorithm. Specifically, given an -bit integer , we first randomly pick an integer with , the order of modulo is the smallest positive integer such that . The order-finding algorithm in Shor’s algorithm first uses quantum operations to obtain an estimation of for some , then is obtained by means of classical algorithms. In this paper, we propose a distributed Shor’s algorithm. The difference between our distributed algorithm and the traditional order-finding algorithm is that we use two quantum computers separately to estimate partial bits of for some . To ensure their measuring results correspond to the same , we need employ quantum teleportation. We integrate the measuring results via classical post-processing. After that, we get an estimation of with high precision. Compared with the traditional Shor’s algorithm that uses multiple controlling qubits, our algorithm reduces nearly qubits and reduces the circuit depth of each computer.
keywords:
Shor’s algorithm , Distributed Shor’s algorithm , Quantum teleportation , Circuit depth1 Introduction
Quantum computing has shown great potential in some fields or problems, such as chemical molecular simulation [1], portfolio optimization [15], large number decomposition [16], unordered database search [9] and linear equation solving [10] et al. At present, there have been many useful algorithms in quantum computing [12], but to realize these algorithms requires the power of large-scale general quantum computers. However, it is still very difficult to develop a large-scale general quantum computer, because there are important physical problems in quantum computer that have not been solved. Therefore it is necessary to consider reducing the number of qubits and other computing resources required for quantum algorithms.
Distributed quantum computing is a computing method that solves problems collaboratively through multiple computing nodes. In distributed quantum computing, we can use multiple medium-scale quantum computers to complete a task that was originally completed by a single large-scale quantum computer. Distributed quantum computing not only reduces the number of qubits required, but also sometimes reduces the circuit depth of each computer. This is also important since noise is increased with circuit being deepened . Therefore, distributed quantum computing has been studied significantly (for example, [2, 3, 11, 18]).
Shor’s algorithm proposed by Peter Shor in 1994 [16] is an epoch-making discovery. It can factor a large integer with certain probability and costs time polynomial in the length of the input integer, whereas the time complexity of the best known classical algorithm for factoring large numbers is exponential. Shor’s algorithm can be applied in cracking various cryptosystems, such as RSA cryptography and elliptic curve cryptography. For this reason, Shor’s algorithm has received extensive attention from the community. However, recently some researchers have pointed out that using Shor’s algorithm to crack the commonly used 2048-bit RSA integer requires physical qubits of millions [6]. So it is vital to consider reducing the qubits required in Shor’s algorithm. Many researchers have been working on reducing the number of qubits required for Shor’s algorithm [4, 8, 14], and these results have shown that Shor’s algorithm can be implemented using only one controlling qubit to factor a -bit integer together with qubits and circuit depth , where is a constant.
In 2004, Yimsiriwattana et al [18] proposed a distributed Shor’s algorithm. In this distributed algorithm, it directly divides the qubits into several parts reasonably, so each part has fewer qubits than the original one. Since all unitary operators can be decomposed into single qubit quantum gates and CNOT gates [13], they only need to consider how to implement CNOT gates acting on different parts, while a CNOT gate acting on different parts can be implemented by means of pre-sharing EPR pairs, local operations and classical communication. Their distributed algorithm needs to communicate classical bits.
In this paper, we propose a new distributed Shor’s algorithm. In our distributed algorithm, two computers execute sequentially. Each computer estimates several bits of some key intermediate quantity. In order to guarantee the correlation between the two computers’ measuring results to some extent, we employ quantum communication. Furthermore, to obtain high accuracy, we can adjust the measuring result of the first computer in terms of the measuring result of the second computer through classical post-processing. Compared with the traditional Shor’s algorithm that uses multiple controlling qubits, our algorithm reduces the cost of qubits (reduces nearly qubits) and the circuit depth of each computer. Although each computer in our distributed algorithm requires more qubits than the Shor’s algorithm mentioned above that uses only one controlling qubit, our method of using quantum communication to distribute the phase estimation of Shor’s algorithm may be applicable to other quantum algorithms.
The remainder of the paper is organized as follows. In Section 2, we review quantum teleportation and some quantum algorithms related to Shor’s algorithm. In Section 3, we present a distributed Shor’s algorithm (more specifically, a distributed order-finding algorithm), and prove the correctness of our algorithm. In Section 4, we analyze the performance of our algorithm, including space complexity, time complexity, circuit depth and communication complexity. Finally in Section 5, we conclude with a summary.
2 Preliminaries
In this section, we review the quantum Fourier transform, phase estimation algorithm, order-finding algorithm and others we will use. We assume that the readers are familiar with the liner algebra and basic notations in quantum computing (for the details we can refer to [13]).
2.1 Quantum Fourier transform
Quantum Fourier transform is a unitary operator with the following action on the standard basis states:
(1) |
for . Hence the inverse quantum Fourier transform is acted as follows:
(2) |
for . Quantum Fourier transform and the inverse quantum Fourier transform can be implemented using single elementary gates (i.e. single qubit gates and two-qubit gates) [13, 16].
2.2 Phase estimation algorithm
Phase estimation algorithm is an application of the quantum Fourier transform. Let be a quantum state and let be a unitary operator that satisfies for some real number . Suppose we can create the quantum state and implement controlled operation such that
(3) |
for any positive integer and -bit string , where the first register is control qubits. Figure 1 shows the implementation of . Then we can apply phase estimation algorithm to estimate . For the sake of convenience, we first define the following notations. In this paper, we treat bit strings and their corresponding binary integers as the same.

Definition 1.
For any real number , where and , denote and respectively as follows:
-
1.
: for any positive integer , .
-
2.
: for any integer with , .
-
3.
: for any integer with , .
-
4.
: for any two -bit strings (or -bit binary integers) , define .
is a useful distance to estimate the error of the algorithms in our paper and it has the following properties. We specify for any negative integer and positive integer , where is an integer and satisfies .
Lemma 1.
Let be a positive integer and let be any two -bit strings. It holds that:
(I) Let . Then .
(II) is a distance on .
(III) Let be an positive integer. If , then
(4) |
Proof.
First we prove (I). It is clear for the case of . Without loss of generality, assume . Since , we have contains only elements. Note that
(5) | |||
(6) | |||
(7) | |||
(8) |
and , we get that and are exactly two elements of . Hence . Thus (I) holds.
Then we prove . We just need to show that satisfies the triangle inequality, that is, holds for any -bit string . By (I), we know that there exists such that
(9) |
and
(10) |
Hence . Then by (I) again, we have
(11) |
Thus, (II) holds.
Finally we prove . By (I) and , we know that there exists an integer with such that
(12) |
Then by (I) again we have
(13) |
Hence
(14) |
Therefore Equation (4) holds. ∎
We can understand in a more intuitive way. We place numbers 0 to evenly on a circumference where and coincide. Suppose the distance of two adjacent points on the circumference is . Then can be regarded as the length of the shortest path on the circumference from to . Next we review the phase estimation algorithm (see Algorithm 1) and its associated results.
Procedure:
If the fractional part of does not exceed bits (i.e. is an integer), by observing Equation (2) and the step 4 in Algorithm 1, we can see that is a perfect estimate of (i.e. ). However, sometimes is not approximated by but is approximated by . For example, if the binary representation of is (sufficiently many s), we will obtain the measuring result with high probability. The purpose of phase estimation algorithm is to find a such that is close to or . We have the following results.
Proposition 1 (See [13]).
In Algorithm 1, for any and any positive integer , if , then the probability of is at least .
Lemma 2.
For any -bit string and real number . If , then we have or , where .
Proof.
Since , if , we have
(15) |
and thus ; if , we have
(16) |
and therefore, we have . ∎
That is to say, if is an estimate of with error less than , then is an estimate of with error no larger than .
2.3 Order-finding algorithm
Phase estimation algorithm is a key subroutine in order-finding algorithm. Given an -bit integer and a positive integer with , the purpose of order-finding algorithm is to find the order of modulo , that is, the least integer that satisfies . An important unitary operator in order-finding algorithm is defined as
(17) |
Denote
(18) |
We have
(19) | |||
(20) |
and
(21) |
So if we expect to apply phase estimation algorithm in finding order, the key is to construct , that is, for any -bit string ,
(22) |
Algorithm 2 [13] and Figure 2 show the precedure of order-finding algorithm.
Input: Positive integers and with .
Output: The order of modulo .
Procedure:

The purpose of steps 1 to 5 in Algorithm 2 is to get a measuring result such that is an estimation of for some (i.e. ). Let be any projective measurement on and let be any -qubit quantum state for . By Equation (21), we have
(23) |
for . Hence by Propositon 1 and Equation (23), we can obtain the following proposition immediately.
Proposition 2 (See [13]).
In Algorithm 2, the probability of for any fixed is at least . Thus the probability that there exists an such that
(24) |
is at least .
Although it is an important part to discuss the probability of obtaining correctly from the measuring result by applying continued fractions algorithm, the details are omitted here and we focus on considering whether the measuring result is an estimation of for some , since this is exactly the goal of the quantum part in the order-finding algorithm.
2.4 Quantum teleportation
Quantum teleportation is an important means to realize quantum communication [5, 7]. Quantum teleportation is effectively equivalent to physically teleporting qubits, but in fact, the realization of quantum teleportation only requires classical communication and both parties to share an EPR pair in advance. The following result is useful.
Theorem 1 ([5]).
When Alice and Bob share pairs of EPR pairs, they can simulate transmitting qubits by communicating classical bits.
3 Distributed order-finding algorithm
In [11], a distributed phase estimation algorithm was proposed, but the method in [11] can not guarantee the precision of the result. However, their ideas deserve further consideration. In this section, by combining with quantum teleportation, we proposed a distributed order-finding algorithm and prove the correctness of our algorithm.
Without loss of generality, assume that is even. The idea of our distributed order-finding algorithm is as follows. We need two quantum computers (named and ). We first apply order-finding algorithm in computer and obtain an estimation of the first bits of for some , and similarly obtain an estimation of the th bit to th bit of in computer . We can realize this by using , since and the fractional part of starts at the th bit of he fractional part of . Moreover, since and we can calculate classically with time complexity , we can construct with the same way as . In addition, to guarantee the measuring results of and corresponding to the same , we need quantum teleportation.
However, in order to maintain high precision, computer actually estimates the th bit to th bit, where the estimation of the th bit and the th bit is used to “correct” the measuring result of . This “correction” operation is handed over to a classical subroutine named . Our distributed order-finding algorithm is shown in Algorithm 3 and Figure 3, and the subroutine is shown in Algorithm 4

Input: Positive integers and with .
Output: The order of modulo .
Procedure:
Input: Two measuring results: -bit string and -bit string .
Output: An estimation such that for some .
Procedure:
Remark 1.
Although Algorithm 3 is a serial algorithm, the two computers can also execute in parallel to some extent. For example, execute the algorithm in the following order: 1, (2, 6), 3, 5, 7, (4, 8), 9, 10, 11, where represents the th step in Algorithm 3, and means that the th and th steps are executed in parallel.
Next we prove the correctness of our algorithm, that is, we can obtain the output such that holds for some with high probability. Let be the same as those in Algorithm 3 and Algorithm 4. We first prove that if and are both estimations of some bits of with , then the output is perfect (i.e. ), and the probability of this case is not less than .
Proposition 3.
Let satisfy that is an integer, that is, where , . Then in Algorithm 3, it holds that
(25) |
Proof.
Then we prove that if is an estimation of the th to th bit of , we can get .
Lemma 3.
Let satisfy that is not an integer and let satisfy
(31) |
Then .
Proof.
Since is not an integer, we have
(32) |
So we get is not or . Hence, is not or . That is to say, if we add or subtract 1 to , its first two bits are not changed. Thus by Equation (31), we have
(33) |
Therefore the lemma holds. ∎
If , that is, the first two bits of are correct, then we can use these two bits of to “correct” . The following lemma can be used to show the correctness of Algorithm 4.
Lemma 4.
Let be a positive integer and let be two -bit strings with . Then there only exists one element in such that , and for any , if and only if .
Proof.
By Lemma 1, we know that there exists such a . It is clear that such a is unique. Next we prove that for any , if and only if . For any , suppose , then we have
(34) |
That is,
(35) |
On the other hand, for any , suppose . Since there only exists one elements in such that , is equal to , that is, satisfies . Consequently, the lemma holds. ∎
We can inspect Lemma 4 from another aspect. If and hold for some , then the in Algorithm 4 exists, and holds as well.
Proposition 4.
Let satisfy for some with being not an integer. Suppose . Then .
Proof.
Since and is not an integer, by Lemma 3, we have
(36) |
Since , by Lemma 1, we have
(37) |
As a result, in Algorithm 4, the exists. By Equation (36), Lemma 4, and the steps 1 to 2 in Algorithm 4, we get
(38) |
Since , by Equation (36) and Equation (38), we get
(39) |
Since is not an integer, similar to Equation (32), we know that is not or . Then by Equation (39), we get . Therefore, by Equation (39) and Lemma 2, we obtain
(40) |
∎
Theorem 2.
In Algorithm 3, for any fixed , the probability of is at least . The probability that there exists an such that is at least .
4 Complexity analysis
The complexity of the circuit of (distributed) order-finding algorithm depends on the construction of . There are two kinds of implementation of proposed by Shor [17]. The first method (denoted as method (I)) needs time complexity and space complexity , and the second method (denoted as method (II) ) needs time complexity and space complexity . In this section, we compare our distributed order-finding algorithm with the traditional order-finding algorithm. For a more concrete comparison, we consider that is implemented by method (I). There is a concrete implementation of order-finding algorithm by using method (I) in [18]. However, the advantages of our distributed order-finding algorithm in space and circuit depth are independent of whether method (I) or method (II) is used.
Space complexity The implementation of the operator in method (I) needs qubits plus auxiliary qubits for any positive integer , where is . By Theorem 1, to teleport qubits, computers and need to share pairs of EPR states and communicate with classical bits. As a result, needs qubits and needs qubits. As a comparison, order-finding algorithm needs qubits. So, our distributed order-finding algorithm can reduce nearly qubits.
Time complexity. The operator can be implemented by means of elementary gates in method (I). Hence the gate complexity (or time complexity) in both our distributed order-finding algorithm and order-finding algorithm is .
Circuit depth. By Figure 1, we know that the circuit depth of depends on the circuit depth of controlled- and . The circuit depth of controlled- is in method (I). By observing the value “” in order-finding algorithm and our distributed order-finding algorithm, we clearly get that the circuit depth of each computer in our distributed order-finding algorithm is less than the traditional order-finding algorithm, even though both are .
Communication complexity. In our distributed Shor’s algorithm, we need to teleport qubits. Therefore, the communication complexity of our distributed Shor’s algorithm is . As a comparison, the communication complexity of the distributed order-finding algorithm proposed in [18] is .
5 Conclusions
In this paper, we have proposed a new distributed Shor’s algorithm. More specifically, we have proposed a new distributed order-finding algorithm. In this distributed quantum algorithm, two computers work sequentially via quantum teleportation. Each of them can obtain an estimation of partial bits of for some with high probability. It is worth mentioning that they can also be executed in parallel to some extent. We have shown that our distributed algorithm has advantages over the traditional order-finding algorithm in space and circuit depth. Our distributed order-finding algorithm can reduce nearly qubits and reduce the circuit depth to some extent for each computer. However, unlike parallel execution, the way of serial execution that has been used in our algorithm leads to noise in both computers.
We have proved the correctness of this distributed algorithm on two computers, a natural problem is whether or not this method can be generalized to multiple computers or to other quantum algorithms. We would further consider the problem in subsequent study.
6 Acknowledgements
This work is partly supported by the National Natural Science Foundation of China (Nos. 61572532, 61876195) and the Natural Science Foundation of Guangdong Province of China (No. 2017B030311011).
References
- [1] A. Aspuru-Guzik, A.D. Dutoi, P. J. Love, M. Head-Gordon, Simulated quantum computation of molecular energies, Science, 309 (5741) (2005) 1704–1707.
- [2] J. Avron, O. Casper, I. Rozen, Quantum advantage and noise reduction in distributed quantum computing, Physical Review A, 104 (5) (2021) 052404.
- [3] R. Beals, S. Brierley, O. Gray, A. W. Harrow, S. Kutin, N. Linden, D. Shepherd, M. Stather, Efficient distributed quantum computing, Proceedings of the Royal Society A Mathematical Physical and Engineering Science, 469 (2153) (2013) 20120686.
- [4] S. Beauregard, Circuit for Shor’s algorithm using qubits, Quantum Information and Computation, 3 (2) (2003) 175–185.
- [5] C. Bennett, G. Brassard, C. Crépeau, R. Jozsa, A. Peres, and W. K. Wootters, Teleporting an unknown quantum state via dual classical and Einstein-Podolsky-Rosen channels, Physiscal Review Letters, 70 (13) (1993) 1895–1899.
- [6] C. Gidney, M. Ekera, How to factor 2048 bit RSA integers in 8 hours using 20 million noisy qubits, Quantum, 5 (2021) 433.
- [7] N. Gisin, R. Thew, Quantum communication, Nature Photonics, 1 (3) (2007) 165–171.
- [8] T. Haner, M. Roetteler, K. M. Svore, Factoring using qubits with Toffoli based modular multiplication, Quantum Information and Computation, 17 (7-8) (2017) 673–684.
- [9] L. K. Grover, A fast quantum mechanical algorithm for database search, in: Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, 1996, pp. 212–219.
- [10] A. W. Harrow, A. Hassidim, S. Lloyd, Quantum algorithm for linear systems of equations, Physical Review Letters, 103 (15) (2009) 150502.
- [11] K. Li, D. Qiu, L. Li, S. Zheng, Z. Rong, Application of distributed semi-quantum computing model in phase estimation, Information Processing Letters 120 (2017) 23–29.
- [12] A. Montanaro, Quantum algorithms: an overview, npj Quantum Information, 2 (2016) 15023.
- [13] M. A. Nielsen, I. L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, Cambridge, 2000.
- [14] S. Parker, M. B. Plenio, Efficient factorization with a single pure qubit and mixed qubits, Physical Review Letters, 85 (14) (2000) 3049–3052.
- [15] G. Rosenberg, P. Haghnegahdar, P. Goddard, P. Carr, K. Wu, M. L. De Prado, Solving the optimal trading trajectory problem using a quantum annealer, IEEE Journal of Selected Topics in Signal Processing, 10 (6) (2016) 1053–1060.
- [16] P.W. Shor, Algorithms for quantum computation: discrete logarithms and factoring, in: Proceedings of the 35th Annual Symposium on Foundations of Computer Science, 1994, pp. 124–134.
- [17] P.W. Shor, Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer, Siam Review 41 (2) (1999) 303–332.
- [18] A. Yimsiriwattana, S.J. Lomonaco, Distributed quantum computing: a distributed Shor algorithm, Quantum Information and Computation II, 5436 (2004) 360–372.