Quantum Approximate Counting for Markov Chains and Application to Collision Counting
Abstract
In this paper we show how to generalize the quantum approximate counting technique developed by Brassard, Høyer and Tapp [ICALP 1998] to a more general setting: estimating the number of marked states of a Markov chain (a Markov chain can be seen as a random walk over a graph with weighted edges). This makes it possible to construct quantum approximate counting algorithms from quantum search algorithms based on the powerful “quantum walk based search” framework established by Magniez, Nayak, Roland and Santha [SIAM Journal on Computing 2011]. As an application, we apply this approach to the quantum element distinctness algorithm by Ambainis [SIAM Journal on Computing 2007]: for two injective functions over a set of elements, we obtain a quantum algorithm that estimates the number of collisions of the two functions within relative error by making queries, which gives an improvement over the -query classical algorithm based on random sampling when .
1 Introduction
Quantum search.
Grover’s search [9] is the most basic version of quantum search. This quantum algorithm considers the following problem called unstructured search: for a function given as a black-box, find one element such that (we call such an element a solution and assume that there exists at least one). Let denote the number of solutions. Grover’s algorithm outputs with high probability a solution using calls to the function . This is a quadratic improvement over the classical (i.e., non-quantum) strategy, which is based on random sampling and requires calls to to find a solution with high probability.
Another fundamental, but more sophisticated, quantum search technique is quantum walk based search [2, 16, 21]. The problem considered here is a generalization of unstructured search. As for unstructured search, this problem accesses the data via a black-box. It can be defined in an abstract way for any Markov chain (Markov chains are generalizations of the concept of random walks over a graphs — see Section 2.2 for details). Some states of the Markov chains corresponding to “solutions” are called the marked states. The goal of the problem is to find one marked state (here we assume again that there is at least one marked state). Let be a lower bound on the fraction of marked states. Magniez, Nayak, Roland and Santha [16] have shown that this problem can be solved with high probability by a quantum algorithm that makes
(1) |
queries to the black-box, were denotes the spectral gap of the Markov chain and , and represent the number of queries needed to implement the three basic quantum operations corresponding to search: the setup operation, the update operation and the checking operation. Unstructured search simply corresponds to the Markov chain representing a random walk on a complete graph of vertices, which has spectral gap , with , , and . The complexity becomes , as for Grover’s search.333Grover’s search can also be derived in another (more direct) way, with costs , and . A striking illustration of the power of quantum walk based search is the problem called element distinctness: for two functions accessible as black boxes, find a pair satisfying if such a pair exists (such a pair is called a collision). Ambainis [2] showed how to solve this problem with queries using quantum walk based search.444While the version of element distinctness considered in [2] uses only one function, the version we present here is essentially equivalent (see [13] for details). Besides element distinctness, quantum walk based search has been used to design quantum algorithms for many other problems, such as matrix multiplication [6], triangle finding [10, 12, 15] or associativity testing [14].
Quantum approximate counting.
Quantum approximate counting is another very useful quantum primitive designed by Brassard, Høyer and Tapp [5]. It solves the (approximate) counting version of unstructured search: for any given and a function given as a black-box, output an -approximation of the number of solution , i.e., output an integer such that . The quantum approximate counting algorithm from [5] solves this problem using calls to . This is again a quadratic improvement over the classical strategy, which is based on random sampling and requires calls to [18]. Very recently, several variants of quantum approximate counting have been proposed [1, 20, 23, 24]. These variants have the same asymptotic complexity as the original algorithm but may be easier to implement in practice, especially on near-future quantum computers.
Since quantum walk based search is a generalization of Grover search, a natural question is whether a version of quantum approximate counting can be obtained for quantum walk based search as well, i.e., whether there exists an efficient quantum algorithm to estimate the number of marked states of a Markov chain. To our knowledge, this problem has never been studied in the literature. This problem is especially motivated by the task of approximate collision counting: for two functions accessible as black boxes, output an approximation of the number of collisions. While not stated explicitly, this task is at the heart of most algorithms for estimating the edit distance between two non-repetitive strings [3, 13, 17], and is also used for estimating the -distance between two probability distributions [4, 22].
Description of our results.
Our main technical result is the following general approximate counting version for reversible Markov chains.
Theorem 1.1.
Let be a reversible ergodic Markov chain with uniform stationary distribution, be the spectral gap of and be a known lower bound on the fraction of marked states of . Let , and be the number of queries needed to implement, respectively, the setup operation, the update operation and the checking operation. For any , there exists a quantum algorithm which outputs with high probability an estimate such that , and uses
queries to the black box.555In this paper the notation removes the polylogarithmic factors in all parameters.
As an application of Theorem 1.1, we show how to construct a quantum algorithm to approximate the number of collisions. More precisely, the version of approximate collision counting we consider is as follows: for and for two injective functions , where , output an -approximation of the number of collisions , i.e., an estimate such that . (We restrict our attention to the case of injective functions since this simplifies the exposition of our result. Note that this version with injective functions is still interesting since it is precisely the version needed for estimating the edit distance between two non-repetitive strings [3, 13, 17].)
In order to derive worst-case complexity upper bounds that explicitly depend on the number of collisions, we assume that we know a lower bound , and state the complexities using . There is an easy randomized algorithm based on sampling (which has been, for instance, used implicitly in [3, 17]) that outputs with high probability an -approximation of using evaluations of and , which is essentially tight (see Appendix A for details). Based on the technique of Theorem 1.1, we design a fast quantum algorithm for this problem:
Theorem 1.2.
Given a lower bound of the number of collision , for any there is a quantum algorithm that outputs with high probability an estimate such that using
queries.
Overview of our techniques.
Let us start by giving an overview of the quantum approximate counting by Brassard, Høyer and Tapp [5]. Roughly speaking, the quantum approximate counting algorithm from [5] applies the technique called quantum phase estimation [7, 11], which is based on the quantum Fourier transform, in order to estimate the eigenvalues of the unitary operator used as the main subroutine of Grover’s algorithm. This operator, when restricted to the appropriate subspace, is a rotation whose angle is closely related to the number of solutions. The eigenvalues of this operator are thus closely related to the number of solutions as well. A good approximation of the eigenvalues, which can be obtained using quantum phase estimation, then gives a good approximation of the number of solutions.
The main insight is that the main operator of quantum walk based search by Magniez, Nayak, Roland and Santha [16] approximately corresponds to a rotation (see Corollary 2.3 in Section 2.2). To prove Theorem 1.1, we observe that the eigenvalues of this rotation are closely related to the fraction of marked states with respect to the stationary distribution of the Markov chain, and then shows that quantum phase estimation applied on (instead of ) with good enough precision gives a good estimation of these eigenvalues and thus a good approximation of the number of marked states if the stationary distribution is uniform (Section 3). To prove Theorem 1.2, we additionally establish the relation between the number of collisions and the number of marked states in the Markov chain corresponding to a random walk over the Johnson graph (which is the same Markov chain as the one used in Ambainis’ element distinctness algorithm [2]), and show that a good approximation for the latter gives a good enough approximation of the former (Section 4).
2 Preliminaries
We assume that the reader is familiar with the basics of quantum computation (we refer to, e.g., [19] for a good reference) and describe below phase estimation and quantum walk based search.
2.1 Phase estimation
We first describe phase estimation [7, 11] (see also [19]). Given a unitary matrix and an eigenvector of corresponding to an eigenvalue , where , phase estimation is a quantum algorithm that outputs a good approximation of . The phase estimation algorithm is also given a positive integer as input, which controls the precision of the estimation.
In order to implement this algorithm, the unitary needs to be accessible as follows: we have access to a black box that performs a controlled- operation. Note that when an implementation of as a concrete quantum circuit is available, a circuit implementation of the controlled- operation can be simply obtained by replacing each elementary gate with its controlled version.
The phase estimation is described as Algorithm 1 below, and a concrete implementation as a quantum circuit which we denote is given in Figure 1. The total number of controlled- operations is . The following theorem shows that when setting appropriately, the algorithm outputs a good approximation with high probability.
Theorem 2.1 ([19]).
For any integer and any , when setting the phase estimation algorithm Est outputs with probability at least an approximation such that .
2.2 Search via quantum walk
In this subsection we describe the quantum walk based search approach (also called search via quantum walk) by Magniez, Nayak, Roland and Santha [16].
Classical Markov chains.
A Markov chain with a finite state space corresponds to a random walk over a weighted graph. In this paper, as in [16], we write and identify the Markov chain with its transition matrix , where is the probability of transition from to (the matrix is a stochastic matrix). The Markov chain is ergodic if its reachability graph is connected and non-bipartite. An ergodic Markov chain has a unique stationary distribution, which we denote . The Markov chain is reversible if the equality holds for all (for example, any Markov chain induced from an undirected weighted graph is reversible).
All the Markov chains considered in this paper are ergodic and reversible. Let be a non-empty subset of . The elements of are called the marked states of . Define
the fraction of marked states in the stationary distribution.
Quantum implementation of Markov chains.
Due to reversibility issues in the quantum setting, the approach by Magniez, Nayak, Roland and Santha [16] for implementing Markov chains considers pairs of states (i.e., the space ) instead of working with states (i.e., the space ). Consider the vector space . Let and be the subspaces of defined as and , where
The unitary operation defined on by is called the quantum walk based on the classical chain .
Define the following two quantum states:
and write . Let denote the state orthonormal to in the subspace of . Then
where , (remember that denotes the fraction of marked items of ). Let us write the unitary operation defined on such that and for any orthogonal to . Define the unitary operation on as follows:
and let . In the subspace the unitary operation is a rotation of angle .
One of the main technical contributions of [16] is the construction of a quantum circuit which efficiently implements an operator that is close to the reflection .
Theorem 2.2 ([16, Section 3.2]).
Let be a reversible ergodic Markov chain with a state space of size . Let denote the spectral gap of . Then for any integer there exists a quantum circuit that acts on qubits, where , and satisfies the following properties:
-
1.
The circuit uses Hadamard gates, controlled phase rotations, and makes at most calls to the controlled quantum walk c- and its inverse c-.
-
2.
For any , .
The following immediate corollary of Theorem 2.2 gives an upper bound for the difference between and the rotation .
Corollary 2.3.
For any , .
Quantum walk based search.
When considering quantum walk based search the input is given as a black box, as for Grover’s search. The set of marked states is defined using the input in such a way that finding a marked state corresponds to finding a solution to the search problem. Additionally, some data from the input is stored in a data structure . Mathematically, this corresponds to working with and instead of and . For convenience we simply write instead of , for any . The query complexity of the quantum walk based search can then be expressed in terms of three costs (see [16, Section 1.3] for details):
-
•
Setup cost : The cost of constructing the state from .
-
•
Update cost : The cost of realizing any of the unitary transformations
and their inverses.
-
•
Checking cost : The cost of realizing the following conditional phase flip
A single application of can be implemented using queries, from Theorem 2.2. A single application of can thus be implemented using queries. Magniez, Nayak, Roland and Santha [16] showed that by first preparing the initial state using queries, and then applying times the operation , a marked state (and thus a solution to the search problem) is obtained with high probability. The overall complexity of this strategy is
queries. The factor can be removed from the complexity using more work (see [16, Section 4]), which gives the complexity stated in Eq. (1) in the introduction.
3 Quantum Counting for Markov Chains
In this section we prove Theorem 1.1: for a reversible ergodic Markov chain which has the uniform stationary distribution and a finite set of size and a marked subset , the goal is to find an estimation of such that . The main idea is to apply phase estimation on the operator introduced in Section 2.2. Before giving the proof of Theorem 1.1 we present a high-level description of our strategy and prove a technical result (Lemma 3.1).
Consider first the operator introduced in Section 2.2. Since is a rotation of angle on , in this subspace it has orthonormal eigenvectors
corresponding to the eigenvalues . Applying phase estimation (Algorithm 1) with on would give an approximation of with high probability,666Observe that we have a factor here since the version of phase estimation presented in Section 2.1 assumes that the eigenvalues are of the form with . from which we would be able to obtain a good approximation of since , where if the stationary distribution of is uniform. There are nevertheless two issues with this strategy: we do not know the eigenvectors and we cannot apply directly.
The first issue is dealt with in a fairly standard way: we will instead apply phase estimation using the state . This state, which we know how to construct (at cost queries), can be expressed as
(2) |
and thus applying Algorithm 1 with on would output with high probability an approximation of either or , which would be enough to obtain an approximation of .
To solve the second problem, we will simply apply phase estimation on instead of . As in Section 2.1, let us use to denote the circuit of phase estimation applied on with the parameter . We can upper bound the difference between and operating on with as follows.
Lemma 3.1.
For any , .
Proof.
We are now ready to give a proof of Theorem 1.1 by putting all these ingredients together, adjusting all the parameters and making the analysis of the approximation errors.
Proof of Theorem 1.1..
We divide the proof into several components.
Analysis of phase estimation on the ideal rotation .
From Theorem 2.1 we know that when choosing parameters , , , the algorithm Est(, , ) would output a value such that with probability at least .
Let us now interpret these probabilities in a more algebraic way. Let and be the projection over the subspaces and , respectively, of the vector space . We thus have:
Let us now analyze the output of Est(, , ). For conciseness, we write . Remember the decomposition of Eq. (2) and observe that . Thus we have:
(3) |
which means that Est(, , ) outputs a value such that with probability at least and with probability at least .
Description of the algorithm.
We apply phase estimation on instead of , with the input state , where is the operator introduced in Section 2.2. Remember that is defined using a parameter (see the statement of Theorem 2.2). We set .
The algorithm for Theorem 1.1 is as follows.
Error Analysis.
We are going to show that , where and , by analyzing the quantity .
Let us write . For any measurement operator , we have
where we used Lemma 3.1 to derive the last inequality (remember that we are taking ). Combined with (3), this implies that with probability at least the output of Est(, , ) satisfies the condition , and with probability at least it satisfies the condition . It follows from this condition that holds with probability at least , since . Assume that this inequality holds. Then . Notice that for any , by the mean value theorem. We thus have
and it follows that
Complexity.
Under the framework presented in Section 2.2, the quantum state can be created using queries, and the operator can be implemented using queries. The operator is applied for times by the phase estimation algorithm. The overall complexity of Algorithm 2 is thus
queries, as claimed. ∎
From the argument above, we then obtain the following immediate corollary for any reversible ergodic Markov chain whose stationary distribution is not necessarily uniform.
Corollary 3.2.
Let be a reversible ergodic Markov chain, be the spectral gap of and be a known lower bound on the fraction of marked states of with respective to its stationary distribution. For any , there exists a quantum algorithm which outputs with high probability an estimate such that , and uses
queries to the black box.
4 Application: Approximate Collision Counting
Consider two injective functions , where , given as quantum oracles . More precisely, and act on qubits and are defined as follows: for any and any ,
where we are interpreting integers as binary strings (via some fixed encoding), and denotes the bit-wise parity. Let be the number of collisions, i.e., . In this section we show how to use Theorem 1.1 to approximate .
Define to be the disjoint union of the domain of and the domain of . Note that . For convenience, we will later identify with the set . Similarly to Ambainis’s algorithm for element distinctness [2], we consider a Johnson graph determined by a parameter , which is defined by and . Then . We say a vertex in is marked if it contains a collision, i.e., there exist such that . Denote the Markov chain corresponding to the random walk on by . We identify the states of with the vertices of .
By the principle of inclusion and exclusion, the number of marked vertices in is
Define , and . Then
Observe that for .
The following lemma characterizes the relation between an approximation of and an approximation of .
Lemma 4.1.
Let and be such that holds. Define and assume that , where is an upper bound of . Then for any value that satisfies the condition , the inequality
holds.
Proof of Lemma 4.1..
From the definitions of and , we have for and . Then
Hence , and
as claimed. ∎
We are now ready to give the proof of Theorem 1.2.
Proof of Theorem 1.2..
We first compute a constant-factor approximation of the number of collision . Concretely, we can use Algorithm 4 in Appendix B with the lower bound to obtain a value such that , i.e., such that or, equivalently, . Denote and by and , respectively.
Let be a positive integer satisfying the conditions and (the choice of will be discussed later). Take
Since is a regular graph, the stationary distribution of is uniform. It follows that applying Theorem 1.1 on with and outputs with high probability a value such that using
queries. Then we output . Lemma 4.1 guarantees that when holds.
Complexity.
In the main algorithm, the setup cost is , the update cost , and the checking cost . Since and we chose as
the query complexity for the main algorithm is
We finally explain how to choose . Below we assume that . This assumption can be done without loss of generality since smaller values of do not lead to better approximation of (since is an integer). For convenience we also assume that is such that : for larger values of we can simply use the classical algorithm from Appendix A.
Take such that , , and . Such an exists since
and
(The third condition is trivially satisfied.) For this choice of the complexity of the main algorithm becomes .
The overall query complexity (including the complexity of computing the constant-factor approximation) is
as claimed. ∎
Acknowledgments
FLG was supported by JSPS KAKENHI grants Nos. JP19H04066, JP20H05966, JP20H00579, JP20H04139, JP21H04879 and MEXT Quantum Leap Flagship Program (MEXT Q-LEAP) grants No. JPMXS0118067394 and JPMXS0120319794.
References
- [1] Scott Aaronson and Patrick Rall “Quantum Approximate Counting, Simplified” In Proceedings of the 3rd Symposium on Simplicity in Algorithms (SOSA 2020), 2020, pp. 24–32 DOI: 10.1137/1.9781611976014.5
- [2] Andris Ambainis “Quantum Walk Algorithm for Element Distinctness” In SIAM Journal on Computing 37.1, 2007, pp. 210–239 DOI: 10.1137/S0097539705447311
- [3] Alexandr Andoni and Huy L. Nguyen “Near-Optimal Sublinear Time Algorithms for Ulam Distance” In Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2010), 2010, pp. 76–86 DOI: 10.1137/1.9781611973075.8
- [4] Tugkan Batu et al. “Testing Closeness of Discrete Distributions” In Journal of the ACM 60.1, 2013, pp. 4:1–4:25 DOI: 10.1145/2432622.2432626
- [5] Gilles Brassard, Peter Høyer and Alain Tapp “Quantum Counting” In Proceedings of the 25th International Colloquium on Automata, Languages and Programming 1443, Lecture Notes in Computer Science, 1998, pp. 820–831 DOI: 10.1007/BFb0055105
- [6] Harry Buhrman and Robert Špalek “Quantum verification of matrix products” In Proceedings of 17th Annual ACM-SIAM Symposium on Discrete Algorithms, 2006, pp. 880–889 DOI: 10.1145/1109557.1109654
- [7] Richard Cleve, Artur Ekert, Chiara Macchiavello and Michele Mosca “Quantum algorithms revisited” In Proceedings of The Royal Society of London A 454, 1998, pp. 339–354 DOI: 10.1098/rspa.1998.0164
- [8] Devdatt Dubhashi and Alessandro Panconesi “Concentration of Measure for the Analysis of Randomized Algorithms” Cambridge University Press, 2009
- [9] Lov K. Grover “A Fast Quantum Mechanical Algorithm for Database Search” In Proceedings of the 28th Annual ACM Symposium on the Theory of Computing, 1996, pp. 212–219 DOI: 10.1145/237814.237866
- [10] Stacey Jeffery, Robin Kothari and Frédéric Magniez “Nested Quantum Walks with Quantum Data Structures” In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms, 2013, pp. 1474–1485 DOI: 10.1137/1.9781611973105.106
- [11] Alexey Yu. Kitaev “Quantum measurements and the Abelian Stabilizer Problem” ArXiv:quant-ph/9511026, 1995
- [12] François Le Gall “Improved Quantum Algorithm for Triangle Finding via Combinatorial Arguments” In Proceedings of the 55th IEEE Annual Symposium on Foundations of Computer Science, 2014, pp. 216–225 DOI: 10.1109/FOCS.2014.31
- [13] François Le Gall and Saeed Seddighin “Quantum Meets Fine-grained Complexity: Sublinear Time Quantum Algorithms for String Problems” In 13th Innovations in Theoretical Computer Science Conference (ITCS 2022) 215, LIPIcs, 2022, pp. 97:1–97:23 DOI: 10.4230/LIPIcs.ITCS.2022.97
- [14] Troy Lee, Frédéric Magniez and Miklos Santha “Improved Quantum Query Algorithms for Triangle Detection and Associativity Testing” In Algorithmica 77.2, 2017, pp. 459–486 DOI: 10.1007/s00453-015-0084-9
- [15] Frédéric Magniez, Miklos Santha and Mario Szegedy “Quantum Algorithms for the Triangle Problem” In SIAM Journal on Computing 37.2, 2007, pp. 413–424 DOI: 10.1137/050643684
- [16] Frédéric Magniez, Ashwin Nayak, Jérémie Roland and Miklos Santha “Search via Quantum Walk” In SIAM Journal on Computing 40.1, 2011, pp. 142–164 DOI: 10.1137/090745854
- [17] Timothy Naumovitz, Michael E. Saks and C. Seshadhri “Accurate and Nearly Optimal Sublinear Approximations to Ulam Distance” In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2017), 2017, pp. 2012–2031 DOI: 10.1137/1.9781611974782.131
- [18] Ashwin Nayak and Felix Wu “The Quantum Query Complexity of Approximating the Median and Related Statistics” In Proceedings of the Thirty-First Annual ACM Symposium on Theory of Computing (STOC 1999), 1999, pp. 384–393 DOI: 10.1145/301250.301349
- [19] Michael A. Nielsen and Isaac L. Chuang “Quantum Computation and Quantum Information (10th Anniversary edition)” Cambridge University Press, 2016
- [20] Yohichi Suzuki et al. “Amplitude estimation without phase estimation” In Quantum Information Processing 19, 2020, pp. article 75 DOI: 10.1007/s11128-019-2565-2
- [21] Mario Szegedy “Quantum Speed-Up of Markov Chain Based Algorithms” In Proceedings of the 45th Symposium on Foundations of Computer Science (FOCS 2004), 2004, pp. 32–41 DOI: 10.1109/FOCS.2004.53
- [22] Paul Valiant “Testing Symmetric Properties of Distributions” In SIAM Journal on Computing 40.6, 2011, pp. 1927–1968 DOI: 10.1137/080734066
- [23] Ramgopal Venkateswaran and Ryan O’Donnell “Quantum Approximate Counting with Nonadaptive Grover Iterations” In Proceedings of the 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021) 187, LIPIcs, 2021, pp. 59:1–59:12 DOI: 10.4230/LIPIcs.STACS.2021.59
- [24] Chu-Ryang Wie “Simpler quantum counting” In Quantum Information & Computation 19.11&12, 2019, pp. 0967–0983 DOI: 10.26421/QIC19.11-12-5
Appendix A Classical Algorithm for Approximate Collision Counting
In this appendix we describe the “folklore” classical algorithm based on random sampling for approximating the number of collisions of two injective functions , which has been used implicitly in [3, 17]. More precisely, we show that given a lower bound , there exists a randomized algorithm which outputs a value such that holds with high probability, using
evaluations of and .
Let us write . The main procedure, which is parametrized by a real number , is given as Algorithm 3 below.
The complexity of Algorithm 3 is queries. Since the expected value of the size of these sets is , the expected query complexity of Algorithm 3 is . We now analyze its correctness.
Lemma A.1.
If for some , then with probability at least the output of Algorithm 3 is such that .
Proof.
Since each element is included into or with probability , the probability for a collision pair to be included is . Then the expected value of the number of collisions in is . With , the condition is established if and only if holds. By the use of Chernoff-Hoeffding bound [8] we have
If for some , then , which implies that the probability of success is at least if . ∎
We now describe the whole algorithm. Let be a small constant. First, we apply Algorithm 3 with
for a constant . It outputs such that with probability at least . Then we apply Algorithm 3 with
It outputs such that with probability at least . The overall expected query complexity is
and the probability of success is at least . This upper bound on the expected query complexity can be converted into an upper bound on the worst-case query complexity using a tail bound.
Note that the dependence on and in the above complexity is optimal. Indeed, by an argument similar to the argument used in the proof of Lemma A.1, it is easy to show that queries are necessary to hit a collision.
Appendix B Quantum Constant-Factor Approximate Collision Counting
In this appendix we describe a simple algorithm that uses Ambainis’ element distinctness algorithm [2] instead of the classical strategy of Steps 6-10 of Algorithm 3, and computes a constant-factor approximation of the number of collisions using queries. Here is the formal statement of our result (again we assume that a lower bound is known and state the complexity in term of ).
Theorem B.1.
There exists a quantum algorithm that uses queries and outputs with probability at least an estimator such that .
The quantum algorithm is described as Algorithm 4 below, where again we write .
At Steps 9-10, observe that we actually only need to check whether holds. We thus simply apply Ambainis’ element distinctness algorithm [2] consecutively times, removing the collisions as soon as they are discovered, and check if we obtain strictly less than collisions, in which case we decide that . The complexity of Steps 9-10 is thus queries. Since Steps 9-10 are repeated times, and all other steps can be implemented without any queries, the overall query complexity of Algorithm 4 is .
We now analyze the correctness of the algorithm. First, observe that since , the two conditions of Step 8 hold with probability at least , from Chernoff bound. Since the success probability of Ambainis’ element distinctness algorithm can also be made larger than (e.g., by repeating Ambainis’ algorithm times), the implementation of test of Step 10 described in the previous paragraph is correct with probability at least . Below, we assume that the two conditions of Step 8 hold and that the test of Step 10 is implemented correctly, and show that Algorithm 4 outputs with high probability a good approximation of under these assumptions.
Lemma A.1 guarantees that as soon as the inequality
(4) |
becomes true, the number of collisions at Step 10 satisfies with probability at least . Inequality (4) necessary becomes true during the while loop of Algorithm 4 and once it becomes true, it remains true for all the subsequent values of . Let denote the first value taken by during the while loop such the Inequality (4) holds. The correctness of Algorithm 4 then follows from the following lemma.
Lemma B.2.
For the value , we have
Proof.
Notice that . The expected value of is thus
and the claim follows from Chernoff bound. ∎