Instance Independence of Single Layer Quantum Approximate Optimization Algorithm on Mixed-Spin Models at Infinite Size
Abstract
This paper studies the application of the Quantum Approximate Optimization Algorithm (qaoa) to spin-glass models with random multi-body couplings in the limit of a large number of spins. We show that for such mixed-spin models the performance of depth qaoa is independent of the specific instance in the limit of infinite sized systems and we give an explicit formula for the expected performance. We also give explicit expressions for the higher moments of the expected energy, thereby proving that the expected performance of qaoa concentrates.
1 Summary
The Quantum Approximate Optimization Algorithm (qaoa) is a variational quantum algorithm designed to give approximate solutions to unconstrained binary optimization problems [1]. While qaoa can be proven to give the optimal answer in the limit where the number of qaoa layers goes to infinity, rigorous results on the performance of qaoa with finite are difficult to obtain. In a recent paper, Farhi et al. [2] studied the application of the qaoa to the Sherrington-Kirkpatrick (SK) model, a spin-glass model with random all-to-all two-body couplings, in the limit of a large number of spins. Their paper demonstrated that for fixed , the performance of the qaoa is independent of the specific instance of the SK model and can be predicted by explicit formulas. The paper also showed that the approximation ratio of the qaoa at outperforms a large class of classical optimization algorithms (although not the best classical algorithm [3]). In the current paper, we generalize the result of Farhi et al. to mixed-spin SK models, which extends the two-body couplings of standard SK to random all-to-all -body couplings. We demonstrate that for , the performance of the qaoa is again independent of the specific instance, and we provide an explicit formula for the expected performance. Our work provides a potential avenue to demonstrating the advantage of qaoa over classical algorithms, as the best known classical algorithms for mixed-spin SK models have an approximation ratio that is bounded away from [4, 5].
2 Preliminaries and Notation
The Quantum Approximate Optimization Algorithm (qaoa) [1] is a heuristic quantum algorithm for binary optimization. Given a cost function of binary variables (spins) , qaoa seeks to produce a string close to the minimum of . Commonly we view as a Hamiltonian operator that is diagonal in the -basis. A depth- qaoa circuit then consists of repetitions of alternatively applying the Hamiltonian and the mixing Hamiltonian to a uniform superposition as initial state, which is the product of single particle eigenstates. Explicitly, the depth- qaoa state is given by
(1) |
A depth- qaoa circuit is parameterized by the angles . For a given problem, these angles should be optimized so that measuring in the -basis gives strings that make as small as possible. In practice, this is typically done by minimizing the expectation value of the energy
(2) |
For some problems, this minimization may be done analytically on a classical computer [1, 6, 7, 8]. Otherwise, the minimization can be performed by running the qaoa on a quantum computer repeatedly for a fixed set of angles, estimating the expectation value, and updating the angles according to classical minimization algorithms [1, 9, 10]. We note that minimizing the energy expectation value is only one possible definition of “best” angles; in general minimizing the expectation value and maximizing the probability of finding the optimal (or maximizing the probability of falling below a certain threshold) do not coincide.
It was recently demonstrated in [11] that in local optimization problems with cost functions drawn from from realistic random distributions (e.g. MaxCut on random 3-regular graphs), the expectation value per spin is instance independent as . That is, for fixed angles and is the same for all problem instances. This implies that the angles and do not need to be optimized for every problem instance, but can be optimized once and reused for every problem drawn from the same distribution. The methods of [11] can also be used to derive concentration of measure results for local optimization problems. (While [11] did not explicitly address concentration of measure, it can be easily derived from their methodology.) That is, in the limit as , the variance in the energy per spin goes to zero:
(3) |
Concentration of measure means that for large , every measurement of a qaoa state in the -basis gives strings with the same energy per spin. In total, instance independence implies the qaoa angles do not need to be optimized from instance to instance in order to minimize the expectation value, and concentration of measure implies that expectation value is the correct measure of the “best" angles.
While instance independence (and concentration of measure) were initially derived for local cost functions, similar results have also been derived for the Sherrington-Kirkpatrick (SK) model [12], a physics-inspired optimization problem with cost function
(4) |
with the are independently drawn from a Gaussian distribution with mean . The SK model is the “most nonlocal" two-body cost function, and serves as a model for realistic nonlocal two-body optimization problems. In the limit , the ground state energy per spin is known to be independent of the instance and can be computed explicitly [13, 14]. Recently, Montanari derived a classical algorithm that produces strings with energy within times the optimum; this method has a complexity of , where is a polynomial in [3].
Ref. [2] proved both instance independence and concentration of measure for depth- qaoa applied to the SK model. In addition, [2] provided an explicit formula for in the limit for , and provided a computer algorithm to generate for any fixed depth . Therefore, the qaoa angles for the SK model can be chosen on a classical computer, and there are fixed performance guarantees in the limit . Ref. [2] demonstrated that at , qaoa outperforms semidefinite programming for the SK model, but could not show that the qaoa matches the performance of the Montanari algorithm.
In this work, we study a generalization of the SK model, the mixed-spin SK model, that allows for polynomials of degree in the binary variables instead of only degree- terms [15, 16]. This can serve as a model for nonlocal optimization problems with higher-order terms. The mixed-spin SK model is also known to have a ground state energy per spin that is independent of the instance and can be computed explicitly [15, 17, 16, 18]. For a mixed-spin SK model with degree , the generalization of the Montanari algorithm [4] approaches a fixed approximation ratio of times the optimum value, rather than the optimal value [5]. Thus, the mixed-spin SK model is a potential avenue for establishing the advantage of the qaoa over classical algorithms.
In this paper, we generalize the work of [2] to prove instance independence and concentration of measure for qaoa applied to mixed-spin SK models. As part of our work, we derive an explicit formula for in the limit , implying that the qaoa angles for the mixed-spin SK models can be chosen on a classical computer. Our work can likely be generalized to depth using the same methods as [2].
3 Current Results and Prior Work
3.1 Our Results on Mixed-Spin Sherrington-Kirkpatrick Model
The mixed-spin SK model (often called the mixed -spin model, although we will not use this terminology to avoid confusion with the of qaoa) is given by [15, 16]
(5) |
where in the last line we denote the product by . We assume each is sampled independently from a Gaussian distribution that only depends on , and we will let be the standard deviation of the coupling constants with , for .
The ground state of the mixed-spin SK Hamiltonian is known to have a fixed energy per spin in the limit [18]. In fact we can also allow arbitrarily high orders in the mixed-spin SK model (), provided the variances decrease quickly enough to make finite, and the ground state model will still have a fixed energy per spin as [18]. However, for simplicity we consider some finite bound on the degree.
Our main result is as follows. Define the -spin model by Eq. 5 with . Then, using depth qaoa with angles , the expectation of the energy per spin equals in the large limit is given by
(6) |
If we define the variables and the polynomial , as is common in discussions of the mixed-spin model (see Section 3.2), we can also write this as
(7) |
Furthermore, the second moment of the energy per spin equals
(8) |
Our second result, Eq. 8, allows us to prove that qaoa applied to the mixed-spin SK model has both concentration of measure and instance independence. To see this, we note that we may write
(9) |
When applying qaoa to mixed-spin SK models, the measured value of varies for two reason. First, it varies because the bonds vary from instance to instance (the expectation). Second, it varies because the qaoa state is not an eigenstate of the -operators, so that the measurement outcomes have randomness even for fixed (the expectation). The left hand side of Eq. 9 represents the total variance in due to both sources of randomness. The right hand side demonstrates that the total variance can be decomposed into two terms. The first term is the average over of the variance due only to the measurement randomness. The second term is the variance in the expected value due to the randomness in the bonds . Since both of these terms are non-negative, they must both tend to zero as as well.
The fact that the first variance approaches zero gives us concentration of measure: it says that for typical couplings the variance in the measurement outcomes vanishes, and thus we always measure a string with energy per spin equal to the expectation value (note that the term inside the is always positive, so that for the average over to go to zero the magnitude of the typical value must also go to zero). The second term approaching zero clearly gives instance independence of the expectation value, since it shows that the variance in the expectation value due to different couplings vanishes.
Finally, we note that the methods we use also suffice to derive a formula for all higher moments of the energy per spin
(10) |
although unlike the result, the result does not have any obvious implication for the performance of the qaoa.
3.2 Numerical Results and Relation to Prior Results
As an example of using our Eqs. 6 and 7, we can compute the optimal angles for pure -spin models given by . In this case, our central equation reduces to
(11) |
Examples of this energy landscape for small are plotted in Fig. 1, and numerically optimized angles and corresponding energy per spin are plotted in Fig. 2.



As another example application of our formula, we explore the instance independence at finite to demonstrate convergence as . In Fig. 3a-e, we plot the expected energy per spin for five randomly generated mixed-spin optimization problems generated from a distribution with (and all higher-order terms zero). We see that as we increase , the energy landscape of any given problem instance quickly approaches the landscape (Fig. 3f), as predicted.
To put Eq. 6 in perspective we explicitly calculate what it implies for two basic cases and and compare it with the literature on mixed-spin models.
The standard Sherrington-Kirkpatrick model has , , and for all other . In this case our Eq. 6 tells us that
(12) |
which agrees with previous work in [2] and [8]. This expectation is minimized by the parameters and for which . This is the same result we found numerically for in Fig. 2.
For the more general case we will briefly review the notation used by the articles [5, 4, 18, 17]. In the current paper the mixed-spin model is described by the Hamiltonian
(13) |
with and for a fixed the summation has terms. In [5] and elsewhere a different notation is used where the Hamiltonian is defined by
(14) |
with . Crucially, now the indices can have repeated values and permutations are treated distinctly, hence for each the summation involves terms.
In the limit of large one can show that the -plets with repeated values can be ignored as their relative contribution decreases as a function . For distinct indices there are permutations to consider in , hence we have a sum of standard normal distributions: , which is identical to a single normal distribution with variance . As a result we re-express as
(15) |
with . Thus when we have that and describe the same model. It is common to capture the different coefficients by the mixture function , such that the standard SK model has (i.e. and hence ).
For the -body model , [4] analyzes the mixture , such that . It was shown that the expected ground state energy-per-spin of this model equals In the notation of the current article this model is equivalent to and with this our Eq. 6 gives the expectation
(16) |
This expectation is minimized to by the angles , which are the solutions to
(17) | ||||
(18) | ||||
(19) |
This is the same result we found numerically for in Fig. 2. We thus see how depth qaoa can approximate the ground state energy by a factor of . This approximation factor had been reported earlier by Zhou et al.[19]
4 Derivation of Main Result
The proof in this section follows to large extent the framework of the earlier result by Farhi et al. [2], which relied on manipulating the moment generating function to extract expressions for the first and second moments of . We use their method of simplifying the moment generating function, and their reorganization of the sum over -strings into a sum over sketches (see Section 4.2). We extend their proof technique by generalizing their form of the moment generating function to higher-spin models and demonstrating that it can still be written as a sum over sketches, developing careful power-counting methods to allow us to extract the relevant terms in the limit, and deriving identities that allow us to explicitly evaluate the relevant sums.
In our proof, we will use the following conventions:
-
•
A -basis state is specified by a string of s. We will use the shorthand for this.
-
•
The XOR of two bitstrings and is given by the componentwise product .
-
•
For a set , we denote the product of the bits in as ; with this convention we thus have .
-
•
The uniform superposition over all strings is denoted by
(20)
This is in contrast to the usual convention in quantum information, in which a -basis state is specified by a string of s and s and the XOR of two strings is given by componentwise addition modulo 2. We choose our notation to be consistent with [2] and to simplify certain expressions in our derivation.
4.1 Moment Generating Function
Following [2], to evaluate and we use the moment-generating function . From the moment-generating function, we can find the moments via
(21) |
We can simplify the expectation inside the moment-generating function by inserting three complete sets of states:
(22) |
where in the last step we used that and made the replacements and .
4.2 Expectation when Couplings are Normal Distributed Variables
We will now treat the couplings as a random variable and consider the expectation of the energy. We assume that the distribution is symmetric with respect to , such that we can replace by to get
(23) |
When we further assume that the variables are independent between different we can continue by
(24) |
Next we assume that the are normally distributed with a standard deviation that is the same for all sets of the same size, that is . We note that taking the expectation value of a Gaussian random variable with standard deviation gives
(25) |
so that our overall expression becomes
(26) |
To do the sum over , we claim that the summand does not depend on all spin values of and . Instead, it is only a function of the four integer values , where is defined to be the number of positions with . Note that only three of these variables are actually independent, as we always have . As these numbers summarize the crucial information of the strings, we will refer to as the sketch of . Writing the summand in terms of the sketch rather than was introduced in [2]; here we establish that we can still write the summand in terms of the sketch for the mixed-spin SK model. To start, it is straightforward to verify that
(27) |
We can also write explicit combinatorial formulas for the sums in the exponential:
(28) | ||||
(29) |
Therefore, the summand indeed depends only on and the number of ways to assign the positions into four groups of these sizes is the multinomial . To condense our notation we will use to denote the set of sketches , allowing us to use the shorthand
(30) |
Note that this summation has terms. We thus have
(31) |
Eq. 31 is the form of the moment-generating function we will use to evaluate and .
4.3 General Form of Moments
Using Eq. 21 combined with the form of the moment-generating function given in Eq. 31, we can write the first moment as:
(32) |
and the second moment as:
(33) |
The explicit expression for (Eq. 28) shows that is a degree- polynomial in the variables , , and , so we can expand as
(34) |
where the are constants independent of . In terms of this expansion, we have
(35) |
and
(36) |
Ref. [2] could explicitly evaluate these terms for the small values of and relevant for the two-body SK model, using concise expressions for and . However, to get explicit formulas beyond requires carefully counting powers of to establish which terms survive in the limit, and using the general expressions for and (Eqs. 28 and 29) to derive explicit forms of the leading-order terms. To tame this sum our derivation thus uses techniques that go beyond a simple generalization of [2].
4.4 Evaluating Sums Over the Sketches
We see that evaluating both moments reduces to repeatedly evaluating terms of the form
(37) |
with playing the role of in Eq. 35. For a fixed , note that only depends on (Eq. 29) and with this symmetry we define the single variable function and use with . The remaining two degrees of freedom are further reorganized by summing over using , and over using . We thus obtain
(38) |
where we have defined and to be analyzed next in the limit of large .
4.5 Large Limit of , , and
As we plan to take the limit , we will use big notation in to keep track of the significant terms in and for constant , , and . The summations for and can be expressed exactly via the following identity (which is a generalization of the identities given in [2] for the case of and which can be proven by induction on ):
(39) |
For , we thus have
(40) |
Using the operator equality we see that expanding gives a sum of differential operators with . Because (Eq. 27) we see that only the operators with give nonzero contributions to . For we therefore have . In the particular case of , the only terms in that give a nonzero result are , hence
(41) |
For it will be enough to observe that the order of is independent of . Summarizing, we have
(42) |
For , we use
(43) |
The highest-order terms in come from the terms in and thus
(44) |
4.6 Large properties of
To complete the evaluation of the moments, we to determine the large dependency of , its coefficients (Eq. 34) and their role in the first and second moment expressions of Eqs. 35 and 36. We see from Eq. 49 that the only coefficients that matter in the limit are those with , i.e. those with . We can evaluate these from the explicit formula for (Eq. 28) by keeping only terms of total degree in the terms , , and . This gives
(50) |
Note that there are no terms with and in . Comparing Eq. 50 with the definition of in Eq. 34, we see for after some algebra that
(51) |
Having established these properties of and , we now have sufficient information to evaluate our moments.
4.7 Evaluating the First Moment
4.8 Evaluating the Second Moment
4.9 Evaluating Higher Moments
The proof technique used above also applies to higher moments. When computing the th moment by taking derivatives of the moment-generating function according to Eq. 21, the only terms that survive in the limit are the terms where all derivatives hit the term in the exponential, so that the expression for the moment becomes
(54) |
Noting from Eq. 49 that
(55) |
we can write the th moment as
(56) |
Thus, we find that for all ,
(57) |
4.10 Alternative expression for
Here, we present an alternative expression for that makes contact with the notation used in [5] and elsewhere (see Section 3.2). The current paper’s central result (Eq. 6) reads:
(58) |
If we use the substitution from Section 3.2, , this becomes
(59) |
Using the identity
(60) |
we can rewrite this as
(61) |
Finally, using the definition and hence , we get
(62) |
which is precisely what we claimed in Eq. 7. For a pure -spin model with only the above further simplifies to
(63) |
5 Discussion and Conclusion
In this work, we have derived explicit formulas to quantify the performance of qaoa on mixed-spin models in the limit. We demonstrated both concentration of measure and instance independence for arbitrary mixed-spin models, which imply that the expectation value of the energy per spin is independent of the specific model specification and that measurements of the qaoa state are guaranteed to give energies close to the expectation value. Our explicit formula for the expectation value of the energy for arbitrary mixed-spin models allows us to find the optimal angles on a classical computer.
There are two obvious open questions raised by this work. First, the approach of this paper can probably be combined with the methods of [2] to generalize our work to depth qaoa. Such a result would allow one to prove instance independence and concentration of measure, and derive a computer algorithm to generate formulas for the expectation value per spin, at arbitrary depth . This is a particularly interesting route of research, since it is known that in the cubic case with , Montanari’s classical algorithm does not approach the optimal solution [5], so that at sufficient depth the qaoa has a chance of outperforming the best known classical algorithm. Higher-spin models with may even provide a more direct route towards finding a setting where qaoa at depth outperforms classical optimization such as Montanari’s algorithm [4, 5]. While the generalization to higher is likely possible, it is a nontrivial extension of this paper, and we leave it for future work.
Second, it remains an open question what to what extent results on the random models can be used to find optimal angles for realistic binary optimization problems. One hypothetical approach to finding qaoa angles for a single instance of an -spin optimization problem would be the procedure:
-
1.
For all , compute the standard deviations of the -spin couplings in the problem.
-
2.
Use Eq. 6 for the mixed-spin model as an estimate for the expectation value of the energy per spin.
-
3.
Run the qaoa at the optimal angles for the corresponding mixed-spin model (or use these angles as starting points for a numerical optimization of the angles)
This procedure assumes that the means of the couplings are zero: for all . If this method can be shown to work, it would greatly simplify finding qaoa angles for general optimization problems.
References
- Farhi et al. [2014] Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. A quantum approximate optimization algorithm. arXiv preprint arXiv:1411.4028, 2014. URL https://arxiv.org/abs/1411.4028.
- Farhi et al. [2019] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, and Leo Zhou. The quantum approximate optimization algorithm and the Sherrington-Kirkpatrick model at infinite size. arXiv preprint arXiv:1910.08187, 2019. URL https://arxiv.org/abs/1910.08187.
- Montanari [2021] Andrea Montanari. Optimization of the Sherrington–Kirkpatrick hamiltonian. SIAM Journal on Computing, 0(0):FOCS19–1, 2021. doi: 10.1137/20M132016X.
- Alaoui et al. [2020] Ahmed El Alaoui, Andrea Montanari, and Mark Sellke. Optimization of mean-field spin glasses. arXiv preprint arXiv:2001.00904, 2020. URL https://arxiv.org/abs/2001.00904.
- Alaoui and Montanari [2020] Ahmed El Alaoui and Andrea Montanari. Algorithmic thresholds in mean field spin glasses. arXiv preprint arXiv:2009.11481, 2020. URL https://arxiv.org/abs/2009.11481.
- Jiang et al. [2017] Zhang Jiang, Eleanor G Rieffel, and Zhihui Wang. Near-optimal quantum circuit for Grover’s unstructured search using a transverse field. Physical Review A, 95(6):062317, 2017. doi: 10.1103/PhysRevA.95.062317.
- Wang et al. [2018] Zhihui Wang, Stuart Hadfield, Zhang Jiang, and Eleanor G Rieffel. Quantum approximate optimization algorithm for maxcut: A fermionic view. Physical Review A, 97(2):022304, 2018. doi: 10.1103/PhysRevA.97.022304.
- Ozaeta et al. [2020] Asier Ozaeta, Wim van Dam, and Peter L. McMahon. Expectation values from the single-layer quantum approximate optimization algorithm on Ising problems. arXiv preprint arXiv:2012.03421, 2020. URL https://arxiv.org/abs/2012.03421.
- Crooks [2018] Gavin E Crooks. Performance of the quantum approximate optimization algorithm on the maximum cut problem. arXiv preprint arXiv:1811.08419, 2018. URL https://arxiv.org/abs/1811.08419.
- Zhou et al. [2020] Leo Zhou, Sheng-Tao Wang, Soonwon Choi, Hannes Pichler, and Mikhail D Lukin. Quantum approximate optimization algorithm: Performance, mechanism, and implementation on near-term devices. Physical Review X, 10(2):021067, 2020. doi: 10.1103/PhysRevX.10.021067.
- Brandao et al. [2018] Fernando GSL Brandao, Michael Broughton, Edward Farhi, Sam Gutmann, and Hartmut Neven. For fixed control parameters the quantum approximate optimization algorithm’s objective function value concentrates for typical instances. arXiv preprint arXiv:1812.04170, 2018. URL https://arxiv.org/abs/1812.04170.
- Sherrington and Kirkpatrick [1975] David Sherrington and Scott Kirkpatrick. Solvable model of a spin-glass. Physical Review Letters, 35(26):1792, 1975. doi: 10.1103/PhysRevLett.35.1792.
- Parisi [1980] Giorgio Parisi. A sequence of approximated solutions to the SK model for spin glasses. Journal of Physics A: Mathematical and General, 13(4):L115, 1980. doi: 10.1088/0305-4470/13/4/009.
- Talagrand [2006] Michel Talagrand. The Parisi formula. Annals of mathematics, pages 221–263, 2006. doi: 10.4007/annals.2006.163.221.
- Panchenko [2012] Dmitry Panchenko. The Sherrington-Kirkpatrick model: an overview. Journal of Statistical Physics, 149(2):362–383, 2012. doi: 10.1007/s10955-012-0586-7.
- Panchenko [2013] Dmitry Panchenko. The Sherrington-Kirkpatrick model. Springer Science & Business Media, 2013. doi: 10.1007/978-1-4614-6289-7.
- Auffinger et al. [2017] Antonio Auffinger, Wei-Kuo Chen, et al. Parisi formula for the ground state energy in the mixed -spin model. Annals of Probability, 45(6B):4617–4631, 2017. doi: 10.1214/16-AOP1173.
- Panchenko et al. [2014] Dmitry Panchenko et al. The Parisi formula for mixed -spin models. Annals of Probability, 42(3):946–958, 2014. doi: 10.1214/12-AOP800.
- Zhou et al. [2021] Leo Zhou, Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. The quantum approximate optimization algorithm and the Sherrington-Kirkpatrick model at infinite size. Talk presented at QIP, 2021. URL https://youtu.be/UP-Zuke7IUg.