This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Instance Independence of Single Layer Quantum Approximate Optimization Algorithm on Mixed-Spin Models at Infinite Size

Jahan Claes QC Ware Corporation, Palo Alto, CA USA Department of Physics and Institute for Condensed Matter Theory, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA [email protected]    Wim van Dam QC Ware Corporation, Palo Alto, CA USA Department of Computer Science, University of California, Santa Barbara, CA USA Department of Physics, University of California, Santa Barbara, CA USA
Abstract

This paper studies the application of the Quantum Approximate Optimization Algorithm (qaoa) to spin-glass models with random multi-body couplings in the limit of a large number of spins. We show that for such mixed-spin models the performance of depth 11 qaoa is independent of the specific instance in the limit of infinite sized systems and we give an explicit formula for the expected performance. We also give explicit expressions for the higher moments of the expected energy, thereby proving that the expected performance of qaoa concentrates.

1 Summary

The Quantum Approximate Optimization Algorithm (qaoa) is a variational quantum algorithm designed to give approximate solutions to unconstrained binary optimization problems [1]. While qaoa can be proven to give the optimal answer in the limit where the number of qaoa layers pp goes to infinity, rigorous results on the performance of qaoa with finite pp are difficult to obtain. In a recent paper, Farhi et al. [2] studied the application of the qaoa to the Sherrington-Kirkpatrick (SK) model, a spin-glass model with random all-to-all two-body couplings, in the limit of a large number of spins. Their paper demonstrated that for fixed pp, the performance of the qaoa is independent of the specific instance of the SK model and can be predicted by explicit formulas. The paper also showed that the approximation ratio of the qaoa at p=11p=11 outperforms a large class of classical optimization algorithms (although not the best classical algorithm [3]). In the current paper, we generalize the result of Farhi et al. to mixed-spin SK models, which extends the two-body couplings of standard SK to random all-to-all qq-body couplings. We demonstrate that for p=1p=1, the performance of the qaoa is again independent of the specific instance, and we provide an explicit formula for the expected performance. Our work provides a potential avenue to demonstrating the advantage of qaoa over classical algorithms, as the best known classical algorithms for mixed-spin SK models have an approximation ratio that is bounded away from 11 [4, 5].

2 Preliminaries and Notation

The Quantum Approximate Optimization Algorithm (qaoa) [1] is a heuristic quantum algorithm for binary optimization. Given a cost function of nn binary variables (spins) H(z1,,zn)H(z_{1},\dots,z_{n}), qaoa seeks to produce a string z:=(z1,,zn)z:=(z_{1},\dots,z_{n}) close to the minimum of HH. Commonly we view HH as a Hamiltonian operator that is diagonal in the ZZ-basis. A depth-pp qaoa circuit then consists of pp repetitions of alternatively applying the Hamiltonian HH and the mixing Hamiltonian B=X1++XnB=X_{1}+\cdots+X_{n} to a uniform superposition as initial state, which is the product of +X+X single particle eigenstates. Explicitly, the depth-pp qaoa state is given by

|β1,,βp;γ1,,γp:=eiβpBeiγpHeiβ1Beiγ1H12nz{±1}n|z.|\beta_{1},\dots,\beta_{p};\gamma_{1},\dots,\gamma_{p}\rangle:=e^{-i\beta_{p}B}e^{-i\gamma_{p}H}\cdots e^{-i\beta_{1}B}e^{-i\gamma_{1}H}\cdot\frac{1}{\sqrt{2^{n}}}\sum_{z\in\{\pm 1\}^{n}}|z\rangle. (1)

A depth-pp qaoa circuit is parameterized by the 2p2p angles {γi},{βi}\{\gamma_{i}\},\{\beta_{i}\}. For a given problem, these angles should be optimized so that measuring in the ZZ-basis gives strings that make HH as small as possible. In practice, this is typically done by minimizing the expectation value of the energy

H:=β1,,βp;γ1,,γp|H|β1,,βp;γ1,,γp.\langle H\rangle:=\langle\beta_{1},\dots,\beta_{p};\gamma_{1},\dots,\gamma_{p}|H|\beta_{1},\dots,\beta_{p};\gamma_{1},\dots,\gamma_{p}\rangle. (2)

For some problems, this minimization may be done analytically on a classical computer [1, 6, 7, 8]. Otherwise, the minimization can be performed by running the qaoa on a quantum computer repeatedly for a fixed set of angles, estimating the expectation value, and updating the angles according to classical minimization algorithms [1, 9, 10]. We note that minimizing the energy expectation value is only one possible definition of “best” angles; in general minimizing the expectation value and maximizing the probability of finding the optimal zz (or maximizing the probability of H(z1,,zn)H(z_{1},\dots,z_{n}) falling below a certain threshold) do not coincide.

It was recently demonstrated in [11] that in local optimization problems with cost functions drawn from from realistic random distributions (e.g. MaxCut on random 3-regular graphs), the expectation value per spin is instance independent as nn\rightarrow\infty. That is, for fixed angles {γi}\{\gamma_{i}\} and {βi}\{\beta_{i}\} H/n\langle H/n\rangle is the same for all problem instances. This implies that the angles {γi}\{\gamma_{i}\} and {βi}\{\beta_{i}\} do not need to be optimized for every problem instance, but can be optimized once and reused for every problem drawn from the same distribution. The methods of [11] can also be used to derive concentration of measure results for local optimization problems. (While [11] did not explicitly address concentration of measure, it can be easily derived from their methodology.) That is, in the limit as nn\rightarrow\infty, the variance in the energy per spin goes to zero:

(H/n)2(H/n)20\langle(H/n)^{2}\rangle-\langle(H/n)\rangle^{2}\rightarrow 0 (3)

Concentration of measure means that for large nn, every measurement of a qaoa state in the ZZ-basis gives strings with the same energy per spin. In total, instance independence implies the qaoa angles do not need to be optimized from instance to instance in order to minimize the expectation value, and concentration of measure implies that expectation value is the correct measure of the “best" angles.

While instance independence (and concentration of measure) were initially derived for local cost functions, similar results have also been derived for the Sherrington-Kirkpatrick (SK) model [12], a physics-inspired optimization problem with cost function

H=1j<knJjkZjZkH=\sum_{1\leq j<k\leq n}J_{jk}Z_{j}Z_{k} (4)

with the JjkJ_{jk} are independently drawn from a Gaussian distribution with mean 0. The SK model is the “most nonlocal" two-body cost function, and serves as a model for realistic nonlocal two-body optimization problems. In the limit nn\rightarrow\infty, the ground state energy per spin is known to be independent of the instance and can be computed explicitly [13, 14]. Recently, Montanari derived a classical algorithm that produces strings zz with energy within (1ϵ)(1-\epsilon) times the optimum; this method has a complexity of C(ϵ)n2C(\epsilon)n^{2}, where C(ϵ)C(\epsilon) is a polynomial in 1/ϵ1/\epsilon [3].

Ref. [2] proved both instance independence and concentration of measure for depth-pp qaoa applied to the SK model. In addition, [2] provided an explicit formula for H/n\langle H/n\rangle in the limit nn\rightarrow\infty for p=1p=1, and provided a computer algorithm to generate H/n\langle H/n\rangle for any fixed depth p>1p>1. Therefore, the qaoa angles for the SK model can be chosen on a classical computer, and there are fixed performance guarantees in the limit nn\rightarrow\infty. Ref. [2] demonstrated that at p=11p=11, qaoa outperforms semidefinite programming for the SK model, but could not show that the qaoa matches the performance of the Montanari algorithm.

In this work, we study a generalization of the SK model, the mixed-spin SK model, that allows for polynomials of degree dd in the binary variables instead of only degree-22 terms [15, 16]. This can serve as a model for nonlocal optimization problems with higher-order terms. The mixed-spin SK model is also known to have a ground state energy per spin that is independent of the instance and can be computed explicitly [15, 17, 16, 18]. For a mixed-spin SK model with degree d=3d=3, the generalization of the Montanari algorithm [4] approaches a fixed approximation ratio of (0.9843±0.0003)\sim(0.9843\pm 0.0003) times the optimum value, rather than the optimal value [5]. Thus, the mixed-spin SK model is a potential avenue for establishing the advantage of the qaoa over classical algorithms.

In this paper, we generalize the work of [2] to prove instance independence and concentration of measure for p=1p=1 qaoa applied to mixed-spin SK models. As part of our work, we derive an explicit formula for H/n\langle H/n\rangle in the limit nn\rightarrow\infty, implying that the qaoa angles for the mixed-spin SK models can be chosen on a classical computer. Our work can likely be generalized to depth p1p\geq 1 using the same methods as [2].

3 Current Results and Prior Work

3.1 Our Results on Mixed-Spin Sherrington-Kirkpatrick Model

The mixed-spin SK model (often called the mixed pp-spin model, although we will not use this terminology to avoid confusion with the pp of qaoa) is given by [15, 16]

H=j=1nJjZj+1n1j<knJjkZjZk+1n1j<k<nJjkZjZkZ++1nd121j1<jdnJj1jdZj1Zjd=q=1dn1q2S{1,,n}|S|=qJSZS\displaystyle\begin{split}H&=\sum_{j=1}^{n}J_{j}Z_{j}+\frac{1}{\sqrt{n}}\sum_{1\leq j<k\leq n}J_{jk}Z_{j}Z_{k}+\frac{1}{n}\sum_{1\leq j<k<\ell\leq n}J_{jk\ell}Z_{j}Z_{k}Z_{\ell}+\cdots+\frac{1}{n^{\frac{d-1}{2}}}\sum_{1\leq j_{1}<\cdots j_{d}\leq n}J_{j_{1}\cdots j_{d}}Z_{j_{1}}\cdots Z_{j_{d}}\\ &=\sum_{q=1}^{d}n^{\frac{1-q}{2}}\sum_{\begin{subarray}{c}S\subseteq\{1,\dots,n\}\\ |S|=q\end{subarray}}J_{S}Z_{S}\end{split} (5)

where in the last line we denote the product iSZi\prod_{i\in S}Z_{i} by ZSZ_{S}. We assume each JSJ_{S} is sampled independently from a Gaussian distribution N(0,σ|S|2)N(0,\sigma^{2}_{|S|}) that only depends on |S||S|, and we will let σq\sigma_{q} be the standard deviation of the coupling constants JSJ_{S} with |S|=q|S|=q, for q=1,,dq=1,\dots,d.

The ground state of the mixed-spin SK Hamiltonian is known to have a fixed energy per spin in the limit nn\rightarrow\infty [18]. In fact we can also allow arbitrarily high orders in the mixed-spin SK model (d=d=\infty), provided the variances decrease quickly enough to make q2qσq2\sum_{q}2^{q}\sigma_{q}^{2} finite, and the ground state model will still have a fixed energy per spin as nn\rightarrow\infty [18]. However, for simplicity we consider some finite bound dd on the degree.

Our main result is as follows. Define the nn-spin model by Eq. 5 with JSN(0,σ|S|2)J_{S}\sim N(0,\sigma^{2}_{|S|}). Then, using depth 11 qaoa with angles β,γ\beta,\gamma, the expectation of the energy per spin equals in the large nn limit is given by

limn𝔼JN[H/n]\displaystyle\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right] =2γq=1dσq2a oddaq(1)a12a!(qa)!exp(2γ2aq=1dσq2(q1)!)sina(2β)cosqa(2β),.\displaystyle=2\gamma\sum_{q=1}^{d}\sigma_{q}^{2}\sum_{\begin{subarray}{c}a\text{ odd}\\ a\leq q\end{subarray}}\frac{(-1)^{\frac{a-1}{2}}}{a!(q-a)!}\exp\left(-2\gamma^{2}a\sum_{q^{\prime}=1}^{d}\frac{\sigma_{q^{\prime}}^{2}}{(q^{\prime}-1)!}\right)\sin^{a}(2\beta)\cos^{q-a}(2\beta),. (6)

If we define the variables cq:=σq/q!c_{q}:=\sigma_{q}/\sqrt{q!} and the polynomial ξ(x):=q=1dcq2xq\xi(x):=\sum_{q=1}^{d}c_{q}^{2}x^{q}, as is common in discussions of the mixed-spin model (see Section 3.2), we can also write this as

limn𝔼JN[H/n]=2γ[ξ(cos(2β)+isin(2β)exp(2γ2ξ(1)))].\displaystyle\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right]=2\gamma\Im\left[\xi\Big{(}\cos(2\beta)+i\sin(2\beta)\exp\left({-2\gamma^{2}\xi^{\prime}(1)}\right)\Big{)}\right]. (7)

Furthermore, the second moment of the energy per spin equals

limn𝔼JN[(H/n)2]\displaystyle\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle(H/n)^{2}\rangle\right] =limn𝔼JN[H/n]2.\displaystyle=\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right]^{2}. (8)

Our second result, Eq. 8, allows us to prove that p=1p=1 qaoa applied to the mixed-spin SK model has both concentration of measure and instance independence. To see this, we note that we may write

𝔼JN[(H/n)2]𝔼JN[H/n]2\displaystyle\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle(H/n)^{2}\rangle\right]-\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right]^{2} =(𝔼JN[(H/n)2H/n2])+(𝔼JN[H/n2]𝔼JN[H/n]2)\displaystyle=\Big{(}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle(H/n)^{2}\rangle-\langle H/n\rangle^{2}\right]\Big{)}+\Big{(}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle^{2}\right]-\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right]^{2}\Big{)} (9)

When applying qaoa to mixed-spin SK models, the measured value of (H/n)(H/n) varies for two reason. First, it varies because the bonds JSJ_{S} vary from instance to instance (the 𝔼J[]\operatorname*{\mathbb{E}}_{J}\left[\cdot\right] expectation). Second, it varies because the qaoa state |γ,β|\gamma,\beta\rangle is not an eigenstate of the ZZ-operators, so that the measurement outcomes have randomness even for fixed JSJ_{S} (the \langle\cdot\rangle expectation). The left hand side of Eq. 9 represents the total variance in (H/n)(H/n) due to both sources of randomness. The right hand side demonstrates that the total variance can be decomposed into two terms. The first term is the average over JSJ_{S} of the variance due only to the measurement randomness. The second term is the variance in the expected value H/n\langle H/n\rangle due to the randomness in the bonds JSJ_{S}. Since both of these terms are non-negative, they must both tend to zero as nn\rightarrow\infty as well.

The fact that the first variance approaches zero gives us concentration of measure: it says that for typical couplings JSJ_{S} the variance in the measurement outcomes vanishes, and thus we always measure a string with energy per spin equal to the expectation value (note that the term inside the 𝔼[]\operatorname*{\mathbb{E}}\left[\cdot\right] is always positive, so that for the average over JSJ_{S} to go to zero the magnitude of the typical value must also go to zero). The second term approaching zero clearly gives instance independence of the expectation value, since it shows that the variance in the expectation value due to different couplings vanishes.

Finally, we note that the methods we use also suffice to derive a formula for all higher moments of the energy per spin

limn𝔼JN[(H/n)m]\displaystyle\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle(H/n)^{m}\rangle\right] =limn𝔼JN[H/n]m,\displaystyle=\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right]^{m}, (10)

although unlike the m=2m=2 result, the m>2m>2 result does not have any obvious implication for the performance of the qaoa.

3.2 Numerical Results and Relation to Prior Results

As an example of using our Eqs. 6 and 7, we can compute the optimal angles for pure dd-spin models given by σq=δd,qd!/2\sigma_{q}=\delta_{d,q}\sqrt{d!/2}. In this case, our central equation reduces to

limn𝔼JN[H/n]=γd!a oddad(1)a12a!(da)!exp(2γ2ad)sina(2β)cosda(2β)=γ[(cos(2β)+isin(2β)exp(2γ2d))d].\displaystyle\begin{split}\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right]&=\gamma d!\sum_{\begin{subarray}{c}a\text{ odd}\\ a\leq d\end{subarray}}\frac{(-1)^{\frac{a-1}{2}}}{a!(d-a)!}\exp\left(-2\gamma^{2}ad\right)\sin^{a}(2\beta)\cos^{d-a}(2\beta)\\ &=\gamma\Im\left[\Big{(}\cos(2\beta)+i\sin(2\beta)\exp\left({-2\gamma^{2}d}\right)\Big{)}^{d}\right].\end{split} (11)

Examples of this energy landscape for small dd are plotted in Fig. 1, and numerically optimized angles and corresponding energy per spin are plotted in Fig. 2.

Refer to caption
Figure 1: The energy landscapes for the pure dd-spin models with σq=δd,qd!/2\sigma_{q}=\delta_{d,q}\sqrt{d!/2}, in the limit as nn\rightarrow\infty.
Refer to caption
Figure 2: The optimal angles (β,γ)(\beta_{*},\gamma_{*}) for the pure dd-spin models with σq=δd,qd!/2\sigma_{q}=\delta_{d,q}\sqrt{d!/2} that minimize the energy per spin 𝔼[H/n]\mathbb{E}\left[\langle H/n\rangle\right] in the limit nn\rightarrow\infty.
Refer to caption
Figure 3: (a-e) The energy landscape of random instances of a mixed-spin optimization problem with (σ1,σ2,σ3)=(1/3,1/2,1)(\sigma_{1},\sigma_{2},\sigma_{3})=(1/3,1/2,1), at varying problem sizes nn. (f) The predicted energy landscape in the limit nn\rightarrow\infty. We see the energy landscape for random instances rapidly converges to the nn\rightarrow\infty landscape, demonstrating instance independence in this limit.

As another example application of our formula, we explore the instance independence at finite nn to demonstrate convergence as nn\rightarrow\infty. In Fig. 3a-e, we plot the expected energy per spin H/n\langle H/n\rangle for five randomly generated mixed-spin optimization problems generated from a distribution with (σ1,σ2,σ3)=(1/3,1/2,1)(\sigma_{1},\sigma_{2},\sigma_{3})=(1/3,1/2,1) (and all higher-order terms zero). We see that as we increase nn, the energy landscape of any given problem instance quickly approaches the nn\rightarrow\infty landscape (Fig. 3f), as predicted.

To put Eq. 6 in perspective we explicitly calculate what it implies for two basic cases q=2q=2 and q=3q=3 and compare it with the literature on mixed-spin models.

The standard Sherrington-Kirkpatrick model has q=2q=2, σ2=1\sigma_{2}=1, and σq=0\sigma_{q}=0 for all other q2q\neq 2. In this case our Eq. 6 tells us that

𝔼JN[HSK/n]\displaystyle\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H_{SK}/n\rangle\right] =2γe2γ2sin(2β)cos(2β)=γe2γ2sin(4β),\displaystyle=2\gamma e^{-2\gamma^{2}}\sin(2\beta)\cos(2\beta)=\gamma e^{-2\gamma^{2}}\sin(4\beta), (12)

which agrees with previous work in [2] and [8]. This expectation is minimized by the parameters β=±π/8±0.392699\beta_{*}=\pm\pi/8\approx\pm 0.392699 and γ=0.5\gamma_{*}=\mp 0.5 for which 𝔼[HSK/n]=1/4e0.303265\operatorname*{\mathbb{E}}\left[\langle H_{SK}/n\rangle\right]=-1/\sqrt{4e}\approx-0.303265. This is the same result we found numerically for d=2d=2 in Fig. 2.

For the more general case q2q\geq 2 we will briefly review the notation used by the articles [5, 4, 18, 17]. In the current paper the mixed-spin model is described by the Hamiltonian

H\displaystyle H =q=1d1nq11j1<<jqnJj1jqZj1Zjq,\displaystyle=\sum_{q=1}^{d}\frac{1}{\sqrt{n^{q-1}}}\sum_{1\leq j_{1}<\cdots<j_{q}\leq n}{J_{j_{1}\cdots j_{q}}Z_{j_{1}}\cdots Z_{j_{q}}}, (13)

with Jj1jqN(0,σq2)J_{j_{1}\cdots j_{q}}\sim N(0,\sigma_{q}^{2}) and for a fixed qq the summation has (nq)\binom{n}{q} terms. In [5] and elsewhere a different notation is used where the Hamiltonian is defined by

H\displaystyle H^{\prime} =q=1dcqnq11j1,,jqnJj1jqZj1Zjq,\displaystyle=\sum_{q=1}^{d}\frac{c_{q}}{\sqrt{n^{q-1}}}\sum_{1\leq j_{1},\dots,j_{q}\leq n}{J_{j_{1}\cdots j_{q}}Z_{j_{1}}\cdots Z_{j_{q}}}, (14)

with Jj1jqN(0,1)J_{j_{1}\cdots j_{q}}\sim N(0,1). Crucially, now the indices j1,,jqj_{1},\dots,j_{q} can have repeated values and permutations are treated distinctly, hence for each qq the summation involves nqn^{q} terms.

In the limit of large nn one can show that the qq-plets (j1,jq)(j_{1},\dots j_{q}) with repeated values can be ignored as their relative contribution decreases as a function nn. For distinct indices there are q!q! permutations to consider in HH^{\prime}, hence we have a sum of q!q! standard normal distributions: Jj1jq++Jjqj1J^{\prime}_{j_{1}\dots j_{q}}+\cdots+J^{\prime}_{j_{q}\cdots j_{1}}, which is identical to a single normal distribution with variance q!q!. As a result we re-express HH^{\prime} as

H\displaystyle H^{\prime} =q=1dcqq!nq11j1<<jqnJj1jqZj1Zjq,\displaystyle=\sum_{q=1}^{d}\frac{c_{q}\sqrt{q!}}{\sqrt{n^{q-1}}}\sum_{1\leq j_{1}<\dots<j_{q}\leq n}{J_{j_{1}\cdots j_{q}}Z_{j_{1}}\cdots Z_{j_{q}}}, (15)

with JN(0,1)J\sim N(0,1). Thus when σq=cqq!\sigma_{q}=c_{q}\sqrt{q!} we have that HH and HH^{\prime} describe the same model. It is common to capture the different cqc_{q} coefficients by the mixture function ξ(x)=qcq2xq\xi(x)=\sum_{q}c_{q}^{2}x^{q}, such that the standard SK model has ξ(x)=x2/2\xi(x)=x^{2}/2 (i.e. c2=1/2c_{2}=1/\sqrt{2} and hence σ2=1\sigma_{2}=1).

For the 33-body model H3H_{3}, [4] analyzes the mixture ξ(x)=c32x3=x3/2\xi(x)=c_{3}^{2}x^{3}=x^{3}/2, such that c3=1/2c_{3}=1/\sqrt{2}. It was shown that the expected ground state energy-per-spin of this model equals 0.8132±0.0001-0.8132\pm 0.0001 In the notation of the current article this model is equivalent to σ3=3\sigma_{3}=\sqrt{3} and with this our Eq. 6 gives the expectation

𝔼JN[H3/n]\displaystyle\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H_{3}/n\rangle\right] =3γe3γ2sin(2β)cos2(2β)γe9γ2sin3(2β).\displaystyle=3\gamma e^{-3\gamma^{2}}\sin(2\beta)\cos^{2}(2\beta)-\gamma e^{-9\gamma^{2}}\sin^{3}(2\beta). (16)

This expectation is minimized to 𝔼J[H3/n]0.270638\operatorname*{\mathbb{E}}_{J}[\langle H_{3}/n\rangle]\approx-0.270638 by the angles (β,γ)(±0.290003,0.430091)(\beta_{*},\gamma_{*})\approx(\pm 0.290003,\mp 0.430091), which are the solutions to

β\displaystyle\beta_{*} =14arccos(119γ2)\displaystyle=-\frac{1}{4}\arccos\left(1-\frac{1}{9\gamma_{*}^{2}}\right) (17)
exp(6γ2)\displaystyle\exp(-6\gamma_{*}^{2}) =18γ23\displaystyle=18\gamma_{*}^{2}-3 (18)
𝔼J[H3/n]\displaystyle\operatorname*{\mathbb{E}}_{J}[\langle H_{3}/n\rangle] =4γ223.\displaystyle=\sqrt{4\gamma_{*}^{2}-\frac{2}{3}}. (19)

This is the same result we found numerically for d=3d=3 in Fig. 2. We thus see how depth p=1p=1 qaoa can approximate the ground state energy by a factor of 0.3328060.332806. This 0.330.33 approximation factor had been reported earlier by Zhou et al.[19]

4 Derivation of Main Result

The proof in this section follows to large extent the framework of the earlier result by Farhi et al. [2], which relied on manipulating the moment generating function 𝔼J[eiλH/n]\operatorname*{\mathbb{E}}_{J}\left[\langle e^{i\lambda H/n}\rangle\right] to extract expressions for the first and second moments of (H/n)(H/n). We use their method of simplifying the moment generating function, and their reorganization of the sum over zz-strings into a sum over sketches (see Section 4.2). We extend their proof technique by generalizing their form of the moment generating function to higher-spin models and demonstrating that it can still be written as a sum over sketches, developing careful power-counting methods to allow us to extract the relevant terms in the nn\rightarrow\infty limit, and deriving identities that allow us to explicitly evaluate the relevant sums.

In our proof, we will use the following conventions:

  • A ZZ-basis state |z|z\rangle is specified by a string z=(z1,z2,,zn)z=(z_{1},z_{2},\dots,z_{n}) of ±1\pm 1s. We will use the shorthand z{±1}nz\in\{\pm 1\}^{n} for this.

  • The XOR of two bitstrings zz and zz^{\prime} is given by the componentwise product zz:=(z1z1,z2z2,,znzn){±1}nzz^{\prime}:=(z_{1}z^{\prime}_{1},z_{2}z^{\prime}_{2},\dots,z_{n}z_{n}^{\prime})\in\{\pm 1\}^{n}.

  • For a set S{1,,n}S\subseteq\{1,\dots,n\}, we denote the product of the bits in SS as iSzi=:zS\prod_{i\in S}z_{i}=:z_{S}; with this convention we thus have ZS|z=zS|zZ_{S}|z\rangle=z_{S}|z\rangle.

  • The uniform superposition over all strings zz is denoted by

    |Σ\displaystyle|\Sigma\rangle :=12nz{±1}n|z.\displaystyle:=\frac{1}{\sqrt{2^{n}}}\sum_{z\in\{\pm 1\}^{n}}|z\rangle. (20)

This is in contrast to the usual convention in quantum information, in which a ZZ-basis state is specified by a string zz of 0s and 11s and the XOR of two strings is given by componentwise addition modulo 2. We choose our notation to be consistent with [2] and to simplify certain expressions in our derivation.

4.1 Moment Generating Function

Following [2], to evaluate 𝔼J[H/n]\operatorname*{\mathbb{E}}_{J}[\langle H/n\rangle] and 𝔼J[(H/n)2]\operatorname*{\mathbb{E}}_{J}[\langle(H/n)^{2}\rangle] we use the moment-generating function 𝔼J[eiλH/n]\operatorname*{\mathbb{E}}_{J}[\langle e^{i\lambda H/n}\rangle]. From the moment-generating function, we can find the moments via

𝔼JN[(H/n)m]\displaystyle\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle(H/n)^{m}\rangle\right] =(iλ)m𝔼JN[eiλH/n]|λ=0\displaystyle=\left.(-i\partial_{\lambda})^{m}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle e^{i\lambda H/n}\rangle\right]\right|_{\lambda=0} (21)

We can simplify the expectation inside the moment-generating function by inserting three complete sets of states:

eiλH/n=Σ|eiγHeiβBeiλH/neiβBeiγH|Σ=z,z,z′′{±1}nΣ|zz|eiγHeiβB|z′′z′′|eiλH/neiβBeiγH|zz|Σ=12nz,z,z′′{±1}nexp(iγ[H(z)H(z)]+iλH(z′′)n)z|eiβB|z′′z′′|eiβB|z=12nz,z,z′′{±1}nexp(iq=1dn1q2|S|=qJS[γ(zSzS)+λzS′′n])z|eiβB|z′′z′′|eiβB|z=12nz,z,z′′{±1}nexp(iq=1dn1q2|S|=qzS′′JS[γ(zSzS)+λn])z|eiβB|(+1)n(+1)n|eiβB|z,\displaystyle\begin{split}\langle e^{i\lambda H/n}\rangle&=\langle\Sigma|e^{i\gamma H}e^{i\beta B}e^{i\lambda H/n}e^{-i\beta B}e^{-i\gamma H}|\Sigma\rangle\\ &=\sum_{z,z^{\prime},z^{\prime\prime}\in\{\pm 1\}^{n}}\langle\Sigma|z\rangle\langle z|e^{i\gamma H}e^{i\beta B}|z^{\prime\prime}\rangle\langle z^{\prime\prime}|e^{i\lambda H/n}e^{-i\beta B}e^{-i\gamma H}|z^{\prime}\rangle\langle z^{\prime}|\Sigma\rangle\\ &=\frac{1}{2^{n}}\sum_{z,z^{\prime},z^{\prime\prime}\in\{\pm 1\}^{n}}\exp{\left(i\gamma\left[H(z)-H(z^{\prime})\right]+i\frac{\lambda H(z^{\prime\prime})}{n}\right)}\langle z|e^{i\beta B}|z^{\prime\prime}\rangle\langle z^{\prime\prime}|e^{-i\beta B}|z^{\prime}\rangle\\ &=\frac{1}{2^{n}}\sum_{z,z^{\prime},z^{\prime\prime}\in\{\pm 1\}^{n}}\exp{\left(i\sum_{q=1}^{d}n^{\frac{1-q}{2}}\sum_{|S|=q}J_{S}\left[\gamma(z_{S}-z^{\prime}_{S})+\frac{\lambda z^{\prime\prime}_{S}}{n}\right]\right)}\langle z|e^{i\beta B}|z^{\prime\prime}\rangle\langle z^{\prime\prime}|e^{-i\beta B}|z^{\prime}\rangle\\ &=\frac{1}{2^{n}}\sum_{z,z^{\prime},z^{\prime\prime}\in\{\pm 1\}^{n}}\exp{\left(i\sum_{q=1}^{d}n^{\frac{1-q}{2}}\sum_{|S|=q}z^{\prime\prime}_{S}J_{S}\left[\gamma(z_{S}-z^{\prime}_{S})+\frac{\lambda}{n}\right]\right)}\langle z|e^{i\beta B}|(+1)^{n}\rangle\langle(+1)^{n}|e^{-i\beta B}|z^{\prime}\rangle,\end{split} (22)

where in the last step we used that z|eiβB|z=zz|eiβB|(+1)n\langle z|e^{i\beta B}|z^{\prime}\rangle=\langle zz^{\prime}|e^{i\beta B}|(+1)^{n}\rangle and made the replacements zzz′′z\mapsto zz^{\prime\prime} and zzz′′z^{\prime}\mapsto z^{\prime}z^{\prime\prime}.

4.2 Expectation when Couplings are Normal Distributed Variables

We will now treat the JSJ_{S} couplings as a random variable and consider the expectation 𝔼J\operatorname*{\mathbb{E}}_{J} of the energy. We assume that the distribution is symmetric with respect to JJJ\leftrightarrow-J, such that we can replace zS′′JSz^{\prime\prime}_{S}J_{S} by JSJ_{S} to get

𝔼J[eiλH/n]=12nz,z,z′′{±1}n𝔼JN[exp(iq=1dn1q2|S|=qzS′′JS[γ(zSzS)+λn])]z|eiβB(+1)n(+1)n|eiβB|z=z,z{±1}n𝔼JN[exp(iq=1dn1q2|S|=qJS[γ(zSzS)+λn])]z|eiβB(+1)n(+1)n|eiβB|z.\displaystyle\begin{split}\operatorname*{\mathbb{E}}_{J}\left[\langle e^{i\lambda H/n}\rangle\right]&=\frac{1}{2^{n}}\sum_{z,z^{\prime},z^{\prime\prime}\in\{\pm 1\}^{n}}\operatorname*{\mathbb{E}}_{J\sim N}\left[\exp{\left(i\sum_{q=1}^{d}n^{\frac{1-q}{2}}\sum_{|S|=q}z^{\prime\prime}_{S}J_{S}\left[\gamma(z_{S}-z^{\prime}_{S})+\frac{\lambda}{n}\right]\right)}\right]\langle z|e^{i\beta B}|(+1)^{n}\rangle\langle(+1)^{n}|e^{-i\beta B}|z^{\prime}\rangle\\ &=\sum_{z,z^{\prime}\in\{\pm 1\}^{n}}\operatorname*{\mathbb{E}}_{J\sim N}\left[\exp{\left(i\sum_{q=1}^{d}n^{\frac{1-q}{2}}\sum_{|S|=q}J_{S}\left[\gamma(z_{S}-z^{\prime}_{S})+\frac{\lambda}{n}\right]\right)}\right]\langle z|e^{i\beta B}|(+1)^{n}\rangle\langle(+1)^{n}|e^{-i\beta B}|z^{\prime}\rangle.\end{split} (23)

When we further assume that the JSJ_{S} variables are independent between different SS we can continue by

𝔼J[eiλH/n]\displaystyle\operatorname*{\mathbb{E}}_{J}\left[\langle e^{i\lambda H/n}\rangle\right] =z,z{±1}nq=1d|S|=q𝔼JS[exp(in1q2JS[γ(zSzS)+λn])]z|eiβB(+1)n(+1)n|eiβB|z.\displaystyle=\sum_{z,z^{\prime}\in\{\pm 1\}^{n}}\prod_{q=1}^{d}\prod_{|S|=q}\operatorname*{\mathbb{E}}_{J_{S}}\left[\exp{\left(i\cdot n^{\frac{1-q}{2}}J_{S}\left[\gamma(z_{S}-z^{\prime}_{S})+\frac{\lambda}{n}\right]\right)}\right]\langle z|e^{i\beta B}|(+1)^{n}\rangle\langle(+1)^{n}|e^{-i\beta B}|z^{\prime}\rangle. (24)

Next we assume that the JSJ_{S} are normally distributed with a standard deviation that is the same for all sets SS of the same size, that is JSN(0,σ|S|2)J_{S}\sim N(0,\sigma^{2}_{|S|}). We note that taking the expectation value of a Gaussian random variable JJ with standard deviation σ\sigma gives

𝔼JN(0,σ2)[eicJ]=1σ2πeJ2/2σ2+icJ=ec2σ2/2\displaystyle\operatorname*{\mathbb{E}}_{J\sim N(0,\sigma^{2})}\left[e^{icJ}\right]=\frac{1}{\sigma\sqrt{2\pi}}\int e^{-J^{2}/2\sigma^{2}+icJ}=e^{-c^{2}\sigma^{2}/2} (25)

so that our overall expression becomes

𝔼JN[eiλH/n]\displaystyle\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle e^{i\lambda H/n}\rangle\right] =z,z{±1}nexp(q=1dσq22nq1|S|=q[γ2(zSzS)2+2γλ(zSzS)n+λ2n2])z|eiβB|(+1)n(+1)n|eiβB|z.\displaystyle=\smashoperator[r]{\sum_{z,z^{\prime}\in\{\pm 1\}^{n}}^{}}\exp{\left(-\sum_{q=1}^{d}\frac{\sigma_{q}^{2}}{2n^{q-1}}\sum_{|S|=q}\left[\gamma^{2}(z_{S}-z^{\prime}_{S})^{2}+\frac{2\gamma\lambda(z_{S}-z^{\prime}_{S})}{n}+\frac{\lambda^{2}}{n^{2}}\right]\right)}\langle z|e^{i\beta B}|(+1)^{n}\rangle\langle(+1)^{n}|e^{-i\beta B}|z^{\prime}\rangle. (26)

To do the sum over z,zz,z^{\prime}, we claim that the summand does not depend on all 2n2n spin values of zz and zz^{\prime}. Instead, it is only a function of the four integer values (n++,n+,n+,n)(n_{++},n_{+-},n_{-+},n_{--}), where nssn_{ss^{\prime}} is defined to be the number of positions k{1,,n}k\in\{1,\dots,n\} with (zk,zk)=(s,s)(z_{k},z^{\prime}_{k})=(s,s^{\prime}). Note that only three of these variables are actually independent, as we always have n+++n++n++n=nn_{++}+n_{+-}+n_{-+}+n_{--}=n. As these numbers summarize the crucial information of the strings, we will refer to (n++,n+,n+,n)(n_{++},n_{+-},n_{-+},n_{--}) as the sketch of (z,z)(z,z^{\prime}). Writing the summand in terms of the sketch rather than (z,z)(z,z^{\prime}) was introduced in [2]; here we establish that we can still write the summand in terms of the sketch for the mixed-spin SK model. To start, it is straightforward to verify that

z|eiβB|(+1)n(+1)n|eiβB|z\displaystyle\langle z|e^{i\beta B}|(+1)^{n}\rangle\langle(+1)^{n}|e^{-i\beta B}|z^{\prime}\rangle =ss{±}2Qssnss, with {Q++:=cos2(β)Q:=sin2(β)Q+:=isin(β)cos(β)Q+:=isin(β)cos(β).\displaystyle=\smashoperator[r]{\prod_{ss^{\prime}\in\{\pm\}^{2}}^{}}Q_{ss^{\prime}}^{n_{ss^{\prime}}},\text{\leavevmode\nobreak\ with\leavevmode\nobreak\ }\begin{cases}Q_{++}:=\cos^{2}(\beta)\\ Q_{--}:=\sin^{2}(\beta)\\ Q_{-+}:=i\sin(\beta)\cos(\beta)\\ Q_{+-}:=-i\sin(\beta)\cos(\beta).\end{cases} (27)

We can also write explicit combinatorial formulas for the sums in the exponential:

S{1,,n}|S|=q(zSzS)=k=0q(1)qk[(n+++n+k)(n++nqk)(n+++n+k)(n++nqk)]=k=0q(1)qk[(n+(n+n+)+(n++n)2k)(n(n+n+)(n++n)2qk)(n(n+n+)+(n++n)2k)(n+(n+n+)(n++n)2qk)]=:fq(n++,n+,n+,n)\displaystyle\begin{split}\sum_{\begin{subarray}{c}S\subseteq\{1,\dots,n\}\\ |S|=q\end{subarray}}(z_{S}-z_{S}^{\prime})&=\sum_{k=0}^{q}(-1)^{q-k}\left[\binom{n_{++}+n_{+-}}{k}\binom{n_{-+}+n_{--}}{q-k}-\binom{n_{++}+n_{-+}}{k}\binom{n_{+-}+n_{--}}{q-k}\right]\\ &=\sum_{k=0}^{q}(-1)^{q-k}\left[\binom{\frac{n+(n_{+-}-n_{-+})+(n_{++}-n_{--})}{2}}{k}\binom{\frac{n-(n_{+-}-n_{-+})-(n_{++}-n_{--})}{2}}{q-k}\right.\\ &\qquad\qquad\qquad\qquad-\left.\binom{\frac{n-(n_{+-}-n_{-+})+(n_{++}-n_{--})}{2}}{k}\binom{\frac{n+(n_{+-}-n_{-+})-(n_{++}-n_{--})}{2}}{q-k}\right]\\ &=:f_{q}(n_{++},n_{+-},n_{-+},n_{--})\end{split} (28)
S{1,,n}|S|=q(zSzS)2=4k odd(n++n+k)(n+++nqk)=4k odd(n++n+k)(n(n++n+)qk)=:gq(n++,n+,n+,n).\displaystyle\begin{split}\sum_{\begin{subarray}{c}S\subseteq\{1,\dots,n\}\\ |S|=q\end{subarray}}(z_{S}-z_{S}^{\prime})^{2}&=4\sum_{k\text{ odd}}\binom{n_{+-}+n_{-+}}{k}\binom{n_{++}+n_{--}}{q-k}\\ &=4\sum_{k\text{ odd}}\binom{n_{+-}+n_{-+}}{k}\binom{n-(n_{+-}+n_{-+})}{q-k}\\ &=:g_{q}(n_{++},n_{+-},n_{-+},n_{--}).\end{split} (29)

Therefore, the summand indeed depends only on (n++,n+,n+,n)(n_{++},n_{+-},n_{-+},n_{--}) and the number of ways to assign the nn positions into four groups of these sizes is the multinomial n!/(n++!n+!n+!n!)n!/(n_{++}!n_{+-}!n_{-+}!n_{--}!). To condense our notation we will use {n}\{n_{*}\} to denote the set of sketches n=(n++,n+,n+,n)n_{*}=(n_{++},n_{+-},n_{-+},n_{--}), allowing us to use the shorthand

{n}(nn)F(n)\displaystyle\sum_{\{n_{*}\}}\binom{n}{n_{*}}F(n_{*}) :=(n++,n+,n+,n)4n+++n++n++n=n(nn++,n+,n+,n)F(n++,n+,n+,n).\displaystyle:=\sum_{\begin{subarray}{c}(n_{++},n_{+-},n_{-+},n_{--})\in\mathbb{N}^{4}\\ n_{++}+n_{+-}+n_{-+}+n_{--}=n\end{subarray}}\binom{n}{n_{++},n_{+-},n_{-+},n_{--}}F(n_{++},n_{+-},n_{-+},n_{--}). (30)

Note that this summation has (n+33)\binom{n+3}{3} terms. We thus have

𝔼JN[eiλH/n]\displaystyle\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle e^{i\lambda H/n}\rangle\right] ={n}(nn)exp(q=1d[γ2gq(n)+2γλfq(n)n+λ2(nq)n2]σq22nq1)ss{±}2Qssnss.\displaystyle=\sum_{\{n_{*}\}}\binom{n}{n_{*}}\exp\left({-\sum_{q=1}^{d}\left[\gamma^{2}g_{q}(n_{*})+2\gamma\lambda\frac{f_{q}(n_{*})}{n}+\lambda^{2}\frac{\left(\begin{smallmatrix}n\\ q\end{smallmatrix}\right)}{n^{2}}\right]\frac{\sigma_{q}^{2}}{2n^{q-1}}}\right)\smashoperator[r]{\prod_{ss^{\prime}\in\{\pm\}^{2}}^{}}Q_{ss^{\prime}}^{n_{ss^{\prime}}}. (31)

Eq. 31 is the form of the moment-generating function we will use to evaluate 𝔼J[H/n]\operatorname*{\mathbb{E}}_{J}[\langle H/n\rangle] and 𝔼J[(H/n)2]\operatorname*{\mathbb{E}}_{J}[\langle(H/n)^{2}\rangle].

4.3 General Form of Moments

Using Eq. 21 combined with the form of the moment-generating function given in Eq. 31, we can write the first moment as:

𝔼JN[H/n]\displaystyle\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right] =iγq=1dσq2nq{n}(nn)exp(q=1dγ2gq(n)σq22nq1)fq(n)ss{±}2Qssnss,\displaystyle=i\gamma\sum_{q=1}^{d}\frac{\sigma_{q}^{2}}{n^{q}}\sum_{\{n_{*}\}}\binom{n}{n_{*}}\exp\left({-\sum_{q^{\prime}=1}^{d}\gamma^{2}g_{q^{\prime}}(n_{*})\frac{\sigma_{q^{\prime}}^{2}}{2n^{q^{\prime}-1}}}\right)f_{q}(n_{*})\smashoperator[r]{\prod_{ss^{\prime}\in\{\pm\}^{2}}^{}}Q_{ss^{\prime}}^{n_{ss^{\prime}}}, (32)

and the second moment as:

𝔼JN[(H/n)2]=\displaystyle\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle(H/n)^{2}\rangle\right]= q=1d(nq)σq2nq+1γ2q,q=1dσq2σq2nq+q{n}(nn)exp(q′′=1dγ2gq′′(n)σq′′22nq′′1)fq(n)fq(n)ss{±}2Qssnss.\displaystyle\sum_{q=1}^{d}\binom{n}{q}\frac{\sigma_{q}^{2}}{n^{{q}+1}}-\gamma^{2}\sum_{q,q^{\prime}=1}^{d}\frac{\sigma_{q}^{2}\sigma_{q^{\prime}}^{2}}{n^{q+q^{\prime}}}\sum_{\{n_{*}\}}\binom{n}{n_{*}}\exp\left({-\sum_{q^{\prime\prime}=1}^{d}\gamma^{2}g_{q^{\prime\prime}}(n_{*})\frac{\sigma_{q^{\prime\prime}}^{2}}{2n^{q^{\prime\prime}-1}}}\right)f_{q}(n_{*})f_{q^{\prime}}(n_{*})\smashoperator[r]{\prod_{ss^{\prime}\in\{\pm\}^{2}}^{}}Q_{ss^{\prime}}^{n_{ss^{\prime}}}. (33)

The explicit expression for fqf_{q} (Eq. 28) shows that fqf_{q} is a degree-qq polynomial in the variables (n+n+)(n_{+-}-n_{-+}), (n++n)(n_{++}-n_{--}), and nn, so we can expand fqf_{q} as

fq(n)=a+b+cqfqabc(n+n+)a(n++n)bncf_{q}(n_{*})=\sum_{a+b+c\leq q}f_{q}^{abc}(n_{+-}-n_{-+})^{a}(n_{++}-n_{--})^{b}n^{c} (34)

where the fqabcf^{abc}_{q} are constants independent of nn. In terms of this expansion, we have

𝔼JN[H/n]\displaystyle\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right] =iγq=1dσq2a+b+cqfqabcnqc{n}(nn)exp(q=1dgq(n)γ2σq22nq1)(n+n+)a(n++n)bss{±}2Qssnss\displaystyle=i\gamma\sum_{q=1}^{d}\sigma_{q}^{2}\sum_{a+b+c\leq q}\frac{f_{q}^{abc}}{n^{q-c}}\sum_{\{n_{*}\}}\binom{n}{n_{*}}\exp\left(-\sum_{q^{\prime}=1}^{d}g_{q^{\prime}}(n_{*})\frac{\gamma^{2}\sigma_{q^{\prime}}^{2}}{2n^{q^{\prime}-1}}\right)(n_{+-}-n_{-+})^{a}(n_{++}-n_{--})^{b}\smashoperator[r]{\prod_{ss^{\prime}\in\{\pm\}^{2}}^{}}Q_{ss^{\prime}}^{n_{ss^{\prime}}} (35)

and

𝔼JN[(H/n)2]=q=1d(nq)σq2nq+1γ2q,q=1dσq2σq2a+b+cqa+b+cqfqabcnqcfqabcnqc{n}(nn)exp(q′′=1dgq′′(n)γ2σq′′22nq′′1)×(n+n+)a+a(n++n)b+bss{±}2Qssnss.\displaystyle\begin{split}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle(H/n)^{2}\rangle\right]&=\sum_{q=1}^{d}\binom{n}{q}\frac{\sigma_{q}^{2}}{n^{q+1}}-\gamma^{2}\sum_{q,q^{\prime}=1}^{d}\sigma_{q}^{2}\sigma_{q^{\prime}}^{2}\sum_{\begin{subarray}{c}a+b+c\leq q\\ a^{\prime}+b^{\prime}+c^{\prime}\leq q^{\prime}\end{subarray}}\frac{f^{abc}_{q}}{n^{q-c}}\frac{f^{a^{\prime}b^{\prime}c^{\prime}}_{q^{\prime}}}{n^{q^{\prime}-c^{\prime}}}\sum_{\{n_{*}\}}\binom{n}{n_{*}}\exp\left(-\sum_{q^{\prime\prime}=1}^{d}g_{q^{\prime\prime}}(n_{*})\frac{\gamma^{2}\sigma_{q^{\prime\prime}}^{2}}{2n^{q^{\prime\prime}-1}}\right)\\ &\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\times(n_{+-}-n_{-+})^{a+a^{\prime}}(n_{++}-n_{--})^{b+b^{\prime}}\smashoperator[r]{\prod_{ss^{\prime}\in\{\pm\}^{2}}^{}}Q_{ss^{\prime}}^{n_{ss^{\prime}}}.\end{split} (36)

Ref. [2] could explicitly evaluate these terms for the small values of aa and bb relevant for the two-body SK model, using concise expressions for f2f_{2} and g2g_{2}. However, to get explicit formulas beyond q=2q=2 requires carefully counting powers of nn to establish which terms survive in the nn\rightarrow\infty limit, and using the general expressions for fqf_{q} and gqg_{q} (Eqs. 28 and 29) to derive explicit forms of the leading-order terms. To tame this sum our derivation thus uses techniques that go beyond a simple generalization of [2].

4.4 Evaluating Sums Over the Sketches (n++,n+,n+,n)(n_{++},n_{+-},n_{-+},n_{--})

We see that evaluating both moments reduces to repeatedly evaluating terms of the form

Tξab\displaystyle T^{ab}_{\xi} :=1nξ{n}(nn)exp(q=1dγ2gq(n)σq22nq1)(n+n+)a(n++n)bss{±}2Qssnss\displaystyle:=\frac{1}{n^{\xi}}\sum_{\{n_{*}\}}\binom{n}{n_{*}}\exp\left(-\sum_{q^{\prime}=1}^{d}\gamma^{2}g_{q^{\prime}}(n_{*})\frac{\sigma_{q^{\prime}}^{2}}{2n^{q^{\prime}-1}}\right)(n_{+-}-n_{-+})^{a}(n_{++}-n_{--})^{b}\smashoperator[r]{\prod_{ss^{\prime}\in\{\pm\}^{2}}^{}}Q_{ss^{\prime}}^{n_{ss^{\prime}}} (37)

with ξa+b\xi\geq a+b playing the role of qcq-c in Eq. 35. For a fixed nn, note that gq(n)g_{q^{\prime}}(n_{*}) only depends on n++n+n_{+-}+n_{-+} (Eq. 29) and with this symmetry we define the single variable function g¯q(n++n+):=gq(n++,n+,n+,n)\bar{g}_{q^{\prime}}(n_{+-}+n_{-+}):=g_{q^{\prime}}(n_{++},n_{+-},n_{-+},n_{--}) and use t\sum_{t} with t=n++n+t=n_{+-}+n_{-+}. The remaining two degrees of freedom are further reorganized by summing over n+n_{+-} using n++n+=t\sum_{n_{+-}+n_{-+}=t}, and over n++n_{++} using n+++n=nt\sum_{n_{++}+n_{--}=n-t}. We thus obtain

Tξab=1nξt=0n(nt)exp(q=1dγ2g¯q(t)σq22nq1)[n++n+=t(tn+)(n+n+)aQ+n+Q+n+]Ata×[n+++n=nt(ntn++)(n++n)bQ++n++Qn]Btb,\displaystyle\begin{split}T^{ab}_{\xi}=\frac{1}{n^{\xi}}\sum_{t=0}^{n}\binom{n}{t}\exp\left(-\sum_{q^{\prime}=1}^{d}\gamma^{2}\bar{g}_{q^{\prime}}(t)\frac{\sigma_{q^{\prime}}^{2}}{2n^{q^{\prime}-1}}\right)&\underbrace{\left[\sum_{n_{+-}+n_{-+}=t}\binom{t}{n_{+-}}(n_{+-}-n_{-+})^{a}Q_{+-}^{n_{+-}}Q_{-+}^{n_{-+}}\right]}_{A^{a}_{t}}\\ &\qquad\times\underbrace{\left[\sum_{n_{++}+n_{--}=n-t}\binom{n-t}{n_{++}}(n_{++}-n_{--})^{b}Q_{++}^{n_{++}}Q_{--}^{n_{--}}\right]}_{B^{b}_{t}},\end{split} (38)

where we have defined AtaA^{a}_{t} and BtbB^{b}_{t} to be analyzed next in the limit of large nn.

4.5 Large nn Limit of AtaA^{a}_{t}, BtbB^{b}_{t}, and TξabT^{ab}_{\xi}

As we plan to take the limit nn\rightarrow\infty, we will use big O()O(\cdot) notation in nn to keep track of the significant terms in AtaA^{a}_{t} and BtbB^{b}_{t} for constant aa, bb, and tt. The summations for AtaA^{a}_{t} and BtbB^{b}_{t} can be expressed exactly via the following identity (which is a generalization of the identities given in [2] for the case of a=1,2a=1,2 and which can be proven by induction on aa):

n1+n2=u(un1)(n1n2)aQ1n1Q2n2\displaystyle\sum_{n_{1}+n_{2}=u}\binom{u}{n_{1}}(n_{1}-n_{2})^{a}Q_{1}^{n_{1}}Q_{2}^{n_{2}} =(xxyy)a(x+y)u|x=Q1y=Q2\displaystyle=(x\partial_{x}-y\partial_{y})^{a}(x+y)^{u}\left|{}_{\begin{subarray}{c}x=Q_{1}\\ y=Q_{2}\end{subarray}}\right. (39)

For AtaA^{a}_{t}, we thus have

Ata\displaystyle A^{a}_{t} =(xxyy)a(x+y)t|x=Q+y=Q+\displaystyle=(x\partial_{x}-y\partial_{y})^{a}(x+y)^{t}\left|{}_{\begin{subarray}{c}x=Q_{+-}\\ y=Q_{-+}\end{subarray}}\right. (40)

Using the operator equality xx=1+xx\partial_{x}\cdot x=1+x\partial_{x} we see that expanding (xxyy)a(x\partial_{x}-y\partial_{y})^{a} gives a sum of differential operators xkyxkyx^{k}y^{\ell}\partial_{x}^{k}\partial_{y}^{\ell} with k+ak+\ell\leq a. Because (Q++Q+)=0(Q_{+-}+Q_{-+})=0 (Eq. 27) we see that only the operators xky\partial_{x}^{k}\partial_{y}^{\ell} with k+=tk+\ell=t give nonzero contributions to AtaA^{a}_{t}. For t>at>a we therefore have Ata=0A^{a}_{t}=0. In the particular case of t=at=a, the only terms in (xxyy)a(x\partial_{x}-y\partial_{y})^{a} that give a nonzero result are k+=a(ak)xk(y)xky\sum_{k+\ell=a}\left(\begin{smallmatrix}a\\ k\end{smallmatrix}\right)x^{k}(-y)^{\ell}\partial_{x}^{k}\partial_{y}^{\ell}, hence

Aa,a=a!k+=a(ak)Q+k(Q+)=a!(Q+Q+)a=a!(i)asina(2β).\displaystyle A_{a,a}=a!\sum_{k+\ell=a}\binom{a}{k}Q_{+-}^{k}(-Q_{-+})^{\ell}=a!(Q_{+-}-Q_{-+})^{a}=a!(-i)^{a}\sin^{a}(2\beta). (41)

For t<at<a it will be enough to observe that the order of AtaA^{a}_{t} is independent of nn. Summarizing, we have

Ata={0if t>a,a!(i)asina(2β)if t=a,O(1)if t<a.\displaystyle A^{a}_{t}=\begin{cases}0&\text{if $t>a$,}\\ a!(-i)^{a}\sin^{a}(2\beta)&\text{if $t=a$,}\\ O(1)&\text{if $t<a$}.\end{cases} (42)

For BtbB^{b}_{t}, we use

Btb\displaystyle B^{b}_{t} =(xxyy)b(x+y)nt|.x=Q++y=Q\displaystyle=(x\partial_{x}-y\partial_{y})^{b}(x+y)^{n-t}\left|{}_{\begin{subarray}{c}x=Q_{++}\\ y=Q_{--}\end{subarray}}\right.. (43)

The highest-order terms in nn come from the terms k+=b(bk)xk(y)xky\sum_{k+\ell=b}\left(\begin{smallmatrix}b\\ k\end{smallmatrix}\right)x^{k}(-y)^{\ell}\partial_{x}^{k}\partial_{y}^{\ell} in (xxyy)b(x\partial_{x}-y\partial_{y})^{b} and thus

Btb=(nt)!(ntb)!k+=b(bk)Q++k(Q)(Q+++Q)ntb+O(nb1)=nb(Q++Q)b(Q+++Q)ntb+O(nb1)=nbcosb(2β)+O(nb1).\displaystyle\begin{split}B^{b}_{t}&=\frac{(n-t)!}{(n-t-b)!}\sum_{k+\ell=b}\binom{b}{k}Q_{++}^{k}(-Q_{--})^{\ell}(Q_{++}+Q_{--})^{n-t-b}+O(n^{b-1})\\ &=n^{b}(Q_{++}-Q_{--})^{b}(Q_{++}+Q_{--})^{n-t-b}+O(n^{b-1})\\ &=n^{b}\cos^{b}(2\beta)+O(n^{b-1}).\end{split} (44)

Plugging our Eqs. 42 and 44 for AtaA^{a}_{t} and BtbB^{b}_{t} into Eq. 38 for TξabT_{\xi}^{ab} hence gives

Tξab=1nξ(na)exp(q=1dγ2g¯q(a)σq22nq1)a!(i)asina(2β)[nbcosb(2β)+O(nb1)]+1nξt=0a1(nt)exp(q=1dγ2g¯q(t)σq22nq1)O(1)[nbcosb(2β)+O(nb1)].\displaystyle\begin{split}T^{ab}_{\xi}&=\frac{1}{n^{\xi}}\binom{n}{a}\exp\left(-\sum_{q^{\prime}=1}^{d}\gamma^{2}\bar{g}_{q^{\prime}}(a)\frac{\sigma_{q^{\prime}}^{2}}{2n^{q^{\prime}-1}}\right)\cdot a!(-i)^{a}\sin^{a}(2\beta)\cdot\left[n^{b}\cos^{b}(2\beta)+O(n^{b-1})\right]\\ &\quad+\frac{1}{n^{\xi}}\sum_{t=0}^{a-1}\binom{n}{t}\exp\left(-\sum_{q^{\prime}=1}^{d}\gamma^{2}\bar{g}_{q^{\prime}}(t)\frac{\sigma_{q^{\prime}}^{2}}{2n^{q^{\prime}-1}}\right)\cdot O(1)\cdot\left[n^{b}\cos^{b}(2\beta)+O(n^{b-1})\right].\end{split} (45)

As (na)=na/a!+O(na1)\binom{n}{a}=n^{a}/a!+O(n^{a-1}) and similarly (nt)=O(nt)\binom{n}{t}=O(n^{t}) this simplifies to

Tξab\displaystyle T^{ab}_{\xi} =na+bξa!exp(q=1dγ2g¯q(a)σq22nq1)a!(i)asina(2β)cosb(2β)+O(na+bξ1).\displaystyle=\frac{n^{a+b-\xi}}{a!}\exp\left(-\sum_{q^{\prime}=1}^{d}\gamma^{2}\bar{g}_{q^{\prime}}(a)\frac{\sigma_{q^{\prime}}^{2}}{2n^{q^{\prime}-1}}\right)\cdot a!(-i)^{a}\sin^{a}(2\beta)\cdot\cos^{b}(2\beta)+O\left({n^{a+b-\xi-1}}\right). (46)

After reminding ourselves that a+bξa+b\leq\xi we see that, in the nn\rightarrow\infty limit, TξabT_{\xi}^{ab} further condenses to

Tξab\displaystyle T^{ab}_{\xi} {(i)aexp(q=1dγ2σq2limn[g¯q(a)2nq1])sina(2β)cosb(2β)if a+b=ξ0if a+b<ξ.\displaystyle\rightarrow\begin{cases}(-i)^{a}\exp\left(-\sum_{q^{\prime}=1}^{d}\gamma^{2}\sigma_{q^{\prime}}^{2}\lim_{n\rightarrow\infty}\left[\frac{\bar{g}_{q^{\prime}}(a)}{2n^{q^{\prime}-1}}\right]\right)\sin^{a}(2\beta)\cos^{b}(2\beta)&\text{if\leavevmode\nobreak\ }a+b=\xi\\ 0&\text{if\leavevmode\nobreak\ }a+b<\xi.\end{cases} (47)

As g¯q(a)\bar{g}_{q^{\prime}}(a) has an implicit nn dependency we have to determine its relevant terms as nn\rightarrow\infty. Eq. 47 shows that the only relevant terms in g¯q\bar{g}_{q^{\prime}} are those of degree at least (q1)(q^{\prime}-1) in nn. From the definition of gqg_{q} (Eq. 29), elementary algebra gives

g¯q(n++n+)\displaystyle\bar{g}_{q^{\prime}}(n_{+-}+n_{-+}) =4(n++n+)nq1(q1)!+O(nq2).\displaystyle=\frac{4(n_{+-}+n_{-+})n^{q^{\prime}-1}}{(q^{\prime}-1)!}+O(n^{q^{\prime}-2}). (48)

Therefore, we have the large nn limit

Tξab\displaystyle T^{ab}_{\xi} {(i)aexp(2aγ2q=1dσq2(q1)!)sina(2β)cosb(2β)if a+b=ξ0if a+b<ξ.\displaystyle\rightarrow\begin{cases}(-i)^{a}\exp\left(-2a\gamma^{2}\sum_{q^{\prime}=1}^{d}\frac{\sigma_{q^{\prime}}^{2}}{(q^{\prime}-1)!}\right)\sin^{a}(2\beta)\cos^{b}(2\beta)&\text{if\leavevmode\nobreak\ }a+b=\xi\\ 0&\text{if\leavevmode\nobreak\ }a+b<\xi.\end{cases} (49)

4.6 Large nn properties of fqf_{q}

To complete the evaluation of the moments, we to determine the large nn dependency of fqf_{q}, its fqabcf_{q}^{abc} coefficients (Eq. 34) and their role in the first and second moment expressions of Eqs. 35 and 36. We see from Eq. 49 that the only coefficients that matter in the nn\rightarrow\infty limit are those with a+b=ξa+b=\xi, i.e. those with a+b+c=qa+b+c=q. We can evaluate these fqabcf_{q}^{abc} from the explicit formula for fqf_{q} (Eq. 28) by keeping only terms of total degree qq in the terms (n+n+)(n_{+-}-n_{-+}), (n++n)(n_{++}-n_{--}), and nn. This gives

fq(n++,n+,n+,n)\displaystyle f_{q}(n_{++},n_{+-},n_{-+},n_{--}) =a oddaq2a!(qa)!(n+n+)a(n++n)qa+(lower degree terms)\displaystyle=\sum_{\begin{subarray}{c}a\text{ odd}\\ a\leq q\end{subarray}}\frac{2}{a!(q-a)!}(n_{+-}-n_{-+})^{a}(n_{++}-n_{--})^{q-a}+(\text{lower degree terms}) (50)

Note that there are no terms (n+n+)a(n++n)bnc(n_{+-}-n_{-+})^{a}(n_{++}-n_{--})^{b}n^{c} with c0c\neq 0 and (a+b+c)=q(a+b+c)=q in fqf_{q}. Comparing Eq. 50 with the definition of fqabcf_{q}^{abc} in Eq. 34, we see for nn\rightarrow\infty after some algebra that

fqabc\displaystyle f_{q}^{abc} ={2a!b!if a+b=qc=0, and a odd0if a+b+c=q with a even or c>0.\displaystyle=\begin{cases}\frac{2}{a!b!}&\text{if $a+b=q$, $c=0$, and $a$ odd}\\ 0&\text{if $a+b+c=q$ with $a$ even or $c>0$.}\end{cases} (51)

Having established these properties of TqcabT^{ab}_{q-c} and fqf_{q}, we now have sufficient information to evaluate our moments.

4.7 Evaluating the First Moment

To evaluate the first moment, we simplify Eq. 35 using the definition of TξabT_{\xi}^{ab} (Eq. 37) and the results of Eqs. 49 and 51 to get

𝔼JN[H/n]=iγq=1dσq2a+b+cqfqabcTqcabiγq=1dσq2a oddaq2a!(qa)!(i)aexp(2γ2aq=1dσq2(q1)!)sina(2β)cosqa(2β)=2γq=1dσq2a oddaq(1)a12a!(qa)!exp(2γ2aq=1dσq2(q1)!)sina(2β)cosqa(2β),\displaystyle\begin{split}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right]&=i\gamma\sum_{q=1}^{d}\sigma_{q}^{2}\sum_{a+b+c\leq q}f_{q}^{abc}T_{q-c}^{ab}\\ &\rightarrow i\gamma\sum_{q=1}^{d}\sigma_{q}^{2}\sum_{\begin{subarray}{c}a\text{ odd}\\ a\leq q\end{subarray}}\frac{2}{a!(q-a)!}(-i)^{a}\exp\left(-2\gamma^{2}a\sum_{q^{\prime}=1}^{d}\frac{\sigma_{q^{\prime}}^{2}}{(q^{\prime}-1)!}\right)\sin^{a}(2\beta)\cos^{q-a}(2\beta)\\ &=2\gamma\sum_{q=1}^{d}\sigma_{q}^{2}\sum_{\begin{subarray}{c}a\text{ odd}\\ a\leq q\end{subarray}}\frac{(-1)^{\frac{a-1}{2}}}{a!(q-a)!}\exp\left(-2\gamma^{2}a\sum_{q^{\prime}=1}^{d}\frac{\sigma_{q^{\prime}}^{2}}{(q^{\prime}-1)!}\right)\sin^{a}(2\beta)\cos^{q-a}(2\beta),\end{split} (52)

which is precisely what we claimed in Eq. 6.

4.8 Evaluating the Second Moment

To evaluate the second moment, we simplify Eq. 36 using the definition of TξabT_{\xi}^{ab} (Eq. 37) and the results of Eqs. 49 and 51 to get

𝔼JN[(H/n)2]=q=1d(nq)σq2nq+1γ2q,q=1dσq2σq2a+b+cqa+b+cqfqabcfqabcTqc+qc(a+a)(b+b)γ2q,q=1dσq2σq2a,a oddaqaq2a!(qa)!2a!(qa)!(i)a+a×exp(2γ2(a+a)q′′=1dσq′′2(q′′1)!)sina+a(2β)cosqa+qa(2β)=[iγq=1dσq2a oddaq2a!(qa)!(i)aexp(2γ2aq′′=1dσq′′2(q′′1)!)sina(2β)cosqa(2β)]2=limn𝔼JN[H/n]2,\displaystyle\begin{split}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle(H/n)^{2}\rangle\right]&=\sum_{q=1}^{d}\binom{n}{q}\frac{\sigma_{q}^{2}}{n^{q+1}}-\gamma^{2}\sum_{q,q^{\prime}=1}^{d}\sigma_{q}^{2}\sigma_{q^{\prime}}^{2}\sum_{\begin{subarray}{c}a+b+c\leq q\\ a^{\prime}+b^{\prime}+c^{\prime}\leq q^{\prime}\end{subarray}}f^{abc}_{q}f^{a^{\prime}b^{\prime}c^{\prime}}_{q^{\prime}}T^{(a+a^{\prime})(b+b^{\prime})}_{q-c+q^{\prime}-c^{\prime}}\\ &\rightarrow-\gamma^{2}\sum_{q,q^{\prime}=1}^{d}\sigma_{q}^{2}\sigma_{q^{\prime}}^{2}\sum_{\begin{subarray}{c}a,a^{\prime}\text{ odd}\\ a\leq q\\ a^{\prime}\leq q^{\prime}\end{subarray}}\frac{2}{a!(q-a)!}\frac{2}{a^{\prime}!(q^{\prime}-a^{\prime})!}(-i)^{a+a^{\prime}}\\ &\qquad\qquad\qquad\qquad\qquad\qquad\times\exp\left(-2\gamma^{2}(a+a^{\prime})\sum_{q^{\prime\prime}=1}^{d}\frac{\sigma_{q^{\prime\prime}}^{2}}{(q^{\prime\prime}-1)!}\right)\sin^{a+a^{\prime}}(2\beta)\cos^{q-a+q^{\prime}-a^{\prime}}(2\beta)\\ &=\left[i\gamma\sum_{q=1}^{d}\sigma_{q}^{2}\sum_{\begin{subarray}{c}a\text{ odd}\\ a\leq q\end{subarray}}\frac{2}{a!(q-a)!}(-i)^{a}\exp\left(-2\gamma^{2}a\sum_{q^{\prime\prime}=1}^{d}\frac{\sigma_{q^{\prime\prime}}^{2}}{(q^{\prime\prime}-1)!}\right)\sin^{a}(2\beta)\cos^{q-a}(2\beta)\right]^{2}\\ &=\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right]^{2},\end{split} (53)

which is precisely what we claimed in Eq. 8.

4.9 Evaluating Higher Moments

The proof technique used above also applies to higher moments. When computing the mmth moment by taking derivatives of the moment-generating function according to Eq. 21, the only terms that survive in the limit nn\rightarrow\infty are the terms where all derivatives hit the 2γλfq/n2\gamma\lambda f_{q}/n term in the exponential, so that the expression for the moment becomes

𝔼JN[(H/n)m]\displaystyle\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle(H/n)^{m}\rangle\right] =(iγ)mq1,,qm=1dσq12σqm2a1+b1+c1q1am+bm+cmqmfq1a1b1c1fqmambmcmT(q1c1)++(qmcm)(a1++am)(b1++bm)+O(1n).\displaystyle=(i\gamma)^{m}\sum_{q_{1},\dots,q_{m}=1}^{d}\sigma_{q_{1}}^{2}\cdots\sigma_{q_{m}}^{2}\sum_{\begin{subarray}{c}a_{1}+b_{1}+c_{1}\leq q_{1}\\ \dots\\ a_{m}+b_{m}+c_{m}\leq q_{m}\end{subarray}}f^{a_{1}b_{1}c_{1}}_{q_{1}}\cdots f^{a_{m}b_{m}c_{m}}_{q_{m}}T^{(a_{1}+\cdots+a_{m})(b_{1}+\cdots+b_{m})}_{(q_{1}-c_{1})+\cdots+(q_{m}-c_{m})}+O\left(\frac{1}{n}\right). (54)

Noting from Eq. 49 that

T(q1c1)++(qmcm)(a1++am)(b1++bm)=Tq1c1a1b1Tqmcmambm+O(1n),T^{(a_{1}+\cdots+a_{m})(b_{1}+\cdots+b_{m})}_{(q_{1}-c_{1})+\cdots+(q_{m}-c_{m})}=T^{a_{1}b_{1}}_{q_{1}-c_{1}}\cdots T^{a_{m}b_{m}}_{q_{m}-c_{m}}+O\left(\frac{1}{n}\right), (55)

we can write the mmth moment as

𝔼JN[(H/n)m]=(iγ)mq1,,qm=1dσq12σqm2a1+b1+c1q1am+bm+cmqmfq1a1b1c1fqmambmcmTq1c1a1b1Tqmcmambm+O(1n)=[iγq=1dσq2a+b+cqfqabcTqca]m+O(1n)=𝔼JN[H/n]m+O(1n).\displaystyle\begin{split}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle(H/n)^{m}\rangle\right]&=(i\gamma)^{m}\sum_{q_{1},\dots,q_{m}=1}^{d}\sigma_{q_{1}}^{2}\cdots\sigma_{q_{m}}^{2}\sum_{\begin{subarray}{c}a_{1}+b_{1}+c_{1}\leq q_{1}\\ \cdots\\ a_{m}+b_{m}+c_{m}\leq q_{m}\end{subarray}}f^{a_{1}b_{1}c_{1}}_{q_{1}}\cdots f^{a_{m}b_{m}c_{m}}_{q_{m}}T^{a_{1}b_{1}}_{q_{1}-c_{1}}\dots T^{a_{m}b_{m}}_{q_{m}-c_{m}}+O\left(\frac{1}{n}\right)\\ &=\left[i\gamma\sum_{q=1}^{d}\sigma_{q}^{2}\sum_{a+b+c\leq q}f^{abc}_{q}T^{a}_{q-c}\right]^{m}+O\left(\frac{1}{n}\right)\\ &=\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right]^{m}+O\left(\frac{1}{n}\right).\end{split} (56)

Thus, we find that for all mm,

limn𝔼JN[(H/n)m]\displaystyle\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle(H/n)^{m}\rangle\right] =limn𝔼JN[H/n]m.\displaystyle=\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right]^{m}. (57)

4.10 Alternative expression for 𝔼[H/n]\mathbb{E}\left[\langle H/n\rangle\right]

Here, we present an alternative expression for 𝔼[H/n]\mathbb{E}\left[\langle H/n\rangle\right] that makes contact with the notation used in [5] and elsewhere (see Section 3.2). The current paper’s central result (Eq. 6) reads:

limn𝔼JN[H/n]\displaystyle\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right] =2γq=1dσq2a oddaq(1)a12a!(qa)!exp(2γ2aq=1dσq2(q1)!)sina(2β)cosqa(2β).\displaystyle=2\gamma\sum_{q=1}^{d}\sigma_{q}^{2}\sum_{\begin{subarray}{c}a\text{ odd}\\ a\leq q\end{subarray}}\frac{(-1)^{\frac{a-1}{2}}}{a!(q-a)!}\exp\left(-2\gamma^{2}a\sum_{q^{\prime}=1}^{d}\frac{\sigma_{q^{\prime}}^{2}}{(q^{\prime}-1)!}\right)\sin^{a}(2\beta)\cos^{q-a}(2\beta). (58)

If we use the substitution from Section 3.2, σq2=cq2q!\sigma^{2}_{q}=c^{2}_{q}{q!}, this becomes

limn𝔼JN[H/n]=2γq=1dcq2q!a oddaq(1)a12a!(qa)!exp(2γ2aq=1dqcq2)sina(2β)cosqa(2β)=2iγq=1dcq2a odd0aq(qa)(isin(2β)exp(2γ2q=1dqcq2))a(cos(2β))qa.\displaystyle\begin{split}\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right]&=2\gamma\sum_{q=1}^{d}c_{q}^{2}q!\sum_{\begin{subarray}{c}a\text{ odd}\\ a\leq q\end{subarray}}\frac{(-1)^{\frac{a-1}{2}}}{a!(q-a)!}\exp\left(-2\gamma^{2}a\sum_{q^{\prime}=1}^{d}q^{\prime}c_{q^{\prime}}^{2}\right)\sin^{a}(2\beta)\cos^{q-a}(2\beta)\\ &=2i\gamma\sum_{q=1}^{d}c^{2}_{q}\sum_{\begin{subarray}{c}a\text{ odd}\\ 0\leq a\leq q\end{subarray}}\binom{q}{a}\left(-i\sin(2\beta)\exp\left({-2\gamma^{2}\sum_{q^{\prime}=1}^{d}q^{\prime}c^{2}_{q^{\prime}}}\right)\right)^{a}\big{(}\cos(2\beta)\big{)}^{q-a}.\end{split} (59)

Using the identity

2ia odd(qa)(iA)aBqa\displaystyle 2i\sum_{a\text{ odd}}\binom{q}{a}(-iA)^{a}B^{q-a} =i((iA+B)q(iA+B)q)=2[(iA+B)q],\displaystyle=-i((iA+B)^{q}-(-iA+B)^{q})=2\Im\left[(iA+B)^{q}\right], (60)

we can rewrite this as

limn𝔼JN[H/n]\displaystyle\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right] =2γq=1dcq2[(isin(2β)exp(2γ2q=1dqcq2)+cos(2β))q].\displaystyle=2\gamma\sum_{q=1}^{d}c^{2}_{q}\Im\left[\left(i\sin(2\beta)\exp\left({-2\gamma^{2}\sum_{q^{\prime}=1}^{d}q^{\prime}c^{2}_{q^{\prime}}}\right)+\cos(2\beta)\right)^{q}\right]. (61)

Finally, using the definition ξ(x):=qcq2xq\xi(x):=\sum_{q}c_{q}^{2}x^{q} and hence ξ(1)=qqcq2\xi^{\prime}(1)=\sum_{q}qc_{q}^{2}, we get

limn𝔼JN[H/n]=2γq=1dcq2[(cos(2β)+isin(2β)exp(2γ2ξ(1)))q]=2γ[ξ(cos(2β)+isin(2β)exp(2γ2ξ(1)))],\displaystyle\begin{split}\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right]&=2\gamma\sum_{q=1}^{d}c^{2}_{q}\Im\left[\Big{(}\cos(2\beta)+i\sin(2\beta)\exp\left({-2\gamma^{2}\xi^{\prime}(1)}\right)\Big{)}^{q}\right]\\ &=2\gamma\Im\left[\xi\Big{(}\cos(2\beta)+i\sin(2\beta)\exp\left({-2\gamma^{2}\xi^{\prime}(1)}\right)\Big{)}\right],\end{split} (62)

which is precisely what we claimed in Eq. 7. For a pure dd-spin model with only cd0c_{d}\neq 0 the above further simplifies to

limn𝔼JN[H/n]\displaystyle\lim_{n\rightarrow\infty}\operatorname*{\mathbb{E}}_{J\sim N}\left[\langle H/n\rangle\right] =2γcd2[(cos(2β)+isin(2β)exp(2dcd2γ2))d].\displaystyle=2\gamma c_{d}^{2}\cdot\Im\left[\Big{(}\cos(2\beta)+i\sin(2\beta)\exp\left({-2dc_{d}^{2}\gamma^{2}}\right)\Big{)}^{d}\right]. (63)

5 Discussion and Conclusion

In this work, we have derived explicit formulas to quantify the performance of p=1p=1 qaoa on mixed-spin models in the nn\rightarrow\infty limit. We demonstrated both concentration of measure and instance independence for arbitrary mixed-spin models, which imply that the expectation value of the energy per spin is independent of the specific model specification and that measurements of the qaoa state are guaranteed to give energies close to the expectation value. Our explicit formula for the expectation value of the energy for arbitrary mixed-spin models allows us to find the optimal angles on a classical computer.

There are two obvious open questions raised by this work. First, the approach of this paper can probably be combined with the methods of [2] to generalize our work to depth p>1p>1 qaoa. Such a result would allow one to prove instance independence and concentration of measure, and derive a computer algorithm to generate formulas for the expectation value per spin, at arbitrary depth pp. This is a particularly interesting route of research, since it is known that in the cubic case with σqδq,3\sigma_{q}\propto\delta_{q,3}, Montanari’s classical algorithm does not approach the optimal solution [5], so that at sufficient depth pp the qaoa has a chance of outperforming the best known classical algorithm. Higher-spin models with q>3q>3 may even provide a more direct route towards finding a setting where qaoa at depth p=1p=1 outperforms classical optimization such as Montanari’s algorithm [4, 5]. While the generalization to higher pp is likely possible, it is a nontrivial extension of this paper, and we leave it for future work.

Second, it remains an open question what to what extent results on the random models can be used to find optimal angles for realistic binary optimization problems. One hypothetical approach to finding qaoa angles for a single instance of an nn-spin optimization problem would be the procedure:

  1. 1.

    For all q=1,,dq=1,\dots,d, compute the standard deviations σq\sigma_{q} of the qq-spin couplings in the problem.

  2. 2.

    Use Eq. 6 for the mixed-spin model as an estimate for the expectation value of the energy per spin.

  3. 3.

    Run the qaoa at the optimal angles for the corresponding mixed-spin model (or use these angles as starting points for a numerical optimization of the angles)

This procedure assumes that the means of the couplings are zero: μq=0\mu_{q}=0 for all q=1,,dq=1,\dots,d. If this method can be shown to work, it would greatly simplify finding qaoa angles for general optimization problems.

References

  • Farhi et al. [2014] Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. A quantum approximate optimization algorithm. arXiv preprint arXiv:1411.4028, 2014. URL https://arxiv.org/abs/1411.4028.
  • Farhi et al. [2019] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, and Leo Zhou. The quantum approximate optimization algorithm and the Sherrington-Kirkpatrick model at infinite size. arXiv preprint arXiv:1910.08187, 2019. URL https://arxiv.org/abs/1910.08187.
  • Montanari [2021] Andrea Montanari. Optimization of the Sherrington–Kirkpatrick hamiltonian. SIAM Journal on Computing, 0(0):FOCS19–1, 2021. doi: 10.1137/20M132016X.
  • Alaoui et al. [2020] Ahmed El Alaoui, Andrea Montanari, and Mark Sellke. Optimization of mean-field spin glasses. arXiv preprint arXiv:2001.00904, 2020. URL https://arxiv.org/abs/2001.00904.
  • Alaoui and Montanari [2020] Ahmed El Alaoui and Andrea Montanari. Algorithmic thresholds in mean field spin glasses. arXiv preprint arXiv:2009.11481, 2020. URL https://arxiv.org/abs/2009.11481.
  • Jiang et al. [2017] Zhang Jiang, Eleanor G Rieffel, and Zhihui Wang. Near-optimal quantum circuit for Grover’s unstructured search using a transverse field. Physical Review A, 95(6):062317, 2017. doi: 10.1103/PhysRevA.95.062317.
  • Wang et al. [2018] Zhihui Wang, Stuart Hadfield, Zhang Jiang, and Eleanor G Rieffel. Quantum approximate optimization algorithm for maxcut: A fermionic view. Physical Review A, 97(2):022304, 2018. doi: 10.1103/PhysRevA.97.022304.
  • Ozaeta et al. [2020] Asier Ozaeta, Wim van Dam, and Peter L. McMahon. Expectation values from the single-layer quantum approximate optimization algorithm on Ising problems. arXiv preprint arXiv:2012.03421, 2020. URL https://arxiv.org/abs/2012.03421.
  • Crooks [2018] Gavin E Crooks. Performance of the quantum approximate optimization algorithm on the maximum cut problem. arXiv preprint arXiv:1811.08419, 2018. URL https://arxiv.org/abs/1811.08419.
  • Zhou et al. [2020] Leo Zhou, Sheng-Tao Wang, Soonwon Choi, Hannes Pichler, and Mikhail D Lukin. Quantum approximate optimization algorithm: Performance, mechanism, and implementation on near-term devices. Physical Review X, 10(2):021067, 2020. doi: 10.1103/PhysRevX.10.021067.
  • Brandao et al. [2018] Fernando GSL Brandao, Michael Broughton, Edward Farhi, Sam Gutmann, and Hartmut Neven. For fixed control parameters the quantum approximate optimization algorithm’s objective function value concentrates for typical instances. arXiv preprint arXiv:1812.04170, 2018. URL https://arxiv.org/abs/1812.04170.
  • Sherrington and Kirkpatrick [1975] David Sherrington and Scott Kirkpatrick. Solvable model of a spin-glass. Physical Review Letters, 35(26):1792, 1975. doi: 10.1103/PhysRevLett.35.1792.
  • Parisi [1980] Giorgio Parisi. A sequence of approximated solutions to the SK model for spin glasses. Journal of Physics A: Mathematical and General, 13(4):L115, 1980. doi: 10.1088/0305-4470/13/4/009.
  • Talagrand [2006] Michel Talagrand. The Parisi formula. Annals of mathematics, pages 221–263, 2006. doi: 10.4007/annals.2006.163.221.
  • Panchenko [2012] Dmitry Panchenko. The Sherrington-Kirkpatrick model: an overview. Journal of Statistical Physics, 149(2):362–383, 2012. doi: 10.1007/s10955-012-0586-7.
  • Panchenko [2013] Dmitry Panchenko. The Sherrington-Kirkpatrick model. Springer Science & Business Media, 2013. doi: 10.1007/978-1-4614-6289-7.
  • Auffinger et al. [2017] Antonio Auffinger, Wei-Kuo Chen, et al. Parisi formula for the ground state energy in the mixed pp-spin model. Annals of Probability, 45(6B):4617–4631, 2017. doi: 10.1214/16-AOP1173.
  • Panchenko et al. [2014] Dmitry Panchenko et al. The Parisi formula for mixed pp-spin models. Annals of Probability, 42(3):946–958, 2014. doi: 10.1214/12-AOP800.
  • Zhou et al. [2021] Leo Zhou, Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. The quantum approximate optimization algorithm and the Sherrington-Kirkpatrick model at infinite size. Talk presented at QIP, 2021. URL https://youtu.be/UP-Zuke7IUg.