Quantum Speedups for Zero-Sum Games
via Improved Dynamic Gibbs Sampling

Adam Bouland Yosheb Getachew Yujia Jin Aaron Sidford Kevin Tian¹¹1Work partly completed while at Stanford.
{abouland,yoshebg,yujiajin,sidford}@stanford.edu, tiankevin@microsoft.com

Abstract

We give a quantum algorithm for computing an $\epsilon$ -approximate Nash equilibrium of a zero-sum game in a $m\times n$ payoff matrix with bounded entries. Given a standard quantum oracle for accessing the payoff matrix our algorithm runs in time $\widetilde{O}(\sqrt{m+n}\cdot\epsilon^{-2.5}+\epsilon^{-3})$ and outputs a classical representation of the $\epsilon$ -approximate Nash equilibrium. This improves upon the best prior quantum runtime of $\widetilde{O}(\sqrt{m+n}\cdot\epsilon^{-3})$ obtained by [vAG19] and the classic $\widetilde{O}((m+n)\cdot\epsilon^{-2})$ runtime due to [GK95] whenever $\epsilon=\Omega((m+n)^{-1})$ . We obtain this result by designing new quantum data structures for efficiently sampling from a slowly-changing Gibbs distribution.

1 Introduction

There is now a broad family of quantum algorithms for machine learning and fast numerical linear algebra [BWP⁺17], built on many quantum algorithmic primitives, e.g. [BHMT02, HHL09, GSLW19]. More specifically, for a wide range of problems it has been shown how quantum algorithms can (in certain parameter regimes) yield faster runtimes.²²2Note that quantifying the end-to-end speedups obtained by these methods can be subtle due to I/O overheads, different access models [Aar15], and classical de-quantization algorithms [Tan19, CGL⁺20, GLG22]. These quantum algorithms obtain runtimes which improve upon the dimension dependence of classical algorithms, but often at the cost of a worse dependence on the error tolerance and/or implicit access to the solution (e.g. query or sampling access for solution entries). Consequently, this paper is motivated by the following question.

To what degree is there an inherent accuracy versus dimension-dependence tradeoff for quantum optimization algorithms? What algorithmic techniques improve this tradeoff?

In this paper we consider this question for the fundamental optimization problem of computing $\epsilon$ -approximate Nash equilibrium in zero-sum games. Our main result is an improved dependence on $\epsilon$ for quantum algorithms solving zero-sum games, which is very close to that of its classical counterpart. Further, we show that for our algorithms, obtaining a classical representation of the solution is obtainable at no additional asymptotic cost. Our work builds upon [vAG19, LCW19], which already took a large and important step towards answering the above question by designing quantum data structures for efficiently implementing algorithms for solving zero-sum games.

Interestingly, to obtain our result we provide improved quantum algorithms for solving a dynamic data structure problem of sampling from a slowly-changing Gibbs distribution. Such dynamic sampling problems arise as a natural component of stochastic gradient methods for solving zero-sum games. We obtain our speedups by improving a Gibbs sampling subroutine developed in [vAG19]. We design a new dynamic quantum data structure which performs the necessary Gibbs sampling in time $\widetilde{O}(\epsilon^{-\frac{1}{2}})$ , which is faster than the corresponding $\widetilde{O}(\epsilon^{-1})$ runtime achieved by [vAG19]. Beyond the intrinsic utility of solving this problem, we hope our improved Gibbs sampler showcases potential algorithmic insights that can be gleaned by seeking improved error dependencies for quantum optimization algorithms. Moreover, we hope this work encourages the study and design of quantum data structures for efficient optimization.

1.1 Zero-sum games

For matrix $\mathbf{A}\in\mathbb{R}^{m\times n}$ its associated zero-sum game is the pair of equivalent optimization problems

\min_{u\in\Delta^{m}}\max_{v\in\Delta^{n}}u^{\top}\mathbf{A}v=\max_{v\in\Delta^{n}}\min_{u\in\Delta^{m}}u^{\top}\mathbf{A}v,\text{ where }\textstyle{\Delta^{k}:=\{x\in\mathbb{R}^{k}_{\geq 0}:\sum_{i\in[k]}x_{i}=1\}}.

In such a game, we refer to $\mathbf{A}$ as the payoff matrix and view the $m$ and $n$ -dimensional simplices, i.e. $\Delta^{m}$ and $\Delta^{n}$ , as the space of distributions over $[m]$ and $[n]$ respectively. From this perspective $u^{\top}\mathbf{A}v$ , known as payoff or utility of $(u,v)$ , is the expected value of $\mathbf{A}_{ij}$ when sampling $i\in[m]$ and $j\in[n]$ independently from the distributions corresponding to $u$ and $v$ . Thus, a zero-sum game models a two-player game where a minimization player seeks to minimize the payoff while, simultaneously, a maximization player seeks to maximize it.

In this paper, we consider the canonical problem of computing an approximate Nash equilibrium of a zero-sum game. Given the payoff matrix $\mathbf{A}\in\mathbb{R}^{m\times n}$ we call a pair $(u,v)\in\Delta^{m}\times\Delta^{n}$ an $\epsilon$ -approximate Nash equilibrium (NE) for $\epsilon\in\mathbb{R}_{>0}$ if

\left(\max_{v^{\prime}\in\Delta^{n}}u^{\top}\mathbf{A}v^{\prime}\right)-\left(\min_{u^{\prime}\in\Delta^{m}}(u^{\prime})^{\top}\mathbf{A}v\right)\leq\epsilon.

We assume that the payoff matrix $\mathbf{A}$ and the error-tolerance are given as input to an algorithm, and that, for simplicity, $\left\lVert\mathbf{A}\right\rVert_{\max}\leq 1$ , i.e. the largest entry of $\mathbf{A}$ has magnitude at most $1$ (this is without loss of generality by rescaling $\mathbf{A}\leftarrow\left\lVert\mathbf{A}\right\rVert_{\max}^{-1}\mathbf{A}$ and $\epsilon\leftarrow\left\lVert\mathbf{A}\right\rVert_{\max}^{-1}\epsilon$ ). The main goal of this paper is to design improved zero-sum game solvers, i.e. algorithms that compute $\epsilon$ -approximate NEs.

Zero-sum games are foundational to theoretical computer science, optimization, and economics. The problem of approximately solving zero-sum games is a natural formulation of approximate linear programming (LP) and correspondingly, this problem is a prominent testbed for new optimization techniques. Over the past decades there have been numerous advances in the computational complexity of solving zero-sum games under various assumptions on problem parameter (see Section 1.3 for a survey). Recent advancements in interior point methods (IPMs) for linear programming, e.g. [vdBLL⁺21] and references therein (discussed in more detail in Section 1.3), solve zero sum-games in time $\widetilde{O}(mn+\min(m,n)^{2.5})$ .³³3We use the $\widetilde{O}$ notation to hide polylogarithmic dependences on problem parameters when convenient for exposition; see Section 2 for a more detailed statement of hidden parameters. In informal theorem statements, we use “with high probability” to indicate a polylogarithmic dependence on the failure probability. Further the linear programming algorithm of [vdB20], shows that zero-sum games can be solved deterministically in $\widetilde{O}((m+n)^{\omega})$ time where $\omega<2.373$ is the current matrix multiplication constant [AW21], or $\widetilde{O}((m+n)^{3})$ without fast matrix multiplication. In this paper, we primarily focus on sublinear-time algorithms for approximating NEs.

A well-known algorithm by [GK95] achieves a runtime of $\widetilde{O}((m+n)\cdot\epsilon^{-2})$ , which is the state-of-the-art sublinear runtime amongst classical algorithms, without further problem assumptions. Recently it has been shown that quantum algorithms can yield strikingly runtime improvements for solving zero-sum games and their generalizations [LCW19, vAG19, LWCW21]. In particular, in 2019 Li, Chakrabati and Wu [LCW19] gave a quantum algorithm for zero sum games in time $\widetilde{O}(\sqrt{m+n}\cdot\epsilon^{-4})$ , and simultaneously van Apeldoorn and Gilyen [vAG19] gave an algorithm running in time $\widetilde{O}(\sqrt{m+n}\cdot\epsilon^{-3})$ . These algorithms yield a quadratic improvement in the dimension dependence of the best classical algorithm, at the cost of a higher error dependence.

The algorithms of [LCW19, vAG19, LWCW21] operate using a standard quantum oracle for $\mathbf{A}$ (formally stated in Section 2), in which one can query the entries of $\mathbf{A}$ in superposition. We focus on the algorithm of [vAG19] for the rest of this paper, as we focus on improving error dependence. The [vAG19] algorithm generalizes the classical algorithm of Grigoriadis and Khachiyan [GK95], and obtains a runtime improvement by speeding up a key dynamic Gibbs sampling subroutine required by the [GK95] method. As we discuss in greater detail in Section 3, van Apeldoorn and Gilyen give a quantum data structure to efficiently perform this sampling in time quadratically faster in the dimension, which lies at the core of their algorithmic speedup.

Our result.

We give a new quantum algorithm for solving zero-sum games which improves upon the runtime of the prior state-of-the-art quantum algorithm, due to [vAG19].

Theorem 1 (informal, see Theorem 4).

Let $\mathbf{A}\in\mathbb{R}^{m\times n}$ with $\left\lVert\mathbf{A}\right\rVert_{\max}\leq 1$ , and $\epsilon\in(0,1)$ . Given a quantum oracle for $\mathbf{A}$ (defined in Section 2), there is an $\widetilde{O}(\sqrt{m+n}\cdot\epsilon^{-2.5}+\epsilon^{-3})$ time algorithm which yields a classical output $(u,v)\in\Delta^{m}\times\Delta^{n}$ that is an $\epsilon$ -approximate NE with high probability.

Our new algorithm simultaneously improves the best known quantum [vAG19] and classical [GK95] algorithms in the parameter regime where IPMs do not dominate sublinear algorithms. In particular, it is faster than the classical $\widetilde{O}((m+n)\cdot\epsilon^{-2})$ runtime of [GK95] whenever $\epsilon^{-1}=\widetilde{O}(m+n)$ , which includes the regime where [GK95] offers advantages over the $\widetilde{O}((m+n)^{\omega})$ runtime of the [vdB20] IPM, as $\omega<3$ . This is in contrast to the prior quantum rate of [vAG19], which does not achieve an improvement upon [GK95] in the full parameter range where sublinear algorithms are currently preferable to IPMs. For example, when $m\approx n$ and (up to logarithmic factors) $\epsilon\in[n^{-c},n^{-\frac{1}{2}}]$ where $c=\frac{1}{2}(\omega-1)$ , the rate of [GK95] is favorable to that of [vAG19] and state-of-the-art IPMs [vdB20, CLS21].⁴⁴4There is evidence that $\omega=2$ cannot be achieved with current techniques, e.g. [Alm21].

1.2 Dynamic Gibbs sampling

We obtain the improved error dependence in our zero-sum game solver by producing a new, faster quantum data structure to perform the Gibbs sampling as used in the algorithm of [vAG19], which may be of independent interest. Gibbs sampling is a fundamental algorithmic primitive — the basic task is, given vector $v\in\mathbb{R}^{n}$ , sample from the probability distribution proportional to $\exp(v)$ . Gibbs sampling is used as a subroutine in many quantum and classical optimization algorithms, e.g. [BS17] and follow-up works. In general, quantum algorithms can perform this task more efficiently using amplitude estimation, which can boost the acceptance probability of rejection sampling schemes. This strategy was implemented in [vAG19], which approximate the maximum entry $v_{\max}$ of $v$ using quantum maximum finding [DH96], uniformly sample $i\in[n]$ , and accept the sample with probability $\exp(v_{i}-v_{\max})\leq 1$ using quantum rejection sampling. We give a more detailed overview of the [vAG19] Gibbs sampler and its complexity analysis in Section 3.2.

We give a data structure which quadratically improves the error dependence of the [vAG19] Gibbs sampling subroutine runtime, from $\widetilde{O}(\sqrt{m+n}\cdot\epsilon^{-1})$ per sample to an amortized $\widetilde{O}(\sqrt{m+n}\cdot\epsilon^{-\frac{1}{2}})$ per sample. A key fact which enables this improvement is that the Gibbs distributions one samples from in the zero-sum game solver of [GK95] change slowly over time: the base vector $v$ receives bounded sparse updates in each iteration. By storing partial information about the Gibbs distribution, namely an efficiently-computable overestimate to its entries which remains valid across many consecutive iterations, we obtain an improved dynamic Gibbs sampler, which we also provide a detailed overview of in Section 3.2.

We now define our notion of an approximate Gibbs sampler, and then state the dynamic sampling problem we consider, which arises naturally in zero-sum game algorithms with sublinear runtimes.

Definition 1 (Approximate Gibbs oracle).

For $v\in\mathbb{R}^{n}$ , its associated Gibbs distribution is $p_{v}\in\Delta^{n}$ such that for all $i\in[n]$ , $[p_{v}]_{i}\propto\exp(v_{i})$ . We say $\mathcal{O}^{\textup{gibbs}}_{v}$ is a $\delta$ -approximate Gibbs oracle if it samples from $\tilde{p}\in\Delta^{n}$ with $\left\lVert\tilde{p}-p_{v}\right\rVert_{1}\leq\delta$ .

Problem 1 (Sampling maintenance).

Let $\eta>0$ , $\delta\in(0,1)$ , and suppose we have a quantum oracle for $\mathbf{A}\in\mathbb{R}^{m\times n}$ . Consider a sequence of $T$ $\mathsf{Update}$ operations to a dynamic vector $x\in\mathbb{R}^{m}_{\geq 0}$ , each of the form $x_{i}\leftarrow x_{i}+\eta$ for some $i\in[m]$ . In the sampling maintenance problem, in amortized $\mathcal{T}_{\textup{update}}$ time per $\mathsf{Update}$ we must maintain a $\delta$ -approximate Gibbs oracle, $\mathcal{O}_{\textup{samp}}$ , for $\mathbf{A}^{\top}x$ which is queryable in worst-case time $\mathcal{T}_{\textup{samp}}$ .

Our result.

We provide a quantum algorithm for solving Problem 1, which improves upon the runtime implied by the corresponding component in the algorithm of [vAG19].

Theorem 2 (informal, see Theorem 3).

There is a quantum algorithm which solves 1 with high probability with $\max(\mathcal{T}_{\textup{samp}},\mathcal{T}_{\textup{update}})=\widetilde{O}\left(\sqrt{n}\cdot T\eta^{1.5}\right)$ and an initialization cost of $\widetilde{O}\left(\eta^{3}T^{3}\right)$ .

Theorem 2 improves upon the solution to the sampling maintenance Problem 1 implied by [vAG19] by a $\eta^{-\frac{1}{2}}$ factor; in the setting of the [GK95] solver, where $T=\widetilde{O}(\epsilon^{-2})$ and $\eta=\Theta(\epsilon)$ , this is an $\epsilon^{-\frac{1}{2}}$ -factor improvement. At a high level, our improvement is obtained by storing a hint consisting of a vector which overestimates the true Gibbs distribution, and an approximate normalization factor, which are infrequently updated. Our maintained hint satisfies the desirable properties that: $(i)$ it remains valid for a batch of consecutive iterations, and $(ii)$ the degree of overestimation is bounded. The former property ensures a fast amortized update time, and the latter ensures a fast sample time by lower bounding the acceptance probability of our quantum rejection sampler. Our high-level strategy for maintaining improved hints is to repeatedly call our sampling access to accurately estimate large entries of the Gibbs distribution, and to exploit stability of the distribution under the setting of Problem 1. We discuss our dynamic Gibbs sampler in more detail and compare it with previous methods for solving 1 in Section 3.2.

The initialization cost of Theorem 2 is due to the current state-of-the-art in numerically stable implementations of the quantum singular value transformation (SVT) framework of [GSLW19]. This cost is also the cause of the additive $\widetilde{O}(\epsilon^{-3})$ term in Theorem 1. We discuss this cost in Appendix D; improvements to numerically stable implementations of [GSLW19] would be reflected in the runtimes of Theorems 1 and 2.

1.3 Related work

Table 1: Algorithms for computing

\displaystyle\epsilon

-approximate Nash equilibria of zero-sum games. Hides polylogarithmic factors and assumes

\displaystyle\mathbf{A}\in\mathbb{R}^{m\times n}

with

\displaystyle\left\lVert\mathbf{A}\right\rVert_{\max}\leq 1

Method	Query model	Total runtime
interior point method [CLS21]	classical	$\displaystyle\max(m,n)^{\omega}$
interior point method [vdBLL⁺21]	classical	$\displaystyle mn+\min(m,n)^{2.5}$
extragradient [Nem04, Nes07]	classical	$\displaystyle mn\cdot\epsilon^{-1}$
stochastic mirror descent (SMD) [GK95]	classical	$\displaystyle(m+n)\cdot\epsilon^{-2}$
variance-reduced SMD [CJST19]	classical	$\displaystyle mn+\sqrt{mn(m+n)}\cdot\epsilon^{-1}$
[vAG19]	quantum	$\displaystyle\sqrt{m+n}\cdot\epsilon^{-3}$
Theorem 1 (our work)	quantum	$\displaystyle\sqrt{m+n}\cdot\epsilon^{-2.5}+\epsilon^{-3}$

Table 2: Solutions to Problem 1,

\displaystyle T=\epsilon^{-2}

\displaystyle\eta=\epsilon

. Hides polylogarithmic factors.

Method	Query model	$\displaystyle\mathcal{T}_{\textup{samp}}$	$\displaystyle\mathcal{T}_{\textup{update}}$
explicit updates [GK95]	classical	$\displaystyle 1$	$\displaystyle m+n$
max-based rejection sampling [vAG19]	quantum	$\displaystyle\sqrt{m+n}\cdot\epsilon^{-1}$	$\displaystyle\sqrt{m+n}\cdot\epsilon^{-1}$
Theorem 2 (our work)	quantum	$\displaystyle\sqrt{m+n}\cdot\epsilon^{-\frac{1}{2}}$	$\displaystyle\sqrt{m+n}\cdot\epsilon^{-\frac{1}{2}}$

Quantum optimization and machine learning.

There are a wide array of quantum algorithms for optimization and machine learning which make use of fundamental algorithmic primitives such as amplitude amplification [BHMT02], the HHL algorithm [HHL09], and the quantum singular value transformation [GSLW19]. For example, a number of works gave HHL-based algorithms for a variety of machine learning tasks such as PCA [LMR14], SVMs [RML14], and recommendation systems [KP16]. For more details see the survey article of [BWP⁺17].

Most relevant to our current work are quantum algorithms for optimization problems. For example, Brandao and Svore [BS17] gave a quantum algorithm for SDP solving based on the Arora-Kale algorithm [AK07], which was later improved by [VAGGdW20b]. There have also been quantum IPM-based methods for LPs and SDPs [KP20]. Additionally a series of works have considered quantum algorithms for general convex optimization [CCLW20, vAGGdW20a], which make use of Jordan’s algorithm for fast gradient estimation [Jor05, GAW19].

In the area of zero-sum games, in addition to the works previously mentioned [vAG19, LCW19] on $\ell_{1}$ - $\ell_{1}$ games (where both players are $\ell_{1}$ -constrained), there have been several works considering different variants of zero-sum games. For example Li, Chakrabati and Wu [LCW19] gave quantum algorithms for $\ell_{2}$ - $\ell_{1}$ games with quadratic improvement on the dimension. Later Li, Wang, Chakrabati and Wu [LWCW21] extended this algorithm to more general $\ell_{q}$ - $\ell_{1}$ games with $q\in(1,2]$ .

Zero-sum games.

Zero-sum games are a canonical modeling tool in optimization, economics and machine learning [Neu28]. The classic extragradient (mirror prox) method [Nem04, Nes07] computes an $\epsilon$ -approximate NE in $\widetilde{O}(mn\cdot\epsilon^{-1})$ time; as discussed previously, the stochastic mirror descent method of [GK95] obtains the same accuracy in time $\widetilde{O}((m+n)\cdot\epsilon^{-2})$ . An intermediate runtime was recently obtained by [CJST19] using variance reduction, described in Table 1.

Improved runtimes are available under more fine-grained characterizations of the matrix $\mathbf{A}$ , such as sparsity (e.g. number of nonzero entries per row or column) or numerical sparsity (e.g. rows and columns with bounded $\ell_{1}$ -to- $\ell_{2}$ norm ratios) [CJST20]. Notably, the [GK95] algorithm also offers runtime improvements under a sparsity assumption, as does the algorithm of [vAG19] in certain sparsity-to-accuracy ratio regimes. In this paper, we focus on NE algorithms in the general setting (without further sparsity or numerical sparsity assumptions).

In parallel, a long line of research improving IPMs for solving linear programming [Kar84, Ren88, LS14, LS19, vdBLSS20, JSWZ21] has led to a number of different zero-sum game solvers with polylogarithmic runtime dependencies on the problem accuracy $\epsilon$ . The current state-of-the-art variants of IPMs are [CLS21] and [vdBLL⁺21], which achieve runtimes of $\widetilde{O}(\max(m,n)^{\omega})$ and $\widetilde{O}(mn+\min(m,n)^{2.5})$ respectively. We refer readers to Table 1 for detailed comparisons. Finally, for strongly polynomial runtimes (i.e. with no dependence on $\epsilon$ ), which are outside the scope of this paper, we refer readers to [DNV20] and references therein.

1.4 Future work

Theorem 1’s $\epsilon$ dependence is within an $\epsilon^{-\frac{1}{2}}$ factor of matching classical counterparts. To the best of our knowledge, removing this $\epsilon^{-\frac{1}{2}}$ overhead would represent the first quantum algorithm for a natural optimization problem which improves upon classical counterparts across all parameters.

Both our work and [vAG19] solve Problem 1 by leveraging a powerful polynomial approximation-based technique developed in [GSLW19], known as the quantum singular value transform (QSVT). In both cases, QSVT is used with a polynomial of degree $\widetilde{O}(\epsilon^{-1})$ . We note that in closely-related classical settings (discussed in [SV14]), Chebyshev polynomial-based approximations yield a quadratically smaller degree. However, a boundedness requirement (due to the spectra of quantum gates) prevents straightforwardly applying these constructions within QSVT. Sidestepping this barrier is a natural avenue towards improving our work, which we leave as an open problem.

More generally, establishing optimal oracle query complexities of dynamic Gibbs sampling (e.g. Problem 1) and solving zero-sum games are key problems left open by our work. These questions are potentially more approachable than establishing tight time complexity characterizations. For example, could $\max(\mathcal{T}_{\textup{samp}},\mathcal{T}_{\textup{update}})$ be improved to $\widetilde{O}(\sqrt{n})$ in the context of Theorem 1, or can we rule out such an improvement in the query model?

1.5 Organization

In Section 2 we state the notation used throughout the paper, as well as the (classical and quantum) computational models we assume. In Section 3, we give a brief technical overview of the core components of our algorithm used to prove Theorem 1: the stochastic gradient method our method is built on, and an efficient quantum implementation of a key subroutine using a new dynamic Gibbs sampler. Finally in Section 4 we give our new quantum sampler, and prove Theorem 2.

We aim to give a self-contained, but simplified, description of our algorithm in Section 3 to improve the readability of the paper for readers with an optimization background unfamiliar with quantum computing, and vice versa. In particular, we abstract away the core optimization machinery (stochastic mirror descent) and quantum machinery (quantum SVT) developed in prior work into the statements of Propositions 1 and 2, and focus on how we use these statements black-box to build a faster algorithm. The proofs of these statements can be found in Appendices A and B.

2 Preliminaries

General notation.

$\widetilde{O}$ hides logarithmic factors in problem dimensions (denoted $m$ and $n$ ), target accuracies (denoted $\epsilon$ ), and failure probabilities (denoted $\alpha$ ). When discussing runtimes for Problem 1, we additionally use $\widetilde{O}$ to hide logarithmic factors in the parameters $\eta,T$ . For all $i\in[n]$ we let $e_{i}\in\mathbb{R}^{n}$ denote the $i^{\text{th}}$ standard basis vector for $i\in[n]$ when $n$ is clear. $\left\lVert\cdot\right\rVert_{p}$ denotes the $\ell_{p}$ norm of a vector. For $\mathbf{A}\in\mathbb{R}^{m\times n}$ , its $i^{\text{th}}$ row and $j^{\text{th}}$ column are respectively $\mathbf{A}_{i:},\mathbf{A}_{:j}$ . For $v\in\mathbb{R}^{n}$ , $\textbf{{diag}}\left(v\right)$ is the diagonal $n\times n$ matrix with $v$ as the diagonal. Conjugate transposes of $\mathbf{A}$ are denoted $\mathbf{A}^{*}$ ; when the matrix is real we use $\mathbf{A}^{\top}$ . The all-ones and all-zeros vectors of dimension $n$ are $\mathbf{1}_{n}$ and $\mathbf{0}_{n}$ . Finally, throughout $a:=\lceil\log_{2}m\rceil$ and $b:=\lceil\log_{2}n\rceil$ , so $[m]\subseteq[2^{a}]$ and $[n]\subseteq[2^{b}]$ .

Computation models.

We assume entries of $\mathbf{A}$ are $w$ -bit reals for $w=O(\log(mn))$ , and work in the word RAM model where $w$ -bit arithmetic operations take $O(1)$ time; for simplicity, we assume mathematical operations such as trigonometric functions and radicals can also be implemented exactly for $w$ -bit words in $O(1)$ time. Throughout, “quantum states” mean unit vectors, and “quantum gates” or “oracles” $\mathcal{O}$ mean unitary matrices. We follow standard notation and identify a standard basis vector $e_{i}$ for $i\in[n]$ with $|i\rangle$ , an $a$ -qubit state, in which $i$ is represented in binary (i.e. more formally, $|i\rangle=|\textup{bin}(i)\rangle$ , and bin is omitted for brevity). We consider the standard model of quantum access to oracles, in which the oracle $\mathcal{O}$ , which is defined by its operation on $|s\rangle$ for all $\{0,1\}^{*}$ -valued $s$ (where length is clear from context), can be queried in superposition. If $\mathcal{O}$ is queried on $|v\rangle:=\sum_{s}\alpha_{s}|s\rangle$ , the result is $\mathcal{O}|v\rangle=\sum_{s}\alpha_{i}(\mathcal{O}|s\rangle)$ . We use $|g\rangle$ , $|g^{\prime}\rangle$ , etc. (when clear from context) to denote arbitrary sub-unit vectors, which represent garbage states (unused in computations). The tensor product of states $|u\rangle$ and $|v\rangle$ on $a$ and $b$ qubits is denoted $|u\rangle|v\rangle$ , an $(a+b)$ -qubit state. The runtime of a quantum circuit is its maximum depth (in arithmetic gates on $w$ -bit words).

Access model.

Throughout the paper, we assume a standard quantum oracle for accessing $\mathbf{A}$ (recall $\left\lVert\mathbf{A}\right\rVert_{\max}\leq 1$ ). In particular, by a quantum oracle for $\mathbf{A}$ we mean an oracle $\mathcal{O}_{\mathbf{A}}$ which, when queried with $|i\rangle|j\rangle|s\rangle$ for $i\in[m],j\in[n],s\in\{0,1\}^{w}$ , reversibly writes $\mathbf{A}_{ij}$ (in binary) to the third register in $O(1)$ time, i.e. $\mathcal{O}_{\mathbf{A}}|i\rangle|j\rangle|s\rangle=|i\rangle|j\rangle|s\oplus\mathbf{A}_{ij}\rangle$ where $\oplus$ is bitwise mod- $2$ addition.

Given a quantum oracle for $\mathbf{A}$ , with two queries, by standard constructions one can construct an oracle which places the value in the amplitude of the state rather than the register itself. More formally, one can construct⁵⁵5This follows e.g. by calling the oracle to obtain the value of $\mathbf{A}_{ij}$ in binary (interpreted as a signed number between $0$ and $1$ ), adding an ancilla qubit, performing arithmetric to compute the rotation angle needed on that ancilla, applying a tower of controlled rotation gates to an ancilla qubit using that rotation angle express in binary, then calling the standard oracle a second time to uncompute the binary value of $\mathbf{A}_{ij}$ . See e.g. [GR02] for details. an $\mathcal{O}_{\mathbf{A}}^{\prime}$ , which operates as:

\mathcal{O}_{\mathbf{A}}^{\prime}|0\rangle|i\rangle|j\rangle=\sqrt{\mathbf{A}_{ij}}|0\rangle|i\rangle|j\rangle+\sqrt{1-|\mathbf{A}_{ij}|}|1\rangle|g\rangle,\text{ for }(i,j)\in[m]\times[n].

It is standard in the literature to (using ancilla qubits to store the output register where $\mathbf{A}_{ij}$ is written) construct such an $\mathcal{O}_{\mathbf{A}}^{\prime}$ from $\mathcal{O}_{\mathbf{A}}$ under our classical model of computation, see e.g. [GR02]. For simplicity, we omit discussion of ancilla qubits in the remainder of the paper and assume direct access to $\mathcal{O}_{\mathbf{A}}^{\prime}$ . We also note that there is ambiguity in the implementation of $\mathcal{O}_{\mathbf{A}}^{\prime}$ in that the square root is not unique, and that we have control over the signing used in this implementation. We will use this flexibility crucially later in the paper, specifically Corollary 6.

3 Overview of approach

In this section, we give an overview of the approach we take to prove our main results: an improved quantum runtime for solving zero-sum games (Theorem 4) and an improved quantum data structures for dynamic Gibbs sampling (Theorem 3). We organize this section as follows.

In Section 3.1, we state Algorithm 1, the optimization method framework we use to solve zero-sum games. This framework is a generalization of the classical algorithm of [GK95]. We state its guarantees in Proposition 1 and defer the proof to Appendix A. Algorithm 1 assumes access to an approximate Gibbs oracle (Definition 1) for sampling from dynamic distributions as stated in Problem 1. The bulk of our work is devoted to obtaining an efficient quantum implementation of such an oracle (Theorem 3) and using this result we prove Theorem 4 at the end of Section 3.1.

In Section 3.2, we overview the main technical innovation of this paper, an improved solution to Problem 1. Whereas prior work by [vAG19] solves Problem 1 at an amortized $\approx\sqrt{m+n}\cdot\epsilon^{-1}$ cost per iteration, we show how to solve the problem at an amortized $\approx\sqrt{m+n}\cdot\epsilon^{-\frac{1}{2}}$ cost. We remark that the only quantum components of our algorithm (quantum SVT and amplitude amplification) are abstracted away by Proposition 2, which is proven in Appendix B.

3.1 Solving matrix games with a Gibbs sampling oracle

Our proof of Theorem 4 uses an efficient implementation of the algorithmic framework stated in Algorithm 1, based on stochastic mirror descent. In specifying Algorithm 1, we recall our earlier Definition 1, which captures the approximate sampling access we require for Algorithm 1’s execution.

1 Input:

\mathbf{A}\in\mathbb{R}^{m\times n}

, desired accuracy

\epsilon\in(0,1)

\delta

-approximate Gibbs oracles for the (dynamic) vectors

-\mathbf{A}^{\top}x_{t}

and

\mathbf{A}y_{t}

2 Parameters: Gibbs sampler parameter

\delta\in(0,1)

, step size

\eta>0

, iteration count

T

3 Initialize

\hat{u}\leftarrow\mathbf{0}_{m}

\hat{v}\leftarrow\mathbf{0}_{n}

x_{0}\leftarrow\mathbf{0}_{m}

, and

y_{0}\leftarrow\mathbf{0}_{n}

4 for $t=0$ to $T-1$ do

5 Independently sample

j_{t},j^{\prime}_{t}\in[n]

using

\mathcal{O}^{\textup{gibbs}}_{-\mathbf{A}^{\top}x_{t}}

and

i_{t},i^{\prime}_{t}\in[m]

using

\mathcal{O}^{\textup{gibbs}}_{\mathbf{A}y_{t}}

Update

y_{t+1}\leftarrow y_{t}+\eta e_{j_{t}}

and

x_{t+1}\leftarrow x_{t}+\eta e_{i_{t}}

// Update iterates.

Update

\hat{u}\leftarrow\hat{u}+\frac{1}{T}e_{i^{\prime}_{t}}

and

\hat{v}\leftarrow\hat{v}+\frac{1}{T}e_{j^{\prime}_{t}}

// Update output.

7return

(\hat{u},\hat{v})

Algorithm 1 MatrixGameSolver(

\delta,\eta,T

)

The main skeleton of Algorithm 1 (Lines 5-6) using exact oracles is identical to the method of [GK95]. However, our framework builds upon [GK95] in the following three ways.

1.

We tolerate total variation error in the sampling procedure via $\delta$ -approximate Gibbs oracles.
2.

We provide a high-probability guarantee on the duality gap using martingale arguments.
3.

We subsample the output to obtain a sparse solution yielding a comparable duality gap.

We remark that several of these improvements have appeared previously, either explicitly or implicitly, in the stochastic gradient method literature. For example, an approximation-tolerant stochastic gradient method was given in [CJST20], and our proofs of the high-probability guarantees are based on arguments in [AL17, CDST19]. For completeness we give a self-contained proof of the following guarantee on Algorithm 1 in Appendix A.

Proposition 1.

Let $\mathbf{A}\in\mathbb{R}^{m\times n}$ satisfy $\left\lVert\mathbf{A}\right\rVert_{\max}\leq 1$ and $\epsilon,\alpha\in(0,1)$ . Let $\delta\leq\frac{\epsilon}{20}$ , $\eta=\frac{\epsilon}{60}$ , and $T=\Theta(\epsilon^{-2}\log\frac{mn}{\alpha})$ for an appropriate constant. With probability $\geq 1-\alpha$ , Algorithm 1 outputs an $\epsilon$ -approximate NE for $\mathbf{A}$ .

Given Proposition 1 to obtain our faster zero-sum game solvers, we simply need to efficiently implement the Gibbs sampling in Algorithm 1. As introduced in Section 1, 1, describes a dynamic approximate Gibbs oracle sampling problem sufficient for this task. Indeed, solving two appropriate parameterizations of Problem 1 provides the oracles needed by Algorithm 1. By combining Proposition 1 with the following Theorem 3 (our solution to Problem 1, discussed in greater detail in Section 3.2), we prove our main result Theorem 4.

Theorem 3.

Let $\alpha\in(0,1)$ and $\delta\leq\eta$ . Given a quantum oracle for $\mathbf{A}\in\mathbb{R}^{m\times n}$ (defined in Section 2) with $\left\lVert\mathbf{A}\right\rVert_{\max}\leq 1$ , we can solve Problem 1 with probability $\geq 1-\alpha$ with

\max(\mathcal{T}_{\textup{samp}},\mathcal{T}_{\textup{update}})=O\left(1+\sqrt{n}\cdot T\eta\log^{4}\left(\frac{mn}{\delta}\right)\cdot\left(\sqrt{\eta\log\left(\frac{n\eta T}{\alpha}\right)}+\eta\log\left(\frac{n\eta T}{\alpha}\right)\right)\right),

and an additive initialization cost of

O\left(\eta^{3}T^{3}\log^{4}\left(\frac{n\eta T}{\delta}\right)+\log^{7}\left(\frac{n\eta T}{\delta}\right)\right).

Theorem 4.

Let $\mathbf{A}\in\mathbb{R}^{m\times n}$ satisfy $\left\lVert\mathbf{A}\right\rVert_{\max}\leq 1$ , and let $\epsilon,\alpha\in(0,1)$ . Given a quantum oracle for $\mathbf{A}$ (defined in Section 2), there is a quantum algorithm which yields a classical output $(u,v)\in\Delta^{m}\times\Delta^{n}$ that is an $\epsilon$ -approximate NE for $\mathbf{A}$ with probability $\geq 1-\alpha$ in time

O\left(\frac{\sqrt{m+n}}{\epsilon^{2.5}}\log^{4}\left(\frac{mn}{\epsilon}\right)\log^{2.5}\left(\frac{mn}{\alpha\epsilon}\right)+\frac{\sqrt{m+n}}{\epsilon^{2}}\log^{4}\left(\frac{mn}{\epsilon}\right)\log^{3}\left(\frac{mn}{\alpha\epsilon}\right)+\frac{1}{\epsilon^{3}}\log^{7}\left(\frac{mn}{\epsilon}\right)\right).

Proof.

We apply two instances of Theorem 3 to implement the $\delta$ -approximate Gibbs oracle for the dynamic vectors $-\mathbf{A}^{\top}x_{t}$ and $\mathbf{A}y_{t}$ , to implement each iteration of Algorithm 1 in amortized $O(1+\mathcal{T}_{\textup{samp}}+\mathcal{T}_{\textup{update}})$ time. Using the settings of parameters $T,\eta$ in Proposition 1 and setting $\delta=\Theta(\epsilon)$ , which suffices for Algorithm 1 and Theorem 3, we have

\max(\mathcal{T}_{\textup{samp}},\mathcal{T}_{\textup{update}})=O\left(\frac{\sqrt{m+n}}{\epsilon}\log^{4}\left(\frac{mn}{\epsilon}\right)\log\left(\frac{mn}{\alpha\epsilon}\right)\left(\epsilon\log\left(\frac{mn}{\alpha\epsilon}\right)+\sqrt{\epsilon\log\left(\frac{mn}{\alpha\epsilon}\right)}\right)\right).

The conclusion follows since, by observation, Algorithm 1 costs $O(T\cdot(1+\mathcal{T}_{\textup{samp}}+\mathcal{T}_{\textup{update}}))$ . As remarked in the introduction, the additive term in the runtime comes from the cost of stably implementing a quantum circuit required in the use of Theorem 3 representing a polynomial transformation in finite precision, which we discuss in greater detail in Appendix D. ∎

3.2 Dynamic sampling maintenance via dynamic hint maintenance

In this section, we overview our proof of Theorem 3, which proceeds in two steps.

1.

We reduce sampling maintenance (Problem 1) to a problem which we call hint maintenance. This latter problem is a specialization of the sampling maintenance problem where suitable advice, which we call the hint throughout, is provided.
2.

We show how to solve the hint maintenance problem required by Proposition 2 in Theorem 3, by recursively calling Proposition 2 in phases, allowing us to maintain hints of suitable quality.

Reducing sampling maintenance to hint maintenance.

First, we introduce the following data structure for maintaining the $x$ variable in Problem 1, which was used crucially in [vAG19] for dynamic Gibbs sampling. This data structure allows efficient queries to subsets of the coordinates of $x$ and we use it in our Gibbs sampler as well.

Lemma 1 (Sampler tree).

Let $\eta\in\mathbb{R}_{\geq 0}$ and $m\in\mathbb{N}$ . There is a classical data structure, $\mathsf{SamplerTree}$ , supporting a tree on $O(m)$ nodes such that $[m]$ corresponds to leaves, with the following operations.

•

$\mathsf{Init}(m,\eta_{\textup{fixed}})$ : initialize $x\leftarrow\mathbf{0}_{m}$ and $\eta\leftarrow\eta_{\textup{fixed}}$
•

$\mathsf{Update}(i)$ : $x_{i}\leftarrow x_{i}+\eta$
•

$\mathsf{SubtreeSum}(v)$ : return the sum of all $x_{i}$ , where $i$ is in the subtree of $v$

The total runtime of $T$ calls to $\mathsf{Update}$ is $O(T\log m)$ , and calls to $\mathsf{SubtreeSum}$ cost $O(1)$ .

An implementation of $\mathsf{SamplerTree}$ based on propagating subtree sums upon updates is standard classical data structure, and we omit further description for brevity. Next, we state our first building block towards solving Problem 1, a result which can be thought of as quantum sampling with a hint. We defer its proof to Appendix B, as it is primarily based on generalizing dynamic block-encoding strategies with bounded-degree polynomial approximations, as pioneered by [GSLW19, vAG19].

Proposition 2.

Let $x\in\mathbb{R}^{m}_{\geq 0}$ correspond to an instance of $\mathsf{SamplerTree}$ , and $\beta\geq\left\lVert x\right\rVert_{1}$ . Let $p$ be the Gibbs distribution associated with $\mathbf{A}^{\top}x$ , let $Z:=\sum_{j\in[n]}\exp([\mathbf{A}^{\top}x]_{j})$ and $\widetilde{Z}\in[Z,CZ]$ for some $C\geq 1$ . Finally, let $q\in\mathbb{R}^{n}$ have entries classically queriable in $O(1)$ time, satisfy $q\geq p$ entrywise, $q_{j}\in[\frac{\delta}{n},1]$ for all $j\in[n]$ , and $\left\lVert q\right\rVert_{1}=\rho$ . Suppose $\widetilde{Z}$ , $C$ , $\rho$ , and $\beta$ are explicitly known. Given a quantum oracle for $\mathbf{A}\in\mathbb{R}^{m\times n}$ (defined in Section 2) with $\left\lVert\mathbf{A}\right\rVert_{\max}\leq 1$ , we can implement a $\delta$ -approximate Gibbs oracle which has query cost $O(\sqrt{\rho C}\cdot\beta\log^{4}\left(\frac{Cmn}{\delta}\right))$ . The total additional cost incurred if $x$ undergoes $T$ $\mathsf{Update}$ calls which preserve the invariants on $\widetilde{Z},C,\rho,\beta$ is $O(T\log m)$ .

Proposition 2 makes use of an overestimating hint vector $q$ and approximate normalization constant $\widetilde{Z}$ , which we collectively call the hint. The acceptance probability of our rejection sampling is governed by two primary parameters: $\rho=\|q\|_{1}$ , which reflects the degree of overestimation (and can be thought of as a hint quality), and $C\geq 1$ , which reflects our inability to accept with probability $\frac{p_{j}}{q_{j}}$ when $p$ is implicit (which can be thought of as a normalization quality). In particular, the rejection sampling scheme used in Proposition 2 will instead accept with probability $\frac{p_{j}}{Cq_{j}}$ .⁶⁶6Exactly computing $Z$ may require time $\Omega(n)$ in standard implementations, an obstacle to runtimes $\propto\sqrt{n}$ .

Here we elaborate briefly on the implementation of Proposition 2 (for more details, see Appendix 4). We follow notation of Proposition 2, and also let $w:=\mathbf{A}^{\top}x$ such that the unnormalized Gibbs distribution is $\exp(w)$ , and $p=\frac{\exp(w)}{Z}$ . Proposition 2 is a rejection sampler which first loads the hint $q$ into superposition, and then applies a filter. Overall, our scheme has the form

\text{sample }j\sim\frac{q}{\rho},\text{ then accept with probability }\frac{\exp(w_{j})}{CZ\cdot q_{j}}=\frac{p_{j}}{Cq_{j}},

(1)

which results in an accepted sample with probability $\approx\frac{1}{\rho C}$ , and hence requires $\approx\sqrt{\rho C}$ trials to succeed after applying quantum amplitude amplification, a generalization of Grover search [BHMT02].⁷⁷7The $\beta$ in Proposition 2 comes from loading $\exp(w_{j})$ into a quantum oracle via polynomials of degree $\approx\beta$ . The latter filtering step is implemented using appropriate block-encoding technology.

The above discussion suggests that the hint and normalization qualities, parameterized by $\rho$ and $C$ , are crucial in controlling the acceptance probability of our scheme. More concretely, in our applications of Proposition 2, $\beta=\eta T=\widetilde{O}(\frac{1}{\epsilon})$ , which is the bound on the $\ell_{1}$ norm of the $x_{t}$ and $y_{t}$ iterates in Algorithm 1 under the parameter settings of Proposition 1. Overall, the cost of implementing an approximate Gibbs oracle is then (up to logarithmic factors) $\sqrt{\rho C}\cdot\frac{1}{\epsilon}$ . Proposition 2 hence reduces Problem 1 to the problem of maintaining the hint consisting of a vector $q$ and a normalization estimate $\widetilde{Z}$ . We mention that Proposition 2 is a strict generalization of a corresponding building block in [vAG19], which only used $q$ set to the all-ones vector.

Approaches for Problem 1.

We now overview our improved solution to Problem 1 via efficient use of Proposition 2. To motivate our solution, we outline three solutions to Problem 1 offering different tradeoffs in the overall quality $\rho C$ . The first only uses classical information and does not use Proposition 2 at all, the second uses Proposition 2 but maintains no history across iterates, and the third (building upon the first two) is our approach.

Solution 1: [GK95]. A standard way to solve Problem 1 is to explicitly update $w=\mathbf{A}^{\top}x$ and $\exp(w)$ , and exactly maintain the normalizing constant $Z$ . This allows us to sample from $p$ in $\widetilde{O}(1)$ time. Since $w$ changes by one row of $\mathbf{A}$ under a $1$ -sparse $\mathsf{Update}$ operation to $x$ , this is implementable in $O(n)$ time per iteration. We can view this as an instance of the scheme (1) with $q=p$ , $C=1$ , and $\rho=1$ . It yields the (unbalanced) tradeoff for Problem 1 of $\mathcal{T}_{\textup{samp}}=\widetilde{O}(1)$ and $\mathcal{T}_{\textup{update}}=O(n)$ .

Solution 2: [vAG19]. A recent work [vAG19] introduced a quantum implementation of the scheme (1) with an improved tradeoff. The [vAG19] scheme first uniformly samples, which in the language of (1) means $q=\mathbf{1}_{n}$ and $\rho=n$ . It then applies quantum maximum finding [DH96] to obtain an approximate maximum entry of $w$ , which they show takes time $\widetilde{O}(\beta\cdot\sqrt{n})$ ; for the sake of simplicity here, we assume this exactly yields $w_{\max}:=\max_{j\in[n]}w_{j}$ . Finally, the acceptance probability $\frac{p_{j}}{Cq_{j}}$ is set to $\exp(w_{j}-w_{\max})$ . For $q=\mathbf{1}_{n}$ , this translates to

p_{j}\cdot\exp(w_{\max}-w_{j})=\frac{\exp(w_{\max})}{Z}\leq 1,

implying $C=1$ suffices. We note this bound on $C$ can be tight when $w$ is very non-uniform. Overall, the [vAG19] scheme’s update time requires maximum finding, and its sampling time (via Proposition 2) requires time $\widetilde{O}(\beta\cdot\sqrt{\rho C})=\widetilde{O}(\beta\cdot\sqrt{n})$ . For $\beta=\widetilde{O}(\frac{1}{\epsilon})$ as in Algorithm 1, this yields the balanced tradeoff $\max(\mathcal{T}_{\textup{samp}},\mathcal{T}_{\textup{update}})=\widetilde{O}\left(\sqrt{n}\cdot\epsilon^{-1}\right)$ . As discussed earlier, our key insight is to improve upon this specific choice of hint in [vAG19], for their implicit use of Proposition 2.

Solution 3: this work. We design better hints for Proposition 2 by executing our algorithm in phases corresponding to batches of $\approx\frac{1}{\eta}$ iterations. At the start of each phase, we use the Gibbs access afforded by Proposition 2 to produce a suitable hint for efficiently implementing the next phase. Our execution of this strategy, parameterized by an integer $k\in[n]$ , relies on the following observations.

1.

During $\lceil\frac{1}{\eta}\rceil$ iterations $t\in\{\tau+s\}_{s\in[\lceil\frac{1}{\eta}\rceil]}$ (where $\tau$ starts the phase), the dynamic Gibbs distribution $p_{t}$ (where $t$ is the iteration index) changes by $O(1)$ multiplicatively, since $w$ entrywise changes by $O(1)$ additively. Thus, the quality of a hint vector deteriorates by at most a constant in the phase, so it suffices to give a good hint $q_{\tau}\geq p_{\tau}$ at the phase start.

By using access to Proposition 2 at the end of the previous phase, we can efficiently estimate large entries of $p_{\tau}$ . More precisely, we sample $\widetilde{O}(k)$ times from $p_{\tau}$ , and let the empirical distribution of these samples be $\tilde{q}$ . Chernoff bounds show that any large entry $[p_{\tau}]_{j}=\Omega(\frac{1}{k})$ will be accurately reflected in the empirical sample. Hence, we set the hint to

q_{j}=\begin{cases}\tilde{q}_{j}\cdot O(1)&\tilde{q}_{j}=\Omega(\frac{1}{k})\\ \frac{1}{k}\cdot O(1)&\tilde{q}_{j}=O(\frac{1}{k})\end{cases},

for appropriate constants. This yields an improved hint quality of $\rho\approx\frac{n}{k}$ , since large entries of the hint sum to at most $O(1)$ (as $\tilde{q}_{j}\approx p_{j}$ ), and small entries sum to $O(\frac{n}{k})$ .

3.

We show a similar strategy of using empirical concentration, combined with a testing variant of Proposition 2, accurately estimates the normalizing factor $Z$ , yielding $C=O(1)$ .

This strategy yields $\mathcal{T}_{\textup{samp}}=\widetilde{O}(\beta\cdot\sqrt{n/k})$ and $\mathcal{T}_{\textup{update}}=\widetilde{O}(\mathcal{T}_{\textup{samp}}\cdot k\eta)$ (since we amortize $\mathcal{T}_{\textup{update}}$ over $\approx\frac{1}{\eta}$ iterations). For the parameter settings of Algorithm 1, optimizing $k$ yields

\max(\mathcal{T}_{\textup{samp}},\mathcal{T}_{\textup{update}})=\widetilde{O}\left(\sqrt{n}\cdot\epsilon^{-\frac{1}{2}}\right)\,.

We prove Theorem 3, our improved solution to Problem 1, in Section 4. Ignoring logarithmic factors and assuming $\eta\ll 1$ (as in our setting), Theorem 3 shows we can maintain $\max(\mathcal{T}_{\textup{samp}},\mathcal{T}_{\textup{update}})=\widetilde{O}(\sqrt{n}\cdot T\eta^{1.5})$ . For the parameter settings $T=\widetilde{O}(\epsilon^{-2})$ , $\eta=\Theta(\epsilon)$ , as stated in Proposition 1, this indeed equates to $\max(\mathcal{T}_{\textup{samp}},\mathcal{T}_{\textup{update}})=\widetilde{O}(\sqrt{n}\cdot\epsilon^{-\frac{1}{2}})$ .

4 Gibbs sampling oracle implementation

In this section, we prove Theorem 3, which gives our solution to Problem 1. To do so, we follow the outline given in Section 3.2, wherein we solve Problem 1 in batches of $\lceil\frac{1}{\eta}\rceil$ iterations, each of which we call a “phase.” In Sections 4.1 and 4.2, we only discuss a single phase of Problem 1, consisting of the iterations $\tau+s$ for $s\in[\lceil\frac{1}{\eta}\rceil]$ and some initial iteration $\tau$ , assuming certain invariants (stated below) hold at the start of the phase. We give a complete solution to Problem 1 in Section 4.3.

Invariant 1 (Approximate normalization access).

We explicitly have $\widetilde{Z}_{\textup{prev}}$ with $\widetilde{Z}_{\textup{prev}}\in[Z_{\tau},CZ_{\tau}]$ for some $C=O(1)$ .

Invariant 2 (Initial sampling maintenance).

We have $\mathcal{O}_{\tau}$ solving Problem 1 in iteration $\tau$ .

The remainder of this section is then organized as follows.

•

Section 4.1: We show that assuming Invariants 1 and 2 hold at the start of a phase, we can perform preprocessing used to construct our hint, consisting of the estimated normalization $\widetilde{Z}$ and vector $q$ , in an application of Proposition 2. This gives the cost of $\mathcal{T}_{\textup{samp}}$ in Problem 1.
•

Section 4.2: We show that at the conclusion of each phase we can maintain Invariants 1 and 2 for use in the next phase. This gives the cost of $\mathcal{T}_{\textup{update}}$ in Problem 1.
•

Section 4.3: We recursively call the subroutine of Sections 4.1 and 4.2 (which solves Problem 1 for all the iterations $\tau+s$ where $s\in[\lceil\frac{1}{\eta}\rceil]$ for some $\tau$ ) $\approx\eta T$ times to prove Theorem 3.

4.1 Preprocessing and approximate Gibbs oracle implementation

In this section, we show how to construct the “hint” $q$ which will be used throughout a phase (starting in iteration $\tau$ ) given access to $\mathcal{O}_{\tau}$ , and bound $\rho=\left\lVert q\right\rVert_{1}$ which quantifies the quality of our hint, under the assumption that Invariants 1 and 2 hold in the phase. We first show a multiplicative stability property of the relevant Gibbs distributions in a phase.

Lemma 2.

For all $s\in[\lceil\frac{1}{\eta}\rceil]$ , we have

Z_{\tau+s}\in\left[\frac{1}{3}Z_{\tau},3Z_{\tau}\right],\text{ and }p_{\tau+s}\in\left[\frac{1}{9}p_{\tau},9p_{\tau}\right]\text{ entrywise.}

Proof.

Let $\nu_{t}:=\exp(\mathbf{A}^{\top}x_{t})$ for all $t$ , such that $p_{t}=\frac{\nu_{t}}{Z_{t}}$ . We have that for any $j\in[n]$ ,

	$\displaystyle\frac{[\nu_{\tau+s}]_{j}}{[\nu_{\tau}]_{j}}$	$\displaystyle=\exp\left(\left[\mathbf{A}^{\top}\left(x_{\tau+s}-x_{\tau}\right)\right]_{j}\right)$
		$\displaystyle\in\left[\exp\left(-\left\lVert\mathbf{A}\right\rVert_{\max}\left\lVert x_{\tau+s}-x_{\tau}\right\rVert_{1}\right),\exp\left(\left\lVert\mathbf{A}\right\rVert_{\max}\left\lVert x_{\tau+s}-x_{\tau}\right\rVert_{1}\right)\right]$
		$\displaystyle\in\left[\exp\left(-\eta s\right),\exp\left(\eta s\right)\right]\in\left[\frac{1}{3},3\right].$

Similarly, $Z_{\tau+s}\in[\frac{1}{3}Z_{\tau},3Z_{\tau}]$ , and combining yields the conclusion. ∎

Next, our computation of the overestimating vector $q$ is parameterized by an integer $k\in[n]$ which will be fixed throughout this section and Section 4.2. We will simply set $q$ to be an upscaled variant of an empirical distribution of roughly $k$ draws from $\mathcal{O}_{\tau}$ .

Lemma 3.

Let $k\in[n]$ , $\alpha\in(0,1)$ , and suppose $\delta\leq\frac{1}{16k}$ . Draw $N=\Theta(k\log\frac{n\eta T}{\alpha})$ samples from $\mathcal{O}_{\tau}$ for an appropriately large constant, and let $\tilde{q}\in\Delta^{n}$ be the empirical distribution over these $N$ samples. Define $\mathcal{B}:=\{i\in[n]\mid\tilde{q}_{i}\geq\frac{1}{2k}\}$ . Then for

q_{j}=\begin{cases}18\tilde{q}_{j}&j\in\mathcal{B}\\ \frac{18}{k}&j\not\in\mathcal{B}\end{cases},

with probability $\geq 1-\frac{\alpha}{2\lceil\eta T\rceil}$ , $\left\lVert q\right\rVert_{1}=O(\frac{n}{k})$ and $q\geq p_{\tau+s}$ entrywise, for all $s\leq\frac{1}{\eta}$ .

Proof.

The first conclusion $\left\lVert q\right\rVert_{1}=O(\frac{n}{k})$ is immediate from the definition of $q$ , since $\left\lVert q\right\rVert_{1}\leq 18\left\lVert\tilde{q}\right\rVert_{1}+\frac{18n}{k}$ . In light of Lemma 2 (which holds deterministically), to show the second conclusion, it suffices to show that with the desired success probability, we have both

2\tilde{q}_{j}\geq[p_{\tau}]_{j}\text{ for all }j\in\mathcal{B}

(2)

and

\frac{2}{k}\geq[p_{\tau}]_{j}\text{ for all }j\not\in\mathcal{B}.

(3)

Denote $\alpha^{\prime}:=\frac{\alpha}{2\lceil\eta T\rceil}$ for notational convenience, and let $\tilde{p}$ denote the distribution of samples from $\mathcal{O}_{\tau}$ , and recall that $\left\lVert\tilde{p}-p_{\tau}\right\rVert_{1}\leq\frac{1}{16k}$ . Because we are taking $\Theta(k\log\frac{n}{\alpha^{\prime}})$ samples from $\tilde{p}$ , we have by a standard Chernoff bound that with probability at least $1-\alpha^{\prime}$ (union bounding over all coordinates $j\in[n]$ ), both of the following hold.

1.

For all $j\in[n]$ such that $\tilde{p}_{j}\geq\frac{1}{4k}$ , $\tilde{q}_{j}\geq\frac{2\tilde{p}_{j}}{3}$ .
2.

For all $j\in[n]$ such that $\tilde{p}_{j}\leq\frac{1}{4k}$ , $\tilde{q}_{j}\leq\frac{1}{2k}$ .

We condition on these events for the remainder of the proof; we now show (2), (3) in turn.

Proof of (2). To see (2), the second event above implies that if $\tilde{p}_{j}\leq\frac{1}{4k}$ , then $j\not\in\mathcal{B}$ . Hence, for all $j\in\mathcal{B}$ , we have $\tilde{q}_{j}\geq\frac{2\tilde{p}_{j}}{3}\geq\frac{[p_{\tau}]_{j}}{2}$ since $\left\lVert\tilde{p}-p_{\tau}\right\rVert_{\infty}\leq\frac{1}{16k}\leq\frac{1}{4}\tilde{p}_{j}$ for all $j\in\mathcal{B}$ .

Proof of (3). To see (3), suppose for contradiction that $j\not\in\mathcal{B}$ and $[p_{\tau}]_{j}>\frac{2}{k}$ . This implies that $\tilde{p}_{j}>\frac{1}{k}$ , and hence by the first event above, $\tilde{q}_{j}\geq\frac{1}{2k}$ , contradicting $j\not\in\mathcal{B}$ . ∎

Corollary 1.

Assume that Invariants 1, 2 hold for the phase consisting of iterations $\tau+s$ , $s\in[\lceil\frac{1}{\eta}\rceil]$ . We can solve Problem 1 for the phase with probability $\geq 1-\frac{\alpha}{2\lceil\eta T\rceil}$ , and

\mathcal{T}_{\textup{samp}}:=O\left(\sqrt{\frac{n}{k}}\cdot T\eta\log^{4}\left(\frac{mn}{\delta}\right)\right).

Proof.

We will run the algorithm described in the proof of Lemma 3, and condition on it succeeding, giving the failure probability. It then suffices to apply Proposition 2 with $q$ defined in Lemma 3. For this $q$ , we parameterize Proposition 2 with $C=O(1)$ (see Invariant 1), $\rho=O(\frac{n}{k})$ (see Lemma 3), and $\beta=T\eta$ . It is clear the lower bound on entries of $q$ in Proposition 2 holds. ∎

4.2 Maintaining invariants

We now show how to maintain Invariant 1 at iteration $\tau^{\prime}:=\tau+\lceil\frac{1}{\eta}\rceil$ , for use in the next phase, and bound the cost of doing so. We note that Invariant 2 follows immediately from our construction in Corollary 1. First, by combining Lemma 2 with Invariant 1,

Z_{\tau^{\prime}}\in\left[\frac{\widetilde{Z}_{\textup{prev}}}{3C},3\widetilde{Z}_{\textup{prev}}\right].

(4)

This suggests that we may use $3\widetilde{Z}_{\textup{prev}}=\widetilde{Z}$ for the next phase; however, this would lead to an exponential blowup in the multiplicative range $C$ . To sidestep this, we develop a tester for a hidden parameter governing a success probability, which will be used to give a refined estimate $\widetilde{Z}$ . We require the following corollary of Proposition 2, whose proof we defer to Appendix B.

Corollary 2.

Following notation of Proposition 2, let $R:=\frac{\widetilde{Z}}{Z}$ . There is a quantum oracle $\mathcal{O}_{\textup{test}}$ which can be implemented under $T$ $\mathsf{Update}$ calls to $x$ in $O(T\log m)$ time, and has query cost

O\left(\sqrt{\rho C}\cdot\beta\log^{4}\left(\frac{Cmn}{\ell\delta}\right)\right).

Furthermore, for explicitly known constants $C_{\ell}$ and $C_{u}$ , $\mathcal{O}_{\textup{test}}$ returns “success” with probability $p$ for

\frac{C_{\ell}}{\sqrt{R\rho}}\leq p\leq\frac{C_{u}}{\sqrt{R\rho}}.

Corollary 2 differs from Proposition 2 in that it returns a Boolean-valued answer (as opposed to a sample from an approximate Gibbs distribution), and has a success probability parameterized by explicit constants. We now show how to use Corollary 2 to maintain Invariant 1.

Lemma 4.

Assume Invariants 1, 2 hold for the phase consisting of iterations $\tau+s$ , $s\in[\lceil\frac{1}{\eta}\rceil]$ , and suppose $C\geq\frac{4C_{u}^{2}}{C_{\ell}^{2}}$ for $C=O(1)$ , where $C_{u}$ and $C_{\ell}$ are the constants from Corollary 2. Further, suppose we have obtained $q$ satisfying the conclusion of Lemma 3 (i.e. that the algorithm in Lemma 3 succeeded). We can determine $\widetilde{Z}$ such that $\widetilde{Z}\in[Z_{\tau^{\prime}},CZ_{\tau^{\prime}}]$ with probability $\geq 1-\frac{\alpha}{2\lceil\eta T\rceil}$ , in time

O\left(\sqrt{\frac{n}{k}}\cdot T\eta\log^{4}\left(\frac{mn}{\delta}\right)\log\left(\frac{\eta T}{\alpha}\right)\right).

Proof.

Define $\widetilde{Z}_{0}:=3\widetilde{Z}_{\textup{prev}}$ , $R_{0}:=\frac{\widetilde{Z}_{0}}{Z_{\tau^{\prime}}}$ , and note that $\widetilde{Z}_{0}\in[Z_{\tau^{\prime}},9CZ_{\tau^{\prime}}]$ by Invariant 1 and Lemma 2. Next, assuming the success of Lemma 3, we have that the success probability $p$ of $\mathcal{O}_{\textup{test}}$ from Corollary 2 using the estimate $\widetilde{Z}_{0}$ satisfies (for the unknown $R_{0}\in[1,9C]$ , and known $C_{\ell},C_{u},\rho$ )

\frac{C_{\ell}}{\sqrt{R_{0}\rho}}\leq p\leq\frac{C_{u}}{\sqrt{R_{0}\rho}}.

For $N:=27\log\frac{4\lceil\eta T\rceil}{\alpha}\cdot\frac{3\sqrt{C\rho}}{C_{\ell}}$ , we first run $\mathcal{O}_{\textup{test}}$ $N$ times and check the number of successes, denoted by $S$ , which fits within the runtime budget by Corollary 2. By a Chernoff bound, we have that with probability $\geq 1-\frac{\alpha}{2\lceil\eta T\rceil}$ , we have

54\log\frac{4\lceil\eta T\rceil}{\alpha}\cdot\sqrt{\frac{C}{R_{0}}}\leq\frac{2}{3}pN\leq S\leq\frac{4}{3}pN\leq 108\log\frac{4\lceil\eta T\rceil}{\alpha}\cdot\frac{C_{u}}{C_{\ell}}\cdot\sqrt{\frac{C}{R_{0}}}.

Hence, we can determine the quantity $R_{0}$ up to a multiplicative factor of $\frac{4C_{u}^{2}}{C_{\ell}^{2}}\leq C$ , which also implies the same multiplicative approximation factor for $Z_{\tau^{\prime}}$ , as desired. ∎

4.3 Proof of Theorem 3

See 3

Proof.

We first claim that for any $k\in[n]$ , we can solve Problem 1 with probability $\geq 1-\alpha$ and

	$\displaystyle\mathcal{T}_{\textup{samp}}$	$\displaystyle=O\left(\sqrt{\frac{n}{k}}\cdot T\eta\log^{4}\left(\frac{mn}{\delta}\right)\right),$
	$\displaystyle\mathcal{T}_{\textup{update}}$	$\displaystyle=O\left(\left(\sqrt{\frac{n}{k}}\cdot T\eta\log^{4}\left(\frac{mn}{\delta}\right)\right)\cdot k\eta\log\left(\frac{n\eta T}{\alpha}\right)\right).$

This follows from combining Lemma 3 (amortized over $\lceil\frac{1}{\eta}\rceil$ iterations), Corollary 1, and Lemma 4, and taking a union bound over at most $\lceil\eta T\rceil$ phases. Here we note that the cost of $\log m$ per iteration to support $\mathsf{Update}$ costs to $x$ in Lemma 1, Proposition 2, and Corollary 2 is not dominant. By choosing $k=\Theta(\max(1,(\eta\log\frac{mn}{\alpha\epsilon})^{-1}))$ , we balance the costs of $\mathcal{T}_{\textup{samp}}$ and $\mathcal{T}_{\textup{update}}$ , yielding the conclusion. We finally note that by picking an appropriate constant in the definition of $k$ , we have $\delta\leq\eta\implies\delta\leq\frac{1}{16k}$ as required by Lemma 3, the only component specifying a bound on $\delta$ . ∎

Acknowledgments

We thank András Gilyén for communication regarding the prior work [vAG19]. AB was supported in part by the DOE QuantISED grant DE-SC0020360, by the AFOSR under grant FA9550-21-1-0392, and by the U.S. DOE Office of Science under Award Number DE-SC0020266. YG was supported in part by the Stanford MS&E DE&I Research program. YJ was supported in part by a Stanford Graduate Fellowship and a Danzig-Lieberman Graduate Fellowship. AS was supported in part by a Microsoft Research Faculty Fellowship, NSF CAREER Award CCF1844855, NSF Grant CCF-1955039, a PayPal research award, and a Sloan Research Fellowship. KT thanks Ewin Tang for her expertise on quantum linear algebra and for fielding many of our questions.

References

[Aar15] Scott Aaronson. Read the fine print. Nature Physics, 11(4):291–293, 2015.
[AK07] Sanjeev Arora and Satyen Kale. A combinatorial, primal-dual approach to semidefinite programs. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pages 227–236, 2007.
[AL17] Zeyuan Allen-Zhu and Yuanzhi Li. Follow the compressed leader: Faster online learning of eigenvectors and faster MMWU. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, volume 70 of Proceedings of Machine Learning Research, pages 116–125. PMLR, 2017.
[Alm21] Josh Alman. Limits on the universal method for matrix multiplication. Theory Comput., 17:1–30, 2021.
[AW21] Josh Alman and Virginia Vassilevska Williams. A refined laser method and faster matrix multiplication. In Dániel Marx, editor, Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms, SODA 2021, Virtual Conference, January 10 - 13, 2021, pages 522–539. SIAM, 2021.
[BHMT02] Gilles Brassard, Peter Høyer, Michele Mosca, and Alain Tapp. Quantum amplitude amplification and estimation. Quantum Computation and Quantum Information, 305:53–74, 2002.
[BS17] Fernando GSL Brandao and Krysta M Svore. Quantum speed-ups for solving semidefinite programs. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 415–426. IEEE, 2017.
[Bub15] Sébastien Bubeck. Convex optimization: Algorithms and complexity. Foundations and Trends in Machine Learning, 8(3-4):231–357, 2015.
[BWP⁺17] Jacob Biamonte, Peter Wittek, Nicola Pancotti, Patrick Rebentrost, Nathan Wiebe, and Seth Lloyd. Quantum machine learning. Nature, 549(7671):195–202, 2017.
[CCLW20] Shouvanik Chakrabarti, Andrew M Childs, Tongyang Li, and Xiaodi Wu. Quantum algorithms and lower bounds for convex optimization. Quantum, 4:221, 2020.
[CDST19] Yair Carmon, John C. Duchi, Aaron Sidford, and Kevin Tian. A rank-1 sketch for matrix multiplicative weights. In Alina Beygelzimer and Daniel Hsu, editors, Conference on Learning Theory, COLT 2019, 25-28 June 2019, Phoenix, AZ, USA, volume 99 of Proceedings of Machine Learning Research, pages 589–623. PMLR, 2019.
[CGL⁺20] Nai-Hui Chia, András Gilyén, Tongyang Li, Han-Hsuan Lin, Ewin Tang, and Chunhao Wang. Sampling-based sublinear low-rank matrix arithmetic framework for dequantizing quantum machine learning. In Proceedings of the 52nd Annual ACM SIGACT symposium on theory of computing, pages 387–400, 2020.
[CJST19] Yair Carmon, Yujia Jin, Aaron Sidford, and Kevin Tian. Variance reduction for matrix games. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett, editors, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 11377–11388, 2019.
[CJST20] Yair Carmon, Yujia Jin, Aaron Sidford, and Kevin Tian. Coordinate methods for matrix games. In Sandy Irani, editor, 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020, pages 283–293. IEEE, 2020.
[CLS21] Michael B Cohen, Yin Tat Lee, and Zhao Song. Solving linear programs in the current matrix multiplication time. Journal of the ACM (JACM), 68(1):1–39, 2021.
[DH96] Christoph Dürr and Peter Høyer. A quantum algorithm for finding the minimum. CoRR, quant-ph/9607014, 1996.
[DNV20] Daniel Dadush, Bento Natura, and Làszlò A Vègh. Revisiting tardos’s framework for linear programming: faster exact solutions using approximate solvers. In Sandy Irani, editor, 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020, pages 931–942. IEEE, 2020.
[GAW19] András Gilyén, Srinivasan Arunachalam, and Nathan Wiebe. Optimizing quantum optimization algorithms via faster quantum gradient computation. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1425–1444. SIAM, 2019.
[GK95] Michael D. Grigoriadis and Leonid G. Khachiyan. A sublinear-time randomized approximation algorithm for matrix games. Operation Research Letters, 18(2):53–58, 1995.
[GLG22] Sevag Gharibian and François Le Gall. Dequantizing the quantum singular value transformation: Hardness and applications to quantum chemistry and the quantum pcp conjecture. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, pages 19–32, 2022.
[GR02] Lov Grover and Terry Rudolph. Creating superpositions that correspond to efficiently integrable probability distributions. CoRR, abs/quant-ph/0208112, 2002.
[GSLW19] András Gilyén, Yuan Su, Guang Hao Low, and Nathan Wiebe. Quantum singular value transformation and beyond: exponential improvements for quantum matrix arithmetics. In Moses Charikar and Edith Cohen, editors, Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, Phoenix, AZ, USA, June 23-26, 2019, pages 193–204. ACM, 2019.
[Haa19] Jeongwan Haah. Product decomposition of periodic functions in quantum signal processing. Quantum, 3:190, 2019.
[HHL09] Aram W Harrow, Avinatan Hassidim, and Seth Lloyd. Quantum algorithm for linear systems of equations. Physical review letters, 103(15):150502, 2009.
[Jor05] Stephen P Jordan. Fast quantum algorithm for numerical gradient estimation. Physical review letters, 95(5):050501, 2005.
[JSWZ21] Shunhua Jiang, Zhao Song, Omri Weinstein, and Hengjie Zhang. A faster algorithm for solving general lps. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2021, 2021, pages 823–832, 2021.
[Kar84] Narendra Karmarkar. A new polynomial-time algorithm for linear programming. In Proceedings of the sixteenth annual ACM symposium on Theory of computing, pages 302–311, 1984.
[KP16] Iordanis Kerenidis and Anupam Prakash. Quantum recommendation systems. arXiv preprint arXiv:1603.08675, 2016.
[KP20] Iordanis Kerenidis and Anupam Prakash. A quantum interior point method for lps and sdps. ACM Transactions on Quantum Computing, 1(1):1–32, 2020.
[LCW19] Tongyang Li, Shouvanik Chakrabarti, and Xiaodi Wu. Sublinear quantum algorithms for training linear and kernel-based classifiers. In International Conference on Machine Learning, pages 3815–3824. PMLR, 2019.
[LMR14] Seth Lloyd, Masoud Mohseni, and Patrick Rebentrost. Quantum principal component analysis. Nature Physics, 10(9):631–633, 2014.
[LS14] Yin Tat Lee and Aaron Sidford. Path finding methods for linear programming: Solving linear programs in o (vrank) iterations and faster algorithms for maximum flow. In 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, pages 424–433. IEEE, 2014.
[LS19] Yin Tat Lee and Aaron Sidford. Solving linear programs with sqrt (rank) linear system solves. arXiv preprint arXiv:1910.08033, 2019.
[LWCW21] Tongyang Li, Chunhao Wang, Shouvanik Chakrabarti, and Xiaodi Wu. Sublinear classical and quantum algorithms for general matrix games. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 8465–8473, 2021.
[Nem04] Arkadi Nemirovski. Prox-method with rate of convergence ${O}(1/t)$ for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM Journal on Optimization, 15(1):229–251, 2004.
[Nes07] Yurii Nesterov. Dual extrapolation and its applications to solving variational inequalities and related problems. Mathematical Programing, 109(2-3):319–344, 2007.
[Neu28] John Von Neumann. Zur theorie der gesellschaftsspiele. Mathematische Annalen, 100:295–320, 1928.
[NJLS09] Arkadi Nemirovski, Anatoli B. Juditsky, Guanghui Lan, and Alexander Shapiro. Robust stochastic approximation approach to stochastic programming. SIAM J. Optim., 19(4):1574–1609, 2009.
[Ren88] James Renegar. A polynomial-time algorithm, based on newton’s method, for linear programming. Mathematical programming, 40(1):59–93, 1988.
[RML14] Patrick Rebentrost, Masoud Mohseni, and Seth Lloyd. Quantum support vector machine for big data classification. Physical review letters, 113(13):130503, 2014.
[SV14] Sushant Sachdeva and Nisheeth K. Vishnoi. Faster algorithms via approximation theory. Found. Trends Theor. Comput. Sci., 9(2):125–210, 2014.
[Tan19] Ewin Tang. A quantum-inspired classical algorithm for recommendation systems. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 217–228, 2019.
[vAG19] Joran van Apeldoorn and András Gilyén. Quantum algorithms for zero-sum games. CoRR, abs/1904.03180, 2019.
[vAGGdW20a] Joran van Apeldoorn, András Gilyén, Sander Gribling, and Ronald de Wolf. Convex optimization using quantum oracles. Quantum, 4:220, 2020.
[VAGGdW20b] Joran Van Apeldoorn, András Gilyén, Sander Gribling, and Ronald de Wolf. Quantum sdp-solvers: Better upper and lower bounds. Quantum, 4:230, 2020.
[vdB20] Jan van den Brand. A deterministic linear program solver in current matrix multiplication time. In Proceedings of the Thirty-first Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, 2020, pages 259–278, 2020.
[vdBLL⁺21] Jan van den Brand, Yin Tat Lee, Yang P. Liu, Thatchaphol Saranurak, Aaron Sidford, Zhao Song, and Di Wang. Minimum cost flows, mdps, and $\ell_{1}$ -regression in nearly linear time for dense instances. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2021, 2021, pages 859–869, 2021.
[vdBLSS20] Jan van den Brand, Yin Tat Lee, Aaron Sidford, and Zhao Song. Solving tall dense linear programs in nearly linear time. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 775–788, 2020.

Appendix A Solving matrix games with a Gibbs sampling oracle

In this section, we prove Proposition 1, which shows how to solve a zero-sum matrix game using an approximate Gibbs sampling oracle (via Algorithm 1). To briefly motivate the algorithm we use and our proof of its guarantees, we recall the problem we consider is of the form

\begin{gathered}\min_{v\in\Delta^{n}}\max_{u\in\Delta^{m}}f(u,v):=u^{\top}\mathbf{A}v,\leavevmode\nobreak\ \leavevmode\nobreak\ \text{where}\leavevmode\nobreak\ \leavevmode\nobreak\ \left\lVert\mathbf{A}\right\rVert_{\max}\leq 1,\end{gathered}

(5)

and we define the associated gradient operator as

\displaystyle g(u,v)=(-\mathbf{A}v,\mathbf{A}^{\top}u).

(6)

Taking (stochastic) mirror descent steps on the gradient operator in (5) is well-known to yield an approximate NE to the matrix game [Bub15]. We show that an approximate implementation of this strategy, combined with appropriate subsampling, efficiently yields an approximate NE. We begin by making the following observation.

Lemma 5.

Let $u,\tilde{u}\in\Delta^{m}$ have $\left\lVert u-\tilde{u}\right\rVert_{1}\leq\delta$ . Let $\tilde{g}:=\mathbf{A}_{i:}$ where $i\sim\tilde{u}$ , and $g:=\mathbf{A}^{\top}u$ . Then, $\left\lVert g-\mathbb{E}\tilde{g}\right\rVert_{\infty}\leq\delta$ .

Proof.

Note that $\mathbb{E}\tilde{g}=\mathbf{A}^{\top}\tilde{u}$ , and $\left\lVert\mathbf{A}^{\top}(u-\tilde{u})\right\rVert_{\infty}\leq\left\lVert u-\tilde{u}\right\rVert_{1}\leq\delta$ since $\left\lVert\mathbf{A}\right\rVert_{\max}\leq 1$ . ∎

We next present a variant of the classical mirror descent analysis, which bounds the expected approximation quality of iterates of Algorithm 1 prior to subsampling.

Proposition 3.

Let $\delta\leq\frac{\epsilon}{20}$ , $\eta=\frac{\epsilon}{15}$ and $T\geq\frac{6\log(mn)}{\eta\epsilon}$ in Algorithm 1. Let the iterates of Algorithm 1 be $\{x_{t},y_{t}\}_{t=0}^{T-1}$ , and denote $u_{t}:=\frac{\exp(\mathbf{A}y_{t})}{\left\lVert\exp(\mathbf{A}y_{t})\right\rVert_{1}}$ , $v_{t}:=\frac{\exp(-\mathbf{A}^{\top}x_{t})}{\left\lVert\exp(-\mathbf{A}^{\top}x_{t})\right\rVert_{1}}$ for all $0\leq t<T$ . For $(\bar{u},\bar{v}):=\frac{1}{T}\sum_{t=0}^{T-1}(u_{t},v_{t})$ , we have

\displaystyle\mathbb{E}\left[\max_{u\in\Delta^{m}}u^{\top}\mathbf{A}\bar{v}-\min_{v\in\Delta^{n}}\bar{u}^{\top}\mathbf{A}v\right]\leq\epsilon.

(7)

Proof.

By definition of the updates, at every iteration $0\leq t\leq T-1$ , we have

	$\displaystyle u_{t+1}$	$\displaystyle=\textup{argmin}_{u\in\Delta^{m}}\left\{\eta\langle-\mathbf{A}_{:j_{t}},u\rangle+\sum_{i\in[m]}[u]_{i}\log\frac{[u]_{i}}{[u_{t}]_{i}}\right\},$
	$\displaystyle v_{t+1}$	$\displaystyle=\textup{argmin}_{v\in\Delta^{n}}\left\{\eta\langle\mathbf{A}_{i_{t}:},v\rangle+\sum_{j\in[n]}[v]_{j}\log\frac{[v]_{j}}{[v_{t}]_{j}}\right\}.$

Consequently, by the optimality conditions of $u_{t+1}$ and $v_{t+1}$ respectively, we have for any $u\in\Delta^{m}$ , $v\in\Delta^{n}$ , and letting $V_{x}(x^{\prime}):=\sum_{k}[x^{\prime}]_{k}\log\frac{[x^{\prime}]_{k}}{[x]_{k}}$ be the KL divergence between simplex variables of appropriate dimension,

$\displaystyle\left\langle-\mathbf{A}_{:j},u_{t}-u\right\rangle+\left\langle\mathbf{A}_{i:},v_{t}-v\right\rangle$	$\displaystyle\leq\frac{1}{\eta}\left(V_{u_{t}}(u)-V_{u_{t+1}}(u)+V_{v_{t}}(v)-V_{v_{t+1}}(v)\right)$	(8)
	$\displaystyle+\left(\left\langle-\mathbf{A}_{:j},u_{t}-u_{t+1}\right\rangle-\frac{1}{\eta}V_{u_{t}}(u_{t+1})\right)$
	$\displaystyle+\left(\left\langle\mathbf{A}_{i:},v_{t}-v_{t+1}\right\rangle-\frac{1}{\eta}V_{v_{t}}(v_{t+1})\right)$
	$\displaystyle\leq\frac{1}{\eta}\left(V_{u_{t}}(u)-V_{u_{t+1}}(u)+V_{v_{t}}(v)-V_{v_{t+1}}(v)\right)$
	$\displaystyle+\frac{\eta}{2}\left\lVert\mathbf{A}_{:j}\right\rVert_{\infty}^{2}+\frac{\eta}{2}\left\lVert\mathbf{A}_{i:}\right\rVert_{\infty}^{2},$

where for the last inequality we use Hölder’s inequality and the fact that $V$ is $1$ -strongly convex in the $\ell_{1}$ norm (by Pinsker’s inequality). Averaging the above for $0\leq t<T$ , and denoting $w_{t}:=(u_{t},v_{t})$ and $\tilde{g}_{t}:=(-\mathbf{A}_{:j_{t}},\mathbf{A}_{i_{t}:})$ , we obtain for any $w=(u,v)\in\Delta^{m}\times\Delta^{n}$ ,

\frac{1}{T}\sum_{t=0}^{T-1}\langle\tilde{g}_{t},w_{t}-w\rangle\leq\frac{1}{\eta T}\left(V_{u_{0}}(u)+V_{v_{0}}(v)\right)+\eta.

(9)

In the above, we further recalled the bound $\left\lVert\mathbf{A}\right\rVert_{\max}\leq 1$ by assumption. In order to bound the deviation of the left-hand side from its expectation, we use a “ghost iterate” argument following [NJLS09, CJST19]. In particular, we define iterates $\tilde{u}_{t}$ , $\tilde{v}_{t}$ as follows: let $\tilde{u}_{0}\leftarrow u_{0}$ , $\tilde{v}_{0}\leftarrow v_{0}$ , and then for each $0\leq t<T$ , define

	$\displaystyle\tilde{u}_{t+1}$	$\displaystyle:=\textup{argmin}_{u\in\Delta^{m}}\left\{\eta\langle-\mathbf{A}v_{t}+\mathbf{A}_{:j_{t}},\bar{u}\rangle+\sum_{i\in[m]}[u]_{i}\log\frac{[u]_{i}}{[\tilde{u}_{t}]_{i}}\right\},$
	$\displaystyle\tilde{v}_{t+1}$	$\displaystyle:=\textup{argmin}_{v\in\Delta^{n}}\left\{\eta\langle\mathbf{A}^{\top}u_{t}-\mathbf{A}_{:i_{t}},\bar{v}\rangle+\sum_{j\in[n]}[v]_{j}\log\frac{[v]_{j}}{[\tilde{v}_{t}]_{j}}\right\},$

where $i,j$ above are the same coordinates as were used in defining the updates to $u_{t+1}$ and $v_{t+1}$ . By an analogous bound to (8), where we note $\left\lVert\mathbf{A}_{:j_{t}}-\mathbf{A}^{\top}v_{t}\right\rVert_{\infty},\left\lVert\mathbf{A}u_{t}-\mathbf{A}_{i_{t}:}\right\rVert_{\infty}\leq 2$ ,

	$\displaystyle\left\langle-\mathbf{A}^{\top}v_{t}+\mathbf{A}_{:j_{t}},\tilde{u}_{t}-u\right\rangle+\left\langle\mathbf{A}u_{t}-\mathbf{A}_{i_{t}:},\tilde{v}_{t}-v\right\rangle$	$\displaystyle\leq\frac{1}{\eta}\left(V_{\tilde{u}_{t}}(u)-V_{\tilde{u}_{t+1}}(u)+V_{\tilde{v}_{t}}(v)-V_{\tilde{v}_{t+1}}(v)\right)$
		$\displaystyle+4\eta.$

Averaging the above for $0\leq t<T$ , and denoting $\tilde{w}_{t}:=(\tilde{u}_{t},\tilde{v}_{t})$ and $g_{t}:=g(w_{t})$ (see (5)), we obtain for any $w=(u,v)\in\Delta^{m}\times\Delta^{n}$ ,

\frac{1}{T}\sum_{t\in[T]-1}\langle g_{t}-\tilde{g}_{t},\tilde{w}_{t}-w\rangle\leq\frac{1}{\eta T}\left(V_{u_{0}}(u)+V_{v_{0}}(v)\right)+4\eta.

(10)

Summing inequalities (9) and (10), and maximizing over $w=(u,v)\in\Delta^{m}\times\Delta^{n}$ , we have

	$\displaystyle\max_{w\in\Delta^{m}\times\Delta^{n}}\frac{1}{T}\sum_{t=0}^{T-1}\langle g_{t},w_{t}-w\rangle$	$\displaystyle\leq\max_{u\in\Delta^{n},v\in\Delta^{m}}\frac{2}{\eta T}\left(V_{u_{0}}(u)+V_{v_{0}}(v)\right)$		(11)
		$\displaystyle+5\eta+\frac{1}{T}\sum_{t=0}^{T-1}\langle g_{t}-\tilde{g}_{t},w_{t}-\tilde{w}_{t}\rangle.$		(11)

Taking expectations over the above, we have

	$\displaystyle\mathbb{E}\left[\max_{w\in\Delta^{m}\times\Delta^{n}}\frac{1}{T}\sum_{t=0}^{T-1}\langle g_{t},w_{t}-w\rangle\right]$	$\displaystyle\leq\max_{u\in\Delta^{n},v\in\Delta^{m}}\frac{2}{\eta T}\left[V_{u_{0}}(u)+V_{v_{0}}(v)\right]$
		$\displaystyle+5\eta+\mathbb{E}\left[\frac{1}{T}\sum_{t=0}^{T-1}\langle g_{t}-\tilde{g}_{t},w_{t}-\tilde{w}_{t}\rangle\right]$
		$\displaystyle\stackrel{{\scriptstyle(i)}}{{\leq}}\frac{2\log(mn)}{\eta T}+5\eta+\frac{1}{T}\sum_{t\in[T]-1}\langle g_{t}-\mathbb{E}\tilde{g}_{t},w_{t}-\bar{w}_{t}\rangle,$
		$\displaystyle\stackrel{{\scriptstyle(ii)}}{{\leq}}\frac{2\log(mn)}{\eta T}+5\eta+4\delta\stackrel{{\scriptstyle(iii)}}{{\leq}}\epsilon.$

In the above, $(i)$ used the diameter bound of the KL divergence from the uniform distribution, i.e. $\max_{u\in\Delta^{m}}V_{u_{0}}(u)=\log m$ (and a similar bound for $V_{v_{0}}(v)$ ). Further, $(ii)$ uses that $\tilde{g}_{t}$ is conditionally independent of $w_{t}$ and $\tilde{w}_{t}$ , and by the assumption on the Gibbs sampler $\left\lVert g_{t}-\mathbb{E}\tilde{g}_{t}\right\rVert_{\infty}\leq\delta$ (via Lemma 5), and Hölder, and $(iii)$ uses our choices of $T$ , $\eta$ and $\delta$ .

Finally, we note that the desired claim follows by linearity: for any $w=(u,v)$ ,

	$\displaystyle\frac{1}{T}\sum_{t=0}^{T-1}\langle g_{t},w_{t}-w\rangle$	$\displaystyle=\left\langle g\left(\frac{1}{T}\sum_{t=0}^{T-1}w_{t}\right),\frac{1}{T}\sum_{t=0}^{T-1}w_{t}-w\right\rangle$
		$\displaystyle=u^{\top}\mathbf{A}\bar{v}-\bar{u}^{\top}\mathbf{A}v.$

∎

By using a simple martingale argument (inspired by those in [AL17, CDST19]) to bound the error term in (11), we show that the guarantee of Proposition 3 holds with high probability.

Corollary 3.

Let $\alpha\in(0,1)$ , and let $\delta\leq\frac{\epsilon}{20}$ , $\eta=\frac{\epsilon}{20}$ and $T\geq\frac{8\log(mn)}{\eta\epsilon}+\frac{2048\log\frac{1}{\alpha}}{\epsilon^{2}}$ in Algorithm 1. Then with probability at least $1-\alpha$ , following notation of Proposition 3, $(\bar{u},\bar{v})$ are an $\epsilon$ -approximate NE for $\mathbf{A}$ .

Proof.

Consider the filtration given by $\mathcal{F}_{t}=\sigma(u_{0},v_{0},\tilde{g}_{0},\cdots,\tilde{g}_{t},u_{t+1},v_{t+1})$ . We will bound the terms $\sum_{t=0}^{T-1}\langle g_{t}-\tilde{g}_{t},w_{t}-\bar{w}_{t}\rangle$ in (7). To do so, we define a martingale difference sequence of the form $D_{t}:=\langle g_{t}-\tilde{g}_{t},w_{t}-\bar{w}_{t}\rangle-\langle g_{t}-\mathbb{E}\left[\tilde{g}_{t}|\mathcal{F}_{t-1}\right],w_{t}-\bar{w}_{t}\rangle$ which is adapted to the filtration $\mathcal{F}_{t}$ . We first note that $D_{t}\leq\left\lVert g_{t-1}-\tilde{g}_{t-1}\right\rVert_{\infty}\left\lVert w_{t-1}-\bar{w}_{t-1}\right\rVert_{1}\leq 8$ with probability $1$ . Consequently, applying the Azuma-Hoeffding inequality yields

\displaystyle\sum_{t=0}^{T-1}D_{t}\leq\sqrt{128T\log\frac{1}{\alpha}}\leavevmode\nobreak\ \text{with probability}\geq 1-\alpha.

Plugging this back into (11) and using the KL divergence range bound, Lemma 5 with our definition of $\mathcal{O}^{\textup{gibbs}}$ , and choices of parameters, we thus have with probability $1-\alpha$ ,

\displaystyle\max_{w\in\Delta^{m}\times\Delta^{n}}\frac{1}{T}\sum_{t=0}^{T-1}\langle g_{t},w_{t}-w\rangle\leq\frac{2\log mn}{\eta T}+5\eta+4\delta+\sqrt{\frac{128\log\frac{1}{\alpha}}{T}}\leq\epsilon.

(12)

The remainder of the proof follows analogously to Proposition 3. ∎

The Gibbs sampling oracles implicitly maintain access to $u_{t}\propto\exp(\mathbf{A}y_{t})$ and $v_{t}\propto\exp(-\mathbf{A}^{\top}x_{t})$ , which by averaging gives $(\bar{u},\bar{v})=\frac{1}{T}\sum_{t=0}^{T-1}(u_{t},v_{t})$ as one approximate equilibrium as guaranteed in Corollary 3. To turn the implicitly maintained iterates into an actual classic output, we subsample the iterates. Below we formally show one can take the empirical average of independent samples from distributions close to $\bar{u}$ and $\bar{v}$ to also obtain an approximate equilibrium (with the same approximation factor up to constant factors) with high probability.

Lemma 6.

Suppose $\bar{u}=\frac{1}{T}\sum_{t=0}^{T-1}u_{t}$ for $\{u_{t}\}_{t=0}^{T-1}\subset\Delta^{m}$ and $\bar{v}=\frac{1}{T}\sum_{t=0}^{T-1}v_{t}$ for $\{v_{t}\}_{t=0}^{T-1}\subset\Delta^{n}$ are an $\epsilon$ -approximate NE for $\mathbf{A}$ . Further suppose that for some $\delta\in(0,1)$ , $\{\tilde{u}_{t}\}_{t=0}^{T-1}\subset\Delta^{m}$ , $\{\tilde{v}_{t}\}_{t=0}^{T-1}\subset\Delta^{n}$ , and for all $0\leq t<T-1$ , we have $\left\lVert\tilde{u}_{t}-u_{t}\right\rVert_{1}\leq\delta$ and $\left\lVert\tilde{v}_{t}-v_{t}\right\rVert_{1}\leq\delta$ . Let $\hat{u}=\frac{1}{T}\sum_{t=0}^{T-1}e_{i_{t}}$ where each $e_{i_{t}}\in\mathbb{R}^{m}$ is sampled independently according to $\tilde{u}_{t}$ ; similarly, let $\hat{v}=\frac{1}{T}\sum_{t=0}^{T-1}e_{j_{t}}$ where each $e_{j_{t}}\in\mathbb{R}^{n}$ is sampled independently according to $\tilde{v}_{t}$ . Suppose $T\geq\frac{16\log\frac{mn}{\alpha}}{\epsilon^{2}}$ . Then with probability at least $1-\alpha$ , $(\hat{u},\hat{v})$ are a $(2\epsilon+2\delta)$ -approximate NE for $\mathbf{A}$ .

Proof.

First, let $\tilde{u}_{\textup{avg}}=\frac{1}{T}\sum_{t=0}^{T-1}\tilde{u}_{t}$ and $\tilde{v}_{\textup{avg}}=\frac{1}{T}\sum_{t=0}^{T-1}\tilde{v}_{t}$ . By convexity of norms, we have $\left\lVert\tilde{u}_{\textup{avg}}-\bar{u}\right\rVert_{1}\leq\delta$ and $\left\lVert\tilde{v}_{\textup{avg}}-\bar{v}\right\rVert_{1}\leq\delta$ , and hence under the NE approximation guarantee of $(\bar{u},\bar{v})$ and Hölder’s inequality,

\max_{u\in\Delta^{m}}u^{\top}\mathbf{A}\tilde{v}_{\textup{avg}}-\min_{v\in\Delta^{m}}\tilde{u}_{\textup{avg}}^{\top}\mathbf{A}v\leq\epsilon+2\delta.

Let $z$ be a fixed vector in $[-1,1]^{n}$ . By Hoeffding’s inequality, since each random variable $\left\langle z,e_{j_{t}}\right\rangle$ lies in the range $[-1,1]$ and $\mathbb{E}\hat{v}=\tilde{v}_{\textup{avg}}$ , we have that

\Pr\left[\left|\left\langle z,\hat{v}-\tilde{v}_{\textup{avg}}\right\rangle\right|\geq\frac{\epsilon}{2}\right]\leq 2\exp\left(-\frac{T\epsilon^{2}}{8}\right)\leq\frac{\alpha}{m+n}.

(13)

Next, note that $\max_{u\in\Delta^{m}}u^{\top}\mathbf{A}\tilde{v}_{\textup{avg}}$ is achieved by a basis vector $u=e_{i}$ . Hence, applying a union bound over (13) for all $z=\mathbf{A}_{i:}$ shows that with probability at least $1-\frac{\alpha m}{m+n}$ ,

\max_{u\in\Delta^{m}}u^{\top}\mathbf{A}\hat{v}\leq\max_{u\in\Delta^{m}}u^{\top}\mathbf{A}\tilde{v}_{\textup{avg}}+\frac{\epsilon}{2}.

By symmetry, with probability at least $1-\frac{\alpha n}{m+n}$ ,

\min_{v\in\Delta^{n}}\hat{u}^{\top}\mathbf{A}v\geq\min_{v\in\Delta^{n}}\tilde{u}_{\textup{avg}}^{\top}\mathbf{A}v-\frac{\epsilon}{2}.

The conclusion follows from a union bound, and combining the above three displays. ∎

Finally, we put these pieces together to give a complete guarantee.

See 1

Proof.

We follow notation of Proposition 3. By applying Corollary 3 (up to constant factors), we have that with probability at least $1-\frac{\alpha}{2}$ , $\bar{u}:=\frac{1}{T}\sum_{t=0}^{T-1}u_{t}$ and $\bar{v}:=\frac{1}{T}\sum_{t=0}^{T-1}v_{t}$ satisfy

\max_{u\in\Delta^{m}}u^{\top}\mathbf{A}\bar{v}-\min_{v\in\Delta^{n}}\bar{u}^{\top}\mathbf{A}v\leq\frac{\epsilon}{3}.

Finally, Lemma 6 (with failure probability $\frac{\alpha}{2}$ ) and a union bound yields the desired conclusion. ∎

Appendix B Quantum rejection sampling with a hint

In this section, we prove Proposition 2, which gives a dynamic quantum rejection sampling subroutine and bounds its cost of implementation. Our result is an extension of analogous developments in [vAG19], but are stated more generally to allow for the use of an appropriate “hint” vector in the rejection sampling procedure. We build up to our main result in several pieces.

Amplitude amplification.

First, for a quantum decision algorithm which applies unitary $\mathbf{U}$ and then measures, yielding an accepting state with probability $\alpha$ , quantum amplification [BHMT02] shows we can apply $\mathbf{U}$ $\approx\alpha^{-\frac{1}{2}}$ times to obtain an accepting state with high probability.

Proposition 4 (Theorem 3, [BHMT02]).

Let $S\subseteq\{0,1\}^{s}$ , let $\mathbf{U}$ be a $s$ -qubit quantum oracle, and let $\alpha$ be the probability that measuring the result of applying $\mathbf{U}$ yields an accepting state. There is a (quantum) algorithm using $O(\alpha^{-\frac{1}{2}}\log\frac{1}{\delta})$ queries to $\mathbf{U}$ and $O(\log s\log\frac{1}{\delta})$ additional time that returns $s$ with $s\in S$ with probability $\geq 1-\delta$ .

Loading from trees.

Given a dynamic vector $x\in\mathbb{R}^{m}_{\geq 0}$ which is supported in an appropriate efficient data structure $\mathsf{SamplerTree}$ (see Lemma 1), and a known bound $\beta\geq\left\lVert x\right\rVert_{1}$ , we recall a result of [GR02] which allows us to form a superposition of the entries in $x$ (suitably rescaled).

Lemma 7.

Let $x\in\mathbb{R}^{m}_{\geq 0}$ correspond to an instance of $\mathsf{SamplerTree}$ , and $\beta\geq\left\lVert x\right\rVert_{1}$ . We can maintain a quantum oracle $\mathcal{O}_{\mathsf{SamplerTree}}$ which takes $O(\log m)$ time to apply, such that the total cost of building $\mathcal{O}_{\mathsf{SamplerTree}}$ after $T$ calls to $\mathsf{Update}$ is $O(T\log m)$ , and

\mathcal{O}_{\mathsf{SamplerTree}}|0\rangle^{\otimes(a+1)}=\sum_{i\in[m]}\sqrt{\frac{x_{i}}{\beta}}|0\rangle|i\rangle+\sqrt{1-\frac{\left\lVert x\right\rVert_{1}}{\beta}}|1\rangle|g\rangle.

Proof.

This is implicit in [GR02]. We first apply a $1$ -qubit gate to condition on selecting from the tree (with probability $\frac{\left\lVert x\right\rVert_{1}}{\beta}$ ), and then apply the [GR02] procedure conditioned on the first qubit being $|0\rangle$ , which controls for one qubit at a time while propagating subtree sums (provided by $\mathsf{SamplerTree}$ via $\mathsf{SubtreeSum}$ ). The cost to build the circuit follows because on an $\mathsf{Update}$ we need to change the gates corresponding to the relevant leaf-to-root path.

∎

Corollary 4.

Let $x\in\mathbb{R}^{m}_{\geq 0}$ correspond to an instance of $\mathsf{SamplerTree}$ , and let $\beta\geq\left\lVert x\right\rVert_{1}$ , and suppose $\mathbf{A}\in\mathbb{R}^{m\times n}$ has $\left\lVert\mathbf{A}\right\rVert_{\max}\leq 1$ . We can maintain a quantum oracle $\mathcal{O}_{\mathbf{A}^{\top}x}$ which takes $O(\log m)$ time to apply, with total building cost $O(T\log m)$ after $T$ calls to $\mathsf{Update}$ , such that for any $j\in[n]$ ,

\mathcal{O}_{\mathbf{A}^{\top}x}|0\rangle^{\otimes(a+2)}|j\rangle=|0\rangle\left(\sum_{i\in[m]}\sqrt{\frac{\mathbf{A}_{ij}x_{i}}{\beta}}|0\rangle|i\rangle+|1\rangle|g\rangle\right)|j\rangle.

Proof.

We apply $\mathcal{O}_{\mathbf{A}}^{\prime}$ (see Section 2) to the output of $\mathcal{O}_{\mathsf{SamplerTree}}$ , ignoring the additional qubit. ∎

We remark here that the additional qubit in Corollary 4 will shortly become useful in constructing an appropriate block-encoding of a scaling of $\textbf{{diag}}\left(\mathbf{A}^{\top}x\right)$ .

Polynomial approximation.

In order to give approximate Gibbs samplers for the types of dynamic vectors Algorithm 1 encounters, we further require some tools from polynomial approximation theory. We first state a helper result on boundedly approximating the exponential, a variant of which was also used in [vAG19]. We provide a proof in Appendix C.

Lemma 8 (Lemma 7, [vAG19]).

Let $\beta\geq 1$ , $\xi\leq\frac{1}{10}$ . There is a polynomial $\mathcal{P}_{\beta,\xi}$ of degree $O(\beta\log\frac{1}{\xi})$ such that $\max_{x\in[-1,1]}|\mathcal{P}_{\beta,\xi}(x)|\leq 3$ and $\max_{x\in[-1,0]}|\mathcal{P}_{\beta,\xi}(x)-\exp(\beta x)|\leq\xi$ .

Next, we state a further corollary of Lemma 8 to be used in our rejection sampler.

Corollary 5.

Let $B,\delta\geq 0$ and suppose $v\in\mathbb{R}^{n}$ has $\left\lVert v\right\rVert_{\infty}\leq B$ . Further, suppose for some $c\geq 0$ , $-c\leq\max_{j\in[n]}v_{j}\leq 0$ . Let $q\in\mathbb{R}^{n}_{\geq 0}$ satisfy $q_{j}\in[\ell,1]$ entrywise. Finally, define $u_{j}:=\frac{v_{j}}{2B}$ entrywise. There is a degree- $\Delta$ polynomial $\mathcal{P}$ , for $\Delta=O(B\cdot(c+\log\frac{n}{\ell\delta}))$ , such that for $w_{j}:=\mathcal{P}(u_{j})^{2}q_{j}$ and $z_{j}:=\exp(2Bu_{j})q_{j}$ entrywise,

\left\lVert\frac{w}{\left\lVert w\right\rVert_{1}}-\frac{z}{\left\lVert z\right\rVert_{1}}\right\rVert_{1}\leq\delta.

(14)

Moreover, $\max_{x\in[-1,1]}|\mathcal{P}(x)|\leq\frac{1}{2}$ , and $\|w\|_{1}\geq\frac{1-\delta}{36}\|z\|_{1}$ .

Proof.

Assume $\delta\leq 2$ else the statement is clearly true. First, $u_{j}\in[-\frac{1}{2},0]$ entrywise by the stated assumptions (since $v_{j}\in[-B,0]$ entrywise). Let $\mathcal{P}_{\beta,\xi}(\cdot)$ be the polynomial given by Lemma 8 which $\xi$ -approximates $\exp(\beta\cdot)$ on $[-\frac{1}{2},0]$ . We define

\mathcal{P}(u):=\frac{1}{6}\mathcal{P}_{B,\xi}\left(u\right),\text{ for }\xi:=\frac{\delta\ell}{6n\exp(c)}.

The degree bound and absolute value bound of this polynomial follows immediately from Lemma 8, so it remains to show the distance bound. The guarantees of Lemma 8 then imply for all $j\in[n]$ ,

\left|6\mathcal{P}(u_{j})-\exp\left(Bu_{j}\right)\right|\leq\xi.

(15)

We further have that $u_{j}\leq 0$ , so $\exp(Bu_{j})\leq 1$ . Hence, we also have

\left|6\mathcal{P}(u_{j})+\exp\left(Bu_{j}\right)\right|\leq 2+\xi\leq 3.

Combining yields for all $j\in[n]$ ,

\left|36\mathcal{P}(u_{j})^{2}-\exp\left(2Bu_{j}\right)\right|\leq 3\xi.

(16)

Next, let $y_{j}:=36w_{j}$ for all $j\in[n]$ , and note that $\frac{y}{\|y\|_{1}}=\frac{w}{\|w\|_{1}}$ . We bound

	$\displaystyle\left\lVert\frac{w}{\left\lVert w\right\rVert_{1}}-\frac{z}{\left\lVert z\right\rVert_{1}}\right\rVert_{1}$	$\displaystyle=\sum_{j\in[n]}\left\|\frac{y_{j}}{\left\lVert y\right\rVert_{1}}-\frac{z_{j}}{\left\lVert z\right\rVert_{1}}\right\|\leq\sum_{j\in[n]}\left\|\frac{y_{j}}{\left\lVert y\right\rVert_{1}}-\frac{y_{j}}{\left\lVert z\right\rVert_{1}}\right\|+\sum_{j\in[n]}\left\|\frac{y_{j}}{\left\lVert z\right\rVert_{1}}-\frac{z_{j}}{\left\lVert z\right\rVert_{1}}\right\|$		(17)
		$\displaystyle\leq\left\|1-\frac{\left\lVert y\right\rVert_{1}}{\left\lVert z\right\rVert_{1}}\right\|+\frac{\left\lVert y-z\right\rVert_{1}}{\left\lVert z\right\rVert_{1}}\leq\frac{2\left\lVert y-z\right\rVert_{1}}{\left\lVert z\right\rVert_{1}}.$		(17)

By using the definitions of $y,z$ and (16), as well as the assumed ranges on $q$ ,

\displaystyle\left\lVert y-z\right\rVert_{1}\leq 3n\xi,\;\left\lVert z\right\rVert_{1}\geq\ell\exp(-c).

The second inequality used that some $v_{j}=2Bu_{j}$ is at least $-c$ by assumption. Combining the above display with (17) and the definition of $\xi$ concludes the proof of (14). Finally, using the bounds on $\left\lVert y-z\right\rVert_{1},\|z\|_{1}$ above shows that

\|w\|_{1}=\frac{1}{36}\|y\|_{1}\geq\frac{1-\delta}{36}\|z\|_{1}.

∎

Block-encoding.

Our approximate Gibbs oracle follows an implementation strategy pioneered by [GSLW19] termed “block-encoding.” Specifically, we follow [GSLW19] and say that $\mathbf{U}$ , an $(a+\ell)$ -qubit quantum gate, is an $\ell$ -bit block-encoding of $\mathbf{M}$ if the top-left $2^{a}\times 2^{a}$ submatrix of $\mathbf{U}$ is $\mathbf{M}$ . Block-encoded matrices admit efficient composable operations, such as the application of linear combinations and bounded polynomials. We summarize these properties in the following.

Proposition 5 (Lemma 52, [GSLW19]).

Let $\mathbf{U}_{1}$ and $\mathbf{U}_{2}$ be $\ell$ -bit block-encodings of $\mathbf{M}_{1}$ , $\mathbf{M}_{2}$ of the same size. There is an $O(\ell)$ -bit block-encoding of $\frac{1}{2}\mathbf{M}_{1}+\frac{1}{2}\mathbf{M}_{2}$ which takes the same asymptotic time to apply as applying $\mathbf{U}_{1}$ and $\mathbf{U}_{2}$ .

Proposition 6 (Theorem 56, [GSLW19]).

Let $\mathbf{U}$ be an $\ell$ -bit block-encoding of $\mathbf{M}$ , and $\mathcal{P}:[-1,1]\to[-\frac{1}{2},\frac{1}{2}]$ be a degree- $\Delta$ polynomial. There is an $O(\ell)$ -bit block-encoding of $\mathcal{P}(\mathbf{M})$ which can be applied in $O(\Delta)$ applications of $\mathbf{U}$ and $\mathbf{U}^{\dagger}$ and $O(\ell\Delta)$ additional time.

We also demonstrate that an application of Corollary 4 yields a simple block-encoding of $\textbf{{diag}}\left(\frac{\mathbf{A}^{\top}x}{\beta}\right)$ . A similar construction previously appeared in [vAG19].

Corollary 6.

Let $x\in\mathbb{R}^{m}_{\geq 0}$ correspond to an instance of $\mathsf{SamplerTree}$ , and $\beta\geq\left\lVert x\right\rVert_{1}$ . Let $\mathbf{M}:=\textbf{{diag}}\left(\frac{\mathbf{A}^{\top}x}{\beta}\right)$ and $\mathbf{U}:=\mathcal{O}_{\mathbf{A}^{\top}x}^{*}(\textup{SWAP}_{12}\otimes\mathbf{I})\mathcal{O}_{\mathbf{A}^{\top}x}$ , where $\textup{SWAP}_{12}$ swaps the first two qubits and $\mathcal{O}_{\mathbf{A}^{\top}x}$ is from Corollary 4. Then $\mathbf{U}$ is a block-encoding of $\mathbf{M}$ , and can be applied in time $O(\log m)$ , with total building cost $O(T\log m)$ after $T$ calls to $\mathsf{Update}$ .

Proof.

Define $w_{ij}:=\frac{\mathbf{A}_{ij}x_{i}}{\beta}$ for convenience. By the definition of $\mathcal{O}_{\mathbf{A}^{\top}x}$ , we have that

\left(\textup{SWAP}_{12}\otimes\mathbf{I}\right)\mathcal{O}_{\mathbf{A}^{\top}x}\left(|0\rangle^{\otimes(a+2)}|j\rangle\right)=\left(|00\rangle\sum_{i\in[m]}\sqrt{w_{ij}}|i\rangle+|10\rangle|g\rangle\right)|j\rangle.

Hence, for $j,j^{\prime}\in[n]$ , we compute $\langle j^{\prime}|\langle 0|^{\otimes(a+2)}\mathbf{U}|0\rangle^{\otimes(a+2)}|j\rangle$ as:

	$\displaystyle\langle j^{\prime}\|\left(\|00\rangle\sum_{i\in[m]}\sqrt{w_{ij}}\|i\rangle+\|01\rangle\|g\rangle\right)^{*}\left(\|00\rangle\sum_{i\in[m]}\sqrt{w_{ij}}\|i\rangle+\|10\rangle\|g\rangle\right)\|j\rangle$
	$\displaystyle=\begin{cases}\sum_{i\in[m]}w_{ij}=\frac{[\mathbf{A}^{\top}x]_{j}}{\beta}&j=j^{\prime}\\ 0&j\neq j^{\prime}\end{cases}.$

In particular the $|01\rangle$ and $|10\rangle$ terms disappear, and $|j\rangle$ , $|j^{\prime}\rangle$ are orthogonal unless $j=j^{\prime}$ . In the above, we required that $\sqrt{w_{ij}}^{*}\sqrt{w_{ij}}=w_{ij}$ , which is only true if $w_{ij}$ is nonnegative. To bypass this issue, we will implement the two copies of $\mathcal{O}_{\mathbf{A}^{\top}x}$ in slightly different ways, to obtain the correct signing. For notational clarity, we let $\mathcal{O}^{L}$ be the oracle which is conjugated on the left and $\mathcal{O}^{R}$ be the oracle on the right, such that $\mathbf{U}=(\mathcal{O}^{L})^{*}(\textup{SWAP}_{12}\otimes\mathbf{I})(\mathcal{O}^{R})$ . Note that $x$ is entrywise nonnegative and $\beta>0$ , and hence the only factor determining the sign of $w_{ij}$ is $\mathbf{A}_{ij}$ . When $\mathbf{A}_{ij}\geq 0$ , we will define the oracles $\mathcal{O}_{\mathbf{A}}^{\prime}$ used to load $\sqrt{\mathbf{A}_{ij}}$ for $\mathcal{O}^{L}$ and $\mathcal{O}^{R}$ in a consistent way (i.e. use the same-signed square root), so that $\sqrt{w_{ij}}^{2}=w_{ij}$ . When $\mathbf{A}_{ij}<0$ we will define them in an inconsistent way, so that after the conjugation operation, $-\sqrt{w_{ij}}\sqrt{w_{ij}}=w_{ij}$ . We have thus shown that $\langle 0|^{\otimes(a+2)}\mathbf{U}|0\rangle^{\otimes(a+2)}=\mathbf{M}$ which implies the first conclusion. To see the second, all our gates are reversible (arithmetic circuits are reversible, and $\mathcal{O}_{\mathbf{A}}$ is its own inverse), and hence the complexity of applying $\mathcal{O}_{\mathbf{A}^{\top}x}^{*}$ is the same as $\mathcal{O}_{\mathbf{A}^{\top}x}$ . ∎

Finally, we put together the pieces and prove Proposition 2, which we use repeatedly throughout the paper to implement our Gibbs sampling oracles.

See 2

Proof.

Throughout the proof, let $\delta\leftarrow\min(\frac{1}{2},\delta)$ and $B:=4(\beta+\log(\frac{Cn}{\delta}))$ . Also define $\ell:=\frac{\delta}{n}$ (following notation of Corollary 5). We first observe that since $\max_{j\in[n]}[\mathbf{A}^{\top}x]_{j}\leq\log Z\leq\max_{j\in[n]}[\mathbf{A}^{\top}x]_{j}+\log n$ ,

-\log(Cn)\leq\max_{j\in[n]}[\mathbf{A}^{\top}x]_{j}-\log\left(\widetilde{Z}q_{j}\right)\leq 0.

Here, the upper bound used that for all $j\in[n]$ , $\exp([\mathbf{A}^{\top}x]_{j}-\widetilde{Z}q_{j})=\frac{p_{j}}{q_{j}}\cdot\frac{Z}{\widetilde{Z}}\leq 1$ by assumption. Hence, for $v:=\mathbf{A}^{\top}x-\log(\widetilde{Z}q)$ entrywise,

-c\leq\max_{j\in[n]}v_{j}\leq 0\text{ for }c:=\log(Cn).

Next, we note $\log(\widetilde{Z}q)$ is entrywise bounded in magnitude by $\frac{B}{2}$ :

	$\displaystyle\log(\widetilde{Z}q_{j})$	$\displaystyle\leq\log(CZ)\leq\log\left(n\cdot\max_{j\in[n]}\exp([\mathbf{A}^{\top}x]_{j})\right)+\log C\leq\frac{B}{2},$
	$\displaystyle\log(\widetilde{Z}q_{j})$	$\displaystyle\geq\log Z+\log\frac{\delta}{n}\geq\min_{j\in[n]}[\mathbf{A}^{\top}x]_{j}-\log\frac{n}{\delta}\geq-\frac{B}{2}.$

Define $\mathbf{M}_{1}:=\textbf{{diag}}\left(\frac{\mathbf{A}^{\top}x}{2B}\right)$ and $\mathbf{M}_{2}:=\textbf{{diag}}\left(-\frac{1}{2B}\log(\widetilde{Z}q)\right)$ . By the calculations above, we have $\left\lVert\mathbf{M}_{2}\right\rVert_{\textup{op}}\leq\frac{1}{2}$ , and similarly it is clear that $\left\lVert\mathbf{M}_{1}\right\rVert_{\textup{op}}\leq\frac{1}{2}$ because $\left\lVert\mathbf{A}^{\top}x\right\rVert_{\infty}\leq\beta$ . Moreover, by using Corollary 6 with $\beta\leftarrow B$ , we obtain $\mathbf{U}_{1}$ , a block-encoding of $\mathbf{M}_{1}$ applicable in $O(\log m)$ time. Using a similar construction as Corollary 6, since $q$ , $B$ , and $\widetilde{Z}$ are all efficiently classically queriable, we obtain $\mathbf{U}_{2}$ , a block-encoding of $\mathbf{M}_{2}$ applicable in $O(1)$ time. Hence, Proposition 5 yields $\mathbf{U}$ , a block-encoding of

\mathbf{M}_{1}+\mathbf{M}_{2}=\textbf{diag}\left(\frac{v}{2B}\right),

which can be applied in $O(\log mn)$ time. Next, let $\mathcal{P}$ be the degree- $\Delta=O(B\log\frac{Cn}{\delta})$ polynomial from Corollary 5, parameterized by $B,v,c,q,\ell$ as defined earlier. Corollary 5 shows that $\mathcal{P}:[-1,1]\to[-\frac{1}{2},\frac{1}{2}]$ . Thus, Proposition 6 then yields $\mathbf{U}^{\prime}$ , a block-encoding of $\textbf{{diag}}\left(\mathcal{P}(\frac{v}{2B})\right)$ which can be applied in $O(\Delta\cdot\log mn)$ time. Furthermore, since $q$ and $\rho$ are efficiently classically queriable, we can define a gate $\mathcal{O}_{q}$ which is applicable in $O(1)$ time and acts as

\mathcal{O}_{q}|0\rangle^{\otimes(b+1)}=|0\rangle\sum_{j\in[n]}\sqrt{\frac{q_{j}}{\rho}}|j\rangle+|1\rangle|g\rangle.

Applying $\mathbf{U}^{\prime}$ to the output of $\mathcal{O}_{q}$ with appropriate ancilla qubits then yields

|0\rangle^{\otimes O(1)}\sum_{j\in[n]}\sqrt{\frac{q_{j}\mathcal{P}(u_{j})^{2}}{\rho}}|j\rangle|g_{j}\rangle+|g^{\prime}\rangle,\text{ where }u_{j}:=\frac{v_{j}}{2B}\text{ for all }j\in[n].

Post-selecting on the first register being the all-zeroes state and measuring on the register corresponding to $j$ , we see that we obtain a sample $j\in[n]$ with probability proportional to $q_{j}\mathcal{P}(u_{j})^{2}$ . By Corollary 5, conditioned on the sample succeeding, the resulting distribution is $\delta$ -close in $\ell_{1}$ to the distribution proportional to $q\circ\exp(v)\propto\exp(\mathbf{A}^{\top}x)$ , and hence the result is a $\delta$ -approximate Gibbs oracle. Finally, we bound the query cost of the oracle. Define $w_{j}:=\mathcal{P}(u_{j})^{2}q_{j}$ and $z_{j}:=\exp(v_{j})q_{j}$ as in Corollary 5. By definition of $v,\widetilde{Z}$ ,

\left\lVert z\right\rVert_{1}=\sum_{j\in[n]}\frac{\exp\left(\left[\mathbf{A}^{\top}x\right]_{j}\right)}{\widetilde{Z}}\in\left[C^{-1},1\right].

Moreover, the last conclusion in Corollary 5 shows $\left\lVert w\right\rVert_{1}\geq\frac{1}{72}\left\lVert z\right\rVert_{1}\geq(72C)^{-1}$ . Hence,

\sum_{j\in[n]}\frac{q_{j}\mathcal{P}(u_{j})^{2}}{\rho}=\frac{\left\lVert w\right\rVert_{1}}{\rho}\geq\frac{1}{72C\rho}.

In other words, we have an oracle which we can apply in time $O(\Delta\cdot\log mn)$ which correctly returns a sample with probability $\alpha\geq\frac{1}{72C\rho}$ . By applying Proposition 4 to improve the success probability, we obtain the desired conclusion at a $O(\sqrt{C\rho}\log\frac{1}{\delta})$ overhead.

∎

See 2

Proof.

Our oracle $\mathcal{O}_{\textup{test}}$ is the oracle from Proposition 2, except we will choose a sufficiently small constant value of $\delta$ . It returns “success” when the sample is accepted by the rejection sampler after boosting by amplitude amplification. Before boosting, the success probability from Proposition 2 is $\Theta(\frac{1}{R\rho})$ where the constants in the upper and lower bounds are explicit. Further, the constants from Proposition 4 are explicit, and hence boosting by amplitude amplification improves the success probability to $\Theta(\frac{1}{\sqrt{R\rho}})$ with known constant bounds as required by the corollary statement. ∎

Appendix C Bounded approximation to $\exp$ on $[-1,1]$

Here, we give a proof of a lemma (with slightly different constants) used in the prior work [vAG19]. This section builds entirely off prior results on polynomial approximation in [GSLW19]; we include it for completeness because a proof was not given in [vAG19]. As a reminder, we stated and used the following result earlier when constructing our rejection sampler in Appendix B.

See 8

To obtain the lemma, we will utilize the following result from [GSLW19].

Proposition 7 (Corollary 66, [GSLW19]).

Let $x_{0}\in[-1,1]$ , $r\in(0,2]$ , $\delta\in(0,r]$ . Let $f:[x_{0}-r-\delta,x_{0}+r+\delta]\rightarrow\mathbb{C}$ be such that $f(x_{0}+x)=\sum_{\ell\geq 0}a_{\ell}x^{\ell}$ for all $x\in[-r-\delta,r+\delta]$ . Suppose $B>0$ is such that $\sum_{\ell\geq 0}(r+\delta)^{\ell}|a_{\ell}|\leq B$ and let $\epsilon\in(0,\frac{1}{2B}]$ . There is a polynomial $P$ (see Appendix D for its numerically stable implementation) of degree $O\left(\frac{1}{\delta}\log\frac{B}{\epsilon}\right)$ such that

\displaystyle\max_{x\in[x_{0}-r,x_{0}+r]}|f(x)-P(x)|\leq\epsilon\text{ and }\max_{x\in[-1,1]}|P(x)|\leq\epsilon+B.

Proof of Lemma 8.

We apply Proposition 7 with $f(x):=\exp(\beta x)$ which has a convergent Taylor series everywhere, and the parameter settings $x_{0}=-1$ , $r=1$ , $\delta=\frac{1}{\beta}$ , $B=e$ . We have that $f(x_{0}+x)=\sum_{\ell\geq 0}\exp(-\beta)\frac{\beta^{\ell}\cdot x^{\ell}}{\ell!}=\sum_{\ell\geq 0}a_{\ell}x^{\ell}$ with $a_{\ell}=\exp(-\beta)\frac{\beta^{\ell}}{\ell!}$ for any integer $\ell\geq 0$ . We also check that our choice of $B$ is valid, via

\sum_{\ell\geq 0}(r+\delta)^{\ell}|a_{\ell}|=\exp(-\beta)\sum_{\ell\geq 0}\left(1+\frac{1}{\beta}\right)^{\ell}\frac{\beta^{\ell}}{\ell!}=\exp(-\beta)\sum_{\ell\geq 0}\frac{(\beta+1)^{\ell}}{\ell!}=\exp(\beta+1-\beta)=e.

Hence by Proposition 7, we have for any $\xi\leq\frac{1}{2e}$ , there is a polynomial $P$ of degree $O(\beta\log\frac{1}{\xi})$ such that $\max_{x\in[-2,0]}|\exp(\beta x)-P(x)|\leq\epsilon$ and $\max_{x\in[-1,1]}|\tilde{P}(x)|\leq e+\frac{1}{6}+\xi\leq 3$ . ∎

Appendix D Numerically stable implementation of polynomial approximation

Throughout this section, let $\Delta=O(\frac{1}{\epsilon}\log^{2}(\frac{mn}{\epsilon}))$ be the degree of the polynomial used in the proof of Proposition 2 in Appendix B (specifically, constructed in the proof of Proposition 2, where we have $C=O(1)$ and $\delta=O(\epsilon)$ in our applications). The polynomial we use is constructed via a decomposition in the Fourier basis (see Lemmas 57 and 65, [GSLW19]). It is not immediate that this polynomial transform can be implemented stably in finite-precision arithmetic, within the quantum singular value transformation framework of [GSLW19], which is used in the proof of Proposition 2. However, [Haa19] shows that given such a decomposition in the Fourier basis, we can obtain a numerically-stable implementation of the polynomial transformation required as a quantum circuit up to additive error $\xi$ , in time

O\left(\Delta^{3}\log\left(\frac{\Delta}{\xi}\right)\right).

In our setting (in the proof of Proposition 2), it is straightforward to check that $\xi=\textup{poly}(m,n,\epsilon^{-1})$ . This construction results in the additive term in Theorem 4.

Quantum Speedups for Zero-Sum Games via Improved Dynamic Gibbs Sampling

Abstract

1 Introduction

1.1 Zero-sum games

Our result.

Theorem 1 (informal, see Theorem 4).

1.2 Dynamic Gibbs sampling

Definition 1 (Approximate Gibbs oracle).

Problem 1 (Sampling maintenance).

Our result.

Theorem 2 (informal, see Theorem 3).

1.3 Related work

Quantum optimization and machine learning.

Zero-sum games.

1.4 Future work

1.5 Organization

2 Preliminaries

General notation.

Computation models.

Access model.

3 Overview of approach

3.1 Solving matrix games with a Gibbs sampling oracle

Proposition 1.

Theorem 3.

Theorem 4.

Proof.

3.2 Dynamic sampling maintenance via dynamic hint maintenance

Reducing sampling maintenance to hint maintenance.

Lemma 1 (Sampler tree).

Proposition 2.

Approaches for Problem 1.

4 Gibbs sampling oracle implementation

Invariant 1 (Approximate normalization access).

Invariant 2 (Initial sampling maintenance).

4.1 Preprocessing and approximate Gibbs oracle implementation

Lemma 2.

Proof.

Lemma 3.

Proof.

Corollary 1.

Proof.

4.2 Maintaining invariants

Corollary 2.

Lemma 4.

Proof.

4.3 Proof of Theorem 3

Proof.

Acknowledgments

References

Appendix A Solving matrix games with a Gibbs sampling oracle

Lemma 5.

Proof.

Proposition 3.

Proof.

Corollary 3.

Proof.

Lemma 6.

Proof.

Proof.

Appendix B Quantum rejection sampling with a hint

Amplitude amplification.

Proposition 4 (Theorem 3, [BHMT02]).

Loading from trees.

Lemma 7.

Proof.

Corollary 4.

Proof.

Polynomial approximation.

Lemma 8 (Lemma 7, [vAG19]).

Corollary 5.

Proof.

Block-encoding.

Proposition 5 (Lemma 52, [GSLW19]).

Proposition 6 (Theorem 56, [GSLW19]).

Corollary 6.

Proof.

Proof.

Proof.

Appendix C Bounded approximation to exp\exp on [−1,1][-1,1]

Proposition 7 (Corollary 66, [GSLW19]).

Quantum Speedups for Zero-Sum Games
via Improved Dynamic Gibbs Sampling

Appendix C Bounded approximation to $\exp$ on $[-1,1]$