Undecidable problems associated with variational quantum algorithms

Georgios Korpas Department of Computer Science, Czech Technical University in Prague, Czechia Quantum Technologies Group, Innovation & Ventures, HSBC, Singapore Archimedes Research Unit on AI, Data Science and Algorithms, Athena RC, Marousi, Greece Vyacheslav Kungurtsev Jakub Mareček Department of Computer Science, Czech Technical University in Prague, Czechia

Abstract

Variational Quantum Algorithms (VQAs), such as the Variational Quantum Eigensolver (VQE) and the Quantum Approximate Optimization Algorithm (QAOA), are widely studied as candidates for near-term quantum advantage. Recent work has shown that training VQAs is $\mathsf{NP}$ -hard in general. In this paper, we present a conditional result suggesting that the training of VQAs is undecidable, even in idealized, noiseless settings. We reduce the decision version of the digitized VQA training problem—where circuit parameters are drawn from a discrete set—to the question of whether a universal Diophantine equation (UDE) has a root. This reduction relies on encoding the UDE into the structure of a variational quantum circuit via the matrix exponentials. The central step involves establishing a correspondence between the objective function of the VQA and a known UDE of 58 variables and degree 4. Our main result is conditional on a natural conjecture: that a certain system of structured complex polynomial equations—arising from the inner product of a VQA circuit output and a fixed observable—has at least one solution. We argue this conjecture is plausible based on dimension-counting arguments (degrees of freedom in the Hamiltonians, state vector, and observable), and the generic solvability of such systems in algebraic geometry over the complex numbers. Under this assumption, we suggest that deciding whether a digitized VQA achieves a given energy threshold is undecidable. This links the limitations of variational quantum algorithms to foundational questions in mathematics and logic, extending the known landscape of quantum computational hardness to include uncomputability. Additionally, we establish an unconditional undecidability result for VQA convergence in open quantum systems.

I Introduction

Quantum algorithms, especially those implementable on near-term hardware, have garnered significant attention in recent years [1, 2, 3]. Many financial institutions are investing in quantum computing, motivated by its potential applications in risk assessment, portfolio optimization, and stochastic control [4, 5, 6, 7] while many other industries follow this path as well. Variational Quantum Algorithms (VQAs), encompassing the Variational Quantum Eigensolver (VQE) [8], the Quantum Approximate Optimization Algorithm (QAOA) [9], and their respective variations, have sometimes been advocated as a methodology for addressing $\mathsf{NP}$ -hard optimization problems. See [2] for an extensive review. These algorithms operate by iteratively evaluating a parameterized quantum circuit and updating its parameters to minimize a cost function—typically, the expectation value of an observable (see App. A.1).

However, the expectation value of the quantum observable, as measured by a quantum computer, depends on specific parameters provided to a classical optimizer. The objective of this optimizer is to explore the parameter landscape, adjusting the parameters to minimize the observable’s expectation value. These refined parameters are then anticipated to yield a reduced estimate of the expectation value of the quantum observable, ultimately converging to the genuine minimum, ideally. This process—commonly referred to as VQA training—forms a hybrid quantum-classical loop.

The classical computation of parameters for the variational form is often achieved through the use of gradient-based approaches [10] or quasi-Newton methods [11]. In particular, it has been observed that even second-order techniques often converge to local minima and are not guaranteed to reach global optima [12], and the computation of the optima, even to a predetermined precision, is $\mathsf{NP}$ -hard [13]. Independently, there are lower bounds [14, 15] on the depth of parametrized circuits that one needs to consider when analyzing the efficiency of variational quantum algorithms. The impact of noise on the iteration complexity of VQAs was studied in [16].

In this work, we show that training Variational Quantum Algorithms (VQAs)—under certain assumptions—can be undecidable, extending the known complexity barriers beyond $\mathsf{NP}$ -hardness. Specifically, we consider the conditions under which the classical computation of the parameters of the variational form becomes uncomputable. We also examine related problems in training VQAs and analyze whether QAOA converges in the presence of noise.

Our main result, Theorem 5, connects the decision version of training VQAs to Hilbert’s 10th problem by reducing it to the solvability of a universal Diophantine equation, assuming the existence of a common solution for a certain system of complex polynomial equations. In Conjecture 4, we argue that such a solution likely exists, based on degrees-of-freedom heuristics and the typical behavior of structured polynomial systems. We then show that this result extends naturally to QAOA.

Our investigation focuses primarily on worst-case behavior. These findings do not rule out the possibility that specific subclasses of VQAs may still be successfully trainable under more favorable conditions, which we briefly discuss in Section VI.

Notation.

We use the notation $[n]\coloneqq\{1,\dots,n\}$ . The Pauli matrices are denoted by $\sigma_{x}$ , $\sigma_{y}$ , and $\sigma_{z}$ respectively. By $\operatorname{Herm}(\mathbb{C}^{n})$ we denote the space of Hermitian $n\times n$ matrices, i.e., matrices $A\in\mathbb{C}^{n\times n}$ that satisfy $A=A^{\dagger}$ , where $\dagger$ denotes complex conjugation. A complex Hilbert space $\mathcal{H}$ of dimension $n$ is isomorphic to $\mathbb{C}^{n}$ . By $\|A\|$ we refer to the operator norm of an operator $A\in{\rm End}(\mathcal{H})\cong\operatorname{Herm}(\mathbb{C}^{n})$ while by $U(n)\subset{\rm GL}_{n}(\mathbb{C})$ we denote the degree $n$ unitary group. For a field $k$ , we denote by $k[\boldsymbol{x}]$ the ring of polynomials with coefficients in $k$ where $\boldsymbol{x}=(x_{1},\ldots,x_{m})$ , for some integer $m>0$ , .

II Correspondence of VQA to Polynomial Equation Systems

Initially, let us focus on the VQA minimization problem; specifically we state the VQA minimization problem as termed in [13] and also a digitized VQA minimization problem. These problems will be fundamental in the subsequent analysis.

Problem 1 (VQA minimization problem, paraphrasing [13]).

Instance: An initial state $\ket{\Psi_{0}}\in\mathbb{C}^{n}$ , a set of generators $\{H_{i}\}_{i\in\{1,\dots,L\}}\in\operatorname{Herm}(\mathbb{C}^{n})$ , where $L\in\mathbb{Z}_{>0}$ is the number of layers and an observable $O\in\operatorname{Herm}(\mathbb{C}^{n})$ .
Optimization version: For $\ket{\Psi(\boldsymbol{\phi})}\coloneqq U_{L}(\phi_{L})\cdots U_{1}(\phi_{1})\ket{\Psi_{0}}$ with $U_{i}(\phi_{i})=\mathrm{e}^{-\mathrm{i}H_{i}\phi_{i}}$ , find $\boldsymbol{\phi}\in\mathbb{R}^{L}$ that minimizes $\left<O(\boldsymbol{\phi})\right>\coloneqq~{}\left\langle\Psi(\boldsymbol{\phi})\right|O\left|\Psi(\boldsymbol{\phi})\right\rangle$ .
Decision version for $a\in\mathbb{R}$: For $\ket{\Psi({\boldsymbol{\phi}})}\coloneqq U_{L}(\phi_{L})\cdots U_{1}(\phi_{1})\ket{\Psi_{0}}$ with $U_{i}(\phi_{i})~{}=~{}\mathrm{e}^{-\mathrm{i}H_{i}\phi_{i}}$ , determine if there exists $\boldsymbol{\phi}\in\mathbb{R}^{L}$ for which $\left\langle\Psi(\boldsymbol{\phi})\right|O\left|\Psi(\boldsymbol{\phi})\right\rangle\leq a$ .

Example.

For illustration purposes, consider the $L=1$ case of Problem 1. There, one seeks a parameter $\phi\in\mathbb{R}$ such that the state given by $\ket{\Psi({\phi})}\coloneqq U(\phi)\ket{\Psi_{0}}\coloneqq\mathrm{e}^{-\mathrm{i}H\phi}\ket{\Psi_{0}}$ satisfies $\left\langle\Psi({\phi})\right|O\left|\Psi({\phi})\right\rangle\leq a\in\mathbb{R}$ . The operator $O$ can be decomposed as

\displaystyle O=\sum\limits_{j=1}^{n}\lambda_{j}P_{\lambda_{j}},

(1)

with $P_{\lambda_{j}}$ the associated eigenspace projection operators with eigenvalue $\lambda_{j}$ , the associated unconstrained optimization problem becomes:

\min\limits_{\phi\in\mathbb{R}}\,\sum\limits_{j=1}^{n}\lambda_{j}\left\|P_{\lambda_{j}}\mathrm{e}^{-\mathrm{i}H\phi}\ket{\Psi_{0}}\right\|^{2}

(2)

which is a continuously differentiable function of a real valued variable $\phi$ . Let $n=2$ , $\ket{\Psi_{0}}=\alpha\ket{0}+\beta\ket{1}$ , $O=\begin{pmatrix}2&0\\ 0&-1\end{pmatrix}$ and $H=\sigma_{x}$ , so $\cos(H\phi)=\cos(\phi)I$ and $\sin(H\phi)=\sin(\phi)\sigma_{x}$ . Then, Problem (2) becomes,

\min\limits_{\phi\in\mathbb{R}}(2\alpha^{2}-\beta^{2})\cos^{2}\phi-(2\beta^{2}-\alpha^{2})\sin^{2}\phi

(which fits the framework of Theorem 24) and does not have any known exploitable structure such as convexity. This already hints at the undecidability of VQA.

Nevertheless, we know the form of the optimization problem, and although one can consider that it is unlikely that a special structure-exploiting algorithm for minimization could be found, this argument cannot guarantee that one does not exist.

Thus, to prove the undecidability, let us consider a floating-point representation of the parametrization of the variational form. Indeed, the pulses that implement unitary gates in NISQ devices are stored digitally, and thence the gates cannot be parametrized continuously, but only using a (possibly infinite) discrete set of values, generalizing floating-point arithmetic (IEEE 754) commonly utilized on classical computers. Formally:

Problem 2 (Digitized VQA minimization problem).

Instance: A discrete set of numbers $\mathbb{D}$ representable by a single word on a digital computer. An initial state $\ket{\Psi_{0}}\in\mathbb{C}^{n}$ , a set of generators $\{H_{i}\}_{i\in\{1,\dots,L\}}\subset\operatorname{Herm}(\mathbb{C}^{n})$ , where $L$ is the number of layers and an observable $O\subset\operatorname{Herm}(\mathbb{C}^{n})$ .
Optimization version: For $\ket{\Psi(\boldsymbol{\phi})}\coloneqq U_{L}(\phi_{L})\cdots U_{1}(\phi_{1})\ket{\Psi_{0}}$ with $U_{i}(\phi_{i})=\mathrm{e}^{-\mathrm{i}H_{i}\phi_{i}}$ , find $\boldsymbol{\phi}\in\mathbb{D}^{L}$ that minimizes $\left<O(\boldsymbol{\phi})\right>\coloneqq~{}\left\langle\Psi(\boldsymbol{\phi})\right|O\left|\Psi(\boldsymbol{\phi})\right\rangle$ .
Decision version with $a\in\mathbb{R}$: For $\ket{\Psi(\boldsymbol{\phi})}\coloneqq U_{L}(\phi_{L})\cdots U_{1}(\phi_{1})\ket{\Psi_{0}}$ with $U_{i}(\phi_{i})=\mathrm{e}^{-\mathrm{i}H_{i}\phi_{i}}$ , decide if there exists $\boldsymbol{\phi}\in\mathbb{D}^{L}$ such that $\left<O(\boldsymbol{\phi})\right>\coloneqq~{}\left\langle\Psi(\boldsymbol{\phi})\right|O\left|\Psi(\boldsymbol{\phi})\right\rangle\leq a$ .

\begin{gathered}D(58,4),D(38,8),D(32,12),D(29,16),\\ D(28,20),D(26,24),D(25,28),D(24,36),\\ D(21,96),D(19,2668),D(14,2\times 10^{5}),\\ D(13,6.6\times 10^{43}),D(12,1.3\times 10^{44}),\\ D(11,4.6\times 10^{44}),D(10,8.6\times 10^{44}),\\ D(9,1.6\times 10^{45}).\end{gathered}

(3)

Figure 1: A sample of known Universal Diophantine Equations

D(n,d)

, which are polynomials of degree

d

n

variables.

In the following, we will present evidence that this is undecidable, using the arguments of Matiâsevič [17], presented very nicely by Jones [18]. Recall that a set $W$ of integers (representing strings) is recursively enumerable (r.e.) if there exists a recursive function that can eventually generate any element in $W$ . Building on earlier results of Martin Davis, Hilary Putnam, and Julia Robinson, Matiâsevič has shown that every recursively enumerable set is Diophantine, that is, for every r.e. set $W$ , there is a polynomial $D^{\prime}_{W}$ such that for all integers $x$ :

\displaystyle x\in W\Leftrightarrow\exists v,z_{1},z_{2},\ldots,z_{v}\textrm{ s.t. }D^{\prime}_{W}(x,z_{1},z_{2},\ldots,z_{v})=0

where $z_{1},z_{2},\ldots,z_{v}$ are positive integers and the coefficients of $D^{\prime}_{W}$ are integers. In this tradition, $z_{1},z_{2},\ldots,z_{v}$ are known as variables, and $x,v$ are known as parameters. Thus, if one were able to decide the existence of the root of $D^{\prime}_{W}$ in positive integers, one could decide the halting problem [19, Chapter 4] for any Turing machine.

Furthermore, there exist universal integers $n,d$ , such that for every r.e. set $W$ , there exists its description $v$ such that:

\displaystyle x\in W\Leftrightarrow\exists z_{1},z_{2},\ldots,z_{n}\textrm{ s.t. }D^{\prime\prime}(x,v,z_{1},z_{2},\ldots,z_{n})=0

where $D^{\prime\prime}$ is a polynomial of degree $d$ . This polynomial is called a universal Diophantine equation. Here, $x,v$ would still be parameters. When deciding on the halting problem [19, Chapter 4] on the input $x$ for a Turing machine described by a number of parameters, it is convenient to consider additional parameters, as suggested in Figure 1, where the parameters are $u,x,y,z$ .

This sum-of-squares construction ensures that the optimization problem can be interpreted as minimizing a squared norm, which aligns naturally with the quantum mechanical objective (16).

The construction of universal Diophantine equations (UDE) has been the subject of much subsequent work, especially when seeking UDE such that the number $n$ of variables and the degree $d$ of the polynomial are low.¹¹1 The construction of such polynomials may seem rather obscure, but it is nicely explained in [20, Note 12.9] including Mathematica code. This subsequent work [17] has established the existence of a variety of universal Diophantine equations $D(n,d)$ , as illustrated in Figure 1, cited from [17]. In particular, it is known [21, 17] that there is a universal Diophantine equation with 58 variables and degree 4, and that solving this $D(58,4)$ is equivalent to solving $D(86\times 10^{43},10)$ , which in turn is equivalent to solving the succinct²²2Notice that $D(58,4)$ features many monomials while $D(10,86\times 10^{43})$ features large coefficients. $D(10,2\times 5^{60})$ displayed in Figure 1.

A note on matrix exponentials

We recall some fundamental properties of matrix exponentials which will be useful for the sequel. Write

\exp{(kA)}=\mathrm{e}^{kA},

(4)

where $A\in\mathbb{K}^{n\times n}$ , where $\mathbb{K}=\mathbb{R}$ or $\mathbb{C}$ usually, and $k$ an element of some ring $R$ , usually presented as the ring of real numbers $\mathbb{R}$ . We recall Sylvester’s formula:

Theorem 3 (Sylvester’s formula [22]from Caley-Hamilton theorem).

We can write

\mathrm{e}^{kA}=\sum_{j=0}^{n-1}\varkappa_{j}A^{j}.

(5)

where each $\varkappa_{j}\in\mathbb{K}$ .

Proof.

This is quite easy to see: let $A\in\mathbb{R}^{2\times 2}$ , i.e., $n=2$ , without loss of generality and let $a,b\in\mathbb{R}$ be its eigenvalues. Note that although the original form of the formula is for $k\in\mathbb{R}$ and $A\in\mathbb{R}^{n\times n}$ (for some $n$ ), the derivation of Sylvester’s Theorem can be performed using the Cayley-Hamilton Theorem, which has a generic form appropriate to any commutative ring [23, Corollary 4.2.15]³³3This will be quite important in Section III when discussing such matrix exponentials. A commutative ring is simply a set together with an operation of addition and multiplication. For example, the set of natural numbers $\mathbb{Z}$ together with the standard multiplication and addition forms a commutative ring. Similarly, the set of $n\times n$ diagonal matrices ${\rm MatDiag}(\mathbb{C}^{n})$ with values in complex numbers, together with the standard matrix addition and matrix multiplication, forms a commutative ring.. See also [24]. We assume for simplicity that $A$ is diagonalizable. In our setting, the matrices of interest (e.g., Hermitian Hamiltonians) are always diagonalizable, so this assumption holds. ∎

The eigendecomposition of a $A$ is $A=Q\left(\begin{smallmatrix}p&0\\ 0&q\end{smallmatrix}\right)Q^{-1}$ for an invertible square matrix $Q$ . Then, $\mathrm{e}^{A}=Q\left(\begin{smallmatrix}\mathrm{e}^{p}&0\\ 0&\mathrm{e}^{q}\end{smallmatrix}\right)Q^{-1}$ . Finally, observe that by finding such a $Q$ we arrive at $\mathrm{e}^{A}=(q-p)^{-1}[(q\mathrm{e}^{p}-p\mathrm{e}^{q})\mathds{1}+(\mathrm{e}^{q}-\mathrm{e}^{p})A]$ . This generalizes to a higher $n$ .

In Eq. (5) the coefficients in the finite series expansion are given as functions of the eigenvalues of $A$ ,

	$\displaystyle\mathrm{e}^{k\lambda_{i}}$	$\displaystyle=\sum_{j=0}^{n-1}\varkappa_{j}\lambda_{i}^{j}$		(6)
		$\displaystyle=\varkappa_{0}+\varkappa_{1}\lambda_{i}+\varkappa_{2}\lambda_{i}^{2}+\ldots+\varkappa_{n-1}\lambda_{i}^{n-1}$		(7)

To understand this better let us consider a concrete example where $A\in\mathbb{C}^{2\times 2}$ given as

A=\begin{pmatrix}a&b\\ c&d\ \end{pmatrix},

(8)

and let $k\in\mathbb{R}$ . The characteristic polynomial of $A$ is

p(x)=x^{2}-(a+d)x+ad-bc,

(9)

with eigenvalues

\displaystyle x_{1,2}

\displaystyle=\tfrac{1}{2}(a+d\mp\sqrt{a^{2}+4bc-2ad+d^{2}}).

(10)

Using Eq. (6) we can compute the coefficients $\{\varkappa_{j}\}$ appearing in Eq. (5) using

	$\displaystyle\mathrm{e}^{kx_{1}}$	$\displaystyle=\varkappa_{0}+\varkappa_{1}x_{1}$		(11)
	$\displaystyle\mathrm{e}^{kx_{2}}$	$\displaystyle=\varkappa_{0}+\varkappa_{1}x_{2}.$		(11)

to obtain

	$\displaystyle\varkappa_{0}(x_{1},x_{2},k)$	$\displaystyle=\frac{\mathrm{e}^{kx_{2}}x_{1}-\mathrm{e}^{kx_{1}}x_{2}}{x_{1}-x_{2}}$		(12)
	$\displaystyle\varkappa_{1}(x_{1},x_{2},k)$	$\displaystyle=\frac{\mathrm{e}^{kx_{1}}-\mathrm{e}^{kx_{2}}}{x_{1}-x_{2}}.$		(13)

From Eq. (12), we can write $\varkappa_{0},\varkappa_{1}$ as functions of the eigenvalues $x_{1},x_{2}$ using Eq. (5). We obtain:

\mathrm{e}^{kA}=\varkappa_{0}(x_{1},x_{2},k)\mathds{1}+\varkappa_{1}(x_{1},x_{2},k)A.

(14)

Observe that when $k\in\mathrm{i}\mathbb{R}$ , each coefficient $\varkappa_{j}\in\mathbb{C}$ . These coefficients are determined entirely by the eigenvalues $\left\{\lambda_{i}\right\}$ of $A$ , and depend on the scalar $k$ as well. They are therefore not independent parameters, but functions of the spectrum of $A$ . This observation is important: for a matrix $A\in\mathbb{K}^{n\times n}$ with $n^{2}$ (real) degrees of freedom, solving an equation of the form

\operatorname{Tr}\left[A\left(\varkappa_{j}B\right)\right]=0

where $B\in\mathbb{K}^{n\times n}$ , does not introduce new degrees of freedom-since $\varkappa_{j}$ is not a free variable but a fixed scalar determined by $A$ and $k$ .

III The Undecidability Conjecture

We would like to encode the universal Diophantine equation $D(58,4)$ into the digitized VQA minimization problem. That is, we wish to take one universal Diophantine equation, $D(58,4)$ , which is a system of polynomial equations, but which can be easily transformed into a single sum-of-squares polynomial $D(58,8)$ . This sum-of-squares version can be seen as:

\min\limits_{\boldsymbol{\phi}\in\mathbb{D}^{58}}\,\sum_{j}q_{j}(\boldsymbol{\phi})^{2},

(15)

where $q_{j}$ denotes one summand of the sum-of-squares form of the UDE $D(58,8)$ . This sum-of-squares construction ensures that the optimization problem can be interpreted as minimizing a squared norm, which aligns naturally with the quantum mechanical objective (16).

Now, let us consider the $L=58$ and $n=5$ case of Problem 1:

\displaystyle\min_{\boldsymbol{\phi}\in\mathbb{D}^{58}}\,\left\|O\prod_{i=0}^{57}U_{58-i}(\phi_{58-i})\ket{\Psi_{0}}\right\|^{2}

(16)

and let us show the existence of the Hermitian operator $O\in\operatorname{Herm}(\mathbb{C}^{n})$ , as well as the state $\ket{\Psi_{0}}\in\mathcal{H}$ , and unitaries $U_{58}(\phi_{58}),\ldots,U_{1}(\phi_{1})\in U(n)$ such that the classically-solved optimization problem (16) in the discrete set $\mathbb{D}^{58}$ of numbers representable by 58 words in a digital computer, attains objective value zero if and only if there is a zero of the UDE $D(58,4)$ in $\mathbb{D}^{58}$ . In other words, proving Theorem 5 amounts to finding a map (16) $\to$ (15) and the existence of such an encoding map boils down to the nontrivial task of proving the existence of a common zero of a system of complex polynomials. The challenge lies in ensuring that the variational quantum circuit evaluates to zero if and only if the UDE has a solution—a condition that hinges on the ability to construct appropriate Hamiltonians, a state, and an observable to match the algebraic structure of the UDE.

The conversion involves several layers of variables that we qualitatively describe now, and further describe in Table 2 in the Supplementary material:

•

Within any UDE, there are parameters that encode the Turing machine.⁴⁴4In the example of Figure 3, this would be $u,v,x,z$ .
•

Within the same UDE, there will be a vector of variables restricted to integers; ⁵⁵5 In the example of Figure 3, these would be $a,b,c,\ldots,s,t,w,\alpha,\gamma,\nu,\theta,\lambda,\tau,\phi$ . Notice, however, that we override the meaning of some of these symbols in this section. in $D(58,4)$ , this is a 58-vector, which we denote $\boldsymbol{\phi}$ .
•

Ultimately, we can decide on the state vector $\ket{\Psi_{0}}$ , and the operators $O,U_{58}(\phi_{58}),\ldots,U_{1}(\phi_{1})$ . In this system, $\ket{\Psi_{0}}$ will enjoy some freedom, which will be discussed later. Note that the dimensions of these objects are restricted by the dimension of the UDE⁶⁶6By Sylvester’s formula..
•

However, the existence of an encoding map is determined by the existence of a non-empty zero set $V(I)$ of simultaneous zeros of a certain system of complex polynomials closely associated with the probability amplitude of $O$ .

First, let us recall that it is sufficient to consider only universal Diophantine equations that are sum-of-squares polynomials, which is compatible with Prob. (15). Fortunately, there are a number of those, as exemplified in Figure 3. The sum-of-squares nature of many universal Diophantine equations is due to the fact that those are usually devised as systems of polynomial equations, and only at the end one obtains a single polynomial by the sum-of-squares construction.

Assumptions. Without loss of generality of the proof, we will assume that $[H_{i},H_{j}]=0$ for all $H_{i},H_{j}\in\operatorname{Herm}(\mathbb{C}^{n})$ . This assumption will allow us to efficiently make use of Sylvester’s formula (5). Furthermore, before we proceed, we need to note that when taking into account Sylverster’s formula we directly get a cut-off on the rank of the generators $\{H_{i}\}_{i=1}^{58}$ which now are elements of $\operatorname{Herm}(\mathbb{C}^{5})$ . Similarly, this enforces $\ket{\Psi_{0}}\in\mathbb{CP}^{5}$ .

With $U_{i}(\phi_{i})=e^{-\mathrm{i}\phi_{i}H_{i}}$ as in Eq. (51), we have

$\displaystyle\prod_{i=0}^{57}U_{58-i}(\phi_{58-i})$	$\displaystyle=\prod_{i=0}^{57}\mathrm{e}^{-\mathrm{i}\phi_{58-i}H_{58-i}}$	(17)
	$\displaystyle=\mathrm{e}^{-\mathrm{i}\sum\limits_{i=1}^{58}\phi_{i}H_{i}}$	(18)
	$\displaystyle=\sum\limits_{j=0}^{4}\varkappa_{j}\left(\sum\limits_{i=1}^{58}\phi_{i}H_{i}\right)^{j}$	(19)
	$\displaystyle=\boldsymbol{c}\cdot\boldsymbol{b}(\boldsymbol{\phi})=\sum_{a=1}^{m}c_{a}b_{a}(\boldsymbol{\phi}).$	(20)

Here, we applied Sylvester’s Theorem 3 to obtain the third line, Eq. (19), with the expansion up to $4$ since $H_{i}\in\operatorname{Herm}(\mathbb{C}^{5})$ . The index $a$ is a multi-index that labels the elements of $\boldsymbol{c}$ and $\boldsymbol{b}$ according to the degree of the monomial in the elements of the variational parameter vector $\boldsymbol{\phi}$ , see for example Eq. (61) below. The monomial basis vector $\boldsymbol{b}(\boldsymbol{\phi})$ encodes the possible set of monomial terms in the VQA (i.e., $\phi_{1}^{2}\phi_{36}$ , etc.), and has dimension $m=\tfrac{62!}{58!4!}\sim\mathcal{O}(10^{5})$ (generically $\tfrac{(L+n-1)!}{L!(n-1)!}$ ). Each component of the $m$ -dimensional vector $\boldsymbol{c}$ is itself an element of $\operatorname{Herm}(\mathbb{C}^{5})$ and contains the coefficients that multiply each monomial term of $\boldsymbol{b}(\boldsymbol{\phi})$ , a product of the corresponding complex number $\varkappa_{j}$ with a certain monomial in the generators $\{H_{i}\}_{i=1}^{58}$ . Note that the set of complex numbers $\{\varkappa_{j}\}_{j=0}^{4}$ is not independent. They can be described as functions of the variational parameters and the eigenvalues of the Hamiltonians by solving for them the system

$\displaystyle\mathrm{e}^{-\mathrm{i}x_{1}}$	$\displaystyle=\sum_{l=0}^{4}\varkappa_{l}x_{1}^{l}$	(21)
$\displaystyle\mathrm{e}^{-\mathrm{i}x_{2}}$	$\displaystyle=\sum_{l=0}^{4}\varkappa_{l}x_{2}^{l}$
	$\displaystyle\vdots$
$\displaystyle\mathrm{e}^{-\mathrm{i}x_{5}}$	$\displaystyle=\sum_{l=0}^{4}\varkappa_{l}x_{5}^{l},\$

as in Eq. (11), where

\displaystyle x_{i}=\sum_{j=1}^{58}(\phi_{j}H_{j})_{ii},

(22)

that is, each $x_{i}$ is given as the sum of the eigenvalues of the $i$ -th row of all Hamiltonians.

In order to clearly understand the structure of the vectors $\boldsymbol{c}$ and $\boldsymbol{b}(\boldsymbol{\phi})$ we provide a simple example in the Supplementary Material, page D.

With this new description of $\prod_{i=0}^{57}U_{58-i}(\phi_{58-i})$ as the product $\boldsymbol{c}\cdot\boldsymbol{b}(\boldsymbol{\phi})$ we can insert the latter into the VQA objective (16), with the aim of reaching (15), and seek to enforce the corresponding equation,

$\displaystyle\left\\|O\boldsymbol{c}\cdot\boldsymbol{b}(\boldsymbol{\phi})\ket{\Psi_{0}}\right\\|^{2}$	$\displaystyle=\sum\limits_{j}q_{j}(\boldsymbol{\phi})^{2}$	(23)
	$\displaystyle=\sum_{j}c^{U}_{j}b^{U}_{j}(\boldsymbol{\phi})$
	$\displaystyle=\boldsymbol{c}^{U}\boldsymbol{b}^{U}(\boldsymbol{\phi}),$

where $\boldsymbol{c}^{U}\in\mathbb{Z}^{m^{2}}$ and $\boldsymbol{b}^{U}$ are the coefficients and corresponding monomial basis vectors of the sum-of-squares form of the UDE, i.e., the decomposition of Eq. (15) as $\boldsymbol{c}^{U}\boldsymbol{b}^{U}$ . This equation is equivalent to a set of equations on the coefficients, i.e.,

\displaystyle\braket{\Psi_{0}}{(O\boldsymbol{c})^{\dagger}O\boldsymbol{c}}{\Psi_{0}}_{j}=\boldsymbol{c}^{U}_{j}

(24)

where $j$ is selecting the corresponding monomial term of $(O\boldsymbol{c})^{\dagger}O\boldsymbol{c}$ of $\boldsymbol{b}^{U}$ . In other words, the subscript $j$ denotes projection onto the monomial basis term $b_{j}^{U}(\boldsymbol{\phi})$ . That is, we extract the coefficient of the $j$ -the monomial in the expansion of the squared amplitude.

Let $\boldsymbol{x}$ denote the vector of entries of the operator $O$ and let $\boldsymbol{w}$ denote the entries of $\ket{\Psi_{0}}$ . The system of equations (24) corresponds to $2m$ equations $\{g_{i}(\boldsymbol{x},\boldsymbol{w})=0\}_{i=1}^{m^{2}}$ to solve for.

Conjecture 4 (Solution of a Highly Structured System).

For the set of coefficients $\boldsymbol{c}^{U}_{j}$ associated with the Diophantine polynomial system with 58 variables, there exists at least one pair of $O$ and $\ket{\Psi_{0}}$ such that there exists at least one solution $\boldsymbol{c}^{U}$ to Sys. (24).

Essentially, system (24) corresponds to finding the zero-set $V(I)$ of a system of complex polynomial equations $\{g_{i}=0\}_{i=1}^{2m}$ where $I$ is the radical formed out of them. By Hilbert’s Nullstellensatz [25], a solution of a system of polynomials $\{g_{1}=0,\ldots,g_{2m}=0\}$ with complex coefficients in $l$ variables has no solution in $\mathbb{C}^{l}$ if and only if one can find a set of Bézout polynomials $\{g^{\prime}_{1}=0,\ldots,g^{\prime}_{k}=0\}$ such that

\sum_{i=1}^{k}g_{i}g^{\prime}_{i}=1,

(25)

i.e., we require that the set $\{g_{1},\ldots,g_{k}\}$ forms a proper ideal in $\mathbb{C}[x_{1},\ldots,x_{l}]$ . Using the technique of Sombra [26], an upper bound on the total degree of the $g^{\prime}_{i}$ can be computed by solving an exponentially large linear system. Alternatively, explicit solutions could be obtained by [27].

Theorem 5 (Undecidability of VQAs).

Assuming Conjecture 4 (“Solution of a Highly Structured System”) holds, the decision version of digitized VQA minimization (Problem 2) is undecidable for $L=58$ layers.

Theorem 5 then implies that there is no finite-time algorithm to solve the problem, that is,

Corollary 6 (Uncomputability of VQAs).

Assuming Conjecture 4 (“Solution of a Highly Structured System”) holds, there exists no recursive function to decide digitized VQA minimization in the decision version (Problem 2) for $L=58$ layers. Consequently, the optimization version is uncomputable for $L=58$ layers.

This result is conditional on the validity of the Conjecture 4. While we do not proceed to prove that this system indeed contains at least one solution, there are various reasons for this to be true. (See above.) Note that there are a number of Diophantine equations that we can consider, with increasing degree and variable dimension. From the standpoint of the degrees of freedom arithmetic described in the following, however, it intuitively appears to be the most agreeable equation to the formulated conjecture, and so we formulate it as such. Of course, the analogous conjecture for any polynomial Diophantine system would be similarly informative in regard to VQA.

Degrees of freedom. For each Hermitian matrix $H\in\operatorname{Herm}(\mathbb{C}^{5})$ , we have $5$ real degrees of freedom from the diagonals (which are real) and $2\times 10$ real degrees of freedom ( $2\times 5$ complex numbers) from the off-diagonals, summing to 25 real degrees of freedom. Any operator $O\in\operatorname{Herm}(\mathbb{C}^{5})$ must be described by the $5^{2}-1$ generators $\{\sigma_{i}\}_{i=1}^{25}$ of the corresponding Lie algebra as $O=\sum_{i=1}^{25}\alpha_{i}\sigma_{i}$ matching the number of parameters. The degrees of freedom of $\boldsymbol{c}$ is equivalent to the number of eigenvalues of the diagonal generators $\{H_{i}\}_{i=1}^{58}$ , that is $58\times 5=290$ . Once a combination is chosen, all elements of $\boldsymbol{c}$ are fixed. Finally, naively one might think that $\ket{\Psi_{0}}\in\mathbb{C}^{5}$ but given the trace condition a complex degree of freedom is removed, while the same holds for a global phase, thus one is left with 8 real degrees of freedom. This matches the dimension of $\mathbb{CP}^{5}$ where the set of states such as $\ket{\Psi_{0}}$ belongs to. We summarize the degrees of freedom for our system in Table 1.

Object	Space	Degrees of freedom (real)
$\ket{\Psi_{0}}$	$Q_{5}\coloneqq\mathbb{CP}^{5}$	8
$O$	$\operatorname{Herm}(\mathbb{C}^{5})$	25
$H_{i}$	$\operatorname{Herm}(\mathbb{C}^{5})$	5
$\{H_{i}\}$	$\operatorname{Herm}(\mathbb{C}^{5})$	$5\times 58=290$

Table 1: The space of (pure) states of 5-dits, denoted as

Q_{5}

is a proper subset of the

d^{2}-1

d=5

, dimensional Hilbert-Schmidt sphere (ball, for mixed states). The corresponding Bloch manifold for

\{\ket{\Psi_{0}}\}_{\rm pure}\cup\{\ket{\Psi_{0}}\}_{\rm mixed}

{\rm SU}(5)

. For

\{\ket{\Psi_{0}}\}_{\rm pure}

the number of degrees of freedom is

2d-2

, the real dimension of

Q_{5}

[28, Section 8]. Finally, for

c_{(i,j)}\in\operatorname{Herm}(\mathbb{C}^{5})

, we do not obtain new degrees of freedom on top of the degrees of freedom in

\boldsymbol{\phi}

and

\{H_{i}\}

An illustration of encoding a sum-of-squares polynomial into small VQA

As an illustration on how to encode a universal Diophantine equation polynomial into VQA let us consider a simple example of how a generic second-degree sum-of-squares polynomial can be written in the form (2). Consider the sum-of-squares polynomial

\displaystyle p(x,y)=\eta_{x}x^{2}+\eta_{y}y^{2}+\eta_{xy}x^{2}y^{2},

(26)

with coefficients $\eta_{x},\eta_{y},\eta_{xy}\in\mathbb{Z},\mathbb{R}$ or $\mathbb{C}$ that are given and fixed (i.e., not free). The individual terms $\{q_{j}\}$ of the UDE have as their highest total power 2. Thus, we consider that the state dimension is $3$ in order to generate the required powers in the use of Sylvester’s formula.

We wish to construct the corresponding:

\displaystyle\left\|OU_{x}(x)U_{y}(y)|\Psi_{0}\rangle\right\|^{2}.

(27)

Let $H_{x}=X,H_{y}=Y$ . Recall that

$\displaystyle U_{x}(x)U_{y}(y)$	$\displaystyle=\mathrm{e}^{-\mathrm{i}X}\mathrm{e}^{-\mathrm{i}Y}=\mathrm{e}^{-\mathrm{i}(X+Y)}$	(28)
	$\displaystyle=\sum\limits_{j=0}^{2}\varkappa_{j}(xX+yY)^{j}$	(29)
	$\displaystyle=c(0,0)+c(0,1)x+c(1,0)y$
	$\displaystyle\quad+c(1,1)xy+c(2,0)x^{2}+c(0,2)y^{2}$	(30)
	$\displaystyle=\boldsymbol{c}\cdot\boldsymbol{b}(x,y)$	(31)

where we have applied Sylvester’s Theorem 3 at the second equality, noting that $X,Y\in\operatorname{Herm}(\mathbb{C}^{3})$ and thus there are terms up to degree two in the polynomial appearing in its application. Here we require that $[X,Y]=0$ in order to proceed to the second equality of Eq. (28). Explicitly, we have

	$\displaystyle c(0,0)$	$\displaystyle=\varkappa_{0}\mathds{1}$
	$\displaystyle c(1,0)$	$\displaystyle=\varkappa_{1}X$
	$\displaystyle c(0,1)$	$\displaystyle=\varkappa_{1}Y$
	$\displaystyle c(2,0)$	$\displaystyle=\varkappa_{2}X^{2}$
	$\displaystyle c(0,2)$	$\displaystyle=\varkappa_{2}Y^{2}$
	$\displaystyle c(1,1)$	$\displaystyle=2\varkappa_{2}XY.$

Note the dot product of the monomial vector $\boldsymbol{b}(x,y)=\begin{pmatrix}1&x&y&xy&x^{2}&y^{2}\end{pmatrix}^{T}$ with the vector of coefficients $\boldsymbol{c}$ . Applying $O$ from the left, that is, in $OU_{x}(x)U_{y}(y)$ we obtain the modified polynomial,

$\displaystyle OU_{x}(x)U_{y}(y)$	$\displaystyle=Oc(2,0)x^{2}+Oc(0,2)y^{2}$	(32)
	$\displaystyle\quad+Oc(1,1)xy+Oc(0,0)$
	$\displaystyle\quad+Oc(1,0)x+Oc(0,1)y.\$

Thus, when we substitute this expression into the VQA objective we obtain,

		$\displaystyle\left\\|OU_{x}(x)U_{y}(y)\|\Psi_{0}\rangle\right\\|^{2}=\left\\|O(\boldsymbol{c}\cdot\boldsymbol{b}(x,y))\|\Psi_{0}\right\\|^{2}$		(33)
		$\displaystyle=\langle\Psi_{0}\|\sum_{\begin{subarray}{c}0\leq j^{1}_{1}+j^{2}_{1}\leq 2\\ 0\leq j^{1}_{2}+j^{2}_{2}\leq 2\end{subarray}}\tilde{O}_{j^{1}_{1},j^{2}_{1}}^{\dagger}\tilde{O}_{j^{1}_{2},j^{2}_{2}}x^{j^{1}_{1}+j^{1}_{2}}y^{j^{2}_{1}+j^{2}_{2}}\|\Psi_{0}\rangle$		(34)

where $\tilde{O}_{j^{1},j^{2}}=Oc(j^{1},j^{2})\in\operatorname{Herm}(\mathbb{C^{5}})$ , since, as we have seen, each $c(i,j)$ corresponds to a certain monomial in Hamiltonians $\{H_{i}\}$ . Expanding yields

$\displaystyle 0$	$\displaystyle=\braket{\Psi_{0}}{\tilde{O}_{0,0}^{\dagger}\tilde{O}_{0,0}}{\Psi_{0}}$	(35)
$\displaystyle 0$	$\displaystyle=\braket{\Psi_{0}}{\tilde{O}_{1,0}^{\dagger}\tilde{O}_{0,0}+\tilde{O}_{0,0}^{\dagger}\mathcal{O}_{1,0}}{\Psi_{0}}(=0\cdot y)$
$\displaystyle 0$	$\displaystyle=\braket{\Psi_{0}}{\tilde{O}_{0,1}^{\dagger}\tilde{O}_{0,0}+\tilde{O}_{0,0}^{\dagger}\tilde{O}_{0,1}}{\Psi_{0}}(=0\cdot x)$
$\displaystyle\eta_{xy}$	$\displaystyle=\braket{\Psi_{0}}{\tilde{O}_{0,0}^{\dagger}\tilde{O}_{1,1}+\tilde{O}_{0,1}^{\dagger}\tilde{O}_{1,0}+\tilde{O}_{1,0}^{\dagger}\tilde{O}_{0,1}}{\Psi_{0}}$
$\displaystyle\eta_{y}$	$\displaystyle=\braket{\Psi_{0}}{\tilde{O}_{2,0}^{\dagger}\tilde{O}_{0,0}+\tilde{O}_{0,0}^{\dagger}\tilde{O}_{2,0}+\tilde{O}_{1,0}^{\dagger}\tilde{O}_{1,0}}{\Psi_{0}}$
$\displaystyle\eta_{x}$	$\displaystyle=\braket{\Psi_{0}}{\tilde{O}_{0,2}^{\dagger}\tilde{O}_{0,0}+\tilde{O}_{0,0}^{\dagger}\tilde{O}_{0,2}+\tilde{O}_{0,1}^{\dagger}\tilde{O}_{0,1}}{\Psi_{0}}.$

If there exist $O,H_{1},H_{2}$ and $\ket{\Psi_{0}}$ such that (35) has a solution, then we can encode the problem of finding zeros of the polynomial as this VQA. But in fact, we have two degrees of freedom in $\ket{\Psi_{0}}$ and six degrees of freedom with $O$ . We potentially have at least two additional degrees of freedom with the choice of $H_{1}$ and $H_{2}$ , which, in turn, modify $\boldsymbol{c}$ . In Fig. 4 we provide a sample SAGE script to test whether the space we are interested in is empty or not.

Summary of the construction

We started from Eq. (27) and applied Sylvester’s formula (based on Caley-Hamilton theorem) to the product of unitaries $\prod_{i=0}^{57}U_{58-i}(\phi_{58-i})$ as to obtain $\boldsymbol{c}\cdot\boldsymbol{b}(\boldsymbol{\phi})$ . This allows us to reformulate the original product of unitaries, as usually understood in the VQAs, in terms of a system of complex equations which when enforced to produce the coefficients of the square of the UDE of interest, $D(58,4)$ , achieve the encoding of the VQA into the UDE provided at least one solution exists.

IV Translating the Main Result to QAOA

The reasoning above and most importantly Theorem 5 and Corollary 6 apply equally well to QAOA in the digitized version:

Problem 7 (QAOA minimization problem, [13]).

Instance: Two Hamiltonians $H_{b},H_{c}\in\operatorname{Herm}(\mathbb{C}^{n})$ and the number of layers $L$ in unary notation⁷⁷7This means that the length of the input scales linearly with $L$ ..
Optimization version: For a tunable state $\ket{\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})}\coloneqq U_{b}(\beta_{L})U_{c}(\gamma_{L})\cdots U_{b}(\beta_{1})U_{c}(\gamma_{1})\ket{\Psi_{0}}$ , where $\ket{\Psi_{0}}$ is the ground state of $H_{b}$ , $U_{b}(\beta)=\mathrm{e}^{-\mathrm{i}H_{b}\beta}$ and $U_{c}(\gamma)=\mathrm{e}^{-\mathrm{i}H_{c}\gamma}$ , find $\boldsymbol{\beta},\boldsymbol{\gamma}\in\mathbb{R}^{d}$ which minimize $\left<O(\boldsymbol{\beta},\boldsymbol{\gamma})\right>\coloneqq\left\langle\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})\right|H_{c}\left|\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})\right\rangle$ .
Decision version for $a\in\mathbb{R}$: For a tunable state $\ket{\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})}\coloneqq U_{b}(\beta_{L})U_{c}(\gamma_{L})\cdots U_{b}(\beta_{1})U_{c}(\gamma_{1})\ket{\Psi_{0}}$ , where $\ket{\Psi_{0}}$ is the ground state of $H_{b}$ , $U_{b}(\beta)=\mathrm{e}^{-\mathrm{i}H_{b}\beta}$ and $U_{c}(\gamma)=\mathrm{e}^{-\mathrm{i}H_{c}\gamma}$ , decide if there exists $\boldsymbol{\beta},\boldsymbol{\gamma}\in\mathbb{R}^{d}$ such that $\left<O(\boldsymbol{\beta},\boldsymbol{\gamma})\right>\coloneqq\left\langle\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})\right|H_{c}\left|\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})\right\rangle\leq a$ .

Theorem 8 (Undecidability of optimization in QAOA).

Decision version of QAOA minimization problem (Problem 7) is undecidable for $L=58$ layers.

Proof.

Consider $L$ unitary operators parametrized by $\boldsymbol{\gamma}\in\mathbb{R}^{L}$ acting on some prepared state $\ket{\Psi(\boldsymbol{\beta})}$ which is a function of $\boldsymbol{\beta}\in\mathbb{R}^{L}$ . Then for any $L$ we define the VQA ansatz with parameter $\boldsymbol{\gamma}$

\displaystyle\ket{\Psi(\boldsymbol{\gamma})}=\prod_{i=0}^{L-1}U_{c}(\gamma_{L-i})\ket{\tilde{\Psi}_{0}(\boldsymbol{\beta})},

(36)

where the VQA layers of unitaries act on a parameterized state

\displaystyle\ket{\tilde{\Psi}_{0}(\boldsymbol{\beta})}=\prod_{i=0}^{L-1}U_{b}(\beta_{L-i})\ket{\Psi_{0}},

(37)

and, as before, $U_{b}\eqqcolon{\rm e}^{-\mathrm{i}H_{b}}$ and $U_{c}\eqqcolon{\rm e}^{-\mathrm{i}H_{c}}$ . Assume $H_{b}$ is diagonal, as in Eq. (38) below or by diagonalizing the usual tensor product of $\sigma_{x}$ matrices as in [9], and $H_{c}$ is symmetric. Since $H_{b}$ and $H_{c}$ commute and we form the QAOA ansatz

\displaystyle\ket{\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})}

\displaystyle=\prod_{i=0}^{L-1}U_{c}(\gamma_{L-i})U_{b}(\beta_{L-i})\ket{\Psi_{0}}

and Theorem 5 applies directly. Notice that to show undecidability, it is sufficient to show the undecidability of this special case.

In general, QAOA utilizes the Ising model Hamiltonian which is proportional to $\otimes^{N}\sigma_{z}$ , and corresponds to a symmetric block matrix indeed. This means that $U_{b}(\beta_{i+1})U_{c}(\gamma_{i})=U_{c}(\gamma_{i})U_{b}(\beta_{i+1})$ . Then Theorem 5 applies.

Alternatively, one can use the reduction from single layer VQA to QAOA, which has been introduced by Bittel and Kliesch [13], and then invoke our Theorem 5 again. ∎

For the convenience of the reader, the reduction of Bittel and Kliesch [13] considers a Hilbert space $\mathcal{H}=\mathbb{C}^{2d+1}$ and considers a mixer Hamiltonian $H_{b}$ that takes the form

\displaystyle H_{b}=\begin{pmatrix}E_{1}&&&&&\\ &-E_{1}&&&&\\ &&\ddots&&&\\ &&&E_{d}&&\\ &&&&-E_{d}&\\ &&&&&1\ \end{pmatrix}

(38)

where $|E_{i}|<1$ for all $i\in[d]$ . For the optimization Hamiltonian $H_{c}$ (which amounts to the VQE Hamiltonian), consider

\displaystyle H_{c}

\displaystyle=O\oplus 0+\tau\left(\ket{+}_{2d}\bra{2d+1}+\ket{2d+1}\bra{+}_{2d}\right),

(39)

where $\tau\in\mathbb{R}$ is some constant to be fixed, $\ket{+_{2d}}=\sum_{j=1}^{2d}\ket{j}/\sqrt{2d}$ . The observable $O$ is defined as follows:

\displaystyle O_{i,j}=\begin{cases}O_{i,j}^{\prime}&i\neq j,\\ -\sum_{\alpha=1}^{2d}O_{\alpha,j}^{\prime}&i=j,\end{cases}

(40)

where $O^{\prime}:=\frac{d}{8}A\otimes\begin{pmatrix}1&1\\ 1&1\end{pmatrix}$ and $A\in\{0,1\}^{d\times d}$ . Note that $O\oplus 0$ refers to $O$ being embedded in the first $2d$ -dimensional summand of $\mathcal{H}$ . The ground state energy of the mixer Hamiltonian $H_{b}$ is $\lambda_{\min}(H_{b})=-1$ as deducted from Eq. (38) with the ground state being $\ket{2d+1}$ . Then, applying the optimization Hamiltonian and using the fact that $\ket{+_{2d}}$ is an eigenstate of $O\oplus 0$ yields

	$\displaystyle\ket{\Psi(\gamma)}$	$\displaystyle\coloneqq U_{c}(\gamma)\ket{2d+1}$		(41)
		$\displaystyle=\cos(\tau\gamma)\ket{2d+1}+\mathrm{i}\sin(\tau\gamma)\ket{+_{2d}}.\$		(41)

Overall, the variational state $\ket{\Psi(\beta,\gamma)}$ can be written as

	$\displaystyle\ket{\Psi(\beta,\gamma)}$	$\displaystyle=U_{b}(\beta)U_{c}(\gamma)\ket{2d+1}$
		$\displaystyle=\cos(\tau\gamma)\mathrm{e}^{\mathrm{i}\beta}\ket{2d+1}$
		$\displaystyle\quad+\mathrm{i}\sin(\tau\gamma)\tfrac{1}{\sqrt{2d}}\sum_{j=1}^{d}\mathrm{e}^{-\mathrm{i}E_{j}\beta}\ket{2j-1}+\mathrm{e}^{\mathrm{i}E_{j}\beta}\ket{2j}\,,$

which we use to derive an expression for the expectation value

	$\displaystyle\left<O(\beta,\gamma)\right>$	$\displaystyle=\bra{\Psi(\beta,\gamma)}H_{c}\ket{\Psi(\beta,\gamma)}$		(42)
		$\displaystyle=\sin^{2}(\tau\gamma)f(\beta)+2\tau\cos(\tau\gamma)\sin(\tau\gamma)g(\beta)$		(42)

with

	$\displaystyle f(\beta)$	$\displaystyle=\frac{1}{4}\sum_{i,j}(A_{i,j}(\cos(E_{i}\beta)\cos(E_{j}\beta)-1)\,,$		(43)
	$\displaystyle g(\beta)$	$\displaystyle=-\frac{\sin(\beta)}{d}\sum_{i=1}^{d}\cos(E_{j}\beta)\,.$		(44)

For $\tau\ll 1$ the contribution of $g$ becomes insignificant and $\gamma=\frac{\pi}{2\tau}$ minimizes the objective function as $f(\beta)\leq 0$ .

V Further results on QAOA for open quantum systems

In this Section, we remark on how the consideration of open quantum systems can influence undecideability, with respect to the conjecture presented. In open quantum systems, the system can be modeled as $U_{b}(\beta)=\mathrm{e}^{-\mathrm{i}H_{b}\beta}+\eta_{b}$ and $U_{c}(\gamma)=\mathrm{e}^{-\mathrm{i}H_{c}\gamma}+\eta_{c}$ , for which unitarity is not assured for all realizations of noise $\eta_{b},\eta_{c}\in\operatorname{Herm}(\mathbb{C}^{n})$ . For example, consider a level two system with $\eta_{b}$ given as:

\displaystyle\eta_{b}=\left(\begin{array}[]{c c}z&w\\ -\mathrm{e}^{\mathrm{i}(\theta+\epsilon)}w^{*}&\mathrm{e}^{\mathrm{i}\theta}z^{*}\end{array}\right),

(47)

where $z,w\in\mathbb{C}$ , ^∗ denotes complex conjugation and $\theta,\epsilon\in\mathbb{R}$ . Even for $0<\epsilon\ll 1$ , $\eta_{b}$ is nonunitary. One can impose $\epsilon$ to be a monotonically increasing function of time, $\epsilon(t)$ with initial condition $\epsilon(0)=0$ , such that for small $t$ the evolution can be written as:

\displaystyle|\Psi(t)\rangle=(\mathrm{e}^{-\mathrm{i}Ht}+\eta_{b}(t))|\boldsymbol{0}\rangle.

(48)

In the context of QAOA, the parameter $t$ can be substituted with the angle $\boldsymbol{\beta}$ . Notice that this is a perfectly reasonable assumption: QAOA is precisely motivated as a variational solver to perform on noisy quantum devices.

Another possibility is to consider controlled systems such as states that evolve as

\displaystyle\mathrm{i}\frac{d}{dt}|{\Psi(t)}\rangle=\left(H+\epsilon(t)\xi\right)|\boldsymbol{0}\rangle,\left.\quad|\Psi\right\rangle|_{t=0}=|\boldsymbol{0}\rangle.

This state evolves according to Hamiltonian $H\in{\rm Herm}(\mathbb{C}^{n})$ as well as a Hermitian operator $\xi\in{\rm Herm}(\mathbb{C}^{n})$ controlled by a time-dependent external field $\epsilon(t)$ . In this context, if $H$ is the target Hamiltonian $H_{c}$ , we see that the addition of the second summand $\epsilon(t)\xi$ makes the evolution operator nonunitary.

Clearly, when one does not have access to the samples of noise, it is impossible to say a priori what would be evolution of the state under, e.g., QAOA. Interestingly, however, even in the clairvoyant setting, where we knew the realization of the noise a priori, and even in the case where the distribution of the noise were singular, i.e., the realizations $\eta_{b},\eta_{c}\in\operatorname{Herm}(\mathbb{C}^{n})$ of noise were constant throughout time, possibly very small but sufficient to make the evolution non-unitary, it is impossible to decide whether the non-unitary evolution is convergent or not.

Problem 9 (QAOA convergence).

Instance: Two Hamiltonians $H_{b},H_{c}\in\operatorname{Herm}(\mathbb{C}^{n})$ and the number of layers $L$ in unary notation. Two realizations of noise associated with the two Hamiltonians $\eta_{b},\eta_{c}\in\operatorname{Herm}(\mathbb{C}^{n})$ , which assure that neither $U_{b}(\beta)=\mathrm{e}^{-\mathrm{i}H_{b}\beta}+\eta_{b}$ nor $U_{c}(\gamma)=\mathrm{e}^{-\mathrm{i}H_{c}\gamma}+\eta_{c}$ are unitary,
Decision version: For a tunable state $\ket{\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})}\coloneqq U_{b}(\beta_{L})U_{c}(\gamma_{L})\cdots U_{b}(\beta_{1})U_{c}(\gamma_{1})\ket{\Psi_{0}}$ , where $\ket{\Psi_{0}}$ is the ground state of $H_{b}$ , decide whether there exists a limit to the sequence produced by QAOA.

Theorem 10.

The QAOA convergence problem is undecidable.

Proof.

We can prove this by reduction from [29], which shows that the stability of a switched linear system is undecidable. ∎

In practice, the pulses that implement unitary gates in NISQ devices do so in a discrete and finite fashion. Therefore, we reformulate Problem (9) to a digitized version as follows.

Problem 11 (Digitized QAOA convergence).

Instance: Two Hamiltonians $H_{b},H_{c}\in\operatorname{Herm}(\mathbb{C}^{n})$ and the number of layers $L$ in unary notation. Parameters $\beta_{i},\gamma_{i}\in\mathbb{D}$ come from some finite, discrete set $\mathbb{D}$ of numbers representable by a single word in a digital computer.
Decision version: For a tunable state $\ket{\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})}\coloneqq U_{b}(\beta_{L})U_{c}(\gamma_{L})\cdots U_{b}(\beta_{1})U_{c}(\gamma_{1})\ket{\Psi_{0}}$ , where $\ket{\Psi_{0}}$ is the ground state of $H_{b}$ , $U_{b}(\beta)=\mathrm{e}^{-\mathrm{i}H_{b}\beta}$ and $U_{c}(\gamma)=\mathrm{e}^{-\mathrm{i}H_{c}\gamma}$ , decide whether there exists a limit to the sequence produced by QAOA.

Proposition 12.

Digitized QAOA convergence is undecidable.

Proof.

Since $\beta_{i},\gamma_{i}$ come from some finite, discrete ordered set, by setting $U_{i}=U_{b}(\beta_{i})U_{c}(\gamma_{i})$ , the vocabulary $\mathcal{U}_{(L)}=\{U_{1},\ldots,U_{L}\}$ is finite. The convergence of products of operators of the form $\Pi_{U}=U_{n}\ldots U_{1}$ can be determined by computing the joint spectral radius $\hat{\rho}(\mathcal{U})$ . This is an instance of Problem (22). It is known that convergence of $\Pi_{U}$ holds only if $\hat{\rho}(\mathcal{U}_{(L)})<1$ which is itself an instance of Problem (21). It was shown in [29], that the USR is undecidable implying that BMP is undecidable too. The proof is based on a reduction of the problem of probabilistic finite automata emptiness to USR and it amounts into showing that $\hat{\rho}(\mathcal{U}_{(L)})>1$ and $\hat{\rho}(\mathcal{U}_{(L)})\leq 1$ . The proof also covers QAOA. ∎

Proposition 13.

Assuming Conjecture 4 holds, there exists no recursive function to decide whether the digitized QAOA convergence problem has a solution.

Proposition 14.

Digitized QAOA is undecidable for the product of two matrices ( $L=1$ ).

Proof.

To ease the notation, we skip the argument of the unitaries since it is fixed. The set $\mathcal{U}_{(1)}=\{U_{b},U_{c}\}$ defined a dictionary out of which one can form words of length $L$ . The decidability problem for $\mathcal{U}_{(1)}$ can be deduced from the proof of Theorem (14): First, one defines the set of two $nL\times nL$ matrices $\tilde{\mathcal{U}}_{(1)}=\{U,T\}$ where $U=\mathrm{diag}(U_{1},\ldots,U_{L})$ and

\displaystyle T=\begin{pmatrix}0&\mathds{1}_{n(L-1)}\\ \mathds{1}_{n}&0\end{pmatrix},

(49)

where $\mathbb{I}_{n}$ is the $n\times n$ identity matrix. Then it is easy to verify that $\rho(\mathcal{U}_{(n)})\leq 1$ if and only if $\rho(\tilde{\mathcal{U}}_{(1)})\leq 1$ [29]. This is based on the proof of Theorem 23. ∎

VI Conclusions

We have shown that training VQA is undecidable, assuming a certain conjecture on the solvability of a certain structured system of equations in complex numbers.

The investigation has shown that the undecidability of VQA via universal Diophantine quations is not completely straightforward. Instead, the undecidability is related to the solvability of an overdetermined, but highly-structured complex-variate system of equations. Algorithmically, confirmation of the conjecture would confirm the broadly shaping consensus that VQA is rather “heuristic”, with behaviour hard to characterize and whose use across a wide enough class of possible instances cannot be fully reliable. While this may not be an issue in many applications of optimization in engineering and chemistry, it seems unlikely that applications in number theory [30] would be successful.

Note that affirmation of the conjecture does not rule out computability in specific instances, or even specific broad classes of Hamiltonians. Indeed, computability has been shown for large-girth regular graphs [31], projectors [32], and the Sherrington-Kirkpatrick model [33, 31].

Acknowledgements.

This work has been partially supported by project MIS 5154714 of the National Recovery and Resilience Plan Greece 2.0 funded by the European Union under the NextGenerationEU Program.

DISCLAIMER

This paper was prepared for information purposes and is not a product of HSBC Bank Plc. or its affiliates. Neither HSBC Bank Plc. nor any of its affiliates make any explicit or implied representation or warranty and none of them accept any liability in connection with this paper, including, but not limited to, the completeness, accuracy, reliability of information contained herein and the potential legal, compliance, tax or accounting effects thereof. Copyright HSBC Group 2025.

References

Preskill [2018] J. Preskill, Quantum computing in the nisq era and beyond, Quantum 2, 79 (2018).
Cerezo et al. [2021] M. Cerezo, A. Arrasmith, R. Babbush, S. C. Benjamin, S. Endo, K. Fujii, J. R. McClean, K. Mitarai, X. Yuan, L. Cincio, and et al., Variational quantum algorithms, Nature Reviews Physics 3, 625–644 (2021).
Huang et al. [2023] H.-L. Huang, X.-Y. Xu, C. Guo, G. Tian, S.-J. Wei, X. Sun, W.-S. Bao, and G.-L. Long, Near-term quantum computing techniques: Variational quantum algorithms, error mitigation, circuit compilation, benchmarking and classical simulation, Science China Physics, Mechanics & Astronomy 66, 250302 (2023).
Egger et al. [2020] D. J. Egger, C. Gambella, J. Marecek, S. McFaddin, M. Mevissen, R. Raymond, A. Simonetto, S. Woerner, and E. Yndurain, Quantum computing for finance: state of the art and future prospects, IEEE Transactions on Quantum Engineering (2020).
Gilliam et al. [2021] A. Gilliam, S. Woerner, and C. Gonciulea, Grover adaptive search for constrained polynomial binary optimization, Quantum 5, 428 (2021).
Herman et al. [2022] D. Herman, C. Googin, X. Liu, A. Galda, I. Safro, Y. Sun, M. Pistoia, and Y. Alexeev, A survey of quantum computing for finance, arXiv preprint arXiv:2201.02773 (2022).
Intallura et al. [2023] P. Intallura, G. Korpas, S. Chakraborty, V. Kungurtsev, and J. Marecek, A survey of quantum alternatives to randomized algorithms: Monte carlo integration and beyond, arXiv preprint arXiv:2303.04945 (2023).
Peruzzo et al. [2014] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’Brien, A variational eigenvalue solver on a photonic quantum processor, Nature communications 5, 1 (2014).
Farhi et al. [2014] E. Farhi, J. Goldstone, and S. Gutmann, A quantum approximate optimization algorithm, arXiv preprint arXiv:1411.4028 (2014).
Schuld et al. [2019] M. Schuld, V. Bergholm, C. Gogolin, J. Izaac, and N. Killoran, Evaluating analytic gradients on quantum hardware, Physical Review A 99, 032331 (2019).
Byrd et al. [1995] R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu, A limited memory algorithm for bound constrained optimization, SIAM Journal on scientific computing 16, 1190 (1995).
Cerezo and Coles [2020] M. Cerezo and P. J. Coles, Impact of barren plateaus on the hessian and higher order derivatives, arXiv preprint arXiv:2008.07454 (2020).
Bittel and Kliesch [2021] L. Bittel and M. Kliesch, Training variational quantum algorithms is np-hard, Physical Review Letters 127 (2021), 10.1103/physrevlett.127.120502.
Bravyi et al. [2021] S. Bravyi, D. Gosset, and R. Movassagh, Classical algorithms for quantum mean values, Nature Physics 17, 337 (2021).
Bravyi et al. [2020] S. Bravyi, A. Kliesch, R. Koenig, and E. Tang, Obstacles to variational quantum optimization from symmetry protection, Physical Review Letters 125, 260505 (2020).
Kungurtsev et al. [2022] V. Kungurtsev, G. Korpas, J. Marecek, and E. Y. Zhu, Iteration complexity of variational quantum algorithms, arXiv preprint arXiv:2209.10615 (2022).
Yuri Matiâsevič and Putnam [1993] M. D. Yuri Matiâsevič and H. Putnam, Hilbert’s tenth problem, Foundations of computing (MIT Press, 1993).
Jones [1982a] J. P. Jones, Universal diophantine equation, Journal of Symbolic Logic 47, 549–571 (1982a).
Sipser [2013] M. Sipser, Introduction to the Theory of Computation, 3rd ed. (Course Technology, Boston, MA, 2013).
Wolfram [2002] S. Wolfram, A new kind of science, Vol. 5 (Wolfram media, Champaign, IL, 2002).
Jones [1982b] J. P. Jones, Universal diophantine equation, Journal of Symbolic Logic 47, 549 (1982b).
Sylvester [1883] J. Sylvester, XXXIX. on the equation to the secular inequalities in the planetary theory, The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 16, 267 (1883).
Gatto and Salehyan [2016] L. Gatto and P. Salehyan, Hasse-Schmidt derivations on Grassmann algebras (Springer, 2016).
Merzbacher [1968] E. Merzbacher, Matrix methods in quantum mechanics, American Journal of Physics 36, 814 (1968).
Cox et al. [2015] D. A. Cox, J. Little, and D. O’Shea, Ideals, Varieties, and Algorithms (Springer International Publishing, 2015).
Sombra [1999] M. Sombra, A sparse effective nullstellensatz, Advances in Applied Mathematics 22, 271 (1999).
Duchi and Ruan [2019] J. C. Duchi and F. Ruan, Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval, Information and Inference: A Journal of the IMA 8, 471 (2019).
Bengtsson and Zyczkowski [2006] I. Bengtsson and K. Zyczkowski, Geometry of Quantum States: An Introduction to Quantum Entanglement (Cambridge University Press, 2006).
Blondel and Tsitsiklis [2000a] V. D. Blondel and J. N. Tsitsiklis, The boundedness of all products of a pair of matrices is undecidable, Systems & Control Letters 41, 135 (2000a).
Yan et al. [2022] B. Yan, Z. Tan, S. Wei, H. Jiang, W. Wang, H. Wang, L. Luo, Q. Duan, Y. Liu, W. Shi, et al., Factoring integers with sublinear resources on a superconducting quantum processor, arXiv preprint arXiv:2212.12372 (2022).
Basso et al. [2021] J. Basso, E. Farhi, K. Marwaha, B. Villalonga, and L. Zhou, The quantum approximate optimization algorithm at high depth for maxcut on large-girth regular graphs and the sherrington-kirkpatrick model, arXiv preprint arXiv:2110.14206 (2021).
Zhou et al. [2020] L. Zhou, S.-T. Wang, S. Choi, H. Pichler, and M. D. Lukin, Quantum approximate optimization algorithm: Performance, mechanism, and implementation on near-term devices, Physical Review X 10, 021067 (2020).
Farhi et al. [2019] E. Farhi, J. Goldstone, S. Gutmann, and L. Zhou, The quantum approximate optimization algorithm and the Sherrington-Kirkpatrick model at infinite size, arXiv preprint arXiv:1910.08187 (2019).
Albash and Lidar [2018] T. Albash and D. A. Lidar, Adiabatic quantum computation, Reviews of Modern Physics 90, 015002 (2018).
Blum and Rivest [1992] A. L. Blum and R. L. Rivest, Training a 3-node neural network is NP-complete, Neural Networks 5, 117 (1992).
Sima [2002] J. Sima, Training a single sigmoidal neuron is hard, Neural Computation 14, 2709 (2002).
Sälzer and Lange [2021] M. Sälzer and M. Lange, Reachability is NP-complete even for the simplest neural networks, in Lecture Notes in Computer Science (Springer International Publishing, 2021) pp. 149–164.
Svozil [1993] K. Svozil, Randomness & undecidability in physics (World Scientific, 1993).
da Costa and Doria [1991] N. C. da Costa and F. A. Doria, Undecidability and incompleteness in classical mechanics, International Journal of Theoretical Physics 30, 1041 (1991).
Lloyd [1993] S. Lloyd, Quantum-mechanical computers and uncomputability, Physical review letters 71, 943 (1993).
Lloyd [2017] S. Lloyd, Uncomputability and physical law, in The Incomputable (Springer, 2017) pp. 95–104.
Cubitt et al. [2015] T. S. Cubitt, D. Perez-Garcia, and M. M. Wolf, Undecidability of the spectral gap, Nature 528, 207 (2015).
Bausch et al. [2020] J. Bausch, T. S. Cubitt, A. Lucia, and D. Perez-Garcia, Undecidability of the spectral gap in one dimension, Physical Review X 10, 031038 (2020).
Bausch et al. [2021] J. Bausch, T. S. Cubitt, and J. D. Watson, Uncomputability of phase diagrams, Nature Communications 12, 1 (2021).
Smith [2006] W. D. Smith, Three counterexamples refuting kieu’s plan for “quantum adiabatic hypercomputation”; and some uncomputable quantum mechanical tasks, Applied Mathematics and Computation 178, 184 (2006).
Bondar and Pechen [2020] D. I. Bondar and A. N. Pechen, Uncomputability and complexity of quantum control, Scientific reports 10, 1 (2020).
Blondel and Tsitsiklis [2000b] V. D. Blondel and J. N. Tsitsiklis, A survey of computational complexity results in systems and control, Automatica 36, 1249 (2000b).
Daubechies and Lagarias [1992] I. Daubechies and J. C. Lagarias, Sets of matrices all infinite products of which converge, Linear algebra and its applications 161, 227 (1992).
Rota and Strang [1960] G.-C. Rota and W. Strang, A note on the joint spectral radius, Indag. Math. 22, 379–381 (1960).
Wang and Wen [2013] S. Wang and J. Wen, The finiteness conjecture for the joint spectral radius of a pair of matrices, (IEEE, 2013).
Berger and Wang [1992] M. A. Berger and Y. Wang, Bounded semigroups of matrices, Linear Algebra and its Applications 166, 21 (1992).
Matiâsevič [1970] Û. V. Matiâsevič, The diophantineness of enumerable sets, in Doklady Akademii Nauk, Vol. 191 (Russian Academy of Sciences, 1970) pp. 279–282.
Jones and Matiâsevič [1991] J. P. Jones and Y. V. Matiâsevič, Proof of recursive unsolvability of hilbert’s tenth problem, The American Mathematical Monthly 98, 689 (1991), https://doi.org/10.1080/00029890.1991.11995778.
Matiâsevič [1993] Û. V. Matiâsevič, Hilbert’s tenth problem (MIT press, 1993).
Hilbert [1902] D. Hilbert, Mathematical problems, Bulletin of the American Mathematical Society 8, 437 (1902).
Zhu [2006] W. Zhu, Unsolvability of some optimization problems, Applied Mathematics and Computation 174, 921 (2006).
Liberti [2019] L. Liberti, Undecidability and hardness in mixed-integer nonlinear programming, RAIRO-Operations Research 53, 81 (2019).

Appendix A Background and Related Work

A.1 Variational Quantum Algorithms

Variational Quantum Algorithms comprise a category of hybrid quantum-classical algorithms [2]. A classical optimization problem is translated into a physical quantum problem associated with a Hamiltonian $H$ . The optimization process involves minimizing the ground state energy corresponding to Hamiltonian $H$ . Once this mapping is established, execution on a quantum computer requires preparing an initial state $|\psi_{0}\rangle$ and allowing it to evolve according to $H$ through a parametrized quantum circuit, with parameter vector $\boldsymbol{\phi}\in\mathbb{R}^{L}$ , also referred to as the variational form. Consequently, the evolved state $|\psi(\boldsymbol{\phi})\rangle=U(\boldsymbol{\phi})|\psi_{0}\rangle$ is obtained.

Refer to caption — Figure 2: Schematic of the architecture of variational quantum algorithms. The classical optimizer undertakes the task to update the parameters $\boldsymbol{\phi}\to\boldsymbol{\phi}^{\prime}$ such that the cost function is minimized. The loop halts once optimal parameters have been found.

The unitary operator $U$ , that is being implemented by a set of gates in the parametrized quantum circuit, approximates ${\rm e}^{-{\rm i}H(\boldsymbol{\phi})t}$ within some error margin where $t$ denotes time and $H\in\operatorname{Herm}(\mathbb{C}^{n})$ the (parametrized) system Hamiltonian. The optimal values of the parameters are not known so a random initialization is due. Subsequently, expectation values of specific observables are measured and a cost function $C(\boldsymbol{\phi})$ is constructed that is fed into a classical optimizer that minimizes it. Once an optimal value is found, the classical computer updates the values for the parameters $\boldsymbol{\phi}$ into the quantum circuit and the procedure repeats.

QAOA is a variational algorithm whose objective is to approximately solve certain types of combinatorial optimization problems such as $\mathsf{MaxCut}$ and $\mathsf{3-SAT}$ problems. QAOA uses the adiabatic theorem which states that a slow enough transition between two Hamiltonians $H_{b}$ and $H_{c}$ guarantees that the ground state $|\psi_{b}\rangle$ of $H_{b}$ slowly transitions to the ground state $|\psi_{c}\rangle$ of $H_{c}$ as long as the Hamiltonians are gapped and level crossings are avoided [34].

Typically, the QAOA first creates an initial state $|+\rangle^{\otimes n}$ , by applying $n$ Hadamard gates to $|\boldsymbol{0}\rangle^{\otimes n}$ , that evolves according to the mixer Hamiltonian $H_{b}=\sum_{i=0}^{n-1}{X}_{i}$ . The quantum circuit applies successively $L$ layers of the unitaries $\prod_{n}\mathrm{e}^{-\mathrm{i}\beta_{n}{H}_{b}}\mathrm{e}^{-\mathrm{i}\gamma_{n}{H}_{c}}$ , $n=0,\ldots,L$ to create a trial state $|\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})\rangle$ . The parameters $\boldsymbol{\beta},\boldsymbol{\gamma}$ are then fed to a classical optimizer so as to be updated for the next iteration of the algorithm.

VQAs in general, and QAOA specifically, needs to overcome several difficulties in order to perform reasonably efficient computations. One such difficulty is the challenge to find variational forms such that resulting cost function provides with optimal solutions $\hat{p}_{t}$ , at each iteration $t$ , that minimize the error $\varepsilon_{p}$ to the actual but unknown optimum $p^{*}$ of the optimization problem:

\displaystyle\varepsilon_{p}=|\hat{p}-p^{*}|.

(50)

However, this is heavily constrained by the hardware capabilities of NISQ devices including gate fidelities, decoherence times, circuit depth, etc. Once a solution $\hat{p}_{t}$ has been found, the classical optimizer needs to perform efficient search within the parameter landscape such that the estimation of the next iteration is closer to $p^{*}$ and the error converges to zero, $\lim_{r}\varepsilon_{p}^{(r)}\to 0$ , for finite $r\in\mathbb{Z}_{\geq 1}$ being the iteration number.

A.2 Complexity of Classical Subproblems in Variational Quantum Algorithms

There exists a multitude of formalizations for variational quantum algorithms. In this work we use the notation of [13], presented the following:

Problem 15 (VQA minimization, oracular formulation, paraphrasing [13]).

Instance

A set of generators $\{H_{i}\}_{i\in\{1,\dots,L\}}$ and an observable $O$ acting on $\mathcal{H}=(\mathbb{C}^{2})^{\otimes N}$ , given in terms of their Pauli basis representation.

Oracle access

We set $\ket{\Psi(\boldsymbol{\phi})}\coloneqq U_{L}(\phi_{L})\cdots U_{1}(\phi_{1})\ket{\boldsymbol{0}}$ with

\displaystyle U_{i}(\phi_{i})=\mathrm{e}^{-\mathrm{i}H_{i}\phi_{i}}.

(51)

The oracle $\mathcal{O}$ returns $\left<O(\boldsymbol{\phi})\right>\coloneqq\left\langle\Psi(\boldsymbol{\phi})\right|O\left|\Psi(\boldsymbol{\phi})\right\rangle$ , given $\boldsymbol{\phi}$ , up to any desired polynomial additive error.

Optimization version

Find $\boldsymbol{\phi}\in\mathbb{R}^{L}$ that minimizes $\left<O(\boldsymbol{\phi})\right>$ provided access to $\mathcal{O}$ .

Decision version for $a\in\mathbb{R}$

Decide whether there exists $\boldsymbol{\phi}\in\mathbb{R}^{L}$ whose value $\left<O(\boldsymbol{\phi})\right>\leq a$ provided access to $\mathcal{O}$ .

Theorem 16 (Hardness of VQA optimization, decision version with the oracular formulation, paraphrasing [13]).

Assuming ${\mathsf{P}}\neq{\mathsf{NP}}$ there is no deterministic classical algorithm that solves the decision version of Problem 15 in polynomial time.

Alternatively, with computational basis of the Hilbert space $\mathcal{H}$ of dimension $\dim(\mathcal{H})=2^{N}\eqqcolon n$ we formulate the following problem:

Problem 17 (VQA minimization problem, paraphrasing [13] ).

Instance: An initial state $\ket{\Psi_{0}}\in\mathbb{C}^{n}$ , a set of generators $\{H_{i}\}_{i\in\{1,\dots,L\}}\in\operatorname{Herm}(\mathbb{C}^{n})$ , where $L$ is the number of layers and an observable $O\in\operatorname{Herm}(\mathbb{C}^{n})$ .
Optimization version: For $\ket{\Psi(\boldsymbol{\phi})}\coloneqq U_{L}(\phi_{L})\cdots U_{1}(\phi_{1})\ket{\Psi_{0}}$ with $U_{i}(\phi_{i})=\mathrm{e}^{-\mathrm{i}H_{i}\phi_{i}}$ , find $\boldsymbol{\phi}\in\mathbb{R}^{L}$ that minimizes $\left<O(\boldsymbol{\phi})\right>\coloneqq~{}\left\langle\Psi(\boldsymbol{\phi})\right|O\left|\Psi(\boldsymbol{\phi})\right\rangle$ .
Decision version for $a\in\mathbb{R}$: For $\ket{\Psi({\boldsymbol{\phi}})}\coloneqq U_{L}(\phi_{L})\cdots U_{1}(\phi_{1})\ket{\Psi_{0}}$ with $U_{i}(\phi_{i})~{}=~{}\mathrm{e}^{-\mathrm{i}H_{i}\phi_{i}}$ , determine if there exists $\boldsymbol{\phi}\in\mathbb{R}^{L}$ for which $\left\langle\Psi(\boldsymbol{\phi})\right|O\left|\Psi(\boldsymbol{\phi})\right\rangle\leq a$ .

Theorem 18 (Hardness of VQA optimization, paraphrasing [13]).

Optimization version of VQA minimization problem (Problem 1) is $\mathsf{NP}$ -hard. Decision version of VQA minimization problem (Problem 1) is $\mathsf{NP}$ -hard for $L=1$ layer.

This result implies that even when simplifying the problem to a single layer, the decision version remains computationally difficult. VQA minimization involves finding a global minimum of a multimodal objective landscape, a $\mathsf{NP}$ -hard problem. The similarity lies in that their objective functions exhibit non-linear and non-convex properties, leading to numerous local minima and complicating the search for global optima. Famously, Blum and Rivest [35] showed that training a 3-neuron multi-layer perceptron, which has a similar landscape, is $\mathsf{NP}$ -Complete. (See [36] for relevant results and [37] for some recent work on the topic from the neural network reachability point of view.)

Next we proceed to discuss analogous results about QAOA, the algorithm we will dominantly use for the main result of this paper in Sec. III.

Problem 19 (QAOA minimization problem, [13]).

Instance: Two Hamiltonians $H_{b},H_{c}\in\operatorname{Herm}(\mathbb{C}^{n})$ and the number of layers $L$ in unary notation⁸⁸8This means that the length of the input scales linearly with $L$ ..
Optimization version: For a tunable state $\ket{\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})}\coloneqq U_{b}(\beta_{L})U_{c}(\gamma_{L})\cdots U_{b}(\beta_{1})U_{c}(\gamma_{1})\ket{\Psi_{0}}$ , where $\ket{\Psi_{0}}$ is the ground state of $H_{b}$ , $U_{b}(\beta)=\mathrm{e}^{-\mathrm{i}H_{b}\beta}$ and $U_{c}(\gamma)=\mathrm{e}^{-\mathrm{i}H_{c}\gamma}$ , find $\boldsymbol{\beta},\boldsymbol{\gamma}\in\mathbb{R}^{d}$ which minimize $\left<O(\boldsymbol{\beta},\boldsymbol{\gamma})\right>\coloneqq\left\langle\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})\right|H_{c}\left|\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})\right\rangle$ .
Decision version for $a\in\mathbb{R}$: For a tunable state $\ket{\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})}\coloneqq U_{b}(\beta_{L})U_{c}(\gamma_{L})\cdots U_{b}(\beta_{1})U_{c}(\gamma_{1})\ket{\Psi_{0}}$ , where $\ket{\Psi_{0}}$ is the ground state of $H_{b}$ , $U_{b}(\beta)=\mathrm{e}^{-\mathrm{i}H_{b}\beta}$ and $U_{c}(\gamma)=\mathrm{e}^{-\mathrm{i}H_{c}\gamma}$ , decide if there exists $\boldsymbol{\beta},\boldsymbol{\gamma}\in\mathbb{R}^{d}$ such that $\left<O(\boldsymbol{\beta},\boldsymbol{\gamma})\right>\coloneqq\left\langle\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})\right|H_{c}\left|\Psi(\boldsymbol{\beta},\boldsymbol{\gamma})\right\rangle\leq a$ .

Theorem 20 (Hardness of optimization in QAOA, [13]).

Decision version of QAOA minimization problem (Problem 7) is $\mathsf{NP}$ -hard for $L=1$ layer.

Proof.

To show this, we show a reduction from single layer VQA to QAOA, which implies that Problem 7 is $\mathsf{NP}$ -hard. ∎

The proof of Theorem 20 is based on the proof that VQA is $\mathsf{NP}$ -hard even for $L=1$ . In the subsequent section, we show that $\mathsf{NP}$ -hardness is a very conservative estimate. In particular, we show that certain problems associated to VQAs and QAOAs are undecidable.

A.3 Undecidability in Quantum Computing

Undecidability is a much stronger notion than $\mathsf{NP}$ -hardness. Instead of proving a conditional statement on the non-existence of a polynomial-time algorithm for obtaining a solution, e.g. Proposition 16, undecidability is an unconditional statement on the non-existence of a finite-time algorithm for indicating whether a solution satisfying a certain objective even exists.

While there is a long history of work on undecidability in physics [38], starting with the influential paper [39] on the undecidability in classical mechanics, only a handful of undecidability results are known in quantum information theory and quantum computing. Following the pioneering work of Lloyd [40, 41], Cubitt et al. [42] have shown the undecidability of the spectral gap, even in one dimension [43], and uncomputability [44] of phase diagrams. Smith [45] has shown that no fully general form of the quantum adiabatic theorem can exist that yields computable upper bounds on adiabatic convergence times. Bondar and Pechen [46] have shown uncomputability of a digitized quantum optimal control.

A.4 Undecidability in Control Theory

There is, however, a much longer tradition of the study of undecidability and uncomputability in control theory [47]. Using the notion of the generalized spectral radius [48], [29] show that the convergence of products of operators of the form of $U_{L}(\phi_{L})\ldots U_{1}(\phi_{1})$ is undecidable.

The joint spectral radius [49] of a set of matrices is a measure of the maximal asymptotic growth rate that can be obtained by forming long products of matrices taken from a set $\mathcal{A}=\{A_{1},\ldots,A_{n}\}$ of real $n\times n$ matrices.

Consider products $A_{(t)}=A_{t}\ldots A_{1}$ . For example, if $\mathcal{A}=\{A_{1},A_{2}\}$ and $t=2$ ,

\displaystyle A_{(2)}=\begin{cases}A_{1}^{2}\\ A_{2}^{2}\\ A_{1}A_{2}\\ A_{2}A_{1}\ \end{cases}.

(52)

The spectral radius of a square matrix $A\in\mathcal{A}$ is defined as:

\displaystyle\rho(A)=\mathrm{max}\{|\lambda|\text{ such that }\lambda\text{ is an eigenvalue of }A\}.

(53)

A natural generalization of the spectral radius that serves as a measure of growth ( $t\to\infty$ ) of matrix products $A_{(t)}$ is the joint spectral radius:

\displaystyle\hat{\rho}(\mathcal{A})

\displaystyle=\lim_{t\to\infty}\sup\hat{\rho}_{t}(\mathcal{A}),

(54)

where for some matrix norm $\lVert\cdot\rVert$ we define

\displaystyle\hat{\rho}_{t}(\mathcal{A})=\max_{A_{i}\in\mathcal{A}}\lVert A_{t}\ldots A_{1}\rVert^{1/t}.

(55)

A very similar concept is that of the generalized spectral radius $\rho(\mathcal{A})$ is defined as:

\displaystyle\rho(\mathcal{A})=\lim_{t\to\infty}\sup\rho_{t}(A_{t}\ldots A_{1}),

(56)

where

\displaystyle\rho_{t}(\mathcal{A})=\max_{A_{i}\in\mathcal{A}}\rho(A_{t}\ldots A_{1})^{1/t}.

(57)

It is known that $\rho_{t}(\mathcal{A})\leq\rho(\mathcal{A})$ [50], while for any finite set of matrices $\mathcal{A}$ it holds that [51]

\displaystyle\hat{\rho}(\mathcal{A})=\rho(\mathcal{A}).

(58)

Problem 21 (Unit Spectral Radius (USR)).

Input: A finite set $\mathcal{A}$ of $n\times n$ $\mathbb{Q}$ -valued matrices.
Decision version for $a=1$: Is the joint spectral radius $\rho(\mathcal{A})\leq a$ ?

Problem 22 (Bounded Matrix Products (BMP)).

Input: A finite set $\mathcal{A}$ of $n\times n$ $\mathbb{Q}$ -valued matrices.
Decision version: Is the set of all matrix products bounded?

Attacking Problem (21) and Problem (22) amounts to the asymptotic computation of the joint spectral radius for $\mathcal{A}$ . If it is found to be less than one then the problem is decidable and the corresponding (switched) linear system is stable [29].

Theorem 23 (Blondel and Tsitsiklis, [29]).

The problems BMPs and USR are undecidable. They remain undecidable even in the special case where $\mathcal{A}$ consists of only two matrices.

A.5 Undecidability in Optimization

Optimization without any assumptions on the objective function is undecidable. This builds upon the work of Matiâsevič from the 1970 [52, 53, 54], which has resolved Hilbert’s tenth problem [55]. Hilbert’s tenth problem [55] asked for a general algorithm which, for any polynomial equation with integer coefficients and a finite number of unknowns that can decide whether the equation has a solution with all unknowns taking integer values. That is, given $p\in\mathbb{Z}[x]$ with $x\in\mathbb{R}^{d}$ , decide if $p(x)=0$ has integer solutions. Since Hilbert’s problem is equivalent to the decision problem of whether the real-valued optimization problem,

\displaystyle\min_{x\in\mathbb{R}^{d}}\,p(x)+\sum\limits_{i=1}^{d}\sin^{2}(\pi x_{i})

(59)

has a zero-valued optimum solution, where $p(x)$ is the original polynomial ⁹⁹9 Cf. [39, Definition 3.10] or [56, Proof of Theorem 3] or a modern survey in [57]., this can be extended to optimization without the integrality of the variables:

Theorem 24 (Unconstrained continuous global minimization is undecidable, paraphrasing [56]).

Given a continuously differentiable function $f(x_{1},\ldots,x_{L})$ , unconstrained continuous global minimization seeks to find a global minimal solution of $minf$ . The associated decision problem is: Given $L,f$ , a constant $B$ , does there exist $x_{i},i=1,2...,L$ such that $f(x_{1},\ldots,x_{L})\leq B$ ? There exists no recursive function to decide this decision problem.

We refer to [57] for a nice survey, which fixes some issues with the proofs of [56].

Appendix B A Table of Notation

Variable	Description	Space	Equation(s)
$U_{i}(\phi_{i})$	parametrized unitaries	$U(n)$	(51),(16)
$O$	VQA operator	$\operatorname{Herm}(\mathbb{C}^{n})$	(1),(16)
$\boldsymbol{\phi}$	variational parameters	$\mathbb{R}^{58}$	(51),(16)
$\ket{\Psi_{0}}$	initial state	$\mathcal{H}$	(16)
$q_{j}(\phi)$	summand of SOS polynomial	$\mathbb{R}[\boldsymbol{\phi}]$	(15)
$\varkappa_{j}$	Sylvester coefficients	$\mathbb{C}$	(5), (20)
$c_{j}$	monomial choosing vector	$\operatorname{Herm}(\mathbb{C}^{n})$	(20)
$b_{j}(\boldsymbol{\phi})$	monomial vector	$\mathbb{R}[\boldsymbol{\phi}]$	(20)
$g_{i}$	complex polynomial	$\mathbb{C}[\boldsymbol{x},\boldsymbol{w}]$	(24)
System	Parameters	Variables	Equation
UDE	$u,x,y,z$	$a,b,c,\ldots,\tau,\varphi$	(60)
System	# Variables	# DOF	Equation
Initial			(16)
New			(24)

Table 2: An overview of the notation.

Appendix C An Example of a UDE

	$\displaystyle(elg^{2}+\alpha-(b-xy)q^{2})^{2}+$		(60)
	$\displaystyle(q-b^{5^{60}})^{2}+$
	$\displaystyle(\lambda+q^{4}-1-\lambda b^{5})^{2}+$
	$\displaystyle(\theta+2z-b^{5})^{2}+$
	$\displaystyle(u+t\theta-l)^{2}+$
	$\displaystyle(y+m\theta-e)^{2}+$
	$\displaystyle(q^{16}-n)^{2}+$
	$\displaystyle\left([g+eq^{3}+lq^{5}+(2(e-z\lambda)(1+xb^{5}+g)^{4}+\quad\right.$
	$\displaystyle\lambda b^{5}+\lambda b^{5}q^{4})q^{4}][n^{2}-n]+$
	$\displaystyle\left.[q^{3}-bl+1+\theta\lambda q^{3}+(b^{5}-2)q^{5}][n^{2}-1]-r\right)^{2}+$
	$\displaystyle(2ws^{2}r^{2}n^{2}-p)^{2}+$
	$\displaystyle(p^{2}k^{2}-k^{2}+1-\tau^{2})^{2}+$
	$\displaystyle(4(c-ksn^{2})^{2}+\nu-k^{2})^{2}+$
	$\displaystyle(r+1+hp-h-k)^{2}+$
	$\displaystyle((wn^{2}+1)rsn^{2}-a)^{2}+$
	$\displaystyle(2r+1+\varphi-c)^{2}+$
	$\displaystyle(bw+ca-2c+4a\gamma-5\gamma-d)^{2}+$
	$\displaystyle((a^{2}-1)c^{2}+1-d^{2})^{2}+$
	$\displaystyle((a^{2}-1)i^{2}c^{4}+1-f^{2})^{2}+$
	$\displaystyle(((a+f^{2}(d^{2}-a))^{2}-1)(2r+1+jc)^{2}+1-(d+of)^{2})^{2}$

Figure 3: Universal Diophantine equation

D(28,2\cdot 5^{60})

based on the work of [21, 17], where

a,b,c,\ldots,s,t,w,\alpha,\gamma,\nu,\theta,\lambda,\tau,\varphi

are variables and

u,x,y,z

are parameters that encode the Turing machine. Notice that the Universal Diophantine equation is a sum-of-squares polynomial, which is compatible with (2).

Appendix D Further clarifications and a simple example

In order to assist in understanding the form of the equation system, let us consider how a potential two-layer example, instead of the 58-layer case, which is our target, corresponds to an encoding of a degree-8 polynomial. We shall see the precise form of the vectors $\boldsymbol{c}$ and $\boldsymbol{b}$ . With $H\in\operatorname{Herm}(\mathbb{C}^{5})$ , such that the expansion runs for $0\leq j\leq 4$ , and utilizing Sylvester’s formula, we obtain the following:

$\displaystyle c_{(0,0)}$	$\displaystyle=\varkappa_{0}\mathds{1}$	(61)
$\displaystyle c_{(1,0)}$	$\displaystyle=\varkappa_{1}H_{1}$
$\displaystyle c_{(0,1)}$	$\displaystyle=\varkappa_{1}H_{2}$
$\displaystyle c_{(2,0)}$	$\displaystyle=\varkappa_{2}H_{1}^{2}$
$\displaystyle c_{(0,2)}$	$\displaystyle=\varkappa_{2}H_{2}^{2}$
$\displaystyle c_{(1,1)}$	$\displaystyle=2\varkappa_{2}H_{1}H_{2}$
$\displaystyle c_{(3,0)}$	$\displaystyle=\varkappa_{3}H_{1}^{3}$
$\displaystyle c_{(0,3)}$	$\displaystyle=\varkappa_{3}H_{2}^{3}$
$\displaystyle c_{(2,1)}$	$\displaystyle=3\varkappa_{3}H_{1}^{2}H_{2}$
$\displaystyle c_{(1,2)}$	$\displaystyle=3\varkappa_{3}H_{1}H_{2}^{2}$
$\displaystyle c_{(4,0)}$	$\displaystyle=\varkappa_{4}H_{1}^{4}$
$\displaystyle c_{(0,4)}$	$\displaystyle=\varkappa_{4}H_{2}^{4}$
$\displaystyle c_{(3,1)}$	$\displaystyle=4\varkappa_{4}H_{1}^{3}H_{2}$
$\displaystyle c_{(1,3)}$	$\displaystyle=4\varkappa_{4}H_{1}H_{2}^{3}$
$\displaystyle c_{(2,2)}$	$\displaystyle=6\varkappa_{4}H_{1}^{2}H_{2}^{2},\$

where, for simplicity, we absorbed the complex unit into $\{H_{i}\}$ . All dependencies on $\boldsymbol{\phi}$ are encoded in $\boldsymbol{b}(\boldsymbol{\phi})$ . Furthermore, observe that for each monomial degree $i=j+k$ , for $c_{(j,k)}$ , only the coefficient $\varkappa_{i}$ is relevant. From this example, it becomes clear that the indexing of $\boldsymbol{c}$ directly reveals the form of the corresponding monomial on the generators $\{H_{i}\}_{i=1}^{58}$ . Furthermore, all $c(j,k)$ contain equal degrees of freedom since they depend, thorough $\varkappa_{j+k}$ , on all of the variational parameters,

\displaystyle c_{(j,k)}=c_{(j,k)}(\{\phi_{i}\},\{H_{58}\}).

(62)

Appendix E An SAGE Script

var(’x,y,a1,a2,a4,b1,b2,c1,c2,d1,d2,a,b’)
g1(x,y)=a1*x^2  + a2*y^2 - a
g2(x,y)=b1*x^2  + b2*y^2 - b
g3(x,y)=c1*x^2  + c2*y^2
g4(x,y)=d1*x^2  + d2*y^2
R.<x,y,a1,a2,b1,b2,c1,c2,d1,d2> =
   CC[x,y,a1,a2,b1,b2,c1,c2,c3,d1,d2]
I = Ideal(g1, g2, g3, g4)
S = R.quotient_Rig(I)
X = SpecS
S.is_empty()

Figure 4: A SAGE script for testing the non-emptiness of the moduli space of quadrics over

\mathds{CP}^{4}