This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Quantum reservoir computing using arrays of Rydberg atoms

Rodrigo Araiza Bravo1 [email protected]    Khadijeh Najafi1,2 [email protected]    Xun Gao1    Susanne F. Yelin1 1Department of Physics, Harvard University, Cambridge, Massachusetts 02138, USA
2IBM Quantum, IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA
Abstract

Quantum computing promises to speed up machine learning algorithms. However, noisy intermediate-scale quantum (NISQ) devices pose engineering challenges to realizing quantum machine learning (QML) advantages. Recently, a series of QML computational models inspired by the noise-tolerant dynamics of the brain has emerged as a means to circumvent the hardware limitations of NISQ devices. In this article, we introduce a quantum version of a recurrent neural network (RNN), a well-known model for neural circuits in the brain. Our quantum RNN (qRNN) makes use of the natural Hamiltonian dynamics of an ensemble of interacting spin-1/2 particles as a means for computation. In the limit where the Hamiltonian is diagonal, the qRNN recovers the dynamics of the classical version. Beyond this limit, we observe that the quantum dynamics of the qRNN provide it with quantum computational features that can aid it in computation. To this end, we study a fixed geometry qRNN, i.e. a quantum reservoir compute, based on arrays of Rydberg atoms and show that the Rydberg reservoir is indeed capable of replicating the learning of several cognitive tasks such as multitasking, decision-making, and long-term memory by taking advantage of several key features of this platform such as interatomic species interactions, and quantum many-body scars.

preprint: APS/123-QED

I Introduction

Quantum computing promises to enhance machine learning algorithms. However, implementing these advantages often relies on either fault-tolerant quantum computers not yet available [1, 2, 3, 4, 5], or on decoherence-limited, variational quantum circuits which may experience training bottlenecks [6, 7]. Thus, currently available noisy intermediate-scale quantum (NISQ) devices thwart quantum advantages in machine learning algorithms.

Recently, to counteract these challenges, several quantum machine learning architectures have emerged inspired by models for computation in the brain [8, 9, 10]. These brain-inspired algorithms are motivated by the inherent robustness of input- and hardware-noise in brain-like computation, and by the possibility to use the analogue dynamics of controllable, many-body quantum systems for computation without relaying on a digital circuit architecture. Broadly speaking, these brain-inspired algorithms can be put into two categories. The first of which encompasses systems quantizing the dynamics of biological computational models at the single-neuron level. Thus, the dynamics of single qubits or groups of qubits resemble the dynamics of a neurons in a neural circuit of interest. Examples of these include quantum memristors [11], which are electrical circuits with a history-dependent resistance, quantum versions of the biologically realistic Hodgkin-Huxley model for single neurons [12, 13], and unitary adiabatic quantum perceptron [14].

The second category of brain-inspired algorithms relies on a macroscopic resemblance between many-body quantum systems and neural circuits. In this regard, the algorithms that have received the most attention are quantum reservoir computers. Quantum reservoir computers use ensembles of quantum emitters with fixed interactions to perform versatile machine learning tasks relying on the complexity of the unitary evolution of the system. Since these systems can couple with both classical and quantum devices, which may encode the tasks’ input, quantum reservoirs have been used for time-series prediction [15, 16, 17], entanglement measurement [18, 19], quantum state preparation [20], continuous-variable computation [21] which can be made universal [22], reduction of depths in quantum circuit [23], ground state finding [24], and for long-term memory employing ergodicity-breaking dynamics [25, 26, 27]. See [10] for a comprehensive review of quantum reservoir computing.

In both categories, however, a thorough understanding of the potential computational advantages and their origins are slowly emerging. In this article, we contribute to this direction by proposing a quantum extension of a well-known neural circuit model called recurrent neural networks (RNNs), of which reservoir computers are a special case [28]. Our extension uses the Hamiltonian dynamics of ensembles of two-level systems. In the limit where the Hamiltonian is diagonal, we recover the classical single-neuron dynamics naturally encoding RNNs into quantum hardware. Recently, another natural encoding of a reservoir computer was proposed using superconducting qubits [29]. In our case, the general dynamics of the quantum RNN (qRNN) present several new features that can aid in the computation of both classical and quantum tasks. In particular, a qRNN used for simulating stochastic dynamics can exhibit speedups compared to classical RNNs.

To show that our scheme is experimentally realizable, we propose that arrays of Rydberg atoms can be used as qRNNs (Sec. IV). Although our Rydberg qRNNs have restricted connectivity, we are motivated to use Rydberg arrays due to recent studies with equally restricted qRNNs which show significant computational capacity when driven near criticality [17, 17, 24]. Moreover, recent experiments using optical tweezers [30, 31, 32, 33, 34, 35, 36, 37] have catapulted the community’s interest in Rydberg arrays as they exhibit long coherence times, controllable and scalable geometries, and increasing levels of single-atom control [38]. Additionally, Rydberg arrays can be used for a novel, programmable quantum simulations and universal computations [39, 30, 40, 41, 42, 43].

We numerically implement fixed-geometry Rydberg qRNNs, i.e. Rydberg reservoir computers, and we successfully perform cognitive tasks even when a few atoms are available (Sec. V). The success of these tasks is explained by the physics of Rydberg atoms. For example, our Rydberg qRNNs excel at learning to multitask since they can naturally encode RNNs with inhibitory and excitatory neurons which are vitals for many cognitive tasks [44]. This encoding relies on the different types of interactions between Rydberg atoms with different principal quantum numbers [45]. Likewise, a Rydberg qRNN exhibits long-term memory due to the weak-ergodicity breaking dynamics of many-body quantum scars [35, 46, 47]. Lastly, we discuss possible further research directions in Sec. VI.

We remark that the notion of qRNNs has been previously coined relying on universal quantum circuits and using measurements to implement the nonlinear dynamics of an RNN [48]. Instead, what we define as a “quantum RNN” leverages the inherent unitary dynamics of ensembles of two-level systems to compute, deviating from the quantum digital circuit model for computation.

II Classical recurrent neural networks

We begin by reviewing an archetypal RNN consisting of NN binary neurons. Each neuron is in one of two possible states sn(t){1,1}s_{n}(t)\in\{-1,1\} and is updated from the time-step tt to t+1t+1 following the update rule

sn(t+1)\displaystyle s_{n}(t+1) =sign(hn(t)sn(t)),\displaystyle=\text{sign}\left(h_{n}(t)s_{n}(t)\right),
hn(t)\displaystyle h_{n}(t) Δn(t)+mJnmsm(t),\displaystyle\equiv-\Delta_{n}(t)+\sum_{m}J_{nm}s_{m}(t), (1)

where Jnm=JmnJ_{nm}=J_{mn} are symmetric synaptic connections between neurons nn and mm. The time-dependent biases Δn(t)\Delta_{n}(t) encode the RNN’s inputs. To avoid memorization during a learning task with inputs untask(t)u_{n}^{\text{task}}(t), the RNN receives Gaussian-whitened inputs

Δn(t)=untask(t)+ξn,\Delta_{n}(t)=u_{n}^{task}(t)+\xi_{n}, (2)

where ξn\xi_{n} is a zero-mean Gaussian random variable with variance σin2\sigma_{in}^{2}, making the evolution of the RNN stochastic. In RNNs, the value of σin2\sigma_{in}^{2} is proportional to the value of the tasks’ inputs untasku_{n}^{task}.

When studying learning tasks similar to those in the mammalian cortex [44] one turns to a continuous version of the rule in (II) obtained in the case that the time-interval τ\tau in which neurons update is small compared to JnmJ_{nm}. In this limit,

τs˙n(t)=sn(t)+sign(hn(t)sn(t)).\tau\dot{s}_{n}(t)=-s_{n}(t)+\text{sign}\left(h_{n}(t)s_{n}(t)\right). (3)

Thus, the RNN obeys a system of nonlinear differential equations. Note that (3) imply that sn[1,1]s_{n}\in[-1,1] is a continuous and bounded variable [28].

A third way to describe an RNN is via the probability distribution pt(𝒔)p_{t}(\bm{s}) of observing each of the 2N2^{N} different configurations 𝒔\bm{s} at the ttht^{th} time-step. Due to the noise in the inputs Δn\Delta_{n}, the dynamics of the distribution follow a Markov chain description [28]. This description is particularly useful for analyzing the stochastic dynamics simulatable by an RNN. As we shall see in Sec III.1, this representation will be useful in explaining how, relative to classical RNNs, the unitary dynamics of a qRNN can speed up stochastic process simulations.

Lastly, we describe how to use an RNN for computation. After the RNN evolves for a time tft_{f}, a subset of MM neurons are used to collect the vector 𝒓(tf)=(sn1(tf),,snM(tf),1)\bm{r}(t_{f})=(s_{n_{1}}(t_{f}),...,s_{n_{M}}(t_{f}),1) with the last entry accommodating for a bias. The other NMN-M other neurons are called hidden neurons. The RNN’s output is obtained via a linear transformation 𝒚out=Wout𝒓(tf)\bm{y}^{\text{out}}=W^{\text{out}}\bm{r}(t_{f}) where WoutW^{\text{out}} is a real-valued matrix. Thus, the computational complexity of the RNN comes from the nonlinear activation function in (II) which enables 𝒚out\bm{y}^{\text{\text{out}}} to be a nonlinear function of the inputs.

In a learning task with a target output 𝒚targ\bm{y}^{\text{targ}}, the RNN is trained by minimizing a loss function (𝒚out,𝒚targ)\mathcal{L}(\bm{y}^{\text{out}},\bm{y}^{\text{targ}}) with respect to the network parameters such as WoutW^{\text{out}}, JnmJ_{nm}, etc. subject to the task-determined inputs in (2). We choose the square-loss

(𝒚out,𝒚targ)=1Nsi=1Ns𝒚itarg𝒚iout2,\mathcal{L}(\bm{y}^{\text{out}},\bm{y}^{\text{targ}})=\frac{1}{N_{s}}\sum_{i=1}^{N_{s}}||\bm{y}_{i}^{\text{targ}}-\bm{y}_{i}^{\text{out}}||^{2}, (4)

where ii labels the NsN_{s} different input instances. For the tasks in Sec. V, we fix the connections JnmJ_{nm}, such that our qRNNs more closely resemble quantum reservoir computers.

III Quantum recurrent neural networks

III.1 Quantum update rule

Let us now extend the classical RNN in (II) to the quantum setting. We replace each of the NN neurons with a spin-1/2 particle for which a spin measurement along the zz-axis yields the values {1,1}\{-1,1\}. Thus, each neuron nn is in a normalized quantum state in the Hilbert space n\mathcal{H}_{n} with basis vectors {|-1n,|1n}\{|\text{-}1\rangle_{n},|1\rangle_{n}\} which are eigenstates of the Pauli-Z operator σnz=|11|n|-1-1|n\sigma_{n}^{z}=|1\rangle\langle 1|_{n}-|\text{-}1\rangle\langle\text{-}1|_{n}. The state of the composite system lives in the product Hilbert space =n=1Nn\mathcal{H}=\bigotimes_{n=1}^{N}\mathcal{H}_{n}.

We choose spins interacting via the time-dependent Hamiltonian

H(t)\displaystyle H(t) =n=1NΔn(t)σnz+nmJnmσnzσmz\displaystyle=-\sum_{n=1}^{N}\Delta_{n}(t)\sigma_{n}^{z}+\sum_{nm}J_{nm}\sigma_{n}^{z}\sigma_{m}^{z}
+Ω(t)2n=1Nσnx,\displaystyle+\frac{\Omega(t)}{2}\sum_{n=1}^{N}\sigma_{n}^{x}, (5)

where σnx=|1-1|n+|-11|n\sigma_{n}^{x}=|1\rangle\langle\text{-}1|_{n}+|\text{-}1\rangle\langle 1|_{n} is the Pauli-X operator. Indeed, the evolution under (III.1) encompasses the update rule in (II). To see this, note that in the classical case of (II), the RNN evolves under the rules

If hn>0, sn doesn’t change.\displaystyle\text{If }h_{n}>0,\text{ }s_{n}\text{ doesn't change.} (1C)
If hn<0, sn flips.\displaystyle\text{If }h_{n}<0,\text{ }s_{n}\text{ flips.} (2C)

Here “CC” stands for “classical”. Now, consider a qRNN starting in the configuration |s1,s2,,sN|s_{1},s_{2},...,s_{N}\rangle and evolving for a time t=2πΩ1t=2\pi\Omega^{-1}. In the limit where ΔnΩ\Delta_{n}\gg\Omega or JnmΩJ_{nm}\gg\Omega, each spin experiences the Ham7iltonian Hn=hnσnz+Ω2σnxH_{n}=h_{n}\sigma_{n}^{z}+\frac{\Omega}{2}\sigma_{n}^{x} where hn=Δn+mJnmsmh_{n}=-\Delta_{n}+\sum_{m}J_{nm}s_{m} is the effective field generated by the rest of the spins where sms_{m} stands for the measurement result of σmz\sigma_{m}^{z} on the initial configuration. We then obtain the quantum update rules

If |hn|Ω, |sn doesn’t change.\displaystyle\text{If }{\color[rgb]{0,0,0}{|h_{n}|}}\gg\Omega,\text{ }{\color[rgb]{0,0,0}{|s_{n}\rangle}}\text{ doesn't change.} (1Q)
If |hn|Ω, |sn flips.\displaystyle\text{If }{\color[rgb]{0,0,0}{|h_{n}|}}\ll\Omega,\text{ }{\color[rgb]{0,0,0}{|s_{n}\rangle}}\text{ flips.} (2Q)

Here, “Q” stands for “quantum”. Therefore, (III.1) can implement (1C)-(2C) but without the use of the nonlinear activation function in (II). Nonetheless, (III.1) allows for more general dynamics beyond the perturbative limit for which (1Q)-(2Q) holds. We now highlight three features arising from the quantum evolution of the qRNN: (i) the ability to compute complex functions on the input by using quantum interference, (ii) exploiting the choice of measurement basis, and (iii) efficiently achieving stochastic processes inaccessible to classical RNNs with no hidden neurons.

Refer to caption
Figure 1: Computing the parity, XOR(s1,s2s_{1},s_{2}), of two inputs s1s_{1} and s2s_{2} with a qRNN. Spin 3 (the output spin) experiences an effective field J~=J(s1+s2)\tilde{J}=J(s_{1}+s_{2}) with JΩJ\gg\Omega. After evolving for a time t=2πΩ1t=2\pi\Omega^{-1}, we measure the output spin. The measurement outcome +1\text{+}1 is obtained when s1=s2s_{1}=-s_{2} since J~=0\tilde{J}=0. If s1=s2s_{1}=s_{2} so that J~0\tilde{J}\neq 0, the inputs constructively interfere to generate a large detuning on the output such that measurement yields the outcome -1.

Quantum feature 1: quantum interference as a means for computation

The computational power of (II) is a result of its nonlinear dynamics. For example, an RNN with linear dynamics is incapable of computing the parity function XOR(s1,s2)=s1s2\text{XOR}(s_{1},s_{2})=s_{1}s_{2} between two classical binary inputs. On the other hand, quantum mechanics is a unitary theory. Yet, this does not limit a qRNN to linear computation. Indeed, a qRNN can compute XOR by leveraging quantum interference, a resource fundamental to quantum computation. Thus, we can use a qRNN for complex computing tasks.

As illustrated in Fig. 1, we can compute XOR(s1,s2s_{1},s_{2}) using a qRNN of three spins initially in the state |s1,s2,-1|s_{1},s_{2},\text{-}1\rangle. The third spin is an outcome spin. This spin is measured to tell us information about the parity of s1s_{1} and s2s_{2}. We let these spins evolve under the dynamics dictated by (III.1) choosing Δn,J12=0\Delta_{n},J_{12}=0 and J13=J23=JΩJ_{13}=J_{23}=J\gg\Omega. Let J~=J(s1+s2)\tilde{J}=J(s_{1}+s_{2}). In the frame rotating at the rate J~\tilde{J}, the output spin experiences the Hamiltonian

H3\displaystyle H_{3} =Ω2(e2iJ~τ|1-1|+h.c.).\displaystyle=\frac{\Omega}{2}\left(e^{2i\tilde{J}\tau}|1\rangle\langle\text{-}1|+\text{h.c.}\right). (6)

It’s clear that if the spins have odd parity (i.e. s1=s2s_{1}=-s_{2} so that J~=0\tilde{J}=0), the output spin flips to the state |1|1\rangle when we choose to evolve by t=2πΩ1t=2\pi\Omega^{-1}. On the other hand, if J~0\tilde{J}\neq 0, H3H_{3} contains only fast-rotating terms, and the rotating-wave approximation (RWA) allows us to neglect the evolution of the output spin [49]. Physically, the RWA can be thought of as the spin rotating along the xx-axis by a small amount followed by a rapid precession of the spin around the zz-axis. Indeed, as illustrated in Fig. 1, Jt1J\gg t^{-1} amounts to averaging out the spin’s position so that the spin is along the zz-axis. Overall, this computation realizes the operation |s1,s2,1|s1,s2,XOR(s1,s2)|s_{1},s_{2},-1\rangle\rightarrow|s_{1},s_{2},XOR(s_{1},s_{2})\rangle

Note that this is a result of s1+s2s_{1}+s_{2} constructively interfering to produce a large effective detuning on the output and blocking its evolution. Thus, interference serves as a means for computation in qRNNs.

Refer to caption
Figure 2: Detection of a Z-error on three spins L1,2,3L_{1,2,3} using a quantum RNN. A Z-error is conjugated into a bit-flip-like error using a Hamiltonian generating a rotation along the x-axis where t=π/2Ωt=\pi/2\Omega and Ω\Omega is the dominant field of the Hamiltonian. The state of each of the LiL_{i} after the rotation (orange region) depends on whether a Z-error occurs, as it’s illustrated at the bottom of the figure. As exemplified here for L3L_{3}, a Z-error results in a spin flipping from what we would expect in the absence of errors. To detect the Z-error, a set of auxiliary qubits A1,2A_{1,2} is brought in to perform a parity measurements of pairs (L1,L2)(L_{1},L_{2}) and (L2,L3)(L_{2},L_{3}). Since under no Z-error the parity measurements must match, the parity measurements allow us to detect the location of the Z-error as specified in Table 1.

Quantum feature 2: arbitrary measurement basis as a means for computation

Equations (1Q)-(2Q) recover (II) when t=2πΩ1t=2\pi\Omega^{-1}. However, t=2πΩ1t=2\pi\Omega^{-1} is not a necessary restriction. This freedom results in the ability to rotate each quantum neuron which can be used as means for computing on a different basis. Measuring on different bases reveals the quantum correlations enhancing the performance of a qRNN relative to its classical counterpart. In this section, we show how to use the qRNN’s evolution to change the basis on which an error occurs. This freedom can detect a Z-error, an error proper to quantum computation.

Consider the repetition code |0L=|-y3|0_{L}\rangle=|\text{-}_{y}\rangle^{\otimes 3} and |1L=|+y3|1_{L}\rangle=|+_{y}\rangle^{\otimes 3} on qubits labeled L1,2,3L_{1,2,3} where |±y=12(|-1±i|1)|\pm_{y}\rangle=\frac{1}{\sqrt{2}}(|\text{-}1\rangle\pm i|1\rangle). Suppose we prepare the state |ψ=a|0L+b|1L|\psi\rangle=a|0_{L}\rangle+b|1_{L}\rangle, and consequently a Z-error occurs, we can detect the error by rotating all three spins L1,2,3L_{1,2,3} using (III.1) with the dominant field being Ω\Omega for a time t=π/2Ωt=\pi/2\Omega. Note that the rotation conjugates the Z-error by

eiπσx/4σzeiπσx/4σye^{-i\pi\sigma^{x}/4}\sigma^{z}e^{i\pi\sigma^{x}/4}\propto\sigma^{y} (7)

where σny=i|-11|i|1-1|\sigma_{n}^{y}=i|\text{-}1\rangle\langle 1|-i|1\rangle\langle\text{-}1| is like a bit-flip error except for a state-dependent phase. A bit-flip error can then be detected by bringing two extra spins A1,2A_{1,2} and performing parity measurements of the pairings (L1,L2)(L_{1},L2), and (L2,L3)(L_{2},L_{3}) as described in Sec. III.1. Using Table 1, the final parity of (L1,L2)(L_{1},L2), and (L2,L3)(L_{2},L_{3}) gives the measurement results a1a_{1} and a2a_{2} which can be used to discern where the Z-error occurred.

As an example, Fig. 2 illustrates the two final states of L3L_{3} if no error occurs (bottom left), and if a Z-error occurs on L3L_{3} (bottom right).

Detecting the Z-error hinges on (7) can be achieved by using the qRNNs evolution to rotate the measurement basis. Note that rotation allows us to measure the error syndrome of the stabilizer state |ψ|\psi\rangle, bringing out the quantum correlations of the state. Thus, the qRNN’s native evolution can be used to perform quantum computational tasks. After the error is detected on spin LiL_{i}, all qubits are rotated again by UU^{\dagger} and σiz\sigma_{i}^{z} can be applied to correct the error. We note that using a repetition code for error detection is a well-known technique in the quantum computing community.

The previous two quantum features show that qRNNs are naturally suited to solve important problems in machine learning and quantum computing. Recently, qRNNs were used to compress quantum circuits [23]. However, studies on using qRNNs for error correction in circuit-like quantum computing are warranted and left for further studies.

a2a_{2} a1a_{1} -1 +1
-1 Error in L2L_{2} Error in L1L_{1}
+1 Error in L3L_{3} No error
Table 1: Results of parity measurements for detection of a Z-error. Measuring spin AiA_{i} results in the outcome aia_{i}. By comparing the outcomes, one can detect the location of the Z-error.
Refer to caption
Figure 3: Comparing a classical and a quantum RNN to stochastically evolve a distribution ptfp_{t_{f}} from an initial distribution p0p_{0}. Here, we consider p0(𝒔)=ptf(𝒔)p_{0}(\bm{s})=p_{t_{f}}(\bm{s}^{\prime}) when sn=sns_{n}=-s^{\prime}_{n} for all nn. In this case, the RNN needs to produce a stochastic process matrix LtargL^{\text{targ}} that flips all the spins through several time-steps. The classical RNN (top) requires 𝒪(2Nm)\mathcal{O}(2^{N-m}) time-steps (i.e applications of PP) while using mm hidden neurons. A qRNN (bottom) requires one time-step and no hidden neurons.

Quantum feature 3: stochastic processes accessible to a qRNN

We now explore how a qRNN can be used to stochastically evolve a probability distribution faster than any classical RNN. Firstly, we note that if we initialize an RNN according to an initial distribution p0(𝒔)p_{0}(\bm{s}), the dynamics in (II) dictate that for t>0t>0 the RNN obeys a distribution given by the Markov-chain dynamics

pt(𝒔)=sP(𝒔|𝒔)pt1(𝒔)\displaystyle p_{t}(\bm{s})=\sum_{s^{\prime}}P({\bm{s}|\bm{s}^{\prime}})p_{t-1}(\bm{s}^{\prime}) (8)

where P(𝒔|𝒔)P({\bm{s}|\bm{s}^{\prime}}) is the transition probability between states 𝒔\bm{s}^{\prime} and 𝒔\bm{s}, which particular value is given by (II) [28] (see Appendix A for details).

Given this observation, we see that an RNN can be used for the task evolving a probability distribution p0p_{0} into pf=Ltargp0p_{f}=L^{\text{targ}}p_{0} by a series of stochastic transition matrices Lout=PtfL^{\text{out}}=P^{t_{f}}. The goal is to adjust the parameters of the RNN (i.e. biases and connection weights) to simulate the stochastic matrix encoded in LoutLtargL^{\text{out}}\approx L^{\text{targ}} in as few steps as possible. Then, one may ask if a qRNN can do this more efficiently than any RNN.

We answer this in the positive. It is worth noting that not all stochastic transition matrices LtargL^{\text{targ}} are embeddable in a Markov process (for a review of classical and quantum embeddability see Appendix A). To simulate a stochastic system’s future behavior, information about its past must be stored, and thus memory is a key resource. Quantum information processing promises a memory advantage for stochastic simulation [50]. In simulating stochastic evolution with classical resources there is a trade-off between the temporal and physical resources needed [51], and it’s been shown that certain stochastic evolutions, when simulated with quantum hardware, may not suffer from such trade-off since the evolution arising from quantum Lindbladian dynamics are far more general than classical Markovian evolution [52]. That is, there exist matrices LtargL^{\text{targ}} that are quantum embeddable but not classically embeddable. Moreover, even if LtargL^{\text{targ}} is embeddable, the quantum evolution can lower the number of steps needed to produce LtargL^{\text{targ}} since the unitary dynamics of a quantum system allow a simultaneous, continuous, and coherent update of every neuron. This separation in capabilities illustrates the computational advantages of quantizing an RNN.

Let us now give an example of a matrix LtargL^{\text{targ}} that can be achieved exponentially faster in a qRNN. Consider the task of realizing a transformation FF corresponding to a global “spin-flip”

F𝒔|𝒔={1if n snsn0otherwise.F_{\bm{s}|\bm{s^{\prime}}}=\begin{cases}1&\mbox{if }\forall_{n}\text{ }s_{n}\neq s^{\prime}_{n}\\ 0&\mbox{otherwise.}\end{cases} (9)

Realizing FF on NN neurons using a classical Markov process requires several time steps of order 𝒪(2Nm)\mathcal{O}(2^{N-m}) where mm is the number of hidden neurons (for details, see Sec. III.A in Ref. [52]). In other words, a classical RNN cannot produce FF efficiently when all available neurons must be flipped. This is a result of (II), and the fact that flipping neuron nn is done by ensuring that there is another neuron mm in the opposite state so that Jnm>0J_{nm}>0 dominates hnh_{n}.

On the other hand, a qRNN can perform FF in a single step regardless if all neurons need to flip. To see this, one can consider the case of (III.1) with Ωhn\Omega\gg h_{n}. In this case, neurons both flip simultaneously and in a single time step under a unitary UU. That is, if |ψ0=𝒔p0(𝒔)|𝒔|\psi_{0}\rangle=\sum_{\bm{s}}\sqrt{p_{0}(\bm{s})}|\bm{s}\rangle, then

|ψf=U|ψ=𝒔pf(𝒔)|𝒔.|\psi_{f}\rangle=U|\psi\rangle=\sum_{\bm{s}}\sqrt{p_{f}(\bm{s})}|\bm{s}\rangle. (10)

While the realization of the matrix FF via (III.1) signal a quantum advantage, we highlight that this advantage is extremely sensitive to the decoherence arising from spontaneous emission (i.e. spontaneous relaxations from |1|1\rangle to |-1|\text{-}1\rangle), a main source of noise in NISQ devices (see Appendix A). It remains an open problem whether there exist stochastic processes enabled by (III.1) which are robust to noise, and in the future, we hope to explore how to shield unitary stochastic processes against noise in experimentally realizable NISQ devices.

The spin-flip process FF is efficiently simulated using a classical computer. However, FF exemplifies the qRNN’s ability to access stochastic processes inaccessible to classical RNNs without hidden neurons. This implies that if an RNN is employed simulate evolving p0p_{0} to ptfp_{t_{f}} stochastically by passing it through several linear transformations, there are instances where the qRNN requires exponentially fewer steps. Stochastic simulation, of course, has applications in finance, biology, and ecology, among other fields. As an example, Ref. [53] used this quantum advantage to propose a quantum circuit algorithm for stochastic process characterization and presented applications in finance and correlated random walks. The separation above illustrates the computational advantage of quantizing an RNN.

III.2 qRNNs under spontaneous emission

Having seen how (III.1) recovers the discrete update rule (II), we now show that a qRNN under dissipation naturally evolves under continuous-time dynamics analogous to (3). This establishes only mathematical similarities between the evolution of NISQ devices and neural circuits, allowing us to use available quantum hardware for cognitive tasks, an idea that we explore further in Sec. V.

Consider the qRNN in (III.1)(\ref{eq:HGeneral}) under spontaneous emission where a spin relaxes from |1|1\rangle to |-1|\text{-}1\rangle at a rate γ\gamma. To extract the dynamics of continuous variables, we focus on the dynamics of the expectation values of local Pauli operators.

The expectation value of an observable AA is A=Tr(Aρ)\left\langle A\right\rangle=\text{Tr}(A\rho) where ρ\rho is the density matrix describing the system. In particular, we focus on the expectations of the operators σnx\sigma^{x}_{n}, and σny=i|-11|ni|1-1|n\sigma^{y}_{n}=i|\text{-1}\rangle\langle 1|_{n}-i|1\rangle\langle\text{-1}|_{n}. If we start the qRNN at a state for which σnz(0)=1\langle\sigma_{n}^{z}(0)\rangle=-1 then (see Appendix B)

σny˙=\displaystyle\dot{\left\langle\sigma^{y}_{n}\right\rangle}= 1τσny(t)Ω2γmJnmσnx(t)σmy(t)\displaystyle-\frac{1}{\tau}\left\langle\sigma^{y}_{n}(t)\right\rangle-\frac{\Omega}{2\gamma}\sum_{m}J_{nm}\left\langle\sigma_{n}^{x}(t)\sigma_{m}^{y}(t)\right\rangle
+Δn(t)σnx(t),\displaystyle+\Delta_{n}(t)\left\langle\sigma_{n}^{x}(t)\right\rangle, (11)

where we have defined the neural time-scale τ1=γ/2+Ω2/4γ\tau^{-1}=\gamma/2+\Omega^{2}/4\gamma which is different than that in (3) but bears the analogous significance of the time-scale in which σny\left\langle\sigma_{n}^{y}\right\rangle decays.

Differently, that (3), notice that the dynamics of σny\left\langle\sigma_{n}^{y}\right\rangle are influenced by the spin’s value along the xx-axis, a consequence of the nontrivial commutation relation of spin variables. The commutation relations also make (III.2) quadratic, and therefore nonlinear. The quadratic term in (III.2) is analogous to the nonlinear term that gives RNNs their computational power.

In Appendix B, we explore the dynamics of σnx\left\langle\sigma_{n}^{x}\right\rangle as well and show that together with σny\left\langle\sigma_{n}^{y}\right\rangle, we recover dynamics analogous to integrate-and-fire RNN model [54], a more realistic model of neural networks in the brain than the one in (3)(\ref{eq:EOMClassical}).

Refer to caption
Figure 4: Schematic picture of RNNs with classical and quantum neurons. (A) Classical RNN. The inputs are local biases, and the inter-neural connections JnmJ_{nm} are arbitrary. A set of neurons is used for readout to produce the output 𝒚out=Wout𝒓\bm{y}^{\text{out}}=W^{\text{out}}\bm{r}. (B) qRNN made from Rydberg atoms which restrict the connections to Jnm1/Rnm6J_{nm}\sim 1/R_{nm}^{6} where RnmR_{nm} is the physical distance between atoms nn and mm. Here, we depict interactions between nearest and next-nearest neighbors. However, each neuron interacts with all others in the chain via Jnm1/Rnm6J_{nm}\sim 1/R_{nm}^{6}. Local expectation values of a subset of atoms are for readout. (C) Arrays of Rydberg atoms as qRNNs. Each atom experiences a Rabi-drive Ω\Omega, and a local detuning Δn\Delta_{n} encoding the RNN’s inputs. One of the main sources of decoherence in Rydberg atoms is spontaneous emission at a rate γ\gamma.

IV Quantum reservoir computers using Rydberg atoms: An experimental proposal

The similarities between (III.2) and the evolution of RNNs suggest the ability of qRNNs to emulate neurological learning. To explore neurological learning in qRNNs, we propose to fix the architecture of the qRNN coupling constants JnmJ_{nm} based on optical-tweezers arrays of Rydberg atoms.

The natural Hamiltonian of a Rydberg array closely resembles the one in (III.1). A Rydberg atom is a single valance-electron atom that can be coherently driven between an atomic ground state |g|g\rangle and a highly excited state |r|r\rangle with a much larger principal quantum number. These states can represent our |-1|\text{-}1\rangle and |1|1\rangle neuronal states respectively. A Rydberg atom in its excited state exhibits a large electronic dipole moment, and, consequently, a collection of Rydberg atoms interact via a 1/R61/R^{6} van der Waals potential where RR denotes the physical distance between the two atoms. For an array of Rydberg atoms where the atoms are at fixed positions, the Hamiltonian of the system is [35]

HRyd=Δnn^n+Ω2nσnx+nmVRnm6n^nn^mH_{Ryd}=\Delta\sum_{n}\hat{n}_{n}+\frac{\Omega}{2}\sum_{n}\sigma_{n}^{x}+\sum_{nm}\frac{V}{R_{nm}^{6}}\hat{n}_{n}\hat{n}_{m} (12)

where n^n=|11|n\hat{n}_{n}=|1\rangle\langle 1|_{n}, Ω\Omega is the coherent Rabi drive coupling the |-1|\text{-}1\rangle and |1|1\rangle states, Δ<0\Delta<0 is a global drive frequency mismatch to the atomic spacing of the atoms, and VV is the nearest neighbor interaction strength. Using acusto-optical deflectors (AOD) and spatial light modulator (SLM), one can create spatially depending light-shifts resulting in site and time-dependent detunings Δn(t)=Δ+α(t)Δn\Delta_{n}(t)=\Delta+\alpha(t)\Delta_{n} where α(t)\alpha(t) is a time-dependent envelope. With this in mind, the Hamiltonian in (12) can be mapped to a Hamiltonian like that in (III.1) with Jnm=V/Rnm6J_{nm}=V/R_{nm}^{6} since n^n=(σnz+𝟙n)/2\hat{n}_{n}=(\sigma_{n}^{z}+\mathds{1}_{n})/2. In this paper, for concreteness, we compare our numerics against the experimental realization of Rydberg arrays in Ref. [35, 38], where the rates Ω,Δn,V\Omega,\Delta_{n},V are all in units of mega-Hertz, while time constants are in units of micro-seconds. In these experiments, an off-resonance intermediate state, |6P3/2,F=3,MF=3|6P_{3/2},F=3,M_{F}=-3\rangle, is used to couple |g=|5S1/2,F=2,mF=2|g\rangle=|5S_{1/2},F=2,m_{F}=-2\rangle and |r=|70S1/2,mJ=1/2,mI=3/2|r\rangle=|70S_{1/2},m_{J}=-1/2,m_{I}=-3/2\rangle of Rubidium-87 atoms through a two-photon transition. Thus, photon-scattering off the intermediate state is the dominant source of decoherence. As we show in Appendix D, we can model this with a modified spontaneous emission process given by the jump operator

L+=γ|g(αr|+βg|)L^{+}=\sqrt{\gamma}|g\rangle\left(\alpha\langle r|+\beta\langle g|\right) (13)

instead of the typical γ|gr|\sqrt{\gamma}|g\rangle\langle r| jump operator. In the equation above, γ=2π/(20 μs)\gamma=2\pi/(20\text{ }\mu\text{s}), and (α,β)=(0.05,0.16)(\alpha,\beta)=(0.05,0.16) for the realistic settings we simulate. With the full unitary and dissipative dynamics, we can think of an array of Rydberg atoms as a quantum analog of a continuous-time RNN. Fig. 4 compares the architecture of a classical RNN in Fig. 4A, and a Rydberg RNN in Fig.s 4B-C.

We note that training RNNs can be unstable as that often relies on (truncated) back-propagation through time or real-time recurrent learning. One way to circumvent this problem is by keeping the fixed system’s parameters. Instead, we focus on only training the output filter WoutW^{\text{out}}. This easier training schedule motivated the introduction of reservoir computers [55] and their quantum analogs [10, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27]. Thus, in the following numerical experiments we fix the position of the atoms in either a 1D chain or a 2D square lattices and train only WoutW^{\text{out}} and some temporal parameters depending on the task. That is, in this article we implement Rydberg reservoir computers. Logically, successful performance on the tasks here presented sufficiently shows qRNNs computational ability. While we include the effect of small imperfections on the positions of the atoms, we see no significant effect on the performance of the tasks after averaging our results over 10 realizations of the atom’s positions. We leave full optimization of the qRNN for future work.

Lastly, several features of the many-body dynamics of arrays of Rydberg atoms are particularly well suited for emulating biological tasks. In Sec. V.1, we show how Rydberg arrays can be used to implement inhibitory and excitatory neurons which are vital in many biological tasks such as multitasking [56]. The key idea behind encoding inhibitory neurons will be leveraging positive and negative interactions between Rydberg atoms with different principal quantum numbers [45]. Additionally, in Sec. V.4 we show that Rydberg arrays can store long-term memory by taking advantage of the weak-ergodicity breaking dynamics of quantum many-body scars [35, 46, 47].

V Learning biological tasks via reservoir computers

We focus on analyzing the Rydberg reservoir’s potential to learn biologically plausible tasks. In the tasks analyzed, we fixed the geometry of the atoms depending on the task at hand. As a proof of principle, we focus on four simple neurological tasks which indicate good performance even with a small number of atoms. We show that a Rydberg reservoir can encode inhibitory and excitatory neurons vital for successful multitasking. Likewise, we show that Rydberg reservoirs can learn to decide by distinguishing properties of stimuli, have a working memory, and exhibit long-term memory enhanced by quantum many-body scars. Simulation details of each task can be found in Appendix D.

V.1 Multitasking

Refer to caption
Figure 5: Encoding inhibitory neurons using Rydberg atoms and using them for multitasking. Multitasking consists of fixing the qRNN’s parameters and training WoutW^{\text{out}} to produce three conflicting outputs. (A) Shows the scheme for encoding inhibitory neurons. Rydberg atoms with different principal quantum numbers are used such that pairs (nQ)(nQ)(n_{Q})(n_{Q}^{\prime}) interact attractively while (nQ)(nQ)(n_{Q}^{\prime})(n_{Q}^{\prime}) and (nQ)(nQ)(n_{Q})(n_{Q}) pairs interact repulsively. The network receives two binary inputs x,yx,y. (B) Square error for learning the functions XOR, OR, and AND on the inputs with different numbers of inhibitory neurons. Better performance is observed when 1 in every 4 neurons is inhibitory. (C)-(E) Example of learned functions using eight neurons and two inhibitory neurons, which results in performing 40% better than without inhibitory neurons.

A hallmark of classical RNNs is their ability to multitask. Multitasking consists of simultaneously learning several output functions. Dale’s principle defines an inhibitory neuron, indexed by nn, as one with a negative sign in its interactions with all other neurons [57]

Jnm0m.J_{nm}\leq 0\quad\forall m. (14)

Two Rydberg atoms with different principal quantum numbers nQn_{Q}, and nQn_{Q}^{\prime} and angular momentum quantum numbers the same can interact with a 1/r61/r^{6} attractive potential VnQ,nQV_{n_{Q},n_{Q}^{\prime}} [45]. Using the Python package PairInteraction [58], we note that if nQn_{Q} represents the state |r=|70S1/2,mj=1/2,mI=3/2|r\rangle=|70S_{1/2},m_{j}=-1/2,m_{I}=-3/2\rangle, and nQn_{Q}^{\prime} represents |r=|73S1/2,mj=1/2,mI=3/2|r^{\prime}\rangle=|73S_{1/2},m_{j}=-1/2,m_{I}=-3/2\rangle, then the interaction VnQ,nQ=VVnQ,nQV_{n_{Q},n_{Q}}=V\approx-V_{n_{Q},n_{Q}^{\prime}} where VV is the strength between atoms with principal quantum numbers nQn_{Q} (see Appendix D). We can use this fact to encode inhibitory neurons. We restrict the concentration of nQn_{Q}^{\prime} Rydberg atoms to be sparse such that pairs of nQn_{Q}^{\prime} atoms are placed as far as possible at a distance dmaxd_{max} in a 1D chain arrangement. We choose the field strength VV so that V/dmax6=102V/d_{max}^{6}=10^{-2}, and as a result we can neglect the interactions between pairs of nQn_{Q}^{\prime} atoms, but not the interactions between pairs (nQ)(nQ)(n_{Q})(n_{Q}^{\prime}) and (nQ)(nQ)(n_{Q})(n_{Q}). This amounts to saying that if atom nn is driven to nQn_{Q}^{\prime} then for all mm Jnm0J_{nm}\lesssim 0 as in (14). By implementing this in on our reservoir we can learn XOR, AND, and OR simultaneously for different concentrations of inhibitory neurons as illustrated in Fig. 5(A).

Fig. 5(B) shows the errors of simultaneously learning XOR, OR, and AND as a function of the system size NN for a different number of inhibitory neurons in the array. The network is initialized in the state |gN|g\rangle^{\otimes N}, and the network receives two binary inputs x,y{0,1}x,y\in\{0,1\} (in units of MHz) for a time Δt\Delta t (in units of μ\mus) with input noise σin=0.1\sigma_{in}=0.1. Afterwards, the network is interrogated to give XOR(x,y)(x,y), OR(x,y)(x,y), and AND(x,y)(x,y). WoutW^{\text{out}} is trained using the loss in (4). The errors shown in Fig. 5(B) are the minimum achieved over a wide range of choices of interaction time Δt[0,5]\Delta t\in[0,5] μ\mus. This shows that in some cases our reservoir can benefit from having a connectivity matrix JnmJ_{nm} with both positive (excitatory) and negative (inhibitory) values, analogously to the mammalian brain. For small system sizes, it seems that a ratio of 1:4 inhibitory neurons betters the learning performance, similar to the results in [56]. This is supported by the performance at 4 and 8 neurons in Fig. 5(B). Particularly, N=8N=8 neurons, two of which are inhibitory, result in a 40% decrease in the loss. Nonetheless, we observe that having no inhibitory neurons is best when dealing with N=N=6 and 10 neurons. No inhibitory neurons are ever the worse choice. Fig. 5(C)-(E) shows the results of learning XOR, OR, and AND simultaneously using N=8N=8 and two inhibitory neuron. Note that the network is fully capable of classification errors well below the input noise threshold σin\sigma_{in}.

Lastly, while this task shows the success of the Rydberg reservoirs at approximating Boolean functions of the input, we note that one may also want to calculate different nonlinear functions of the input. We remark that our Rydberg reservoir can approximate biologically relevant nonlinear functions such as ReLu and sigmoid.

V.2 Decision-making

Refer to caption
Figure 6: =Decision-making task using a Rydberg reservoir. (A) Schematic of the input stimuli as a pair of time-dependent detunings on two atoms. The stimuli are turned on for a normally distributed time Δt\Delta t with standard deviation σin=0.1\sigma_{in}=0.1. The network decides on a relaxation time toutt_{out} to output the decision sign(Δ1inΔ2in)\text{sign}\left(\Delta^{in}_{1}-\Delta^{in}_{2}\right). (B) The psychometric response of the decision-making task which maps the accuracy towards deciding that Δ1in\Delta^{in}_{1} is the largest as a function of the inputs’ difference. The simulated response (dotted) is well fitted by a sigmoid function (solid curve).

One of the great successes of classical RNNs is their ability to integrate sensory stimuli to choose between two actions. Here, we present the Rydberg reservoir with a variant of the dot motion decision-making task initially studied in monkeys in which several inputs are analyzed to produce a scalar nonlinear function [59]. This function represents a decision. This task shows the Rydberg reservoir’s ability to produce nonlinear functions of the input and perform simple cognitive tasks, a feature of most reservoirs proposed thus far [60].

In this task, a reservoir is presented with two inputs Δ1in\Delta_{1}^{in} and Δ2in\Delta_{2}^{in}, and the goal is to train the network to choose which input is the largest. That is,

ytarg=sign(Δ1inΔ2in).y^{\text{targ}}=\text{sign}\left(\Delta_{1}^{in}-\Delta_{2}^{in}\right). (15)

The stimuli, which in the case of a qRNN are local detunings on a pair of atoms, are turned on for a normally distributed time Δt\Delta t with variance also σin=0.1\sigma_{in}=0.1 and mean Δt=0.1\langle\Delta t\rangle=0.1 μs\mu s (see Fig. 6(A)). The stimuli are then turned off, and the network chooses a relaxation time toutt_{out} after which it “makes a decision” by approximating (15). This is known as the fixed-duration protocol since the experimentalist fixes the stimulation period, and the subject, the reservoir in this case, learns to choose a response time toutt_{out}.

In the brain, we expect the performance of a decision-making task to follow a sigmoidal psychometric response [44, 59]. A psychometric response maps out the accuracy of a decision-making task as a function of stimuli distinguishability. As an example of a psychometric response, the reader could think about paying a routine visit to the eye doctor and having to discern the letters “b” and “p” written on the wall. If the letters are large enough, they become distinguishable, and if the letters are too small one often fails to make out the right letter.

Classically, a decision-making task benefits from connectivity between all neurons. Since our connectivity is limited by physical constraints, a 2D square lattice structure was chosen to prevent neurons from being isolated from the rest. Moreover, a 2D square lattice is experimentally friendly. We set up a Rydberg reservoir of 3×23\times 2 atoms with two input atoms and two different output atoms (for details see Appendix D). The reservoir is then trained by optimizing over toutt_{out}, and WoutW^{\text{out}} such that the reservoir’s output approximates (15) while keeping the network parameters JnmJ_{nm}, Ω\Omega and Δnin\Delta^{in}_{n} fixed. We observe that tout1t_{out}\approx 1 μ\mus is regularly obtained as this is the time scale in which the information about Δ1,2in\Delta_{1,2}^{in} propagates through the network. In our case, c1=Δ1inΔ2inc_{1}=\Delta^{in}_{1}-\Delta^{in}_{2} is a natural choice for a measure of stimuli distinguishability. Fig. 6(B) shows the psychometric response of the task which is qualitatively similar to the ones obtained in classical RNNs [44]. Moreover, we see in Fig. 6(B) that if |c1|σin|c_{1}|\geq\sigma_{in} such that it is above the input noise level our network success more than 80%80\% of the time. The success of this task shows the Rydberg reservoir’s ability to emulate simple cognitive tasks.

V.3 Parametric working memory

Refer to caption
Figure 7: Working memory of a Rydberg quantum reservoir computer. (A) Schematic of the network’s inputs where two atoms are detuned for a time Δt\Delta t but temporally separated by a time tdelayt_{delay}. Two different output neurons are used for readout at a time toutt_{out} after the second input is turned off. (B) Loss of the working memory task as a function of the total input time 2Δt+tdelay2\Delta t+t_{delay} (gray). Entanglement entropy between the input qubits and the rest of the reservoir as a function of 2Δt+tdelay2\Delta t+t_{delay} (blue). Here, the mean value of toutt_{out} is 0.1. The loss stays large for small input times until the input qubits start entangling with the rest of the reservoir. (C) Accuracy as a function of the time the input are turned on (Δt\Delta t) for four different choices of toutt_{out} and with fixed tdelay=0.1t_{delay}=0.1. These curves show that accuracy is largely independent of toutt_{out} and Δt\Delta t as long as Δt<0.3\Delta t<0.3 (D) Accuracy of the working memory task at Δt=0.15\Delta t=0.15 and tout=0.5t_{out}=0.5 as a function of tdelayt_{delay}. The blue curve is the performance when V>ΩV>\Omega puts the reservoir in the Rydberg blockaded regime, while the red curve is the performance when V<ΩV<\Omega puts the reservoir in the disordered regime. These plots show that when V>ΩV>\Omega, the Rydberg reservoir can hold memory for later manipulation better than when V<ΩV<\Omega. Shaded regions indicate error bars.

Our next neurological task is that of parametric working memory. Working memory, which is one of the most important cognitive functions, deals with the brain’s ability to retain and manipulate information for the later execution of a task. Here, we train a network to perform a task based on the decision-making task in Sec. V.2 but with two temporally separate stimuli (see Fig. 7(A)). We use the fixed-time protocol where the separation between stimuli, denoted by tdelayt_{delay} if fixed by us. The stimuli are both turned on for a time Δt\Delta t, and after the second input the network is left to relax for a time toutt_{out} before two output neurons are used to approximate (15. To avoid overfitting, we add Gaussian noise to the times Δt\Delta t, toutt_{out}, and tdelayt_{delay} with zero mean and standard deviation σin=0.1\sigma_{in}=0.1. The network optimizes over WoutW^{\text{out}}. Thus, the network has to retain information about Δ1in\Delta_{1}^{in} for a few “seconds” to then compare against Δ2in\Delta_{2}^{in} and make a decision.

We set a Rydberg reservoir of 3×23\times 2 atoms with two input atoms and two different output atoms (for details see Appendix D). Fig. 7(B) shows the loss of the network as a function of the total time the inputs are injected into the network (τ=2Δt+tdelay\tau=2\Delta t+t_{delay}). We note that the loss function is high for small τ\tau since it takes the input neurons to correlate with the rest of the reservoir. Accordingly, in Fig. 7(B) we show that growth of the entanglement entropy of the input qubits accompanies a decrease in the loss function. For Fig. 7(C) we fixed tout=0.1t_{out}=0.1, a choice which has little effect on the reservoir’s performance.

In Fig. 7(C) we show the accuracy of the reservoir at reproducing (15) as a function of the time the inputs are turned on (Δt\Delta t) and for different choices of toutt_{out}. For these plots tdelay=0.1t_{delay}=0.1 is fixed. We notice that the accuracy is largely invariant to our sampled choices of toutt_{out}.

Lastly, in Fig. 7(D), we probe the reservoir’s accuracy as a function of tdelayt_{delay}. For these experiments, we fix tout=0.5t_{out}=0.5 and Δt=0.15\Delta t=0.15. Importantly we set V=2π×10V=2\pi\times 10 MHz and Ω=2π×4.2\Omega=2\pi\times 4.2 MHz such that V>ΩV>\Omega and neighboring Rydberg excitations are off-resonance putting our reservoir in the so-called blockaded regime [61, 62]. While one initially might expected the accuracy to decrease for increasing tdelayt_{delay}, we found that this is not the case and instead the accuracy oscillates persistently reaching high accuracies as shown in Fig. 7(D) blue curve. Interestingly, this behavior disappears when the coupling V=2π×0.1V=2\pi\times 0.1 MHz such that V<ΩV<\Omega as shown in the red curve in Fig. 7(D), although the performance is statistically significant even for long tdelayt_{d}elay with an accuracy greater than 50%50\%. We can conclude that, in the blockaded regime, the reservoir can hold information for longer periods. We can understand this dependence on V/ΩV/\Omega as follows. In the disordered regime, atoms are mostly uncorrelated and the atoms are allowed to freely oscillate with the dynamics being dominated by the drive Ω\Omega. Thus, after a short time, the inputs coming through a zz-field are largely irrelevant and the network is unable to hold the information about the first input. On the other hand, when V>ΩV>\Omega the atoms are largely correlated since neighboring excitations of Rydberg atoms are blockaded and the dynamics are slowed down. These slow dynamics in the system allow for longer memory times. In Sec. V.4, we will explore the longer-term memory in the blockaded regime and show that long-term memory in a reservoir can be stabilized due to the presence of quantum many-body scars.

V.4 Long-term Memory via Quantum Many-body Scars

Refer to caption
Figure 8: A state encoding a memory mm is prepared. The state evolves under its natural Hamiltonian before being interrogated via local measurements to retrieve mm. If the evolution time is short, the system is yet out of equilibrium and remembers its initial condition. Thus, mm can be retrieved. On the other hand, after a long time, the system may thermalize and local measurements fail to provide information about the initial state. Thus, the memory retrieval time is upper bounded by the thermalization time of the initial state |ψm(0)|\psi_{m}(0)\rangle under the system’s dynamics. In the example in Sec. V.4, the system is a chain of Rydberg atoms, and final measurements are performed on a single atom which is then linearly post-processed to retrieve mm. In this case, a thermal state can be observed by measuring if the entanglement entropy of the region obeys a volume law. If the dynamics can be stabilized against thermalization, the memory can be retrieved at larger times.

Finally, we turn to examine a reservoirs ability to encode long-term memory. The task consists of encoding a classical bit mm in the initial state of a reservoir |ψm(0)|\psi_{m}(0)\rangle so that after the system is left to evolve under its inherent dynamics for a time TT, local measurements of the state |ψm(T)|\psi_{m}(T)\rangle are used to recover mm. However, mm cannot be recovered from local measurements if the dynamics obey the eigenstate-thermalization hypothesis (ETH) [63]. Instead, local measurements of |ψm(T)|\psi_{m}(T)\rangle obey thermal statistics described by the energy spectrum of the Hamiltonian and bear no information on the initial condition |ψm(0)|\psi_{m}(0)\rangle. Thus, reservoirs that violate the ETH are naturally suited for memory tasks, since they can locally retain information about their initial state. Indeed this notion has begun to be studied in quantum reservoirs [25, 27]. Recent experiments using quench dynamics in arrays of Rydberg atoms have revealed quantum many-body scaring behavior [35], which can be stabilized [46, 47] to delay the thermalization of the system. Here, we use these results to enlarge the memory lifetime of a reservoir. Simulation details are found in Appendix D.

In the case of a kicked ring of Rydberg atoms experiencing nearest-neighbor blockade, the dynamics are captured in the so-called PXP-model [35, 47, 64, 65]

H(t)\displaystyle H(t) =HPXP+N^kθkδ(tkτ)\displaystyle=H_{PXP}+\hat{N}\sum_{k\in\mathbb{Z}}\theta_{k}\delta(t-k\tau) (16)
HPXP\displaystyle H_{PXP} =Ωn=1NPn1σnxPn+1N^=nn^n\displaystyle=\Omega\sum_{n=1}^{N}P_{n-1}\sigma_{n}^{x}P_{n+1}\qquad\hat{N}=\sum_{n}\hat{n}_{n} (17)

where Pn=|gg|nP_{n}=|g\rangle\langle g|_{n} projects the atom at the nthn^{th} site onto the ground state and we choose periodic boundary conditions to mitigate edge effects. In (16) we let θk=π+ϵk\theta_{k}=\pi+\epsilon_{k} where ϵk\epsilon_{k} is a Gaussian random variable with mean ϵ\epsilon and variance σin2\sigma_{in}^{2}. That is ϵk\epsilon_{k} plays the role of added noise in the reservoir. For this discussion we let γ=0\gamma=0 since we know from experiments that the quantum scaring behavior is robust to the atom’s decoherence, and the choice to work with the Hamiltonian evolution helps speed up the acquisition of numerical data.

We denote χτ=exp(iπN^)exp(iτHPXP)\chi_{\tau}=\exp{(-i\pi\hat{N})}\exp{(-i\tau H_{PXP})}. It has been empirically observed that χτ\chi_{\tau} approximately exchanges the Neel states |AF=|1010|AF\rangle=|1010...\rangle and |AF=|0101|AF^{\prime}\rangle=|0101...\rangle for τ1.51πΩ1\tau\approx 1.51\pi\Omega^{-1} [35]. Note that χτχτ=𝟙\chi_{\tau}\chi_{\tau}=\mathds{1}, and so under no noise, any state |ψ|\psi\rangle is recovered after a cycle of evolution of 2τ2\tau. However, the noise ϵk\epsilon_{k} destroys the revival of all initial states except for |AF|AF\rangle and |AF|AF^{\prime}\rangle (see Appendix C). This leads to many-body quantum scars stabilized by the operator exp(iπN^)\exp{(-i\pi\hat{N})} [46, 47].

Given the dynamics in (16), we propose the following scheme for encoding a binary memory m{0,1}m\in\{0,1\}. We choose a reference state |ψ|\psi\rangle, and let |ψ0(0)=|ψ|\psi_{0}(0)\rangle=|\psi\rangle and |ψ1(0)=χτ|ψ|\psi_{1}(0)\rangle=\chi_{\tau}|\psi\rangle. Subsequently, the state |ψm(0)|\psi_{m}(0)\rangle is left to evolve for nn cycles of duration 2τ=2(1.51π)2\tau=2(1.51\pi) after which the populations 𝒓m(n)=(Pg(2nτ|m),Pr(2nτ|m))\bm{r}_{m}(n)=(P_{g}(2n\tau|m),P_{r}(2n\tau|m)) of the single-atom reduced density matrix are used to retrieve mm. The retrieval is done by training a vector WnoutW^{\text{out}}_{n} on MM instances of 𝒓m(n)\bm{r}_{m}(n) in order to minimize (4) with 𝒚targ=𝒎\bm{y}^{\text{targ}}=\bm{m} the binary vector of memories and 𝒚out(n)=Wnout𝒓(n)\bm{y}^{\text{out}}(n)=W_{n}^{\text{out}}\bm{r}(n) our networks’ output after nn cycles.

To quantify the quality of the memory retrieval R(n)R(n), we use the squared Pearson’s rr-factor

R(n)=cov2(𝒎,𝒚out(n))σ2(𝒎)σ2(𝒚out(n)).R(n)=\frac{\text{cov}^{2}(\bm{m},\bm{y}^{\text{out}}(n))}{\sigma^{2}(\bm{m})\sigma^{2}(\bm{y}^{\text{out}}(n))}. (18)

Fig. 9(a) shows the memory retrieval error as a function of the number of cycles for three different choices of reference states. Fig. 9(b) shows the average entanglement entropy (S¯E)(\bar{S}_{E}) of the left-most atom in the ring. Saturation of S¯E\bar{S}_{E} signals growth in the memory retrieval error as the state “forgets” the initial condition. From other studies, we see that memory is retrieved at longer times due to the slow thermalization of the Neel states due to quantum many-body scars [35, 47, 64, 65, 46, 47]. The time-crystalline nature of the reservoir using |ψ=|AF|\psi\rangle=|AF\rangle signals long-time correlations, and thus the reservoir can be used to encode and predict series with long-time correlations [17].

The Neel states exhibit long-term memory due to the evolution’s scaring behavior. This can be understood by analyzing the average evolution produced by a single cycle. Up to second order in ϵk\epsilon_{k}, the state at time 2τn2\tau n, ρ(n)\rho(n), evolves to the state at time 2τ(n+1)2\tau(n+1), ρ(n+1)\rho(n+1), where (see Appendix C)

ρ(n+1)\displaystyle\rho(n+1) =ρ(n)iϵ[H+,ρ(n)]\displaystyle=\rho(n)-i\epsilon[H^{+},\rho(n)]
+σin2(H+ρ(n)H+12{H+H+,ρ(n)})\displaystyle+\sigma^{2}_{in}\left(H^{+}\rho(n)H^{+}-\frac{1}{2}\{H^{+}H^{+},\rho(n)\}\right)
+σin2(Hρ(n)H12{HH,ρ(n)}).\displaystyle+\sigma^{2}_{in}\left(H^{-}\rho(n)H^{-}-\frac{1}{2}\{H^{-}H^{-},\rho(n)\}\right). (19)

Here, H±=N^±χτN^χτH^{\pm}=\hat{N}\pm\chi_{\tau}\hat{N}\chi_{\tau} are Hermitian operators. We can rewrite (V.4) as ρ(n+1)=ρ(n)+ϵ,σ(ρ(n))\rho(n+1)=\rho(n)+\mathcal{L}_{\epsilon,\sigma}(\rho(n)). Since [H+,χτ]=0[H^{+},\chi_{\tau}]=0, the operator H+H^{+} has an emergent 2\mathbb{Z}_{2} symmetry which means that the ground states of H+H^{+} are well approximated by the states |±=12(|AF±|AF)|\pm\rangle=\frac{1}{\sqrt{2}}\left(|AF\rangle\pm|AF^{\prime}\rangle\right) [47]. Note that

H+|+\displaystyle H^{+}|+\rangle N|+,\displaystyle\approx N|+\rangle,\quad H|+0,\displaystyle H^{-}|+\rangle\approx 0, (20)
H+|\displaystyle H^{+}|-\rangle N|+,\displaystyle\approx N|+\rangle,\quad H|0,\displaystyle H^{-}|-\rangle\approx 0, (21)

were NN is the system size We conclude that if ρ(n)=|AFAF|\rho(n)=|AF\rangle\langle AF| then ρ(n+1)ρ(n)\rho(n+1)\approx\rho(n) as this state is (approximately) in the kernel of ϵ,σ\mathcal{L}_{\epsilon,\sigma}. Therefore, the Neel states are suitable memory states.

Refer to caption
Figure 9: Dependence of memory retrieval on different reference states. We use a ring of N=8N=8 Rydberg atoms with ϵ=σ=0.1\epsilon=\sigma=0.1, and M=100,30M=100,30 samples for the training and testing sets respectively. The memories are sampled from a balanced Bernoulli distribution. (a) Shows the memory retrieval error for three different choices of reference state, |AF=|grgrgrgr|AF\rangle=|grgrgrgr\rangle, |gg=|ggg|gg\rangle=|gg...g\rangle and |d2=|grggggrg|d_{2}\rangle=|grggggrg\rangle. Due to the scaring behavior of |AF|AF\rangle, the memory length is greatly improved. (b) Shows the left-most atom’s entanglement entropy averaged over the MM memory instances (S¯E\bar{S}_{E}). Saturation of S¯E\bar{S}_{E} signals the thermalization of the system and thus a decrease in RR.

Equation (V.4) also tells us that any density matrix in the kernel of ϵ,σ\mathcal{L}_{\epsilon,\sigma} may also serve as a memory state since it is a steady state of the evolution. This would allow us to enlarge the number of memories accessible in a qRNN. In Appendix C we show the existence of a large number of steady states, and we present a scheme to prepare a number of them. It’s worth noting, however, that these memories may have to be distinguished from one another via global measurements. The questions of how to efficiently prepare and distinguish these memory states remain importantly both open and key in telling us if a memory quantum advantage can be claimed in qRNNs. As it stands, using quantum scars signals that Rydberg-inspired RNNs may present enhanced memory since quantum scars are classically simulatable due to their low entanglement entropy. However, it’s unclear whether the system can be classically simulated at late times due to the onset of the thermalization. These questions are left for future studies.

Quite recently, another proposal to enlarge the number of memories accessible in a quantum reservoir has been introduced using the emergent scale-free network dynamics of a melting discrete time-crystal in an Ising chain [25]. The proposal in [25] can be seen as a generalization of the quantum reservoir presented in (16) by dropping the constraint of the Rydberg blockade. Our results, as well as those in [25], pose the possibility of having an RNN with a memory capacity that outpaces that of classical RNNs such as the Hopfield network [66].

VI Conclusions and outlook

In this article, we present a quantum extension of a classical RNN on binary neurons. This implies a deep connection between controllable many-body quantum systems and brain-inspired computational models. Our qRNN facilitates the ability to employ the analogue dynamics of quantum systems for computation instead of the circuit-based paradigm. We show how features of the quantum evolution of our qRNN can be used for quantum learning tasks, and to speedup simulation of stochastic dynamics. We implement a quantum reservoir using arrays of Rydberg atoms and show how Rydberg atoms analogously perform biological tasks even in the presence of a few atoms. This can be explained via the physics of the system. For example, we showed how weak-ergodicity breaking collective dynamics in Rydberg atoms can be employed for long-term memory.

While this article takes the first step forward in connecting controllable quantum systems and neural networks from a fundamental perspective, several questions remain unanswered. Firstly, from the first two quantum features hereby presented, studies of how qRNNs can be used for quantum error correction in circuit-like quantum computing are warranted. Directly from this work, investigations into advantageous stochastic processes in qRNNs that are robust to decoherence are enticing. These advantages will likely emerge from the collective behavior of quantum neurons. Therefore, the field will soon require a thorough understanding of the collective dissipative dynamics of neurons in qRNNs, which would also shed light on rigorous studies of the computational power of these architectures. Guided by the fact that neural networks become universal approximators by interconnecting many neurons, one may also consider the spatial and control requirements necessary for universal brain-inspired quantum machine learning.

Given the vast number of classical computational models for the brain, there are several immediate research directions. One of these is the exploration of a systematic way to quantize more biologically realistic models of a neural circuit. A possible starting point for translating different neural circuits would be to exploit key engineering and fundamental features of different NISQ platforms. For example, recent experiments using Rydberg atoms in photonic cavities may provide us with the ability to capture neural plasticity on qRNNs by arbitrarily tuning the inter-neural interactions [67]. Likewise, superconducting circuits have lately been used to encode biologically realistic single-neuron models [13]. Along these explorations, it will be imperative to establish a variety of methods to analyze how quantum neural networks recover the classical protocols within certain limits, as well as the source and extent of the quantum advantages that each platform can offer.

Lastly, while our memory encoding scheme in Sec. V.4 offers a possibility to encode a binary memory, whether a higher number of memories can be encoded efficiently remains an important open question. In Appendix C we offer a proposal based on the steady states of the effective dissipative evolution in the pre-thermalization regime introduced by the noise in the qRNN. This already shows a theoretical number of memories greater than those attainable by the vanilla Hopfield network [66]. However, distinguishing these memories, or producing Hamiltonians with the desired memory state in mind, is left for future research. It is clear, however, that memory in a quantum reservoir relies on ergodicity breaking dynamics [25, 27]. Hamiltonian engineering techniques, together with more general driven Hamiltonians such as those in [25], may pave the way towards programmable memories in a qRNN.

ACKNOWLEDGMENTS

The authors thank Mikhail. D. Lukin and Nishad Maskara for insightful discussion. RAB acknowledges support from NSF Graduate Research Fellowship under Grant No. DGE1745303, as well as funding from Harvard University’s Graduate Prize Fellowship. XG acknowledges support from Quantum Science of the Harvard-MPQ Center for Quantum Optics, the Templeton Religion Trust grant TRT 0159, and by the Army Research Office under Grant W911NF1910302 and MURI Grant W911NF-20-1-0082. SFY acknowledges funding from NSF and AFOSR.

References

  • Biamonte et al. [2017] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, Quantum machine learning, Nature 549, 195 (2017).
  • Harrow et al. [2009] A. W. Harrow, A. Hassidim, and S. Lloyd, Quantum algorithm for linear systems of equations, Physical Review Letters 10310.1103/physrevlett.103.150502 (2009).
  • Wiebe et al. [2012] N. Wiebe, D. Braun, and S. Lloyd, Quantum algorithm for data fitting, Physical Review Letters 10910.1103/physrevlett.109.050505 (2012).
  • Low et al. [2014] G. H. Low, T. J. Yoder, and I. L. Chuang, Quantum inference on bayesian networks, Physical Review A 8910.1103/physreva.89.062315 (2014).
  • Lloyd et al. [2014] S. Lloyd, M. Mohseni, and P. Rebentrost, Quantum principal component analysis, Nature Physics 10, 631 (2014).
  • McClean et al. [2018] J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush, and H. Neven, Barren plateaus in quantum neural network training landscapes, Nature communications 9, 1 (2018).
  • Cerezo et al. [2021] M. Cerezo, A. Sone, T. Volkoff, L. Cincio, and P. J. Coles, Cost function dependent barren plateaus in shallow parametrized quantum circuits, Nature communications 12, 1 (2021).
  • Marković and Grollier [2020] D. Marković and J. Grollier, Quantum neuromorphic computing, Applied Physics Letters 117, 150501 (2020).
  • Kiraly et al. [2021] B. Kiraly, E. J. Knol, W. M. J. van Weerdenburg, H. J. Kappen, and A. A. Khajetoorians, An atomic Boltzmann machine capable of self-adaption, Nature Nanotechnology 10.1038/s41565-020-00838-4 (2021).
  • Mujal et al. [2021] P. Mujal, R. Martínez-Peña, J. Nokkala, J. García-Beni, G. L. Giorgi, M. C. Soriano, and R. Zambrini, Opportunities in quantum reservoir computing and extreme learning machines, arXiv preprint arXiv:2102.11831  (2021).
  • Pfeiffer et al. [2016] P. Pfeiffer, I. Egusquiza, M. Di Ventra, M. Sanz, and E. Solano, Quantum memristors, Scientific reports 6, 1 (2016).
  • Gonzalez-Raya et al. [2019] T. Gonzalez-Raya, X.-H. Cheng, I. L. Egusquiza, X. Chen, M. Sanz, and E. Solano, Quantized single-ion-channel hodgkin-huxley model for quantum neurons, Physical Review Applied 12, 014037 (2019).
  • Gonzalez-Raya et al. [2020] T. Gonzalez-Raya, E. Solano, and M. Sanz, Quantized three-ion-channel neuron model for neural action potentials, Quantum 4, 224 (2020).
  • Torrontegui and García-Ripoll [2019] E. Torrontegui and J. J. García-Ripoll, Unitary quantum perceptron as efficient universal approximator, EPL (Europhysics Letters) 125, 30004 (2019).
  • Fujii and Nakajima [2020] K. Fujii and K. Nakajima, Quantum reservoir computing: a reservoir approach toward quantum machine learning on near-term quantum devices, arXiv preprint arXiv:2011.04890  (2020).
  • Nakajima et al. [2019] K. Nakajima, K. Fujii, M. Negoro, K. Mitarai, and M. Kitagawa, Boosting computational power through spatial multiplexing in quantum reservoir computing, Phys. Rev. Applied 11, 034021 (2019).
  • Kutvonen et al. [2020] A. Kutvonen, K. Fujii, and T. Sagawa, Optimizing a quantum reservoir computer for time series prediction, Scientific Reports 10, 14687 (2020).
  • Ghosh et al. [2019a] S. Ghosh, A. Opala, M. Matuszewski, T. Paterek, and T. C. H. Liew, Quantum reservoir processing, npj Quantum Information 510.1038/s41534-019-0149-8 (2019a).
  • Khan et al. [2021] S. A. Khan, F. Hu, G. Angelatos, and H. E. Türeci, Physical reservoir computing using finitely-sampled quantum systems, arXiv preprint arXiv:2110.13849  (2021).
  • Ghosh et al. [2019b] S. Ghosh, T. Paterek, and T. C. H. Liew, Quantum neuromorphic platform for quantum state preparation, Phys. Rev. Lett. 123, 260404 (2019b).
  • Govia et al. [2021] L. Govia, G. Ribeill, G. Rowlands, H. Krovi, and T. Ohki, Quantum reservoir computing with a single nonlinear oscillator, Physical Review Research 3, 013077 (2021).
  • Nokkala et al. [2021] J. Nokkala, R. Martínez-Peña, G. L. Giorgi, V. Parigi, M. C. Soriano, and R. Zambrini, Gaussian states of continuous-variable quantum systems provide universal and versatile reservoir computing, Communications Physics 4, 1 (2021).
  • Ghosh et al. [2021] S. Ghosh, T. Krisnanda, T. Paterek, and T. C. Liew, Realising and compressing quantum circuits with quantum reservoir computing, Communications Physics 4, 1 (2021).
  • Mujal [2022] P. Mujal, Quantum reservoir computing for speckle-disorder potentials, arXiv preprint arXiv:2201.11096  (2022).
  • Sakurai et al. [2021] A. Sakurai, M. P. Estarellas, W. J. Munro, and K. Nemoto, Quantum reservoir computation utilising scale-free networks, arXiv preprint arXiv:2108.12131  (2021).
  • Xia et al. [2022] W. Xia, J. Zou, X. Qiu, and X. Li, The reservoir learning power across quantum many-body localization transition, Frontiers of Physics 17, 1 (2022).
  • Martínez-Peña et al. [2021] R. Martínez-Peña, G. L. Giorgi, J. Nokkala, M. C. Soriano, and R. Zambrini, Dynamical phase transitions in quantum reservoir computing, Phys. Rev. Lett. 127, 100502 (2021).
  • Coolen [2001] A. Coolen, Statistical mechanics of recurrent neural networks I—statics, in Handbook of biological physics, Vol. 4 (Elsevier, 2001) Chap. 14, pp. 553–618.
  • Suzuki et al. [2022] Y. Suzuki, Q. Gao, K. C. Pradel, K. Yasuoka, and N. Yamamoto, Natural quantum reservoir computing for temporal information processing, Scientific Reports 12, 1 (2022).
  • Saffman et al. [2010] M. Saffman, T. G. Walker, and K. Mølmer, Quantum information with Rydberg atoms, Rev. Mod. Phys. 82, 2313 (2010).
  • Lester et al. [2015] B. J. Lester, N. Luick, A. M. Kaufman, C. M. Reynolds, and C. A. Regal, Rapid production of uniformly filled arrays of neutral atoms, Physical review letters 115, 073003 (2015).
  • Barredo et al. [2016] D. Barredo, S. De Léséleuc, V. Lienhard, T. Lahaye, and A. Browaeys, An atom-by-atom assembler of defect-free arbitrary two-dimensional atomic arrays, Science 354, 1021 (2016).
  • Endres et al. [2016] M. Endres, H. Bernien, A. Keesling, H. Levine, E. R. Anschuetz, A. Krajenbrink, C. Senko, V. Vuletic, M. Greiner, and M. D. Lukin, Atom-by-atom assembly of defect-free one-dimensional cold atom arrays, Science 354, 1024 (2016).
  • Labuhn et al. [2016] H. Labuhn, D. Barredo, S. Ravets, S. De Léséleuc, T. Macrì, T. Lahaye, and A. Browaeys, Tunable two-dimensional arrays of single rydberg atoms for realizing quantum ising models, Nature 534, 667 (2016).
  • Bernien et al. [2017] H. Bernien, S. Schwartz, A. Keesling, H. Levine, A. Omran, H. Pichler, S. Choi, A. S. Zibrov, M. Endres, M. Greiner, V. Vuletić, and M. D. Lukin, Probing many-body dynamics on a 51-atom quantum simulator, Nature 551, 579 (2017).
  • Cooper et al. [2018] A. Cooper, J. P. Covey, I. S. Madjarov, S. G. Porsev, M. S. Safronova, and M. Endres, Alkaline-earth atoms in optical tweezers, Physical Review X 8, 041055 (2018).
  • Wilson et al. [2022] J. Wilson, S. Saskin, Y. Meng, S. Ma, R. Dilip, A. Burgers, and J. Thompson, Trapping alkaline earth rydberg atoms optical tweezer arrays, Physical Review Letters 128, 033201 (2022).
  • Ebadi et al. [2020] S. Ebadi, T. T. Wang, H. Levine, A. Keesling, G. Semeghini, A. Omran, D. Bluvstein, R. Samajdar, H. Pichler, W. W. Ho, et al.Quantum phases of matter on a 256-atom programmable quantum simulator (2020), arXiv:2012.12281 [quant-ph] .
  • Isenhower et al. [2010] L. Isenhower, E. Urban, X. Zhang, A. Gill, T. Henage, T. A. Johnson, T. Walker, and M. Saffman, Demonstration of a neutral atom controlled-not quantum gate, Physical review letters 104, 010503 (2010).
  • Pichler et al. [2018] H. Pichler, S.-T. Wang, L. Zhou, S. Choi, and M. D. Lukin, Quantum optimization for maximum independent set using Rydberg atom arrays (2018), arXiv:1808.10816 [quant-ph] .
  • Omran et al. [2019] A. Omran, H. Levine, A. Keesling, G. Semeghini, T. Wang, S. Ebadi, H. Bernien, A. Zibrov, H. Pichler, S. Choi, et al., Generation and manipulation of Schrödinger cat states in Rydberg atom arrays, Science 365, 570 (2019).
  • Henriet et al. [2020] L. Henriet, L. Beguin, A. Signoles, T. Lahaye, A. Browaeys, G.-O. Reymond, and C. Jurczak, Quantum computing with neutral atoms, Quantum 4, 327 (2020).
  • Cohen and Thompson [2021] S. R. Cohen and J. D. Thompson, Quantum computing with circular rydberg atoms, PRX Quantum 2, 030322 (2021).
  • Song et al. [2016] H. F. Song, G. R. Yang, and X.-J. Wang, Training excitatory-inhibitory recurrent neural networks for cognitive tasks: A simple and flexible framework, PLOS Computational Biology 12, 1 (2016).
  • Han and Gallagher [2009] J. Han and T. F. Gallagher, Millimeter-wave rubidium Rydberg van der waals spectroscopy, Phys. Rev. A 79, 053409 (2009).
  • Bluvstein et al. [2021] D. Bluvstein, A. Omran, H. Levine, A. Keesling, G. Semeghini, S. Ebadi, T. T. Wang, A. A. Michailidis, N. Maskara, W. W. Ho, et al., Controlling quantum many-body dynamics in driven rydberg atom arrays, Science 371, 1355 (2021).
  • Maskara et al. [2021] N. Maskara, A. A. Michailidis, W. W. Ho, D. Bluvstein, S. Choi, M. D. Lukin, and M. Serbyn, Discrete time-crystalline order enabled by quantum many-body scars: entanglement steering via periodic driving, arXiv preprint arXiv:2102.13160  (2021).
  • Bausch [2020] J. Bausch, Recurrent quantum neural networks, arXiv preprint arXiv:2006.14619  (2020).
  • Wu and Yang [2007] Y. Wu and X. Yang, Strong-coupling theory of periodically driven two-level systems, Phys. Rev. Lett 98, 013601 (2007).
  • Ghafari et al. [2019] F. Ghafari, N. Tischler, J. Thompson, M. Gu, L. K. Shalm, V. B. Verma, S. W. Nam, R. B. Patel, H. M. Wiseman, and G. J. Pryde, Dimensional quantum memory advantage in the simulation of stochastic processes, Phys. Rev. X 9, 041013 (2019).
  • Wolpert et al. [2019] D. H. Wolpert, A. Kolchinsky, and J. A. Owen, A space–time tradeoff for implementing a function with master equation dynamics, Nature communications 10, 1 (2019).
  • Korzekwa and Lostaglio [2021] K. Korzekwa and M. Lostaglio, Quantum advantage in simulating stochastic processes, Phys. Rev. X 11, 021019 (2021).
  • Blank et al. [2021] C. Blank, D. K. Park, and F. Petruccione, Quantum-enhanced analysis of discrete stochastic processes, npj Quantum Information 7, 1 (2021).
  • Burkitt [2006] A. N. Burkitt, A review of the integrate-and-fire neuron model: I. Homogeneous synaptic input, Biological Cybernetics 95, 1 (2006).
  • Jaeger [2002] H. Jaeger, Short term memory in echo state networks, in GMD - German National Research Institute for Computer Science (GMD-Report 152, 2002).
  • Capano et al. [2015] V. Capano, H. J. Herrmann, and L. de Arcangelis, Optimal percentage of inhibitory synapses in multi-task learning, Scientific Reports 510.1038/srep09895 (2015).
  • Eccles et al. [1954] J. C. Eccles, P. Fatt, and K. Koketsu, Cholinergic and inhibitory synapses in a pathway from motor-axon collaterals to motoneurones, The Journal of Physiology 126, 524 (1954).
  • Weber et al. [2017] S. Weber, C. Tresp, H. Menke, A. Urvoy, O. Firstenberg, H. P. Büchler, and S. Hofferberth, Tutorial: Calculation of Rydberg interaction potentials, J. Phys. B: At. Mol. Opt. Phys. 50, 133001 (2017).
  • Roitman and Shadlen [2002] J. D. Roitman and M. N. Shadlen, Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task, The Journal of Neuroscience 22, 9475 (2002).
  • Govia et al. [2022] L. Govia, G. Ribeill, G. Rowlands, and T. Ohki, Nonlinear input transformations are ubiquitous in quantum reservoir computing, Neuromorphic Computing and Engineering 2, 014008 (2022).
  • Urban et al. [2009] E. Urban, T. A. Johnson, T. Henage, L. Isenhower, D. Yavuz, T. Walker, and M. Saffman, Observation of rydberg blockade between two atoms, Nature Physics 5, 110 (2009).
  • Gaetan et al. [2009] A. Gaetan, Y. Miroshnychenko, T. Wilk, A. Chotia, M. Viteau, D. Comparat, P. Pillet, A. Browaeys, and P. Grangier, Observation of collective excitation of two individual atoms in the rydberg blockade regime, Nature Physics 5, 115 (2009).
  • D’Alessio et al. [2016] L. D’Alessio, Y. Kafri, A. Polkovnikov, and M. Rigol, From quantum chaos and eigenstate thermalization to statistical mechanics and thermodynamics, Advances in Physics 65, 239 (2016).
  • Fendley et al. [2004] P. Fendley, K. Sengupta, and S. Sachdev, Competing density-wave orders in a one-dimensional hard-boson model, Physical Review B 69, 075106 (2004).
  • Lesanovsky and Katsura [2012] I. Lesanovsky and H. Katsura, Interacting fibonacci anyons in a rydberg gas, Physical Review A 86, 041601 (2012).
  • Folli et al. [2017] V. Folli, M. Leonetti, and G. Ruocco, On the maximum storage capacity of the hopfield model, Frontiers in computational neuroscience 10, 144 (2017).
  • Periwal et al. [2021] A. Periwal, E. S. Cooper, P. Kunkel, J. F. Wienand, E. J. Davis, and M. Schleier-Smith, Programmable interactions and emergent geometry in an atomic array, arXiv preprint arXiv:2106.04070  (2021).
  • Goodman [1970] G. S. Goodman, An intrinsic time for non-stationary finite Markov chains, Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 16, 165 (1970).
  • Reiter and Sørensen [2012] F. Reiter and A. S. Sørensen, Effective operator formalism for open quantum systems, Physical Review A 85, 032111 (2012).
  • Nelder and Mead [1965] J. A. Nelder and R. Mead, A simplex method for function minimization, The computer journal 7, 308 (1965).
  • Serbyn et al. [2021] M. Serbyn, D. A. Abanin, and Z. Papić, Quantum many-body scars and weak breaking of ergodicity, Nature Physics 17, 675 (2021).

Appendix A Probability transformations using qRNNs

In the case of of the RNN presented in (II), using Ref. [28] we can derive that P(𝒔|𝒔)P(\bm{s}|\bm{s}^{\prime}) in (8) is given by

P(𝒔|𝒔)=i=1N12(1+sig[hi(𝒔)/σin21])P(\bm{s}|\bm{s}^{\prime})=\prod_{i=1}^{N}\frac{1}{2}\left(1+s_{i}g\left[h_{i}(\bm{s}^{\prime})/\sigma^{2}_{in}-1\right]\right)

where g[x]=Erf[z/2]g\left[x\right]=\text{Erf}\left[z/\sqrt{2}\right] is the error function due to the Gaussian noise. Regarding the task in Sec. III.1 of flipping all neurons at once, one could naively think that this can be done classically by taking the inputs Δn\Delta_{n}\rightarrow\infty, however, since the noise’s strength σin2\sigma_{in}^{2} scales as the size of the inputs, one obtains P(𝒔|𝒔)i=1N12(1+si/2)P(\bm{s}|\bm{s}^{\prime})\rightarrow\prod_{i=1}^{N}\frac{1}{2}(1+s_{i}/2) which is a completely random update independent of the original state.

A transition matrix LL obeys L𝒔|𝒔0L_{\bm{s}^{\prime}|\bm{s}}\geq 0 and 𝒔L𝒔|𝒔=1\sum_{\bm{s}^{\prime}}L_{\bm{s}^{\prime}|\bm{s}}=1. LL is said to be classically embeddable if it can be generated by a continuous Markov process via

ddtP(t)=K(t)P(t),P(0)=𝟙, P(tf)=L,\frac{d}{dt}P(t)=K(t)P(t),\quad P(0)=\mathds{1},\text{ }P(t_{f})=L, (A.22)

where KK is called a generator matrix that preserves the positive nature of PP via the constraint K𝒔|𝒔0K_{\bm{s}|\bm{s}^{\prime}}\geq 0 for 𝒔𝒔\bm{s}\neq\bm{s}^{\prime}, and normalization via the constraint 𝒔K𝒔|𝒔=0\sum_{\bm{s}}K_{\bm{s}|\bm{s}^{\prime}}=0. Applied to our setup, a classically embeddable stochastic process is one that can transform ptf=Lp0p_{t_{f}}=Lp_{0} via an RNN without employing any hidden neurons (i.e. M=NM=N neurons are used for readout), and in a single step. In general, determining if a matrix LL is embeddable is an open question, but any embeddable matrix must necessarily satisfy [68]

𝒔L𝒔|𝒔detL0.\prod_{\bm{s}}L_{\bm{s}|\bm{s}}\geq\text{det}L\geq 0. (A.23)

From (A.23), it immediately follows that the global “spin-flip” matrix FF defined in (9) is not classically embeddable. That is, detF=1\det F=1 and 𝒔F𝒔|𝒔=0\prod_{\bm{s}}F_{\bm{s}|\bm{s}}=0, violating (A.23). Notice that the impossibility of performing FF without hidden neurons is quite general, and it is not limited to the stochastic process allowed by (II). Moreover, the number of time-steps needed to achieve FF using mm hidden neurons is of order 𝒪(2Nm)\mathcal{O}(2^{N-m}) (for details, see Sec. III.A in Ref. [52]).

Similar definitions of embeddability exist in the quantum setting. A stochastic process LL is said to be quantum embeddable if there exists a Markovian quantum channel \mathcal{E} such that

L𝒔|𝒔=𝒔|(|𝒔𝒔|)|𝒔.L_{\bm{s}^{\prime}|\bm{s}}=\langle\bm{s}^{\prime}|\mathcal{E}(|\bm{s}\rangle\langle\bm{s}|)|\bm{s}^{\prime}\rangle. (A.24)

A Markovian quantum channel \mathcal{E} is a channel arising from the time-evolution under a master equation, and thus \mathcal{E} may include unitary and dissipative terms. As pointed out in Ref. [52], all classically embeddable stochastic processes. Moreover, permutations such as FF in (9) are quantum embeddable since all permutations are unitary operators.

We highlight that realizing FF is extremely sensitive to the decoherence arising from spontaneous emission, a main source of noise in NISQ devices. If γ\gamma is the decay rate at which spin |1|1\rangle relaxes to |-1|\text{-}1\rangle, one can show that the unitary evolution leads to the stochastic process FγF^{\gamma} where detFγ=e𝒪(2N)\text{det}F^{\gamma}=e^{-\mathcal{O}(2^{N})}. Notice that whether FγF^{\gamma} violates (A.23) becomes rapidly inconclusive with increasing system size.

Appendix B Continous-time dynamics for a qRNN

A successful neural circuit model is the integrate and fire RNN (IF-RNN). In an IF-RNN each of the NN neurons is influenced by pre-synaptic firing rates and produces a post-synaptic firing rate as an output. Each neuron is endowed with a firing rate sn(t)s_{n}(t), where nn denotes the nthn^{th} neuron. The pre-synaptic firing rates arriving at the nthn^{th} neuron are integrated to produce a pre-synaptic current In(t)I_{n}(t). In turn, the neuron produces a firing rate sns_{n} influenced by its current and the firings of other neurons. Additionally, each neuron can receive a temporal input stimulus Δnin(t)\Delta^{in}_{n}(t) which affects both the currents and the firing rates. Generally, the firing rates and currents are described by non-linear, coupled differential equations of the form

I˙n\displaystyle\dot{I}_{n} =τI1In+Gn(𝒔(t),𝑰(t),Jnm,𝚫in(t))\displaystyle=-\tau_{I}^{-1}I_{n}+G_{n}(\bm{s}(t),\bm{I}(t),J_{nm},\bm{\Delta}^{in}(t)) (B.25)
s˙n\displaystyle\dot{s}_{n} =τs1sn+Fn(𝒔(t),𝑰(t),Jnm,𝚫in(t))\displaystyle=-\tau^{-1}_{s}s_{n}+F_{n}(\bm{s}(t),\bm{I}(t),J_{nm},\bm{\Delta}^{in}(t)) (B.26)

where τI,r\tau_{I,r} are relaxation time constants for the currents and firing rates respectively. The vector 𝒔(t)\bm{s}(t) is defined as 𝒔(t)=(s1(t),,sN(t))\bm{s}(t)=(s_{1}(t),...,s_{N}(t)), with 𝑰(t)\bm{I}(t), and 𝚫in(t)\bm{\Delta}^{in}(t) defined analogously. The functions GG and FF ensure the dynamics are non-linear which gives RNNs their vast computational complexity. The specific forms of GG and FF depend on the application and relation between the currents and the firing rates one is trying to capture by the model.

The qRNN in Sec. III.2 follows the Heisenberg-Langevine equations of motion

A˙=i[H,A]\displaystyle\dot{A}=i[H,A] +n(γ2σn++fn)[A,σn]\displaystyle+\sum_{n}\left(\frac{\gamma}{2}\sigma_{n}^{+}+f_{n}^{\dagger}\right)[A,\sigma_{n}^{-}]
+n[A,σn+](γ2σn+fn)\displaystyle+\sum_{n}[A,\sigma_{n}^{+}]\left(\frac{\gamma}{2}\sigma_{n}^{-}+f_{n}\right) (B.27)

for any operator AA. In (B), σn+=|1-1|n\sigma_{n}^{+}=|1\rangle\langle\text{-}1|_{n}, σn+=(σn+)\sigma^{+}_{n}=(\sigma_{n}^{+})^{\dagger}, and fnf_{n} is a Langevin noise operator with Gaussian statistics fn(t)=0\left\langle f_{n}(t)\right\rangle=0 and fn(t)fm(t)δmnδ(tt)\left\langle f_{n}(t)f_{m}^{\dagger}(t^{\prime})\right\rangle\propto\delta_{mn}\delta(t-t^{\prime}). In the equation above [A,B]ABBC[A,B]\equiv AB-BC stands for the commutator between matrices AA and BB.

To extract the statistics of the system, one may choose to look at the dynamics of two different local observables’ expectation values. For example, the equations of motion for expectations of the local Pauli operators σnx=|-11|n+|1-1|n\sigma^{x}_{n}=|\text{-1}\rangle\langle 1|_{n}+|1\rangle\langle\text{-1}|_{n}, and σny=i|-11|ni|1-1|n\sigma^{y}_{n}=i|\text{-1}\rangle\langle 1|_{n}-i|1\rangle\langle\text{-1}|_{n} are given by

σnx˙=γ2σnx+i[H(t),σnx]\displaystyle\dot{\left\langle\sigma^{x}_{n}\right\rangle}=-\frac{\gamma}{2}\left\langle\sigma^{x}_{n}\right\rangle+i\left\langle\left[H(t),\sigma_{n}^{x}\right]\right\rangle (B.28)
σny˙=γ2σny+i[H(t),σny]\displaystyle\dot{\left\langle\sigma^{y}_{n}\right\rangle}=-\frac{\gamma}{2}\left\langle\sigma^{y}_{n}\right\rangle+i\left\langle\left[H(t),\sigma_{n}^{y}\right]\right\rangle (B.29)

with H(t)H(t) specified by (III.1). The expectation values are calculated in the quantum-mechanical sense such that for an operator AA, A=Tr(Aρ)\left\langle A\right\rangle=\text{Tr}(A\rho), and terms linear in fnf_{n} cancel out. Notice that the commutators in equations (B.28)-(B.29) play the role of the functions GG and FF in (B.25)-(B.26).

For σnz\sigma_{n}^{z}, (B) gives

σnz˙=γ/2σnzΩ2σny+γ𝟙/22fnσn.\dot{\sigma^{z}_{n}}=-\gamma/2\sigma^{z}_{n}-\frac{\Omega}{2}\sigma_{n}^{y}+\gamma\mathds{1}/2-2f_{n}^{\dagger}\sigma_{n}^{-}. (B.30)

This can be integrated out to give

σnz(t)σnz(0)=\displaystyle\sigma_{n}^{z}(t)-\sigma_{n}^{z}(0)= 0tdteγ/2(tt)(Ω2σny(t)\displaystyle\int_{0}^{t}dt^{\prime}e^{-\gamma/2(t-t^{\prime})}\biggl{(}-\frac{\Omega}{2}\sigma_{n}^{y}(t^{\prime})
2fn(t)σn(t)+γ/2𝟙)\displaystyle-2f_{n}^{\dagger}(t^{\prime})\sigma_{n}^{-}(t^{\prime})+\gamma/2\mathds{1}\biggr{)} (B.31)

We choose to start the network at σnz=1\left\langle\sigma^{z}_{n}\right\rangle=-1 for all nn. We plug this back into (B.29), and we take the expectation values to eliminate terms linear in fnf_{n}. We obtain

σny˙=\displaystyle\dot{\left\langle\sigma^{y}_{n}\right\rangle}= γ2σny+Ω2m=1NJnm0t𝑑teγ(tt)σnx(t)σmy(t)\displaystyle-\frac{\gamma}{2}\left\langle\sigma_{n}^{y}\right\rangle+\frac{\Omega}{2}\sum_{m=1}^{N}J_{nm}\int_{0}^{t}dt^{\prime}e^{-\gamma(t-t^{\prime})}\left\langle\sigma^{x}_{n}(t)\sigma^{y}_{m}(t^{\prime})\right\rangle
+Δn(t)σnxΩ240tσny(t)eγ(tt)𝑑t.\displaystyle+\Delta_{n}(t)\left\langle\sigma_{n}^{x}\right\rangle-\frac{\Omega^{2}}{4}\int_{0}^{t}\left\langle\sigma^{y}_{n}(t^{\prime})\right\rangle e^{-\gamma(t-t^{\prime})}dt^{\prime}. (B.32)

Similar equations can be found for σnx\left\langle\sigma_{n}^{x}\right\rangle. Equation (B) tells us that σny\left\langle\sigma_{n}^{y}\right\rangle depends on past statistics, and thus our network has a memory time bounded by 1/γ1/\gamma. Let JJ denote the matrix JnmJ_{nm}. For γt1\gamma t\gg 1, we can extend the lower bound of integration to -\infty. Using the approximation teγ(tt)f(t)𝑑tγ1f(t)\int_{-\infty}^{t}e^{-\gamma(t-t^{\prime})}f(t^{\prime})dt^{\prime}\approx-\gamma^{-1}f(t), we obtain

σnx˙\displaystyle\dot{\left\langle\sigma_{n}^{x}\right\rangle} =γ2σnxΔninσny\displaystyle=-\frac{\gamma}{2}\left\langle\sigma_{n}^{x}\right\rangle-\Delta_{n}^{in}\left\langle\sigma_{n}^{y}\right\rangle
Ω2γmJnmσnyσmy\displaystyle-\frac{\Omega}{2\gamma}\sum_{m}J_{nm}\left\langle\sigma_{n}^{y}\sigma_{m}^{y}\right\rangle (B.33)
σny˙\displaystyle\dot{\left\langle\sigma_{n}^{y}\right\rangle} =(γ2+Ω24γ)σny+Δninσnx\displaystyle=-\left(\frac{\gamma}{2}+\frac{\Omega^{2}}{4\gamma}\right)\left\langle\sigma_{n}^{y}\right\rangle+\Delta_{n}^{in}\left\langle\sigma_{n}^{x}\right\rangle
Ω2γmJnmσnxσmy\displaystyle-\frac{\Omega}{2\gamma}\sum_{m}J_{nm}\left\langle\sigma_{n}^{x}\sigma_{m}^{y}\right\rangle (B.34)

thus leading to (III.2). In (B.34)-(B.33), the time dependence is implied.

Let us now define sn(t)σny(t)s_{n}(t)\equiv\left\langle\sigma^{y}_{n}(t)\right\rangle and In(t)σnx(t)I_{n}(t)\equiv\left\langle\sigma^{x}_{n}(t)\right\rangle so that 𝒔(t)=(s1(t),,sN(t))\bm{s}(t)=(s_{1}(t),...,s_{N}(t)) and 𝑰(t)=(I1(t),,IN(t))\bm{I}(t)=(I_{1}(t),...,I_{N}(t)). We see that (B.34)-(B.33) match (B.25)-(B.26) where

Gn\displaystyle G_{n} =ΔnsnΩ2γmJnmInIm,\displaystyle=-\Delta_{n}s_{n}-\frac{\Omega}{2\gamma}\sum_{m}J_{nm}\left\langle I_{n}I_{m}\right\rangle, τI1=γ2,\displaystyle\tau_{I}^{-1}=\frac{\gamma}{2}, (B.35)
Fn\displaystyle F_{n} =ΔnInΩ2γmJnmInsm,\displaystyle=\Delta_{n}I_{n}-\frac{\Omega}{2\gamma}\sum_{m}J_{nm}\left\langle I_{n}s_{m}\right\rangle, τs1=γ2+Ω24γ.\displaystyle\tau_{s}^{-1}=\frac{\gamma}{2}+\frac{\Omega^{2}}{4\gamma}. (B.36)

Equations (B.33)-(B.34) allow us to naturally interpret σny\left\langle\sigma_{n}^{y}\right\rangle as the firing rate of the nthn^{th} neuron, and σnx\left\langle\sigma_{n}^{x}\right\rangle as the current. That is, the rate of the pre-synaptic neuron σky\left\langle\sigma^{y}_{k}\right\rangle amounts to a current in the post-synaptic neuron σnx\left\langle\sigma^{x}_{n}\right\rangle that drives its rate σny\left\langle\sigma_{n}^{y}\right\rangle.

Equations (B.33)-(B.34) comprise a system of coupled quadratic differential equations, where the quadratic terms arise from the nontrivial commutation relation of the Pauli-operators [σnα,σmβ]=iδαβϵαβγσnγ[\sigma_{n}^{\alpha},\sigma_{m}^{\beta}]=i\delta_{\alpha\beta}\epsilon_{\alpha\beta\gamma}\sigma_{n}^{\gamma} where ϵαβγ\epsilon_{\alpha\beta\gamma} is the Levi-Civita symbol. These quadratic terms in (B.33)-(B.34) make a qRNN a powerful computational system similar to how the functions GG and FF make an RNN a powerful computational system.

Appendix C Memory and quantum many-body scars

As described in the main text, and more thoroughly discussed in Ref. [47], the scaring behavior of the kicked PXP-model is robust to fixed imperfections in the drive. The robustness persist even for random noise. Fig. 10 exemplifies the overlap with the initial condition for a noisy, kicked PXP-model for different values of ϵ\epsilon and σin\sigma_{in}, which is a natural extension of the model in [47]. The Neel state |AF|AF\rangle exhibits robust revivals invariant of σin2\sigma^{2}_{in}. This fact can be explained with the effective theory presented below.

Refer to caption
Figure 10: Fidelities with the initial state after evolving for n=100n=100 cycles of noisy, kicked dynamics. The fidelity is defined as F|ψ|ψ(2nτ)|2F\equiv|\langle\psi|\psi(2n\tau)\rangle|^{2} where |ψ|\psi\rangle is the initial state. Here, we used L=8L=8 Rydberg atoms and define |AF=|grgrgr|AF\rangle=|grgrgr\rangle, |gg=|gggggggg|gg\rangle=|gggggggg\rangle, and |d2=|grggggrg|d_{2}\rangle=|grggggrg\rangle. The Neel state |AF|AF\rangle is robust to the noise in the drive since this state is invariant to decoherence up to second order in ϵi\epsilon_{i}.

To understand the robustness of the quantum scaring behavior in the Rydberg reservoir it is instructive to seek an effective description of the system’s evolution. Recall that a cycle is defined as two imperfect applications of χτ\chi_{\tau}. The Hamiltonian in (16) produces the single-cycle unitary

Uτ(ϵ1,ϵ2)=eiϵ2N^χτeiϵ1N^χτ=eiϵ2N^eiϵ1χτN^χτU_{\tau}(\epsilon_{1},\epsilon_{2})=e^{-i\epsilon_{2}\hat{N}}\chi_{\tau}e^{-i\epsilon_{1}\hat{N}}\chi_{\tau}=e^{-i\epsilon_{2}\hat{N}}e^{-i\epsilon_{1}\chi_{\tau}\hat{N}\chi_{\tau}} (C.37)

where we use the fact that χτ\chi_{\tau} is both Hermitian and unitary. Using the Baker-Campbell-Hausdorf formula to second order in ϵi\epsilon_{i}, we can rewrite (C.37) as

Uτ(ϵ1,ϵ2)ei(ϵ2N^+ϵ1χτN^χτ).U_{\tau}(\epsilon_{1},\epsilon_{2})\approx e^{-i(\epsilon_{2}\hat{N}+\epsilon_{1}\chi_{\tau}\hat{N}\chi_{\tau})}. (C.38)

A state ρ(n)\rho(n) evolves to ρ(n+1)=Uτ(ϵ1,ϵ2)ρ(n)Uτ(ϵ1,ϵ2)\rho(n+1)=U_{\tau}(\epsilon_{1},\epsilon_{2})\rho(n)U^{\dagger}_{\tau}(\epsilon_{1},\epsilon_{2}) after a cycle. Expanding this to second order in ϵk\epsilon_{k} and using the fact that ϵk=ϵ\langle\epsilon_{k}\rangle=\epsilon and ϵkϵl=σin2δkl\langle\epsilon_{k}\epsilon_{l}\rangle=\sigma_{in}^{2}\delta_{kl}, we obtain the average evolution of the state

ρ(n+1)ρ(n)\displaystyle\rho(n+1)-\rho(n) =iε[H+,ρ(n)]\displaystyle=-i\varepsilon[H^{+},\rho(n)]
+σin2(N^ρ(n)N^12{N^2,ρ(n)})\displaystyle+\sigma_{in}^{2}\left(\hat{N}\rho(n)\hat{N}-\frac{1}{2}\{\hat{N}^{2},\rho(n)\}\right)
+σin2(χτN^χτρ(n)χτN^χτ\displaystyle+\sigma_{in}^{2}\left(\chi_{\tau}\hat{N}\chi_{\tau}\rho(n)\chi_{\tau}\hat{N}\chi_{\tau}\right.
12{χτN^2χτ,ρ(n)}).\displaystyle\left.-\frac{1}{2}\{\chi_{\tau}\hat{N}^{2}\chi_{\tau},\rho(n)\}\right). (C.39)

Here, {A,B}=AB+BA\{A,B\}=AB+BA denote commutators and anti-commutators respectively. We define H+=N^+χτN^χτH^{+}=\hat{N}+\chi_{\tau}\hat{N}\chi_{\tau}. For times T2τT\gg 2\tau, we can take (C) to be a Lindbladian evolution since the noise satisfies the Markovian properties. We can rewrite (C) as

ρ˙\displaystyle\dot{\rho} =ϵ,σ(ρ)\displaystyle=\mathcal{L}_{\epsilon,\sigma}(\rho) (C.40)
ϵ,σ()\displaystyle\mathcal{L}_{\epsilon,\sigma}(\cdot) =iε2τ[H+,]+σin22τD+()+σin22τD()\displaystyle=-i\frac{\varepsilon}{2\tau}[H^{+},\cdot]+\frac{\sigma_{in}^{2}}{2\tau}D^{+}(\cdot)+\frac{\sigma_{in}^{2}}{2\tau}D^{-}(\cdot) (C.41)
D±()\displaystyle D^{\pm}(\cdot) =H±H±+12{H±H±,}\displaystyle=H^{\pm}\cdot H^{\pm}+\frac{1}{2}\{H^{\pm}H^{\pm},\cdot\} (C.42)

where H=N^χτN^χτH^{-}=\hat{N}-\chi_{\tau}\hat{N}\chi_{\tau}. For τ=1.51π\tau=1.51\pi, the Neel states are approximately simultaneous eigenstates of χτN^χτ\chi_{\tau}\hat{N}\chi_{\tau} and N^\hat{N} with eigenvalues NN for a system of size NN. Thus, they are simultaneous eigenstates of H±H^{\pm} and so

ϵ,σ(|AFAF|)0,ϵ,σ(|AFAF|)0.\mathcal{L}_{\epsilon,\sigma}(|AF\rangle\langle AF|)\approx 0,\quad\mathcal{L}_{\epsilon,\sigma}(|AF^{\prime}\rangle\langle AF^{\prime}|)\approx 0. (C.43)

Therefore, the Neel states are steady states. It is worth noting that ϵ,σ\mathcal{L}_{\epsilon,\sigma} captures the pre-thermal evolution. Ultimately, higher-order effects in ϵk\epsilon_{k} takeover and lead to the thermalization of the Neel states similar to the results in [47] and as seen in Fig. 9. Nonetheless, the thermalization of the Neel states is delayed relative to other states due to (C.43).

Refer to caption
Figure 11: Number of zero eigenvalues of the super-operator ϵ,σ\mathcal{L}_{\epsilon,\sigma} as a function of the system size. ϵ,σ\mathcal{L}_{\epsilon,\sigma} describes the effective dynamics of a Rydberg reservoir composed of kicked Rydberg atoms. The number of zeros surpasses the linear number of memories available in the Hopfield network.
Refer to caption
Figure 12: Empirical memory states ρss(s)\rho_{ss}(s) obtained from evolving the initial states |s|s\rangle which are basis states of the Rydberg blockaded Hilbert space. NmeN_{m}^{e} denotes the number of memories found using this procedure. The different plots show the fidelities F(ρss(s),ρss(s))F(\rho_{ss}(s),\rho_{ss}(s^{\prime})) between different steady states. The red squares delimit the basis states with different number of Rydberg excitations starting with the zero excitation sector on the top-left square and ending with N/2N/2 excitations sector on the bottom-right square. Red arrows denote initial configurations for each of the NmeN^{e}_{m} memories found empirically. While this procedure produces a number of memory states smaller than the number of zeros of ϵ,σ\mathcal{L}_{\epsilon,\sigma}, Nme>NN_{m}^{e}>N, a bound unattainable by common classical RNNs.

Moreover, any density matrix ρss\rho_{ss} in the kernel of ϵ,σ\mathcal{L}_{\epsilon,\sigma} can be used as a memory state. Expressing ϵ,σ\mathcal{L}_{\epsilon,\sigma} as a super-operator on density matrices, we can look at its spectrum which is in general complex. Fig. 11 shows the number of zero eigenvalues of ϵ,σ\mathcal{L}_{\epsilon,\sigma} for different system sizes NN. The number of zeros scales larger than linearly on NN. Therefore, a quantum reservoir evolving under ϵ,σ\mathcal{L}_{\epsilon,\sigma} may have a larger number of memory states than a classical RNN. To prepare these states, we propose to initialize the reservoir on different string configurations |s|s\rangle satisfying the Rydberg blockade constraint. For example, one can have s=rgg..gs=rgg..g while s=rrggs=rrg...g is not allowed. The system is left to evolve for some time TssT_{ss} to reach a steady state ρss(s)\rho_{ss}(s) which can then be used as memory. Different initial strings can lead to different steady states as exemplified in Fig. 12. Fig. 12 shows the fidelity between ρss(s)\rho_{ss}(s) and ρss(s)\rho_{ss}(s^{\prime}) defined by the trace norm

F(ρss(s),ρss(s))=(Trρss(s)ρss(s)ρss(s))2.F(\rho_{ss}(s),\rho_{ss}(s^{\prime}))=\left(\text{Tr}\sqrt{\sqrt{\rho_{ss}(s)}\rho_{ss}(s^{\prime})\sqrt{\rho_{ss}(s^{\prime})}}\right)^{2}. (C.44)

The red arrows in Fig. 12 indicate the different memory states obtained by this scheme. It’s worth noting that this scheme offers us an empirical number of memories NmeN^{e}_{m} that scales at most as ϕN\phi^{N}, where ϕ1.62\phi\approx 1.62 is the Golden ratio since that’s the number of basis states respecting the Rydberg blockade. We see that Nme>NN^{e}_{m}>N in all instances, a bound unattainable by classical RNNs such as the Hopfield network [66]. However, this scheme relies on an efficient way to recognize the different memory states through measurements, a question that we leave for future investigations.

Appendix D Experimental values, and numerical simulations

Refer to caption
Figure 13: Schematic of Rydberg atoms as used in Ref. [38]. The ground state |g=|5S1/2|g\rangle=|5S_{1/2}\rangle, and the Rydberg state |r=|50S1/2|r\rangle=|50S_{1/2}\rangle are coupled via a two-photon transition. An off-resonance 420 nm laser (Ω420=2π×160 MHZ\Omega_{420}=2\pi\times 160\text{ MHZ}, δ=2π×1 GHz\delta=2\pi\times 1\text{ GHz}) couples |g|g\rangle with the intermediate |6P3/2|6P_{3/2}\rangle state, and a 1013 nm laser (Ω1013=2π×50 MHz\Omega_{1013}=2\pi\times 50\text{ MHz} couples the intermediate state and |r|r\rangle creating an effective drive between |g|g\rangle and |r|r\rangle at rate Ω=Ω420Ω1013δ=2π×4.2MHz\Omega=\frac{\Omega_{420}\Omega_{1013}}{\delta}=2\pi\times 4.2\text{MHz}. Four spontaneous emission processes are at play: emission to nearby Rydberg atoms due to black-body-radiation at a rate γBBR=2π/(250 μs)\gamma_{BBR}=2\pi/(250\text{ }\mu\text{s}), photon-scattering out of the intermediate state into the ground state at rate γ420=2π/(20 μs)\gamma_{420}=2\pi/(20\text{ }\mu\text{s}) and into the Rydberg state at rate γ1013=2π/(150 μs)\gamma_{1013}=2\pi/(150\text{ }\mu\text{s}), and spontaneous emission from |r|r\rangle to |g|g\rangle at rate γSE=2π/(375 μs)\gamma_{SE}=2\pi/(375\text{ }\mu\text{s}). Since γBBR+γSE+γ1013=2π/(75 μs)\gamma_{BBR}+\gamma_{SE}+\gamma_{1013}=2\pi/(75\text{ }\mu\text{s}) is smaller than γ420\gamma_{420}, the leading source of decoherence for short periods of time (<10 μs<10\text{ }\mu\text{s}) is due to the γ420\gamma_{420} decay.

In this section, we outline the details of the experimental values used for the numerical simulation of Sec. V. Firstly, for simulating Rydberg atoms we use the experimental values in Ref. [38] for concreteness ( see Fig. 13). In this experimental platform, a two-photon transition couples |g=|5S1/2|g\rangle=|5S_{1/2}\rangle and |r=|50S1/2|r\rangle=|50S_{1/2}\rangle via an off-resonance state |6P3/2|6P_{3/2}\rangle. For this setup, and for short periods of simulation (<10 μs<10\text{ }\mu\text{s}), the dominant source of decoherence is photon-scattering processes out of the intermediate state. Using the fact that the intermediate state is off-resonance, we can adiabatically eliminate it to produce an effective decay operator (see Sec. IV.B in Ref. [69])

σeff=γ4202δ|g(Ω420g|+Ω1013r|)\sigma^{-}_{eff}=\frac{\sqrt{\gamma_{420}}}{2\delta}|g\rangle\left(\Omega_{420}\langle g|+\Omega_{1013}\langle r|\right) (C.45)

which is an effective spontaneous emission from |r|r\rangle to |g|g\rangle accompanied by decoherence on the ground state.

We chose Ω=4.2 MHz\Omega=4.2\text{ MHz}. Additionally, a pair of |r|r\rangle atoms interact with a strength C6=862.9C_{6}=862.9 GHz(μm)6(\mu\text{m})^{6}. We used the PairInteraction python package from [58] to determine that a pair of |r=|70S1/2|r\rangle=|70S_{1/2}\rangle and |r=|73S1/2|r^{\prime}\rangle=|73S_{1/2}\rangle has a similar interaction strength of C6rr=836.6 GHz(μm)6C6C_{6}^{rr^{\prime}}=-836.6\text{ GHz}(\mu\text{m})^{6}\approx-C_{6}. We used this interaction to model the inhibitory and excitatory neurons in Sec. V.1 (VnQ,nQ=VV_{n_{Q},n_{Q}}=V, VnQ,nQ=VV_{n_{Q},n_{Q}^{\prime}}=-V). We denote V=C6/a06V=C_{6}/a^{6}_{0} where a0a_{0} is tuned to give us different nearest neighbor interaction strengths.

Next, we explain and report the numerical parameters chosen for each of the biological tasks.

D.1 Multitasking

Our scheme to encode inhibitory and excitatory neurons relies on approximating (13), and as a result, one needs the “inhibitory neurons” to be as far away as possible from each other such that they do not interact positively with each other. For this reason, this task uses a 1D open chain of atoms separated by a distance a0a_{0} with the inhibitory neurons being at opposite ends of the chain and in the bulk with maximum spacing from each other. The input neurons are chosen to be the two at one end of the chain, while the output neuron is chosen to be at the opposite end of the chain. This choice was made to ensure that the input neurons interact with the whole chain before readout.

The inputs are uniformly sampled from {0,2π}\{0,2\pi\} MHz with added Gaussian noise σin=0.1\sigma_{in}=0.1, and all Δt\Delta t sampled from a Gaussian with average Δt[0,5]\langle\Delta t\rangle\in[0,5] (μ(\mus) and standard deviation σin\sigma_{in}. For each size of the network and number of inhibitory neurons, We choose a0a_{0} such that the separation between inhibitory neurons dmaxd_{max} results in V/dmax2=102V/d_{max}^{2}=10^{-2}. For example, for the case of 4 neurons and two inhibitory neurons on either end, note that one needs V/36=102V/3^{6}=10^{-2}, which amounts to choosing V=7.2V=7.2 MHz. Note that this value of VV is of the order of magnitude of Ω=4.2\Omega=4.2 MHz and so the reservoir, in this case, is well into the non-classical regime.

The learned parameters in the output linear map WoutW^{\text{out}} which in this case is a matrix in 3+1×1\mathbb{R}^{3+1\times 1} with the last row representing a bias term. Note that the dimension of the map is so because only one neuron is measured but three functions have to be fitted.

D.2 Decision making

In classical RNNs tasks such as decision making and working memory require connectivity between all neurons. Since our connectivity is limited by physical constraints, an open 2D square lattice structure was chosen to prevent neurons from being isolated from the rest. Moreover, a 2D square lattice is experimentally friendly. In our case, we use an open 2×32\times 3 lattice with the two input neurons being at the top-left corner of the chain, and two output neurons being at the bottom-right corner. Again, this architecture was chosen so that the input neurons have to interact with the rest of the system before readout. We use V=2π×10V=2\pi\times 10 MHz for our simulations, and choose Δt=2π/V\Delta t=2\pi/V as the time the inputs are turned on as that’s the timescale in which the input atoms entangle with the rest of the chain.

The inputs are uniformly sampled from {0,π/2,π,3π/2,2π}\{0,\pi/2,\pi,3\pi/2,2\pi\} (MHz) with added Gaussian noise σin=0.1\sigma_{in}=0.1. In this task, the time that the stimuli are turned on Δt\Delta t is fixed to a mean of Δt=0.1\langle\Delta t\rangle=0.1 μ\mus and with added Gaussian noise σin=0.1\sigma_{in}=0.1. In this task, we optimize over the linear output map WoutW^{\text{out}}, a matrix in 1+1,2\mathbb{R}^{1+1,2} since one function is fitted and two neurons are measured. Additionally, we train the output time toutt_{out} after the stimuli are turned off and before the network is probed to come up with an input that is satisfied (15). To do the optimization, we make use of the Nelder-Mead algorithm [70].

In order to compute the psychometric response plotted in Fig. 6B, we measure the expectation values on the two output neurons and produce the vector 𝒓(Δ1in,Δ2in)=(σout1y,σout2y,1)\bm{r}(\Delta^{in}_{1},\Delta^{in}_{2})=(\langle\sigma_{out1}^{y}\rangle,\langle\sigma_{out2}^{y}\rangle,1) which depends on the inputs Δ1,2in\Delta^{in}_{1,2}, as well as the temporal parameters (Δt,tout)(\Delta t,t_{out}). We then compute yout(Δ1,2in)=Wout𝒓(Δ1,2in)y^{\text{out}}(\Delta^{in}_{1,2})=W^{\text{out}}\cdot\bm{r}(\Delta^{in}_{1,2}) and (Wout,tout)(W^{\text{out}},t_{out}) are optimized such that yout(Δ1,2in)ytargy^{\text{out}}(\Delta^{in}_{1,2})\approx y^{targ} in 15. The optimization is done by generating about 40,000 different values of Δ1,2in\Delta_{1,2}^{in} of different levels of contrast |Δ1inΔ2in||\Delta^{in}_{1}-\Delta^{in}_{2}| ranging from 0 to 1 MHz. Once the optimization is done, we look at the loss towards Δ2in\Delta_{2}^{in}, which is obtained as the error in classifying Δ2in\Delta_{2}^{in} as greater than Δ1in\Delta_{1}^{in} when indeed Δ2in>Δ1in\Delta_{2}^{in}>\Delta_{1}^{in}. The error is quantified using the mean-square loss in (4).

D.3 Working memory

This task’s setup is identical to the decision-making task except that the two inputs are separated by a delay time tdelayt_{delay}. The values of the interaction strength VV used for Fig. 7 are V=2π×10V=2\pi\times 10 MHz and V=2π×0.1V=2\pi\times 0.1 MHz corresponding to V/Ω>1V/\Omega>1 and V/Ω<1V/\Omega<1 respectively. The former of which sets us in the Rydberg blockaded regime while the latter is not. In this task, the times Δt\Delta t and tdelayt_{delay} are fixed up to an added Gaussian with noise σin=0.1\sigma_{in}=0.1. In this task, we optimize over the linear output map WoutW^{\text{out}}, a matrix in 1+1,2\mathbb{R}^{1+1,2} since one function is fitted and two neurons are measured.

D.4 Long-term memory

Although quantum scars are known to exist in other geometries and dimensions [71], for this task we use a 1D chain of Rydberg atoms since for this case quantum many-body scars have been experimentally observed [35, 46]. Furthermore, our chain has periodic boundary conditions to avoid edge effects. Since we know that scars are robust to decoherence, we set γ420=0\gamma_{420}=0 so that we can evolve our states for longer periods. The number of cycles nn in Fig. 9 corresponds to nn evolutions under the PXP Hamiltonian for a time 2τ=1.51×πΩ12\tau=1.51\times\pi\Omega^{-1}. In this case, we take VΩV\gg\Omega and renormalized Ω=1\Omega=1. The noisy field in (16) is sampled according to ϵkN(μ=0.1,σ=0.1)\epsilon_{k}\sim N(\mu=0.1,\sigma=0.1). The input mm is sampled as a fair random coin. Lastly, after each number of cycles nn, the only trained parameter is Wnout1+1×1W_{n}^{\text{out}}\in\mathbb{R}^{1+1\times 1} since only one atom is probed to calculate an answer as to the input mm.