Chebyshev expansion of spectral functions using restricted Boltzmann machines

Douglas Hendry Department of Physics, Northeastern University, Boston, Massachusetts 02115, USA Hongwei Chen Department of Physics, Northeastern University, Boston, Massachusetts 02115, USA Phillip Weinberg Department of Physics, Northeastern University, Boston, Massachusetts 02115, USA Adrian E. Feiguin Department of Physics, Northeastern University, Boston, Massachusetts 02115, USA

Abstract

Calculating the spectral function of two dimensional systems is arguably one of the most pressing challenges in modern computational condensed matter physics. While efficient techniques are available in lower dimensions, two dimensional systems present insurmountable hurdles, ranging from the sign problem in quantum Monte Carlo (MC), to the entanglement area law in tensor network based methods. We hereby present a variational approach based on a Chebyshev expansion of the spectral function and a neural network representation for the wave functions. The Chebyshev moments are obtained by recursively applying the Hamiltonian and projecting on the space of variational states using a modified natural gradient descent method. We compare this approach with a modified approximation of the spectral function which uses a Krylov subspace constructed from the “Chebyshev wave-functions”. We present results for the one-dimensional and two-dimensional Heisenberg model on the square lattice, and compare to those obtained by other methods in the literature.

I Introduction

The knowledge of the excitation spectrum of a quantum many-body system provides valuable information that be directly compared to experiments. While much effort is dedicated to the calculation of ground state wave functions, their energies do not have much experimental relevance since they cannot be measured. Experiments such as photoemission or inelastic neutron scattering spectroscopies, for instance, measure energy differences between excited and ground states, a density of states. In addition, the integrated weight also yields the equal time correlations in the ground state, that can also indicate the existence of order or lack thereof, or the presence of quasiparticle excitations that could be used as fingerprints to identify quantum phases. Unfortunately, the numerical evaluation of dynamical correlation functions in strongly correlated systems is a very difficult undertaking, and the development of techniques that can applied beyond one dimension is possibly one of the pressing challenges in computational condensed matter.

Since the advent of high-temperature supeconductivity, progress has been marked by ingenuity to overcome the computational limitations of the time. The first techniques to emerge were based on exact diagonalizationDagotto (1994), which is limited to small clusters, and different variants of quantum Monte Carlo, that suffers from the sign problem and requires uncontrolled analytic continuations Schüttler and Scalapino (1986); Sandvik (1998); Silver et al. (1990); Gubernatis et al. (1991); Syljuåsen (2008); Fuchs et al. (2010); Sandvik (2016); Shao et al. (2017). Soon after the invention of the density matrix renormalization group (DMRG) White (1992, 1993); Schollwöck (2005, 2011); Feiguin (2013a), several approaches to extract dynamical properties were proposed, that can be generically referred-to as dynamical DMRGHallberg (1995); Kühner and White (1999a); Jeckelmann (2002); Dargel et al. (2011, 2012); Nocera and Alvarez (2016), or DDMRG. The time-dependent density matrix renormalization group and recent variations using Chebyshev expansions have been important developments, giving access to accurate spectra for very large one-dimensional systems Daley et al. (2004); White and Feiguin (2004); Feiguin and White (2005); Feiguin (2011, 2013b); Holzner et al. (2011); Wolf et al. (2015); Xie et al. (2018) and quasi-2D cylinders Zaletel et al. (2015); Verresen et al. (2018) with moderate computational resources. An alternative approach consists of proposing variational forms for the excited states, which can be represented with matrix product states (MPS)Haegeman et al. (2013); Vanderstraeten et al. (2015a, b, 2019) or other variational wave functions that can be easily extended to higher dimensions and are free from the sign problemLi and Yang (2010); Dalla Piazza et al. (2014); Ferrari et al. (2018); Hendry and Feiguin (2019). The variational Monte Carlo (VMC) approach relies on a variational ansatz for the ground state inpired by some physical insight (typically of the form of a Gutzwiller projected wave function) and can provide a few hundred discrete poles by diagonalizing a Hamiltonian matrix projected onto a subspace of single magnon excitations.

In this work, we bring together some of these ideas translated in the context of recent advances in quantum machine learning. Attempts in this direction have already demonstrated very promising results: In Ref.Hendry and Feiguin, 2019, the concepts behind correction-vector dynamical DMRG were generalized to arbitrary variational wave functions, and applied to the Heisenberg model using restricted Boltzmann machines (RBM) Carleo and Troyer (2017); Saito (2017); Cai and Liu (2018); Glasser et al. (2018). RBMs are a type of artificial neural network widely used in machine learning that inspired Carleo and TroyerCarleo and Troyer (2017) to propose a variational wave-function for a spin- $\frac{1}{2}$ system of $N$ sites. The visible layer corresponds to the spin configurations $\vec{\sigma}^{z}=(\sigma^{z}_{1},\sigma^{z}_{2},\cdots,\sigma^{z}_{N})$ . Then the coefficients of the wave function is given as $\psi(\vec{\sigma}^{z},\vec{a},\vec{b},W)=e^{\sum_{i=1}^{N}a_{i}\sigma^{z}_{i}}\prod_{j=1}^{M}2\cosh{(\theta_{j})}$ where $\theta_{j}=b_{j}+\sum_{i=1}^{N}W_{ij}\sigma^{z}_{i}$ . The weights $W_{ij}$ and the biases $A$ and $b$ in expression are free parameters that are used to variationally minimize the energy $E=\langle\psi|H\psi\rangle/\langle\psi|\psi\rangle$ . This procedure is carried out using Monte Carlo to sample over all possible spin configurations. The remarkable aspect of this type of wave-functions is that they can encode a great deal of information in a relatively small set of variational parameters, without making any assumptions about the physics of the problem Chen et al. (2018).

The main goal of this paper is to variationally approximate the zero temperature Green’s function for a quantum many-body system:

G_{ij}(z)=\langle\psi|\hat{A}^{\dagger}_{j}\frac{1}{z-\hat{H}}\hat{A}_{i}|\psi\rangle

where $A_{i}$ and $A_{j}$ are some local operators of interest, $E_{0}$ is the ground state energy and $z=\omega+E_{0}+i\eta$ . The spectral function is thus obtained as $A_{ij}(\omega)=-\frac{1}{\pi}G_{ij}(z)$ . The correction vector DMRG method recasts this calculation as a complex system of equations for each frequency $\omega$ . In Ref.Hendry and Feiguin, 2019, the solution to this system of equations was encoded in the form of an RBM. The optimization is carried out by means of a natural gradient descent approach based on quantum geometry concepts that allows one to solve a large system of equations stochastically with RBMs, or any other variational wave-function.

We here explore a different avenue by expanding the spectral function in terms of Chebyshev polynomials. In Ref.Holzner et al., 2011 it was shown that this approach, combined with DMRG, can yield very accurate results with a fraction of the effort compared to other methods. In fact, it comes with several advantages over other methods, such as: (i) it provides uniform resolution over the entire relevant frequency range with a relatively small numbers of Chebyshev moments and (ii) the information to calculate each moment is encoded in an RBM that can be stored for later use or systematically improved if needed.

The paper is organized as follows: In Sec.II we introduce the basic ideas behind the Chebyshev representation of the spectral function; in Sec. III we discuss the implementation using RBMs, and the details about the variational optimization. Results can be improved by taking advantage of error correction measures described in Sec.IV. Sec. V covers an alternative approach using a representation of the spectral function in terms of a Krylov expansion of the basis. Results for both the one-dimensional and two-dimensional Heisenberg model on the square lattice are presented in Sec.VI. We finally close with a summary and discussion of possible strategies to improve the accuracy of the calculation.

II Chebyshev expansion

In this section we briefly describe the expansion of the spectral function $A_{ij}(\omega)$ in terms of Chebyshev moments. We refer the reader to Ref. Holzner et al., 2011 for a more detailed discussion. An important aspect of Chebyshev polynomials is they are defined in the interval $[-1,1]$ . Hence, we rescale the frequencies and the Hamiltonian such that the range of the expansion encompasses the region with the bulk of the spectral weight. If our problem has non-zero spectral weight in the interval $0\leq\omega\leq W_{A}$ , we define $\omega^{\prime}=\omega/a-W^{\prime}$ with $a=W_{*}/2W^{\prime}$ , and $H^{\prime}=(H-E_{0})/a-W^{\prime}$ . In these expressions, $W_{*}>W_{A}$ is an effective bandwidth larger than $W_{A}$ , and $W^{\prime}$ is the rescaled bandwidth, that we chose, for safety reasons, to be slightly smaller than 1 ( $W^{\prime}=1-0.0125$ in the following). In a nutshell, the spectral function can now be written as:

A_{ij}(\omega)=\frac{2W^{\prime}/W_{*}}{\pi\sqrt{1-\omega^{\prime 2}}}\left[g_{0}\mu_{0}+2\sum_{n=1}^{N-1}g_{n}\mu_{n}T_{n}(\omega^{\prime})\right]

(1)

where $T_{n}(x)$ is the $n^{th}$ Chebyshev polynomial of the first kind, the coefficients $g_{n}$ are damping factors that affect the broadening and $\mu_{n}$ are the corresponding Chebyshev moments:

\mu_{n}=\langle\psi|\hat{A}^{\dagger}_{j}T_{n}(\hat{H}^{\prime})\hat{A}_{i}|\psi\rangle.

(2)

Luckily, one can exploit the definition of the Chebyshev polynomials and their recursion relation to recast the moments as:

\mu_{n}=\langle\psi|\hat{A}_{j}|t_{n}\rangle,\,\,\,|t_{n}\rangle=T_{n}(\hat{H}^{\prime})\hat{A}_{i}|\psi\rangle,

(3)

with

	$\displaystyle T_{n+1}(x)$	$\displaystyle=$	$\displaystyle 2xT_{n}(x)-T_{n-1}(x),$
	$\displaystyle T_{0}(x)$	$\displaystyle=$	$\displaystyle 1,T_{1}(x)=x.$		(4)

Hence, given an initial wave-function $|t_{0}\rangle=\hat{A}_{i}|\psi\rangle$ , we iteratively solve for a sequence of “Chebyshev wave-functions” $|t_{1}\rangle$ , $|t_{2}\rangle$ ,… such that $|t_{n}\rangle=T_{n}(\hat{H^{\prime}})|t_{0}\rangle$ . In the following section, we describe how to carry out this task using a variational representation of $|t_{n}\rangle.$

III Variational optimization

Let $|\phi\rangle$ be the target wave function for a variational wave function $|\psi(\alpha)\rangle$ with variational parameters $\{\alpha_{1},\alpha_{2}\cdots\alpha_{M}\}\in\mathbb{C}^{M}$ that need to be optimized to make $|\psi(\alpha)\rangle$ as equal to $|\phi\rangle$ as possible up to an overall constant. This is done by minimizing Fubini-Study metric –essentially the angle– between the two wave functions, which is given by

\gamma(\psi,\phi)=\arccos{\sqrt{\frac{\langle\psi|\phi\rangle\langle\phi|\psi\rangle}{\langle\psi|\psi\rangle\langle\phi|\phi\rangle}}}.

Taking the gradient of ${\gamma}^{2}$ with respect $|\psi\rangle$ gives

|\delta\psi\rangle=\delta\gamma\cdot\,\gamma(\psi,\phi)\cot{(\gamma(\psi,\phi))}\left[\frac{\langle\psi|\psi\rangle}{\langle\psi|\phi\rangle}|\phi\rangle-|\psi\rangle\right],

which is a differential rotation by an angle $\delta\gamma\cdot\gamma$ towards $|\phi\rangle$ (with a change in the overall phase to match $|\psi\rangle$ ). As with rotations in real space, $\langle\psi|\delta\psi\rangle=0$ and $\langle\delta\psi|\delta\psi\rangle={(\delta\gamma\cdot\gamma)}^{2}\langle\psi|\psi\rangle$ .

The procedure to update the variational coefficients $\alpha$ using natural gradient descent is described in detail in Ref. Hendry and Feiguin, 2019 and we hereby summarize it. The updates $\Delta\alpha$ correspond to projecting $|\delta\psi\rangle$ onto the manifold of possible variational wave-functions of the proposed form (in our case, an RBM). To calculate $\Delta\alpha$ one must solve the following linear system of equations:

\sum_{k^{\prime}=1}^{M}S_{kk^{\prime}}\Delta\alpha_{k^{\prime}}=\frac{\langle\partial_{k}\psi|\delta\psi\rangle}{\langle\psi|\psi\rangle}-\frac{\langle\partial_{k}\psi|\psi\rangle}{\langle\psi|\psi\rangle}\frac{\langle\psi|\delta\psi\rangle}{\langle\psi|\psi\rangle},

where $|\partial_{k}\psi\rangle=\frac{\partial}{\partial\alpha_{k}}|\psi\rangle$ and $S$ is given by

S_{kk^{\prime}}=\frac{\langle\partial_{k}\psi|\partial_{k^{\prime}}\psi\rangle}{\langle\psi|\psi\rangle}-\frac{\langle\partial_{k}\psi|\psi\rangle}{\langle\psi|\psi\rangle}\frac{\langle\psi|\partial_{k^{\prime}}\psi\rangle}{\langle\psi|\psi\rangle}.

Refer to caption — Figure 1: Evolution of the spin dynamic structure factor of a Heisenberg chain with the number of Chebyshev moments using RBM wave-functions for $L=32$ at momentum $k=\pi/2$ .

To estimate the the overlap between wavefunctions in the definition of $S$ and in the linear system defined above, we use VMC. The states $|s\rangle$ are scampled according $P(s)={|\psi_{s}|}^{2}/\langle\psi|\psi\rangle$ , where $\psi_{s}\equiv\langle s|\psi\rangle$ . Using these sampled states we define estimators such as the log derivative $\mathcal{O}_{k}(s)=\frac{\partial\psi_{s}}{\partial\alpha_{k}}/\psi_{s}$ and the ratio $R(s)=\phi_{s}/\psi_{s}$ that are used to calculate the following relevant quantities:

$\displaystyle S_{kk^{\prime}}$	$\displaystyle=$	$\displaystyle\langle\mathcal{O}_{k}(s)^{}\mathcal{O}_{k^{\prime}}(s)\rangle-\langle\mathcal{O}_{k}(s)\rangle^{}\langle\mathcal{O}_{k^{\prime}}(s)\rangle,$
$\displaystyle\gamma(\psi,\phi)$	$\displaystyle=$	$\displaystyle\arccos{\sqrt{\frac{{\|\langle R(s)\rangle\|}^{2}}{\langle{\|R(s)\|}^{2}\rangle}}},$	(5)
$\displaystyle\frac{\langle\partial_{k}\psi\|\delta\psi\rangle}{\langle\psi\|\psi\rangle}$	$\displaystyle=$	$\displaystyle\delta\gamma\,\gamma(\psi,\phi)\cot{(\gamma(\psi,\phi))}\times$
		$\displaystyle\left[\frac{\langle\mathcal{O}_{k}(s)^{}R(s)\rangle}{\langle R(s)\rangle}-\langle\mathcal{O}_{k}(s)^{}\rangle\right].$

Notice that when $|\psi\rangle\propto|\phi\rangle$ , the variance of $R(s)$ goes to $0$ . As a consequence, the variance of the estimators above (except $S$ ) also goes to zero. Once the parameters have converged, the normalization constant $\beta$ can be calculated as

\beta=\frac{\langle\psi|\phi\rangle}{\langle\psi|\psi\rangle}=\langle R(s)\rangle.

This optimization algorithm is applied to every step of the Chebyshev recursion: Each wave-function in sequence is approximated by an non-normalized variational wave function $|\tilde{t}_{n}\rangle$ and corresponding normalization constant $\beta_{n}$ . The wave-functions are solved iteratively using the Chebyshev recurrence relation and is projected onto an RBM form such that the corresponding target wave functions are $|t_{1}\rangle=\beta_{0}{H^{\prime}}|\tilde{t}_{0}\rangle$ , and $|t_{n}\rangle=2\beta_{n-1}{H^{\prime}}|\tilde{t}_{n-1}\rangle-\beta_{n-2}|\tilde{t}_{n-2}\rangle$ for $n\geq 2$ , where $\beta_{0}=1$ and $|\tilde{t}_{0}\rangle=|t_{0}\rangle$ . These wave-functions can be easily stored, since the number of variational coefficients is $\mathcal{O}(N^{2})$ . This allows the calculation to proceed in three steps: First, the wave functions $|t_{n}\rangle$ are obtained; second, the Chebyshev moments are calculated by sampling over the wave functions to obtain the overlaps and, finally, the spectral function can be reconstructed.

IV Chebyshev Moment Correction

In the absence of errors, only the overlap of the Chebyshev vectors with $|t_{0}\rangle$ is needed to obtain the Chebyshev moments. Due to limitations in the expressivity of the RBM wave-function and the fact that each successive Chebyshev wave-function is obtained recursively, the variational Chebyshev wave-functions will become a less accurate approximation, amplifying the error for the higher moments. However, we can obtain better estimates for the Chebyshev moments by recalculating the Chebyshev wave-functions in the sub-space spanned by the original ones. Thus the errors are no longer dependent on the individual accuracy of each variational Chebyshev wave-function, but will depend on how well the true Chebyshev wave-functions can be represented in the new basis.

The procedure to recalculate the Chebyshev moments is as follows: The projector for the space spanned by the RBMs is given by $\hat{P}=\sum_{mn}(\sigma^{-1})_{mn}|t_{m}\rangle\langle t_{n}|$ , $\sigma_{nm}=\langle t_{n}|t_{m}\rangle$ and $H_{nm}=\langle t_{n}|\hat{H}|t_{m}\rangle$ . Let us project the Hamiltonian onto this basis as $\hat{H}_{P}=\hat{P}\hat{H}\hat{P}$ . Then, the recalculated Chebyshev wave-functions are given by

|t^{\prime}_{n}\rangle=T_{n}(\hat{H}_{P})|t^{\prime}_{0}\rangle=\sum_{m}c_{mn}|t_{m}\rangle

with $|t^{\prime}_{0}\rangle=|t_{0}\rangle$ and thus $c_{0n}=\delta_{0n}$ . The coeficients $c_{mn}$ can be obtained from the Chebyshev recursion relations which results in the following system of equations

	$\displaystyle\sum_{l}\sigma_{ml}c_{l1}=\sum_{l}H_{ml}c_{l0}$		(6)
	$\displaystyle\sum_{l}\sigma_{ml}c_{l,n+1}=\sum_{l}2H_{ml}c_{ln}-\sigma_{ml}c_{l,n-1}$		(7)

Finally the moments are obtained as

\mu_{n}=\sum_{ml}c_{m0}^{*}\sigma_{ml}c_{ln}.

V Continued fraction

Another avenue we can explore in order to approximate the spectral function from the Chebyshev wave-functions is the vector correction methodKühner and White (1999b); Jeckelmann (2002); Hendry and Feiguin (2019). The vector correction method involves finding the wave function that is the solution to the system of equations $(z-\hat{H})|\chi_{j}(z)\rangle=\hat{A}_{j}|\psi_{gs}\rangle$ . Then, the Green’s function is given by the overlap

G_{ij}(z)=\langle\psi_{gs}|\hat{A}_{i}^{\dagger}|\chi_{j}(z)\rangle

Since Chebyshev wave-functions form a non-orthogonal basis for the Krylov subspace, we can obtain an approximate solution for $|\chi_{j}(z)\rangle$ by utilizing a Krylov expansion. First, the Lanczos wave functions $|v_{0}\rangle,\dots,|v_{N-1}\rangle$ are obtained from the projected Hamiltonian $\hat{H}_{P}$ with $|v_{0}\rangle=|t_{0}\rangle$ , giving the matrix elements of the Hamiltonian in tridiagonal form. Finally, to obtain the spectral function from $H_{P}$ we use a continued-fraction expansion of the spectral function, which is equivalent to solving the system of equations projected onto this subspaceDagotto (1994); Dargel et al. (2012).

VI Results

For demonstration and benchmarking we focus on the spin- $\frac{1}{2}$ Heisenberg model in one and two dimensions on the square lattice:

\hat{H}=J\sum_{\langle ij\rangle}\vec{S}_{i}\cdot\vec{S}_{j},

(8)

where $\vec{S}=(\hat{S}^{x},\hat{S}^{y},\hat{S}^{z})$ are spin operators and the sum runs over pairs of nearest neighbor sites on the lattice. We consider periodic boundary conditions and chose $J$ as our unit of energy. We are interested in the longitudinal spin structure factor, defined as:

S^{z}(\mathbf{k},\omega)=-\frac{1}{\pi N}\mathrm{Im}\sum_{n}e^{i\mathbf{k}\cdot\mathbf{r}_{n}}{\langle\psi|\hat{S}^{z}_{0}\frac{1}{z-\hat{H}}\hat{S}^{z}_{n}|\psi\rangle}.

where we have used translational invariance. In our calculations, unless otherwise stated, we consider $W^{\prime}=1-\epsilon/4$ with $\epsilon=0.05$ ; $W_{*}=10W^{\prime}$ . In addition, we use the so-called “Jackson damping” in Eq.(1):

g_{n}=\frac{(N-n+1)\cos{\frac{\pi n}{N+1}}+\sin{\frac{\pi n}{N+1}}\cot{\frac{\pi}{N+1}}}{N+1},

where $N$ is the number of moments used. We also implement spatial symmetries to improve the accuracy of our results, as described in Appendix A.

In Fig.1 we show the evolution of the longitudinal structure factor of a 1D spin chain of length $L=32$ for a choice of momentum $k=\pi/2$ as a function of the number of Chebyshev moments $N$ . We clearly observe that the spectrum develops oscillations reproducing the well resolved high energy features in the finite system. These oscillations become more pronounced when increasing $n$ over the range of energies corresponding to the spinon continuum. This shows that one can obtain very reasonable results over the entire frequency domain with moderate effort using a small number of wave-functions, or Chebyshev iterations.

In Fig.2 we compare results with $N=100$ to those obtained using the same approach but with DMRG as a solver (i.e., using matrix product states as a variational ansatz). The agreement over the entire range of momenta and frequencies is remarkable. Notice that there is no straight forward way to quantify the error, except by comparing to such accurate calculations using other methods as benchmark (DMRG in this case). We discuss the possible sources of the discrepancies in the discussion, Sec.VII.

Fig.3 shows the same spectra but using the continued fraction approach, compared to dynamical DMRG. We note that while the positions of the peaks are the same, the profiles are different to those in Fig.2 because the spectrum in this case is described by a superposition of Lorentzians. For all practical purposes, both techniques reproduce the same data. This is more dramatically illustrated in Fig.4 where we compare all approaches as a function of momentum and frequency in color density maps. It is fair to say that they are practically indistinguishable.

While results in 1D serve as a benchmark and illustrate the accuracy of the approach, we aim at making this method also applicable in higher dimensions, where one can access richer physics, such as quantum spin liquidsSzabó and Castelnovo (2020). In this case, we focus on a $6\times 6$ square lattice, where our variational results can be contrasted directly to exact diagonalization (see Appendix B). In Figs.5 and 6 we display a comparison for several values of momenta on the two-dimensional Brillouin zone using both, a Chebyshev expansion and a continued fraction approach, respectively. We find that while the first peak is very accurately described, the high energy features with smaller spectral weight are partially lost.

VII Summary and conclusions

We have presented a highly efficient approach to calculate spectral functions of strongly correlated systems using a Chebyshev expansion of the Green’s function and a variational representation of the many-body states in the form of restricted Boltzmann machines. Unlike a previous method introduced in the literature by the authorsHendry and Feiguin (2019), these calculations are numerically inexpensive and yield very accurate results in 1D. We have also described remedial measures to improve the accuracy of the results, such as using spatial symmetries, and correcting for the limited “expressivity” of the neural network. As mentioned earlier, the method provides uniform resolution over the entire relevant frequency range with a relatively small numbers of Chebyshev moments and the information to calculate each moment is encoded in an RBM that can be stored for post-processing.

In order to understand the sources of error we start by recalling that (i) our calculations rely on a particular form of the variational wave-function, meaning that the results will depend on the representation power of a RBM; (ii) the optimization of the wave function depends on our ability to accurately minimize our loss function with respect of the variational parameters. This loss function may have a complex landscape with local minima and, in addition, (iii) we need to consider the errors introduced by the stochastic sampling of all the different quantities. Due to the recursive nature of the calculation, since every wave function depends on the previous ones, any errors are propagated down the sequence. We have shown that these errors can be mitigated by recalculating the moments in a Chebyshev basis that has support outside the space of single RBMs.

These ideas are general, as demonstrated by both prior studies using matrix product states and in this work using RBMs. The main challenge moving forward is to scale these methods to larger systems, which will be primarily determined by our ability to train larger networks, or variational wave functions, with a growing number of free parameters.

It also worth noting that one can take existing moments and approximate higher moments using extrapolation techniques Ganahl et al. (2014); Wolf et al. (2015). This can help push calculations to larger system sizes by reducing the total number of moments required.

We finally point out that the overall technique, including the error correction approach we used to improve the Chebyshev wave functions, is generic and can be implemented with any variational form of wave-function.

Acknowledgements.

AEF and HC acknowledge the National Science Foundation for support under grant No. DMR-1807814. DH is supported by a Northeastern Tier 1 grant.

Appendix A Implementing symmetries

The RBM wave functions are optimized to approximate $|t_{n00}\rangle=T_{n}(\hat{H})S^{z}_{00}|\psi_{gs}\rangle$ for the $n^{th}$ Chebyshev polynomial $T_{n}$ and approximate ground state $|\psi_{gs}\rangle$ . To obtain the Chebyshev wave functions for momentum $\mathbf{k}=(k_{1},k_{2})$ , we make use of the translational symmetry of the problem by assuming the ground state has zero momentum (translations do no result in a change in overall phase). Here we use $\hat{T}_{m_{1}m_{2}}$ to represent the translation operator on the square lattice with a shift of $(m_{1},m_{2})$ , while $T_{n}(\hat{H})$ denotes the Chebyshev polynomials of the Hamiltonian. Using this notation we can write:

|\widetilde{t}_{nk_{1}k_{2}}\rangle=\frac{1}{L}\sum_{m_{1},m_{2}=0}^{L-1}e^{-i\frac{2\pi}{L}(k_{1}m_{1}+k_{2}m_{2})}\hat{T}_{m_{1}m_{2}}\hat{S}^{z}_{00}\hat{T}_{-m_{1}-m_{2}}T_{n}(\hat{H})\hat{S}^{z}_{00}|\psi_{gs}\rangle=L\hat{P}^{(T)}_{k_{1}k_{2}}|t_{n00}\rangle

In this equation one can see that we are acting with the projection operator onto the momentum sector with momentum quantum numbers given by integers $k_{1},k_{2}$ defined as:

P^{(T)}_{k_{1}k_{2}}=\frac{1}{L^{2}}\sum_{m_{1},m_{2}=0}^{L-1}e^{-i\frac{2\pi}{L}(k_{1}m_{1}+k_{2}m_{2})}\hat{T}_{m_{1}m_{2}}.

Thus, each momentum wave-function can represented in terms of linear combinations of translations of a single RBM optimized for the local $S^{z}_{00}$ operator. Additionally, since $\hat{P}^{(T)}_{k_{1}k_{2}}$ is a projector, the resulting state will have the correct momentum and all the error from different momentum sectors will be projected out. In the same vein, rotational symmetry can be enforced by applying the following projector of rotations and translations.

\hat{P}^{(R)}=\frac{1}{4}(1+\hat{T}_{10}\hat{R}+\hat{T}_{11}\hat{R}^{2}+\hat{T}_{01}\hat{R}^{3}).

This projector is an average of each rotation followed by the translation which takes the $S^{z}$ operator back to the origin. In the exact calculation for the Chebyshev wave-functions, the application of $\hat{P}^{(R)}$ should leave the $|t_{n00}\rangle$ unchanged. However, the RBM representation will not necessarily have rotational symmetry, therefore, the projector will correct any errors related to an absence of rotational symmetry. After including rotation projector we find:

|\widetilde{t}_{nk_{1}k_{2}}\rangle=L\hat{P}^{(T)}_{k_{1}k_{2}}\hat{P}^{(R)}|t_{n00}\rangle

All that is needed to calculate the Chebyshev moments is the overlap:

\mu_{nk_{1}k_{2}}=\frac{\langle\widetilde{t}_{0k_{1}k_{2}}|\widetilde{t}_{nk_{1}k_{2}}\rangle}{\langle\psi_{gs}|\psi_{gs}\rangle}

However, due the recursive process of obtaining the Chebyshev wave-functions, the error will propagate down the sequence of wave-functions. As mentioned in the main text, we can correct some of this error by constructing new Chebyshev wave-functions as linear combinations of the original RBM wave-functions. This error correction requires us to calculate the following quantities:

G^{(k_{1}k_{2})}_{nn^{\prime}}=\frac{\langle\widetilde{t}_{nk_{1}k_{2}}|\widetilde{t}_{n^{\prime}k_{1}k_{2}}\rangle}{\langle\psi_{gs}|\psi_{gs}\rangle}

H^{(k_{1}k_{2})}_{nn^{\prime}}=\frac{\langle\widetilde{t}_{nk_{1}k_{2}}|{H}|\widetilde{t}_{n^{\prime}k_{1}k_{2}}\rangle}{\langle\psi_{gs}|\psi_{gs}\rangle}

These matrices are calculated by sampling over spin configurations states $|s\rangle$ probability $P(s)=\frac{\langle\psi_{gs}|s\rangle\langle s|\psi_{gs}\rangle}{\langle\psi_{gs}|\psi_{gs}\rangle}$ . Then, for each translation $(m_{1},m_{2})$ , and rotation $m_{r}$ of $s$ we obtain:

B_{nm_{1}m_{2}m_{r}}(s)=\frac{\langle s|\hat{T}_{m_{1}m_{2}}\hat{R}^{m_{r}}|t_{n00}\rangle}{\langle s|\psi_{gs}\rangle}

From these values, the ratio $\widetilde{B}_{nk_{1}k_{2}}(s)=\langle s|\widetilde{t}_{nk_{1}k_{2}}\rangle/\langle s|\psi_{gs}\rangle$ can be obtained by Fourier transforming $B$ .

\widetilde{B}_{nk_{1}k_{2}}(s)=\frac{1}{L}\sum_{m_{1},m_{2}=0}^{L-1}e^{-i\frac{2\pi}{L}(k_{1}m_{1}+k_{2}m_{2})}\frac{1}{4}\left[B_{n,m_{1},m_{2},0}(s)+\right.\\ \left.B_{n,m_{1}+1,m_{2},1}(s)+B_{n,m_{1}+1,m_{2}+1,2}(s)+B_{n,m_{1},m_{2}+1,3}(s)\right]

Finally, the average gives $G^{(k_{1}k_{2})}_{nn^{\prime}}=\langle\widetilde{B}_{nk_{1}k_{2}}(s)^{*}\widetilde{B}_{n^{\prime}k_{1}k_{2}}(s)\rangle$ . A similar procedure is followed to obtain $H^{(k_{1}k_{2})}_{nn^{\prime}}$ .

Appendix B Exact Calculation for 2D Heisenberg Model

In order to calculate the spectral function for $6\times 6$ lattice with periodic boundary conditions we implement the translational and point-group symmetries to block diagonalize the Hamiltonian using the QuSpin exact diagonalization library Weinberg and Bukov (2017, 2019). Any local operator $A_{i}$ explicitly breaks translational invariance, so instead we work with linear combinations of the local operators:

\hat{A}_{k_{1},k_{2}}=\frac{1}{L}\sum_{n_{1},n_{2}=0}^{L-1}e^{-i\frac{2\pi}{L}\left(k_{1}n_{1}+k_{2}n_{2}\right)}\hat{A}_{n_{1},n_{2}}.

(9)

The resulting operators have a well defined quantum number for the momentum/point-group symmetries, therefore, when applying this operator to the ground state, the resulting state lives in the sector with the momentum given by the operator:

|A(k_{1},k_{2})\rangle=\hat{A}_{k_{1},k_{2}}|\psi_{gs}\rangle

(10)

With this state calculated we can calculate the Chebyshev moments as well as solve for the exact spectral function by using an iterative linear solver on the following linear equation:

(z-\hat{H})|A(k_{1},k_{2})\rangle=|B(k_{1},k_{2},z)\rangle

(11)

which can then be used to calculate the spectral function defined in momentum space:

G_{k_{1},k_{2}}(z)=\frac{1}{\pi}\langle\psi_{gs}|\hat{A}^{\dagger}_{k_{1},k_{2}}\frac{1}{z-\hat{H}}\hat{A}_{k_{1},k_{2}}|\psi_{gs}\rangle\\ =\frac{1}{\pi}\langle A(k_{1},k_{2})|B(k_{1},k_{2},z)\rangle

(12)

References

Dagotto (1994) E. Dagotto, Rev. Mod. Phys. 66, 763 (1994).
Schüttler and Scalapino (1986) H. B. Schüttler and D. J. Scalapino, Phys. Rev. B 34, 4744 (1986).
Sandvik (1998) A. W. Sandvik, Phys. Rev. B 57, 10287 (1998).
Silver et al. (1990) R. N. Silver, D. S. Sivia, and J. E. Gubernatis, Phys. Rev. B 41, 2380 (1990).
Gubernatis et al. (1991) J. E. Gubernatis, M. Jarrell, R. N. Silver, and D. S. Sivia, Phys. Rev. B 44, 6011 (1991).
Syljuåsen (2008) O. F. Syljuåsen, Phys. Rev. B 78, 174429 (2008).
Fuchs et al. (2010) S. Fuchs, T. Pruschke, and M. Jarrell, Phys. Rev. E 81, 056701 (2010).
Sandvik (2016) A. W. Sandvik, Phys. Rev. E 94, 063308 (2016).
Shao et al. (2017) H. Shao, Y. Q. Qin, S. Capponi, S. Chesi, Z. Y. Meng, and A. W. Sandvik, Phys. Rev. X 7, 041072 (2017).
White (1992) S. R. White, Phys. Rev. Lett. 69, 2863 (1992).
White (1993) S. R. White, Phys. Rev. B 48, 10345 (1993).
Schollwöck (2005) U. Schollwöck, Rev. Mod. Phys. 77, 259 (2005).
Schollwöck (2011) U. Schollwöck, Annals of Physics 326, 96 (2011), january 2011 Special Issue.
Feiguin (2013a) A. E. Feiguin, in Strongly correlated systems: Numerical methods, edited by A. Avella and F. Mancini (Springer, Heidelberg, Berlin, 2013).
Hallberg (1995) K. Hallberg, Phys. Rev. B 52, R9827 (1995).
Kühner and White (1999a) T. D. Kühner and S. R. White, Phys. Rev. B 60, 335 (1999a).
Jeckelmann (2002) E. Jeckelmann, Phys. Rev. B 66, 045114 (2002).
Dargel et al. (2011) P. E. Dargel, A. Honecker, R. Peters, R. M. Noack, and T. Pruschke, Phys. Rev. B 83, 161104 (2011).
Dargel et al. (2012) P. E. Dargel, A. Wöllert, A. Honecker, I. P. McCulloch, U. Schollwöck, and T. Pruschke, Phys. Rev. B 85, 205119 (2012).
Nocera and Alvarez (2016) A. Nocera and G. Alvarez, Phys. Rev. E 94, 053308 (2016).
Daley et al. (2004) A. J. Daley, C. Kollath, U. Schollwöck, , and G. Vidal, J. Stat. Mech.: Theor. Exp. , P04005 (2004).
White and Feiguin (2004) S. White and A. Feiguin, Phys. Rev. Lett. 93, 076401 (2004).
Feiguin and White (2005) A. Feiguin and S. White, Phys. Rev. B 72, 20404 (2005).
Feiguin (2011) A. E. Feiguin, in XV Training Course in the Physics of Strongly Correlated Systems, Vol. 1419 (AIP Proceedings, 2011) p. 5.
Feiguin (2013b) A. E. Feiguin, in Strongly correlated systems: Numerical methods, edited by A. Avella and F. Mancini (Springer, Heidelberg, Berlin, 2013).
Holzner et al. (2011) A. Holzner, A. Weichselbaum, I. P. McCulloch, U. Schollwöck, and J. von Delft, Phys. Rev. B 83, 195115 (2011).
Wolf et al. (2015) F. A. Wolf, J. A. Justiniano, I. P. McCulloch, and U. Schollwöck, Phys. Rev. B 91, 115144 (2015).
Xie et al. (2018) H. D. Xie, R. Z. Huang, X. J. Han, X. Yan, H. H. Zhao, Z. Y. Xie, H. J. Liao, and T. Xiang, Phys. Rev. B 97, 075111 (2018).
Zaletel et al. (2015) M. P. Zaletel, R. S. K. Mong, C. Karrasch, J. E. Moore, and F. Pollmann, Phys. Rev. B 91, 165112 (2015).
Verresen et al. (2018) R. Verresen, F. Pollmann, and R. Moessner, Phys. Rev. B 98, 155102 (2018).
Haegeman et al. (2013) J. Haegeman, T. J. Osborne, and F. Verstraete, Phys. Rev. B 88, 075133 (2013).
Vanderstraeten et al. (2015a) L. Vanderstraeten, M. Mariën, F. Verstraete, and J. Haegeman, Phys. Rev. B 92, 201111 (2015a).
Vanderstraeten et al. (2015b) L. Vanderstraeten, F. Verstraete, and J. Haegeman, Phys. Rev. B 92, 125136 (2015b).
Vanderstraeten et al. (2019) L. Vanderstraeten, J. Haegeman, and F. Verstraete, SciPost Phys. Lect. Notes , 7 (2019).
Li and Yang (2010) T. Li and F. Yang, Phys. Rev. B 81, 214509 (2010).
Dalla Piazza et al. (2014) B. Dalla Piazza, M. Mourigal, N. B. Christensen, G. J. Nilsen, P. Tregenna-Piggott, T. G. Perring, M. Enderle, D. F. McMorrow, D. A. Ivanov, and H. M. Rønnow, Nature Physics 11, 62 EP (2014), article.
Ferrari et al. (2018) F. Ferrari, A. Parola, S. Sorella, and F. Becca, Phys. Rev. B 97, 235103 (2018).
Hendry and Feiguin (2019) D. Hendry and A. E. Feiguin, Phys. Rev. B 100, 245123 (2019).
Carleo and Troyer (2017) G. Carleo and M. Troyer, Science 355, 602 (2017), http://science.sciencemag.org/content/355/6325/602.full.pdf .
Saito (2017) H. Saito, Journal of the Physical Society of Japan 86, 093001 (2017), https://doi.org/10.7566/JPSJ.86.093001 .
Cai and Liu (2018) Z. Cai and J. Liu, Phys. Rev. B 97, 035116 (2018).
Glasser et al. (2018) I. Glasser, N. Pancotti, M. August, I. D. Rodriguez, and J. I. Cirac, Phys. Rev. X 8, 011006 (2018).
Chen et al. (2018) J. Chen, S. Cheng, H. Xie, L. Wang, and T. Xiang, Phys. Rev. B 97, 085104 (2018).
Kühner and White (1999b) T. D. Kühner and S. R. White, Phys. Rev. B 60, 335 (1999b).
Szabó and Castelnovo (2020) A. Szabó and C. Castelnovo, Phys. Rev. Research 2, 033075 (2020).
Ganahl et al. (2014) M. Ganahl, P. Thunström, F. Verstraete, K. Held, and H. G. Evertz, Phys. Rev. B 90, 045144 (2014).
Weinberg and Bukov (2017) P. Weinberg and M. Bukov, SciPost Phys. 2, 003 (2017).
Weinberg and Bukov (2019) P. Weinberg and M. Bukov, SciPost Phys. 7, 20 (2019).