Phase Diagram Reconstruction of the Bose-Hubbard Model with a Restricted Boltzmann Machine Wavefunction

Vladimir Vargas-Calderón [email protected] Herbert Vinck-Posada Grupo de Superconductividad y Nanotecnología, Departamento de Física, Universidad Nacional de Colombia, Bogotá, Colombia Fabio A. González Computing Systems and Industrial Eng. Department, Universidad Nacional de Colombia, Bogotá, Colombia

Abstract

Recently, the use of neural quantum states for describing the ground state of many- and few-body problems has been gaining popularity because of their high expressivity and ability to handle intractably large Hilbert spaces. In particular, methods based on variational Monte Carlo have proven to be successful in describing the physics of bosonic systems such as the Bose-Hubbard (BH) model. However, this technique has not been systematically tested on the parameter space of the BH model, particularly at the boundary between the Mott insulator and superfluid phases. In this work, we evaluate the capabilities of variational Monte Carlo with a trial wavefunction given by a Restricted Boltzmann Machine to reproduce the quantum ground state of the BH model on several points of its parameter space. To benchmark the technique, we compare its results to the ground state found through exact diagonalization for small one-dimensional chains. In general, we find that the learned ground state correctly estimates many observables, reproducing to a high degree the phase diagram for the first Mott lobe and part of the second one. However, we find that the technique is challenged whenever the system transitions between excitation manifolds, as the ground state is not learned correctly at these boundaries. We improve the quality of the results produced by the technique by proposing a method to discard noisy probabilities learned in the ground state.

I Introduction

The dimension of a Hilbert space that describes the possible states of a many-body system scales exponentially with the number of one-body states and with the number of particles. In most cases, this feature comes as an important practical difficulty for physicists to study many-body problems, because approximate techniques have to be implemented in order to perform simulations (e.g., dynamical mean-field theory (Georges et al., 1996), density matrix renormalization group (DMRG) (Schollwöck, 2005)). One of the most studied quantum objects is the ground state of a many-body system for a number of reasons: particles tend to occupy the lowest energy states first, as dictated by the aufbau principle; also excited states inherit the ground state structure; and most prominently, it is an object that encodes changes in observables that exhibit quantum phase transitions. Recently, a technique to calculate the ground state of a many-body system has been proposed, based on defining a trial wavefunction with a neural network, which is optimized to minimize the ground state energy through variational Monte Carlo(VMC) (Carleo and Troyer, 2017). This technique has proven to be successful in approximating with high fidelity the ground state of several condensed matter many- and few-body systems. Some examples are atomic and molecular systems (Han et al., 2019; Choo et al., 2019a; Pfau et al., 2019; Hermann et al., 2019; Kessler et al., 2019; Ruggeri et al., 2018), the transverse-field Ising model (Carleo et al., 2018; Sharir et al., 2020; Hibat-Allah et al., 2020) with quenching (Czischek et al., 2018), the Heisenberg model (Carleo et al., 2018) and its anti-ferromagnetic version (Sharir et al., 2020; Nomura et al., 2017), the quantum harmonic oscillator in electric field (Teng, 2018), the Hubbard model (Nomura et al., 2017), the $J_{1}$ - $J_{2}$ Heisenberg model (Hibat-Allah et al., 2020) and its antiferromagnetic version (Szabó and Castelnovo, 2020; Choo et al., 2019b; Liang et al., 2018), as well as models for interacting bosons in one and three-dimensions (Saito, 2018).

The success in approximating the ground state comes from two sources. The first one is technical: neural networks have expressivity and are able to approximate arbitrary functions (Gao and Duan, 2017), and their optimization methods in machine learning have been thoroughly studied (Sun et al., 2019). The second one is physical: even though the size of a Hilbert space exponentially grows with respect to one-body states and number of particles, only a small set of those states are needed to describe the ground state (Carleo and Troyer, 2017). Here the definition of small can vary, as it will be seen in this work.

Although reported results indicate that the quantum ground state can be represented through neural network wavefunctions, most of these studies focus on the convergence of the energy. Only a few of them use the technique to characterize quantum correlations near a quantum phase transition to study the phase space of a Hamiltonian comprehensively. Indeed, the convergence of the energy comes quicker than the convergence of the state, as any first-order error in the variational state leads to only a second-order error in its corresponding energy (Ballentine, 1998). Therefore, focusing on energy convergence might be misleading when asserting the power of VMC with neural network trial wavefunctions. In contrast to previous works, we intensively and systematically test the capabilities of VMC to reconstruct, through a Restricted Boltzmann Machine (RBM) wavefunction, the quantum ground state of a one-dimensional BH system throughout the superfluid (SF) and Mott insulator (MI) phases. We chose the BH model because of the known numerical difficulties near the MI-SF boundary. Consequently, we contrast exact diagonalization solutions with the ones provided by the VMC-RBM method in systems of 5 and 8 sites for a fine mesh in the BH parameter space at the first and part of the second Mott lobes. It is only due to the sweeping of the BH parameters that we are able to see that points near the MI-SF boundary are challenging for the VMC-RBM method. However, we also find that some of the differences between the learned ground state and the ground state found through exact diagonalization originate from badly learned probability amplitudes of Fock states that should have small contribution to the ground state. In this work, we also propose a state cleaning technique that removes those badly learned probabilities.

There have also been previous efforts to learn the ground state of the BH Hamiltonian through different neural network wavefunctions. However, as mentioned before, these studies do not extensively test the VMC framework throughout the BH parameter space. For instance, a permutation symmetric RBM implemented in NetKet Carleo et al. (2019) has been used to study the energy and particle density convergence at two points in the SF and MI phases, finding a difficulty in the convergence to the numerically exact particle density in the SF phase McBrian et al. (2019). However, a qualitatively good location of the boundary between the SF and MI phases for the region enclosing the first two Mott lobes was achieved using the particle density as a discriminator McBrian et al. (2019). RBMs have also been used to show how for a fixed number of bosons in the system, the learned ground state is able to replicate the numerically exact order parameter of the quantum phase transition Li (2019). However, fixing the number of bosons greatly reduces the size of the Hilbert space, making it easier for the VMC technique to learn the ground state since a more complete sampling of the Hilbert’s basis states is possible. Also, the BH model has been considered as a toy model for trying full-forward neural networks (FFNNs) and convolutional neural networks (CNNs) wavefunctions. For instance, the ground state of one and two-dimensional finite lattices with parabolic confinement potential for a fixed number of bosons was approximated through an FFNN ansatz (Saito, 2017). In the case of no confinement and periodic boundary conditions (which introduce displacement symmetry), both FFNNs and RBMs were used in large 1D lattices of up to 40 sites (Choo et al., 2018). CNNs have also been introduced and compared with FFNN wavefunctions to approximate this ground state (Saito and Kato, 2018). A CNN is also proposed as a means to introduce the ground state at finite temperature in a 1D lattice for larger on-site boson interaction than hopping interaction (Irikura and Saito, 2020). Despite the important results derived from these works, they do not focus on the ability of the neural network ansatz to reproduce the ground state near the quantum phase transition boundary. Nevertheless, unsupervised learning has been previously used to classify states as belonging to the Mott insulator or SF phases in the BH model with previously obtained states or physical quantities, for several values of the order parameters Liu and van Nieuwenburg (2018); Huembeli et al. (2018); Broecker et al. (2017); Pérez Díaz (2019).

We believe that the ability of VMC technique to reproduce the ground state near the quantum phase transition has to be extensively tested. Therefore, we pay close attention to the description of the variational ground state near the SF-MI phase for a small number of sites to compare it with the numerically exact ground state. This paper is organized as follows. Section II exposes the main features of the BH Hamiltonian and describes the SF-MI transition in the system. In this section, an overview of the VMC technique is given, discussing the difficulties that must be faced and overcome in order to learn the ground state of the BH model. In section III, the main results are given, including energy convergence, the overlap of the learned and exact ground states, the reproduction of the phase diagram via two different order parameters, as well as tomographies indicating relevant ¹¹1By relevant, we mean that a Fock state $\ket{{\bf\it n}}$ has a large probability amplitude with respect to other Fock states. Fock states for the ground state. Finally, in section IV conclusions of this work are given.

II Model and Variational Monte Carlo

II.1 Bose-Hubbard Hamiltonian

The BH Hamiltonian describes the interactions between bosons that can occupy sites in a $d$ -dimensional lattice. The model is characterized by a hopping energy $t$ , an on-site interaction $U$ and a chemical potential $\mu$ , so that the grand-canonical Hamiltonian reads ( $\hbar=1$ ) (Fisher et al., 1989)

\displaystyle\hat{H}=-t\sum_{\langle ij\rangle}(\hat{a}_{i}^{\dagger}\hat{a}_{j}+\text{H.c.})+\frac{U}{2}\sum_{i}\hat{n}_{i}(\hat{n}_{i}-1)-\mu\sum_{i}\hat{n}_{i},

(1)

where $\hat{a}_{i}$ is the annihilation operator at site $i$ , and $\hat{n}_{i}=\hat{a}^{\dagger}_{i}\hat{a}_{i}$ is the number operator. The notation $\langle ij\rangle$ indicates that the sum runs over pairs of neighbor sites in the lattice. The BH model is able to reproduce experimental results in Josephson-junction networks (van Oudenaarden and Mooij, 1996; Baltin and Wagenblast, 1997; Bruder et al., 2005) and in lattices of ultra-cold atoms (Greiner et al., 2003; Jaksch et al., 1998; Bloch et al., 2008; Stöferle et al., 2004; Spielman et al., 2007; Gemelke et al., 2009). The latter offers precise control of the lattice parameters (Grynberg et al., 2000; Kollath et al., 2004; Lugan et al., 2007). Theoretically, a lot of attention has been devoted both to understand the quantum phases of the system (the ones that arise from eq. 1 and from the disordered or extended BH model with longer range interactions) (Freericks and Monien, 1996; Ejima et al., 2012; Cazalilla et al., 2011; Teichmann et al., 2009; Batrouni et al., 1995; Krauth et al., 1991; Krauth and Trivedi, 1991; Niyaz et al., 1991). Another central issue is to calculate the quantum phase transition boundaries (Kühner and Monien, 1998; Ejima et al., 2011; Kühner et al., 2000; Batrouni and Scalettar, 1992; Kashurnikov et al., 1996; Batrouni et al., 1990; Elstner and Monien, 1999; Koller and Dupuis, 2006), whose precision has improved over the years with better calculation techniques and computing power, revealing features such as the re-entrance phenomenon (Pino et al., 2013; Elstner and Monien, 1999), where for particular values of the chemical potential, the system switches between the MI phase to the SF phase, and back to the MI phase before definitely entering the SF phase after an increase of $t/U$ .

For simplicity, we will restrict our analyses to the $d=1$ case, where only two quantum phases are possible. When the on-site interaction energy is much larger than the hopping energy, the latter becomes negligible, and the Hamiltonian is written as the sum of independent Hamiltonians for each site $\frac{U}{2}\hat{n}_{i}(\hat{n}_{i}-1)-\mu\hat{n}_{i}$ . These one-particle Hamiltonians can be immediately diagonalized by the number basis. The corresponding eigenenergies are $\frac{U}{2}n_{i}(n_{i}-1)-\mu n_{i}$ , which reach minimum values for fixed $\mu$ and $U$ at $n_{i}=\max\{0,\lceil\mu/U\rceil\}$ (note that all sites are equivalent). Moreover, in this regime, the expected variance of the local number operator is 0, i.e. $\langle\hat{n}_{i}^{2}\rangle-\langle\hat{n}_{i}\rangle^{2}=0$ . This regime characterizes the MI phase. On the other hand, when the hopping energy is much larger than the on-site interaction energy, the latter becomes negligible. Thus, the Hamiltonian can also be written as the sum of independent Hamiltonians, but in momentum space, where $\tilde{a}_{k}=N^{-1/2}\sum_{j=1}^{N}\hat{a}_{j}e^{-ix_{j}p_{k}/\hbar}$ is the boson annihilation operator in the momentum representation. Here, $x_{j}=c\times j$ , where $c$ is the lattice constant, and $p_{k}=2\pi k\hbar/(N\times c)$ . Each independent Hamiltonian in momentum space ( $\sum_{k}(-2t\cos(2\pi k/N)-\mu)\tilde{a}^{\dagger}_{k}\tilde{a}_{k}$ ) has eigenenergies $-2t\cos(2\pi k/N)-\mu$ , which reach their minimum when all bosons condense with 0 momenta. Note that the energies are independent of the states’ occupation, meaning that the ground state is degenerate for any number of particles. This regime is known as the SF phase, characterized by a delocalized wavefunction (formally described by algebraic decaying spatial correlations (Ejima et al., 2012; Pino et al., 2013), which is why the SF phase in 1D is not a true Bose-Einstein condensate). Thus, the ground state has a non-zero expected variance of the local number operator. In fact, the probability distribution for the local occupation is Poissonian, meaning that $\expectationvalue*{n_{i}^{2}}-\expectationvalue*{n_{i}}^{2}=\expectationvalue*{n_{i}}$ (Greiner et al., 2003). For a fixed chemical potential and an infinite number of sites, there exists a continuous phase transition from the MI phase to the SF phase as $U$ decreases with respect to $t$ (with the exception of a range of $\mu$ values in the first Mott lobe, where re-entrance exists).

II.2 Variational Monte Carlo with Restricted Boltzmann Machine Wave Function

In this section, a review of the method introduced by Carleo and Troyer (2017) is given, where a VMC setup is used to find the ground state, formally written as a trial wavefunction given by an RBM. In the Fock space basis, the ground state of the BH model can be written as Saito (2017)

\displaystyle\ket{\Psi}=\sum_{n_{1}=0,n_{2}=0,\ldots}^{\infty}\Psi(n_{1},n_{2},\ldots)\ket{n_{1},n_{2},\ldots},

(2)

where $n_{i}$ is the occupation number of the $i$ -th site, and $|\Psi(n_{1},n_{2},\ldots)|^{2}$ are the probability amplitudes corresponding to the Fock states $\ket{n_{1},n_{2},\ldots}$ . In order to map the wave function to a computer, both the number of sites and the number of possible particles in each site have to be truncated. We will refer to the number of sites as $N$ and to the maximum number of particles in each site as $M-1$ . The coefficients $\Psi(n_{1},n_{2},\ldots,n_{N})$ are approximated by an RBM. RBMs are generative neural networks, formally described by a bipartite undirected graph such as the one shown in fig. 1, where there is a layer of visible neurons denoted by ${\bf\it v}$ that are used to input real data, and a layer of hidden neurons denoted by ${\bf\it h}$ that are used as latent variables of the model (Smolensky, 1986). In particular, the wavefunction coefficients take the form $\Psi(n_{1},n_{2},\ldots,n_{N})\approx\psi_{{\bf\it\theta}}(n_{1},\ldots,n_{N})=\sum_{{\bf\it h}}e^{-E_{\text{RBM}}({\bf\it v}({\bf\it n}),{\bf\it h})}$ , where $E_{\text{RBM}}({\bf\it v}({\bf\it n}),{\bf\it h})$ is the energy of the RBM (Smolensky, 1986), and ${\bf\it\theta}$ are the variational parameters of the RBM. As a short-hand notation, an occupation configuration is denoted as ${\bf\it n}$ , and it is inputted to the visible layer of the RBM. However, the configuration first needs to be one-hot encoded as follows: each occupation $n_{i}$ is encoded into an $M$ -component vector whose $j$ -th component is $\delta_{j,n_{i}}$ , $j=0,1,\ldots,M-1$ ; then, the vectors for every occupation are concatenated into ${\bf\it v}({\bf\it n})$ . Moreover, if the $N_{H}$ hidden neurons are restricted to binary values $1$ or $-1$ , then, the approximated wavefunction coefficients can be written as Carleo and Troyer (2017)

\displaystyle\psi_{{\bf\it\theta}}({\bf\it n})=e^{\sum_{j}a_{j}v_{j}({\bf\it n})}\prod_{\ell=1}^{N_{H}}2\cosh\left(b_{\ell}+\sum_{j}W_{\ell,j}v_{j}({\bf\it n})\right),

(3)

where $a_{j},b_{\ell}$ and $W$ are the complex-valued visible bias, hidden bias and connection matrix of the RBM, respectively.

Refer to caption — Figure 1: (Color online) Illustration of the RBM, where each site occupation is one-hot encoded into $M$ visible neurons, depicted with different colors for different sites in the top layer. There are $N_{H}$ hidden neurons in the bottom layer connected with the visible neurons through weights $W_{\ell,j}$ that connect the $\ell$ -th hidden neuron with the $j$ -th visibile neuron.

Thus, the approximation of the wavefunction coefficients is done through the adjustment of the parameters ${\bf\it\theta}:\{a_{j},b_{\ell},W_{\ell,j}\}$ that minimize the energy $\bra{\psi_{{\bf\it\theta}}}\hat{H}\ket{\psi_{{\bf\it\theta}}}$ . At each step of the minimization, a set $\mathcal{M}$ of configurations ${\bf\it n}$ is sampled from $|\psi_{{\bf\it\theta}}({\bf\it n})|^{2}$ using the Metropolis-Hastings algorithm, so that the energy can be efficiently estimated as (Saito, 2017)

\displaystyle\bra{\psi_{{\bf\it\theta}}}\hat{H}\ket{\psi_{{\bf\it\theta}}}\approx\frac{1}{|\mathcal{M}|}\sum_{{\bf\it n}\in\mathcal{M}}\sum_{{\bf\it n}^{\prime}}\bra{{\bf\it n}}\hat{H}\ket{{\bf\it n}^{\prime}}\frac{\psi_{{\bf\it\theta}}({\bf\it n}^{\prime})}{\psi_{{\bf\it\theta}}({\bf\it n})}.

(4)

More explicitly, the following steps are carried out to generate the sample $\mathcal{M}$ . At the first iteration, a state ${\bf\it n}_{0}$ is randomly proposed. Then, at the $i$ -th iteration:

1.

Under some updating rule, propose a new state ${\bf\it n}^{\prime}_{i}$ from ${\bf\it n}_{i}$ .
2.

With probability $\min\{1,|\psi_{{\bf\it\theta}}({\bf\it n}^{\prime}_{i})/\psi_{{\bf\it\theta}}({\bf\it n}_{i})|^{2}\}$ accept the state ${\bf\it n}^{\prime}_{i}$ , i.e. ${\bf\it n}_{i+1}\leftarrow{\bf\it n}^{\prime}_{i}$ . If it is not accepted, then ${\bf\it n}_{i+1}\leftarrow{\bf\it n}_{i}$ .

To build the sample $\mathcal{M}$ a total of 1000 iterations are performed. The updating rule is provided by NetKet’s Local Metropolis sampler, which picks a site at random and assigns to it a uniformly random occupation, except from the current occupation. Then, through either stochastic gradient descent or stochastic reconfiguration (Sorella et al., 2007), the energy in eq. 4 can be minimized, producing a new set of parameters ${\bf\it\theta}$ , as explained by Carleo and Troyer (2017). Both the sampling and minimization of the energy with respect to the RBM parameters are repeated iteratively until the RBM parameters converge, resembling an Expectation Maximization algorithm (Dempster et al., 1977). A schematic representation of the variational Monte Carlo technique is shown in fig. 2.

The blot at the bottom of fig. 2 depicts a region of the Hilbert space that contains the most relevant Fock states $\ket{{\bf\it n}}$ for the ground state of the BH Hamiltonian, with fixed $\mu,t$ and $U$ . For instance, the only relevant Fock state within the $m$ -th Mott lobe is $\ket{n_{1}=m,n_{2}=m,\ldots}$ , whereas in the SF phase there are many relevant Fock states. In that regard, the region of relevant Fock states can contain a variable number of Fock states depending on the parameters of the Hamiltonian that one is intending to solve. Therefore, an issue immediately arises: for an unknown target probability distribution $|\Psi({\bf\it n})|^{2}$ , the sampling can be too small or too large with respect to the size of the region of relevant Fock states. If it is small, important information about the ground state might not be taken into account, whereas if it is large, noisy probability from non-relevant states can be taken into account.

III Results

We swept several values of chemical potential and hopping energy (a grid of $72\times 100$ points) corresponding to the first and part of the second Mott lobe in the $t/U$ – $\mu/U$ space, with $U=1$ , performing 12000 sampling and optimization steps with NetKet (Carleo et al., 2019), where 1000 Metropolis-Hastings steps were done for each sampling step. This was done for three different scenarios: for 5 sites, we used 8 and 20 hidden neurons, and for 8 sites, we used 11 hidden neurons. In all cases, the maximum number of bosons allowed per site was 4.

III.1 Energy convergence

Since the VMC minimizes the energy, we check that the energy computed with eq. 4 converges by measuring its variance for the last 500 sampling-optimization steps, as well as by measuring the absolute error of the energy when compared to the exact ground state energy found through Lanczsos diagonalization. For the aforementioned three scenarios, the variance for the last 500 sampling-optimization steps is shown in fig. 3(a)-(c), and the corresponding absolute errors with respect to the exact ground state energy are shown in fig. 3(d)-(f). It is seen that in the Hamiltonian parameter space, the majority of the energies have low-variance, showing convergence towards a value that is in excellent agreement with the exact ground-state energies.

Since the Hilbert spaces are small enough to compute all the probability amplitude coefficients for every state in the Fock basis, we can also compute any observable $\hat{O}$ as

\displaystyle\bra{\psi_{{\bf\it\theta}}}\hat{O}\ket{\psi_{{\bf\it\theta}}}=\frac{\sum_{{\bf\it n}\in\mathcal{H}}|\psi_{{\bf\it\theta}}({\bf\it n})|^{2}\bra{{\bf\it n}}\hat{O}\ket{{\bf\it n}}}{\sum_{{\bf\it n}\in\mathcal{H}}|\psi_{{\bf\it\theta}}({\bf\it n})|^{2}}.

(5)

In particular, the absolute error of the energy computed through eq. 5 is shown in fig. 4. It is clear that in the case of 5 sites and 20 neurons, a very large region both in the SF and MI phases shows a strong disagreement between the energy calculated through eqs. 4 and 5, showing that even though the energy converged, the state did not. Another recurrent pattern in the energies calculated through eqs. 4 and 5 is an arc of high absolute errors formed in the left-most side of the Mott lobes, which we will address later.

III.2 Overlap

The lack of convergence of the RBM state for the case of 5 sites and 20 hidden neurons is further confirmed when we measure the overlap between the exact ground state $\ket{\psi_{\text{exact}}}$ and the RBM ground state $\ket{\psi_{{\bf\it\theta}}}$ , shown in fig. 5(c). It is now clear why the expected energy with respect to the complete RBM state shown in fig. 4(c) presents large errors when compared to the expected energy with respect to the exact ground state: it appears that the RBM has not learned the ground state in the bottom-right region of the plot, which is a region that covers part of the Mott lobe, as well as part of the SF phase region. Moreover, in the case of 5 sites and 8 hidden neurons, and the case of 8 sites and 11 hidden neurons, the RBM finds difficulty in learning the ground state in the limit between the MI and the SF phase as shown in fig. 5(a) and (b). The difficulty in learning those states, and in general, in treating the ground state near the MI-SF boundary comes from the Kosterlitz-Thouless-like quantum phase transition in 1D systems (Giamarchi, 2003), where an exponentially small Mott gap exists (Ejima et al., 2012). We see once again that there are arcs of low overlap points formed in the left-most side of the Mott lobes. Within the SF phase, there are also lines of low overlap, which appear because of finite size effects. Note that there are as many of these fictitious boundaries as there are sites in the periodic chain under study. Nevertheless, when comparing the 5 sites cases, it is seen that for values of $\mu/U>1$ the states at the MI-SF phases boundary are better learned when 20 hidden neurons are used in the RBM (as in (McBrian et al., 2019), cf. fig. 5(c)) than when only 8 hidden neurons are used (see fig. 5(a)).

An interesting motif is that if one pays very close attention to the energy absolute errors in fig. 3(d)-(f), the lowest errors occur where the overlap in fig. 5(a)-(c) is lowest. Therefore, even though the convergence in the energy is excellent, the state is not learned correctly.

Since the advantage of VMC over exact diagonalization comes for intractably large Hilbert spaces, it is not always possible to compute the probability amplitudes for all of the Fock states basis. In such a case, the RBM state can be built as

\displaystyle\ket*{\tilde{\psi}_{{\bf\it\theta}}}=\frac{\sum_{{\bf\it n}\in\mathcal{M}^{\prime}}\psi_{{\bf\it\theta}}({\bf\it n})\ket{{\bf\it n}}}{\sqrt{\sum_{{\bf\it n}\in\mathcal{M}^{\prime}}|\psi_{{\bf\it\theta}}({\bf\it n})|^{2}}},

(6)

where $\mathcal{M}^{\prime}$ is a sample of Fock states sampled with the Metropolis-Hastings algorithm, with an acceptance probability of $\min\{1,p_{\text{GC}}({\bf\it n}_{i+1})/p_{\text{GC}}({\bf\it n}_{i})\}$ , where $p_{\text{GC}}({\bf\it n})=\mathcal{Z}^{-1}\exp(-\expectationvalue{\hat{H}}{{\bf\it n}})$ ²²2If instead the sample is built from the distribution $\absolutevalue{\psi_{{\bf\it n}}}^{2}$ , then the acceptance probability becomes $\min\{1,|\psi_{{\bf\it\theta}}({\bf\it n}_{i+1})/\psi_{{\bf\it\theta}}({\bf\it n}_{i})|^{2}\}$ and the state build through eq. 5 inherits the RBM state problems (data not shown). is the probability associated with the grand canonical ensemble (note that the term $\mu\expectationvalue*{\hat{N}}$ has already been introduced in the Hamiltonian). This strategy was used to generate a sample $\mathcal{M}^{\prime}$ of up to 2048 Fock states yielding a state $\ket*{\tilde{\psi}_{{\bf\it\theta}}}$ for every point in the phase diagram. The overlap between the exact ground state $\ket{\psi_{\text{exact}}}$ and the sampled RBM ground state $\ket*{\tilde{\psi}_{{\bf\it\theta}}}$ is shown in fig. 6 for the three studied scenarios.

Comparing fig. 5 with fig. 6, it is seen that the retrieved sampled state “cleans” the RBM state. In fact, the Fock states whose probability amplitudes were badly learned were removed, and only the Fock states relevant for the actual ground state were left. Despite this cleaning, the larger the Hilbert space, the more states have to be sampled to consider all relevant Fock states, as it is seen that 2048 Fock states are insufficient to capture the ground state in the case of 8 sites shown in fig. 6(b). Nonetheless, note that the low overlap in the arc from fig. 5(b) almost completely disappears after the cleaning, cf. fig. 6(b).

On the other hand, an important feature of fig. 6(b) is that the overlap of the sampled RBM state diminishes as $t$ gets larger. This happens because larger values of $t/U$ imply larger delocalization of the ground wavefunction, thus, involving more Fock states. Moreover, the Hilbert space size for 8 sites consists of $5^{8}=390625$ Fock states, which is why only sampling 2048 Fock states results in poor representations of the ground state, especially in the SF phase. Sampling more Fock states eventually reconstructs the exact ground state with very high overlap, except at the boundary between the MI and SF phases (data not shown).

III.3 Order parameter

The phase diagram of the BH model can be reconstructed by measuring quantities in the sampled RBM ground state that exhibit the phase transition. As mentioned before, we chose the variance of the local number operator, in particular, of the first site. In fig. 7(a) and (b), the order parameter measured with the exact ground state is shown for 5 and 8 sites, respectively. It is seen that in the Mott insulator phase, the variance of the number of bosons in the first lattice site is near to 0, but not exactly 0 because of finite size effects. On the other hand, fig. 7(d) and (e) show the order parameter for the sampled RBM ground state with 2048 Fock states for the cases of 8 and 20 hidden neurons for 5 sites, which show excellent agreement with their exact counterpart fig. 7(a). However, at the boundary between the MI and SF phases, there are absolute errors that could be as high as 0.15, which come from the difficulty of learning the ground state near the phase transition boundary. In spite of these differences at the boundaries, it is clear that the learned RBM ground state mimics the re-entrance found in finite 1D chains (Buonsante and Vezzani, 2007; Park et al., 2004). Finally, fig. 7(c) shows the order parameter for the exact RBM ground state for 8 sites and 11 hidden neurons. Figure 7(f) also shows the order parameter for 8 sites and 11 hidden neurons but with a different sampling limit. Instead of fixing a number of Fock states to be sampled (2048 and 4096 were insufficient, data not shown), we build the sample $\mathcal{M}^{\prime}$ by accepting states with the Metropolis-Hastings algorithm until there is a run of 400 consecutive state proposals that do not raise a new accepted state into $\mathcal{M}^{\prime}$ . In this case, the state is cleaned (note that the arc of badly learned order parameter almost completely disappears). A clear indicator of the number of Fock states to represent the ground state emerges: for very low values of $t$ , within the first Mott lobe, only one sampled state ( $\ket{1,1,1,1,1}$ ) is needed to reproduce the exact ground state; in the SF phase, up to 30000 states are sampled before hitting a 400 streak of no new accepted states into the sample.

Other quantities can be used as an order parameter, which more explicitly relate to quantum correlations such as entanglement. For instance, a partial trace carried out over all degrees of freedom except the first site yields a reduced density matrix $\rho_{1}(t,\mu)$ , from which the linear entropy $S(t,\mu)=1-\Tr{\rho^{2}_{1}(t,\mu)}$ can be measured (the von Neumann entanglement entropy can also be used, e.g., Ejima et al. (2012)). Figure 8 shows the absolute errors between the exact and the sampled RBM ground states for the three studied scenarios, where the errors at the MI-SF boundary become large.

III.4 Tomography

In all the studied scenarios, there are problems for learning the ground state at the MI-SF boundaries, as well as at the mini-plateaus boundaries within the SF phase. In order to understand the differences between the learned ground state and the one obtained through exact diagonalization, we performed a study of the composition of the ground states for 5 sites and 8 hidden neurons. For that reason, we examined the probability amplitudes of the ground state (both RBM learned and exact) at 11 different points in the $t$ - $\mu$ space, fixing $t=0.1$ , and with $\mu$ at the middle and border of each plateau in the MI and SF phases. We also examined the RBM and exact ground states for very small $t$ at the middle of the first Mott lobe, as indicated by the black dot in fig. 9, where the ground state for the RBM and for exact diagonalization was $\ket{1,1,1,1,1}$ with an associated probability amplitude of 99.99%, as expected.

Note that the BH chain is invariant (up to a phase) under displacements and inversions, i.e. there are displacement and inversion operators that act as follows on Fock states: $\hat{T}\ket{n_{1},\ldots,n_{N-1},n_{N}}=e^{i\phi}\ket{n_{N},n_{1},\ldots,n_{N-1}}$ for displacement, and $\hat{I}\ket{n_{1},n_{2},\ldots,n_{N-1},n_{N}}=e^{i\varphi}\ket{n_{N},n_{N-1},\ldots,n_{2},n_{1}}$ for inversion. Therefore, in order to perform a tomography, we must take into account that all Fock states that belong to the same rung defined by displacement and inversion operations are equivalent. Now, each Fock state can be brought to a canonical Fock state through a consecutive application of displacement and inversion operators onto the original Fock state. This canonical Fock state is selected as the lexicographically smallest one, after every possible application of displacement and inversion operators, as in (Choo et al., 2018). In fig. 9, we show the probability amplitude distribution of the Fock state manifold rungs for the exact and the RBM ground states. More explicitly, the exact ground state is

\displaystyle\ket{\Psi}=\sum_{{\bf\it n}}\Psi({\bf\it n})\ket{{\bf\it n}}=\sum_{i}\sum_{{\bf\it n}\in R_{i}}\Psi({\bf\it n})\ket{{\bf\it n}},

(7)

where $i$ indexes the rungs, and $R_{i}$ is the $i$ -th rung. The probability amplitude corresponding to a rung is, therefore, the sum of the probability amplitudes of all of its Fock states, and the bars from fig. 9 show those rungs’ probability amplitudes. Reading the plot from right to left, i.e. starting with the smallest value of $\mu$ , we see that within the first Mott lobe, both the RBM and the exact ground states show very similar distributions over three rungs of the one-filling manifold: $\ket{1,1,1,1,1}$ , $\ket{0,1,1,1,2}$ and $\ket{0,1,1,2,1}$ . Increasing $\mu$ up to the first MI-SF boundary (slightly within the SF phase), we see that the probability amplitude distribution starts to differ between the exact and the RBM ground states, and the RBM struggles to identify if the ground state is now in an excitation manifold above the one-filling manifold, represented by the rung $\ket{1,1,1,1,2}$ , or in the MI phase which is represented by the rung $\ket{1,1,1,1,1}$ . Increasing again $\mu$ in order to be at the middle of the first plateau within the SF phase shows that both the RBM and the exact ground states have similar probability amplitude distributions, even though the RBM assigns a probability to other rungs (not shown in the plot because their contribution is less than 0.01). It is clear that this first plateau is mostly represented by the rung $\ket{1,1,1,1,2}$ which is in the 6-th excitation manifold. If we continue to increase $\mu$ we see that near the boundaries between the MI-SF phases and between the plateaus within the SF phase, the RBM and the exact ground states exhibit differences in the probability amplitude distributions. On the contrary, in the middle of those plateaus, the probability amplitude distributions from both RBM and exact ground states are very similar (which is also seen in fig. 5(a)). Moreover, each time the tomography moves onto a new plateau (with a higher value of $\mu$ ), the excitation manifold increases by one, up to the 10-th excitation manifold, which corresponds to the two-filling manifold at the second Mott lobe, characterized by the rung $\ket{2,2,2,2,2}$ as well as the rungs $\ket{1,2,2,2,3}$ and $\ket{1,2,3,1,3}$ .

IV Conclusions

In this work, we systematically tested the capabilities of VMC with a trial ground wavefunction given by an RBM on the one-dimensional Bose-Hubbard model. The motivation for the technique comes from the possibility of incorporating it into the toolbox of quantum physics to tackle theoretical problems that are difficult to study numerically due to the intractably large Hilbert spaces. Thus, it is first needed to intensively test the technique to reproduce known results, and it is also needed to theoretically explain why the technique works (this is a challenging open question which involves the question of why neural networks work well). Only if the community identifies the strengths and weaknesses of the technique is it possible to use it to explore problems of interest that involve vast Hilbert spaces. Regarding the model under study, we repeatedly found differences between the exact ground state found through the exact diagonalization of the BH Hamiltonian and the learned ground state. We did so in one-dimensional chains of 5 and 8 sites. In order to better learn the ground state, the variational trial wavefunction was enriched in the case of 5 sites by increasing the number of hidden units in the RBM’s hidden layer. Although results improved near the MI-SF boundary, other regions of the BH parameter space suffered from badly learned ground states. For that reason, we proposed a sampling technique that cleans the ground state, getting rid of Fock states not relevant to the actual ground state.

In most of the BH model parameter space, it was found that the VMC-RBM method yielded good results in computing the energy (with low errors and clear signs of convergence), and also was able to compute other observables which explicitly involve quantities related to quantum correlations, such as the linear entropy of one of the chain’s sites. Moreover, we reconstructed with excellent accuracy the phase diagram of the BH model using the local occupation variance as an order parameter for most of the parameter space. Furthermore, we also carried out quantum state tomographies to understand the composition of the ground states at the interfaces between the MI-SF phases and also within the excitation manifold plateaus formed for finite chains in the SF phase, which revealed significant differences between the Fock states distributions of the exact ground state and the learned ground state.

Accordingly, we must raise the attention that the VMC technique is consistently challenged near the quantum phase transition of the BH model. We do not doubt that the high-expressivity of neural quantum states is able to improve those results under more extensive sampling (or better sampling schemes), or more complex variational trial wavefunctions apart from RBMs; however, we report that some questions have to be answered with precision before confidently using VMC as a method to accurately explore the physics of a many-body problem without the support of any other numerical method. In the first place, we observed that not every point of the phase diagram requires the same amount of computational work to learn the ground state. Most notably, the ground state in the MI phase is mostly explained by only one Fock state, whereas the number rapidly grows for ground states in the SF phase. Therefore, automated ways of stopping sampling within the sampling steps should be taken into consideration. In particular, when using the Metropolis-Hastings algorithm, we found that if a Markov chain of a certain length was formed with no new sampled states, we could stop sampling, yielding a high-quality ground state. Secondly, the VMC technique showed badly learned ground states along arcs formed within the Mott lobes. Why these form remains unanswered, as they are not related to the structure of the MI-SF boundaries. However, after we clean the learned ground state, these arcs almost completely disappear. Thus, it might be the case that they are related to noisy probability from Fock states that are not relevant to the ground state. Finally, concerning the first question, the number of optimization steps required to learn the ground state has to be better understood, as it is only clear when the energy converges, but not the state ³³3In this regard, NetKet team has included the Gelman-Rubin statistics (Vats and Knudson, 2018).

Additionally, an interesting discussion arises when paying close attention to the sampling technique of the VMC. Netket’s Local Metropolis sampler has low acceptance probability near the MI-SF transition for low values of $t$ and finite small chains such as the ones considered in our work, which complicates learning the state. However, the difference between the exact state and the learned state is constant throughout manifold excitation boundaries regardless of the $t$ value, and also regardless of being at the MI-SF transition, or within the SF transition.

Acknowledgements.

We want to thank Andrés Urquijo and J. P. Restrepo-Cuartas for useful discussions at the early stage of this study.

References

Georges et al. (1996) A. Georges, G. Kotliar, W. Krauth, and M. J. Rozenberg, Rev. Mod. Phys. 68, 13 (1996).
Schollwöck (2005) U. Schollwöck, Rev. Mod. Phys. 77, 259 (2005).
Carleo and Troyer (2017) G. Carleo and M. Troyer, Science 355, 602 (2017).
Han et al. (2019) J. Han, L. Zhang, and W. E, Journal of Computational Physics 399, 108929 (2019).
Choo et al. (2019a) K. Choo, A. Mezzacapo, and G. Carleo, “Fermionic neural-network states for ab-initio electronic structure,” (2019a), arXiv:1909.12852 [physics.comp-ph] .
Pfau et al. (2019) D. Pfau, J. S. Spencer, A. G. de G. Matthews, and W. M. C. Foulkes, “Ab-initio solution of the many-electron schrödinger equation with deep neural networks,” (2019), arXiv:1909.02487 [physics.chem-ph] .
Hermann et al. (2019) J. Hermann, Z. Schätzle, and F. Noé, “Deep neural network solution of the electronic schrödinger equation,” (2019), arXiv:1909.08423 [physics.comp-ph] .
Kessler et al. (2019) J. Kessler, F. Calcavecchia, and T. D. Kühne, “Artificial neural networks as trial wave functions for quantum monte carlo,” (2019), arXiv:1904.10251 [physics.comp-ph] .
Ruggeri et al. (2018) M. Ruggeri, S. Moroni, and M. Holzmann, Phys. Rev. Lett. 120, 205302 (2018).
Carleo et al. (2018) G. Carleo, Y. Nomura, and M. Imada, Nature communications 9, 1 (2018).
Sharir et al. (2020) O. Sharir, Y. Levine, N. Wies, G. Carleo, and A. Shashua, Phys. Rev. Lett. 124, 020503 (2020).
Hibat-Allah et al. (2020) M. Hibat-Allah, M. Ganahl, L. E. Hayward, R. G. Melko, and J. Carrasquilla, “Recurrent neural network wavefunctions,” (2020), arXiv:2002.02973 [cond-mat.dis-nn] .
Czischek et al. (2018) S. Czischek, M. Gärttner, and T. Gasenzer, Phys. Rev. B 98, 024311 (2018).
Nomura et al. (2017) Y. Nomura, A. S. Darmawan, Y. Yamaji, and M. Imada, Phys. Rev. B 96, 205152 (2017).
Teng (2018) P. Teng, Phys. Rev. E 98, 033305 (2018).
Szabó and Castelnovo (2020) A. Szabó and C. Castelnovo, “Neural network wave functions and the sign problem,” (2020), arXiv:2002.04613 [cond-mat.str-el] .
Choo et al. (2019b) K. Choo, T. Neupert, and G. Carleo, Phys. Rev. B 100, 125124 (2019b).
Liang et al. (2018) X. Liang, W.-Y. Liu, P.-Z. Lin, G.-C. Guo, Y.-S. Zhang, and L. He, Phys. Rev. B 98, 104426 (2018).
Saito (2018) H. Saito, Journal of the Physical Society of Japan 87, 074002 (2018).
Gao and Duan (2017) X. Gao and L.-M. Duan, Nature Communications 8, 662 (2017).
Sun et al. (2019) S. Sun, Z. Cao, H. Zhu, and J. Zhao, “A survey of optimization methods from a machine learning perspective,” (2019), arXiv:1906.06821 [cs.LG] .
Ballentine (1998) L. E. Ballentine, Quantum mechanics: a modern development (World Scientific Publishing Company, 1998) pp. 293–294.
Carleo et al. (2019) G. Carleo, K. Choo, D. Hofmann, J. E. Smith, T. Westerhout, F. Alet, E. J. Davis, S. Efthymiou, I. Glasser, S.-H. Lin, M. Mauri, G. Mazzola, C. B. Mendl, E. van Nieuwenburg, O. O’Reilly, H. Théveniaut, G. Torlai, F. Vicentini, and A. Wietek, SoftwareX 10, 100311 (2019).
McBrian et al. (2019) K. McBrian, G. Carleo, and E. Khatami, in Journal of Physics: Conference Series, Vol. 1290 (IOP Publishing, 2019) p. 012005.
Li (2019) Z. Li, Quantum simulation of quantum phases of matter with interacting photons, Ph.D. thesis, National University of Singapore (2019).
Saito (2017) H. Saito, Journal of the Physical Society of Japan 86, 093001 (2017).
Choo et al. (2018) K. Choo, G. Carleo, N. Regnault, and T. Neupert, Phys. Rev. Lett. 121, 167204 (2018).
Saito and Kato (2018) H. Saito and M. Kato, Journal of the Physical Society of Japan 87, 014001 (2018), https://doi.org/10.7566/JPSJ.87.014001 .
Irikura and Saito (2020) N. Irikura and H. Saito, Phys. Rev. Research 2, 013284 (2020).
Liu and van Nieuwenburg (2018) Y.-H. Liu and E. P. L. van Nieuwenburg, Phys. Rev. Lett. 120, 176401 (2018).
Huembeli et al. (2018) P. Huembeli, A. Dauphin, and P. Wittek, Phys. Rev. B 97, 134109 (2018).
Broecker et al. (2017) P. Broecker, F. F. Assaad, and S. Trebst, “Quantum phase recognition via unsupervised machine learning,” (2017), arXiv:1707.00663 [cond-mat.str-el] .
Pérez Díaz (2019) J. Pérez Díaz, Detection of quantum phase transitions via machine learning algorithms, Master’s thesis, Universitat Politècnica de Catalunya (2019).
Note (1) By relevant, we mean that a Fock state $\ket{{\bf\it n}}$ has a large probability amplitude with respect to other Fock states.
Fisher et al. (1989) M. P. Fisher, P. B. Weichman, G. Grinstein, and D. S. Fisher, Physical Review B 40, 546 (1989).
van Oudenaarden and Mooij (1996) A. van Oudenaarden and J. E. Mooij, Physical Review Letters 76, 4947 (1996).
Baltin and Wagenblast (1997) R. Baltin and K.-H. Wagenblast, Europhysics Letters (EPL) 39, 7 (1997).
Bruder et al. (2005) C. Bruder, R. Fazio, and G. Schon, Annalen der Physik 14, 566 (2005).
Greiner et al. (2003) M. Greiner, O. Mandel, T. Rom, A. Altmeyer, A. Widera, T. W. Hänsch, and I. Bloch, Physica B: Condensed Matter 329-333, 11 (2003).
Jaksch et al. (1998) D. Jaksch, C. Bruder, J. I. Cirac, C. W. Gardiner, and P. Zoller, Physical Review Letters 81, 3108 (1998).
Bloch et al. (2008) I. Bloch, J. Dalibard, and W. Zwerger, Reviews of Modern Physics 80, 885 (2008), arXiv:0704.3011 .
Stöferle et al. (2004) T. Stöferle, H. Moritz, C. Schori, M. Köhl, and T. Esslinger, Phys. Rev. Lett. 92, 130403 (2004).
Spielman et al. (2007) I. B. Spielman, W. D. Phillips, and J. V. Porto, Phys. Rev. Lett. 98, 080404 (2007).
Gemelke et al. (2009) N. Gemelke, X. Zhang, C.-L. Hung, and C. Chin, Nature 460, 995 (2009).
Grynberg et al. (2000) G. Grynberg, P. Horak, and C. Mennerat-Robilliard, Europhysics Letters (EPL) 49, 424 (2000).
Kollath et al. (2004) C. Kollath, U. Schollwöck, J. von Delft, and W. Zwerger, Phys. Rev. A 69, 031601 (2004).
Lugan et al. (2007) P. Lugan, D. Clément, P. Bouyer, A. Aspect, M. Lewenstein, and L. Sanchez-Palencia, Phys. Rev. Lett. 98, 170403 (2007).
Freericks and Monien (1996) J. K. Freericks and H. Monien, Physical Review B 53, 2691 (1996).
Ejima et al. (2012) S. Ejima, H. Fehske, F. Gebhard, K. zu Münster, M. Knap, E. Arrigoni, and W. von der Linden, Phys. Rev. A 85, 053644 (2012).
Cazalilla et al. (2011) M. A. Cazalilla, R. Citro, T. Giamarchi, E. Orignac, and M. Rigol, Rev. Mod. Phys. 83, 1405 (2011).
Teichmann et al. (2009) N. Teichmann, D. Hinrichs, M. Holthaus, and A. Eckardt, Physical Review B - Condensed Matter and Materials Physics 79, 1 (2009).
Batrouni et al. (1995) G. G. Batrouni, R. T. Scalettar, G. T. Zimanyi, and A. P. Kampf, Phys. Rev. Lett. 74, 2527 (1995).
Krauth et al. (1991) W. Krauth, N. Trivedi, and D. Ceperley, Phys. Rev. Lett. 67, 2307 (1991).
Krauth and Trivedi (1991) W. Krauth and N. Trivedi, Europhysics Letters (EPL) 14, 627 (1991).
Niyaz et al. (1991) P. Niyaz, R. T. Scalettar, C. Y. Fong, and G. G. Batrouni, Phys. Rev. B 44, 7143 (1991).
Kühner and Monien (1998) T. D. Kühner and H. Monien, Phys. Rev. B 58, R14741 (1998).
Ejima et al. (2011) S. Ejima, H. Fehske, and F. Gebhard, EPL (Europhysics Letters) 93, 30002 (2011).
Kühner et al. (2000) T. D. Kühner, S. R. White, and H. Monien, Phys. Rev. B 61, 12474 (2000).
Batrouni and Scalettar (1992) G. G. Batrouni and R. T. Scalettar, Phys. Rev. B 46, 9051 (1992).
Kashurnikov et al. (1996) V. A. Kashurnikov, A. V. Krasavin, and B. V. Svistunov, Journal of Experimental and Theoretical Physics Letters 64, 99 (1996).
Batrouni et al. (1990) G. G. Batrouni, R. T. Scalettar, and G. T. Zimanyi, Phys. Rev. Lett. 65, 1765 (1990).
Elstner and Monien (1999) N. Elstner and H. Monien, Phys. Rev. B 59, 12184 (1999).
Koller and Dupuis (2006) W. Koller and N. Dupuis, Journal of Physics: Condensed Matter 18, 9525 (2006).
Pino et al. (2013) M. Pino, J. Prior, and S. R. Clark, Physica Status Solidi (B) Basic Research 250, 51 (2013).
Smolensky (1986) P. Smolensky, “Information processing in dynamical systems: Foundations of harmony theory,” in Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1: Foundations (MIT Press, Cambridge, MA, USA, 1986) p. 194–281.
Sorella et al. (2007) S. Sorella, M. Casula, and D. Rocca, The Journal of Chemical Physics 127, 014105 (2007), https://doi.org/10.1063/1.2746035 .
Dempster et al. (1977) A. P. Dempster, N. M. Laird, and D. B. Rubin, Journal of the Royal Statistical Society. Series B (Methodological) 39, 1 (1977).
Giamarchi (2003) T. Giamarchi, Quantum physics in one dimension, Vol. 121 (Clarendon press, 2003).
Note (2) If instead the sample is built from the distribution $\absolutevalue{\psi_{{\bf\it n}}}^{2}$ , then the acceptance probability becomes $\mathop{min}\displaylimits\{1,|\psi_{{\bf\it\theta}}({\bf\it n}_{i+1})/\psi_{{\bf\it\theta}}({\bf\it n}_{i})|^{2}\}$ and the state build through eq. 5 inherits the RBM state problems (data not shown).
Buonsante and Vezzani (2007) P. Buonsante and A. Vezzani, Phys. Rev. Lett. 98, 110601 (2007).
Park et al. (2004) S.-h. Park, C. Park, and M.-C. Cha, Journal of the Korean Physical Society 45, 1553 (2004).
Note (3) In this regard, NetKet team has included the Gelman-Rubin statistics (Vats and Knudson, 2018).
Vats and Knudson (2018) D. Vats and C. Knudson, “Revisiting the gelman-rubin diagnostic,” (2018), arXiv:1812.09384 [stat.CO] .