This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Phase Diagram Reconstruction of the Bose-Hubbard Model with a Restricted Boltzmann Machine Wavefunction

Vladimir Vargas-Calderón [email protected]    Herbert Vinck-Posada Grupo de Superconductividad y Nanotecnología, Departamento de Física, Universidad Nacional de Colombia, Bogotá, Colombia    Fabio A. González Computing Systems and Industrial Eng. Department, Universidad Nacional de Colombia, Bogotá, Colombia
Abstract

Recently, the use of neural quantum states for describing the ground state of many- and few-body problems has been gaining popularity because of their high expressivity and ability to handle intractably large Hilbert spaces. In particular, methods based on variational Monte Carlo have proven to be successful in describing the physics of bosonic systems such as the Bose-Hubbard (BH) model. However, this technique has not been systematically tested on the parameter space of the BH model, particularly at the boundary between the Mott insulator and superfluid phases. In this work, we evaluate the capabilities of variational Monte Carlo with a trial wavefunction given by a Restricted Boltzmann Machine to reproduce the quantum ground state of the BH model on several points of its parameter space. To benchmark the technique, we compare its results to the ground state found through exact diagonalization for small one-dimensional chains. In general, we find that the learned ground state correctly estimates many observables, reproducing to a high degree the phase diagram for the first Mott lobe and part of the second one. However, we find that the technique is challenged whenever the system transitions between excitation manifolds, as the ground state is not learned correctly at these boundaries. We improve the quality of the results produced by the technique by proposing a method to discard noisy probabilities learned in the ground state.

I Introduction

The dimension of a Hilbert space that describes the possible states of a many-body system scales exponentially with the number of one-body states and with the number of particles. In most cases, this feature comes as an important practical difficulty for physicists to study many-body problems, because approximate techniques have to be implemented in order to perform simulations (e.g., dynamical mean-field theory (Georges et al., 1996), density matrix renormalization group (DMRG) (Schollwöck, 2005)). One of the most studied quantum objects is the ground state of a many-body system for a number of reasons: particles tend to occupy the lowest energy states first, as dictated by the aufbau principle; also excited states inherit the ground state structure; and most prominently, it is an object that encodes changes in observables that exhibit quantum phase transitions. Recently, a technique to calculate the ground state of a many-body system has been proposed, based on defining a trial wavefunction with a neural network, which is optimized to minimize the ground state energy through variational Monte Carlo(VMC) (Carleo and Troyer, 2017). This technique has proven to be successful in approximating with high fidelity the ground state of several condensed matter many- and few-body systems. Some examples are atomic and molecular systems (Han et al., 2019; Choo et al., 2019a; Pfau et al., 2019; Hermann et al., 2019; Kessler et al., 2019; Ruggeri et al., 2018), the transverse-field Ising model (Carleo et al., 2018; Sharir et al., 2020; Hibat-Allah et al., 2020) with quenching (Czischek et al., 2018), the Heisenberg model (Carleo et al., 2018) and its anti-ferromagnetic version (Sharir et al., 2020; Nomura et al., 2017), the quantum harmonic oscillator in electric field (Teng, 2018), the Hubbard model (Nomura et al., 2017), the J1J_{1}-J2J_{2} Heisenberg model (Hibat-Allah et al., 2020) and its antiferromagnetic version (Szabó and Castelnovo, 2020; Choo et al., 2019b; Liang et al., 2018), as well as models for interacting bosons in one and three-dimensions (Saito, 2018).

The success in approximating the ground state comes from two sources. The first one is technical: neural networks have expressivity and are able to approximate arbitrary functions (Gao and Duan, 2017), and their optimization methods in machine learning have been thoroughly studied (Sun et al., 2019). The second one is physical: even though the size of a Hilbert space exponentially grows with respect to one-body states and number of particles, only a small set of those states are needed to describe the ground state (Carleo and Troyer, 2017). Here the definition of small can vary, as it will be seen in this work.

Although reported results indicate that the quantum ground state can be represented through neural network wavefunctions, most of these studies focus on the convergence of the energy. Only a few of them use the technique to characterize quantum correlations near a quantum phase transition to study the phase space of a Hamiltonian comprehensively. Indeed, the convergence of the energy comes quicker than the convergence of the state, as any first-order error in the variational state leads to only a second-order error in its corresponding energy (Ballentine, 1998). Therefore, focusing on energy convergence might be misleading when asserting the power of VMC with neural network trial wavefunctions. In contrast to previous works, we intensively and systematically test the capabilities of VMC to reconstruct, through a Restricted Boltzmann Machine (RBM) wavefunction, the quantum ground state of a one-dimensional BH system throughout the superfluid (SF) and Mott insulator (MI) phases. We chose the BH model because of the known numerical difficulties near the MI-SF boundary. Consequently, we contrast exact diagonalization solutions with the ones provided by the VMC-RBM method in systems of 5 and 8 sites for a fine mesh in the BH parameter space at the first and part of the second Mott lobes. It is only due to the sweeping of the BH parameters that we are able to see that points near the MI-SF boundary are challenging for the VMC-RBM method. However, we also find that some of the differences between the learned ground state and the ground state found through exact diagonalization originate from badly learned probability amplitudes of Fock states that should have small contribution to the ground state. In this work, we also propose a state cleaning technique that removes those badly learned probabilities.

There have also been previous efforts to learn the ground state of the BH Hamiltonian through different neural network wavefunctions. However, as mentioned before, these studies do not extensively test the VMC framework throughout the BH parameter space. For instance, a permutation symmetric RBM implemented in NetKet Carleo et al. (2019) has been used to study the energy and particle density convergence at two points in the SF and MI phases, finding a difficulty in the convergence to the numerically exact particle density in the SF phase McBrian et al. (2019). However, a qualitatively good location of the boundary between the SF and MI phases for the region enclosing the first two Mott lobes was achieved using the particle density as a discriminator McBrian et al. (2019). RBMs have also been used to show how for a fixed number of bosons in the system, the learned ground state is able to replicate the numerically exact order parameter of the quantum phase transition Li (2019). However, fixing the number of bosons greatly reduces the size of the Hilbert space, making it easier for the VMC technique to learn the ground state since a more complete sampling of the Hilbert’s basis states is possible. Also, the BH model has been considered as a toy model for trying full-forward neural networks (FFNNs) and convolutional neural networks (CNNs) wavefunctions. For instance, the ground state of one and two-dimensional finite lattices with parabolic confinement potential for a fixed number of bosons was approximated through an FFNN ansatz (Saito, 2017). In the case of no confinement and periodic boundary conditions (which introduce displacement symmetry), both FFNNs and RBMs were used in large 1D lattices of up to 40 sites (Choo et al., 2018). CNNs have also been introduced and compared with FFNN wavefunctions to approximate this ground state (Saito and Kato, 2018). A CNN is also proposed as a means to introduce the ground state at finite temperature in a 1D lattice for larger on-site boson interaction than hopping interaction (Irikura and Saito, 2020). Despite the important results derived from these works, they do not focus on the ability of the neural network ansatz to reproduce the ground state near the quantum phase transition boundary. Nevertheless, unsupervised learning has been previously used to classify states as belonging to the Mott insulator or SF phases in the BH model with previously obtained states or physical quantities, for several values of the order parameters Liu and van Nieuwenburg (2018); Huembeli et al. (2018); Broecker et al. (2017); Pérez Díaz (2019).

We believe that the ability of VMC technique to reproduce the ground state near the quantum phase transition has to be extensively tested. Therefore, we pay close attention to the description of the variational ground state near the SF-MI phase for a small number of sites to compare it with the numerically exact ground state. This paper is organized as follows. Section II exposes the main features of the BH Hamiltonian and describes the SF-MI transition in the system. In this section, an overview of the VMC technique is given, discussing the difficulties that must be faced and overcome in order to learn the ground state of the BH model. In section III, the main results are given, including energy convergence, the overlap of the learned and exact ground states, the reproduction of the phase diagram via two different order parameters, as well as tomographies indicating relevant 111By relevant, we mean that a Fock state |𝒏\ket{{\bf\it n}} has a large probability amplitude with respect to other Fock states. Fock states for the ground state. Finally, in section IV conclusions of this work are given.

II Model and Variational Monte Carlo

II.1 Bose-Hubbard Hamiltonian

The BH Hamiltonian describes the interactions between bosons that can occupy sites in a dd-dimensional lattice. The model is characterized by a hopping energy tt, an on-site interaction UU and a chemical potential μ\mu, so that the grand-canonical Hamiltonian reads (=1\hbar=1(Fisher et al., 1989)

H^=tij(a^ia^j+H.c.)+U2in^i(n^i1)μin^i,\displaystyle\hat{H}=-t\sum_{\langle ij\rangle}(\hat{a}_{i}^{\dagger}\hat{a}_{j}+\text{H.c.})+\frac{U}{2}\sum_{i}\hat{n}_{i}(\hat{n}_{i}-1)-\mu\sum_{i}\hat{n}_{i}, (1)

where a^i\hat{a}_{i} is the annihilation operator at site ii, and n^i=a^ia^i\hat{n}_{i}=\hat{a}^{\dagger}_{i}\hat{a}_{i} is the number operator. The notation ij\langle ij\rangle indicates that the sum runs over pairs of neighbor sites in the lattice. The BH model is able to reproduce experimental results in Josephson-junction networks (van Oudenaarden and Mooij, 1996; Baltin and Wagenblast, 1997; Bruder et al., 2005) and in lattices of ultra-cold atoms (Greiner et al., 2003; Jaksch et al., 1998; Bloch et al., 2008; Stöferle et al., 2004; Spielman et al., 2007; Gemelke et al., 2009). The latter offers precise control of the lattice parameters (Grynberg et al., 2000; Kollath et al., 2004; Lugan et al., 2007). Theoretically, a lot of attention has been devoted both to understand the quantum phases of the system (the ones that arise from eq. 1 and from the disordered or extended BH model with longer range interactions) (Freericks and Monien, 1996; Ejima et al., 2012; Cazalilla et al., 2011; Teichmann et al., 2009; Batrouni et al., 1995; Krauth et al., 1991; Krauth and Trivedi, 1991; Niyaz et al., 1991). Another central issue is to calculate the quantum phase transition boundaries (Kühner and Monien, 1998; Ejima et al., 2011; Kühner et al., 2000; Batrouni and Scalettar, 1992; Kashurnikov et al., 1996; Batrouni et al., 1990; Elstner and Monien, 1999; Koller and Dupuis, 2006), whose precision has improved over the years with better calculation techniques and computing power, revealing features such as the re-entrance phenomenon (Pino et al., 2013; Elstner and Monien, 1999), where for particular values of the chemical potential, the system switches between the MI phase to the SF phase, and back to the MI phase before definitely entering the SF phase after an increase of t/Ut/U.

For simplicity, we will restrict our analyses to the d=1d=1 case, where only two quantum phases are possible. When the on-site interaction energy is much larger than the hopping energy, the latter becomes negligible, and the Hamiltonian is written as the sum of independent Hamiltonians for each site U2n^i(n^i1)μn^i\frac{U}{2}\hat{n}_{i}(\hat{n}_{i}-1)-\mu\hat{n}_{i}. These one-particle Hamiltonians can be immediately diagonalized by the number basis. The corresponding eigenenergies are U2ni(ni1)μni\frac{U}{2}n_{i}(n_{i}-1)-\mu n_{i}, which reach minimum values for fixed μ\mu and UU at ni=max{0,μ/U}n_{i}=\max\{0,\lceil\mu/U\rceil\} (note that all sites are equivalent). Moreover, in this regime, the expected variance of the local number operator is 0, i.e. n^i2n^i2=0\langle\hat{n}_{i}^{2}\rangle-\langle\hat{n}_{i}\rangle^{2}=0. This regime characterizes the MI phase. On the other hand, when the hopping energy is much larger than the on-site interaction energy, the latter becomes negligible. Thus, the Hamiltonian can also be written as the sum of independent Hamiltonians, but in momentum space, where a~k=N1/2j=1Na^jeixjpk/\tilde{a}_{k}=N^{-1/2}\sum_{j=1}^{N}\hat{a}_{j}e^{-ix_{j}p_{k}/\hbar} is the boson annihilation operator in the momentum representation. Here, xj=c×jx_{j}=c\times j, where cc is the lattice constant, and pk=2πk/(N×c)p_{k}=2\pi k\hbar/(N\times c). Each independent Hamiltonian in momentum space (k(2tcos(2πk/N)μ)a~ka~k\sum_{k}(-2t\cos(2\pi k/N)-\mu)\tilde{a}^{\dagger}_{k}\tilde{a}_{k}) has eigenenergies 2tcos(2πk/N)μ-2t\cos(2\pi k/N)-\mu, which reach their minimum when all bosons condense with 0 momenta. Note that the energies are independent of the states’ occupation, meaning that the ground state is degenerate for any number of particles. This regime is known as the SF phase, characterized by a delocalized wavefunction (formally described by algebraic decaying spatial correlations (Ejima et al., 2012; Pino et al., 2013), which is why the SF phase in 1D is not a true Bose-Einstein condensate). Thus, the ground state has a non-zero expected variance of the local number operator. In fact, the probability distribution for the local occupation is Poissonian, meaning that ni2ni2=ni\expectationvalue*{n_{i}^{2}}-\expectationvalue*{n_{i}}^{2}=\expectationvalue*{n_{i}} (Greiner et al., 2003). For a fixed chemical potential and an infinite number of sites, there exists a continuous phase transition from the MI phase to the SF phase as UU decreases with respect to tt (with the exception of a range of μ\mu values in the first Mott lobe, where re-entrance exists).

II.2 Variational Monte Carlo with Restricted Boltzmann Machine Wave Function

In this section, a review of the method introduced by Carleo and Troyer (2017) is given, where a VMC setup is used to find the ground state, formally written as a trial wavefunction given by an RBM. In the Fock space basis, the ground state of the BH model can be written as Saito (2017)

|Ψ=n1=0,n2=0,Ψ(n1,n2,)|n1,n2,,\displaystyle\ket{\Psi}=\sum_{n_{1}=0,n_{2}=0,\ldots}^{\infty}\Psi(n_{1},n_{2},\ldots)\ket{n_{1},n_{2},\ldots}, (2)

where nin_{i} is the occupation number of the ii-th site, and |Ψ(n1,n2,)|2|\Psi(n_{1},n_{2},\ldots)|^{2} are the probability amplitudes corresponding to the Fock states |n1,n2,\ket{n_{1},n_{2},\ldots}. In order to map the wave function to a computer, both the number of sites and the number of possible particles in each site have to be truncated. We will refer to the number of sites as NN and to the maximum number of particles in each site as M1M-1. The coefficients Ψ(n1,n2,,nN)\Psi(n_{1},n_{2},\ldots,n_{N}) are approximated by an RBM. RBMs are generative neural networks, formally described by a bipartite undirected graph such as the one shown in fig. 1, where there is a layer of visible neurons denoted by 𝒗{\bf\it v} that are used to input real data, and a layer of hidden neurons denoted by 𝒉{\bf\it h} that are used as latent variables of the model (Smolensky, 1986). In particular, the wavefunction coefficients take the form Ψ(n1,n2,,nN)ψ𝜽(n1,,nN)=𝒉eERBM(𝒗(𝒏),𝒉)\Psi(n_{1},n_{2},\ldots,n_{N})\approx\psi_{{\bf\it\theta}}(n_{1},\ldots,n_{N})=\sum_{{\bf\it h}}e^{-E_{\text{RBM}}({\bf\it v}({\bf\it n}),{\bf\it h})}, where ERBM(𝒗(𝒏),𝒉)E_{\text{RBM}}({\bf\it v}({\bf\it n}),{\bf\it h}) is the energy of the RBM (Smolensky, 1986), and 𝜽{\bf\it\theta} are the variational parameters of the RBM. As a short-hand notation, an occupation configuration is denoted as 𝒏{\bf\it n}, and it is inputted to the visible layer of the RBM. However, the configuration first needs to be one-hot encoded as follows: each occupation nin_{i} is encoded into an MM-component vector whose jj-th component is δj,ni\delta_{j,n_{i}}, j=0,1,,M1j=0,1,\ldots,M-1; then, the vectors for every occupation are concatenated into 𝒗(𝒏){\bf\it v}({\bf\it n}). Moreover, if the NHN_{H} hidden neurons are restricted to binary values 11 or 1-1, then, the approximated wavefunction coefficients can be written as Carleo and Troyer (2017)

ψ𝜽(𝒏)=ejajvj(𝒏)=1NH2cosh(b+jW,jvj(𝒏)),\displaystyle\psi_{{\bf\it\theta}}({\bf\it n})=e^{\sum_{j}a_{j}v_{j}({\bf\it n})}\prod_{\ell=1}^{N_{H}}2\cosh\left(b_{\ell}+\sum_{j}W_{\ell,j}v_{j}({\bf\it n})\right), (3)

where aj,ba_{j},b_{\ell} and WW are the complex-valued visible bias, hidden bias and connection matrix of the RBM, respectively.

Refer to caption
Figure 1: (Color online) Illustration of the RBM, where each site occupation is one-hot encoded into MM visible neurons, depicted with different colors for different sites in the top layer. There are NHN_{H} hidden neurons in the bottom layer connected with the visible neurons through weights W,jW_{\ell,j} that connect the \ell-th hidden neuron with the jj-th visibile neuron.

Thus, the approximation of the wavefunction coefficients is done through the adjustment of the parameters 𝜽:{aj,b,W,j}{\bf\it\theta}:\{a_{j},b_{\ell},W_{\ell,j}\} that minimize the energy ψ𝜽|H^|ψ𝜽\bra{\psi_{{\bf\it\theta}}}\hat{H}\ket{\psi_{{\bf\it\theta}}}. At each step of the minimization, a set \mathcal{M} of configurations 𝒏{\bf\it n} is sampled from |ψ𝜽(𝒏)|2|\psi_{{\bf\it\theta}}({\bf\it n})|^{2} using the Metropolis-Hastings algorithm, so that the energy can be efficiently estimated as (Saito, 2017)

ψ𝜽|H^|ψ𝜽1||𝒏𝒏𝒏|H^|𝒏ψ𝜽(𝒏)ψ𝜽(𝒏).\displaystyle\bra{\psi_{{\bf\it\theta}}}\hat{H}\ket{\psi_{{\bf\it\theta}}}\approx\frac{1}{|\mathcal{M}|}\sum_{{\bf\it n}\in\mathcal{M}}\sum_{{\bf\it n}^{\prime}}\bra{{\bf\it n}}\hat{H}\ket{{\bf\it n}^{\prime}}\frac{\psi_{{\bf\it\theta}}({\bf\it n}^{\prime})}{\psi_{{\bf\it\theta}}({\bf\it n})}. (4)

More explicitly, the following steps are carried out to generate the sample \mathcal{M}. At the first iteration, a state 𝒏0{\bf\it n}_{0} is randomly proposed. Then, at the ii-th iteration:

  1. 1.

    Under some updating rule, propose a new state 𝒏i{\bf\it n}^{\prime}_{i} from 𝒏i{\bf\it n}_{i}.

  2. 2.

    With probability min{1,|ψ𝜽(𝒏i)/ψ𝜽(𝒏i)|2}\min\{1,|\psi_{{\bf\it\theta}}({\bf\it n}^{\prime}_{i})/\psi_{{\bf\it\theta}}({\bf\it n}_{i})|^{2}\} accept the state 𝒏i{\bf\it n}^{\prime}_{i}, i.e. 𝒏i+1𝒏i{\bf\it n}_{i+1}\leftarrow{\bf\it n}^{\prime}_{i}. If it is not accepted, then 𝒏i+1𝒏i{\bf\it n}_{i+1}\leftarrow{\bf\it n}_{i}.

To build the sample \mathcal{M} a total of 1000 iterations are performed. The updating rule is provided by NetKet’s Local Metropolis sampler, which picks a site at random and assigns to it a uniformly random occupation, except from the current occupation. Then, through either stochastic gradient descent or stochastic reconfiguration (Sorella et al., 2007), the energy in eq. 4 can be minimized, producing a new set of parameters 𝜽{\bf\it\theta}, as explained by Carleo and Troyer (2017). Both the sampling and minimization of the energy with respect to the RBM parameters are repeated iteratively until the RBM parameters converge, resembling an Expectation Maximization algorithm (Dempster et al., 1977). A schematic representation of the variational Monte Carlo technique is shown in fig. 2.

Refer to caption
Figure 2: (Color online) Representation of the variational Monte Carlo technique. With randomly initialized parameters 𝜽{\bf\it\theta}, a set of states from the Hilbert space \mathcal{H} is sampled. By minimizing the energy defined in eq. 4, the parameters 𝜽{\bf\it\theta} are updated. These two steps are repeated nn times with the objective of sampling the states (in the occupation basis) that are relevant for the ground state, depicted by a blot at the bottom, with the correct probability distribution.

The blot at the bottom of fig. 2 depicts a region of the Hilbert space that contains the most relevant Fock states |𝒏\ket{{\bf\it n}} for the ground state of the BH Hamiltonian, with fixed μ,t\mu,t and UU. For instance, the only relevant Fock state within the mm-th Mott lobe is |n1=m,n2=m,\ket{n_{1}=m,n_{2}=m,\ldots}, whereas in the SF phase there are many relevant Fock states. In that regard, the region of relevant Fock states can contain a variable number of Fock states depending on the parameters of the Hamiltonian that one is intending to solve. Therefore, an issue immediately arises: for an unknown target probability distribution |Ψ(𝒏)|2|\Psi({\bf\it n})|^{2}, the sampling can be too small or too large with respect to the size of the region of relevant Fock states. If it is small, important information about the ground state might not be taken into account, whereas if it is large, noisy probability from non-relevant states can be taken into account.

III Results

We swept several values of chemical potential and hopping energy (a grid of 72×10072\times 100 points) corresponding to the first and part of the second Mott lobe in the t/Ut/Uμ/U\mu/U space, with U=1U=1, performing 12000 sampling and optimization steps with NetKet (Carleo et al., 2019), where 1000 Metropolis-Hastings steps were done for each sampling step. This was done for three different scenarios: for 5 sites, we used 8 and 20 hidden neurons, and for 8 sites, we used 11 hidden neurons. In all cases, the maximum number of bosons allowed per site was 4.

III.1 Energy convergence

Since the VMC minimizes the energy, we check that the energy computed with eq. 4 converges by measuring its variance for the last 500 sampling-optimization steps, as well as by measuring the absolute error of the energy when compared to the exact ground state energy found through Lanczsos diagonalization. For the aforementioned three scenarios, the variance for the last 500 sampling-optimization steps is shown in fig. 3(a)-(c), and the corresponding absolute errors with respect to the exact ground state energy are shown in fig. 3(d)-(f). It is seen that in the Hamiltonian parameter space, the majority of the energies have low-variance, showing convergence towards a value that is in excellent agreement with the exact ground-state energies.

Refer to caption
Figure 3: (Color online) Energy variance and absolute error of the last 500 sampling-optimization steps. Dashed lines show the phase boundaries computed for 128 sites with DMRG by Ejima et al. (2011). In (a), the two first Mott lobes and the SF region are labeled explicitly. (a)-(c) show the variance for the last 500 sampling-optimization steps for the cases of 5 sites and 8 hidden neurons, 8 sites and 11 hidden neurons, and 5 sites and 20 hidden neurons, respectively. (d)-(f) show the corresponding absolute errors between the average energy value for the last 500 sampling-optimization steps and the exact ground state energy.

Since the Hilbert spaces are small enough to compute all the probability amplitude coefficients for every state in the Fock basis, we can also compute any observable O^\hat{O} as

ψ𝜽|O^|ψ𝜽=𝒏|ψ𝜽(𝒏)|2𝒏|O^|𝒏𝒏|ψ𝜽(𝒏)|2.\displaystyle\bra{\psi_{{\bf\it\theta}}}\hat{O}\ket{\psi_{{\bf\it\theta}}}=\frac{\sum_{{\bf\it n}\in\mathcal{H}}|\psi_{{\bf\it\theta}}({\bf\it n})|^{2}\bra{{\bf\it n}}\hat{O}\ket{{\bf\it n}}}{\sum_{{\bf\it n}\in\mathcal{H}}|\psi_{{\bf\it\theta}}({\bf\it n})|^{2}}. (5)

In particular, the absolute error of the energy computed through eq. 5 is shown in fig. 4. It is clear that in the case of 5 sites and 20 neurons, a very large region both in the SF and MI phases shows a strong disagreement between the energy calculated through eqs. 4 and 5, showing that even though the energy converged, the state did not. Another recurrent pattern in the energies calculated through eqs. 4 and 5 is an arc of high absolute errors formed in the left-most side of the Mott lobes, which we will address later.

Refer to caption
Figure 4: (Color online) Absolute error between the RBM state expected energy computed through eq. 5 and the exact ground state energy for (a) 5 sites and 8 hidden neurons, (b) 8 sites and 11 hidden neurons and (c) 5 sites and 20 hidden neurons. Dashed lines are the MI-SF boundaries as in fig. 3.

III.2 Overlap

The lack of convergence of the RBM state for the case of 5 sites and 20 hidden neurons is further confirmed when we measure the overlap between the exact ground state |ψexact\ket{\psi_{\text{exact}}} and the RBM ground state |ψ𝜽\ket{\psi_{{\bf\it\theta}}}, shown in fig. 5(c). It is now clear why the expected energy with respect to the complete RBM state shown in fig. 4(c) presents large errors when compared to the expected energy with respect to the exact ground state: it appears that the RBM has not learned the ground state in the bottom-right region of the plot, which is a region that covers part of the Mott lobe, as well as part of the SF phase region. Moreover, in the case of 5 sites and 8 hidden neurons, and the case of 8 sites and 11 hidden neurons, the RBM finds difficulty in learning the ground state in the limit between the MI and the SF phase as shown in fig. 5(a) and (b). The difficulty in learning those states, and in general, in treating the ground state near the MI-SF boundary comes from the Kosterlitz-Thouless-like quantum phase transition in 1D systems (Giamarchi, 2003), where an exponentially small Mott gap exists (Ejima et al., 2012). We see once again that there are arcs of low overlap points formed in the left-most side of the Mott lobes. Within the SF phase, there are also lines of low overlap, which appear because of finite size effects. Note that there are as many of these fictitious boundaries as there are sites in the periodic chain under study. Nevertheless, when comparing the 5 sites cases, it is seen that for values of μ/U>1\mu/U>1 the states at the MI-SF phases boundary are better learned when 20 hidden neurons are used in the RBM (as in (McBrian et al., 2019), cf. fig. 5(c)) than when only 8 hidden neurons are used (see fig. 5(a)).

Refer to caption
Figure 5: (Color online) Overlap |ψexact|ψ𝜽|2\absolutevalue{\innerproduct{\psi_{\text{exact}}}{\psi_{{\bf\it\theta}}}}^{2} between the exact and RBM ground states for (a) 5 sites and 8 hidden neurons, (b) 8 sites and 11 neurons, and (c) 5 sites and 20 neurons. Dashed lines are the MI-SF boundaries as in fig. 3.

An interesting motif is that if one pays very close attention to the energy absolute errors in fig. 3(d)-(f), the lowest errors occur where the overlap in fig. 5(a)-(c) is lowest. Therefore, even though the convergence in the energy is excellent, the state is not learned correctly.

Since the advantage of VMC over exact diagonalization comes for intractably large Hilbert spaces, it is not always possible to compute the probability amplitudes for all of the Fock states basis. In such a case, the RBM state can be built as

|ψ~𝜽=𝒏ψ𝜽(𝒏)|𝒏𝒏|ψ𝜽(𝒏)|2,\displaystyle\ket*{\tilde{\psi}_{{\bf\it\theta}}}=\frac{\sum_{{\bf\it n}\in\mathcal{M}^{\prime}}\psi_{{\bf\it\theta}}({\bf\it n})\ket{{\bf\it n}}}{\sqrt{\sum_{{\bf\it n}\in\mathcal{M}^{\prime}}|\psi_{{\bf\it\theta}}({\bf\it n})|^{2}}}, (6)

where \mathcal{M}^{\prime} is a sample of Fock states sampled with the Metropolis-Hastings algorithm, with an acceptance probability of min{1,pGC(𝒏i+1)/pGC(𝒏i)}\min\{1,p_{\text{GC}}({\bf\it n}_{i+1})/p_{\text{GC}}({\bf\it n}_{i})\}, where pGC(𝒏)=𝒵1exp(𝒏|H^|𝒏)p_{\text{GC}}({\bf\it n})=\mathcal{Z}^{-1}\exp(-\expectationvalue{\hat{H}}{{\bf\it n}}) 222If instead the sample is built from the distribution |ψ𝒏|2\absolutevalue{\psi_{{\bf\it n}}}^{2}, then the acceptance probability becomes min{1,|ψ𝜽(𝒏i+1)/ψ𝜽(𝒏i)|2}\min\{1,|\psi_{{\bf\it\theta}}({\bf\it n}_{i+1})/\psi_{{\bf\it\theta}}({\bf\it n}_{i})|^{2}\} and the state build through eq. 5 inherits the RBM state problems (data not shown). is the probability associated with the grand canonical ensemble (note that the term μN^\mu\expectationvalue*{\hat{N}} has already been introduced in the Hamiltonian). This strategy was used to generate a sample \mathcal{M}^{\prime} of up to 2048 Fock states yielding a state |ψ~𝜽\ket*{\tilde{\psi}_{{\bf\it\theta}}} for every point in the phase diagram. The overlap between the exact ground state |ψexact\ket{\psi_{\text{exact}}} and the sampled RBM ground state |ψ~𝜽\ket*{\tilde{\psi}_{{\bf\it\theta}}} is shown in fig. 6 for the three studied scenarios.

Comparing fig. 5 with fig. 6, it is seen that the retrieved sampled state “cleans” the RBM state. In fact, the Fock states whose probability amplitudes were badly learned were removed, and only the Fock states relevant for the actual ground state were left. Despite this cleaning, the larger the Hilbert space, the more states have to be sampled to consider all relevant Fock states, as it is seen that 2048 Fock states are insufficient to capture the ground state in the case of 8 sites shown in fig. 6(b). Nonetheless, note that the low overlap in the arc from fig. 5(b) almost completely disappears after the cleaning, cf. fig. 6(b).

On the other hand, an important feature of fig. 6(b) is that the overlap of the sampled RBM state diminishes as tt gets larger. This happens because larger values of t/Ut/U imply larger delocalization of the ground wavefunction, thus, involving more Fock states. Moreover, the Hilbert space size for 8 sites consists of 58=3906255^{8}=390625 Fock states, which is why only sampling 2048 Fock states results in poor representations of the ground state, especially in the SF phase. Sampling more Fock states eventually reconstructs the exact ground state with very high overlap, except at the boundary between the MI and SF phases (data not shown).

Refer to caption
Figure 6: (Color online) Overlap |ψexact|ψ~𝜽|2\absolutevalue*{\innerproduct*{\psi_{\text{exact}}}{\tilde{\psi}_{{\bf\it\theta}}}}^{2} between the exact and sampled RBM ground states for (a) 5 sites and 8 hidden neurons, (b) 8 sites and 11 hidden neurons and (c) 5 sites and 20 hidden neurons, for a maximum of 2048 states sampled from the Hilbert space. Dashed lines are the MI-SF boundaries as in fig. 3.

III.3 Order parameter

The phase diagram of the BH model can be reconstructed by measuring quantities in the sampled RBM ground state that exhibit the phase transition. As mentioned before, we chose the variance of the local number operator, in particular, of the first site. In fig. 7(a) and (b), the order parameter measured with the exact ground state is shown for 5 and 8 sites, respectively. It is seen that in the Mott insulator phase, the variance of the number of bosons in the first lattice site is near to 0, but not exactly 0 because of finite size effects. On the other hand, fig. 7(d) and (e) show the order parameter for the sampled RBM ground state with 2048 Fock states for the cases of 8 and 20 hidden neurons for 5 sites, which show excellent agreement with their exact counterpart fig. 7(a). However, at the boundary between the MI and SF phases, there are absolute errors that could be as high as 0.15, which come from the difficulty of learning the ground state near the phase transition boundary. In spite of these differences at the boundaries, it is clear that the learned RBM ground state mimics the re-entrance found in finite 1D chains (Buonsante and Vezzani, 2007; Park et al., 2004). Finally, fig. 7(c) shows the order parameter for the exact RBM ground state for 8 sites and 11 hidden neurons. Figure 7(f) also shows the order parameter for 8 sites and 11 hidden neurons but with a different sampling limit. Instead of fixing a number of Fock states to be sampled (2048 and 4096 were insufficient, data not shown), we build the sample \mathcal{M}^{\prime} by accepting states with the Metropolis-Hastings algorithm until there is a run of 400 consecutive state proposals that do not raise a new accepted state into \mathcal{M}^{\prime}. In this case, the state is cleaned (note that the arc of badly learned order parameter almost completely disappears). A clear indicator of the number of Fock states to represent the ground state emerges: for very low values of tt, within the first Mott lobe, only one sampled state (|1,1,1,1,1\ket{1,1,1,1,1}) is needed to reproduce the exact ground state; in the SF phase, up to 30000 states are sampled before hitting a 400 streak of no new accepted states into the sample.

Refer to caption
Figure 7: (Color online) Order parameter Var(n^1)\text{Var}(\hat{n}_{1}) for the ground state obtained through exact diagonalization for 5 sites (a), and 8 sites (b); for the sampled RBM ground state with 2048 Fock states for 5 sites with 8 hidden neurons (d) and 20 hidden neurons (e); and for the exact RBM ground state for 8 sites with 11 neurons (c) and the sampled RBM ground state (f). The white line is the 0.001 contour line, and the yellow one corresponds to the 0.01 contour line in all plots. Dashed lines are the MI-SF boundaries as in fig. 3.

Other quantities can be used as an order parameter, which more explicitly relate to quantum correlations such as entanglement. For instance, a partial trace carried out over all degrees of freedom except the first site yields a reduced density matrix ρ1(t,μ)\rho_{1}(t,\mu), from which the linear entropy S(t,μ)=1Tr(ρ12(t,μ))S(t,\mu)=1-\Tr{\rho^{2}_{1}(t,\mu)} can be measured (the von Neumann entanglement entropy can also be used, e.g., Ejima et al. (2012)). Figure 8 shows the absolute errors between the exact and the sampled RBM ground states for the three studied scenarios, where the errors at the MI-SF boundary become large.

Refer to caption
Figure 8: (Color online) Absolute error for the linear entropy at the first site of the chain between the exact ground state and the sampled RBM ground state for (a) 5 sites, 8 hidden neurons and 2048 Fock states, (b) 8 sites, 11 hidden neurons and 4096 Fock states, and (c) 5 sites, 20 hidden neurons, and 2048 Fock states. Dashed lines are the MI-SF boundaries as in fig. 3.

III.4 Tomography

In all the studied scenarios, there are problems for learning the ground state at the MI-SF boundaries, as well as at the mini-plateaus boundaries within the SF phase. In order to understand the differences between the learned ground state and the one obtained through exact diagonalization, we performed a study of the composition of the ground states for 5 sites and 8 hidden neurons. For that reason, we examined the probability amplitudes of the ground state (both RBM learned and exact) at 11 different points in the tt-μ\mu space, fixing t=0.1t=0.1, and with μ\mu at the middle and border of each plateau in the MI and SF phases. We also examined the RBM and exact ground states for very small tt at the middle of the first Mott lobe, as indicated by the black dot in fig. 9, where the ground state for the RBM and for exact diagonalization was |1,1,1,1,1\ket{1,1,1,1,1} with an associated probability amplitude of 99.99%, as expected.

Refer to caption
Figure 9: (Color online) Quantum tomography of the probability amplitude of Fock state manifold rungs for the ground state found through exact optimization and through VMC with an RBM wavefunction ansatz. In the lower part of the figure, a 3D plot of the order parameter for the exact ground state is shown for the case of 5 sites. Each bar color represents a manifold rung depicted through its canonical Fock state as the lexicographically smallest one (e.g. the rung {|1,1,1,1,2,|1,1,1,2,1,,|2,1,1,1,1}\{\ket{1,1,1,1,2},\ket{1,1,1,2,1},\ldots,\ket{2,1,1,1,1}\} is represented by |1,1,1,1,2\ket{1,1,1,1,2}). A black dot in the middle of the first Mott lobe indicates that a tomography was made there as well.

Note that the BH chain is invariant (up to a phase) under displacements and inversions, i.e. there are displacement and inversion operators that act as follows on Fock states: T^|n1,,nN1,nN=eiϕ|nN,n1,,nN1\hat{T}\ket{n_{1},\ldots,n_{N-1},n_{N}}=e^{i\phi}\ket{n_{N},n_{1},\ldots,n_{N-1}} for displacement, and I^|n1,n2,,nN1,nN=eiφ|nN,nN1,,n2,n1\hat{I}\ket{n_{1},n_{2},\ldots,n_{N-1},n_{N}}=e^{i\varphi}\ket{n_{N},n_{N-1},\ldots,n_{2},n_{1}} for inversion. Therefore, in order to perform a tomography, we must take into account that all Fock states that belong to the same rung defined by displacement and inversion operations are equivalent. Now, each Fock state can be brought to a canonical Fock state through a consecutive application of displacement and inversion operators onto the original Fock state. This canonical Fock state is selected as the lexicographically smallest one, after every possible application of displacement and inversion operators, as in (Choo et al., 2018). In fig. 9, we show the probability amplitude distribution of the Fock state manifold rungs for the exact and the RBM ground states. More explicitly, the exact ground state is

|Ψ=𝒏Ψ(𝒏)|𝒏=i𝒏RiΨ(𝒏)|𝒏,\displaystyle\ket{\Psi}=\sum_{{\bf\it n}}\Psi({\bf\it n})\ket{{\bf\it n}}=\sum_{i}\sum_{{\bf\it n}\in R_{i}}\Psi({\bf\it n})\ket{{\bf\it n}}, (7)

where ii indexes the rungs, and RiR_{i} is the ii-th rung. The probability amplitude corresponding to a rung is, therefore, the sum of the probability amplitudes of all of its Fock states, and the bars from fig. 9 show those rungs’ probability amplitudes. Reading the plot from right to left, i.e. starting with the smallest value of μ\mu, we see that within the first Mott lobe, both the RBM and the exact ground states show very similar distributions over three rungs of the one-filling manifold: |1,1,1,1,1\ket{1,1,1,1,1}, |0,1,1,1,2\ket{0,1,1,1,2} and |0,1,1,2,1\ket{0,1,1,2,1}. Increasing μ\mu up to the first MI-SF boundary (slightly within the SF phase), we see that the probability amplitude distribution starts to differ between the exact and the RBM ground states, and the RBM struggles to identify if the ground state is now in an excitation manifold above the one-filling manifold, represented by the rung |1,1,1,1,2\ket{1,1,1,1,2}, or in the MI phase which is represented by the rung |1,1,1,1,1\ket{1,1,1,1,1}. Increasing again μ\mu in order to be at the middle of the first plateau within the SF phase shows that both the RBM and the exact ground states have similar probability amplitude distributions, even though the RBM assigns a probability to other rungs (not shown in the plot because their contribution is less than 0.01). It is clear that this first plateau is mostly represented by the rung |1,1,1,1,2\ket{1,1,1,1,2} which is in the 6-th excitation manifold. If we continue to increase μ\mu we see that near the boundaries between the MI-SF phases and between the plateaus within the SF phase, the RBM and the exact ground states exhibit differences in the probability amplitude distributions. On the contrary, in the middle of those plateaus, the probability amplitude distributions from both RBM and exact ground states are very similar (which is also seen in fig. 5(a)). Moreover, each time the tomography moves onto a new plateau (with a higher value of μ\mu), the excitation manifold increases by one, up to the 10-th excitation manifold, which corresponds to the two-filling manifold at the second Mott lobe, characterized by the rung |2,2,2,2,2\ket{2,2,2,2,2} as well as the rungs |1,2,2,2,3\ket{1,2,2,2,3} and |1,2,3,1,3\ket{1,2,3,1,3}.

IV Conclusions

In this work, we systematically tested the capabilities of VMC with a trial ground wavefunction given by an RBM on the one-dimensional Bose-Hubbard model. The motivation for the technique comes from the possibility of incorporating it into the toolbox of quantum physics to tackle theoretical problems that are difficult to study numerically due to the intractably large Hilbert spaces. Thus, it is first needed to intensively test the technique to reproduce known results, and it is also needed to theoretically explain why the technique works (this is a challenging open question which involves the question of why neural networks work well). Only if the community identifies the strengths and weaknesses of the technique is it possible to use it to explore problems of interest that involve vast Hilbert spaces. Regarding the model under study, we repeatedly found differences between the exact ground state found through the exact diagonalization of the BH Hamiltonian and the learned ground state. We did so in one-dimensional chains of 5 and 8 sites. In order to better learn the ground state, the variational trial wavefunction was enriched in the case of 5 sites by increasing the number of hidden units in the RBM’s hidden layer. Although results improved near the MI-SF boundary, other regions of the BH parameter space suffered from badly learned ground states. For that reason, we proposed a sampling technique that cleans the ground state, getting rid of Fock states not relevant to the actual ground state.

In most of the BH model parameter space, it was found that the VMC-RBM method yielded good results in computing the energy (with low errors and clear signs of convergence), and also was able to compute other observables which explicitly involve quantities related to quantum correlations, such as the linear entropy of one of the chain’s sites. Moreover, we reconstructed with excellent accuracy the phase diagram of the BH model using the local occupation variance as an order parameter for most of the parameter space. Furthermore, we also carried out quantum state tomographies to understand the composition of the ground states at the interfaces between the MI-SF phases and also within the excitation manifold plateaus formed for finite chains in the SF phase, which revealed significant differences between the Fock states distributions of the exact ground state and the learned ground state.

Accordingly, we must raise the attention that the VMC technique is consistently challenged near the quantum phase transition of the BH model. We do not doubt that the high-expressivity of neural quantum states is able to improve those results under more extensive sampling (or better sampling schemes), or more complex variational trial wavefunctions apart from RBMs; however, we report that some questions have to be answered with precision before confidently using VMC as a method to accurately explore the physics of a many-body problem without the support of any other numerical method. In the first place, we observed that not every point of the phase diagram requires the same amount of computational work to learn the ground state. Most notably, the ground state in the MI phase is mostly explained by only one Fock state, whereas the number rapidly grows for ground states in the SF phase. Therefore, automated ways of stopping sampling within the sampling steps should be taken into consideration. In particular, when using the Metropolis-Hastings algorithm, we found that if a Markov chain of a certain length was formed with no new sampled states, we could stop sampling, yielding a high-quality ground state. Secondly, the VMC technique showed badly learned ground states along arcs formed within the Mott lobes. Why these form remains unanswered, as they are not related to the structure of the MI-SF boundaries. However, after we clean the learned ground state, these arcs almost completely disappear. Thus, it might be the case that they are related to noisy probability from Fock states that are not relevant to the ground state. Finally, concerning the first question, the number of optimization steps required to learn the ground state has to be better understood, as it is only clear when the energy converges, but not the state 333In this regard, NetKet team has included the Gelman-Rubin statistics (Vats and Knudson, 2018).

Additionally, an interesting discussion arises when paying close attention to the sampling technique of the VMC. Netket’s Local Metropolis sampler has low acceptance probability near the MI-SF transition for low values of tt and finite small chains such as the ones considered in our work, which complicates learning the state. However, the difference between the exact state and the learned state is constant throughout manifold excitation boundaries regardless of the tt value, and also regardless of being at the MI-SF transition, or within the SF transition.

Acknowledgements.
We want to thank Andrés Urquijo and J. P. Restrepo-Cuartas for useful discussions at the early stage of this study.

References