This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Neural Canonical Transformations for Quantum Anharmonic Solids of Lithium

Qi Zhang Beijing National Laboratory for Condensed Matter Physics and Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China    Xiaoyang Wang Laboratory of Computational Physics, Institute of Applied Physics and Computational Mathematics, Fenghao East Road 2, 100094 Beijing, China    Rong Shi CAS Key Laboratory of Quantum Information, University of Science and Technology of China, Hefei 230026, Anhui, China Beijing National Laboratory for Condensed Matter Physics and Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China    Xinguo Ren [email protected] Beijing National Laboratory for Condensed Matter Physics and Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China    Han Wang [email protected] Laboratory of Computational Physics, Institute of Applied Physics and Computational Mathematics, Fenghao East Road 2, 100094 Beijing, China HEDPS, CAPT, College of Engineering, Peking University, 100871 Beijing, China    Lei Wang [email protected] Beijing National Laboratory for Condensed Matter Physics and Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China
Abstract

Lithium is a typical quantum solid, characterized by cubic structures at ambient pressure. As the pressure increases, it forms more complex structures and undergoes a metal-to-semiconductor transformation, complicating theoretical and experimental analyses. We employ the neural canonical transformation approach, an ab initio variational method based on probabilistic generative models, to investigate the quantum anharmonic effects in lithium solids at finite temperatures. This approach combines a normalizing flow for phonon excited-state wave functions with a probabilistic model for the occupation of energy levels, optimized jointly to minimize the free energy. Our results indicate that quantum anharmonicity lowers the bcc-fcc transition temperature compared to classical molecular dynamics predictions. At high pressures, the predicted fractional coordinates of lithium atoms in the cI16 structure show good quantitative agreement with experimental observations. Finally, contrary to previous beliefs, we find that the poor metallic oC88 structure is stabilized by the potential energy surface obtained via high-accuracy electronic structure calculations, rather than thermal or quantum effects.

Introduction.– Accurate prediction of crystal structures has long been a central focus in materials science. At low temperatures, a deep understanding of the physical properties of crystals composed of light atoms typically requires proper treatment of nuclear quantum effects with anharmonicity Cazorla and Boronat (2017); Monserrat et al. (2013); Kapil et al. (2019); Flores-Livas et al. (2020); Monacelli et al. (2021a). These effects can play a crucial role in determining the crystal structure, as seen in solid hydrogen Azadi et al. (2014); Borinaga et al. (2016); Monacelli et al. (2021b, 2023), helium Pederiva and Chester (1998); Vitali et al. (2010); Monserrat et al. (2014), and some metal hydrides Errea et al. (2013, 2014). In this study, we explore one of the most notable examples: lithium, the lightest alkali metal, where the quantum effects of nuclei are pronounced in a wide range of pressures and temperatures Guillaume et al. (2011). Although lithium behaves as a nearly free-electron metal at low pressure and adopts simple, high-symmetry cubic structures, the free energy difference between its bcc (body-centered cubic) and fcc (face-centered cubic) phases is less than 1meV/atom1~{}\text{meV/atom} Hutcheon and Needs (2019); Jerabek et al. (2022); Riffe and Christensen (2024); Phuthi et al. (2024), making precise calculations challenging. Additionally, lithium exhibits several metastable structures that further complicate experimental measurements Ackland et al. (2017). As the pressure increases, lithium exhibits complex physical behaviors, such as anomalous melting curves Guillaume et al. (2011); Schaeffer et al. (2012); Frost et al. (2019), and intriguing phase transitions from metal to semiconductor and back Matsuoka and Shimizu (2009); Lv et al. (2011); Marqués et al. (2011). Moreover, some high-pressure phases consist of large unit cells with tens or even hundreds of atoms Marqués et al. (2011); Gorelli et al. (2012); Wang et al. (2023), posing substantial challenges for both theoretical and experimental studies.

Numerical approaches to studying quantum crystals at finite temperatures include the well-established path integral molecular dynamics Martyna et al. (1999) and path integral Monte Carlo Barker (1979). In recent years, inspired by the successful application of vibrational self-consistent field theory in molecular studies Bowman (1978, 1986); Gerber and Ratner (1979), efforts have been made to extend it to study crystals Monserrat et al. (2013, 2014); Hutcheon and Needs (2019); Kapil et al. (2019). However, it relies on a Taylor expansion of the Born-Oppenheimer energy surface (BOES), and the wave function is a simple Hartree product. The stochastic self-consistent harmonic approximation (SSCHA) Errea et al. (2013, 2014); Monacelli et al. (2021a) provides an alternative by accounting for both ionic quantum and anharmonic effects without assumptions on the specific function form of the BOES. Nevertheless, it still relies on the Gaussian variational density matrices to define the quantum probability distribution. Recent developments have extended SSCHA to non-Gaussian density matrices, yet the entropy is still restricted to Gaussian approximations Siciliano et al. (2024); Monacelli et al. (2024).

In this work, we utilize the recently developed neural canonical transformations (NCT) approach Xie et al. (2022, 2023); Zhang et al. (2024), which is an ab initio variational density matrix method based on deep generative models, to study quantum lattice dynamics of lithium. NCT constructs orthogonal variational wave functions to describe phonons through a normalizing flow model Dinh et al. (2014, 2016); Wang (2018); Papamakarios (2019); Papamakarios et al. (2021); Xie et al. (2022, 2023); Saleh et al. (2023); Zhang et al. (2024). Additionally, we create a probabilistic model to describe the classical energy occupation probabilities for these phonons, allowing for accurate entropy calculations. For electronic calculations, we employ the deep potential model Wang et al. (2018); Zeng et al. (2023); Wang et al. (2023), a machine learning BOES, offering significant computational efficiency improvements over density functional theory (DFT) calculations. NCT’s key advantage is its ability to integrate quantum and anharmonic effects of nuclei into the wave functions, which facilitates the determination of the phonon spectrum. Moreover, the independently optimized phonon energy occupation probabilities enable the computation of anharmonic entropy. The NCT codes for lithium are open-sourced and publicly available myc .

The vibrational Hamiltonian of quantum solids.– Due to the substantial mass difference between electrons and nuclei, typically spanning several orders of magnitude, the Born-Oppenheimer approximation can be applied to decouple their motions and treat them independently. Under this approximation, the vibrational Hamiltonian is expressed as Hvib=i12Mi2+Vel(𝑹),H_{\text{vib}}=-\sum_{i}\frac{1}{2M}\nabla_{i}^{2}+V_{\text{el}}(\bm{R}), where the mass of a lithium atom is taken as M=6.941uM=6.941~{}\text{u}. The term Vel(𝑹)V_{\text{el}}(\bm{R}) is the BOES, derived from electronic structure calculations at nuclear positions 𝑹\bm{R}. In this work, to ensure both accuracy and computational efficiency, we use the deep potential model to fit the BOES Wang et al. (2018); Zeng et al. (2023); Wang et al. (2023); sup , which is derived from DFT calculations using the Perdew-Burke-Ernzerhof (PBE) functional.

The dynamical matrix can be derived from the Hessian of the BOES at the equilibrium position 𝑹𝟎\bm{R_{0}} Martin (2020); Baroni et al. (2001); Souvatzis et al. (2008, 2009); Flores-Livas et al. (2020): C(iα),(jβ)=1M2Veluiαujβ,C_{(i\alpha),(j\beta)}=\frac{1}{M}\frac{\mathrm{\partial}^{2}V_{\text{el}}}{\mathrm{\partial}u_{i\alpha}\mathrm{\partial}u_{j\beta}}, where i,ji,j index the nuclei, α,β\alpha,\beta represent the Cartesian components, and the displacement coordinates are defined as 𝒖=𝑹𝑹𝟎\bm{u}=\bm{R}-\bm{R_{0}}. Diagonalizing the matrix in a supercell containing NN atoms yields D=3N3D=3N-3 non-zero eigenvalues, which correspond to the number of phonon modes. The eigenvalues are related to the squares of the phonon frequencies, ωk2\omega_{k}^{2}   (k=1,2,,Dk=1,2,\ldots,D), and the associated eigenvectors define the unitary transformation from displacement coordinates 𝒖\bm{u} to phonon coordinates 𝒒\bm{q}. Consequently, the Hamiltonian can be expressed in phonon coordinates:

Hvib=12k=1D(2qk2+ωk2qk2)+Vanh(𝒒),H_{\text{vib}}=\frac{1}{2}\sum_{k=1}^{D}\left(-\frac{\partial^{2}}{\partial q_{k}^{2}}+\omega_{k}^{2}q_{k}^{2}\right)+V_{\text{anh}}(\bm{q}), (1)

where the term VanhV_{\text{anh}} represents the anharmonic contributions, as detailed in sup . In this representation, the separation of high and low-frequency modes greatly enhances the efficiency in the following calculations.

Neural canonical transformation for variational density matrix.– The solution for a many-body system in the canonical ensemble can be obtained by minimizing the Helmholtz free energy for a variational density matrix:

F=kBTTr(ρlnρ)+Tr(ρHvib),F=k_{B}T~{}\mathrm{Tr}(\rho\ln\rho)+\mathrm{Tr}(\rho H_{\text{vib}}), (2)

where kBk_{B} is the Boltzmann constant and TT is the temperature. Assuming that the phonons occupy the states |Ψ𝒏|\Psi_{\bm{n}}\rangle with probability p𝒏p_{\bm{n}}, the variational density matrix can be represented in terms of these quantum states as:

ρ=𝒏p𝒏|Ψ𝒏Ψ𝒏|,\rho=\sum_{\bm{n}}p_{\bm{n}}~{}|\Psi_{\bm{n}}\rangle\langle\Psi_{\bm{n}}|, (3)

where 𝒏=(n1,n2,,nD)\bm{n}=(n_{1},n_{2},\ldots,n_{D}) indexes the energy levels of the phonons. An unbiased estimate of the anharmonic free energy for the variational density matrix Eq. 3 can be written as nested thermal and quantum expectations:

F=𝔼𝒏p𝒏[kBTlnp𝒏+𝔼𝒒|Ψ𝒏(𝒒)|2[HvibΨ𝒏(𝒒)Ψ𝒏(𝒒)]],F=\underset{\bm{n}\sim p_{\bm{n}}}{\mathbb{E}}\left[k_{B}T\ln p_{\bm{n}}+\underset{\bm{q}\sim|\Psi_{\bm{n}}(\bm{q})|^{2}}{\mathbb{E}}\left[\frac{H_{\text{vib}}\Psi_{\bm{n}}(\bm{q})}{\Psi_{\bm{n}}(\bm{q})}\right]\right], (4)

where Ψ𝒏(𝒒)=𝒒|Ψ𝒏\Psi_{\bm{n}}(\bm{q})=\langle\bm{q}|\Psi_{\bm{n}}\rangle is the wave function in phonon coordinates. The symbol 𝔼{\mathbb{E}} is the statistical expectation, which can be estimated through sampling Zhang et al. (2024); Becca and Sorella (2017). In this Letter, the variational parameters within the energy occupation probabilities and wave functions are denoted as 𝝁\bm{\mu} and 𝜽\bm{\theta}, respectively, i.e., p𝒏=p𝒏(𝝁)p_{\bm{n}}=p_{\bm{n}}({\bm{\mu}}), Ψ𝒏(𝒒)=Ψ𝒏(𝜽,𝒒)\Psi_{\bm{n}}(\bm{q})=\Psi_{\bm{n}}(\bm{\theta},\bm{q}). These parameters can be optimized via gradient descent Kingma and Ba (2014), with FF as the loss function. The gradients with respect to the probabilities 𝝁F\nabla_{\bm{\mu}}F and the wave function 𝜽F\nabla_{\bm{\theta}}F sup can be efficiently computed using automatic differentiation Bradbury et al. (2018).

In a supercell with DD vibrational modes, setting a cutoff of KK energy levels per phonon (i.e., nk=1,2,,Kn_{k}=1,2,\ldots,K) results in an exponentially large state space of KDK^{D}. For supercells containing hundreds of atoms, directly representing the energy occupation probabilities p𝒏p_{\bm{n}} becomes impractical in computations. In the study of lithium, we assume that the probability distributions take a product form Martyn and Swingle (2019): p𝒏=k=1Dp(nk)p_{\bm{n}}=\prod_{k=1}^{D}p(n_{k}), where p(nk)p(n_{k}) represents the probability of the kk-th phonon occupying state nkn_{k}, and they are governed by learnable parameters 𝝁\bm{\mu}. We have checked that an even more powerful variational autoregressive network Wu et al. (2019); Liu et al. (2021); Xie et al. (2023) does not improve results, likely due to weak coupling between phonon modes. The entropy can be evaluated as the expectation of the probabilities:

S=𝔼𝒏p𝒏[kBlnp𝒏].S=\underset{\bm{n}\sim p_{\bm{n}}}{\mathbb{E}}[-k_{B}\ln p_{\bm{n}}]. (5)

We note the nonlinear SSCHA Siciliano et al. (2024); Monacelli et al. (2024) corresponds to even further simplification of p𝒏p_{\bm{n}}, which assumes that the entropy is given by a set of independent harmonic oscillators.

To construct variational wave functions, we apply a unitary transformation to a set of orthogonal basis states Xie et al. (2022, 2023); Zhang et al. (2024): |Ψ𝒏=U𝜽|Φ𝒏|\Psi_{\bm{n}}\rangle=U_{\bm{\theta}}~{}|\Phi_{\bm{n}}\rangle, where the basis states |Φ𝒏|\Phi_{\bm{n}}\rangle are chosen as the wave functions of a DD-dimensional harmonic oscillator with frequencies ωk\omega_{k}. We implement the unitary transformation U𝜽U_{\bm{\theta}} using a normalizing flow Dinh et al. (2014, 2016); Wang (2018); Papamakarios (2019); Papamakarios et al. (2021); Xie et al. (2022, 2023); Saleh et al. (2023); Zhang et al. (2024), which establishes a learnable bijection between the phonon coordinates 𝒒\bm{q} and a set of quasi-phonon coordinates 𝝃\bm{\xi}. The bijection is represented as a smooth, reversible function 𝝃=f𝜽(𝒒)\bm{\xi}=f_{\bm{\theta}}(\bm{q}), where f𝜽f_{\bm{\theta}} consists of neural networks with learnable parameters 𝜽\bm{\theta}, specifically, a real-valued non-volume preserving network Dinh et al. (2016). Accordingly, the orthogonal variational wave functions of all energy levels can be formulated as sup ; Zhang et al. (2024):

Ψ𝒏(𝒒)=Φ𝒏(f𝜽(𝒒))|det(f𝜽(𝒒)𝒒)|1/2,\Psi_{\bm{n}}(\bm{q})=\Phi_{\bm{n}}\left(f_{\bm{\theta}}(\bm{q})\right)\left|\mathrm{det}\left(\frac{\partial f_{\bm{\theta}}(\bm{q})}{\partial\bm{q}}\right)\right|^{1/2}, (6)

where Φ𝒏(𝝃)=𝝃|Φ𝒏\Phi_{\bm{n}}(\bm{\xi})=\langle\bm{\xi}|\Phi_{\bm{n}}\rangle are basis states. Notably, in the study of lithium, the computation involves approximately ten million orthogonal states for each training. The Jacobian determinant in Eq. 6 captures phonon interactions and anharmonic effects, enabling a more flexible and accurate representation. This approach remains robust when imaginary phonons appear in strong anharmonicity systems (e.g., saddle points of BOES). In such cases, we can choose the corresponding basis states with real-valued frequencies, and the flow model will automatically optimize to find the most suitable wave functions. A detailed derivation of NCT can be found in Supplemental Material sup and our previous work Zhang et al. (2024).

We can extend NCT naturally to the isothermal-isobaric ensemble, where the goal is to minimize the Gibbs free energy at a target pressure PP^{*}, defined as:

G=F+PΩ,G=F+P^{*}\Omega, (7)

where Ω\Omega is the system volume. From the relation dG=dF+ΩσαβdεαβdG=dF+\Omega\sum\sigma_{\alpha\beta}~{}d\varepsilon_{\alpha\beta}, once the parameters 𝝁\bm{\mu} and 𝜽\bm{\theta} have converged under constant volume optimization (i.e., when dF=0dF=0), the gradient of the Gibbs free energy to the strain 𝜺\bm{\varepsilon} depends only on the stress tensor 𝝈\bm{\sigma}. The stress tensor and pressure can be calculated using the virial theorem Nielsen and Martin (1985); Martin (2020); sup . Then, we can optimize the lattice constants 𝒂\bm{a} through the strain tensor εαβ=Ω(σαβPδαβ)\varepsilon_{\alpha\beta}=\Omega(\sigma_{\alpha\beta}-P^{*}\delta_{\alpha\beta}), which is similar to the structure relaxation in other methods Monserrat et al. (2013); Monacelli et al. (2021a).

Refer to caption
Figure 1: Numerical results for fcc and bcc at a fixed volume of Ω=19.2Å3/atom\Omega=19.2~{}\AA^{3}/\text{atom} and temperature T=300KT=300~{}\text{K}. (a) Training curves of the Helmholtz free energy F(𝝁,𝜽)F(\bm{\mu},\bm{\theta}) (Eq. 4). The legend “opt 𝝁\bm{\mu} only” indicates that only the energy occupation probabilities p𝒏p_{\bm{n}} are optimized, and “opt 𝝁\bm{\mu} & 𝜽\bm{\theta}” means that both p𝒏p_{\bm{n}} and wave functions Ψ𝒏\Psi_{\bm{n}} are optimized. (b) Radial distribution functions of lithium atoms. (c) Phonon density of states per atom. The harmonic (har) frequencies ωk\omega_{k} are calculated from the dynamical matrix, and the anharmonic (anh) frequencies wkw_{k} are taken from the single-phonon excitations. The zero-point energies (ZPE) are defined as EZPE,har=k=1Dωk/2NE_{\text{ZPE},\text{har}}=\sum_{k=1}^{D}\omega_{k}/{2N} and EZPE,anh=k=1Dwk/2NE_{\text{ZPE},\text{anh}}=\sum_{k=1}^{D}w_{k}/{2N}.

Anharmonic and nuclear quantum effects on stability.– At ambient temperature and pressure, lithium adopts a simple bcc structure. As the temperature decreases, experiments have demonstrated that the true ground state of lithium is fcc Ackland et al. (2017). Some calculations have revealed that the free energies of these structures are extremely close Hutcheon and Needs (2019); Jerabek et al. (2022); Riffe and Christensen (2024); Phuthi et al. (2024), highlighting the necessity of fully accounting for quantum and anharmonic effects. To investigate the influence of anharmonicity, we first conducted NCT calculations for bcc and fcc using supercells with 250250 and 256256 atoms, respectively, at a fixed volume of 19.2Å3/atom19.2~{}\AA^{3}/\text{atom} and temperature 300K300~{}\text{K}. As a comparative study, we set f𝜽f_{\bm{\theta}} in Eq. 6 to an identity transformation, meaning the phonon wave functions are harmonic oscillators. In this case, only the phonon occupation probabilities p𝒏p_{\bm{n}} in Eq. 4 were optimized.

At a lower temperature of 100K100~{}\text{K}, the free energies of fcc are lower than that of bcc, as expected. However, as the temperature increases to 300K300~{}\text{K}, the impact of anharmonicity becomes evident, as shown in FIG. 1 (a). In fcc, the free energy difference between the two approaches remains small, about 0.11meV/atom0.11~{}\text{meV/atom}. In contrast, the anharmonic effects are much stronger in bcc, and the free energy difference expands to 2.67meV/atom2.67~{}\text{meV/atom}. It is also observed that when we only optimized p𝒏p_{\bm{n}}, the free energy of fcc is lower than that of bcc. However, when the optimization of wave functions is included, i.e., when anharmonic effects are considered, the bcc structure becomes more stable. This phenomenon suggests that bcc exhibits stronger anharmonicity than fcc, underscoring the critical role of anharmonicity in determining the stability.

The findings are further supported by the radial distribution functions (RDF) of atoms, as shown in FIG. 1 (b). The RDF for fcc exhibits only slight influence from anharmonic effects. In contrast, the RDF for bcc exhibits a smoother curve when anharmonic effects are considered, indicating a reduction in atomic localization. This behavior suggests that anharmonicity softens the system, resulting in a lower zero-point energy (ZPE) than the harmonic approximation. Further insights are provided by the phonon density of states (DOS) depicted in FIG. 1 (c), where the ZPE is determined as half the sum of all phonon frequencies per atom, as detailed in sup . Although the bcc structure is more stable at high temperatures, the numerical results reveal that the ZPE of fcc remains lower than that of bcc. This phenomenon can be explained by the differences in coordination numbers: the coordination number of bcc is 88, while that of fcc is 1212. Hence, atoms in bcc interact less strongly with their neighbors, leading to higher quantum fluctuations and greater anharmonicity.

Refer to caption
Figure 2: (a) Gibbs free energy (Eq. 7) difference between fcc and bcc at P=1GPaP=1~{}\text{GPa}, using fcc as the reference. (b) Anharmonic effects on ionic entropy. The anharmonic (anh) entropy is directly obtained from the expectation of the probability distribution (Eq. 5), while the harmonic (har) entropy is calculated from the harmonic frequencies using S=k[ωkkBT1eωk/kBT1ln(1eωk/kBT)]S=\sum_{k}\left[\frac{{\omega_{k}}}{k_{B}T}\frac{1}{\text{e}^{{\omega_{k}}/k_{B}T}-1}-\ln(1-\text{e}^{-{\omega_{k}}/k_{B}T})\right]. The xx-axis of the inset represents the temperature, and the yy-axis corresponds to kBT(SanhShar)-k_{B}T(S_{\text{anh}}-S_{\text{har}}) in units of meV/atom.

To gain deeper insight into the influence of nuclear quantum effects, we performed calculations on the bcc and fcc structures under constant pressure. The Gibbs free energies of both structures are extremely close Hutcheon and Needs (2019); Jerabek et al. (2022); Riffe and Christensen (2024); Phuthi et al. (2024). An error of just 1meV1~{}\text{meV} could lead to a shift of more than 100K100~{}\text{K} in the transition temperature Riffe and Christensen (2024). FIG. 2 (a) shows the Gibbs free energy difference between these structures at 1GPa1~{}\text{GPa}, with the fcc structure used as the reference. The two curves intersect at 142K142~{}\text{K}, indicating a phase transition at this temperature. We also calculated the transition temperature through classical MD simulations with thermodynamic integration on the same BOES, obtaining a value of 185K185~{}\text{K}. The main difference between these methods is that NCT accounts for the quantum effects, while MD simulations do not. Similar results are also observed at 0 and 2GPa2~{}\text{GPa}, as detailed in Supplemental Material sup , where NCT consistently predicts lower transition temperatures (8484 and 196K196~{}\text{K}) compared to MD (144144 and 218K218~{}\text{K}).

We also calculated the ionic entropy of both structures, as shown in FIG. 2 (b). Notably, the anharmonic entropy obtained from NCT is derived directly from the probabilities of energy occupations in Eq. 5, beyond the harmonic oscillator assumption. Under the harmonic oscillator assumption, the entropy of fcc is higher than that of the bcc. However, when anharmonicity is considered, the relationship is reversed. The higher entropy of the bcc structure is a key factor contributing to its stability in finite temperatures Souvatzis et al. (2008, 2009); Söderlind et al. (2012); Hutcheon and Needs (2019); Phuthi et al. (2024). Furthermore, we quantified the free energy difference arising from the anharmonic effects of entropy as kBT(SanhShar)-k_{B}T~{}(S_{\text{anh}}-S_{\text{har}}) (inset of FIG. 2 (b)), estimating it to be on the order of several meV. This underscores the critical importance of accurately incorporating anharmonic effect in the calculations.

Refer to caption
Figure 3: (a) Fractional coordinates of the Wyckoff position 16c16c in the cI16 (I-43d) structure. Experimental values are taken from Ref. Hanfland et al. (2000). (b) Phonon density of states (DOS). (c) Gibbs free energy difference between cI16 and oC88 at T=100KT=100~{}\text{K}, with cI16 taken as the reference. The thick orange line indicates the single-point correction for cI16 and oC88 at 70GPa70~{}\text{GPa}, based on higher-accuracy electronic structure calculations using the HSE functional, as compared to the PBE. (d) Electronic DOS for oC88 with PBE and HSE functionals.

High-pressure structural stability of lithium.– Under high pressure, lithium exhibits more complex structures and larger unit cell sizes. We first applied the NCT method to calculate the cI16 (cubic I-centered, I-43d) structure at 100K100~{}\text{K}, using a supercell of 432432 atoms under various pressures. The NCT method optimizes the equilibrium positions of atoms through coordinate transformations. Subsequently, we quenched the sampled structures to their ground state with the BOES and analyzed the fractional coordinates (x,x,x)(x,x,x) of Wyckoff position 16c16c as a function of atomic volume. As depicted in FIG. 3 (a), our results are consistent with the experiment reported in Ref. Hanfland et al. (2000), demonstrating the reliability of NCT in structure optimizations.

As illustrated in FIG. 3 (b), the phonon DOS and ZPE of the cI16 and oC88 (orthorhombic C-face centered, C2mb) structures are calculated at 70GPa70~{}\text{GPa}. Under the harmonic approximation, the ZPE of both structures are found to be comparable, consistent with the results obtained using the finite displacement and density functional perturbation theory methods Wang et al. (2023). The anharmonic effects soften the phonon spectrum of cI16, reducing the ZPE by 4.92meV4.92~{}\text{meV} and further enhancing its stability. In contrast, the ZPE of oC88 decreases by only 1.67meV1.67~{}\text{meV} under anharmonic effects, indicating a smaller impact compared to cI16. This result suggests that when the anharmonic effect is considered, the stability of oC88 decreases, contrary to the expectations of previous studies Marqués et al. (2011); Lv et al. (2011); Wang et al. (2023).

Additionally, we calculated the Gibbs free energies at 100K100~{}\text{K}, as shown in FIG. 3 (c). The free energy of oC88 remains consistently higher than that of cI16 across all pressure conditions, which contradicts previous experiments. It has been reported that the resistivity increases sharply by more than four orders of magnitude after the cI16 phase, ultimately transforming into a semiconductor state. Compression experiments in Ref. Guillaume et al. (2011) observed the cI16-oC88 transition, as evidenced by changes in crystal color and diffraction patterns. Raman spectra measurements in Ref. Gorelli et al. (2012) detected signals corresponding to the oC88 phase. Another experiment Frost et al. (2019) also observed a phase transition around 60GPa60~{}\text{GPa} through the changes in diffraction peaks.

After the oC88 structure was experimentally observed Guillaume et al. (2011), theoretical studies attempted to explain its stability. However, Ref. Marqués et al. (2011) concluded from their zero-temperature calculations that oC88 was only the second most stable phase with an enthalpy about 1meV1~{}\text{meV} higher than cI16, attributing this to an insufficient consideration of ZPE and thermal effects. In another work Lv et al. (2011), the authors also failed to identify oC88 as a stable structure. In contrast, Ref. Gorelli et al. (2012) found that oC88 is stable when using the PBE functional with a harmonic ZPE at 200K200~{}\text{K}. However, a recent study Wang et al. (2023) demonstrated that neither harmonic nor anharmonic approximations could reproduce the results of Ref. Gorelli et al. (2012) at various conditions, showing a free energy difference at least 1meV1~{}\text{meV} with oC88 consistently higher than cI16. Surprisingly, as NCT captures nuclear quantum effects and anharmonic behaviors more accurately, the difference increases to 4meV4~{}\text{meV}. Therefore, we strongly suspect that the instability of oC88 arises from the limited accuracy of the DFT (PBE) calculations used in fitting the BOES Wang et al. (2023). It has been observed that DFT often over-stabilizes metallic states relative to non-metallic states Ma et al. (2009, 2007).

To validate our hypothesis, we employed the NCT-optimized structures at 70GPa70~{}\text{GPa} and conducted single-point electronic structure calculations using the high-accuracy Heyd-Scuseria-Ernzerhof (HSE06) functional Martin (2020). The HSE functional incorporates a hybrid exchange-correlation correction, enabling a clearer distinction between metallic and non-metallic states. Notably, HSE calculations are significantly more computationally demanding, requiring approximately two orders of magnitude more computational resources than PBE. Additional details of HSE calculations are provided in sup . Our results reveal that the relative energy of oC88 compared to cI16 decreases by 6.17meV6.17~{}\text{meV} under HSE in comparison to PBE. This reduction is significantly larger than the contributions from ZPE, anharmonic, and finite temperature effects. The HSE correction, depicted as the thick line in FIG. 3 (c), predicts a cI16-oC88 phase transition at approximately 62GPa62~{}\text{GPa} and 100K100~{}\text{K}. This finding is consistent with experimental observations, which report a narrow stability range for the oC88 phase, existing between 6262 and 70GPa70~{}\text{GPa}, flanked by the cI16 and oC40 phases on either side, respectively Guillaume et al. (2011). The electronic DOS of oC88, depicted in FIG. 3 (d), shows that while the HSE correction lowers the potential energy, oC88 still behaves as a poor metal.

Conclusions.– In summary, we developed the NCT method Xie et al. (2022, 2023); Zhang et al. (2024) to study anharmonic quantum solids and applied it to lithium. It enables the calculation of excited-state wave functions of nuclear motions beyond the harmonic approximation, allowing for the extraction of anharmonic phonon spectra. Additionally, the independently optimized phonon occupation probabilities facilitate the computation of anharmonic entropy. The results demonstrate that anharmonic effects play a crucial role in the structural stability. Nuclear quantum effects also introduce significant corrections to the fcc-bcc phase transition temperature. The fractional coordinates of the cI16 structure have been determined and closely align with experimental findings. Moreover, we identified that the failure of previous numerical studies Marqués et al. (2011); Lv et al. (2011); Wang et al. (2023) to observe the stability of oC88 was due to the limitations of the PBE functional in accurately describing poor metallic states. To address this, we applied the HSE functional to refine the results and estimate the stability region of oC88. Looking ahead, both experimental and computational investigations suggest that the emergence of novel high-density lithium solid phases between the cI16 and liquid phases presents a promising avenue for future exploration Frost et al. (2019); Wang et al. (2023). In this regime, high-accuracy DFT calculations and precise treatment of nuclear quantum effects may be required. Overall, the NCT method holds great potential for advancing our understanding and addressing diverse challenges in quantum crystal studies.

Acknowledgements.– We are grateful for the useful discussions with Hao Xie, Jian Lv, Xinyang Dong, Zhendong Cao, Zihang Li, Ruisi Wang, and Peize Lin. This work is supported by the National Natural Science Foundation of China under Grants No. 92270107, No. T2225018, No. 12188101, No. T2121001, No. 12134012, and No. 12374067, and the Strategic Priority Research Program of the Chinese Academy of Sciences under Grants No. XDB0500000 and No. XDB30000000.

References

Supplemental Material: Neural Canonical Transformations for Quantum Anharmonic Solids of Lithium

Appendix SI SI. Neural Canonical Transformation approach

Refer to caption
Figure S1: A sketch of the workflow for the neural canonical transformations.

Neural canonical transformation (NCT) is a variational density matrix method that leverages multiple deep generative models. As shown in FIG. S1, this approach utilizes either a product spectrum ansatz (PSA) Martyn and Swingle (2019) or a variational autoregressive network (VAN) Wu et al. (2019); Liu et al. (2021); Xie et al. (2023) to represent phonon energy occupation probabilities (phonon Boltzmann distribution), a normalizing flow Dinh et al. (2014, 2016); Wang (2018); Papamakarios (2019); Papamakarios et al. (2021); Xie et al. (2022, 2023); Saleh et al. (2023); Zhang et al. (2024) for phonon excited-state wave functions, and a deep potential (DP) model Wang et al. (2018); Zeng et al. (2023); Wang et al. (2023) to describe the Born-Oppenheimer energy surface (BOES). In constant-volume calculations, the goal is to minimize the Helmholtz free energy FF by jointly optimizing the wave functions and probability distributions. For constant-pressure calculations, once FF has converged, the stress tensor 𝝈\bm{\sigma} is measured to adjust the lattice constants 𝒂\bm{a}. After making these adjustments, constant-volume calculations are repeated until the Gibbs free energy GG converges.

SI.1 A. Variational wave functions in phonon coordinates

The displacement coordinates of crystals are defined as 𝒖=𝑹𝑹𝟎\bm{u}=\bm{R}-\bm{R_{0}}, where 𝑹\bm{R} represents the nuclear positions and 𝑹𝟎\bm{R_{0}} denotes the initial equilibrium positions. The BOES is expressed as a function of the coordinates and the lattice constants: Vel(𝒂,𝑹)=Vel(𝒂,𝑹𝟎+𝒖)V_{\text{el}}(\bm{a},\bm{R})=V_{\text{el}}(\bm{a},\bm{R_{0}}+\bm{u}), where we omit the lattice constants 𝒂\bm{a} in VelV_{\text{el}} in the main text for simplicity. The dynamical matrix of the lattice can then be derived from the following expression Martin (2020); Baroni et al. (2001); Souvatzis et al. (2008, 2009); Flores-Livas et al. (2020):

C(iα),(jβ)=1M2Veluiαujβ|𝒖=𝟎,C_{(i\alpha),(j\beta)}=\left.\frac{1}{M}\frac{\mathrm{\partial}^{2}V_{\text{el}}}{\mathrm{\partial}u_{i\alpha}\mathrm{\partial}u_{j\beta}}\right|_{\bm{u}=\bm{0}}, (S1)

where i,ji,j index the nuclei, α,β\alpha,\beta represent the Cartesian components, and MM is the mass of atoms. By diagonalizing the dynamical matrix, the system can be expressed in phonon coordinates. The eigenvalues correspond to the squares of phonon frequencies, ωk2\omega_{k}^{2}, where kk indices the phonon modes. The associated eigenvectors define the unitary transformation 𝑷\bm{P} between displacement and phonon coordinates: 𝒒=M𝒖𝑷\bm{q}={\sqrt{M}}\bm{u}\bm{P}, 𝒖=1M𝑷𝒒\bm{u}=\frac{1}{\sqrt{M}}\bm{P}^{\dagger}\bm{q}. Consequently, the anharmonic terms in the BOES can be expressed as:

Vanh(𝒒)=Vel(𝒂,𝑹𝟎+1M𝑷𝒒)k12ωk2qk2.V_{\text{anh}}(\bm{q})=V_{\text{el}}(\bm{a},~{}\bm{R_{0}}+\frac{1}{\sqrt{M}}\bm{P}^{\dagger}\bm{q})-\sum_{k}\frac{1}{2}\omega_{k}^{2}q_{k}^{2}. (S2)

The vibrational Hamiltonian of the system, including anharmonic terms, can then be expressed in phonon coordinates as:

Hvib=12k=1D(2qk2+ωk2qk2)+Vanh(𝒒),H_{\text{vib}}=\frac{1}{2}\sum_{k=1}^{D}\left(-\frac{\partial^{2}}{\partial q_{k}^{2}}+\omega_{k}^{2}q_{k}^{2}\right)+V_{\text{anh}}(\bm{q}), (S3)

where the atomic mass MM is absorbed in the transformation from displacement coordinates 𝒖\bm{u} to phonon coordinates 𝒒\bm{q}. In the harmonic approximation, we discard the anharmonic terms, and the system’s Hamiltonian is defined in quasi-phonon coordinates:

Hbasis=12k=1D(2ξk2+ωk2ξk2).H_{\text{basis}}=\frac{1}{2}\sum_{k=1}^{D}\left(-\frac{\partial^{2}}{\partial\xi_{k}^{2}}+\omega_{k}^{2}\xi_{k}^{2}\right). (S4)

The eigenvectors of the above Hamiltonian are product states of DD-dimensional quantum harmonic oscillators, which serve as the basis of variational wave functions mentioned in the main text:

Φ𝒏(𝝃)=k=1Dϕnk(ξk),withϕnk(ξk)=e12ωkξk2hnk(ωkξk),\Phi_{\bm{n}}(\bm{\xi})=\prod_{k=1}^{D}\phi_{n_{k}}(\xi_{k}),~{}~{}\text{with}~{}~{}\phi_{n_{k}}(\xi_{k})=e^{-\frac{1}{2}\omega_{k}\xi_{k}^{2}}h_{n_{k}}(-\sqrt{\omega_{k}}\xi_{k}), (S5)

where the normalization factor has been omitted, as only derivatives of the wave function’s logarithm are required for subsequent calculations. nkn_{k} is the energy level of the kk-th quasi-phonon, and hnkh_{n_{k}} is the nkn_{k}-th order Hermite polynomial. The variational wave functions in phonon coordinates 𝒒\bm{q} are expressed as Zhang et al. (2024):

Ψ𝒏(q1,q2,,qD)=ϕn1(ξ1)ϕn2(ξ2)ϕnD(ξD)|det((ξ1,ξ2,,ξD)(q1,q2,,qD))|,\Psi_{\bm{n}}(q_{1},q_{2},\cdots,q_{D})=\phi_{n_{1}}(\xi_{1})\phi_{n_{2}}(\xi_{2})\cdots\phi_{n_{D}}(\xi_{D})\sqrt{\left|\mathrm{det}\left(\frac{\partial(\xi_{1},\xi_{2},\cdots,\xi_{D})}{\partial(q_{1},q_{2},\cdots,q_{D})}\right)\right|}, (S6)

where the orthogonality condition is always satisfied due to the properties of the Jacobian determinant, i.e., Ψ𝒏1(𝒒)Ψ𝒏(𝒒)d𝒒=δ𝒏𝒏\int\Psi^{-1}_{\bm{n}^{\prime}}(\bm{q})\Psi_{\bm{n}}(\bm{q})~{}\mathrm{d}\bm{q}=\delta_{\bm{n}^{\prime}\bm{n}}. Using the normalizing flow, the phonon coordinates 𝒒\bm{q} are connected to the quasi-phonon coordinates 𝝃\bm{\xi} through a coordinate transformation, also known as point transformation Eger and Gross (1963, 1964); DeWitt (1952). In our work, it can be understood as a reversible function 𝝃=f𝜽(𝒒)\bm{\xi}=f_{\bm{\theta}}(\bm{q}), where f𝜽f_{\bm{\theta}} is realized through neural networks Dinh et al. (2016) with learnable parameters 𝜽\bm{\theta}.

SI.2 B. Gradients of the variational free energy

Under constant-volume conditions, the variational Helmholtz free energy is calculated through the expectation values of the phonon energy occupation probabilities p𝒏p_{\bm{n}} and phonon wave functions Ψ𝒏(𝒒)\Psi_{\bm{n}}(\bm{q}):

F=𝔼𝒏p𝒏[kBTlnp𝒏+𝔼𝒒|Ψ𝒏(𝒒)|2[E𝒏vib(𝒒)]],F=\underset{\bm{n}\sim p_{\bm{n}}}{\mathbb{E}}\left[k_{B}T\ln p_{\bm{n}}+\underset{\bm{q}\sim|\Psi_{\bm{n}}(\bm{q})|^{2}}{\mathbb{E}}\left[E_{\bm{n}}^{\text{vib}}(\bm{q})\right]\right], (S7)

where the symbol 𝔼{\mathbb{E}} represents statistical expectation, which can be estimated through sampling, and the number of samples is called the batch size BB Zhang et al. (2024); Becca and Sorella (2017). For instance, consider an observable O(x)O(x), a function of a variable xx that follows a probability distribution π(x)\pi(x). Let bb denotes the index of each sample, where b=1,2,,Bb=1,2,\dots,B, and O(xb)O(x_{b}) represents the value corresponding to the bb-th sample. The unbiased estimation of the expectation can be approximated using these samples:

O(x)π(x)dx=𝔼xπ(x)[O(x)]=1Bb=1BO(xb).\int O(x)~{}\pi(x)~{}\mathrm{d}x=\underset{x\sim\pi(x)}{\mathbb{E}}\left[O(x)\right]=\frac{1}{B}\sum_{b=1}^{B}O(x_{b}). (S8)

Returning to Eq. S7, the term E𝒏vib(𝒒)=Ψ𝒏1(𝒒)HvibΨ𝒏(𝒒)E_{\bm{n}}^{\text{vib}}(\bm{q})=\Psi^{-1}_{\bm{n}}(\bm{q})H_{\text{vib}}\Psi_{\bm{n}}(\bm{q}) represents the local energy of the vibrational Hamiltonian. It is determined by the energy level indices 𝒏\bm{n} and the phonon coordinates 𝒒\bm{q}:

E𝒏vib(𝒒)=12k=1D[2qk2ln|Ψ𝒏(𝒒)|+(qkln|Ψ𝒏(𝒒)|)2]+[12k=1Dωk2qk2+Vanh(𝒒)],E_{\bm{n}}^{\text{vib}}(\bm{q})=-\frac{1}{2}\sum_{k=1}^{D}\left[\frac{\partial^{2}}{\partial q_{k}^{2}}\ln|\Psi_{\bm{n}}(\bm{q})|+\left(\frac{\partial}{\partial q_{k}}\ln|\Psi_{\bm{n}}(\bm{q})|\right)^{2}\right]+\left[\frac{1}{2}\sum_{k=1}^{D}\omega_{k}^{2}q_{k}^{2}+V_{\text{anh}}(\bm{q})\right], (S9)

where the first part is kinetic energy and the second part is potential energy. Since the wave functions are real-valued, we directly take the logarithm of their absolute values. This technique helps prevent numerical issues during computation, as detailed in the Appendix of Ref. Zhang et al. (2024). Moreover, the local energy’s computational bottleneck lies in evaluating the second-order derivatives. To reduce the computational complexity, we adopt Hutchinson’s stochastic trace estimator Hutchinson (1989). This trick improves the computational efficiency by an order of magnitude without compromising accuracy Xie et al. (2023). The free energy FF can be minimized by optimizing the parameter 𝝁\bm{\mu} in the probabilities p𝒏p_{\bm{n}} and the parameter 𝜽\bm{\theta} in the wave functions Ψ𝒏(𝒒)\Psi_{\bm{n}}(\bm{q}). The gradients of FF to these two parameters are given by Xie et al. (2022, 2023); Zhang et al. (2024):

𝝁F\displaystyle\nabla_{\bm{\mu}}F =𝔼𝒏p𝒏[(𝝁lnp𝒏)(kBTlnp𝒏+𝔼𝒒|Ψ𝒏(𝒒)|2[E𝒏vib(𝒒)])],\displaystyle=\ \underset{\bm{n}\sim p_{\bm{n}}}{\mathbb{E}}\left[\left(\nabla_{\bm{\mu}}\ln p_{\bm{n}}\right)\cdot\left(k_{B}T\ln p_{\bm{n}}+\underset{\bm{q}\sim|\Psi_{\bm{n}}(\bm{q})|^{2}}{\mathbb{E}}\left[E_{\bm{n}}^{\text{vib}}(\bm{q})\right]\right)\right], (S10)
𝜽F\displaystyle\nabla_{\bm{\theta}}F =2𝔼𝒏p𝒏[𝔼𝒒|Ψ𝒏(𝒒)|2[(𝜽ln|Ψ𝒏(𝒒)|)(E𝒏vib(𝒒))]].\displaystyle=2\underset{\bm{n}\sim p_{\bm{n}}}{\mathbb{E}}\left[\underset{\bm{q}\sim|\Psi_{\bm{n}}(\bm{q})|^{2}}{\mathbb{E}}\left[\left(\nabla_{\bm{\theta}}\ln|\Psi_{\bm{n}}(\bm{q})|\right)\cdot\left(E_{\bm{n}}^{\text{vib}}(\bm{q})\right)\right]\right].

SI.3 C. Observables and anharmonic zero-point energy

During the training process of NCT, observables such as kinetic energy, potential energy, entropy, and Helmholtz free energy are computed at each iteration. This approach also enables the convenient computation of pressure and stress, which is crucial for structural relaxation in the isothermal-isobaric ensemble. This implementation utilizes automatic differentiation Bradbury et al. (2018); Schoenholz and Cubuk (2020). The stress tensor 𝝈\bm{\sigma}, which accounts for both nuclear quantum and anharmonic effects, can be directly derived from its definition Nielsen and Martin (1985); Martin (2020):

σαβ=1ΩFεαβ|ε=0=1Ωi12miiαiβ12ij(𝒓ij)α(𝒓ij)β|𝒓ij|dVel(|𝒓ij|)d|𝒓ij|,\sigma_{\alpha\beta}=-\frac{1}{\Omega}\left.\frac{\partial F}{\partial\varepsilon_{\alpha\beta}}\right|_{\varepsilon=0}=-\frac{1}{\Omega}\left\langle\sum_{i}\frac{1}{2m_{i}}\nabla_{i\alpha}\nabla_{i\beta}-\frac{1}{2}\sum_{i\neq j}\frac{(\bm{r}_{ij})_{\alpha}(\bm{r}_{ij})_{\beta}}{|\bm{r}_{ij}|}\frac{\mathrm{d}V_{\text{el}}(|\bm{r}_{ij}|)}{\mathrm{d}|\bm{r}_{ij}|}\right\rangle, (S11)

where Ω\Omega is the volume of the system, and the strain ε\varepsilon is defined as a transformation of each atomic coordinates rα=(δαβ+εαβ)rβr_{\alpha}=(\delta_{\alpha\beta}+\varepsilon_{\alpha\beta})r_{\beta}. The term dVel(|𝒓ij|)/d|𝒓ij|{\mathrm{d}V_{\text{el}}(|\bm{r}_{ij}|)}/{\mathrm{d}|\bm{r}_{ij}|} can be interpreted as the force, and it can be conveniently derived from BOES. Then, the pressure is determined from the trace of the stress tensor:

P=13ασαα=13Ω(2Ek12𝒓ijdVel(|𝒓ij|)d|𝒓ij|),P=-\frac{1}{3}\sum_{\alpha}\sigma_{\alpha\alpha}=\frac{1}{3\Omega}\left(2E_{k}-\frac{1}{2}\left\langle\bm{r}_{ij}\frac{\mathrm{d}V_{\text{el}}(|\bm{r}_{ij}|)}{\mathrm{d}|\bm{r}_{ij}|}\right\rangle\right), (S12)

where EkE_{k} is the kinetic energy of the ions, which can be derived from the first part of Eq. S9. In the study of lithium, we observe that the electronic contribution to pressure significantly outweighs the phononic contribution. For example, under 70GPa70~{}\text{GPa}, the phonon contribution to the pressure is less than 2GPa2~{}\text{GPa}, with the remainder arising from electrons.

In a supercell containing NN atoms, which has D=3N3D=3N-3 phonons. The zero-point energy (ZPE) in the harmonic approximation is defined using the frequencies ωk\omega_{k} derived from the dynamical matrix: EZPE,har=k=1Dωk/2NE_{\text{ZPE},\text{har}}=\sum_{k=1}^{D}\omega_{k}/{2N}. The anharmonic phonon frequencies wkw_{k} and anharmonic ZPE in NCT can be defined as follows. The energy expectation for the energy level indexed by 𝒏=(n1,n2,,nD)\bm{n}=(n_{1},n_{2},\ldots,n_{D}) can be computed through sampling Zhang et al. (2024):

E𝒏=Ψ𝒏1(𝒒)HvibΨ𝒏(𝒒)d𝒒=𝔼𝒒|Ψ𝒏(𝒒)|2[E𝒏vib(𝒒)],E_{\bm{n}}=\int\Psi^{-1}_{\bm{n}}(\bm{q})H_{\text{vib}}\Psi_{\bm{n}}(\bm{q})~{}\mathrm{d}\bm{q}=\underset{\bm{q}\sim|\Psi_{\bm{n}}(\bm{q})|^{2}}{\mathbb{E}}\left[E_{\bm{n}}^{\text{vib}}(\bm{q})\right], (S13)

where E𝒏vib(𝒒)E_{\bm{n}}^{\text{vib}}(\bm{q}) is the local energy defined in Eq. (S9). In this context, the first excited single-phonon state of the kk-th phonon is represented as (n1,n2,,nD)=(0,0,,0)(n_{1},n_{2},\ldots,n_{D})=(0,0,\ldots,0), except for nk=1n_{k}=1, and the second excited state corresponds to nk=2n_{k}=2. The anharmonic frequency of each phonon is defined by subtracting the ground-state energy from its single-phonon excitation energy: wk=Enk=1Enk=0,w_{k}=E_{n_{k}=1}-E_{n_{k}=0}, where the reduced Planck constant is set to be =1\hbar=1 for simplicity. The anharmonic ZPE is then computed as the half sum of these frequencies, normalized by the number of atoms:

EZPE,anh=1Nk=1Dwk2.E_{\text{ZPE},\text{anh}}=\frac{1}{N}\sum_{k=1}^{D}\frac{w_{k}}{2}. (S14)

Appendix SII SII. Computational details and Additional results

SII.1 A. Born-Oppenheimer energy surface and computational costs

In this work, the BOES is realized through the deep potential (DP) models Wang et al. (2018); Zeng et al. (2023); Wang et al. (2023), which are trained on datasets generated through density functional theory (DFT) calculations. The DFT calculations were performed using the VASP (Vienna ab initio simulation package) code Kresse and Furthmüller (1996), employing the projector augmented wave (PAW) method with the Perdew-Burke-Ernzerhof (PBE) functional. For high-pressure calculations of cI16 and oC88, we utilized the same DP model (Li-DP-Hyb2) as detailed in the Supplemental Material of Ref. Wang et al. (2023). At low pressures, another DP model (Li-DP-Hyb3) was trained with additional data for bcc and fcc structures.

Table S1: Computational costs for neural canonical transformations.
structure atoms batch size epochs GPU time (hours)
bcc 250 400 15000 2*A100: 18 h or 4*V100: 26 h
fcc 256 400 15000 2*A100: 18 h or 4*V100: 26 h
cI16 432 256 15000 4*A100: 22 h or 8*V100: 29 h
oC88 352 256 67000 2*A800: 120 h or 4*V100: 140 h

In the NCT calculations presented in this work, the probability distribution of phonon energy levels p𝒏p_{\bm{n}} is modeled using the product spectrum ansatz, with energy levels truncated at K=20K=20. The function f𝜽f_{\bm{\theta}} in the variational wave functions is implemented using a real-valued non-volume preserving network Dinh et al. (2016), consisting of 66 coupling layers, each comprising two multilayer perceptrons with 128128 hidden units. For the orthorhombic crystals, oC88, structure relaxation is performed every 40004000 iterations to optimize the lattice constant 𝒂\bm{a}. For bcc and fcc structures, relaxation occurs every 5050 iterations. The NCT method demonstrates exceptional scalability for large-scale parallel computations. All calculations were conducted on multiple NVIDIA A800-80G or V100-32G GPUs (graphics processing unit). TABLE S1 summarizes the computation time for each data point, where a single data point corresponds to a specific combination of structure, pressure, and temperature. Additional details regarding parameter settings can be found in our code repository myc .

SII.2 B. Phase diagrams via neural canonical transformations

Refer to caption
Figure S2: (a) Gibbs free energy difference between the fcc and bcc structures at P=0GPaP=0~{}\text{GPa}, calculated through neural canonical transformations (NCT). (b) At P=2GPaP=2~{}\text{GPa}. (c) Phase diagram for fcc-bcc transitions. The experimental transition points at 0GPa0~{}\text{GPa} are taken from Refs. Barrett (1947); Rao and Rao (1967). The experimental transition line and density functional theory (DFT) results are sourced from Ref. Ackland et al. (2017). Both classical molecular dynamics (MD) and NCT calculations utilize the same deep potential model (Li-DP-Hyb3).

We utilized the NCT method to calculate the Gibbs free energy for fcc and bcc structures, as depicted in FIG. S2. NCT predicts transition temperatures of 8484, 142142, and 196K196~{}\text{K} at 0, 11, and 2GPa2~{}\text{GPa}, which are consistently lower than those obtained from thermodynamic integration through classical molecular dynamics (MD) simulations on the same energy surface. The latter yield transition temperatures of 144144, 185185, and 218K218~{}\text{K}. This discrepancy is primarily attributed to the quantum effects of nuclei. Notably, at 0GPa0~{}\text{GPa}, the MD-predicted phase boundary of 144K144~{}\text{K} aligns closely with the DFT value reported in Ref. Ackland et al. (2017). However, NCT significantly lowers the phase boundary to 84K84~{}\text{K}, which agrees more closely with experimental values of 7777 and 88K88~{}\text{K} in Refs. Barrett (1947); Rao and Rao (1967). Compared to the MD results, NCT provides corrections in the right direction, underscoring the importance of incorporating nuclear effects.

Refer to caption
Figure S3: (a) Kinetic energy of cI16 and oC88 structures at T=100KT=100~{}\text{K}, calculated through neural canonical transformations (NCT). (b) Ironic entropy. (c) Phase diagram. The experimental transition lines are taken from Refs. Guillaume et al. (2011); Frost et al. (2019). The NCT result has been corrected.

In the main text, we discussed calculations for the cI16 and oC88 structures using the NCT method. The results for ionic kinetic energy and entropy are presented in FIGs. S3 (a) and (b), respectively. Within the pressure range of 6060 to 70GPa70~{}\text{GPa}, the cI16 structure consistently exhibits lower kinetic energy and higher entropy. However, this outcome contradicts the hypothesis proposed in Ref. Gorelli et al. (2012), which suggested that oC88 might exhibit lower ZPE and higher entropy than the competing structures. At pressures above 75GPa75~{}\text{GPa}, both cI16 and oC88 become unstable, and a more stable semiconducting oC40 structure emerges. The accuracy of the DP model also declines significantly in this range, making the results for cI16 and oC88 less significant. FIG. S3 (c) shows the high-pressure phase diagram, where the NCT-derived cI16-oC88 phase boundary is corrected using hybrid functionals. The transition point agrees well with experimental observations Guillaume et al. (2011); Frost et al. (2019).

SII.3 C. High-accuracy electronic calculations at high-pressure

Table S2: Potential energy, enthalpy, and Gibbs free energy differences for the cI16 and oC88 structures, calculated via various electronic structure methods. The structures are obtained via NCT at T=100KT=100~{}\text{K} and P=70GPaP=70~{}\text{GPa}. All values in the table are given in meV/atom\text{meV}/\text{atom}, and take cI16 as the reference. (1) Potential energies are calculated at equilibrium positions: V=Vel(𝑹𝟎)V=V_{\text{el}}(\bm{R_{0}}). (2) Enthalpies are determined as H=PΩ+VH=P\Omega+V, where P=70GPaP=70~{}\text{GPa} and Ω\Omega is the volume per atom. (3) Gibbs free energy differences are corrected based on neural canonical transformations (NCT) presented in the main text: ΔG=ΔGNCT+ΔV\Delta G=\Delta G_{\text{NCT}}+\Delta V, where we set ΔGNCT=3.96meV/atom\Delta G_{\text{NCT}}=3.96~{}\text{meV}/\text{atom} for PBE.
Method DP (VASP) ABACUS ABACUS FHI-aims FHI-aims
Functional PBE PBE HSE06 PBE HSE06
Basis Plane Wave DZP DZP intermediate intermediate
Pseudopotential PAW Dojo-NC-FR Dojo-NC-FR - -
ΔV\Delta V oC88 - cI16 +7.06+7.06 +7.82+7.82 +1.65+1.65 +8.82+8.82 +1.15+1.15
ΔH\Delta H oC88 - cI16 +1.68+1.68 +2.44+2.44 3.73-3.73 +3.44+3.44 4.23-4.23
ΔG\Delta G oC88 - cI16 +3.96+3.96 +3.96+3.96 2.21-2.21 +3.96+3.96 3.71-3.71

Further DFT calculations were conducted using the ABACUS software package aba (a) (TABLE. S2, columns 3 and 4), which utilizes the Dojo norm-conserving fully-relativistic pseudopotentials (Dojo-NC-FR) generated with the ONCVPSP code (version 3.3.0) aba (b); Hamann (2013). A double-zeta polarized (DZP) basis set was employed for the Perdew-Burke-Ernzerhof (PBE) and Heyd-Scuseria-Ernzerhof (HSE06) hybrid functional calculations. Additional calculations were performed using the FHI-aims all-electron electronic structure code aim (TABLE. S2, columns 5 and 6), which leverages numerically tabulated atom-centered basis functions for highly accurate and efficient quantum mechanical simulations Blum et al. (2009). An intermediate basis set was chosen to balance computational cost and accuracy. The HSE06 hybrid functional was applied to capture the exchange-correlation effects, utilizing the standard mixing scheme of 1/41/4 Hartree-Fock and 3/43/4 Kohn-Sham components Heyd et al. (2003); Martin (2020).

We performed single-point calculations on the structures optimized via NCT at 100K100~{}\text{K} and 70GPa70~{}\text{GPa}, with the results shown in TABLE S2. For the cI16 structure, a 15×15×1515\times 15\times 15 kk-point grid was utilized, while an 8×8×88\times 8\times 8 grid for oC88. All input files for these calculations are accessible in the code repository myc . The second column presents the results obtained using the DP model, trained on data generated by VASP with the PAW method under the PBE functional. The third column presents the results from ABACUS using a DZP basis under PBE, which aligns with those of the DP model, confirming that the DP model’s accuracy does not influence the conclusions. In this case, the potential energy at the equilibrium position of oC88 is 7.82meV/atom7.82~{}\text{meV}/\text{atom} higher than that of cI16. The fourth column presents the results under the hybrid HSE06 functional, where the energy difference between the two structures is reduced, with oC88 being only 1.65meV/atom1.65~{}\text{meV}/\text{atom} higher than cI16. This indicates that HSE corrects the BOES by 6.17meV6.17~{}\text{meV} compared to PBE. Similar results were obtained through calculations performed with FHI-aims, where HSE yielded a correction of 7.67meV/atom7.67~{}\text{meV}/\text{atom} compared to PBE.

To further investigate the stability of these two structures, we estimated their enthalpies, incorporating corrections based on single-point electronic structure calculations. It is evident that when a more accurate functional is employed, oC88 becomes more stable than cI16. This suggests its stabilization originates from electronic structure effects rather than finite-temperature or nuclear quantum effects. In addition, we applied corrections to the Gibbs free energy obtained from the NCT calculations. Interestingly, quantum and thermal effects slightly reduce the stability of the oC88 structure, contradicting the hypotheses proposed in previous studies Marqués et al. (2011); Gorelli et al. (2012); Wang et al. (2023).