Neural Canonical Transformations for Quantum Anharmonic Solids of Lithium
Abstract
Lithium is a typical quantum solid, characterized by cubic structures at ambient pressure. As the pressure increases, it forms more complex structures and undergoes a metal-to-semiconductor transformation, complicating theoretical and experimental analyses. We employ the neural canonical transformation approach, an ab initio variational method based on probabilistic generative models, to investigate the quantum anharmonic effects in lithium solids at finite temperatures. This approach combines a normalizing flow for phonon excited-state wave functions with a probabilistic model for the occupation of energy levels, optimized jointly to minimize the free energy. Our results indicate that quantum anharmonicity lowers the bcc-fcc transition temperature compared to classical molecular dynamics predictions. At high pressures, the predicted fractional coordinates of lithium atoms in the cI16 structure show good quantitative agreement with experimental observations. Finally, contrary to previous beliefs, we find that the poor metallic oC88 structure is stabilized by the potential energy surface obtained via high-accuracy electronic structure calculations, rather than thermal or quantum effects.
Introduction.– Accurate prediction of crystal structures has long been a central focus in materials science. At low temperatures, a deep understanding of the physical properties of crystals composed of light atoms typically requires proper treatment of nuclear quantum effects with anharmonicity Cazorla and Boronat (2017); Monserrat et al. (2013); Kapil et al. (2019); Flores-Livas et al. (2020); Monacelli et al. (2021a). These effects can play a crucial role in determining the crystal structure, as seen in solid hydrogen Azadi et al. (2014); Borinaga et al. (2016); Monacelli et al. (2021b, 2023), helium Pederiva and Chester (1998); Vitali et al. (2010); Monserrat et al. (2014), and some metal hydrides Errea et al. (2013, 2014). In this study, we explore one of the most notable examples: lithium, the lightest alkali metal, where the quantum effects of nuclei are pronounced in a wide range of pressures and temperatures Guillaume et al. (2011). Although lithium behaves as a nearly free-electron metal at low pressure and adopts simple, high-symmetry cubic structures, the free energy difference between its bcc (body-centered cubic) and fcc (face-centered cubic) phases is less than Hutcheon and Needs (2019); Jerabek et al. (2022); Riffe and Christensen (2024); Phuthi et al. (2024), making precise calculations challenging. Additionally, lithium exhibits several metastable structures that further complicate experimental measurements Ackland et al. (2017). As the pressure increases, lithium exhibits complex physical behaviors, such as anomalous melting curves Guillaume et al. (2011); Schaeffer et al. (2012); Frost et al. (2019), and intriguing phase transitions from metal to semiconductor and back Matsuoka and Shimizu (2009); Lv et al. (2011); Marqués et al. (2011). Moreover, some high-pressure phases consist of large unit cells with tens or even hundreds of atoms Marqués et al. (2011); Gorelli et al. (2012); Wang et al. (2023), posing substantial challenges for both theoretical and experimental studies.
Numerical approaches to studying quantum crystals at finite temperatures include the well-established path integral molecular dynamics Martyna et al. (1999) and path integral Monte Carlo Barker (1979). In recent years, inspired by the successful application of vibrational self-consistent field theory in molecular studies Bowman (1978, 1986); Gerber and Ratner (1979), efforts have been made to extend it to study crystals Monserrat et al. (2013, 2014); Hutcheon and Needs (2019); Kapil et al. (2019). However, it relies on a Taylor expansion of the Born-Oppenheimer energy surface (BOES), and the wave function is a simple Hartree product. The stochastic self-consistent harmonic approximation (SSCHA) Errea et al. (2013, 2014); Monacelli et al. (2021a) provides an alternative by accounting for both ionic quantum and anharmonic effects without assumptions on the specific function form of the BOES. Nevertheless, it still relies on the Gaussian variational density matrices to define the quantum probability distribution. Recent developments have extended SSCHA to non-Gaussian density matrices, yet the entropy is still restricted to Gaussian approximations Siciliano et al. (2024); Monacelli et al. (2024).
In this work, we utilize the recently developed neural canonical transformations (NCT) approach Xie et al. (2022, 2023); Zhang et al. (2024), which is an ab initio variational density matrix method based on deep generative models, to study quantum lattice dynamics of lithium. NCT constructs orthogonal variational wave functions to describe phonons through a normalizing flow model Dinh et al. (2014, 2016); Wang (2018); Papamakarios (2019); Papamakarios et al. (2021); Xie et al. (2022, 2023); Saleh et al. (2023); Zhang et al. (2024). Additionally, we create a probabilistic model to describe the classical energy occupation probabilities for these phonons, allowing for accurate entropy calculations. For electronic calculations, we employ the deep potential model Wang et al. (2018); Zeng et al. (2023); Wang et al. (2023), a machine learning BOES, offering significant computational efficiency improvements over density functional theory (DFT) calculations. NCT’s key advantage is its ability to integrate quantum and anharmonic effects of nuclei into the wave functions, which facilitates the determination of the phonon spectrum. Moreover, the independently optimized phonon energy occupation probabilities enable the computation of anharmonic entropy. The NCT codes for lithium are open-sourced and publicly available myc .
The vibrational Hamiltonian of quantum solids.– Due to the substantial mass difference between electrons and nuclei, typically spanning several orders of magnitude, the Born-Oppenheimer approximation can be applied to decouple their motions and treat them independently. Under this approximation, the vibrational Hamiltonian is expressed as where the mass of a lithium atom is taken as . The term is the BOES, derived from electronic structure calculations at nuclear positions . In this work, to ensure both accuracy and computational efficiency, we use the deep potential model to fit the BOES Wang et al. (2018); Zeng et al. (2023); Wang et al. (2023); sup , which is derived from DFT calculations using the Perdew-Burke-Ernzerhof (PBE) functional.
The dynamical matrix can be derived from the Hessian of the BOES at the equilibrium position Martin (2020); Baroni et al. (2001); Souvatzis et al. (2008, 2009); Flores-Livas et al. (2020): where index the nuclei, represent the Cartesian components, and the displacement coordinates are defined as . Diagonalizing the matrix in a supercell containing atoms yields non-zero eigenvalues, which correspond to the number of phonon modes. The eigenvalues are related to the squares of the phonon frequencies, (), and the associated eigenvectors define the unitary transformation from displacement coordinates to phonon coordinates . Consequently, the Hamiltonian can be expressed in phonon coordinates:
(1) |
where the term represents the anharmonic contributions, as detailed in sup . In this representation, the separation of high and low-frequency modes greatly enhances the efficiency in the following calculations.
Neural canonical transformation for variational density matrix.– The solution for a many-body system in the canonical ensemble can be obtained by minimizing the Helmholtz free energy for a variational density matrix:
(2) |
where is the Boltzmann constant and is the temperature. Assuming that the phonons occupy the states with probability , the variational density matrix can be represented in terms of these quantum states as:
(3) |
where indexes the energy levels of the phonons. An unbiased estimate of the anharmonic free energy for the variational density matrix Eq. 3 can be written as nested thermal and quantum expectations:
(4) |
where is the wave function in phonon coordinates. The symbol is the statistical expectation, which can be estimated through sampling Zhang et al. (2024); Becca and Sorella (2017). In this Letter, the variational parameters within the energy occupation probabilities and wave functions are denoted as and , respectively, i.e., , . These parameters can be optimized via gradient descent Kingma and Ba (2014), with as the loss function. The gradients with respect to the probabilities and the wave function sup can be efficiently computed using automatic differentiation Bradbury et al. (2018).
In a supercell with vibrational modes, setting a cutoff of energy levels per phonon (i.e., ) results in an exponentially large state space of . For supercells containing hundreds of atoms, directly representing the energy occupation probabilities becomes impractical in computations. In the study of lithium, we assume that the probability distributions take a product form Martyn and Swingle (2019): , where represents the probability of the -th phonon occupying state , and they are governed by learnable parameters . We have checked that an even more powerful variational autoregressive network Wu et al. (2019); Liu et al. (2021); Xie et al. (2023) does not improve results, likely due to weak coupling between phonon modes. The entropy can be evaluated as the expectation of the probabilities:
(5) |
We note the nonlinear SSCHA Siciliano et al. (2024); Monacelli et al. (2024) corresponds to even further simplification of , which assumes that the entropy is given by a set of independent harmonic oscillators.
To construct variational wave functions, we apply a unitary transformation to a set of orthogonal basis states Xie et al. (2022, 2023); Zhang et al. (2024): , where the basis states are chosen as the wave functions of a -dimensional harmonic oscillator with frequencies . We implement the unitary transformation using a normalizing flow Dinh et al. (2014, 2016); Wang (2018); Papamakarios (2019); Papamakarios et al. (2021); Xie et al. (2022, 2023); Saleh et al. (2023); Zhang et al. (2024), which establishes a learnable bijection between the phonon coordinates and a set of quasi-phonon coordinates . The bijection is represented as a smooth, reversible function , where consists of neural networks with learnable parameters , specifically, a real-valued non-volume preserving network Dinh et al. (2016). Accordingly, the orthogonal variational wave functions of all energy levels can be formulated as sup ; Zhang et al. (2024):
(6) |
where are basis states. Notably, in the study of lithium, the computation involves approximately ten million orthogonal states for each training. The Jacobian determinant in Eq. 6 captures phonon interactions and anharmonic effects, enabling a more flexible and accurate representation. This approach remains robust when imaginary phonons appear in strong anharmonicity systems (e.g., saddle points of BOES). In such cases, we can choose the corresponding basis states with real-valued frequencies, and the flow model will automatically optimize to find the most suitable wave functions. A detailed derivation of NCT can be found in Supplemental Material sup and our previous work Zhang et al. (2024).
We can extend NCT naturally to the isothermal-isobaric ensemble, where the goal is to minimize the Gibbs free energy at a target pressure , defined as:
(7) |
where is the system volume. From the relation , once the parameters and have converged under constant volume optimization (i.e., when ), the gradient of the Gibbs free energy to the strain depends only on the stress tensor . The stress tensor and pressure can be calculated using the virial theorem Nielsen and Martin (1985); Martin (2020); sup . Then, we can optimize the lattice constants through the strain tensor , which is similar to the structure relaxation in other methods Monserrat et al. (2013); Monacelli et al. (2021a).

Anharmonic and nuclear quantum effects on stability.– At ambient temperature and pressure, lithium adopts a simple bcc structure. As the temperature decreases, experiments have demonstrated that the true ground state of lithium is fcc Ackland et al. (2017). Some calculations have revealed that the free energies of these structures are extremely close Hutcheon and Needs (2019); Jerabek et al. (2022); Riffe and Christensen (2024); Phuthi et al. (2024), highlighting the necessity of fully accounting for quantum and anharmonic effects. To investigate the influence of anharmonicity, we first conducted NCT calculations for bcc and fcc using supercells with and atoms, respectively, at a fixed volume of and temperature . As a comparative study, we set in Eq. 6 to an identity transformation, meaning the phonon wave functions are harmonic oscillators. In this case, only the phonon occupation probabilities in Eq. 4 were optimized.
At a lower temperature of , the free energies of fcc are lower than that of bcc, as expected. However, as the temperature increases to , the impact of anharmonicity becomes evident, as shown in FIG. 1 (a). In fcc, the free energy difference between the two approaches remains small, about . In contrast, the anharmonic effects are much stronger in bcc, and the free energy difference expands to . It is also observed that when we only optimized , the free energy of fcc is lower than that of bcc. However, when the optimization of wave functions is included, i.e., when anharmonic effects are considered, the bcc structure becomes more stable. This phenomenon suggests that bcc exhibits stronger anharmonicity than fcc, underscoring the critical role of anharmonicity in determining the stability.
The findings are further supported by the radial distribution functions (RDF) of atoms, as shown in FIG. 1 (b). The RDF for fcc exhibits only slight influence from anharmonic effects. In contrast, the RDF for bcc exhibits a smoother curve when anharmonic effects are considered, indicating a reduction in atomic localization. This behavior suggests that anharmonicity softens the system, resulting in a lower zero-point energy (ZPE) than the harmonic approximation. Further insights are provided by the phonon density of states (DOS) depicted in FIG. 1 (c), where the ZPE is determined as half the sum of all phonon frequencies per atom, as detailed in sup . Although the bcc structure is more stable at high temperatures, the numerical results reveal that the ZPE of fcc remains lower than that of bcc. This phenomenon can be explained by the differences in coordination numbers: the coordination number of bcc is , while that of fcc is . Hence, atoms in bcc interact less strongly with their neighbors, leading to higher quantum fluctuations and greater anharmonicity.

To gain deeper insight into the influence of nuclear quantum effects, we performed calculations on the bcc and fcc structures under constant pressure. The Gibbs free energies of both structures are extremely close Hutcheon and Needs (2019); Jerabek et al. (2022); Riffe and Christensen (2024); Phuthi et al. (2024). An error of just could lead to a shift of more than in the transition temperature Riffe and Christensen (2024). FIG. 2 (a) shows the Gibbs free energy difference between these structures at , with the fcc structure used as the reference. The two curves intersect at , indicating a phase transition at this temperature. We also calculated the transition temperature through classical MD simulations with thermodynamic integration on the same BOES, obtaining a value of . The main difference between these methods is that NCT accounts for the quantum effects, while MD simulations do not. Similar results are also observed at and , as detailed in Supplemental Material sup , where NCT consistently predicts lower transition temperatures ( and ) compared to MD ( and ).
We also calculated the ionic entropy of both structures, as shown in FIG. 2 (b). Notably, the anharmonic entropy obtained from NCT is derived directly from the probabilities of energy occupations in Eq. 5, beyond the harmonic oscillator assumption. Under the harmonic oscillator assumption, the entropy of fcc is higher than that of the bcc. However, when anharmonicity is considered, the relationship is reversed. The higher entropy of the bcc structure is a key factor contributing to its stability in finite temperatures Souvatzis et al. (2008, 2009); Söderlind et al. (2012); Hutcheon and Needs (2019); Phuthi et al. (2024). Furthermore, we quantified the free energy difference arising from the anharmonic effects of entropy as (inset of FIG. 2 (b)), estimating it to be on the order of several meV. This underscores the critical importance of accurately incorporating anharmonic effect in the calculations.

High-pressure structural stability of lithium.– Under high pressure, lithium exhibits more complex structures and larger unit cell sizes. We first applied the NCT method to calculate the cI16 (cubic I-centered, I-43d) structure at , using a supercell of atoms under various pressures. The NCT method optimizes the equilibrium positions of atoms through coordinate transformations. Subsequently, we quenched the sampled structures to their ground state with the BOES and analyzed the fractional coordinates of Wyckoff position as a function of atomic volume. As depicted in FIG. 3 (a), our results are consistent with the experiment reported in Ref. Hanfland et al. (2000), demonstrating the reliability of NCT in structure optimizations.
As illustrated in FIG. 3 (b), the phonon DOS and ZPE of the cI16 and oC88 (orthorhombic C-face centered, C2mb) structures are calculated at . Under the harmonic approximation, the ZPE of both structures are found to be comparable, consistent with the results obtained using the finite displacement and density functional perturbation theory methods Wang et al. (2023). The anharmonic effects soften the phonon spectrum of cI16, reducing the ZPE by and further enhancing its stability. In contrast, the ZPE of oC88 decreases by only under anharmonic effects, indicating a smaller impact compared to cI16. This result suggests that when the anharmonic effect is considered, the stability of oC88 decreases, contrary to the expectations of previous studies Marqués et al. (2011); Lv et al. (2011); Wang et al. (2023).
Additionally, we calculated the Gibbs free energies at , as shown in FIG. 3 (c). The free energy of oC88 remains consistently higher than that of cI16 across all pressure conditions, which contradicts previous experiments. It has been reported that the resistivity increases sharply by more than four orders of magnitude after the cI16 phase, ultimately transforming into a semiconductor state. Compression experiments in Ref. Guillaume et al. (2011) observed the cI16-oC88 transition, as evidenced by changes in crystal color and diffraction patterns. Raman spectra measurements in Ref. Gorelli et al. (2012) detected signals corresponding to the oC88 phase. Another experiment Frost et al. (2019) also observed a phase transition around through the changes in diffraction peaks.
After the oC88 structure was experimentally observed Guillaume et al. (2011), theoretical studies attempted to explain its stability. However, Ref. Marqués et al. (2011) concluded from their zero-temperature calculations that oC88 was only the second most stable phase with an enthalpy about higher than cI16, attributing this to an insufficient consideration of ZPE and thermal effects. In another work Lv et al. (2011), the authors also failed to identify oC88 as a stable structure. In contrast, Ref. Gorelli et al. (2012) found that oC88 is stable when using the PBE functional with a harmonic ZPE at . However, a recent study Wang et al. (2023) demonstrated that neither harmonic nor anharmonic approximations could reproduce the results of Ref. Gorelli et al. (2012) at various conditions, showing a free energy difference at least with oC88 consistently higher than cI16. Surprisingly, as NCT captures nuclear quantum effects and anharmonic behaviors more accurately, the difference increases to . Therefore, we strongly suspect that the instability of oC88 arises from the limited accuracy of the DFT (PBE) calculations used in fitting the BOES Wang et al. (2023). It has been observed that DFT often over-stabilizes metallic states relative to non-metallic states Ma et al. (2009, 2007).
To validate our hypothesis, we employed the NCT-optimized structures at and conducted single-point electronic structure calculations using the high-accuracy Heyd-Scuseria-Ernzerhof (HSE06) functional Martin (2020). The HSE functional incorporates a hybrid exchange-correlation correction, enabling a clearer distinction between metallic and non-metallic states. Notably, HSE calculations are significantly more computationally demanding, requiring approximately two orders of magnitude more computational resources than PBE. Additional details of HSE calculations are provided in sup . Our results reveal that the relative energy of oC88 compared to cI16 decreases by under HSE in comparison to PBE. This reduction is significantly larger than the contributions from ZPE, anharmonic, and finite temperature effects. The HSE correction, depicted as the thick line in FIG. 3 (c), predicts a cI16-oC88 phase transition at approximately and . This finding is consistent with experimental observations, which report a narrow stability range for the oC88 phase, existing between and , flanked by the cI16 and oC40 phases on either side, respectively Guillaume et al. (2011). The electronic DOS of oC88, depicted in FIG. 3 (d), shows that while the HSE correction lowers the potential energy, oC88 still behaves as a poor metal.
Conclusions.– In summary, we developed the NCT method Xie et al. (2022, 2023); Zhang et al. (2024) to study anharmonic quantum solids and applied it to lithium. It enables the calculation of excited-state wave functions of nuclear motions beyond the harmonic approximation, allowing for the extraction of anharmonic phonon spectra. Additionally, the independently optimized phonon occupation probabilities facilitate the computation of anharmonic entropy. The results demonstrate that anharmonic effects play a crucial role in the structural stability. Nuclear quantum effects also introduce significant corrections to the fcc-bcc phase transition temperature. The fractional coordinates of the cI16 structure have been determined and closely align with experimental findings. Moreover, we identified that the failure of previous numerical studies Marqués et al. (2011); Lv et al. (2011); Wang et al. (2023) to observe the stability of oC88 was due to the limitations of the PBE functional in accurately describing poor metallic states. To address this, we applied the HSE functional to refine the results and estimate the stability region of oC88. Looking ahead, both experimental and computational investigations suggest that the emergence of novel high-density lithium solid phases between the cI16 and liquid phases presents a promising avenue for future exploration Frost et al. (2019); Wang et al. (2023). In this regime, high-accuracy DFT calculations and precise treatment of nuclear quantum effects may be required. Overall, the NCT method holds great potential for advancing our understanding and addressing diverse challenges in quantum crystal studies.
Acknowledgements.– We are grateful for the useful discussions with Hao Xie, Jian Lv, Xinyang Dong, Zhendong Cao, Zihang Li, Ruisi Wang, and Peize Lin. This work is supported by the National Natural Science Foundation of China under Grants No. 92270107, No. T2225018, No. 12188101, No. T2121001, No. 12134012, and No. 12374067, and the Strategic Priority Research Program of the Chinese Academy of Sciences under Grants No. XDB0500000 and No. XDB30000000.
References
- Cazorla and Boronat (2017) C. Cazorla and J. Boronat, Rev. Mod. Phys. 89, 035003 (2017).
- Monserrat et al. (2013) B. Monserrat, N. D. Drummond, and R. J. Needs, Phys. Rev. B 87, 144302 (2013).
- Kapil et al. (2019) V. Kapil, E. Engel, M. Rossi, and M. Ceriotti, Journal of Chemical Theory and Computation 15, 5845 (2019), pMID: 31532997.
- Flores-Livas et al. (2020) J. A. Flores-Livas, L. Boeri, A. Sanna, G. Profeta, R. Arita, and M. Eremets, Physics Reports 856, 1 (2020), a perspective on conventional high-temperature superconductors at high pressure: Methods and materials.
- Monacelli et al. (2021a) L. Monacelli, R. Bianco, M. Cherubini, M. Calandra, I. Errea, and F. Mauri, Journal of Physics: Condensed Matter 33, 363001 (2021a).
- Azadi et al. (2014) S. Azadi, B. Monserrat, W. M. C. Foulkes, and R. J. Needs, Phys. Rev. Lett. 112, 165501 (2014).
- Borinaga et al. (2016) M. Borinaga, I. Errea, M. Calandra, F. Mauri, and A. Bergara, Phys. Rev. B 93, 174308 (2016).
- Monacelli et al. (2021b) L. Monacelli, I. Errea, M. Calandra, and F. Mauri, Nature Physics 17, 63 (2021b).
- Monacelli et al. (2023) L. Monacelli, M. Casula, K. Nakano, S. Sorella, and F. Mauri, Nature Physics 19, 845 (2023).
- Pederiva and Chester (1998) F. Pederiva and G. V. Chester, Journal of low temperature physics 113, 741 (1998).
- Vitali et al. (2010) E. Vitali, M. Rossi, L. Reatto, and D. E. Galli, Phys. Rev. B 82, 174510 (2010).
- Monserrat et al. (2014) B. Monserrat, N. D. Drummond, C. J. Pickard, and R. J. Needs, Phys. Rev. Lett. 112, 055504 (2014).
- Errea et al. (2013) I. Errea, M. Calandra, and F. Mauri, Phys. Rev. Lett. 111, 177002 (2013).
- Errea et al. (2014) I. Errea, M. Calandra, and F. Mauri, Phys. Rev. B 89, 064302 (2014).
- Guillaume et al. (2011) C. L. Guillaume, E. Gregoryanz, O. Degtyareva, M. I. McMahon, M. Hanfland, S. Evans, M. Guthrie, S. V. Sinogeikin, and H. Mao, Nature Physics 7, 211 (2011).
- Hutcheon and Needs (2019) M. Hutcheon and R. Needs, Phys. Rev. B 99, 014111 (2019).
- Jerabek et al. (2022) P. Jerabek, A. Burrows, and P. Schwerdtfeger, Chemical Communications 58, 13369 (2022).
- Riffe and Christensen (2024) D. M. Riffe and J. D. Christensen, (2024), arXiv:2406.09527 [cond-mat.mtrl-sci] .
- Phuthi et al. (2024) M. K. Phuthi, Y. Huang, M. Widom, and V. Viswanathan, (2024), arXiv:2406.15491 [cond-mat.mtrl-sci] .
- Ackland et al. (2017) G. J. Ackland, M. Dunuwille, M. Martinez-Canales, I. Loa, R. Zhang, S. Sinogeikin, W. Cai, and S. Deemyad, Science 356, 1254 (2017).
- Schaeffer et al. (2012) A. M. J. Schaeffer, W. B. Talmadge, S. R. Temple, and S. Deemyad, Phys. Rev. Lett. 109, 185702 (2012).
- Frost et al. (2019) M. Frost, J. B. Kim, E. E. McBride, J. R. Peterson, J. S. Smith, P. Sun, and S. H. Glenzer, Phys. Rev. Lett. 123, 065701 (2019).
- Matsuoka and Shimizu (2009) T. Matsuoka and K. Shimizu, Nature 458, 186 (2009).
- Lv et al. (2011) J. Lv, Y. Wang, L. Zhu, and Y. Ma, Phys. Rev. Lett. 106, 015503 (2011).
- Marqués et al. (2011) M. Marqués, M. I. McMahon, E. Gregoryanz, M. Hanfland, C. L. Guillaume, C. J. Pickard, G. J. Ackland, and R. J. Nelmes, Phys. Rev. Lett. 106, 095502 (2011).
- Gorelli et al. (2012) F. A. Gorelli, S. F. Elatresh, C. L. Guillaume, M. Marqués, G. J. Ackland, M. Santoro, S. A. Bonev, and E. Gregoryanz, Phys. Rev. Lett. 108, 055501 (2012).
- Wang et al. (2023) X. Wang, Z. Wang, P. Gao, C. Zhang, J. Lv, H. Wang, H. Liu, Y. Wang, and Y. Ma, Nature Communications 14, 2924 (2023).
- Martyna et al. (1999) G. J. Martyna, A. Hughes, and M. E. Tuckerman, The Journal of Chemical Physics 110, 3275 (1999).
- Barker (1979) J. A. Barker, The Journal of Chemical Physics 70, 2914 (1979).
- Bowman (1978) J. M. Bowman, The Journal of Chemical Physics 68, 608 (1978).
- Bowman (1986) J. M. Bowman, Accounts of Chemical Research 19, 202 (1986).
- Gerber and Ratner (1979) R. Gerber and M. Ratner, Chemical Physics Letters 68, 195 (1979).
- Siciliano et al. (2024) A. Siciliano, L. Monacelli, and F. Mauri, Phys. Rev. B 110, 134111 (2024).
- Monacelli et al. (2024) L. Monacelli, A. Siciliano, and N. Marzari, (2024), arXiv:2410.08882 [cond-mat.mtrl-sci] .
- Xie et al. (2022) H. Xie, L. Zhang, and L. Wang, Journal of Machine Learning 1, 38 (2022).
- Xie et al. (2023) H. Xie, L. Zhang, and L. Wang, SciPost Physics 14 (2023), 10.21468/scipostphys.14.6.154.
- Zhang et al. (2024) Q. Zhang, R.-S. Wang, and L. Wang, The Journal of Chemical Physics 161, 024103 (2024).
- Dinh et al. (2014) L. Dinh, D. Krueger, and Y. Bengio, arXiv preprint arXiv:1410.8516 (2014).
- Dinh et al. (2016) L. Dinh, J. Sohl-Dickstein, and S. Bengio, arXiv preprint arXiv:1605.08803 (2016).
- Wang (2018) L. Wang, (2018).
- Papamakarios (2019) G. Papamakarios, arXiv preprint arXiv:1910.13233 (2019).
- Papamakarios et al. (2021) G. Papamakarios, E. Nalisnick, D. J. Rezende, S. Mohamed, and B. Lakshminarayanan, Journal of Machine Learning Research 22, 1 (2021).
- Saleh et al. (2023) Y. Saleh, Á. F. Corral, A. Iske, J. Küpper, and A. Yachmenev, arXiv preprint arXiv:2308.16468 (2023).
- Wang et al. (2018) H. Wang, L. Zhang, J. Han, and W. E, Computer Physics Communications 228, 178 (2018).
- Zeng et al. (2023) J. Zeng, D. Zhang, D. Lu, P. Mo, Z. Li, Y. Chen, M. Rynik, L. Huang, Z. Li, S. Shi, Y. Wang, H. Ye, P. Tuo, J. Yang, Y. Ding, Y. Li, D. Tisi, Q. Zeng, H. Bao, Y. Xia, J. Huang, K. Muraoka, Y. Wang, J. Chang, F. Yuan, S. L. Bore, C. Cai, Y. Lin, B. Wang, J. Xu, J.-X. Zhu, C. Luo, Y. Zhang, R. E. A. Goodall, W. Liang, A. K. Singh, S. Yao, J. Zhang, R. Wentzcovitch, J. Han, J. Liu, W. Jia, D. M. York, W. E, R. Car, L. Zhang, and H. Wang, The Journal of Chemical Physics 159, 054801 (2023).
- (46) The code of neural canonical transformations for lithium solids, implemented with the Python package JAX Bradbury et al. (2018), is available at https://github.com/zhangqi94/lithium.
- (47) See Supplemental Material for details of the neural canonical transformation approach and additional results, which includes Refs. Barrett (1947); Rao and Rao (1967); Eger and Gross (1963, 1964); DeWitt (1952); Hutchinson (1989); Schoenholz and Cubuk (2020); Kresse and Furthmüller (1996); aba (a); Hamann (2013); aba (b); aim ; Blum et al. (2009); Heyd et al. (2003).
- Martin (2020) R. M. Martin, Electronic structure: basic theory and practical methods (Cambridge university press, 2020).
- Baroni et al. (2001) S. Baroni, S. de Gironcoli, A. Dal Corso, and P. Giannozzi, Rev. Mod. Phys. 73, 515 (2001).
- Souvatzis et al. (2008) P. Souvatzis, O. Eriksson, M. I. Katsnelson, and S. P. Rudin, Phys. Rev. Lett. 100, 095901 (2008).
- Souvatzis et al. (2009) P. Souvatzis, O. Eriksson, M. Katsnelson, and S. Rudin, Computational Materials Science 44, 888 (2009).
- Becca and Sorella (2017) F. Becca and S. Sorella, Quantum Monte Carlo Approaches for Correlated Systems (Cambridge University Press, 2017).
- Kingma and Ba (2014) D. P. Kingma and J. Ba, arXiv preprint arXiv:1412.6980 (2014).
- Bradbury et al. (2018) J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. VanderPlas, S. Wanderman-Milne, and Q. Zhang, “JAX: composable transformations of Python+NumPy programs,” (2018).
- Martyn and Swingle (2019) J. Martyn and B. Swingle, Phys. Rev. A 100, 032107 (2019).
- Wu et al. (2019) D. Wu, L. Wang, and P. Zhang, Phys. Rev. Lett. 122, 080602 (2019).
- Liu et al. (2021) J.-G. Liu, L. Mao, P. Zhang, and L. Wang, Machine Learning: Science and Technology 2, 025011 (2021).
- Nielsen and Martin (1985) O. H. Nielsen and R. M. Martin, Phys. Rev. B 32, 3780 (1985).
- Söderlind et al. (2012) P. Söderlind, B. Grabowski, L. Yang, A. Landa, T. Björkman, P. Souvatzis, and O. Eriksson, Phys. Rev. B 85, 060301 (2012).
- Hanfland et al. (2000) M. Hanfland, K. Syassen, N. Christensen, and D. Novikov, Nature 408, 174 (2000).
- Ma et al. (2009) Y. Ma, M. Eremets, A. R. Oganov, Y. Xie, I. Trojan, S. Medvedev, A. O. Lyakhov, M. Valle, and V. Prakapenka, Nature 458, 182 (2009).
- Ma et al. (2007) Y. Ma, A. R. Oganov, and C. W. Glass, Phys. Rev. B 76, 064101 (2007).
- Barrett (1947) C. S. Barrett, Phys. Rev. 72, 245 (1947).
- Rao and Rao (1967) C. Rao and K. Rao, Progress in Solid State Chemistry 4, 131 (1967).
- Eger and Gross (1963) M. Eger and E. Gross, Annals of Physics 24, 63 (1963).
- Eger and Gross (1964) F. Eger and E. Gross, Il Nuovo Cimento (1955-1965) 34, 1225 (1964).
- DeWitt (1952) B. S. DeWitt, Phys. Rev. 85, 653 (1952).
- Hutchinson (1989) M. Hutchinson, Communications in Statistics - Simulation and Computation 18, 1059 (1989).
- Schoenholz and Cubuk (2020) S. S. Schoenholz and E. D. Cubuk, in Advances in Neural Information Processing Systems, Vol. 33 (Curran Associates, Inc., 2020).
- Kresse and Furthmüller (1996) G. Kresse and J. Furthmüller, Phys. Rev. B 54, 11169 (1996).
- aba (a) (a), Atomic-orbital Based Ab-initio Computation at USTC (ABACUS) is an open-source package for density functional theory calculations. More information can be found at https://abacus.ustc.edu.cn/main.htm.
- Hamann (2013) D. R. Hamann, Phys. Rev. B 88, 085117 (2013).
- aba (b) (b), Optimized Norm-Conservinng Vanderbilt PSeudopotential (ONCVPSP) code is available at http://www.mat-simresearch.com.
- (74) Fritz Haber Institute ab initio materials simulations (FHI-aims) is a quantum mechanics software package based on numeric atom-centered orbitals. More information can be found at https://fhi-aims.org.
- Blum et al. (2009) V. Blum, R. Gehrke, F. Hanke, P. Havu, V. Havu, X. Ren, K. Reuter, and M. Scheffler, Computer Physics Communications 180, 2175 (2009).
- Heyd et al. (2003) J. Heyd, G. E. Scuseria, and M. Ernzerhof, The Journal of Chemical Physics 118, 8207 (2003).
Supplemental Material: Neural Canonical Transformations for Quantum Anharmonic Solids of Lithium
Appendix SI SI. Neural Canonical Transformation approach

Neural canonical transformation (NCT) is a variational density matrix method that leverages multiple deep generative models. As shown in FIG. S1, this approach utilizes either a product spectrum ansatz (PSA) Martyn and Swingle (2019) or a variational autoregressive network (VAN) Wu et al. (2019); Liu et al. (2021); Xie et al. (2023) to represent phonon energy occupation probabilities (phonon Boltzmann distribution), a normalizing flow Dinh et al. (2014, 2016); Wang (2018); Papamakarios (2019); Papamakarios et al. (2021); Xie et al. (2022, 2023); Saleh et al. (2023); Zhang et al. (2024) for phonon excited-state wave functions, and a deep potential (DP) model Wang et al. (2018); Zeng et al. (2023); Wang et al. (2023) to describe the Born-Oppenheimer energy surface (BOES). In constant-volume calculations, the goal is to minimize the Helmholtz free energy by jointly optimizing the wave functions and probability distributions. For constant-pressure calculations, once has converged, the stress tensor is measured to adjust the lattice constants . After making these adjustments, constant-volume calculations are repeated until the Gibbs free energy converges.
SI.1 A. Variational wave functions in phonon coordinates
The displacement coordinates of crystals are defined as , where represents the nuclear positions and denotes the initial equilibrium positions. The BOES is expressed as a function of the coordinates and the lattice constants: , where we omit the lattice constants in in the main text for simplicity. The dynamical matrix of the lattice can then be derived from the following expression Martin (2020); Baroni et al. (2001); Souvatzis et al. (2008, 2009); Flores-Livas et al. (2020):
(S1) |
where index the nuclei, represent the Cartesian components, and is the mass of atoms. By diagonalizing the dynamical matrix, the system can be expressed in phonon coordinates. The eigenvalues correspond to the squares of phonon frequencies, , where indices the phonon modes. The associated eigenvectors define the unitary transformation between displacement and phonon coordinates: , . Consequently, the anharmonic terms in the BOES can be expressed as:
(S2) |
The vibrational Hamiltonian of the system, including anharmonic terms, can then be expressed in phonon coordinates as:
(S3) |
where the atomic mass is absorbed in the transformation from displacement coordinates to phonon coordinates . In the harmonic approximation, we discard the anharmonic terms, and the system’s Hamiltonian is defined in quasi-phonon coordinates:
(S4) |
The eigenvectors of the above Hamiltonian are product states of -dimensional quantum harmonic oscillators, which serve as the basis of variational wave functions mentioned in the main text:
(S5) |
where the normalization factor has been omitted, as only derivatives of the wave function’s logarithm are required for subsequent calculations. is the energy level of the -th quasi-phonon, and is the -th order Hermite polynomial. The variational wave functions in phonon coordinates are expressed as Zhang et al. (2024):
(S6) |
where the orthogonality condition is always satisfied due to the properties of the Jacobian determinant, i.e., . Using the normalizing flow, the phonon coordinates are connected to the quasi-phonon coordinates through a coordinate transformation, also known as point transformation Eger and Gross (1963, 1964); DeWitt (1952). In our work, it can be understood as a reversible function , where is realized through neural networks Dinh et al. (2016) with learnable parameters .
SI.2 B. Gradients of the variational free energy
Under constant-volume conditions, the variational Helmholtz free energy is calculated through the expectation values of the phonon energy occupation probabilities and phonon wave functions :
(S7) |
where the symbol represents statistical expectation, which can be estimated through sampling, and the number of samples is called the batch size Zhang et al. (2024); Becca and Sorella (2017). For instance, consider an observable , a function of a variable that follows a probability distribution . Let denotes the index of each sample, where , and represents the value corresponding to the -th sample. The unbiased estimation of the expectation can be approximated using these samples:
(S8) |
Returning to Eq. S7, the term represents the local energy of the vibrational Hamiltonian. It is determined by the energy level indices and the phonon coordinates :
(S9) |
where the first part is kinetic energy and the second part is potential energy. Since the wave functions are real-valued, we directly take the logarithm of their absolute values. This technique helps prevent numerical issues during computation, as detailed in the Appendix of Ref. Zhang et al. (2024). Moreover, the local energy’s computational bottleneck lies in evaluating the second-order derivatives. To reduce the computational complexity, we adopt Hutchinson’s stochastic trace estimator Hutchinson (1989). This trick improves the computational efficiency by an order of magnitude without compromising accuracy Xie et al. (2023). The free energy can be minimized by optimizing the parameter in the probabilities and the parameter in the wave functions . The gradients of to these two parameters are given by Xie et al. (2022, 2023); Zhang et al. (2024):
(S10) | ||||
SI.3 C. Observables and anharmonic zero-point energy
During the training process of NCT, observables such as kinetic energy, potential energy, entropy, and Helmholtz free energy are computed at each iteration. This approach also enables the convenient computation of pressure and stress, which is crucial for structural relaxation in the isothermal-isobaric ensemble. This implementation utilizes automatic differentiation Bradbury et al. (2018); Schoenholz and Cubuk (2020). The stress tensor , which accounts for both nuclear quantum and anharmonic effects, can be directly derived from its definition Nielsen and Martin (1985); Martin (2020):
(S11) |
where is the volume of the system, and the strain is defined as a transformation of each atomic coordinates . The term can be interpreted as the force, and it can be conveniently derived from BOES. Then, the pressure is determined from the trace of the stress tensor:
(S12) |
where is the kinetic energy of the ions, which can be derived from the first part of Eq. S9. In the study of lithium, we observe that the electronic contribution to pressure significantly outweighs the phononic contribution. For example, under , the phonon contribution to the pressure is less than , with the remainder arising from electrons.
In a supercell containing atoms, which has phonons. The zero-point energy (ZPE) in the harmonic approximation is defined using the frequencies derived from the dynamical matrix: . The anharmonic phonon frequencies and anharmonic ZPE in NCT can be defined as follows. The energy expectation for the energy level indexed by can be computed through sampling Zhang et al. (2024):
(S13) |
where is the local energy defined in Eq. (S9). In this context, the first excited single-phonon state of the -th phonon is represented as , except for , and the second excited state corresponds to . The anharmonic frequency of each phonon is defined by subtracting the ground-state energy from its single-phonon excitation energy: where the reduced Planck constant is set to be for simplicity. The anharmonic ZPE is then computed as the half sum of these frequencies, normalized by the number of atoms:
(S14) |
Appendix SII SII. Computational details and Additional results
SII.1 A. Born-Oppenheimer energy surface and computational costs
In this work, the BOES is realized through the deep potential (DP) models Wang et al. (2018); Zeng et al. (2023); Wang et al. (2023), which are trained on datasets generated through density functional theory (DFT) calculations. The DFT calculations were performed using the VASP (Vienna ab initio simulation package) code Kresse and Furthmüller (1996), employing the projector augmented wave (PAW) method with the Perdew-Burke-Ernzerhof (PBE) functional. For high-pressure calculations of cI16 and oC88, we utilized the same DP model (Li-DP-Hyb2) as detailed in the Supplemental Material of Ref. Wang et al. (2023). At low pressures, another DP model (Li-DP-Hyb3) was trained with additional data for bcc and fcc structures.
structure | atoms | batch size | epochs | GPU time (hours) |
bcc | 250 | 400 | 15000 | 2*A100: 18 h or 4*V100: 26 h |
fcc | 256 | 400 | 15000 | 2*A100: 18 h or 4*V100: 26 h |
cI16 | 432 | 256 | 15000 | 4*A100: 22 h or 8*V100: 29 h |
oC88 | 352 | 256 | 67000 | 2*A800: 120 h or 4*V100: 140 h |
In the NCT calculations presented in this work, the probability distribution of phonon energy levels is modeled using the product spectrum ansatz, with energy levels truncated at . The function in the variational wave functions is implemented using a real-valued non-volume preserving network Dinh et al. (2016), consisting of coupling layers, each comprising two multilayer perceptrons with hidden units. For the orthorhombic crystals, oC88, structure relaxation is performed every iterations to optimize the lattice constant . For bcc and fcc structures, relaxation occurs every iterations. The NCT method demonstrates exceptional scalability for large-scale parallel computations. All calculations were conducted on multiple NVIDIA A800-80G or V100-32G GPUs (graphics processing unit). TABLE S1 summarizes the computation time for each data point, where a single data point corresponds to a specific combination of structure, pressure, and temperature. Additional details regarding parameter settings can be found in our code repository myc .
SII.2 B. Phase diagrams via neural canonical transformations

We utilized the NCT method to calculate the Gibbs free energy for fcc and bcc structures, as depicted in FIG. S2. NCT predicts transition temperatures of , , and at , , and , which are consistently lower than those obtained from thermodynamic integration through classical molecular dynamics (MD) simulations on the same energy surface. The latter yield transition temperatures of , , and . This discrepancy is primarily attributed to the quantum effects of nuclei. Notably, at , the MD-predicted phase boundary of aligns closely with the DFT value reported in Ref. Ackland et al. (2017). However, NCT significantly lowers the phase boundary to , which agrees more closely with experimental values of and in Refs. Barrett (1947); Rao and Rao (1967). Compared to the MD results, NCT provides corrections in the right direction, underscoring the importance of incorporating nuclear effects.

In the main text, we discussed calculations for the cI16 and oC88 structures using the NCT method. The results for ionic kinetic energy and entropy are presented in FIGs. S3 (a) and (b), respectively. Within the pressure range of to , the cI16 structure consistently exhibits lower kinetic energy and higher entropy. However, this outcome contradicts the hypothesis proposed in Ref. Gorelli et al. (2012), which suggested that oC88 might exhibit lower ZPE and higher entropy than the competing structures. At pressures above , both cI16 and oC88 become unstable, and a more stable semiconducting oC40 structure emerges. The accuracy of the DP model also declines significantly in this range, making the results for cI16 and oC88 less significant. FIG. S3 (c) shows the high-pressure phase diagram, where the NCT-derived cI16-oC88 phase boundary is corrected using hybrid functionals. The transition point agrees well with experimental observations Guillaume et al. (2011); Frost et al. (2019).
SII.3 C. High-accuracy electronic calculations at high-pressure
Method | DP (VASP) | ABACUS | ABACUS | FHI-aims | FHI-aims | |
Functional | PBE | PBE | HSE06 | PBE | HSE06 | |
Basis | Plane Wave | DZP | DZP | intermediate | intermediate | |
Pseudopotential | PAW | Dojo-NC-FR | Dojo-NC-FR | - | - | |
oC88 - cI16 | ||||||
oC88 - cI16 | ||||||
oC88 - cI16 |
Further DFT calculations were conducted using the ABACUS software package aba (a) (TABLE. S2, columns 3 and 4), which utilizes the Dojo norm-conserving fully-relativistic pseudopotentials (Dojo-NC-FR) generated with the ONCVPSP code (version 3.3.0) aba (b); Hamann (2013). A double-zeta polarized (DZP) basis set was employed for the Perdew-Burke-Ernzerhof (PBE) and Heyd-Scuseria-Ernzerhof (HSE06) hybrid functional calculations. Additional calculations were performed using the FHI-aims all-electron electronic structure code aim (TABLE. S2, columns 5 and 6), which leverages numerically tabulated atom-centered basis functions for highly accurate and efficient quantum mechanical simulations Blum et al. (2009). An intermediate basis set was chosen to balance computational cost and accuracy. The HSE06 hybrid functional was applied to capture the exchange-correlation effects, utilizing the standard mixing scheme of Hartree-Fock and Kohn-Sham components Heyd et al. (2003); Martin (2020).
We performed single-point calculations on the structures optimized via NCT at and , with the results shown in TABLE S2. For the cI16 structure, a -point grid was utilized, while an grid for oC88. All input files for these calculations are accessible in the code repository myc . The second column presents the results obtained using the DP model, trained on data generated by VASP with the PAW method under the PBE functional. The third column presents the results from ABACUS using a DZP basis under PBE, which aligns with those of the DP model, confirming that the DP model’s accuracy does not influence the conclusions. In this case, the potential energy at the equilibrium position of oC88 is higher than that of cI16. The fourth column presents the results under the hybrid HSE06 functional, where the energy difference between the two structures is reduced, with oC88 being only higher than cI16. This indicates that HSE corrects the BOES by compared to PBE. Similar results were obtained through calculations performed with FHI-aims, where HSE yielded a correction of compared to PBE.
To further investigate the stability of these two structures, we estimated their enthalpies, incorporating corrections based on single-point electronic structure calculations. It is evident that when a more accurate functional is employed, oC88 becomes more stable than cI16. This suggests its stabilization originates from electronic structure effects rather than finite-temperature or nuclear quantum effects. In addition, we applied corrections to the Gibbs free energy obtained from the NCT calculations. Interestingly, quantum and thermal effects slightly reduce the stability of the oC88 structure, contradicting the hypotheses proposed in previous studies Marqués et al. (2011); Gorelli et al. (2012); Wang et al. (2023).