An adaptive preconditioning scheme for the self-consistent field iteration and generalized stacking-fault energy calculations

Sitong Zhang Center for High Pressure Science, State Key Laboratory of Metastable Materials Science and Technology, Yanshan University, Qinhuangdao 066004, China. Xingyu Gao [email protected] Laboratory of Computational Physics, Institute of Applied Physics and Computational Mathematics, Fenghao East Road 2, Beijing 100094, People’s Republic of China. Haifeng Song Laboratory of Computational Physics, Institute of Applied Physics and Computational Mathematics, Fenghao East Road 2, Beijing 100094, People’s Republic of China. Bin Wen [email protected] Center for High Pressure Science, State Key Laboratory of Metastable Materials Science and Technology, Yanshan University, Qinhuangdao 066004, China.

Abstract

Abstract: The generalized stacking-fault energy (GSFE) is the fundamental but key parameter for the plastic deformation of materials. We perform first-principles calculations by full-potential linearized augmented planewave (FLAPW) method to evaluate the GSFE based on the single-shift and triple-shift supercell models. Different degrees of defects are introduced in the two models, thereby affecting the convergence of the self-consistent field (SCF) iterations. We present an adaptive preconditioning scheme which can identify the long-wavelength divergence behavior of the Jacobian during the SCF iteration and automatically switch on the Kerker preconditioning to accelerate the convergence. We implement this algorithm in Elk-7.2.42 package and calculate the GSFE curves for Al, Cu, and Si (111) plane $\langle\bar{1}\bar{1}2\rangle$ direction. We found that the single-shift and triple-shift supercell models have equivalent calculation accuracy and are within the experimental data uncertainty. For computational efficiency, the triple-shift supercell model is preferable due to its better convergence, exhibiting lower degree of defect compared to the single-shift supercell model.

density-functional theory, self-consistent field methods, FLAPW, Kerker preconditioner

I INTRODUCTION

In the field of materials science, defects in materials, including point defects, line defects, planar defects, bulk defects Moitra et al. (2014); Garg et al. (2019); Yan et al. (2021); Jeong et al. (2018), significantly influence the mechanical, electronic, optical, and thermal properties of materials, thereby impacting their performance and applications Ehrhart (1992); Siegel (1981); Arrigoni and Madsen (2021); Davidsson et al. (2021). Understanding and characterizing these defects is crucial for studying material microstructure evolution. In particular, stacking-fault is an importment kind of planar defect caused by the misalignment of atomic layers in crystals Hutchinson (1977); Gilman and Read (1952), and the stacking-fault energy (SFE) refers to the energy change when stacking-fault is formed in the crystal, which directly affects the mechanical properties of materials Anderson et al. (2017); Vitek (1968). It is shown that metal materials with low SFE tend to slip mechanism, while those with high SFE are inclined towards twinning mechanism Meyers et al. (2001). Significantly, the experimentally measurable SFE typically represents the intrinsic SFE. As an extension of SFE, the generalized stacking-fault energy (GSFE) has been proposed to reflect the SFE change of the entire surface, which can help gain a deeper understanding of plastic deformation and offer theoretical support for relevant experiments Sun et al. (2022). It also reveals the energy barrier which is necessary to overcome in order to arrive at the intrinsic SFE.

For the calculation of GSFE, there are two factors need to consider: the stacking-fault construction and the energy calculation method. The stacking-fault construction includes the direct supercell method, indirect supercell method, and axial interaction method Denteneer and Van Haeringen (1987); Muzyk et al. (2012); Gholizadeh (2013). Among these, the direct supercell method is widely recognized for the ability of the atomic position relaxation. And two models are widely used for the direct supercell method in face-centered cubic (FCC) materials: the single-shift and triple-shift supercell models Gholizadeh et al. (2013). It is worth noting that different types of defects are introduced to the two models. In order to obtain accurate GSFE values, the Kohn-Sham density functional theory (DFT) is highly desirable Hohenberg and Kohn (1964); Kohn and Sham (1965). Full-potential linearized augmented planewave (FLAPW) is a high-precision DFT calculation method Wimmer et al. (1981); Ambrosch-Draxl and Sofo (2006), which has developed several computational software packages, such as WIEN2K Blaha et al. (2001), FLEUR Wortmann et al. (2023), EXCITING Gulans et al. (2014), ELK 38 (38), etc. These softwares are widely regarded as benchmarks in assessing bulk properties, including equations of state (EOS) Lejaeghere et al. (2016); Otero-de-la Roza et al. (2011), elastic constant Panda and Chandran (2006), and so on. The application of the FLAPW method in defect calculations (like GSFE calculations) is relatively few Hoshino et al. (2001); Šob . It is shown that high-precision computational results of FLAPW can be as a reference supporting the development of other material calculation methods (such as machine learning) Freitas and Cao (2022); Dragoni et al. (2018).

In periodic boundary conditions, these defect models generally have large volumes, posing a computational challenge due to the substantial resources required. Hence, optimizing computational efficiency becomes a crucial concern. Different configurations may affect the computational efficiency on two aspects: atomic relaxation and electronic self-consistent field (SCF) iteration. At the same time, the defect states introduced by defects may cause ”charge sloshing” during SCF iterations in large systems Tassone et al. (1994); Kresse and Furthmüller (1996a). The essence of charge sloshing is the long-wavelength divergence of the Jacobian caused by the complete screening effect. When the input charge changes, it will cause the output charge violently oscillate in the low-frequency part, resulting in bad convergence. Although it often appears in metals, similar phenomenon may also occur in some defective semiconductor systems. However, prior judgment of whether a system has charge sloshing usually relies on experience. And in some complex systems, such as metal-insulator contact systems, it is difficult to determine a priori whether there is charge sloshing Yu et al. (2018); Zhou et al. (2018). Kerker preconditioning is considered an effective scheme for suppressing the long-wavelength divergence behavior of the Jacobian Resta (1977); Raczkowski et al. (2001), and its effectiveness has been verified in some softwares Winkelmann et al. (2020); Kim et al. (2020). But imposing Kerker preconditioning on all systems may not achieve good convergences. This raises the question: How to identify long-wavelength divergence behavior of the Jacobian during SCF iterations?

In this work, we found a good approximation of the inverse Jacobian in the Pulay subspace and constructed a posterior indicator to identify the long-wavelength divergence behavior of the Jacobian. Then, we developed an adaptive algorithm based on the indicator and the Kerker preconditioner in ELK codes. Our algorithm can identify the charge sloshing and automatically switch to Kerker preconditioning, enabling efficient convergence of different stacking-fault models. On this bases, we used Al, Cu, and Si as representative materials and evaluated the GSFE curve calculations from two aspects of accuracy and efficiency for (111) plane $\langle\bar{1}\bar{1}2\rangle$ direction with the different planar defect models: the single-shift and triple-shift supercell models, and explored the impact of different defects in stacking-fault models on calculations.

This paper is organized as follows: In Sec II, we will introduce fundamental models and computational methods. In Sec III, we will discuss the construction of a posterior indicator and the adaptive preconditioning scheme. In Sec IV, we will present our results and analysis of the GSFE curves calculation. Concluding remarks will be presented in the last section.

II Basic models and calculation methods

II.1 Construction of stacking-fault model

Vitek defined the SFE as Vitek (1968):

\displaystyle\gamma=\frac{\Delta E}{s}.

(1)

Where $\Delta E$ is the energy difference between the original structure and the stacking-fault structure, $s$ is the area of the slip plane.

We first focus on the modeling methods of the single-shift and triple-shift supercell models for GSFE curve calculations. In Al, Cu, and Si structures, the stacking sequence of the close-packed atomic layers along the z- $\langle 111\rangle$ direction can be denoted by ABCABC. Here, we define a 6-layer stacking configuration as the supercell unit block to mitigate the interaction impact between two stacking-faults along the z-axis direction. The $\langle{1}\bar{1}0\rangle$ and $\langle{1}0\bar{1}\rangle$ directions on the (111) plane are selected as the basis vectors to make the supercell unit block as small as possible to reduce computational cost.

The single-shift supercell model consists of two interacting supercell unit blocks, where one remains stationary, and the other moves along the $\langle\bar{1}\bar{1}2\rangle$ direction. However, due to periodic boundary conditions, a relative displacement in the $\langle\bar{1}\bar{1}2\rangle$ direction between the blocks is induced. We add a 10 Å vacuum layer in the z-axis direction to eliminate the interaction between the blocks and their periodic images. The triple-shift supercell model consists of three interacting supercell blocks. They simultaneously slips in the $\langle{1}\bar{1}0\rangle$ , $\langle 01\bar{1}\rangle$ and $\langle\bar{1}01\rangle$ directions, respectively. As a result, the displacements between adjacent blocks are $\langle 2\bar{1}\bar{1}\rangle$ , $\langle\bar{1}2\bar{1}\rangle$ , and $\langle\bar{1}\bar{1}2\rangle$ . The sum of these three displacement vectors is zero, no need to add a vacuum layer and preserving the periodic boundary conditions along the z-axis direction. The modeling methods of the two models are shown in the Fig. 1. We took 18 equidistant points to approximate the unit (111) plane $\langle\bar{1}\bar{1}2\rangle$ direction GSFE curve of Al, Cu, and Si (the (111) plane of Si includes the glide plane and the shuffle plane, and we only compute the glide plane).

Refer to caption — Figure 1: Schematic representation of the single-shift and triple-shift supercell models. (a)In the single-shift supercell model, the blue and red grids represent the 6-layer supercell units block, while the arrows indicate the slip directions of the grids. (b) Side and top views of the triple-shift supercell model, the yellow parallelogram represents the original position of the supercell. The blue, green, and red ones, respectively, indicate the upper, middle, and lower grids’ displacements relative to the original position along the $\langle{1}\bar{1}0\rangle$ , $\langle 01\bar{1}\rangle$ and $\langle\bar{1}01\rangle$ directions (solid lines). The dotted-dashed lines represent the actual relative displacements between adjacent blocks, which are $\langle 2\bar{1}\bar{1}\rangle$ , $\langle\bar{1}2\bar{1}\rangle$ , and $\langle\bar{1}\bar{1}2\rangle$ .

II.2 FLAPW method

We utilized ELK-7.2.42 packages. It is based on the FLAPW method, which divides the space of the crystal into two regions: muffin-tin spheres and interstitial regions. The wavefunctions are expanded using a combination of atomic orbitals in the muffin-tin region and planewave in the interstitial region. The wave functions $\psi^{\sigma}_{\nu\textbf{k}}$ for spin- $\sigma$ valence orbital of band index $\nu$ with Bloch vector k is:

\psi^{\sigma}_{\nu\textbf{k}}(\textbf{r})=\sum_{\textbf{k}+\textbf{G}}c_{\nu\textbf{k}}^{\textbf{G}\sigma}\varphi^{\sigma}_{\textbf{{G}}}(\textbf{k},\textbf{r}).

(2)

Where $c_{\nu\textbf{k}}^{\textbf{G}\sigma}$ is the expansion coefficients of the wavefunction, G are all reciprocal lattice vectors dual to the lattice vectors defining the periodic domain up to the $\textbf{G}_{\rm{max}}$ , and the basis functions $\varphi^{\sigma}_{\textbf{{G}}}(\textbf{k},\textbf{r})$ can be written as:

\varphi^{\sigma}_{\textbf{{G}}}(\textbf{k},\textbf{r})=\begin{cases}\sum_{jL\alpha}A^{\sigma\alpha\textbf{G}}_{jL}(\textbf{{k}})u^{\sigma\alpha}_{jl}(r_{\alpha})Y_{L}(\hat{\bf{r}}_{\alpha})&\rm{muffin-tin}\\ \frac{1}{\sqrt{\Omega}}\exp({i}(\textbf{k}+\textbf{G})\cdot\textbf{{r}})&\rm{interstitial}.\\ \end{cases}

(3)

Where $L$ abbreviates the quantum numbers $l$ and $m$ , $\alpha$ is the muffin-tin sphere, $Y_{L}(\hat{\bf{r}})$ is the spherical harmonics, $A^{\sigma\alpha\textbf{G}}_{jL}(\textbf{{k}})$ is the matching coefficient, $j$ is the order of the basis function, for $j=0$ and 1, $u^{\sigma\alpha}_{jl}(r_{\alpha})$ is the solution of the radial Schrödinger equation at a linearization energy and its energy derivative evaluated at this same energy. In the interstitial region, the function is the planewave. As the basis sets in different regions of the crystal are different, it results in different representations of the density and potential in the two regions:

\rho(\textbf{{r}})=\begin{cases}\sum_{L}\rho^{\alpha}_{L}({r}_{\alpha})Y_{L}({\hat{\textbf{r}}}_{\alpha})&\rm{muffin-tin}\\ \sum_{{\textbf{k}}+{\textbf{G}}}\rho^{\rm{I}}_{{({\textbf{k}}+{\textbf{G}})}}\exp({i}({\textbf{k}}+{\textbf{G}})\cdot\textbf{{r}})&\rm{interstitial};\\ \end{cases}

(4)

V(\textbf{{r}})=\begin{cases}\sum_{L}V^{\rm{MT}}_{L}Y_{L}({\hat{\textbf{r}}}_{\alpha})&\rm{muffin-tin}\\ \sum_{{\textbf{k}}+{\textbf{G}}}V^{\rm{I}}_{({\textbf{k}}+{\textbf{G}})}\exp({i}({\textbf{k}}+{\textbf{G}})\cdot\textbf{{r}})&\rm{interstitial}.\\ \end{cases}

(5)

Based on the above, an eigenvalue problem can be derived:

\hat{H}\psi^{\sigma}_{\nu\textbf{k}}(\textbf{r})=\varepsilon_{\nu\textbf{k}}\psi^{\sigma}_{\nu\textbf{k}}(\textbf{r})

(6)

where Hamiltonian $\hat{H}$ is a sum of corresponding terms:

\hat{H}=\hat{T}_{0}+\hat{V}_{\rm ext}+\hat{V}_{\rm H}+\hat{V}_{\rm xc}.

(7)

${T}_{0}$ is kinetic energy, ${V}_{\rm ext}$ is external-potential, ${V}_{\rm H}$ is Hartree potential, ${V}_{\rm xc}$ is exchange-correlation potential. Of which, ${V}_{\rm xc}$ and ${V}_{\rm H}$ are the local potentials and explicitly density dependent, and the charge density depends on the solution of the Eq.(6), leading to the nonlinearity of the Kohn-Sham equation. One need to resort to a fixed-point iteration scheme:

\displaystyle\textbf{n}_{m+1}=\textbf{Q}(\textbf{n}_{m})

(8)

The residual $\textbf{f}(\textbf{n}_{m})$ is defined as: $\textbf{f}(\textbf{n}_{m})=\textbf{Q}({\textbf{n}_{m}})-\textbf{n}_{m}.$ Based on this idea, the simple mixing scheme for the vector $\textbf{n}_{m}$ (represents charge density $\rho_{m}$ or potential ${v}_{m}$ , in this work, we employ potential mixing) can be derived:

\displaystyle\textbf{n}_{m+1}=\textbf{n}_{m}+P\textbf{f}(\textbf{n}_{m}),

(9)

where $P$ is the preconditioner. Through SCF iterations and continuously updating the charge density and potential until the desired convergence accuracy is achieved, the ground state-energy can be obtained. From this, one can further derive various physical properties.

II.3 GSFE curve calculation test

We employed the FLAPW combined with the local orbital basis set in ELK, the muffin-tin radii were set to $R_{\rm MT}^{\rm Al}=2$ bohr, $R_{\rm MT}^{\rm Cu}=2.2$ bohr, and $R_{\rm MT}^{\rm Si}=2$ bohr for Al, Cu, and Si, respectively. The cut-off radii for these systems were all set to $R^{\rm MT}_{\rm mix}K_{\rm max}=10$ . The Brillouin zone sampling was set as $11\times 11\times 1$ for k-points. We adopt the generalized gradient approximation in the Perdew–Burke–Ernzerhof parametrization for the exchange-correlation functional Perdew et al. (1998). We employed the original Pulay mixing scheme Johnson (1988), and set the convergence criterion as energy between two consecutive iteration steps being smaller than $10^{-6}$ Hartree. All atoms relax along the z-axis direction during the calculation, and atomic forces were converged to less than $0.001$ Hatree/bohr.

To calculate the GSFE curve in the $\langle\bar{1}\bar{1}2\rangle$ direction, we took 18 equidistant points along the unit $\langle\bar{1}\bar{1}2\rangle$ direction, and adopted the 1/18 $\langle\bar{1}\bar{1}2\rangle$ point as an example to show the convergence of the single-shift and triple-shift supercell models calculations in Fig. 2. It is shown that only the single-shift supercell model of Al converges in 50 steps; others do not converge in 50 steps. Obviously, calculating the GSFE curve under current scheme is inefficient.

III Posterior Indicators and the Adaptive Preconditioning scheme

Previous studies have reported similar convergence problems, i.e., the long-wavelength divergence behavior of the Jacobian Tassone et al. (1994); Kresse and Furthmüller (1996a). We want to identify the long-wavelength divergence behavior of the Jacobian in computations and accelerate convergence. In this section, we first present the derivation of a posteriori indicator to monitor the eigenvalue behavior of the Jacobian. Then, we briefly describe the implementation of Kerker preconditioning under the FLAPW basis. And finally, we combine them to obtain an adaptive preconditioning scheme.

III.1 A Posterior Indicator

Firstly, we define the Jacobian as:

J=-\frac{d\textbf{f}(\textbf{n}_{m})}{d\textbf{n}_{m}}.

(10)

The residual propagation of Eq.(9) is given by:

\textbf{f}(\textbf{n}_{m+1})\approx(I-JP)\textbf{f}(\textbf{n}_{m}).

(11)

The convergence condition can be written as:

||I-JP||<1.

(12)

If there are small eigenvalues in $(JP)^{-1}$ , the conditions of Eq.(12) are difficult to satisfy, which implies the long-wavelength divergence behavior of the Jacobian. But it is hard to calculate the eigenvalues of $(JP)^{-1}$ . Practically, we can use the subspace of Pulay scheme to approximate $(JP)^{-1}$ .

In the Broyden’s second method, a sequence of low-rank modifications are made to modify the initial guess of the inverse Jacobian, the recursive formula can be derived from the following constrained optimization problem Marks and Luke (2008); Fang and Saad (2009):

	$\displaystyle{\rm min}_{H}\quad\frac{1}{2}\|\|H-H_{m-1}\|\|^{2}_{F}$		(13)
	$\displaystyle{\rm such\quad that}\quad H_{m}F_{m-1}=-N_{m-1}$		(13)

where $H_{m-1}$ is the approximation to the inverse Jacobian in the $(m-1)$ th Broyden update. $N_{m-1}$ and $F_{m-1}$ are:

\displaystyle N_{m-1}=(\delta\textbf{n}_{m-1},\dots,\delta\textbf{n}_{m-\ell+1});

(14)

\displaystyle F_{m-1}=(\delta\textbf{f}_{m-1},\dots,\delta\textbf{f}_{m-\ell+1});

(15)

where $\ell$ is the size of the subspace. Eq.(13) has an analytical solution as Zhou et al. (2018):

\small H_{m}=H_{m-1}-(N_{m-1}+H_{m-1}F_{m-1})(F_{m-1}^{T}F_{m-1})^{-1}F^{T}_{m-1}.\quad

(16)

And we can replace the $H_{m-1}$ in Eq.(16) to the initial guess $H_{1}$ as the inverse of Jacobian:

\small H_{m}=H_{1}-(N_{m-1}+H_{1}F_{m-1})(F_{m-1}^{T}F_{m-1})^{-1}F^{T}_{m-1}.\quad

(17)

Then we follow the quasi Newton approach to solve the fixed-point iteration, this leads to the Pulay mixing scheme, which is also the method used in Elk codes:

$\displaystyle\textbf{n}_{m+1}$	$\displaystyle=\textbf{n}_{m}+H_{m}\textbf{f}(\textbf{n}_{m})$	(18)
	$\displaystyle=\textbf{n}_{m}+H_{1}\textbf{f}(\textbf{n}_{m})-(N_{m-1}+H_{1}F_{m-1})$
	$\displaystyle\quad\times(F_{m-1}^{T}F_{m-1})^{-1}F^{T}_{m-1}\textbf{f}(\textbf{n}_{m}),$

The $H_{m}$ of Eq.(17) satisfies the constraint condition:

H_{m}F_{m-1}=-N_{m-1}

(19)

and for the Jacobian $J$ , it also satisfies:

J^{-1}F_{m-1}\approx-N_{m-1}.

(20)

From Eq.(19) and Eq.(20), we can calculate the eigenvalues of $P^{-1}H_{m}$ replacing the eigenvalues of $(JP)^{-1}$ , because $H_{m}$ provides an effective approximation to $J$ within the subspace defined by $F_{m}$ . The eigenvalues of $P^{-1}H_{m}$ can be obtained by solving a generalized eigenvalue problem:

F_{m-1}^{T}H_{m}F_{m-1}\textbf{u}_{i}=\lambda_{i}F_{m-1}^{T}PF_{m-1}\textbf{u}_{i},

(21)

Eq.(21) can be shift as:

F_{m-1}^{T}(H_{m}-P)F_{m-1}\textbf{u}_{i}=(\lambda_{i}-1)F_{m-1}^{T}PF_{m-1}\textbf{u}_{i}.\quad

(22)

We only need to calculate the eigenvalues of the left-hand side of the Eq.(22), select the smallest eigenvalue as a posterior indicator. When the indicator is too small, it signifies the long-wavelength divergence behavior of the Jacobian. Note that solving Eq.(22) requires a little computational overhead because $(H_{m}-P)F_{m-1}$ has been calculated in Pulay’s update:

(H_{m}-P)F_{m-1}=-(N_{m-1}+H_{1}F_{m-1})

(23)

and the size of the subspace $\ell$ is generally smaller than 50. The details of relevant derivation can be found in Ref Zhou et al. (2018).

In the general scheme (also in our tests), $H_{1}$ of Eq.(17) is set to $\alpha I$ , where $\alpha$ is a scalar parameter usually set to 0.4. The indicators of Al, Cu, and Si systems for the single-shift and triple-shift supercell models tests are shown in Fig. 3. We found that they all have small indicators during the SCF iterations. It means that the convergence condition of Eq.(12) cannot be satisfied, and the residual may be magnified in the corresponding step, leading to the SCF iteration is hard to converge.

III.2 Kerker preconditioning in the FLAPW method

When we find the long-wavelength divergence behavior of the Jacobian through a posterior indicator, an effective method to suppress the long-wavelength divergence behavior of the Jacobian is the Kerker preconditioner. The Kerker preconditioner is based on the Thomas-Fermi screening model. The dielectric function of the Thomas-Fermi screening model in reciprocal space can be expressed as:

\varepsilon(\textbf{G})=\frac{\textbf{G}^{2}+k_{\rm TF}^{2}}{\textbf{G}^{2}},

(24)

where $k_{\rm TF}$ is the Thomas-Fermi vector Zhou et al. (2018). The Kerker preconditioner can be derived by inverting the dielectric matrix Kresse and Furthmüller (1996b); Lin and Yang (2013), and taking the Kerker preconditioner into simple mixing scheme can be expressed as:

\displaystyle\textbf{n}_{m+1}(\textbf{G})=\textbf{n}_{m}(\textbf{G})+\alpha\frac{\textbf{G}^{2}}{\textbf{G}^{2}+k_{\rm TF}^{2}}\textbf{f}\left(\textbf{n}_{m}\left(\textbf{G}\right)\right).

(25)

In the FLAPW method, due to the violent oscillation of the charge density near the nucleus, the Fourier transform is hard, so the Kerker preconditioner must be implemented in real space Wimmer et al. (1981):

\displaystyle\textbf{n}_{m+1}({\bf r})=\textbf{n}_{m}({\bf r})+\alpha[1+\lambda^{2}(\nabla^{2}-\lambda^{2})^{-1}]\textbf{f}\left(\textbf{n}_{m}\left(\textbf{r}\right)\right).

(26)

We replace the $k_{\rm TF}$ by $\lambda$ , because Thomas-Fermi wave number of the homogeneous electron gas and the electronic properties of the actual system are hard to equal Winkelmann et al. (2020). Then, the key of the problem is the treatment of $(\nabla^{2}-\lambda^{2})^{-1}$ , which can be related to the solution of screened Poisson’s equation Lin and Yang (2013); Hinzen et al. (2020):

\displaystyle(\nabla^{2}-\lambda^{2})V(\textbf{r})=-4\pi\rho(\textbf{r}).

(27)

For the potential mixing, where ${\rho}(\textbf{r})=-{v}_{m}(\textbf{r})/4\pi$ . In order to solving the screened Poisson’s equation in the FLAPW basis, we need to divide the solution into the muffin-tin and the interstitial regions, and firstly solve the screened Poisson’s equation for the interstitial regions in reciprocal space, the solution of interstitial regions provides the boundary conditions for muffin-tin regions. Then, the solution of the muffin-tin regions can be solved by Green’s function method. This is based on the pseudo-charge method proposed by Weinert Weinert (1981).

For the solution of the interstitial region, we constructed the pseudo-charge $\bar{{\rho}}(\textbf{r})$ that can be expanded in terms of the finite set of G-vectors. The charge density in the interstitial region is extended to the entire space, and the charge density in the muffin-tin region is the original charge density subtracted from the extended interstitial charge density. We now replace the real muffin-tin charge density with a smooth density $\tilde{{\rho}}^{\alpha}(\textbf{r})$ :

\bar{{\rho}}(\textbf{r})={\rho}_{\rm I}(\textbf{r})+\sum_{\alpha}\tilde{{\rho}}^{\alpha}(\textbf{r}).

(28)

Concurrently, the pseudo-charge must preserve equivalent multipole moments (the same as the real charge distribution) within the muffin-tin region. This implies that the multipole moments of smooth density $\tilde{q}^{\alpha}_{lm}$ are determined by the difference between the muffin-tin and interstitial multipole moments:

\displaystyle\tilde{q}^{\alpha}_{lm}=q^{\alpha,\rm{MT}}_{lm}-q_{lm}^{\rm I}.

(29)

Where the multipole moments for the muffin-tin $q^{\alpha,\rm{MT}}_{lm}$ and interstitial regions $q_{lm}^{\rm I}$ can be expressed as:

\displaystyle q^{\alpha,\rm{MT}}_{lm}=\frac{{(2l+1)!!}}{\lambda^{l}}\int_{S_{\alpha}}Y_{lm}^{*}({\hat{\bf r}})i_{l}(\lambda r){\rho}{({\bf r})}d{\bf r};

(30)

	$\displaystyle q_{lm}^{\rm I}=$	$\displaystyle\frac{(2l+1)!!}{\lambda^{l}}4\pi{\rm i}^{l}\sum_{\bf G}\rho^{\rm I}({\bf G})\exp({\rm i}{\bf G}\cdot{\bf r}_{\alpha})Y_{lm}^{*}(\hat{\bf G})$		(31)
		$\displaystyle\times\int_{0}^{R_{\alpha}}i_{l}(\lambda r_{\alpha})j_{l}(Gr_{\alpha})r^{2}_{\alpha}dr_{\alpha};$		(31)

$i_{l}(r)$ and $k_{l}(r)$ are the modified spherical Bessel function of the first and second kind. Then the pseudo-charge density in reciprocal space can be written as:

	$\displaystyle\bar{\rho}{({\bf G})}=$	$\displaystyle{\rho}^{I}{({\bf G})}+\frac{4\pi}{\Omega}\exp({{\rm i}{\bf{G}\cdot\bf{R}}^{\alpha}})\sum_{l=0}^{\infty}\sum_{m=-l}^{l}\frac{{(-{\rm i})^{l}}}{(2l+1)!!}$		(32)
		$\displaystyle\times\frac{\lambda^{l+n+1}j_{l+n+1}(GR_{\alpha})}{i_{l+n+1}(\lambda R_{\alpha})G^{n+1}}Y_{lm}^{*}({\hat{\bf G}})\tilde{q}^{\alpha}_{lm}.$		(32)

The pseudo-charge has a more rapidly converging Fourier expansion than the original charge density, and the $V_{I}({\bf G})$ in the interstitial can be get:

\displaystyle V_{I}({\bf G})=\frac{4\pi}{G^{2}+\lambda}\bar{\rho}{({\bf G})}.

(33)

The potential of the interstitial region provides the value at the boundary of the muffin-tin sphere. The solution of potential within the muffin-tin sphere is a Dirichlet boundary-value problem, which can be solved by the Green’s function method Jackson (1999):

$\displaystyle V_{\rm MT}(\textbf{r})$	$\displaystyle=\int_{{\rm MT}_{\alpha}}G({\bf r},{\bf r}^{\prime})\rho({\bf r}^{\prime})d{\bf r}^{\prime}-\frac{R_{\alpha}^{2}}{4\pi}\oint_{S_{\alpha}}V_{I}({R_{\alpha}})\frac{\partial G}{\partial n^{\prime}}d\Omega^{\prime}$	(34)
	$\displaystyle={4\pi\lambda}[k_{l}(\lambda r)\int_{0}^{r}\rho_{lm}(r^{\prime})i_{l}(\lambda r^{\prime}){r^{\prime}}^{2}dr^{\prime}$
	$\displaystyle+i_{l}(\lambda r)\int_{r}^{R_{\alpha}}{r^{\prime}}^{2}{\rho_{lm}(r^{\prime})}{k_{l}(\lambda r^{\prime})}dr^{\prime}$
	$\displaystyle-i_{l}(\lambda r)\frac{k_{l}(\lambda R_{\alpha})}{i_{l}(\lambda R_{\alpha})}\int_{0}^{R_{\alpha}}\rho_{lm}(r^{\prime})i_{l}(\lambda r^{\prime})r^{\prime 2}dr^{\prime}]$
	$\displaystyle+V^{I}_{lm}(R_{\alpha})\left[\frac{i_{l}(\lambda r_{\alpha})}{i_{l}(\lambda R_{\alpha})}\right].$

One characteristic of Kerker preconditioner is that its convergence is not affected by an increase in the number of atoms on the long-axis for metal systems. Based on the implementation of the Kerker preconditioner, we tested the convergence for the triple-shift supercell model of Al, Cu with different thicknesses on the z-axis. We set $\lambda_{\rm Al}=0.6$ , $\lambda_{\rm Cu}=0.8$ Winkelmann et al. (2020) , the convergence results are shown in Fig. 4. We found that the convergence speed is significantly improved by using the Kerker preconditioner. At the same time, we found that the step of convergence almost unchanged for different atomic layer thicknesses. Obviously, the Kerker preconditioner successfully suppressed the long-wavelength divergence behavior of the Jacobian.

III.3 The Adaptive Preconditioning scheme

In the SCF iteration, we establish an initiation criterion for the Kerker preconditioner by setting a threshold of less than 0.2 for a posterior indicator. This leads to an adaptive preconditioning algorithm. The algorithm process is shown in Algorithm. 1.

Algorithm 1 An adaptive preconditioning algorithm

1:Input:

\textbf{n}_{0}

\alpha

\lambda

, tol;

2:repeat:

m=1,2,3,4\dots

\textbf{n}_{m+1}=\textbf{n}_{m}+H_{m}\textbf{f}(\textbf{n}_{m})

H_{1}=\alpha I

;

4: if indicator ¡ 0.2 then ;

5: Break

6: Initiate Kerker preconditioner, and reset

m=1

7: repeat:

m=1,2,3,4\dots

\textbf{n}_{m+1}=\textbf{n}_{m}+H_{m}\textbf{f}(\textbf{n}_{m})

H_{1}=\alpha(\nabla^{2}-\lambda^{2})^{-1}

;

9: until

\sqrt{\frac{||\textbf{f}||^{2}}{\Omega}}<tol

10: Output:

\textbf{n}_{m+1}

11: end if

12:until

\sqrt{\frac{||\textbf{f}||^{2}}{\Omega}}<tol

13:Output:

\textbf{n}_{m+1}

At the beginning of the program, we set $H_{1}$ to $\alpha I$ . A posterior indicator is calculated in every SCF iteration. When the indicator is less than 0.2, break the current step, and the current subspace will also be discarded. Subsequently, a new SCF iteration will be started with the Kerker preconditioner initiating, continue iteration until convergence accuracy is reached.

We tested the convergence and posterior indicators of Al, Cu, and Si using the adaptive preconditioning scheme in the two supercell models. We set $\lambda_{\rm Al}=0.6$ , $\lambda_{\rm Cu}=0.8$ and $\lambda_{\rm Si}=0.6$ . As shown in Fig. 5, small indicator appears around the 5 steps. Then the Kerker preconditioner is initiated, the indicators are approximately 1 and remains stable. It leads to a significantly improved convergence speed compared with the original algorithm. This scheme allows the system to adjust the choice of preconditioning method more reasonably without prior information.

It is worth noting that the Kerker preconditioner is aimed at metallic systems, but in out tests, it is also effective for Si, which is a semiconductor. We calculated the density of states (DOS) for the original structure, the single-shift supercell model, and the triple-shift supercell model of Si, as shown in Fig. 6. Although the original structure of Si is insulated, in the single-shift and triple-shift supercell models, planar defects in stacking-fault structures generate free electrons, giving the system metallic properties. Moreover, in the single-shift supercell model, the planar defects introduced by a vacuum layer enhanced the metallicity. Our programs identified the long-wavelength divergence behavior of the Jacobian and initiated Kerker preconditioner improving convergence in these models.

IV Evaluation of GSFE curve calculation

We first tested the effect of vacuum layers with different thicknesses on the energy for the single-shift supercell model. As shown in Fig. 7, the energies of all three materials converge in the range of $10^{-4}$ eV at 10 Å, indicating that they can prevent the interaction between the slabs and their periodic images. In addition, we also found that as the thickness of the vacuum layer increases, the convergence steps will gradually increase.

Fig. 8 shows the GSFE curves for Al, Cu, and Si of the single-shift and triple-shift supercell models.

We compared our intrinsic SFE values from other methods, and it is shown that the intrinsic SFE values of Al are between 120-166 $mJ/m^{2}$ , of Cu, they fall within 35-45 $mJ/m^{2}$ , and of Si, they are in 35-70 $mJ/m^{2}$ . Our calculated results are within this range. The results of the single-shift and triple-shift supercell models are comparable. And the shape of the GSFE curve is the same as the previous research, confirming the credibility of our computations.

Table 1: Intrinsic SFE

(mJ/m^{2})

for Al, Cu, and Si in different methods

Method	Al	Cu	Si
Sgl-shift	138	39	45
Tpl-shift	134	36	50
WIEN2KCai et al. (2014)	150	44
MEAMWei et al. (2007)	150	43
PAW	112Jin et al. (2011),142Wu et al. (2014)	36Jin et al. (2011)	47Juan and Kaxiras (1996)
EMTOLi et al. (2014)	107	47
Tight-bindingBernstein and Tadmor (2004)	99	30
EXP.	120-142Han et al. (2008),135Smallman and Dobson (1970),	35-45Thornton and Mitchell (1962),	50 $\pm$ 15Takeuchi and Suzuki (1999)
	120Rautioaho (1982),166Anderson et al. (2017)	43.5 $\pm$ 2Yamamoto et al. (1983)	60 $\pm$ 10Föll and Carter (1979)

Then, We compared the computational cost for Al, Cu, and Si in the two models, as shown in Table 2. Our CPU is Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz; every system was tested with 32 cores in parallel. The more atoms for the triple-shift supercell model produce the larger computation times than the single-shift supercell model in per electronic step, but the faster convergence results in similar computation times of the two models for SCF iteration calculations. In structural optimization, the triple-shift supercell models also have the fewer steps, leading to the less total time compared to the single-shift supercell model. Clearly, different defects yield distinct convergence behaviors, although Kerker preconditioner suppresses the long-wavelength divergence of the Jacobian, the planar defects introduced by a vacuum layer still slows convergence. In comparison, defects arising solely from stacking-faults have a milder impact than the planar defects introduced by a vacuum layer on convergence.

Table 2: Time-to-solution for Al, Cu and Si models of SFE calculations by using Kerker preconditioner

system	model	SCF-step	1-SCF(t) ¹¹1Time unit: CPU seconds	relax-step	total(t) ²²2Time unit: CPU hours
Al	Sgl-shift	30	130.80	28	16.26
	Tpl-shift	19	250.10	13	7.56
Cu	Sgl-shift	43	56.19	33	9.21
	Tpl-shift	24	116.96	11	3.98
Si	Sgl-shift	41	1240.35	27	130.69
	Tpl-shift	34	2129.94	15	70.02

V Conclusions

Different supercell models for GSFE calculations introduce different degrees of defects, which affecting the convergence of both the SCF iteration and atomic relaxation. In the perspective of accuracy of GSFE calculations, the single-shift and triple-shift supercell models are comparable. However, the latter would be much more computationally efficient than the former since the degree of defect is diminished in the triple-shift supercell. In the absense of a priori knowledge, the adaptive preconditioning scheme proposed in this work can help to identify the underlying long-wavelength divergence behavior of the Jacobian and restore the convergence of the SCF iteration. This algorithm may be also applied to some heterojunctions such as metal-insulator contacts systems.

Our scheme is implemented in the FLAPW method, and enables the Elk-7.2.42 package to work out the GSFE much more efficiently compared to the original scheme. Obviously, the construction of the a posteriori indicator is independent of the basis functions, and the corresponding computational overhead is negligible. Therefore it may be easy to implement our scheme in other first-principles codes. The ability would faciliate high-precision computations of the characteristic quantities of the microstructure, which is significant for the simulation of the microstructure evolution.

VI Declaration of Interest Statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

VII Acknowledgment

This work was supported by the National Natural Science Foundation of China (Grant Nos. 51925105, and U23A20537).

References

Moitra et al. (2014) A. Moitra, S.-G. Kim, and M. Horstemeyer, Acta materialia 75, 106 (2014).
Garg et al. (2019) P. Garg, M. Bhatia, and K. Solanki, Journal of Alloys and Compounds 788, 413 (2019).
Yan et al. (2021) J. Yan, Z. Zhang, H. Yu, K. Li, Q. Hu, J. Yang, and Z. Zhang, Journal of Alloys and Compounds 866, 158869 (2021).
Jeong et al. (2018) B. Jeong, J. Kim, T. Lee, S.-W. Kim, and S. Ryu, Scientific reports 8, 15200 (2018).
Ehrhart (1992) P. Ehrhart, Properties and interactions of atomic defects in metals and alloys, Ph.D. thesis, Verlag nicht ermittelbar (1992).
Siegel (1981) R. Siegel, Atomic defects and diffusion in metals, Tech. Rep. (Argonne National Lab.(ANL), Argonne, IL (United States), 1981).
Arrigoni and Madsen (2021) M. Arrigoni and G. K. Madsen, Computer Physics Communications 264, 107946 (2021).
Davidsson et al. (2021) J. Davidsson, V. Ivády, R. Armiento, and I. A. Abrikosov, Computer Physics Communications 269, 108091 (2021).
Hutchinson (1977) J. Hutchinson, Metallurgical Transactions A 8, 1465 (1977).
Gilman and Read (1952) J. J. Gilman and T. Read, JOM 4, 875 (1952).
Anderson et al. (2017) P. M. Anderson, J. P. Hirth, and J. Lothe, Theory of dislocations (Cambridge University Press, 2017).
Vitek (1968) V. Vitek, Philosophical Magazine 18, 773 (1968).
Meyers et al. (2001) M. Meyers, O. Vöhringer, and V. Lubarda, Acta materialia 49, 4025 (2001).
Sun et al. (2022) G. Sun, X. Feng, X. Wu, S. Zhang, and B. Wen, Journal of Materials Science & Technology 114, 215 (2022).
Denteneer and Van Haeringen (1987) P. Denteneer and W. Van Haeringen, Journal of Physics C: Solid State Physics 20, L883 (1987).
Muzyk et al. (2012) M. Muzyk, Z. Pakiela, , and K. Kurzydlowski, Scripta Materialia 66, 219 (2012).
Gholizadeh (2013) H. Gholizadeh, PhD Diss. Mont. Leoben, no. May , 193 (2013).
Gholizadeh et al. (2013) H. Gholizadeh, C. Draxl, and P. Puschnig, Acta materialia 61, 341 (2013).
Hohenberg and Kohn (1964) P. Hohenberg and W. Kohn, Phys. Rev. 136, B864 (1964).
Kohn and Sham (1965) W. Kohn and L. J. Sham, Phys. Rev. 140, A1133 (1965).
Wimmer et al. (1981) E. Wimmer, H. Krakauer, M. Weinert, and A. J. Freeman, Phys. Rev. B 24, 864 (1981).
Ambrosch-Draxl and Sofo (2006) C. Ambrosch-Draxl and J. O. Sofo, Computer physics communications 175, 1 (2006).
Blaha et al. (2001) P. Blaha, K. Schwarz, G. K. Madsen, D. Kvasnicka, J. Luitz, et al., An augmented plane wave+ local orbitals program for calculating crystal properties 60 (2001).
Wortmann et al. (2023) D. Wortmann, G. Michalicek, N. Baadji, M. Betzinger, G. Bihlmayer, J. Bröder, T. Burnus, J. Enkovaara, F. Freimuth, C. Friedrich, C.-R. Gerhorst, S. Granberg Cauchi, U. Grytsiuk, A. Hanke, J.-P. Hanke, M. Heide, S. Heinze, R. Hilgers, H. Janssen, D. A. Klüppelberg, R. Kovacik, P. Kurz, M. Lezaic, G. K. H. Madsen, Y. Mokrousov, A. Neukirchen, M. Redies, S. Rost, M. Schlipf, A. Schindlmayr, M. Winkelmann, and S. Blügel, “FLEUR,” Zenodo (2023).
Gulans et al. (2014) A. Gulans, S. Kontur, C. Meisenbichler, D. Nabok, P. Pavone, S. Rigamonti, S. Sagmeister, U. Werner, and C. Draxl, Journal of Physics: Condensed Matter 26, 363202 (2014).
(26) “The Elk Code,” http://elk.sourceforge.net/.
Lejaeghere et al. (2016) K. Lejaeghere, G. Bihlmayer, T. Björkman, P. Blaha, S. Blügel, V. Blum, D. Caliste, I. E. Castelli, S. J. Clark, A. Dal Corso, et al., Science 351, aad3000 (2016).
Otero-de-la Roza et al. (2011) A. Otero-de-la Roza, D. Abbasi-Pérez, and V. Luaña, Computer Physics Communications 182, 2232 (2011).
Panda and Chandran (2006) K. Panda and K. R. Chandran, Computational Materials Science 35, 134 (2006).
Hoshino et al. (2001) T. Hoshino, T. Mizuno, M. Asato, and H. Fukushima, Materials transactions 42, 2206 (2001).
(31) M. Šob, c Masaryk University Brno, 2000 ISBN 80-210-2261-2 , 78.
Freitas and Cao (2022) R. Freitas and Y. Cao, MRS Communications 12, 510 (2022).
Dragoni et al. (2018) D. Dragoni, T. D. Daff, G. Csányi, and N. Marzari, Physical Review Materials 2, 013808 (2018).
Tassone et al. (1994) F. Tassone, F. Mauri, and R. Car, Physical Review B 50, 10561 (1994).
Kresse and Furthmüller (1996a) G. Kresse and J. Furthmüller, Physical review B 54, 11169 (1996a).
Yu et al. (2018) S. Yu, Q. Rice, B. Tabibi, Q. Li, and F. J. Seo, Nanoscale 10, 12472 (2018).
Zhou et al. (2018) Y. Zhou, H. Wang, Y. Liu, X. Gao, and H. Song, Physical Review E 97, 033305 (2018).
Resta (1977) R. Resta, Physical Review B 16, 2717 (1977).
Raczkowski et al. (2001) D. Raczkowski, A. Canning, and L. Wang, Physical Review B 64, 121101 (2001).
Winkelmann et al. (2020) M. Winkelmann, E. Di Napoli, D. Wortmann, and S. Blügel, Physical Review B 102, 195138 (2020).
Kim et al. (2020) J. Kim, A. Gulans, and C. Draxl, Electronic Structure 2, 037001 (2020).
Perdew et al. (1998) J. P. Perdew, K. Burke, and M. Ernzerhof, Physical Review Letters 80, 891 (1998).
Johnson (1988) D. D. Johnson, Physical Review B 38, 12807 (1988).
Marks and Luke (2008) L. D. Marks and D. Luke, Physical Review B 78, 075114 (2008).
Fang and Saad (2009) H.-r. Fang and Y. Saad, Numerical linear algebra with applications 16, 197 (2009).
Kresse and Furthmüller (1996b) G. Kresse and J. Furthmüller, Computational materials science 6, 15 (1996b).
Lin and Yang (2013) L. Lin and C. Yang, SIAM Journal on Scientific Computing 35, S277 (2013).
Hinzen et al. (2020) M. Hinzen, E. Di Napoli, D. Wortmann, and S. Blügel, arXiv preprint arXiv:2003.03370 (2020).
Weinert (1981) M. Weinert, Journal of Mathematical Physics 22, 2433 (1981).
Jackson (1999) J. D. Jackson, Inc., NewYork, NY , 5 (1999).
Cai et al. (2014) T. Cai, Z. Zhang, P. Zhang, J. Yang, and Z. Zhang, Journal of Applied Physics 116 (2014).
Wei et al. (2007) X.-M. Wei, J.-M. Zhang, and K.-W. Xu, Applied Surface Science 254, 1489 (2007).
Jin et al. (2011) Z. Jin, S. Dunham, H. Gleiter, H. Hahn, and P. Gumbsch, Scripta Materialia 64, 605 (2011).
Wu et al. (2014) X.-Z. Wu, L.-L. Liu, R. Wang, and Q. Liu, Chinese Physics B 23, 066104 (2014).
Juan and Kaxiras (1996) Y.-M. Juan and E. Kaxiras, Philosophical Magazine A 74, 1367 (1996).
Li et al. (2014) W. Li, S. Lu, Q.-M. Hu, S. K. Kwon, B. Johansson, and L. Vitos, Journal of Physics: Condensed Matter 26, 265005 (2014).
Bernstein and Tadmor (2004) N. Bernstein and E. Tadmor, Physical Review B 69, 094116 (2004).
Han et al. (2008) W. Han, G. Cheng, S. Li, S. Wu, and Z. Zhang, Physical review letters 101, 115505 (2008).
Smallman and Dobson (1970) R. Smallman and P. Dobson, Metallurgical Transactions 1, 2383 (1970).
Thornton and Mitchell (1962) P. Thornton and T. Mitchell, Philosophical Magazine 7, 361 (1962).
Takeuchi and Suzuki (1999) S. Takeuchi and K. Suzuki, physica status solidi (a) 171, 99 (1999).
Rautioaho (1982) R. Rautioaho, physica status solidi (b) 112, 391 (1982).
Yamamoto et al. (1983) A. Yamamoto, N. Narita, J. ichi Takamura, H. Sakamoto, and N. Matsuo, Journal of the Japan Institute of Metals 47, 903 (1983).
Föll and Carter (1979) H. Föll and C. Carter, Philosophical Magazine A 40, 497 (1979).