Machine learning topological invariants of non-Hermitian systems

Ling-Feng Zhang Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials, School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou 510006, China Ling-Zhi Tang Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials, School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou 510006, China Zhi-Hao Huang Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials, School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou 510006, China Guo-Qing Zhang [email protected] Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials, School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou 510006, China Guangdong-Hong Kong Joint Laboratory of Quantum Matter, Frontier Research Institute for Physics, South China Normal University, Guangzhou 510006, China Wei Huang Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials, School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou 510006, China Dan-Wei Zhang [email protected] Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials, School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou 510006, China Guangdong-Hong Kong Joint Laboratory of Quantum Matter, Frontier Research Institute for Physics, South China Normal University, Guangzhou 510006, China

Abstract

The study of topological properties by machine learning approaches has attracted considerable interest recently. Here we propose machine learning the topological invariants that are unique in non-Hermitian systems. Specifically, we train neural networks to predict the winding of eigenvalues of four prototypical non-Hermitian Hamiltonians on the complex energy plane with nearly $100\%$ accuracy. Our demonstrations in the non-Hermitian Hatano-Nelson model, Su-Schrieffer-Heeger model and generalized Aubry-André-Harper model in one dimension, and two-dimensional Dirac fermion model with non-Hermitian terms show the capability of the neural networks in exploring topological invariants and the associated topological phase transitions and topological phase diagrams in non-Hermitian systems. Moreover, the neural networks trained by a small data set in the phase diagram can successfully predict topological invariants in untouched phase regions. Thus, our work paves the way to revealing non-Hermitian topology with the machine learning toolbox.

I Introduction

Machine learning, which lies at the core of the artificial intelligence and data science, has recently achieved huge success from industrial applications (especially in computer vision and the natural language process) to fundamental researches in physics, cheminformatics and biology Jordan and Mitchell (2015); LeCun et al. (2015); Goodfellow et al. (2016); Carleo et al. (2019). In physics, machine learning has shown its availability in experimental data analysis Biswas et al. (2013); Rem et al. (2019); Kasieczka et al. (2019) and classification of phases of matter Wang (2016); Carrasquilla and Melko (2017); Zhang and Kim (2017); Deng et al. (2017); Huembeli et al. (2019); Dong et al. (2019); Van Nieuwenburg et al. (2017); Carvalho et al. (2018); Zhang et al. (2018a); Sun et al. (2018); Huembeli et al. (2018); Tsai et al. (2020); Ming et al. (2019); Rodriguez-Nieva and Scheurer (2019); Holanda and Griffith (2020); Ohtsuki and Mano (2020). Among these applications, one of the most interesting problems is to extract the global properties of topological phases of matter from local inputs, such as the topological invariants that intrinsically nonlocal. Recent works have shown that artificial neural networks can be trained to predict the topological invariants of band insulators with a high accuracy Zhang et al. (2018a); Sun et al. (2018). The advantage of this approach is that the neural network can capture global topology directly from local raw data inputs. Other theoretical proposals for identifying topological phases by using supervised or unsupervised learning have been suggested Carvalho et al. (2018); Huembeli et al. (2018); Rodriguez-Nieva and Scheurer (2019); Holanda and Griffith (2020); Ohtsuki and Mano (2020); Zhang et al. (2020a); Long et al. (2020); Scheurer and Slager (2020); Balabanov and Granath (2020, 2021). Notably, the convolutional neural network (CNN) trained from raw experimental data has been demonstrated to identify topological phases Rem et al. (2019); Lian et al. (2019).

On the other hand, growing efforts have been invested in uncovering exotic topological states and phenomena in non-Hermitian systems in recent years Diehl et al. (2008); Malzard et al. (2015); Lee (2016); Yao and Wang (2018); Yao et al. (2018); Song et al. (2019); Kunst et al. (2018); Takata and Notomi (2018); Wang et al. (2019); Zeng et al. (2017); Lang et al. (2018); Hamazaki et al. (2019); Jin and Song (2019); Kawabata et al. (2019a); Liu et al. (2019); Lee et al. (2019); Yamamoto et al. (2019); Hatano and Nelson (1996, 1997); Gong et al. (2018); Ghatak and Das (2019); Leykam et al. (2017); Shen et al. (2018); Zhang et al. (2018b, 2020b, 2020c); Luo and Zhang ; Jiang et al. (2019); Longhi (2019); Liu et al. (2020a); Wu and An (2020); Zeng et al. (2020); Zeng and Xu (2020); Tang et al. (2020); Liu et al. (2020b); Zhang et al. (2020d); Xu and Chen (2020); Liu et al. (2020c); Xi et al. ; Lee et al. (2020); Yoshida et al. (2019). The non-Hermiticity may come from gain and loss effects Zeng et al. (2017); Lang et al. (2018); Takata and Notomi (2018); Wang et al. (2019); Hamazaki et al. (2019), non-reciprocal hoppings Hatano and Nelson (1996, 1997), or dissipations in open systems Diehl et al. (2008); Malzard et al. (2015). Non-Hermiticity-induced topological phases are also investigated in disordered Zhang et al. (2020c); Luo and Zhang ; Jiang et al. (2019); Longhi (2019); Liu et al. (2020a); Wu and An (2020); Zeng et al. (2020); Zeng and Xu (2020); Tang et al. (2020); Liu et al. (2020b) and interacting systems Zhang et al. (2020d); Xu and Chen (2020); Liu et al. (2020c); Xi et al. ; Lee et al. (2020); Yoshida et al. (2019). In non-Hermitian topological systems, there are not only topological properties defined by the eigenstates (such as topological Bloch bands), but also topological invariants solely lying on the eigenenergies. For instance, complex energy landscapes and exceptional points give rise to different topological invariants, which include the winding number (vorticity) defined solely in the complex energy plane Gong et al. (2018); Ghatak and Das (2019); Leykam et al. (2017); Shen et al. (2018). This winding number and several closely related winding numbers in the presence of symmetries can lead to a richer topological classification than that of their Hermitian counterparts. In addition, it was revealed Borgnia et al. (2020); Okuma et al. (2020); Zhang et al. (2020e) that the nonzero winding number in the complex energy plane is the topological origin of the so-called non-Hermitian skin effect Lee (2016); Yao and Wang (2018); Yao et al. (2018); Song et al. (2019); Kunst et al. (2018). Considering that the topological invariants in Hermitian systems have been studied recently based on the machine learning approach Carvalho et al. (2018); Zhang et al. (2018a); Sun et al. (2018); Huembeli et al. (2018); Rodriguez-Nieva and Scheurer (2019); Holanda and Griffith (2020); Ohtsuki and Mano (2020); Zhang et al. (2020a); Long et al. (2020); Scheurer and Slager (2020), the flexibility of machine learning such a different kind of winding number in non-Hermitian systems is urgent and meaningful research.

In this work, we adapt machine learning with neural networks to predict non-Hermitian topological invariants and classify the topological phases in several prototypical non-Hermitian models in one and two dimensions. We first take the Hatano-Nelson model Hatano and Nelson (1996, 1997) as a feasibility verification of machine learning in identifying non-Hermitian topological phases. We show that the trained CNN can predict the winding numbers of eigenenergies with a high accuracy even for those phases that are not included in the training, whereas the fully connected neural network (FCNN) can only predict those in the trained phases. We interpolate the intermediate value of the CNN and find a strong relationship with the winding angle of the eigenenergies in the complex plane. We then use the CNN to study topological phase transitions in a non-Hermitian Su-Schrieffer-Heeger (SSH) model Su et al. (1979) with non-reciprocal hopping. We find that the CNN can precisely detect the transition points near the phase boundaries even though trained only by the data in the deep phase region. By using the CNN, we further obtain the topological phase diagram of a non-Hermitian generalized Aubry-André-Harper (AAH) model Harper (1955); Aubry and Andre (1980); Liu et al. (2015) with non-reciprocal hopping and a complex quasiperiodic potential. The winding numbers evaluated from the CNN show an accuracy of more than 99% with theoretical values in the whole parameter space, even though the complex on-site potential is absent in the training process. Finally, we extend our scenario to a two-dimensional non-Hermitian Dirac fermion model Shen et al. (2018) and show the feasibility of neural networks in revealing the winding numbers associated with exceptional points. Our work may provide an efficient and general approach to reveal non-Hermitian topology based on the machine learning toolbox.

The rest of this paper is organized as follows. We first study the winding number of the Hatano-Nelson model as a feasibility verification of our machine learning method in Sec. II. Different performances of the CNN and the FCNN are also discussed. Section III is devoted to revealing the topological phase transition in the non-Hermitian SSH model by the CNN. In Sec. IV, we show that the CNN can precisely predict the topological phase diagram of the non-Hermitian generalized AAH model. In Sec. V, we extend our scenario to reveal the winding numbers associated with exceptional points in a two-dimensional non-Hermitian Dirac fermion model. A further discussion and short summary are finally presented in Sec. VI.

II Learning topological invariants in Hatano-Nelson model

Let us begin with the Hatano-Nelson model, which is a prototypical single-band non-Hermitian model and takes the following Hamiltonian in a one-dimensional lattice of length $L$ Hatano and Nelson (1996, 1997):

H_{1}=\sum_{j}^{L}(t_{r}\hat{c}^{\dagger}_{j+\mu}\hat{c}_{j}+t_{l}\hat{c}^{\dagger}_{j}\hat{c}_{j+\mu}+V_{j}\hat{c}^{\dagger}_{j}\hat{c}_{j}).

(1)

Here $t_{l}\neq t_{r}^{*}$ denotes the amplitudes of non-reciprocal hopping, $\hat{c}^{\dagger}_{j}(\hat{c}_{j})$ is the creation (annihilation) operator at the $j$ -th lattice site, $\mu$ denotes the hopping length between two sites, and $V_{j}$ is the on-site energy in the lattice. The original Hatano-Nelson model takes the disorder potential with random $V_{j}$ and the nearest-neighbor hopping with $\mu=1$ , as shown in Fig. 1(a). Here we consider the clean case by setting $V_{j}=0$ and take $\mu$ as a parameter in learning the topological phase transition with neural networks. Under the periodic boundary condition, the corresponding eigenenergies in this case are given by

E_{1}(k)=\mathcal{H}_{1}(k)=t_{r}e^{-i\mu k}+t_{l}e^{i\mu k},

(2)

where $\mathcal{H}_{1}(k)$ is the Hamiltonian in momentum space with the quasimomentum $k=0,2\pi/L,4\pi/L,\cdots,2\pi$ .

Refer to caption — Figure 1: (Color online) (a) The Hatano-Nelson model with non-reciprocal hopping between two nearest-neighbor sites ( $\mu=1$ ). (b) The complex eigenenergy draws a closed loop around the base energy $E_{B}=0$ during the variation of quasimomentum $k$ from $0$ to $2\pi$ , giving rise to the winding number $w=\pm 1$ for the counterclockwise and clockwise windings, respectively.

Following Ref. Gong et al. (2018), we can define the winding number in the complex energy place as a topological invariant in the Hatano-Nelson model,

	$\displaystyle w$	$\displaystyle=\int_{0}^{2\pi}\frac{dk}{2\pi i}\partial_{k}\ln\det\mathcal{H}_{1}(k)$
		$\displaystyle=\int_{0}^{2\pi}\frac{dk}{2\pi}\partial_{k}\arg E_{1}(k)=\left\{\begin{array}[]{ll}\mu,&\|t_{r}\|<\|t_{l}\|,\\ -\mu,&\|t_{r}\|>\|t_{l}\|,\end{array}\right.$		(5)

where $\arg$ denotes the principal value of the argument belonging to $[0,2\pi)$ . For a discretized $E_{1}(k)$ with finite lattice site $L$ , the complex-energy winding number reduces to

{}w=\frac{1}{2\pi}\sum_{n=1}^{L}\Delta\theta(n)=\frac{1}{2\pi}\sum_{n=1}^{L}[\theta(n)-\theta(n-1)],

(6)

where $\theta(n)=\arg E_{1}(2\pi n/L)$ . Note that for Hermitian systems ( $t_{r}=t_{l}^{*}$ ), one has $w=0$ due to the real energy spectrum with $\arg E_{1}(k)=0,\pi$ . According to this definition, a nontrivial winding number gives the number of times the complex eigenenergy encircles the base point $E_{B}=0$ , which is unique to non-Hermitian systems. The complex eigenenergy windings for two typical cases with $w=\pm 1$ are shown in Fig. 1(b). To examine whether the neural networks have the ability to learn the winding number in a general formalism, we enable the parameter $\mu$ to control the number of times the complex eigenenergy encircles the origin of the complex plane. When the loop winds around the origin $\mu$ times during the variation of $k$ from $0$ to $2\pi$ , the winding number is $\pm\mu$ , where $\pm$ indicates counterclockwise and clockwise windings, respectively.

We now build a supervised task for learning the winding number given by Eq. (6) based on neural networks. First, we need labeled data sets for the training and evaluation. Since the winding number is intrinsically nonlocal and characterized by a complex energy spectrum, we feed neural networks with the normalized spectrum-dependent configurations $\mathbf{d}(n)=[\mathbf{d}_{R}(n),\mathbf{d}_{I}(n)]$ at $L$ points discretized uniformly from $0$ to $2\pi$ , where $\mathbf{d}_{R}(n)=\mathrm{Re}[E_{1}(2\pi n/L)]$ and $\mathbf{d}_{I}(n)=\mathrm{Im}[E_{1}(2\pi n/L)]$ . Therefore, the input data are an $(L+1)\times 2$ -dimensional matrix of the form

\left[\begin{array}[]{ccccc}\mathbf{d}_{R}(0)&\mathbf{d}_{R}(2\pi/L)&\mathbf{d}_{R}(4\pi/L)&...&\mathbf{d}_{R}(2\pi)\\ \mathbf{d}_{I}(0)&\mathbf{d}_{I}(2\pi/L)&\mathbf{d}_{I}(4\pi/L)&...&\mathbf{d}_{I}(2\pi)\end{array}\right]^{T},

with a period of $2\pi$ : $\mathbf{d}(n)=\mathbf{d}(n+2\pi)$ . In the following, we set $L=32$ , which is large enough to take discrete energy spectra as the input data of neural networks. Labels are computed according to Eq. (6) for the corresponding configurations.

The machine learning workflow is schematically shown in Fig. 2. For the Hatano-Nelson model with different $\mu$ , the output of the neural network is a real number $\tilde{w}$ , and the predicted winding number is interpreted as the integer that is closest to $\tilde{w}$ . We first train the neural networks with both complex spectrum configurations and their corresponding true winding numbers. After the training, we feed only the complex-spectrum-dependent configurations to the neural networks and compare their predictions with the true winding numbers, from which we determine the percentage of the correct predictions as the accuracy. In this case, we consider two typical classes of neural networks: the CNN and FCNN, respectively. The neural networks are similar to those in Ref. Zhang et al. (2018a) for calculating the winding number of the Bloch vectors in Hermitian topological bands.

The CNN in our training has two convolution layers with 32 kernels of size $1\times 2\times 2$ and 1 kernel of size $32\times 1\times 1$ , followed by a fully connected layer of two neurons before the output layer. The total number of trainable parameters is 262. The FCNN has two hidden layers with 32 and 2 neurons, respectively. The total number of trainable parameters is 2213. The architecture of two classes of neural networks is shown in Fig. 2. All hidden layers have rectified linear units $f(x)=\max{(0,x)}$ as activation functions and the output layer has linear activation function $f(x)=x$ . The objective function to be optimized is defined by

J_{1}=\frac{1}{N}\sum_{i=1}^{N}(\tilde{w}_{i}-w_{i})^{2},

(7)

where $\tilde{w}_{i}$ and $w_{i}$ are, respectively, the winding number of the $i$ th complex eigenenergies predicted by the neural networks and the true winding number, and $N$ is the total number of the training data set. We take $6\times 10^{4}$ training configurations, which consist of a ratio of $1:1:1$ of them having winding numbers $\{\pm 1,\pm 2,\pm 3\}$ , respectively. The test set consists of some configurations with winding numbers $w\in\{\pm 1,\pm 2,\pm 3\}$ that are not included in the training set and $w\in\{\pm 4,\pm 5\}$ that are not seen by neural networks during the training. The number of configurations for each kind of winding number is $4\times 10^{3}$ . The training details are given in the Appendix A.

After training, we test with other configurations and the predicted winding numbers $\tilde{w}$ are shown in Fig. 3 (a). Note that the networks tend to produce $\tilde{w}$ close to integers and thus we take each final winding number as the integer closest to $\tilde{w}$ . As shown in Fig. 3 (b), we plot the probability distribution of $\tilde{w}$ predicted from the CNN on different test data sets. The test results of two neural networks are presented in Table. 1, which shows a very high accuracy (more than $98\%$ ) of the CNN and FCNN on the test data set with the winding numbers $w=\{\pm 1,\pm 2,\pm 3\}$ . We can find that the CNN performs generally better than the FCNN. Surprisingly, the CNN works well even in the cases of $w=\{\pm 4,\pm 5\}$ , which consist of configurations with larger winding numbers not seen by neural networks during the training. On the contrary, the FCNN cannot predict the true winding numbers even though it has more trainable parameters. These results indicate that the convolutional layer respects the translation symmetry of complex spectrum in the momentum space explicitly and convolutional layers can take local winding angle $\Delta\theta$ explicitly through the $2\times 2$ kernels.

To further see the advantage of the CNN, we open up the black box of neural networks and find the relationship between intermediate activation values and physical quantities, i.e. the winding angle $\Delta\theta$ . Based on the convolutional layers, we consider that the activation value after two convolutions should have a linear dependence on $\Delta\theta$ to some extent and the following fully-connected layers use a simple linear regression. We plot $a_{n}$ versus $\Delta\theta(n)$ , with $n=1,...,L$ and $a_{n}$ being the $n$ -th component of intermediate values after two convolution layers. As shown in Fig. 3 (c), the intermediate output is approximately linear with $\Delta\theta$ within certain regions. A linear combination of these intermediate values with correct coefficients in the following fully connected layers can then easily lead to the true winding number. In this way, the CNN realizes a calculation workflow that is equivalent to the wingding angle $\Delta\theta$ in Eq. (6).

$w$	$\pm 1$	$\pm 2$	$\pm 3$	$\mathbf{\pm 4}$	$\mathbf{\pm 5}$
CNN Accuracy	99.8 %	99.4 %	98.0%	$\mathbf{96.7\%}$	$\mathbf{96.0\%}$
FCNN Accuracy	99.2%	99.0%	98.5%	$\mathbf{0.0\%}$	$\mathbf{0.0\%}$

Table 1: Accuracy of the CNN and FCNN on test data set with the winding numbers

w=\{\pm 1,\pm 2,\pm 3,\pm 4,\pm 5\}

in the Hatano-Nelson model with

\mu=1,2,3,4,5

. The winding numbers

w=\{\pm 4,\pm 5\}

are not seen by the neural networks during the training.

III Learning topological transition in non-Hermitian SSH model

Based on the accurate winding number calculated by the CNN, we further use a similar CNN to study topological phase transitions in the non-Hermitian SSH model, as shown in Fig. 4 (a). The considered model with nonreciprocal intra-cell hopping in the one-dimensional dimerized lattice of $L$ unit cells can be described by the following Hamiltonian:

H_{2}=\sum_{n=1}^{L}[(t-\delta)\hat{a}^{\dagger}_{n}\hat{b}_{n}+(t+\delta)\hat{b}^{\dagger}_{n}\hat{a}_{n}+t^{\prime}\hat{a}^{\dagger}_{n+1}\hat{b}_{n}+t^{\prime}\hat{b}^{\dagger}_{n}\hat{a}_{n+1}].

(8)

Here $\hat{a}^{\dagger}_{n}$ and $\hat{b}^{\dagger}_{n}$ ( $\hat{a}_{n}$ , $\hat{b}_{n}$ ) denote the creation (annihilation) operators on the $n$ -th $A$ and $B$ sublattices, $t$ is the uniform intra-cell hopping amplitude, $\delta$ is the non-Hermitian parameter, and $t^{\prime}$ is the inter-cell hopping amplitude. When $\delta=0$ , this model reduces to the Hermitian SSH model. Under the periodic boundary condition, the corresponding Hamiltonian in the momentum space is given by

\mathcal{H}_{2}(k)=\left(\begin{array}[]{cc}0&t^{\prime}e^{-ik}+t-\delta\\ t^{\prime}e^{ik}+t+\delta&0\end{array}\right).

(9)

The two energy bands are then given by

E_{\pm}(k)=\pm\sqrt{1+t^{2}-\delta^{2}+2t\cos{k}-i2\delta\sin{k}}.

(10)

Following Ref. Leykam et al. (2017); Shen et al. (2018); Gong et al. (2018); Ghatak and Das (2019); Kawabata et al. (2019b) and considering the sublattice symmetry, one can define an inter-band winding number

w_{\pm}=\int_{0}^{2\pi}\frac{dk}{2\pi}\partial_{k}\arg(E_{+}-E_{-})=\int_{0}^{2\pi}\frac{dk}{4\pi}\partial_{k}\arg E_{+}^{2}.

(11)

For discretized $E_{\pm}(k)$ with finite $L$ , it reduces to

w_{\pm}=\frac{1}{4\pi}\sum_{n=1}^{L}[\theta^{\prime}(n)-\theta^{\prime}(n-1)],

(12)

with $\theta^{\prime}(n)=\arg{E_{+}^{2}(2\pi n/L)}$ in this model. Notably, $w_{\pm}$ is half the total windings of $t^{\prime}e^{-ik}+t-\delta$ and $t^{\prime}e^{ik}+t+\delta$ around the origin of the complex plane as $k$ is increased from $0$ to $2\pi$ . The inter-band winding number $w_{\pm}$ is quantized as $\mathbb{Z}/2$ because the windings of $t^{\prime}e^{-ik}+t-\delta$ and $t^{\prime}e^{ik}+t+\delta$ are always integers due to periodicity Shen et al. (2018). We consider $t^{\prime}=1$ , $t\in(-6,6)$ , and $\delta\in(-6,6)$ in our study.

For this model, we set the configuration of input data as $\mathbf{d}(n)=\{\mathrm{Re}[E_{+}^{2}(2\pi n/L)],\mathrm{Im}[E_{+}^{2}(2\pi n/L)]\}$ . To learn the topological phase transition in this model, we treat it as a classification task assisted by neural networks. The output of the neural network is the probabilities of different winding numbers. We define $\{P_{1},P_{2},P_{3}\}$ as the output probabilities of winding numbers $\tilde{w}_{\pm}=\{0,0.5,-0.5\}$ , respectively. The predicted winding number is interpreted as $\tilde{w}_{\pm}$ , which has the highest probability. The architecture of the CNN is shown in Fig. 2, with some training details given in the Appendix A. For our task, the objective function to be optimized is defined by

J_{2}=-\frac{1}{N}[\sum_{i=1}^{N}\sum_{j=1}^{n_{w}=3}1(w_{\pm}^{(i)}=\tilde{w}_{\pm,j})\log_{2}(P_{j}))],

(13)

where $w_{\pm}^{(i)}$ is the label of the $i$ th configuration, and the set $\{\tilde{w}_{\pm,1},\tilde{w}_{\pm,2},...,\tilde{w}_{\pm,n_{w}}\}$ represents the winding number predicted by the neural networks. The expression $1(w_{\pm}^{(i)}=\tilde{w}_{\pm,j})$ means that it will take the value 1 when the condition ${w_{\pm}^{(i)}}=\tilde{w}_{\pm,j}$ is satisfied and the value 0 in the opposite case. In this model, $n_{w}=3$ and $\{\tilde{w}_{\pm,1},\tilde{w}_{\pm,2},\tilde{w}_{\pm,3}\}$ represent the winding numbers $w=\{0,0.5,-0.5\}$ correspondingly.

To see whether the CNN is a good tool to study topological phase transitions in this model, we define a Euclidean distance $s$ between the configuration and the phase boundaries in the parameter space of the Hamiltonian:

s=\frac{|A\delta+Bt+C|}{\sqrt{A^{2}+B^{2}}},

(14)

where $A\delta+Bt+C=0$ (straight lines in the parameter space about $\delta$ and $t$ ) is the equation of phase boundaries with $A,B,C$ being the parameters of the equation. In addition, we define a distance threshold $T$ . In the following, we choose $T=0.2$ as a demonstration and the situation of $0.2<T\leq 0.6$ is discussed later. The training data set consists of $2.4\times 10^{4}$ configurations satisfying $s\geq T$ that are sampled from different phases with different winding numbers.

We test the CNN with two test data sets: (I) $6\times 10^{3}$ configurations satisfying $s<T$ and (II) 300 configurations distributed uniformly in $t=0.5,\delta=[-3,3]$ . The data sets distribution and some training details are given in the Appendix A. After the training, both test data sets, I and II, are evaluated by the CNN. We use the same training and test workflow for $T=0.3,0.4,0.5,and0.6$ . Figure 4 (b) shows the accuracy of the test data sets versus the distance threshold $T$ . We find that the CNN achieves a high accuracy in different $T$ , meaning that the CNN can detect the phase transitions precisely in these regions. Moreover, we locate the phase transition points from the crossing points of prediction probabilities; the phase transitions determined by this method are relatively accurate, as shown in Fig. 4 (c). In the deep phase, the probability for the true winding number $w_{\pm}$ stays at nearly $100\%$ . On the other hand, the probability for $w_{\pm}$ increases linearly at the phase transitions. In a word, the CNN is a great supplementary tool to study the phase transitions when only phase properties in some confident regions (e.g., the deep phase) are provided.

IV Learning topological phase diagram in non-Hermitian AAH model

To show that our results can be generalized to other non-Hermitian topological models, we consider a generalized AAH model in a one-dimensional quasicrystal as shown in Fig. 5 (a), with two kinds of non-Hermiticities arising from the nonreciprocal hopping Jiang et al. (2019) and complex on-site potential phase Longhi (2019). The Hamiltonian of such a non-Hermitian AAH model is given by Tang et al. (2021)

\displaystyle H_{3}=\sum_{j}(t^{(r)}_{j}\hat{c}^{\dagger}_{j+1}\hat{c}_{j}+t^{(l)}_{j}\hat{c}^{\dagger}_{j}\hat{c}_{j+1}+\Delta_{j}\hat{n}_{j}),

(15)

where the non-reciprocal hopping terms and the on-site potential are parameterized as

$\displaystyle t^{(r)}_{j}$	$\displaystyle=\{t+V_{2}\cos[2\pi(j+1/2)\beta]\}e^{-\alpha},$
$\displaystyle t^{(l)}_{j}$	$\displaystyle=\{t+V_{2}\cos[2\pi(j+1/2)\beta]\}e^{\alpha},$	(16)
$\displaystyle\Delta_{j}$	$\displaystyle=V_{1}\cos{(2\pi j\beta+ih)}.$

Here $t^{(r)}_{j}$ $(t^{(l)}_{j})$ denotes the right-hopping (left-hopping) amplitude between the $j$ -th and the $(j+1)$ -th site with parameters $t>0$ and $V_{2}$ being real, $\Delta_{j}$ denotes the complex quasiperiodic potential with $V_{1}>0$ and $\beta$ an irrational number, and the parameters $\alpha$ and $h$ tune the non-reciprocity and complex phase, respectively. For finite quasiperiodic systems, one can take the lattice site number $L=F_{j+1}$ as a rational number and $\beta=F_{j}/F_{j+1}$ with $F_{j}$ being the $j$ -th Fibonacci number since $\lim_{j\rightarrow\infty}F_{j}/F_{j+1}=(\sqrt{5}-1)/2$ . In the following, we set $t=1$ and $L=89$ .

The winding numbers discussed previously cannot be directly used here due to the periodicity breaking. In this case, one can consider a ring chain with an effective magnetic flux $\Phi$ penetrating through the center, such that the Hamiltonian matrix can be rewritten as

H_{3}(\Phi)=\left(\begin{array}[]{ccccc}\Delta_{1}&t^{(l)}_{1}&&&t^{(r)}_{L}e^{-i\Phi}\\ t^{(r)}_{1}&\Delta_{2}&t^{(l)}_{2}&&\\ &\ddots&\ddots&\ddots&\\ &&t^{(r)}_{L-2}&\Delta_{L-1}&t^{(r)}_{L-1}\\ t^{(l)}_{L}e^{i\Phi}&&&t^{(r)}_{L-1}&\Delta_{L}\end{array}\right).

(17)

One can define the winding number with respect to $\Phi$ and the energy base $E_{B}$ Gong et al. (2018); Jiang et al. (2019):

w_{\Phi}=\int_{0}^{2\pi}\frac{\mathrm{d}\Phi}{2\pi i}\partial_{\Phi}\ln\det[{H_{3}(\Phi)-E_{B}}].

(18)

Here $w_{\Phi}$ counts the number of times the complex spectral trajectory encircles the energy base $E_{B}$ ( $E_{B}\in\mathbb{C}$ does not belong to the energy spectrum) when the flux $\Phi$ varies from $0$ to $2\pi$ . For discretized $H_{3}(\Phi)$ with $\Phi=0,2\pi/L_{\Phi},4\pi/L_{\Phi},\cdots,2\pi$ , the winding number can be rewritten as

w_{\Phi}=\frac{1}{2\pi}\sum_{n=1}^{L_{\Phi}}[\theta^{\prime\prime}(n)-\theta^{\prime\prime}(n-1)],

(19)

where $\theta^{\prime\prime}(n)=\arg{\det[H_{3}(2\pi n/L_{\Phi})-E_{B}]}$ .

Below we show that the generalization ability enables the CNN to precisely obtain topological phase diagrams of this non-Hermitian generalized AAH model, even though we only use nonreciprocal-hopping configurations in the training. To do this, we treat the problem as a classification task and set the configuration in this case as $\mathbf{d}(n)=\{\mathrm{Re}\det[\tilde{H}_{3}(n)],\mathrm{Im}\det[\tilde{H}_{3}(n)]\}$ with $\tilde{H}_{3}(n)\equiv H_{3}(2\pi n/L_{\Phi})-E_{B}$ . The architecture of the CNN is similar to that of the non-Hermitian SSH model, but the output layer now becomes two neurons for two kinds of winding numbers. We define $\{P_{1},P_{2}\}$ as the output probabilities of the winding numbers $\tilde{w}_{\Phi}=\{0,-1\}$ , respectively. The objective function in this case is given by [similarly to that in Eq. (13)]

J_{3}=-\frac{1}{N}[\sum_{i=1}^{N}\sum_{j=1}^{n_{w}=2}1(w_{\Phi}^{(i)}=\tilde{w}_{\Phi,j})\log_{2}(P_{j}))],

(20)

where $\{\tilde{w}_{\Phi,1},\tilde{w}_{\Phi,2}\}$ (with $n_{w}=2$ ) represent $\tilde{w}_{\Phi}=\{0,-1\}$ , respectively.

To test the generality of the neural network, we train the neural network with configurations corresponding to Hamiltonians with $h=0$ , and test it by using configurations corresponding to Hamiltonians with both nonreciprocal hopping amplitudes ( $\alpha\neq 0$ ) and complex potentials ( $h\neq 0$ ). The training data set includes configurations with $\alpha\in[0.1,1.0]$ and the interval $\Delta\alpha=0.1$ ; each one consists of $3.2\times 10^{3}$ configurations corresponding Hamiltonians sampled from the two-dimensional parameter space spanned by $V_{1}\in[0,4]\times V_{2}\in[0,2]$ . The test data set includes 110 pairs of parameters, which consist of $\alpha$ from $\alpha=0.15$ to $\alpha=1.95$ with the interval $\Delta\alpha=0.2$ and $h$ from $h=0.0$ to $h=2.0$ with the interval $\Delta h=0.2$ . We sample $3.2\times 10^{3}$ configurations corresponding to Hamiltonians from the region $V_{1}\in[0,4]\times V_{2}\in[0,2]$ for each pair of parameters.

After the training, we find that the CNN performs well even without knowledge of the complex on-site potential ( $h=0$ ) during the training process. Figure 5(b) shows the test accuracy table with respect to the two non-Hermiticity parameters $\alpha$ and $h$ , with the accuracy more than $99\%$ in the whole parameter region. Moreover, we present the topological phase diagrams with respect to $V_{1}$ and $V_{2}$ predicted by the CNN, as shown in Fig. 5(c). It is clear that the CNN performs excellently in the deep phase with only a little struggle near the topological phase transitions. We attribute the high accuracy in this learning task to two factors. First, the normalizing data enable both the training and the test data distribution in the complex unit, which is important for the generality of the neural network. Second, the topological transitions in this model are consistent with the real-complex transitions in the energy spectrum Tang et al. (2021), which reduces the complexity of the problem when input data are dependent on a complex spectrum.

V Generalization to two-dimensional model

Previously, we have used neural networks to investigate the topological properties of several non-Hermitian models in one dimension. In this section, we extend our scenario to reveal the winding numbers associated with exceptional points in the two-dimensional non-Hermitian Dirac fermion model proposed in Ref. Shen et al. (2018). The Dirac Hamiltonian with non-Hermitian terms in two-dimensional momentum space $\mathbf{k}=(k_{x},k_{y})$ is given by Shen et al. (2018)

\mathcal{H}_{4}(\mathbf{k})=(k_{x}+i\kappa_{x})\sigma_{x}+(k_{y}+i\kappa_{y})\sigma_{y}+(m+i\delta_{m})\sigma_{z},

(21)

where $\sigma_{x,y,z}$ are the Pauli matrices, $\kappa_{x,y}$ denote the non-Hermitian modulation parameters, and $m$ and $\delta_{m}$ denotes the real and imaginary parts of the Dirac mass, respectively. The corresponding energy dispersion is obtained as

E_{\pm}(\mathbf{k})=\pm\sqrt{k^{2}-\kappa^{2}+m^{2}-\delta_{m}^{2}+2i(\mathbf{k}\cdot\bm{\kappa})+m\delta_{m}},

(22)

where $k\equiv|\mathbf{k}|$ , $\bm{\kappa}\equiv(\kappa_{x},\kappa_{y})$ and $\kappa\equiv|\bm{\kappa}|$ . The inter-band winding number $w_{\pm}(\Gamma)$ is defined for the energies $E_{+}(\mathbf{k})$ and $E_{-}(\mathbf{k})$ in the complex energy plane Shen et al. (2018):

w_{\pm}(\Gamma)=\oint_{\Gamma}\frac{dk}{2\pi}\partial_{\mathbf{k}}\arg{[E_{+}(\mathbf{k})-E_{-}(\mathbf{k})]},

(23)

where $\Gamma$ is a closed loop in the two-dimensional momentum space. A nonzero winding number $w_{\pm}(\Gamma)$ implies a band degeneracy in the region enclosed by $\Gamma$ . For a pair of separable bands, the winding number can be nonzero only for non-contractible loops in the momentum space. Here we choose loop $\Gamma$ as a unit circle that encircles an exceptional point (a band degeneracy in non-Hermitian band structures) when the Hamiltonian has exceptional points; otherwise we randomly choose a closed loop. The exact topological phase diagram Shen et al. (2018) in the parameter space spanned by ( $m,\kappa$ ) is shown in Fig. 6(a). The winding number is 0 in the regime $\kappa<|m|$ , and the corresponding Hamiltonian has a pair of separable bands without band degeneracies. In the regime $\kappa>|m|$ , the two bands $E_{\pm}(\mathbf{k})$ cross at two isolated exceptional points $\mathbf{k}_{\pm}$ in the two-dimensional momentum space Shen et al. (2018)

\mathbf{k}_{\pm}=-\frac{m\delta_{m}}{\kappa}\hat{\mathbf{n}}\pm\frac{\sqrt{(\kappa^{2}-m^{2})(\kappa^{2}+\delta^{2}_{m})}}{\kappa}\hat{\mathbf{z}}\times\hat{\mathbf{n}},

(24)

where $\hat{\mathbf{n}}=\bm{\kappa}/\kappa$ . For the regime $\kappa>|m|$ , the inter-band winding numbers $w_{\pm}(\Gamma)$ circling an exceptional point are half-integers and have opposite signs for $\mathbf{k}_{\pm}$ . Thus, the winding number $w_{\pm}(\Gamma)$ associated with the exceptional points characterizes topological phase transitions in this model. Note that here we consider the loop $\Gamma$ clockwise circling the exceptional point $\mathbf{k}_{+}$ for the two energy bands $E_{\pm}(\mathbf{k})$ in the complex plane, as displayed in Fig. 6(b).

In the training, we discretize the loop $\Gamma$ to $L$ equally distributed points and set the configuration of input data as $\mathbf{d}(n)=\{\mathrm{Re}{\Delta E(n)},\mathrm{Im}{\Delta E(n)}\}$ with $\Delta E(n)=E_{+}(\mathbf{k}_{n})-E_{-}(\mathbf{k}_{n})$ . The corresponding winding numbers are used as the data labels. We use a workflow similar to that described in Sec. III and a CNN with the same structure as described in Sec. IV to study topological phase transitions characterized by $w_{\pm}$ in this two-dimensional non-Hermitian model. The training data set consists of $3\times 10^{4}$ configurations satisfying $s\geq T$ , sampled from different phases with different winding numbers. We test the CNN with two test data sets: (I) $6\times 10^{3}$ configurations satisfying $s<T$ and (II) 600 configurations distributed uniformly in $\kappa=3,m\in[-6,6]$ . The CNN evaluates both the test data sets, I and II, after the training. In Fig. 6(c), we plot the accuracy versus the distance threshold $T$ , where the CNN is able to detect the winding number precisely for different thresholds $T$ . Furthermore, the topological phase transitions can be revealed by the crossing points of the prediction probabilities as shown in Fig. 6(d). These results demonstrate the feasibility of neural networks in learning the topological invariants in two-dimensional non-Hermitian models.

VI Conclusions

In summary, we have demonstrated that artificial neural networks can be used to predict the topological invariants and the associated topological phase transitions and topological phase diagrams in four different non-Hermitian models with a high accuracy. The eigenenergy winding numbers in the Hatano-Nelson model are presented as a demonstration of our machine learning method. The CNN trained by the data set within the deep phases has been shown to correctly detect the phase transition near each boundary of the non-Hermitian SSH model. We have also investigated the non-Hermitian generalized AAH model with non-reciprocal hopping and a complex quasiperiodic potential. It is found that the topological phase diagram in the non-Hermiticity parameter space predicted by the CNN has a high accuracy with the theoretical counterpart. Furthermore, we have generalized our scenario to reveal the winding numbers associated with exceptional points in the two-dimensional non-Hermitian Dirac fermion model. Our results have shown the generality of the machine learning method in classifying topological phases in prototypical non-Hermitian models.

Finally, we make some remarks on future studies on machine learning non-Hermitian topology. Some exotic features of non-Hermitian topological systems are sensitive to the boundary condition, such as the non-Hermitian skin effect under open boundary conditions Lee (2016); Yao and Wang (2018); Yao et al. (2018); Song et al. (2019); Kunst et al. (2018), which is closely related to the winding number of complex eigenenergies Zhang et al. (2020e); Yang et al. (2020); Okuma et al. (2020). The energy spectrum under periodic boundary conditions may deviate drastically from that under open boundary conditions. Further studies on the non-Hermitian skin effects and the classification of non-Hermitian topological phases under open boundary conditions based on machine learning algorithms will be conducted. In addition, machine learning non-Hermitian topological invariants defined by the eigenstates would be an interesting further study.

Note added. Recently, we noticed two related works on machine learning non-Hermitian topological states Narayan and Narayan (2021); Yu and Deng (2020), which focused on the winding number of the Hamiltonian vectors and the cluster of non-Hermitian topological phases in an unsupervised fashion, respectively.

Appendix A Training details

We first describe some training details for the Hatano-Nelson model. We use the deep learning framework PyTorch Paszke et al. (2019) to construct and train the neural network. Weights are randomly initialized to a normal distribution with the Xavier algorithm Glorot and Bengio (2010) and the biases are initialized to 0. We use the Adam optimizerKingma and Ba (2014) to minimize the output of the neural network $\tilde{w}$ with the true value $w$ . We set the initial learning rate at 0.001 and use the ReduceLROnPlateau algorithm Paszke et al. (2019) to lower by 10 times when the improvement of the validation loss stops for 20 epochs. All hyper-parameters are set to default, unless mentioned otherwise. In order to prevent neural overfitting, $L_{2}$ regularization with strength $10^{-4}$ and early stop Yao et al. (2007) are used during the training. We use mini-batch training with the batch size 64 and a validation set to confirm that there is no overfitting during training. We take $4\times 10^{3}$ configurations, which consist of $1:1:1$ samples with winding numbers $w=\pm\{1,2,3\}$ . The typical loss during a training instance of the CNN and FCNN is shown in Fig. 7 (a), from which one can see that there is no sign of overfitting.

We now provide some training details for the non-Hermitian SSH model. In this case, the CNN has two convolution layers with 32 kernels of size $1\times 2\times 2$ and 1 kernel of size $32\times 1\times 1$ , followed by a fully connected layer of 16 neurons before the output layer. In this model, the output layer consists of three neurons for three different inter-band winding numbers. All the hidden layers have ReLU as activation functions and the output layer has the softmax function $f(\mathbf{x})_{i}=\exp{\mathbf{x}_{i}}/\sum_{j=1}^{n}\exp{\mathbf{x}_{j}}$ . The exact topological phase diagram in the parameter space spanned by $t$ and $\delta$ is shown in Fig. 8 (a). The training data set satisfying $s\geq T$ with $T=0.2$ here and the test data set satisfying $s<T$ are randomly sampled from the parameter space. The data set distribution is shown in Fig. 8 (b). The numbers of configurations in the training data set, validation data set, and test data set are about $2.4\times 10^{4}$ , $6\times 10^{3}$ , and $6\times 10^{3}$ , respectively. Typical loss during training instances of the CNN for different training data sets is plotted in Fig. 7 (b), which clearly shows that the neural networks converge quickly without overfitting.

Finally, we present briefly some details for the non-Hermitian generalized AAH model. In this case, the validation set consists of $8\times 10^{3}$ configurations corresponding to non-reciprocal-hopping Hamiltonians (with $h=0$ ) that are not included in the training data set. The typical loss is shown in Fig. 7 (c), with the networks converging quickly without overfitting.

Acknowledgements.

We thank Dan-Bo Zhang for helpful discussions. This work was supported by the National Natural Science Foundation of China (Grants No. U1830111, No. U1801661, and No. 12047522), the Key-Area Research and Development Program of Guangdong Province (Grant No. 2019B030330001), and the Science and Technology Program of Guangzhou (Grants No. 201804020055 and No. 2019050001).

References

Jordan and Mitchell (2015) M. I. Jordan and T. M. Mitchell, Science 349, 255 (2015).
LeCun et al. (2015) Y. LeCun, Y. Bengio, and G. Hinton, Nature 521, 436 (2015).
Goodfellow et al. (2016) I. J. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (MIT Press, Cambridge, MA, USA, 2016) http://www.deeplearningbook.org.
Carleo et al. (2019) G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld, N. Tishby, L. Vogt-Maranto, and L. Zdeborová, Rev. Mod. Phys. 91, 045002 (2019).
Biswas et al. (2013) R. Biswas, L. Blackburn, J. Cao, R. Essick, K. A. Hodge, E. Katsavounidis, K. Kim, Y.-M. Kim, E.-O. Le Bigot, C.-H. Lee, J. J. Oh, S. H. Oh, E. J. Son, Y. Tao, R. Vaulin, and X. Wang, Phys. Rev. D 88, 062003 (2013).
Rem et al. (2019) B. S. Rem, N. Käming, M. Tarnowski, L. Asteria, N. Fläschner, C. Becker, K. Sengstock, and C. Weitenberg, Nat. Phys. 15, 917 (2019).
Kasieczka et al. (2019) G. Kasieczka, T. Plehn, A. Butter, K. Cranmer, D. Debnath, B. M. Dillon, M. Fairbairn, D. A. Faroughy, W. Fedorko, C. Gay, L. Gouskos, J. F. Kamenik, P. Komiske, S. Leiss, A. Lister, S. Macaluso, E. Metodiev, L. Moore, B. Nachman, K. Nordström, J. Pearkes, H. Qu, Y. Rath, M. Rieger, D. Shih, J. Thompson, and S. Varma, SciPost Physics 7, 14 (2019).
Wang (2016) L. Wang, Phys. Rev. B 94, 195105 (2016).
Carrasquilla and Melko (2017) J. Carrasquilla and R. G. Melko, Nat. Phys. 13, 431 (2017).
Zhang and Kim (2017) Y. Zhang and E.-A. Kim, Phys. Rev. Lett. 118, 216401 (2017).
Deng et al. (2017) D.-L. Deng, X. Li, and S. Das Sarma, Phys. Rev. B 96, 195145 (2017).
Huembeli et al. (2019) P. Huembeli, A. Dauphin, P. Wittek, and C. Gogolin, Phys. Rev. B 99, 104106 (2019).
Dong et al. (2019) X.-Y. Dong, F. Pollmann, and X.-F. Zhang, Phys. Rev. B 99, 121104 (2019).
Van Nieuwenburg et al. (2017) E. P. Van Nieuwenburg, Y.-H. Liu, and S. D. Huber, Nat. Phys. 13, 435 (2017).
Carvalho et al. (2018) D. Carvalho, N. A. García-Martínez, J. L. Lado, and J. Fernández-Rossier, Phys. Rev. B 97, 115453 (2018).
Zhang et al. (2018a) P. Zhang, H. Shen, and H. Zhai, Phys. Rev. Lett. 120, 066401 (2018a).
Sun et al. (2018) N. Sun, J. Yi, P. Zhang, H. Shen, and H. Zhai, Phys. Rev. B 98, 085402 (2018).
Huembeli et al. (2018) P. Huembeli, A. Dauphin, and P. Wittek, Phys. Rev. B 97, 134109 (2018).
Tsai et al. (2020) Y.-H. Tsai, M.-Z. Yu, Y.-H. Hsu, and M.-C. Chung, Phys. Rev. B 102, 054512 (2020).
Ming et al. (2019) Y. Ming, C.-T. Lin, S. D. Bartlett, and W.-W. Zhang, npj Comput. Mater. 5, 1 (2019).
Rodriguez-Nieva and Scheurer (2019) J. F. Rodriguez-Nieva and M. S. Scheurer, Nat. Phys. 15, 790 (2019).
Holanda and Griffith (2020) N. L. Holanda and M. A. R. Griffith, Phys. Rev. B 102, 054107 (2020).
Ohtsuki and Mano (2020) T. Ohtsuki and T. Mano, J. Phys. Soc. Jpn. 89, 022001 (2020).
Zhang et al. (2020a) Y. Zhang, P. Ginsparg, and E.-A. Kim, Phys. Rev. Research 2, 023283 (2020a).
Long et al. (2020) Y. Long, J. Ren, and H. Chen, Phys. Rev. Lett. 124, 185501 (2020).
Scheurer and Slager (2020) M. S. Scheurer and R.-J. Slager, Phys. Rev. Lett. 124, 226401 (2020).
Balabanov and Granath (2020) O. Balabanov and M. Granath, Phys. Rev. Research 2, 013354 (2020).
Balabanov and Granath (2021) O. Balabanov and M. Granath, Machine Learning: Science and Technology 2, 025008 (2021).
Lian et al. (2019) W. Lian, S.-T. Wang, S. Lu, Y. Huang, F. Wang, X. Yuan, W. Zhang, X. Ouyang, X. Wang, X. Huang, L. He, X. Chang, D.-L. Deng, and L. Duan, Phys. Rev. Lett. 122, 210503 (2019).
Diehl et al. (2008) S. Diehl, A. Micheli, A. Kantian, B. Kraus, H. Büchler, and P. Zoller, Nat. Phys. 4, 878 (2008).
Malzard et al. (2015) S. Malzard, C. Poli, and H. Schomerus, Phys. Rev. Lett. 115, 200402 (2015).
Lee (2016) T. E. Lee, Phys. Rev. Lett. 116, 133903 (2016).
Yao and Wang (2018) S. Yao and Z. Wang, Phys. Rev. Lett. 121, 086803 (2018).
Yao et al. (2018) S. Yao, F. Song, and Z. Wang, Phys. Rev. Lett. 121, 136802 (2018).
Song et al. (2019) F. Song, S. Yao, and Z. Wang, Phys. Rev. Lett. 123, 246801 (2019).
Kunst et al. (2018) F. K. Kunst, E. Edvardsson, J. C. Budich, and E. J. Bergholtz, Phys. Rev. Lett. 121, 026808 (2018).
Takata and Notomi (2018) K. Takata and M. Notomi, Phys. Rev. Lett. 121, 213902 (2018).
Wang et al. (2019) H. Wang, J. Ruan, and H. Zhang, Phys. Rev. B 99, 075130 (2019).
Zeng et al. (2017) Q.-B. Zeng, S. Chen, and R. Lü, Phys. Rev. A 95, 062118 (2017).
Lang et al. (2018) L.-J. Lang, Y. Wang, H. Wang, and Y. D. Chong, Phys. Rev. B 98, 094307 (2018).
Hamazaki et al. (2019) R. Hamazaki, K. Kawabata, and M. Ueda, Phys. Rev. Lett. 123, 090603 (2019).
Jin and Song (2019) L. Jin and Z. Song, Phys. Rev. B 99, 081103 (2019).
Kawabata et al. (2019a) K. Kawabata, S. Higashikawa, Z. Gong, Y. Ashida, and M. Ueda, Nat. Commun. 10, 1 (2019a).
Liu et al. (2019) T. Liu, Y.-R. Zhang, Q. Ai, Z. Gong, K. Kawabata, M. Ueda, and F. Nori, Phys. Rev. Lett. 122, 076801 (2019).
Lee et al. (2019) C. H. Lee, L. Li, and J. Gong, Phys. Rev. Lett. 123, 016805 (2019).
Yamamoto et al. (2019) K. Yamamoto, M. Nakagawa, K. Adachi, K. Takasan, M. Ueda, and N. Kawakami, Phys. Rev. Lett. 123, 123601 (2019).
Hatano and Nelson (1996) N. Hatano and D. R. Nelson, Phys. Rev. Lett. 77, 570 (1996).
Hatano and Nelson (1997) N. Hatano and D. R. Nelson, Phys. Rev. B 56, 8651 (1997).
Gong et al. (2018) Z. Gong, Y. Ashida, K. Kawabata, K. Takasan, S. Higashikawa, and M. Ueda, Phys. Rev. X 8, 031079 (2018).
Ghatak and Das (2019) A. Ghatak and T. Das, J. Phys.: Condens. Matter 31, 263001 (2019).
Leykam et al. (2017) D. Leykam, K. Y. Bliokh, C. Huang, Y. D. Chong, and F. Nori, Phys. Rev. Lett. 118, 040401 (2017).
Shen et al. (2018) H. Shen, B. Zhen, and L. Fu, Phys. Rev. Lett. 120, 146402 (2018).
Zhang et al. (2018b) D.-W. Zhang, Y.-Q. Zhu, Y. X. Zhao, H. Yan, and S.-L. Zhu, Advances in Physics 67, 253 (2018b).
Zhang et al. (2020b) G.-Q. Zhang, D.-W. Zhang, Z. Li, Z. D. Wang, and S.-L. Zhu, Phys. Rev. B 102, 054204 (2020b).
Zhang et al. (2020c) D.-W. Zhang, L.-Z. Tang, L.-J. Lang, H. Yan, and S.-L. Zhu, Sci. China Phys. Mech. Astron. 63, 1 (2020c).
(56) X.-W. Luo and C. Zhang, arXiv:1912.10652v1 .
Jiang et al. (2019) H. Jiang, L.-J. Lang, C. Yang, S.-L. Zhu, and S. Chen, Phys. Rev. B 100, 054301 (2019).
Longhi (2019) S. Longhi, Phys. Rev. Lett. 122, 237601 (2019).
Liu et al. (2020a) T. Liu, H. Guo, Y. Pu, and S. Longhi, Phys. Rev. B 102, 024205 (2020a).
Wu and An (2020) H. Wu and J.-H. An, Phys. Rev. B 102, 041119 (2020).
Zeng et al. (2020) Q.-B. Zeng, Y.-B. Yang, and Y. Xu, Phys. Rev. B 101, 020201 (2020).
Zeng and Xu (2020) Q.-B. Zeng and Y. Xu, Phys. Rev. Research 2, 033052 (2020).
Tang et al. (2020) L.-Z. Tang, L.-F. Zhang, G.-Q. Zhang, and D.-W. Zhang, Phys. Rev. A 101, 063612 (2020).
Liu et al. (2020b) H. Liu, Z. Su, Z.-Q. Zhang, and H. Jiang, Chinese Physics B 29, 050502 (2020b).
Zhang et al. (2020d) D.-W. Zhang, Y.-L. Chen, G.-Q. Zhang, L.-J. Lang, Z. Li, and S.-L. Zhu, Phys. Rev. B 101, 235150 (2020d).
Xu and Chen (2020) Z. Xu and S. Chen, Phys. Rev. B 102, 035153 (2020).
Liu et al. (2020c) T. Liu, J. J. He, T. Yoshida, Z.-L. Xiang, and F. Nori, Phys. Rev. B 102, 235151 (2020c).
(68) W. Xi, Z.-H. Zhang, Z.-C. Gu, and W.-Q. Chen, arXiv:1911.01590v4 .
Lee et al. (2020) E. Lee, H. Lee, and B.-J. Yang, Phys. Rev. B 101, 121109 (2020).
Yoshida et al. (2019) T. Yoshida, K. Kudo, and Y. Hatsugai, Sci. Rep. 9, 16895 (2019).
Borgnia et al. (2020) D. S. Borgnia, A. J. Kruchkov, and R.-J. Slager, Phys. Rev. Lett. 124, 056802 (2020).
Okuma et al. (2020) N. Okuma, K. Kawabata, K. Shiozaki, and M. Sato, Phys. Rev. Lett. 124, 086801 (2020).
Zhang et al. (2020e) K. Zhang, Z. Yang, and C. Fang, Phys. Rev. Lett. 125, 126402 (2020e).
Su et al. (1979) W. P. Su, J. R. Schrieffer, and A. J. Heeger, Phys. Rev. Lett. 42, 1698 (1979).
Harper (1955) P. G. Harper, Proc. Phys. Soc. Sect. A 68, 874 (1955).
Aubry and Andre (1980) S. Aubry and G. Andre, Ann. Israel Phys. Soc. 3, 133 (1980).
Liu et al. (2015) F. Liu, S. Ghosh, and Y. D. Chong, Phys. Rev. B 91, 014108 (2015).
Kawabata et al. (2019b) K. Kawabata, K. Shiozaki, M. Ueda, and M. Sato, Phys. Rev. X 9, 041015 (2019b).
Tang et al. (2021) L.-Z. Tang, G.-Q. Zhang, L.-F. Zhang, and D.-W. Zhang, arXiv: 2101.05505 (2021).
Yang et al. (2020) Z. Yang, K. Zhang, C. Fang, and J. Hu, Phys. Rev. Lett. 125, 226402 (2020).
Narayan and Narayan (2021) B. Narayan and A. Narayan, Phys. Rev. B 103, 035413 (2021).
Yu and Deng (2020) L.-W. Yu and D.-L. Deng, arXiv: 2010.14516 (2020).
Paszke et al. (2019) A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, in Advances in Neural Information Processing Systems 32 (Curran Associates, Inc., 2019) pp. 8026–8037.
Glorot and Bengio (2010) X. Glorot and Y. Bengio, in Proceedings of Machine Learning Research, Vol. 9, edited by Y. W. Teh and M. Titterington (JMLR Workshop and Conference Proceedings, Chia Laguna Resort, Sardinia, Italy, 2010) pp. 249–256.
Kingma and Ba (2014) D. P. Kingma and J. Ba, arXiv:1412.6980 (2014).
Yao et al. (2007) Y. Yao, L. Rosasco, and A. Caponnetto, Constructive Approximation 26, 289 (2007).