This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A statistical approach to topological entanglement:
Boltzmann machine representation of high-order irreducible correlations

Shi Feng [email protected] Department of Physics, The Ohio State University, Columbus, Ohio 43210, USA    Deqian Kong Department of Statistics, University of California, Los Angeles, California 90095, USA    Nandini Trivedi Department of Physics, The Ohio State University, Columbus, Ohio 43210, USA
Abstract

Strongly interacting systems can be described in terms of correlation functions at various orders. A quantum analog of high-order correlations is the topological entanglement in topologically ordered states of matter at zero temperature, usually quantified by topological entanglement entropy (TEE). In this work, we propose a statistical interpretation that unifies the two under the same information-theoretic framework. We demonstrate that the existence of a non-zero TEE can be understood in the statistical view as the emergent nnth order mutual information InI_{n} (for arbitrary integer n3n\geq 3) reflected in projectively measured samples, which also makes explicit the equivalence between the two existing methods for its extraction – the Kitaev-Preskill (KP) and the Levin-Wen (LW) construction. To exploit the statistical nature of InI_{n}, we construct a restricted Boltzmann machine (RBM) which captures the high-order correlations and correspondingly the topological entanglement that are encoded in the distribution of projected samples by representing the entanglement Hamiltonian of a local region under the proper basis. Furthermore, we derive a closed form which presents a method to interrogate the trained RBM, making explicit the analytical form of arbitrary order of correlations relevant for InI_{n}. We remark that the interrogation method for extracting high-order correlation can also be applied to the construction of auxiliary fields that disentangle many-body interactions relevant for diverse interacting models.

I Introduction

High-order correlation is of great theoretical importance in many fields of physics. Its presence indicates an irreducible many-body correlation/interaction that cannot be explained using pairwise relations. Such correlations usually emerge in many-body interacting systems such as frustrated magnetic systems and complex networks [1, 2, 3, 4, 5]. A quantum analog of high-order correlations is the topological entanglement in topologically ordered states of matter at zero temperature [6, 7, 8]. The topological entanglement is usually quantified by an O(1)O(1) constant γ\gamma i.e. the topological entanglement entropy (TEE) as the universal contribution to the von-Neumann entropy S(L)=αLγS(L)=\alpha L-\gamma of a subsystem with boundary length LL [9, 10]. Understanding the mechanism of such emergent phenomena in many-body systems involves methods and theories beyond the standard mean-field paradigm, which is the main workhorse for many interacting models. In recent years, there has been considerable activity on long-range entangled frustrated systems [11, 12, 13] and quantum many-body systems with topological order (TO) [14, 15, 16, 17]. These systems feature strong interaction between local degrees of freedom, whereby mean-field or single-mode approximations fail to capture the essential physics that requires theories beyond Landau’s symmetry-breaking paradigm, such as Kitaev spin liquid [15] and the toric code (TC) model [14].

A common approach for the analysis of various many-body systems is to study their statistical behavior and exploit concepts of information theory. A quintessential information-theoretic concept is the quantum entanglement and entanglement entropy as its measure. Specifically, we are interested in the topological entanglement of frustrated quantum systems characterized by their patterns of entanglement encoded in many-body wave functions rather than local order parameters. In such systems, the local degrees of freedom get collectively correlated via the high-order and long-range entanglement of the emergent gauge fields inherent in the topological order.

It is well-known that the long-range entanglement of TO can be quantified by the TEE. Usually, the extraction of TEE of a lattice gauge theory, like TC model, cannot be done by local operational protocol such as quantum distillation [18, 19, 20]. The essential obstacle in calibrating entanglement by the local operation is that the operators have to be locally gauge-invariant, which cannot detect other superselection sectors that therefore requires non-local operations. Also, from a statistical point of view, TEE is an intrinsic non-dyadic many-body correlation [1, 21], as is also reflected in the fact that any local correlation function in TC model with less than four qubits vanishes. Such an obstacle in witnessing the existence of TEE of TO calls for a practical representation whereby the non-local high-order correlation between qubits can be captured faithfully. Moreover, the existence of high-order statistical correlation/interaction makes the challenge coincide with those in the study of complex networks and the representation of many-body synergistic information therein [22, 3]; as well as those in nuclear physics where many-nucleon forces are present [23, 24].

In this work, we exploit a statistical view point to high-order correlations and topological entanglement as its quantum analogue. We show such long-range high-order correlation between multi-partite patches in a topologically non-trivial subsystem can be described by high-order (quantum) mutual information InI_{n} with n3n\geq 3, with the Kitaev-Preskill (KP) and Levin-Wen (LW) constructions of TEE equivalent to I3I_{3}. There are two equivalent ways to interpret the TEE as In3I_{n\geq 3}. It can be viewed (1) as a property of the reduced density matrix ρ\rho of a subsystem, which may be assigned a fictitious quantum entanglement Hamiltonian according to ρexp(H)\rho\equiv\exp(-H) [25]; or (2) as a joint data distribution with high-order covariance from projective sampling on a topological quantum state [26, 27]. In analogy to the entanglement Hamiltonian in the previous case, such data distribution can be assigned to a fictitious classical Hamiltonian \mathcal{H} consisting of Ising interactions of different orders. Then the presence of TEE can be interpreted as follows: there exists a set or sets of basis, whereby the projectively sampled data set {i}\{i\} on a subsystem with closed topology is a sample of the probability distribution p{i}e{i}p_{\{i\}}\propto e^{-\mathcal{H}\{i\}} generated by {i}\mathcal{H}\{i\} with high-order Ising interactions. This is summarized in Table 1. We will focus on the case (2) in our work, and will explain more details in Sec. IV regarding this argument.

Table 1: Comparison between two interpretations of the topological entanglement: (1) topological entanglement as a property of the reduced density matrix ρ\rho of a subsystem; and (2) as a joint data distribution with high-order covariance of a Gibbs ensemble.
(1) (2)
Measure Density matrix ρ\rho Gibbs distribution p{i}p_{\{i\}}
Generator Ent. Ham. HH ρ=exp(H)\rho=\exp(-H) Ising Ham. \mathcal{H} p{i}exp()p_{\{i\}}\propto\exp(-\mathcal{H})
Quantity TEE In3I_{n_{\geq}3} of samples
Method KP or LW construction Sampling + RBM

By the projective sampling on nn qubits within a skeletal subsystem with non-trivial topology in certain basis, we show that the resultant projective samples exhibit effective Gibbs joint distributions that feature non-trivial InI_{n}, establishing the possibility to witness TEE by exploiting classical statistical methods.

We further propose an energy-based statistical representation of such high-order correlation and TEE using restricted Boltzmann machine (RBM) widely used in machine learning. Notably, recent investigation on machine learning applications in quantum many-body physics has burgeoned [28, 29, 30, 31, 32]. It has been shown that by exploiting the generating power of these artificial neuron networks, the phase factors of a quantum state of various models can be faithfully captured, thus providing a network-based variational quantum many-body solver [28, 31]; and the representation power of RBM has lead to numerous application in the physics community [33, 34, 35, 36, 37, 38, 39, 24, 40]. We show that the high-order information existing in the joint distribution of samples can be represented by a bipartite Ising model i.e. the two-body interacting network of an RBM, in which the effective many-body interactions between Ising spins in its visible layer provide a neural network representation of the high-order correlations. In order to interrogate the trained RBM and accurately extract the high-order information, we derive a closed analytical form of the effective nn-body coupling of RBM relevant for InI_{n}. This allows us to determine the existence of high-order correlation or TEE by sampling a subsystem, instead of reconstructing the complete wave function. We further remark that the RBM representation developed in this paper will also be useful for modeling many-body interactions using two-body interactions. Indeed, it was demonstrated in Ref. [39, 24] that RBM can be used to represent interacting models with three-body interactions. Hence the exact form of the effective nn-body interaction of RBM provides a generic pathway to construct auxiliary fields which disentangle arbitrary-order interactions into two-body interactions.

This paper is organized as follows: Section II presents the statistical interpretation of TEE using high-order mutual information, its equivalence to existing constructions, and the formulation of TEE using arbitrary partitions of a subsystem. Section III presents projective measurements on the exactly solvable TC model, which is then used to demonstrate the equivalence between TEE and the InI_{n} encoded in the joint probability distribution. Section IV discusses the structure of RBM (details of RBM are discussed in Appendix B), and presents the analytical representation of effective high-order interactions between spins σ{+1,1}\sigma\in\{+1,-1\} in the visible layer, enabling the interrogation of the trained RBM (Representation for σ{0,1}\sigma\in\{0,1\} is discussed in Appendix A). Section V shows a worked example of extracting high-order information by RBM in the joint probability distribution sampled from TC. Section VI concludes our results, and briefly discusses the potential application of our RBM construction in many-body interacting models.

II The statistical interpretation of topological entanglement

In this section, we present a statistical viewpoint towards TEE i.e. the statistical interpretation of TEE using high-order mutual information. We start from the existing KP construction and LW construction and show their equivalence to the third-order mutual information I3I_{3}. Then we generalize this information-theoretic argument to arbitrary order InI_{n}. This statistical formulation then naturally establishes the possibility of the representation via energy-based statistical network to be discussed in the following sections. By exploiting the property of InI_{n}, we show a generic construction of TEE using an arbitrary nn-partite subsystem at the end of the section.

II.1 Kitaev-Preskill Construction

TO is characterized by global long-range entanglement. For gapped systems the von-Neumann entropy of a subsystem density matrix ρA\rho_{A} for the ground state scales as

S(ρA)=α|A|γ+O(1/|A|).S(\rho_{A})=\alpha|\partial A|-\gamma+O(1/|\partial A|)\ . (1)

The TO is reflected in the topological entanglement entropy (TEE) StopoγS_{\rm topo}\equiv-\gamma [10, 9, 41]. In the Kitaev-Preskill (KP) construction, γ\gamma is extracted by a tripartite disk ABCA\cup B\cup C within the 2D lattice according to

Stopo=S(ρA)+S(ρB)+S(ρC)S(ρAB)S(ρBC)S(ρAC)+S(ρABC)\begin{split}S_{\rm topo}&=S(\rho_{A})+S(\rho_{B})+S(\rho_{C})\\ &-S(\rho_{AB})-S(\rho_{BC})-S(\rho_{AC})+S(\rho_{ABC})\end{split} (2)

whereby the linear combination is engineered in a way such that the area-law contribution is canceled out. γ\gamma is related to the quantum dimension of anyonic charges in the medium by a topological quantum field theory (TQFT) whereby γ=log𝒟,𝒟=ada2\gamma=\log\mathcal{D},~{}\mathcal{D}=\sqrt{\sum_{a}d_{a}^{2}}. Besides the relation with TQFT, recently there have been pioneering works in relating the information-theoretic framework, i.e. the quantum (conditional) mutual information with TO [42, 43, 44] resulting in non-trivial bounds and proofs.

We propose to view TO from the information-theoretic point of view as an emergent statistical synergy [22] due to the underlying gauge structure, which provides more direct intuition and possible detection of TO by quantum sampling in relevant computational and experimental platforms such as tensor network and Rydberg atom arrays. The simple interpretation that is central to our argument is that the topological entropy can be written in the form of (quantum) conditional mutual information:

Stopo=I(A:B)I(A:B|C)S_{\rm topo}=I(A:B)-I(A:B|C) (3)

where I(A:B)I(A:B) and I(A:B|C)I(A:B|C) are quantum mutual information and quantum conditional mutual information, respectively defined by

I(A:B)\displaystyle I(A:B) =S(ρA)+S(ρB)S(ρAB)\displaystyle=S(\rho_{A})+S(\rho_{B})-S(\rho_{AB}) (4)
I(A:B|C)\displaystyle I(A:B|C) =S(ρAC)+S(ρBC)S(ρC)S(ρABC)\displaystyle=S(\rho_{AC})+S(\rho_{BC})-S(\rho_{C})-S(\rho_{ABC}) (5)

This is consistent with the KP construction defined in Eq. 2 if subsystems are properly chosen. To be specific, the mutual information I(A:B)I(A:B) quantifies the amount of shared information between AA and BB; whereas the conditional mutual information I(A:B|C)I(A:B|C) quantifies the amount of shared information between AA and BB given that CC is known (e.g. by a local projection on a quantum state), and is able to include the irreducible tripartite information which is also known as synergistic information in the field of statistics [22]. While a non-zero I(A:B|C)I(A:B|C) is capable of detecting the existence of synergy that information shared between two subsystems could be influenced by a third, it should be noted that it could confuse a trivial bipartite mutual information and the intrinsic synergy if there exists some shared information between AA and BB independent of CC. A trivial case is where AA and BB is completely disjoint from CC thus I(A:B|C)=I(A:B)I(A:B|C)=I(A:B). Therefore, to remove such trivial cases, one must compare I(A:B|C)I(A:B|C) against I(A:B)I(A:B), and look into I3(A:B:C)I_{3}(A:B:C) defined in Eq. 3 or Eq. 6. It is only in the simplest case where I(A:B)=0I(A:B)=0 that a non-zero I(A:B|C)I(A:B|C) alone suffices to determine the synergistic information (as we are to demonstrate in the coming section, this is exactly the case of the LW construction).

If the tripartition of a subsystem is topological non-trivial e.g. shown in Fig. 1(b,c) and the ground state is topologically ordered, the resulting I3I_{3} will equal to StopoS_{\rm topo} [45, 46]. In contrast, for a topologically trivial geometry e.g. shown in Fig. 1(a), I3=0I_{3}=0 is always true for gapped systems since I(A:B)=0I(A:B)=0 and (A:B|C)=0(A:B|C)=0; and it is true regardless of the order of A,BA,B and CC because I3(A:B:C)I_{3}(A:B:C) is symmetric under permutation of variables.

Notably, this interpretation coincides with that of the third order mutual information I3cI_{3}^{c} for classical random variables and is related to the measure of frustration and synergy [1, 22]. In the probabilistic context, given three random variables A,B,CA,B,C generated from a classical ensemble or sampling of quantum density matrix, we can define I3cI_{3}^{c} for classical random variables as

I3c(A:B:C)=Ic(A:B)Ic(A:B|C)\begin{split}I_{3}^{c}(A:B:C)=I^{c}(A:B)-I^{c}(A:B|C)\end{split} (6)

where the mutual information and the conditional mutual information are defined respectively as

Ic(A:B)\displaystyle I^{c}(A:B) =a,bp(a,b)logp(a,b)p(a)p(b)\displaystyle=\sum_{a,b}p(a,b)\log\frac{p(a,b)}{p(a)p(b)} (7)
Ic(A:B|C)\displaystyle I^{c}(A:B|C) =a,b,cp(a,b,c)logp(c)p(a,b,c)p(a,c)p(b,c)\displaystyle=\sum_{a,b,c}p(a,b,c)\log\frac{p(c)p(a,b,c)}{p(a,c)p(b,c)} (8)

hence the I3c(A:B:C)I_{3}^{c}(A:B:C) in Eq. 6 can be expressed compactly as

I3c(A:B:C)=a,b,cp(a,b,c)logp(a,b,c)p(a)p(b)p(c)p(a,b)p(b,c)p(a,c)I_{3}^{c}(A:B:C)=-\sum_{a,b,c}p(a,b,c)\log\frac{p(a,b,c)p(a)p(b)p(c)}{p(a,b)p(b,c)p(a,c)} (9)

Note this definition is symmetric under permutations of indices of A,B,CA,B,C. A negative I3cI_{3}^{c} is intimately related to the so-called synergistic information or interaction information, that is, the intrinsic many-body and non-dyadic correlation that cannot be reduced to pair-wise relations [1, 22]. A negative value of I3cI_{3}^{c} is then interpreted as a statistical situation in which the knowledge of any one of the three variables, A,BA,B and CC, enhances the correlation between the other two. Such statistical intuition remains valid in the quantum many-body cases, whereas the many-body quantum correlation is attributed to the non-local entanglement in a topologically ordered wave function. Indeed, in pure lattice gauge theories or integrable models where gauge sector is disjoint from matter (such as Kitaev spin liquid), by choosing a set of basis whereby the Wilson loops are explicitly constructed, the quantum samples taken under these basis can be interpreted using the classical I3c(A:B:C)I_{3}^{c}(A:B:C), and relevant statistical methods like restricted Boltzmann model can be straightforwardly applied on subsystems thereof.

This formulation of topological entropy can be applied to both classical and quantum context. Indeed, even for classically frustrated spins, the thermal fluctuations exhibit topological entropy and the synergy with negative I3I_{3} [1, 47]. In the context of topologically ordered quantum matter, the non-local constraint e.g. plaquettes operators and Wilson loops in Toric code and Kitaev spin liquids intuitively resemble the aforesaid synergy. Indeed, as we are to show in detail, the topological entanglement entropy is equivalent to a quantum case of third order mutual information that indicates a synergy of quantum fluctuations which is present at zero temperature. This rethinking of topological entropy allows us to unify classically frustrated systems and the TO of quantum frustrated systems in the same information-theoretic framework.

Refer to caption
Figure 1: Different partition schemes of annulus or disk. The partition in (a) is topologically trivial and has no TEE between the three partitions. (b) A tripartite disk used in the KP construction. (c) A tripartite annulus equivalent to Kitaev-Preskill construction. (d) A tripartite annulus used in the Levin-Wen construction. (e) A quadrupartite annulus. (f) An nn-partite annulus. EE denotes environmental degrees of freedoms with respect to the annulus or disk.

II.2 Levin-Wen construction

Indeed the Levin-Wen (LW) construction is equivalent to the KP construction, thus to I3I_{3}; and the aforementioned statistical interpretation of TEE remains valid as well. The LW construction extracts TEE using the partition shown in Fig. 1(d) and the following linear combination of entropies

Stopo=S(ρABC)S(ρAC)S(ρBC)+S(ρC)S_{\rm topo}=S(\rho_{ABC})-S(\rho_{AC})-S(\rho_{BC})+S(\rho_{C}) (10)

It is equivalent to negative conditional mutual information I(A:B|C)-I(A:B|C) which measures the shared information between A,BA,B conditioned on CC, and a non-zero value indicates the existence of the long-range entanglement as a global constraint. It is necessarily non-positive due to the strong subadditivity inequality. Indeed, the LW construction defined above is also equivalent to I3I_{3}. This would be clear if we realize that I3I_{3} for the geometry of Fig. 1(d) is I3(A:B:C)=I(A:B)I(A:B|C)=I(A:B)+StopoI_{3}(A:B:C)=I(A:B)-I(A:B|C)=I(A:B)+S_{\rm topo}. For a large enough CC whose length scale is larger than correlation length i.e. ξ/|C|0\xi/|\partial C|\sim 0, the Hilbert subspaces of ρA\rho_{A} and ρB\rho_{B} are disjoint, hence

S(ρAB)\displaystyle S(\rho_{AB}) =S(ρAρB)=S(ρA)+S(ρB)\displaystyle=S(\rho_{A}\otimes\rho_{B})=S(\rho_{A})+S(\rho_{B})
I(A:B)=0\displaystyle\Rightarrow~{}I(A:B)=0 (11)

for the partition of Fig. 1(d). Combining Eq. 11 with Eq. 3 or Eq. 6 gives exactly the LW construction in Eq. 10. Nevertheless, if ξ/|C|\xi/|\partial C| is not negligibly small, it would be more accurate to include the term I(A:B)I(A:B) which removes the residual non-topological information caused by finite range correlation. We would like to mention that there are also other construction of topological entropy that are statistically equivalent to the third order mutual information I3I_{3} such as the multi-partite entanglement in the context of holographic theory [48, 21, 49, 50, 51], which we do not elaborate in this paper.

II.3 Higher order InI_{n} of TO is equivalent to I3I_{3}

In this section we show that, for a TO with emergent gauge theory, the nn-th order quantum mutual information InI_{n} (n>3n>3) is equivalent to I3I_{3}. According to the aforementioned idea of higher order mutual information, we can always generate higher order constructions as descriptors of higher order irreducible many-body correlation. The generic nn-th order quantum mutual information can be expressed recursively as

In(P1::Pn)=In1(P1::Pn1)In1(P1::Pn1|Pn)\begin{split}I_{n}(P_{1}:\cdots:P_{n})=&I_{n-1}(P_{1}:\cdots:P_{n-1})\\ &-I_{n-1}(P_{1}:\cdots:P_{n-1}|P_{n})\end{split} (12)

where the conditional mutual information satisfies

In1(P1::Pn1|Pn)=In1(P1::Pn1Pn)In1(P1::Pn2:Pn)\begin{split}I_{n-1}(P_{1}:\cdots:P_{n-1}&|P_{n})=I_{n-1}(P_{1}:\cdots:P_{n-1}P_{n})\\ &-I_{n-1}(P_{1}:\cdots:P_{n-2}:P_{n})\end{split} (13)

Similar to I3I_{3}, Eq. 12 can be perceived as an nn-body irreducible correlation, or nn-body synergy that cannot be reduced to any correlations between fewer degrees of freedom. Indeed, it is intuitively clear that TEE is a direct consequence of the Gauss law of an emergent gauge theory, thus encodes such synergy between all degrees of freedom that live on a closed Wilson loop. Without loss of generality, let us assume a gapped TO whose correlation length is negligibly small compared to the sizes of subsystems. Note that for n3n\geq 3, the first term on the right-hand side of Eq. 12 and the second term on the right-hand side of Eq. 13 must be zero since the union of partitions in the parenthesis does not fill up an annular region i.e. is not subjected to gauge constraint; and the small correlation length guarantees there is no statistical correlation between subsystems that do not share a common boundary. Hence, we have

In(P1::Pn)=In1(P1::Pn1Pn)I_{n}(P_{1}:\cdots:P_{n})=-I_{n-1}(P_{1}:\cdots:P_{n-1}P_{n}) (14)

Following such recursion

In(P1::Pn)=(1)n1I3(P1:P2:P3Pn)(1)n1Stopo\begin{split}I_{n}(P_{1}:\cdots:P_{n})&=(-1)^{n-1}I_{3}(P_{1}:P_{2}:P_{3}\cdots P_{n})\\ &\equiv(-1)^{n-1}S_{\rm topo}\end{split} (15)

Hence TEE is equivalent to In(n3)I_{n}~{}(\forall n\geq 3) up to a sign. As a concrete example, we demonstrate that the construction in Eq. 12 with n=4n=4, i.e. the 44th order mutual information with the partition shown in Fig. 1(e), is equivalent to I3I_{3}. For a quadrupartite annular disk ABCDA\cup B\cup C\cup D, I4(A:B:C:D)I_{4}(A:B:C:D) is defined as

I4(A:B:C:D)=I3(A:B:C)I3(A:B:C|D)I_{4}(A:B:C:D)=I_{3}(A:B:C)-I_{3}(A:B:C|D) (16)

where I3(A:B:C)I_{3}(A:B:C) is necessary zero since ABCA\cup B\cup C does not form a closed topology and is thus equivalent to the conditional entropy of a quantum Markov state. The third order conditional mutual information I3(A:B:C|D)I_{3}(A:B:C|D) can be expanded in a non-conditional form as

I3(A:B:C|D)=I3(A:C:BD)I3(A:C:D)I_{3}(A:B:C|D)=I_{3}(A:C:BD)-I_{3}(A:C:D) (17)

where we have exploited the permutation symmetry of In(P1::Pn)I_{n}(P_{1}:\cdots:P_{n}) to switch CC and BB. Note in the given geometry of Fig. 1(e) I3(A:C:D)I_{3}(A:C:D) is necessarily zero [43], in the same way that I3(A:B:C)=0I_{3}(A:B:C)=0. Therefore, for a quadrupartite annular disk ACBDA\cup C\cup B\cup D, we have

I4(A:B:C:D)=I3(A:C:BD)I_{4}(A:B:C:D)=-I_{3}(A:C:BD) (18)

the latter is the same as a tripartite annular disk ACBDA\cup C\cup BD, thus I4I_{4} is reduced to the LW construction of I3I_{3} as shown in Fig. 1(d). It is then trivial to extend to proof to nnth order by induction. Indeed, this is in accordance with the irreducible correlation Cρ(k)C_{\rho}^{(k)} which measures the (quantum) information that is contained in kk parties yet absent in (k1)(k-1) or less [52]; and, in particular, both KP and LW constructions can be understood as the third order irreducible correlation Cρ(3)C_{\rho}^{(3)} [53]. Generally, in systems where the correlation length is not negligibly small compared to the distances between subsystem partitions, we can write down the complete form of I4I_{4} by Eq. 12 and Eq. 13 to extract topological entanglement entropy Stopo(4)S_{\rm topo}^{(4)} using the quadripartite disk in Fig. 1(e):

Stopo(4)=S(ρA)+S(ρB)+S(ρC)+S(ρD)S(ρAB)S(ρBC)S(ρAC)S(ρBD)S(ρAD)S(ρCD)+S(ρABD)+S(ρBCD)+S(ρACD)+S(ρABC)S(ρABCD)\begin{split}-S_{\rm topo}^{(4)}&=S(\rho_{A})+S(\rho_{B})+S(\rho_{C})+S(\rho_{D})\\ &-S(\rho_{AB})-S(\rho_{BC})-S(\rho_{AC})-S(\rho_{BD})\\ &-S(\rho_{AD})-S(\rho_{CD})\\ &+S(\rho_{ABD})+S(\rho_{BCD})+S(\rho_{ACD})+S(\rho_{ABC})\\ &-S(\rho_{ABCD})\end{split} (19)

which is the quadripartite analogue to the tripartite KP construction. It is readily to check that all boundary contributions cancel with each other. Higher order constructions can be represented by the same token. For a generic nn-partite partition shown in Fig. 1(f), we have n3\forall n\geq 3:

Stopo(n)=(1)n1In=i=1n(1)i+nk1,,kiS(ρ(Pk1Pki))S_{\rm topo}^{(n)}=(-1)^{n-1}I_{n}=\sum_{i=1}^{n}(-1)^{i+n}\sum_{k_{1},\cdots,k_{i}}S(\rho_{(P_{k_{1}}\cdots P_{k_{i}})}) (20)

as a generic recipe for extracting TEE in an nn-partite subsystem.

III Statistics of entanglement sampling

The intimate relation between high-order irreducible mutual information (or information synergy) and non-local correlation in topologically ordered systems naturally calls for a probabilistic interpretation in quantum models. In this section we present the sampling process and an example by TC which harbors a Z2Z_{2} gauge theory.

III.1 Joint distribution from projective sampling

In particular, we focus on the entanglement of “skeletal” regions in lattice models, which in 2D lattices are lines with no volume. The advantage of such partition is that for gapped systems it requires the minimum number of qubit samples in order expose the topological structure of the wave function, such as a Wilson loop operator; and InI_{n} can thereby be interpreted as the nn-th order mutual information between nn sampled qubits. By exploiting the inherent probabilistic nature of quantum wavefunction, we can define the probability distribution given a set of projective measurements 𝒫i\mathcal{P}_{i} [27]

pi=Tr(𝒫i𝒫iρ),ρ=𝒫iρ𝒫iTr(𝒫i𝒫iρ)p_{i}=\Tr(\mathcal{P}_{i}^{\dagger}\mathcal{P}_{i}\rho),~{}~{}\rho^{\prime}=\frac{\mathcal{P}_{i}\rho\mathcal{P}_{i}^{\dagger}}{\Tr(\mathcal{P}_{i}^{\dagger}\mathcal{P}_{i}\rho)} (21)

where ii denotes a state in certain complete computational basis {σiαi|i=1n,α{0,x,y,z}}\{\sigma_{i}^{\alpha_{i}}|i=1\cdots n,\alpha\in\{0,x,y,z\}\} and ρ\rho^{\prime} is the resultant density matrix after the measurement. Assume each 𝒫i\mathcal{P}_{i} is a local projector, we conduct sampling on skeletal partition of the lattice i.e. regions (lines) that have no volume [54] but with non-trivial topology as closed loops. Then in an n-site subsystem one can consider the probability in the auto-regression form

p(σ1α1,,σnαn)=i=1np(σiαi|σ<i)p(\sigma_{1}^{\alpha_{1}},\cdots,\sigma_{n}^{\alpha_{n}})=\prod_{i=1}^{n}p(\sigma_{i}^{\alpha_{i}}|\vec{\sigma}_{<i}) (22)

where σ<i{σjαj|j<i}\vec{\sigma}_{<i}\equiv\{\sigma_{j}^{\alpha_{j}}|j<i\}. It should be pointed out that in principle one can simply project the state into a definite many-body basis without doing a series of local projection, however, the local projections easily allow the study on local density matrices and also make it amenable to tenor network methods like density matrix renormalization group and matrix product states. Eq. 22 can be treated as a classical probability distribution for any normalized wavefunction. Note that even though it is formalized in a classical probability, quantum information are nevertheless accessible by shuffling the local computational basis; and since the probability p(σiαi|σ<i)p(\sigma_{i}^{\alpha_{i}}|\vec{\sigma}_{<i}) is determined by the projective measurements conditioned on the totality of previous measurements, the sampled data is in general able to reflect the entanglement structure of ρ\rho according to Eq. 21. Assume that {σi}\{\sigma_{i}\} are gauge field degrees of freedoms as is required by the TO, where local contractable Wilson loops is constraint to have zero total flux, and any other operators that do not form closed loops, i.e. those that are not gauge-invariant, are not subjected to such constraint. Then the joint probability in Eq. 22 will exhibit a strong synergy if (σ1α1,,σnαn)(\sigma_{1}^{\alpha_{1}},\cdots,\sigma_{n}^{\alpha_{n}}) forms a Wilson loop whereby strong fluctuations are accompanied by many-body constraints; and weak synergy otherwise.

III.2 Pure Z2Z_{2} gauge theory

In this section, we demonstrate the equivalence between the following three key concepts: (i) zero-flux constraint for Wilson loops (ii) topological entanglement entropy and (iii) higher order mutual information I3I_{3} in the TC model as a pure Z2Z_{2} gauge theory by classical probabilistic treatment.

Refer to caption
Figure 2: The Toric code lattice, with subsystem partitions as skeletal regions. (a) The two mutually commuting terms of the TC Hamiltonian in Eq. 23. (b) A skeletal subsystem consisting of four partitions P1,,P4P_{1},\cdots,P_{4}, which corresponds to the topology of Fig. 1(e) whereby TEE can be described by the mutual information I4I_{4}. (c) An RBM network representation for the sample distribution of a local skeletal region. The red nodes belong to the hidden layer of the RBM, the projectively measured spins in the TC lattice are treated as the visible nodes of the RBM. For simplicity we only draw the couplings in pink dashed lines between one hidden node and visible nodes within the region of interest. If measurements are projected on the zz axis, the RBM will be able to discern an effective fourth order interaction that is responsible for the I4I_{4} encoded in the plaquette Wilson loop of TC.

The simplest macroscopic model that realizes a pure Z2Z_{2} gauge theory and TO is the Toric code model, where the local energy density is given by mutually commuting stabalizers that are responsible for the non-local loop operators. Let us start with a pure Z2Z_{2} gauge theory, which can be realized by macroscopic Hamiltonian by e.g. Toric code model

HTC=JTC[sAs+pBp]H_{\rm TC}=-J_{\rm TC}\left[\sum_{s}A_{s}+\sum_{p}B_{p}\right] (23)

with AsA_{s} and BpB_{p} given by

As=i+sσix,Bp=ipσizA_{s}=\prod_{i\in+_{s}}\sigma_{i}^{x},~{}~{}B_{p}=\prod_{i\in\square_{p}}\sigma_{i}^{z} (24)

as shown in Fig. 2(a). Its ground state can be exactly constructed by superposing all gauge-equivalent wave functions: |Ψs(1+As)|Ψ0\ket{\Psi}\propto\prod_{s}(1+A_{s})\ket{\Psi_{0}}, where |Ψ0\ket{\Psi_{0}} can take the form of any zz-basis state that satisfies ipσiz=1\prod_{i\in\square_{p}}\sigma_{i}^{z}=1. Let us consider four-site plaquette subsystem consisting of link variable σ{σ1,σ2,σ3,σ4}\sigma_{\square}\equiv\{\sigma_{1},\sigma_{2},\sigma_{3},\sigma_{4}\}. The topological nature can be reflected in the fact that the product of four σz\sigma^{z} (σx\sigma^{x}) that reside in a plaquette (star) must be +1+1 for the ground state wavefunction; and that the gauge operator enforces the superposition of all gauge configurations that satisfies the aforesaid constraint. The ground state wave function can hence be written as

|Ψ{σ}|σ|σ¯\ket{\Psi}\propto\sum_{\{\sigma_{\square}\}}\ket{\sigma_{\square}}\otimes\ket{\overline{\sigma_{\square}}} (25)

up to a normalizing factor, where |σ¯\ket{\overline{\sigma_{\square}}} denotes the gauge configuration of links complementary to the σ\sigma_{\square}. The summation is over the configuration {σ}\{\sigma_{\square}\} of subsystem, which determine the set of configurations in the environment. Due to the zero-flux constraint in the ground state, there are three degrees of freedom which fluctuate independently, hence, the normalized reduced density matrix takes the form

ρ=Trσ¯(|ΨΨ|)=123|σσ|\rho_{\square}=\Tr_{\overline{\sigma_{\square}}}(\ket{\Psi}\bra{\Psi})=\frac{1}{2^{3}}\ket{\sigma_{\square}}\bra{\sigma_{\square}} (26)

which immediately gives the entropy S=Llog2log2S=L\log 2-\log 2, with L=4L=4 and the second term Stopo=log2S_{\rm topo}=-\log 2. The simplicity of ρ\rho_{\square} of TC makes it an ideal and minimal platform to test the statistical approach. To apply a statistical measure of topological entanglement, we use the projective measurement which produces classical probabilities. Given a mixed state {pi,|ψi}\{p_{i},\ket{\psi_{i}}\}, the (reduced) density matrix is defined by ρ=ipi|ψiψi|\rho=\sum_{i}p_{i}\ket{\psi_{i}}\bra{\psi_{i}}. A projetive meaurement by projector 𝒫i\mathcal{P}_{i} gives the outcome ii with probability pip_{i}, and ρ\rho collapses into ρ\rho^{\prime} as discussed in Eq. 21.

Note that the gauge constraint in Z2Z_{2} TO states that any contractable loops must have zero flux in the ground state expectation, hence, under zz basis, in each set of samples the four spins must multiply to one. Equivalently, fixing one of the four spins by a projection into |\ket{\uparrow}, projective measurements on the remaining three σi\sigma_{i} should give a dependent sample distribution. This allows us to write it in terms of the tripartite form, which immediately gives a non-zero synergistic information as the topological entanglement. To see this, we calculate the (conditional) mutual information of {σ1,σ2,σ3}\{\sigma_{1},\sigma_{2},\sigma_{3}\} in the classical information context given a fixed σ4=+1\sigma_{4}=+1:

I(σ1:σ2|σ4=+1)=σ1,σ2p(σ1,σ2)logp(σ1,σ2)p(σ1)p(σ2)I(\sigma_{1}:\sigma_{2}|\sigma_{4}=+1)=\sum_{\sigma_{1},\sigma_{2}}p(\sigma_{1},\sigma_{2})\log\frac{p(\sigma_{1},\sigma_{2})}{p(\sigma_{1})p(\sigma_{2})} (27)
I(σ1:σ2|σ3;σ4=+1)=σ1σ2σ3p(σ1,σ2,σ3)×logp(σ3)p(σ1,σ2,σ3)p(σ1,σ3)p(σ2,σ3)\begin{split}I(\sigma_{1}:\sigma_{2}|\sigma_{3};\sigma_{4}=+1)&=\sum_{\sigma_{1}\sigma_{2}\sigma_{3}}p(\sigma_{1},\sigma_{2},\sigma_{3})\\ &\times\log\frac{p(\sigma_{3})p(\sigma_{1},\sigma_{2},\sigma_{3})}{p(\sigma_{1},\sigma_{3})p(\sigma_{2},\sigma_{3})}\end{split} (28)

It is straightforward to evaluate these equations noting that there are only a few choices of {σ1,σ2,σ3}\{\sigma_{1},\sigma_{2},\sigma_{3}\} given the zero-flux constraint. Here we can directly write down the marginal probabilities. Due to the gauge symmetry, all valid gauge configurations share the same probability weight, hence we have

p(σi=±1)=12,p(σ1=±1,σ2=±1)=14\displaystyle p(\sigma_{i}=\pm 1)=\frac{1}{2},~{}p(\sigma_{1}=\pm 1,\sigma_{2}=\pm 1)=\frac{1}{4} (29)
p(σ1=±1,σ2=±1,σ3=±1)=14\displaystyle p(\sigma_{1}=\pm 1,\sigma_{2}=\pm 1,\sigma_{3}=\pm 1)=\frac{1}{4} (30)

These immediately gives

I(σ1:σ2|σ4=+1)=0,\displaystyle I(\sigma_{1}:\sigma_{2}|\sigma_{4}=+1)=0, (31)
I(σ1:σ2|σ3;σ4=+1)=log2\displaystyle I(\sigma_{1}:\sigma_{2}|\sigma_{3};\sigma_{4}=+1)=\log 2 (32)

thus the third order mutual information which coincides with TEE:

I3(σ1:σ2:σ3|σ4=+1)=log2=Stopo(TC)I_{3}(\sigma_{1}:\sigma_{2}:\sigma_{3}|\sigma_{4}=+1)=-\log 2=S_{\rm topo}(\rm TC) (33)

By the same token we would arrive at the same log2\log 2 with the fixed σ4=1\sigma_{4}=-1. This is also consistent with that obtained by LW construction on a skeletal region [54]. Hence in the chosen basis, the topological entanglement inside the plaquette Wilson loop can be treated as a classical statistical problem where a negative third order mutual information I3I_{3} is indicative of the existence of TEE; and this holds true for larger loops with negative InI_{n}. Note that the equal-weight superposition between gauge configuration, thus the presence of the gauge operator si+sσix\sum_{s}\prod_{i\in+_{s}}\sigma_{i}^{x} in the Hamiltonian, is essential in deriving the above result. Its absence will lead to a product state that trivially satisfies the constraint σiz=1\expectationvalue{\prod_{\square}\sigma_{i}^{z}}=1 at zero temperature without fluctuation in the samples of projective measurements.

This toy example also clearly showcases the gauge constraint as an essential piece that give rise the synergistic information, in the same way it is needed in the conventional derivation of TEE in previous references. In the statistical point of view, the TEE is equivalent to the intrinsic many-body correlation that cannot be reduced to several correlations of pairs of qubits. The key role played by the gauge constraint can also be reflected in another kind of subsystem partition: assume a subsystem whereby the constituent four qubits are colinear, thus the gauge constraint does not interfere. In this case I3=0I_{3}=0 since no synergy would emerge in absence of gauge constraint; this is also consistent with Ref. [43] where the authors showed the quantum conditional mutual information vanishes for subsystems with colinear topology. Hence, to witness StopoS_{\rm topo} or InI_{n}, we must partition a subsystem into a topology such that the gauge constraint is present, such as the skeletal loop shown in Fig. 2(b). Nevertheless, we would like to point out that it is still possible to extract TEE in certain fine-tuned scenarios using only two-point correlations if there are particles in additional quantum sectors that are coupled to the emergent gauge field, whereby the information of gauge sector is imprinted into the two-point correlators of the matter sector [55].

IV Statistical representation by Restricted Boltzmann machine

The goal is to capture the irreducible many-body correlation or the synergistic information present in InI_{n} in the probability distribution of the samples generated from projective measurements on a potentially topologically ordered pure state. The correlation of a state Ψ\Psi is encoded in the reduced density matrix of a subsystem SS, which can be formally associated to a entanglement Hamiltonian (σS)\mathcal{H}(\sigma\in S) [25, 56] (we use calligraphic \mathcal{H} here in contrast to physical Hamiltonian):

ρS=TrS¯|ΨΨ|e(σS)\rho_{S}=\Tr_{\bar{S}}\ket{\Psi}\bra{\Psi}\equiv e^{-\mathcal{H}(\sigma\in S)} (34)

where S¯\bar{S} denotes the complement set of degrees of freedoms of SS. In this formulation, the many-body correlation or topological entanglement is encoded in an interacting Hamiltonian (σS)\mathcal{H}(\sigma\in S), where the nn-body correlation/entanglement of the density matrix can be reflected in the nn-body interaction of the entanglement Hamiltonian. In a pure gauge theory like TC, it suffices to include only diagonal elements in the σz\sigma^{z} basis, and Eq. 34 is reduced to a Boltzmann form p=e/Zp=e^{-\mathcal{H}}/Z.

Representing the correlation information by a interaction model (σS)\mathcal{H}(\sigma\in S) in Eq. 34 requires a statistical network that is able to capture the high order correlation in the data distribution from samples using manageable amount of coupling parameters. As the universal approximator, deep neural networks are in general good at capturing non-local and high-order information such as TO that exhibit InI_{n} for large nn. However, in order to retain analytical tractability, we choose to apply the RBM as a representation model for the projectively sampled data from a quantum state. RBM is defined in a bipartite Ising lattice whereby only one subset of spins σ\sigma are physical (visible) whereas auxiliary spins τ\tau in the other subset are deemed unphysical (hidden). It is essentially a generalized auxiliary field representation of coupled binary spins where the arbitrarily high order interactions between σ\sigma can be represented by at most second order interaction between σ\sigma and τ\tau. For detailed discussions of RBM we refer readers to the Appendix as well as Ref. [57].

Refer to caption
Figure 3: An illustration of the effective interaction of visible nodes in an RBM, where rr blue circles that are included in the same shaded region enjoy an rr-body interaction (r)\mathcal{I}^{(r)}. The four blue circles denote the visible nodes σ\sigma, and the nn red ones denote the hidden nodes τ\tau. The two-body couplings between σ\sigma and τ\tau give effective higher order interactions between σ\sigma when the hidden layer is traced out.

IV.1 Many-body interaction via two-body interaction

In this subsection we explain the intuition that a two-body interacting model like RBM can be used to construct many-body correlation/interaction. It has been proven that with a sufficiently large number of hidden nodes, any probability distribution can be well approximated by RBM [58]. The essence of RBM is the representation of an effective model with many-body interaction using a model with redundant degrees of freedoms which are coupled only by two-body interactions. To make this point explicit, Let HτH_{\tau} be the Hamiltonian of the hidden nodes, which can simply take the form of Pauli matrices τ\tau in presence of magnetic fields; and let HσH_{\sigma} be the Hamiltonian of the visible nodes that are coupled by less than or equal to two-body interactions, and HστH_{\sigma\tau} the Hamiltonian of the two-body interactions between the visible and the hidden nodes. The generic Green function GG of the whole system is given by (EH)G=I(E-H)G=I, in block matrix form it can be written as

(EτστστEσ)(GτGστGτσGσ)=(1001)\begin{pmatrix}E-\mathcal{H}_{\tau}&-\mathcal{H}_{\sigma\tau}\\ -\mathcal{H}_{\sigma\tau}^{\dagger}&E-\mathcal{H}_{\sigma}\end{pmatrix}\begin{pmatrix}G_{\tau}&G_{\sigma\tau}\\ G_{\tau\sigma}&G_{\sigma}\end{pmatrix}=\begin{pmatrix}1&0\\ 0&1\end{pmatrix} (35)

This immediately gives

{E[σ+στ(EHτ)1Hστ]}Gσ=I\left\{E-\left[\mathcal{H}_{\sigma}+\mathcal{H}_{\sigma\tau}(E-H_{\tau})^{-1}H_{\sigma\tau}\right]\right\}G_{\sigma}=I (36)

which means the Green function of the visible nodes is

Gσ=1E(σ+Σ)G_{\sigma}=\frac{1}{E-(\mathcal{H}_{\sigma}+\Sigma)} (37)

with the self energy Σ\Sigma given by

Σ=στ(Eτ)1στ\Sigma=\mathcal{H}_{\sigma\tau}(E-\mathcal{H}_{\tau})^{-1}\mathcal{H}_{\sigma\tau} (38)

It is Σ\Sigma which involves higher order interaction in the perturbation expansion and gives the representation power of RBM. Therefore we can identify an effective Hamiltonian of the visible nodes under the influence of the hidden nodes. The effective Hamiltonian is simply

σeff(E)=𝒫σ(σ+Σ)𝒫σ\mathcal{H}^{\rm eff}_{\sigma}(E)=\mathcal{P}_{\sigma}(\mathcal{H}_{\sigma}+\Sigma)\mathcal{P}_{\sigma}^{\dagger} (39)

where 𝒫σ\mathcal{P}_{\sigma} projects states onto the manifold of visible nodes. The Green function formalism makes clear that, even though all the contributing Hamiltonian are local, two-body in nature, the self energy term in σeff\mathcal{H}^{\rm eff}_{\sigma} can contain non-local, many-body interactions encoded in the entanglement Hamiltonian in Eq. 34, as illustrated in Fig. 3, which can be made formally explicit by perturbation or cumulant expansion. Figure 2(c) shows an example of the RBM description for the data obtained from projections on a small Wilson loop of a lattice model.

As we are to discuss in detail in the following subsection, RBM is a specific implementation of the above picture, where Hσ(𝐚)H_{\sigma}(\mathbf{a}) and Hτ(𝐛)H_{\tau}(\mathbf{b}) have only one-body energy contribution to the total Hamiltonian with 𝐚,𝐛\mathbf{a},\mathbf{b} coupling parameters; and τσ(𝐰)\mathcal{H}_{\tau\sigma}(\mathbf{w}) encodes all two-body interactions wijw_{ij} between the visible and hidden nodes. By tuning the coupling parameters 𝐚,𝐛,𝐰\mathbf{a},\mathbf{b},\mathbf{w}, and tracing out the redundant degrees of freedom τ\tau, we arrive at

eff=s=1Nk1<<ksk1,,ks(s)σk1σks\mathcal{H}^{\rm eff}=\sum_{s=1}^{N}\sum_{k_{1}<\cdots<k_{s}}\mathcal{I}^{(s)}_{k_{1},\cdots,k_{s}}\sigma_{k_{1}}\cdots\sigma_{k_{s}} (40)

where k1,,ks(s)\mathcal{I}^{(s)}_{k_{1},\cdots,k_{s}} is a function of 𝐚,𝐛,𝐰\mathbf{a},\mathbf{b},\mathbf{w}; and it gives the effective coupling between ss visible nodes σk1,,σks\sigma_{k_{1}},\cdots,\sigma_{k_{s}}. Indeed, the ability to encode arbitrarily high order interactions makes RBM a universal representation for all statistical distributions.

In general, a large nn-body mutual information InI_{n} indicates a large k1,,kn(n)\mathcal{I}^{(n)}_{k_{1},\cdots,k_{n}} in the effective energy of RBM defined by p(σ)exp(eff)p(\sigma)\sim\exp(-\mathcal{H}^{\rm eff}), which is equivalent to Eq. 34 for the diagonal elements, and will be discussed in detail in the next subsection. A similar idea by the multi-layered deep Boltzmann machine has been used to resolve the non-local entanglement features on the boundary by a local Ising model in the context of holographic duality [59, 60]. One contribution of our work is to make explicit the form of k1,,kn(n)\mathcal{I}^{(n)}_{k_{1},\cdots,k_{n}} of an RBM for σ{+1,1}\sigma\in\{+1,-1\} relevant for projective samples of spin-12\frac{1}{2}, and also for other binary bits, without involving perturbation or cumulant expansion, making it easy to extract interaction strength of arbitrary order. This allows us to interrogate the RBM the existence of non-local many-body correlation between sampled degrees of freedoms, where a large (n)\mathcal{I}^{(n)} of the trained network is indicative of nn-order correlation as TEE encoded in the entanglement Hamiltonian under the basis by which Wilson loop is diagonal. Furthermore, we also present the effective interaction for σ{0,1}\sigma\in\{0,1\} in the Appendix, which can be used to represent high-order density-density interaction in fermion models [24]; however, since any presence of zero would lead to a trivial energy contribution in Eq. 40 regardless of the order, it is not capable of representing non-trivial many-body correlation like Wilson loops of spins.

Refer to caption
Figure 4: High-order mutual information as a function of different orders of interaction. (a) The third order mutual information as a function of two- and three-body interactions ((2)\mathcal{I}^{(2)} and (3)\mathcal{I}^{(3)} in Eq. 41). Even though a negative I3I_{3} may be present in absence of a three-body interaction (3)\mathcal{I}^{(3)}, it requires |(3)||(2)||\mathcal{I}^{(3)}|\gg|\mathcal{I}^{(2)}| in order to capture the log2-\log 2 topological entropy in the projected data (See Eq. 33). (b) The negative gradient of I3I_{3}. The arrows denote the optimization direction of effective coupling parameters in an RBM with three-spin input (conditioned on σ4\sigma_{4}). (c) The fourth order mutual information as a function of two- and four-body interactions in the frustrated Hamiltonian 4=(2)(σ1σ2σ2σ3+σ3σ4+σ1σ4)(4)σ1σ2σ3σ4\mathcal{H}_{4}=\mathcal{I}^{(2)}(\sigma_{1}\sigma_{2}-\sigma_{2}\sigma_{3}+\sigma_{3}\sigma_{4}+\sigma_{1}\sigma_{4})-\mathcal{I}^{(4)}\sigma_{1}\sigma_{2}\sigma_{3}\sigma_{4}. Similar to the former case, it requires |(4)||(2)||\mathcal{I}^{(4)}|\gg|\mathcal{I}^{(2)}| in order to capture the log2-\log 2 topological entropy in the projected data. (d) The negative gradient of I4I_{4}. The arrows denote the optimization direction of effective coupling parameters in an RBM with four-spin input.

IV.2 Large high-order mutual information requires high-order interaction

Previous section has established the possibility of representing the (projected) entanglement Hamiltonian by means of an effective many-body Ising model with arbitrary orders of interactions due to RBM. However, we would like to point out the following caveat that needs clarification: High-order interactions and high-order correlations address fundamentally different aspects of a system: the former address the effective Hamiltonian dynamics that stabilizes low-energy states, while the latter focus on emergent statistical properties of these states. Indeed, as pointed out in Ref. [1, 4], high-order mutual information In3I_{n\geq 3} can arise in frustrated Ising models with only pair-wise interactions. For example, consider a simple frustrated Ising model with three binary spins, whose Hamiltonian is given by

3=(2)(σ1σ2+σ1σ3+σ2σ3)(3)σ1σ2σ3\mathcal{H}_{3}=-\mathcal{I}^{(2)}(\sigma_{1}\sigma_{2}+\sigma_{1}\sigma_{3}+\sigma_{2}\sigma_{3})-\mathcal{I}^{(3)}\sigma_{1}\sigma_{2}\sigma_{3} (41)

This three-spin model is pair-wisely frustrated when (3)=0\mathcal{I}^{(3)}=0 and (2)<0\mathcal{I}^{(2)}<0, and straightforward calculation shows the third order mutual information is negative, indicating a frustration-induced irreducible three-body correlation. Therefore, one may ask if an nn-th order correlation in a closed Wilson loop can also be reflected in lower order interactions (n<n)\mathcal{I}^{(n^{\prime}<n)}, causing a large degeneracy in RBM representation.

Here we show that such confusion due to the potential degeneracy of representation does not happen in representing the non-local correlation of a TO with Stopo=log2S_{\rm topo}=-\log 2. Indeed, even though the frustrated model 3\mathcal{H}_{3} gives rise to a negative I3I_{3} in absence of high-order interaction (3)\mathcal{I}^{(3)}, it cannot generate the high-order mutual information as large as I3=Stopo=log2I_{3}=S_{\rm topo}=-\log 2 (See Eq. 33) without a dominant (3)\mathcal{I}^{(3)}. Straightforward calculation shows

lim(2)I3[3((2),(3)=0)]=log(98)\lim_{\mathcal{I}^{(2)}\rightarrow-\infty}I_{3}[\mathcal{H}_{3}(\mathcal{I}^{(2)},\mathcal{I}^{(3)}=0)]=-\log(\frac{9}{8}) (42)

which obviously deviates from log2-\log 2 for a Z2Z_{2} gauge theory. Therefore, a large value of (3)\mathcal{I}^{(3)} is required in the optimized RBM in order to generate the correct many-body correlation in the projectively measured data. As shown in Fig. 4(a,b), a faithful RBM representation of Stopo=log2S_{\rm topo}=-\log 2 is only possible with the highest-order interaction (3)\mathcal{I}^{(3)}, and parameters in the effective Hamiltonian of the RBM must flow along the direction where (3)\mathcal{I}^{(3)} increases. The same holds in a four-spin model, as shown in Fig. 4(c,d), which we will demonstrate by our RBM implementation in the following sections. Indeed, this is similar to the classical picture of topological order, where the topological entropy can be perceived as thermally mixed classical loops with energetic constraints [47]. By induction one can show that high-order scenarios have the same properties, which we do not enumerate in this paper.

IV.3 Interrogate the restricted Boltzmann machine

In this section, we discuss how to interrogate a trained RBM to extract effective interactions of arbitrary order. The physical spins are to be obtained by projective measurements with a set of definite projection basis. The network of RBM is given by the energy function

(σ,τ)=iaiσibiτijwijσiτj\mathcal{H}(\sigma,\tau)=\sum_{i}-a_{i}\sigma_{i}-b_{i}\tau_{i}-\sum_{j}w_{ij}\sigma_{i}\tau_{j} (43)

and the joint probability distribution is p(σ,τ)=1Ze(σ,τ)p(\sigma,\tau)=\frac{1}{Z}e^{-\mathcal{H}(\sigma,\tau)} with ZZ the partition function. The probability distribution of the physical spins σ\sigma can be obtained by marginalization over unphysical spins

p(σ)=Trτp(σ,τ)=e(σ)Zp(\sigma)=\Tr_{\tau}p(\sigma,\tau)=\frac{e^{-\mathcal{H}(\sigma)}}{Z^{\prime}} (44)

where the associated partition function is

Z=ZTrτexp(ibiτi)Z^{\prime}=\frac{Z}{\Tr_{\tau}\exp(\sum_{i}b_{i}\tau_{i})} (45)

The the effective energy (σ)\mathcal{H}(\sigma) are given by

(σ)=iaiσi+K(iwijσi)\displaystyle\mathcal{H}(\sigma)=\sum_{i}a_{i}\sigma_{i}+K\left(\sum_{i}w_{ij}\sigma_{i}\right) (46)

where K(iwijσi)K\left(\sum_{i}w_{ij}\sigma_{i}\right) is the cumulant generating function

K(iwijσi)=logTrτ[exp(ijwijσiτj)ρ(τ)]K\left(\sum_{i}w_{ij}\sigma_{i}\right)=\log\Tr_{\tau}\left[\exp(\sum_{ij}w_{ij}\sigma_{i}\tau_{j})\rho(\tau)\right] (47)

and ρ(τ)\rho(\tau) is a probability density function of unphysical spins

ρ(τ)=exp(ibiτi)/Trτexp(ibiτi)\rho(\tau)=\exp(\sum_{i}b_{i}\tau_{i})\Big{/}\Tr_{\tau}\exp(\sum_{i}b_{i}\tau_{i}) (48)

Here we use the spin states in {+1,1}\{+1,-1\}, and we seek to expand the \mathcal{I} in Eq. 40, which usually requires a cumulant expansion of Eq. 47 to arbitrary orders of σ\sigma such that different orders or correlation become explicit:

K(x)=n1n!κ(n)xn,xiwijσiK(x)=\sum_{n}\frac{1}{n!}\kappa^{(n)}x^{n},~{}~{}x\equiv\sum_{i}w_{ij}\sigma_{i} (49)

where κ(n)\kappa^{(n)} is the nnth cumulant function

κ(n)=nxnK(x)|x=0\kappa^{(n)}=\frac{\partial^{n}}{\partial x^{n}}K(x)\Big{|}_{x=0} (50)

Indeed, this is usually the method used to extract or construct effective interactions in RBM [39, 61, 24]. However, the κ(n)\kappa^{(n)} in Eq. 50 requires high order derivative in presence of many-body interaction, making it inconvenient to analytically track the effective interaction of arbitrary order and evaluate (n)\mathcal{I}^{(n)} for large nn. Hence, the implementation in previous pioneering works have exploited only the first few orders of interaction.

Here, instead of using conventional cumulant expansion, we propose a projective construction and derive the closed form of effective interaction at arbitrary order without involving derivatives or cumulant functions. For σ{+1,1}\sigma\in\{+1,-1\} which is anti-symmetric in the zz basis of the Pauli matrix, it is intuitively clear that rrth order interaction should be captured by

k1,,kr(r)Tr{σ1,,σN}[irσkieff],σi=±1\mathcal{I}^{(r)}_{k_{1},\cdots,k_{r}}\sim\Tr_{\{\sigma_{1},\cdots,\sigma_{N}\}}\left[\prod_{i}^{r}\sigma_{k_{i}}\mathcal{H}^{\rm eff}\right],~{}\sigma_{i}=\pm 1 (51)

since all energy contributions are cancelled due to the anti-symmetry of σz\sigma^{z}, except for those of σk1,,σkr\sigma_{k_{1}},\cdots,\sigma_{k_{r}} which are made symmetric by irσki\prod_{i}^{r}\sigma_{k_{i}}. Indeed, we will show this representation of (r)\mathcal{I}^{(r)} is true in the case of σ{+1,1}\sigma\in\{+1,-1\} and is suitable for the application in detecting InI_{n} of emergent gauge field. Yet, in cases such as σ{0,1}\sigma\in\{0,1\} or others, i.e. in absence of anti-symmetry, we need to explicitly construct the effective interaction using a different method. Below we present a general construction that is valid for all binary σ\sigma. The trick is to construct proper resolution of identity which makes all orders of interaction explicit at once. For convenience, we here present the derivation for σ{+1,1}\sigma\in\{+1,-1\}, and use Dirac notation in the zz basis of Hilbert space with zero off-diagonal elements. In the appendix we present another example of such construction, where binary spins can be either 0 or 11. We first define the projector 𝒫k\mathcal{P}_{k} that selects the state where all but the kkth visible node σk\sigma_{k} are 1-1:

𝒫k=|σk=1;σkk=1σk=1;σkk=1|\mathcal{P}_{k}=\ket{\sigma_{k}=1;\sigma_{k^{\prime}\neq k}=-1}\bra{\sigma_{k}=1;\sigma_{k^{\prime}\neq k}=-1} (52)

Then the global projection on σ\vec{\sigma} supported on the subspace of visible nodes that selects out states where there exists only one visible active node is defined by

𝒫(1)k=1N𝒫k\mathcal{P}^{(1)}\equiv\sum_{k=1}^{N}\mathcal{P}_{k} (53)

Similarly for any positive integer mNm\leq N we can define the projector 𝒫(m)\mathcal{P}^{(m)} that selects mm out of NN visible nodes that are active:

𝒫(m)=k1<<km𝒫k1,,km\mathcal{P}^{(m)}=\sum_{k_{1}<\cdots<k_{m}}\mathcal{P}_{k_{1},\cdots,k_{m}} (54)

where 𝒫k1,,km\mathcal{P}_{k_{1},\cdots,k_{m}} is the projector which selects the state with σk{k1,,km}=1\sigma_{k\in\{k_{1},\cdots,k_{m}\}}=1 and others 1-1:

𝒫k1,,km=|σk{k1,,km}=1;σk{k1,,km}=1σk{k1,,km}=1;σk{k1,,km}=1|\begin{split}\mathcal{P}_{k_{1},\cdots,k_{m}}=&\ket{\sigma_{k\in\{k_{1},\cdots,k_{m}\}}=1;\sigma_{k^{\prime}\notin\{k_{1},\cdots,k_{m}\}}=-1}\\ &\bra{\sigma_{k\in\{k_{1},\cdots,k_{m}\}}=1;\sigma_{k\notin\{k_{1},\cdots,k_{m}\}}=-1}\end{split} (55)

These give a useful resolution of identity

m=0N𝒫(m)=I\sum_{m=0}^{N}\mathcal{P}^{(m)}=I (56)

by which the Eq.  47 as a diagonal operator can be readily written as combination of different groups:

K^(𝐰σ)=σm=0Nk1<<kmKm(𝐰)×δσk{k1,,km},1δσk{k1,,km},1×|σk{k1,,km}=1;σk{k1,,km}=1σ|\begin{split}\hat{K}&(\mathbf{w}^{\intercal}\vec{\sigma})=\sum_{\vec{\sigma}}\sum_{m=0}^{N}\sum_{k_{1}<\cdots<k_{m}}K_{m}(\mathbf{w})\\ &\times\delta_{\sigma_{k\in\{k_{1},\cdots,k_{m}\}},1}~{}\delta_{\sigma_{k\notin\{k_{1},\cdots,k_{m}\}},-1}\\ &\times\ket{\sigma_{k\in\{k_{1},\cdots,k_{m}\}}=1;\sigma_{k^{\prime}\notin\{k_{1},\cdots,k_{m}\}}=-1}\bra{\vec{\sigma}}\end{split} (57)

where for convenience we have defined

Km(𝐰kj)K(j=1m𝐰kjj=m+1N𝐰kj)K_{m}(\mathbf{w}^{\intercal}_{k_{j}})\equiv K\left(\sum_{j=1}^{m}\mathbf{w}_{k_{j}}^{\intercal}-\sum_{j=m+1}^{N}\mathbf{w}^{\intercal}_{k_{j}}\right) (58)

Noting that σk\sigma_{k} is binary and classical, we write the Kronecker delta in Eq. 57 as

δσkδσk{k1,,km},1{k1,,km},1=i=1m(1+σki2)k{k1,,km}(1σk2)\begin{split}\delta_{\sigma_{k\in}}&{}_{\{k_{1},\cdots,k_{m}\},1}~{}\delta_{\sigma_{k\notin\{k_{1},\cdots,k_{m}\}},-1}\\ &=\prod_{i=1}^{m}\left(\frac{1+\sigma_{k_{i}}}{2}\right)\prod_{k\notin\{k_{1},\cdots,k_{m}\}}\left(\frac{1-\sigma_{k}}{2}\right)\end{split} (59)

Hence the diagonal terms of K^\hat{K} given by the trace over σ\sigma is

K(𝐰σ)=m=0Nk1<<kmKm(𝐰kj)×i=1m(1+σki2)k{k1,,km}(1σk2)\begin{split}K(\mathbf{w}^{\intercal}\vec{\sigma})&=\sum_{m=0}^{N}\sum_{k_{1}<\cdots<k_{m}}K_{m}(\mathbf{w}^{\intercal}_{k_{j}})\\ &\times\prod_{i=1}^{m}\left(\frac{1+\sigma_{k_{i}}}{2}\right)\prod_{k\notin\{k_{1},\cdots,k_{m}\}}\left(\frac{1-\sigma_{k}}{2}\right)\end{split} (60)

where we can expand the two products respectively. The first product in Eq. 60 is then written into:

i=1m(1+σki2)=12mq=0mk1<<kqσk1σkq\prod_{i=1}^{m}\left(\frac{1+\sigma_{k_{i}}}{2}\right)=\frac{1}{2^{m}}\sum_{q=0}^{m}\sum_{k_{1}<\cdots<k_{q}}\sigma_{k_{1}}\cdots\sigma_{k_{q}} (61)

and the other product into:

k{k1,,km}(1σk2)=12Nmp=0Nm(1)p×(km+1<<km+pσkm+1σkm+p)\begin{split}\prod_{k\notin\{k_{1},\cdots,k_{m}\}}&\left(\frac{1-\sigma_{k}}{2}\right)=\frac{1}{2^{N-m}}\sum_{p=0}^{N-m}(-1)^{p}\\ &\times\left(\sum_{k_{m+1}<\cdots<k_{m+p}}\sigma_{k_{m+1}}\cdots\sigma_{k_{m+p}}\right)\end{split} (62)

Then, combining them together we have the KK in the following form:

K(𝐰σ)=m=0Np=0Nmk1<<km;km+1<<km+p(k1,,km)Km(𝐰kj)(1)p[q=0m(k1<<kqσk1σkq)σkm+1σkm+p]\begin{split}&K(\mathbf{w}^{\intercal}\vec{\sigma})=\sum_{m=0}^{N}\sum_{p=0}^{N-m}\sum_{\begin{subarray}{c}k_{1}<\cdots<k_{m};\\ k_{m+1}<\cdots<k_{m+p}\\ \neq(k_{1},\cdots,k_{m})\end{subarray}}K_{m}(\mathbf{w}^{\intercal}_{k_{j}})\\ &(-1)^{p}\left[\sum_{q=0}^{m}\left(\sum_{k_{1}<\cdots<k_{q}}\sigma_{k_{1}}\cdots\sigma_{k_{q}}\right)\sigma_{k_{m+1}}\cdots\sigma_{k_{m+p}}\right]\end{split} (63)

where we have left out the constant prefactor 12N\frac{1}{2^{N}}. Given the size of sample space NN, each term in the above expression will be determined by a triplet (m,q,p)(m,q,p) where qq is upper bounded by mm and pp by NmN-m. This looks complicated and hard to rearrange. In order to write down an explicit energy function for rr-th order of interaction, the above summation can be grouped according to different doublet (p,q)(p,q), which determines the order r=p+qr=p+q. we replace the index qq by q=rpq=r-p, we have

K(r)(𝐰σ)=m=0Np=0Nmk1<<km;km+1<<km+p(k1,,km)Km(𝐰kj)(1)p[k1<<krp;ki{k1,,km}σk1σkrpσkm+1σkm+p]\begin{split}&K^{(r)}(\mathbf{w}^{\intercal}\vec{\sigma})=\sum_{m=0}^{N}\sum_{p=0}^{N-m}\sum_{\begin{subarray}{c}k_{1}<\cdots<k_{m};\\ k_{m+1}<\cdots<k_{m+p}\\ \neq(k_{1},\cdots,k_{m})\end{subarray}}K_{m}(\mathbf{w}^{\intercal}_{k_{j}})\\ &(-1)^{p}\left[\sum_{\begin{subarray}{c}k_{1}<\cdots<k_{r-p};\\ k_{i}\in\{k_{1},\cdots,k_{m}\}\end{subarray}}\sigma_{k_{1}}\cdots\sigma_{k_{r-p}}\sigma_{k_{m+1}}\cdots\sigma_{k_{m+p}}\right]\end{split} (64)

where the first rpr-pspins σk1σkrp\sigma_{k_{1}}\cdots\sigma_{k_{r-p}} in the product are attributed to the projection into σ=+1\sigma=+1, i.e. from Eq. 61, thus are responsible for +𝐰ksi+\mathbf{w}^{\intercal}_{k_{s_{i}}} in KK, where the nested label ksik_{s_{i}} (1irp1\leq i\leq r-p) is for the σkjs\sigma_{k_{j_{s}}} – the rpr-p out of rrspins that are chosen to be projected into +1+1; the spins σkm+1σkm+p\sigma_{k_{m+1}}\cdots\sigma_{k_{m+p}} in the product are attributed to the projection into σ=1\sigma=-1, i.e. Eq. 62, thus are responsible for 𝐰kji-\mathbf{w}^{\intercal}_{k_{j_{i}}} in KK, and we group these spins using labels kjik_{j_{i}} with p=|{kji}|p=|\{k_{j_{i}}\}|; finally the rest NrN-rspins, though not in the rrspins of interest, still contribute to energy (in contrast to the case of σ{1,0}\sigma\in\{1,0\}) and are associated with +𝐰kli+\mathbf{w}^{\intercal}_{k_{l_{i}}} for 1iNr1\leq i\leq N-r. Therefore, by rearranging indices it is straightforward to read out the interaction strength k1,,kr(r)\mathcal{I}^{(r)}_{k_{1},\cdots,k_{r}} between rrspins σk1σkr\sigma_{k_{1}}\cdots\sigma_{k_{r}}:

k1,,kr(r)={kji},{ksi}{k1,,kr}η(1)pK(i=1p𝐰kji+i=1rp𝐰ksi+i=1Nrηkli𝐰kli)\begin{split}\mathcal{I}^{(r)}_{k_{1},\cdots,k_{r}}=&\sum_{\begin{subarray}{c}\{k_{j_{i}}\},\{k_{s_{i}}\}\\ \subseteq\{k_{1},\cdots,k_{r}\}\end{subarray}}\sum_{\eta}(-1)^{p}K\Bigl{(}-\sum_{i=1}^{p}\mathbf{w}^{\intercal}_{k_{j_{i}}}\\ &+\sum_{i=1}^{r-p}\mathbf{w}^{\intercal}_{k_{s_{i}}}+\sum_{i=1}^{N-r}\eta^{k_{l_{i}}}\mathbf{w}^{\intercal}_{k_{l_{i}}}\Bigr{)}\end{split} (65)

where in indeces are grouped according to

{kji}{ksi}=,\displaystyle\{k_{j_{i}}\}\cap\{k_{s_{i}}\}=\emptyset, (66)
{kji}{ksi}={k1,,kr},\displaystyle\{k_{j_{i}}\}\cup\{k_{s_{i}}\}=\{k_{1},\cdots,k_{r}\}, (67)
kli{kr+1,,kN}\displaystyle k_{l_{i}}\in\{k_{r+1},\cdots,k_{N}\} (68)

and η\sum_{\eta} is the summation over all vectors η\eta of dimension |{kjl}|=Nr|\{k_{j_{l}}\}|=N-r whose elements ηkjl\eta^{k_{j_{l}}} are ±1\pm 1 binaries. It is then easy to see the compact equivalent form of Eq. 65:

k1,,kr(r)=Tr{σ1,σN}[σk1σkrK(iwijσi)]\mathcal{I}^{(r)}_{k_{1},\cdots,k_{r}}=\Tr_{\{\sigma_{1},\cdots\sigma_{N}\}}\left[\sigma_{k_{1}}\cdots\sigma_{k_{r}}K\left(\sum_{i}w_{ij}\sigma_{i}\right)\right] (69)

which is consistent with Eq. 51. This is our central result for the RBM representation. We will test the strength of the formalism in the coming section under the context of Z2Z_{2} TO such as toric code that harbors non-local, many-body correlation in reduced density matrices. The construction for σ{0,1}\sigma\in\{0,1\} is presented in Appendix. A for other potential applications.

Refer to caption
Figure 5: (a) The number of samples corresponding to different spin configurations from the trained RBM. The eight dominant configurations have zero flux in accordance with the gauge constraint of TC. The inset shows the network structure of the RBM with six hidden spins and four visible spins samples from a plaquette loop of TC. (b) Visualization of the weight matrix 𝐰\mathbf{w} of the trained RBM. Thick and thin blue lines indicates strong and weak coupling. (c) The magnitude of effective interactions of different orders. The strongest interaction is of the fourth order corresponding to the inset figure.

V RBM representation of the high-order correlation in TC

The smallest unit that exhibits topological entanglement in TC is the four-point plaquette operator, which is also the smallest Wilson loop that is gauge invariant in the vortex-free ground state. As discussed in previous sections, the topological entanglement in this minimal case can be understood as higher order mutual information I4I_{4} that is encoded in the local density matrix. In this section, we use RBM network to represent the reduced density matrix of TC in the basis whereby the Wilson loop is explicit, i.e. iσiz\prod_{\square_{i}}\sigma_{i}^{z} or +iσix\prod_{+_{i}}\sigma_{i}^{x}; and apply the previously derived result to capture the four-body correlation by the effective four-body interaction of the trained RBM.

We first disentangle Eq. 65 or Eq. 69 and write down explicitly its first few orders of interaction, up to fourth order by which a Wilson loop is built. In this simple exemplary case N=4N=4, hence, each order of effective interaction can be represented by the following way

1,2(2)=ηK(𝐰1𝐰2+η1𝐰3+η2𝐰4)+K(𝐰1+𝐰2+η1𝐰3+η2𝐰4)K(𝐰1𝐰2+η1𝐰3+η2𝐰4)K(𝐰1+𝐰2+η1𝐰3+η2𝐰4)\begin{split}\mathcal{I}^{(2)}_{1,2}=&\sum_{\eta}K\Bigl{(}-\mathbf{w}_{1}^{\intercal}-\mathbf{w}_{2}^{\intercal}+\eta^{1}\mathbf{w}_{3}^{\intercal}+\eta^{2}\mathbf{w}_{4}^{\intercal}\Bigr{)}\\ &+K\Bigl{(}\mathbf{w}_{1}^{\intercal}+\mathbf{w}_{2}^{\intercal}+\eta^{1}\mathbf{w}_{3}^{\intercal}+\eta^{2}\mathbf{w}_{4}^{\intercal}\Bigr{)}\\ &-K\Bigl{(}\mathbf{w}_{1}^{\intercal}-\mathbf{w}_{2}^{\intercal}+\eta^{1}\mathbf{w}_{3}^{\intercal}+\eta^{2}\mathbf{w}_{4}^{\intercal}\Bigr{)}\\ &-K\Bigl{(}-\mathbf{w}_{1}^{\intercal}+\mathbf{w}_{2}^{\intercal}+\eta^{1}\mathbf{w}_{3}^{\intercal}+\eta^{2}\mathbf{w}_{4}^{\intercal}\Bigr{)}\end{split} (70)

and so on, where ηi=±1\eta^{i}=\pm 1. Note that the sign of the prefactor in each summand is determined by the number of minus signs of the first twospins, i.e. the (1)p(-1)^{p} of Eq. 65, whereby even (odd) number of minuses of the first twospins gives the positive (negative) prefactor. The third order (3)\mathcal{I}^{(3)} and fourth order (4)\mathcal{I}^{(4)} can be expressed by the same token, with eight and sixteen summands respectively.

Results of the RBM are presented in Fig. 5. The network structure is shown in the inset of Fig. 5(a), where we used six nodes in the hidden layer to capture the joint probability distribution of 5000 projective samples taken under zz basis in the four spins in a plaquette. Note that it takes at least nn hidden nodes in order to represent an nnth order effective interaction between the visible spins. The interaction matrix 𝐰\mathbf{w} of the trained RBM is showcased in Fig. 5(b), where thinker lines indicate stronger coupling between visible and hidden spins. These together give the effective fourth order interaction, while leaving lower orders of effective interaction negligible. The comparison between different orders of interaction are shown in Fig. 5(c), where, as expected from the fact that the four spins are entangled collectively, the fourth order effective interaction between the σ1σ2σ3σ4\sigma_{1}\sigma_{2}\sigma_{3}\sigma_{4} is significantly larger than other lower order interactions. The validity of the model is further verified by the direct sampling from the RBM network, as shown in Fig. 5(a), the first eight configurations whose product equal to +1+1 are dominant in frequency over those whose product are 1-1. The same method would work equally well in other topologically ordered system with a richer Hilbert space, such as the Kitaev spin liquid, the paradigmatic integrable TO model defined on the honeycomb lattice. Its eigen function factorizes into the gauge sector and majorana fermion sector |Ψ=𝒢|M𝒢|𝒢\ket{\Psi}=\sum_{\mathcal{G}}\ket{M_{\mathcal{G}}}\otimes\ket{\mathcal{G}}, of which only the gauge sector |𝒢\ket{\mathcal{G}} contribute to the topological entanglement entropy of the Wilson loop of spins [62, 55]. The smallest Wilson loop in Kitaev model is the six-point hexagon with alternating spin basis, which requires at least 6 hidden nodes to represent the sixth order correlation in the projective samples. The logic in the TEE of the Kitaev model is the same as that presented for TC model, so we do not repeat the sampling thereof.

In this minimal example we have presented the witnessing of TEE in TC using the projective samples under basis that is proper to the Wilson loop. Even under weak perturbation, we expect that the projective samples in this basis would exhibit the same dominant statistical correlation. Furthermore, under random basis measurement, the statistical correlation relevant for TEE may not be present under a particular chosen basis combination. However, enabled by the low computational cost, it is always doable to train multiple RBMs and extract the effective interaction for each of them; and the TEE by would be reflected by the existence of a high-order effective interaction, which also directly inform the form of Wilson loop explicitly.

VI Conclusion and outlook

In this work we propose a statistical interpretation which unifies the high-order correlation and topological entanglement under the same statistical framework. We demonstrate in Sec. II that the existence of a non-zero TEE can be understood in the statistical view as the emergent nnth order mutual information InI_{n} (for arbitrary n3n\geq 3) reflected in projectively measured samples, which also makes explicit the equivalence between the two existing methods for its extraction – the Kitaev-Preskill and the Levin-Wen construction. The statistical nature of InI_{n} can be reflected in the effective nnth order mutual information, as is discussed in a minimal example in Sec. III.2. Hence, by exploiting the universal representational power of RBM, InI_{n} can be described by the effective interaction between visible nodes of a trained RBM as a descriptor of the distribution of quantum sampling of spins. In Sec. V, we explicitly showcased the construction of the RBM which captures the high-order correlation and/or topological entanglement that are encoded in the distribution of projected sample. Furthermore, in order to extract the coefficient of each order of interaction, we developed in Sec. IV.3 a method to interrogate the trained RBM, making explicit the analytical form of arbitrary order of interaction relevant for InI_{n} in terms of the effective Hamiltonian \mathcal{H}, whereby the high-order correlation is reflected in the effective many-body interaction between visible nodes after tracing out the hidden nodes. Recently the topological phase of TC is realized in cold atom setup [16, 17]. Through our statistical perspective and concrete neural network construction, we hope to provide useful insights in these relevant investigation of topologically ordered matter.

Beyond faithfully describing the high-order correlation, such exact extraction of the effective interactions up to arbitrary order opens the door for various application in many-body physics. Indeed, the effective many-body interaction encoded by the RBM network has been exploited in many-body systems. For example, in [39], where authors successfully used an RBM to capture interaction matrix of an Ising model; and [24] where the authors used RBM to exactly represent the interaction between nucleons. In our work, the exact extraction of nnth order interaction presented in Eq. 65 allows us to step further into arbitrarily high order of interactions, and a generic Hubbard-Stratonovich-type transformation, i.e. the representation of many-body interacting Hamiltonians in terms of two-body Hamiltonians with auxiliary fields. A potential application is a many-nucleon interaction:

HijV=vijn^in^j+vijkn^in^jn^k+vijkln^in^jn^kn^l+\begin{split}H^{V}_{ij}&=v_{ij}\hat{n}_{i}\hat{n}_{j}+v_{ijk}\hat{n}_{i}\hat{n}_{j}\hat{n}_{k}+v_{ijkl}\hat{n}_{i}\hat{n}_{j}\hat{n}_{k}\hat{n}_{l}+\cdots\end{split} (71)

where vv are scalar coefficients dependent on nucleon coordinates, and a functional of spin and isospin. It is then possible to introduce an auxiliary field so as to provide a simpler representation with fewer orders of interaction. Indeed, this is carried out in Ref. [24] where authors used the hidden nodes of RBM as an auxiliary field hh to disentangle the third order interaction into two-body interactions between hh and nn. With the representation of high-order interaction, we hope inspire auxiliary field construction that decouples arbitrarily high-order interactions between nucleons and other many-body interacting systems.

Acknowledgments

S. Feng acknowledges support from NSF Materials Research Science and Engineering Center (MRSEC) Grant No. DMR-2011876 and the Presidential Fellowship of The Ohio State University. N. Trivedi acknowledges support from NSF-DMR 2138905. D. Kong acknowledges support from UCLA Graduate Fellowship. We thank Xiaozhou Feng for insightful discussions on RBM. We also thank Kevin Zhang, Yuri Lensky and Eun-Ah Kim for enlightening discussions during our collaboration on supervised machine learning.

Appendix A Representation for σ{0,1}\sigma\in\{0,1\}

The logic is the same as the that for σ±1\sigma\in{\pm 1}, except that the representation of projection operator is changed. In the {0,1}\{0,1\} case, we apply the following resolution of identity for any positive integer mNm\leq N:

𝒫(m)=k1<<km𝒫k1km,\mathcal{P}^{(m)}=\sum_{k_{1}<\cdots<k_{m}}\mathcal{P}_{k_{1}\cdots k_{m}},~{}~{}~{} (72)

where 𝒫(m)\mathcal{P}^{(m)} is defined as

𝒫k1,,km=|σk{k1,,km}=1;σk{k1,,km}=0σk{k1,,km}=1;σk{k1,,km}=0|\begin{split}\mathcal{P}_{k_{1},\cdots,k_{m}}=&\ket{\sigma_{k\in\{k_{1},\cdots,k_{m}\}}=1;\sigma_{k^{\prime}\notin\{k_{1},\cdots,k_{m}\}}=0}\\ &\bra{\sigma_{k\in\{k_{1},\cdots,k_{m}\}}=1;\sigma_{k\notin\{k_{1},\cdots,k_{m}\}}=0}\end{split} (73)

and the completeness is then given bycc m=0N𝒫(m)=I\sum_{m=0}^{N}\mathcal{P}^{(m)}=I. In the diagonal case, it is straightforward to write the cumulant generating function in the matrix form:

K^(𝐰σ)=σlog[τexp(σ𝐰τ)ρ(τ)]|σσ|\hat{K}(\mathbf{w}^{\intercal}\sigma)=\sum_{\sigma}\log\left[\sum_{\tau}\exp(\sigma\mathbf{w}^{\intercal}\tau)\rho(\tau)\right]\ket{\sigma}\bra{\sigma} (74)

Attach to it the resolution of identity in terms of projectors:

K^(𝐰σ^)=m=0Nσk1<<kmK(j=1m𝐰kj)×δσk{k1,,km},1δσk{k1,,km},0×|σk{k1,,km}=1;σk{k1,,km}=0σ|\begin{split}\hat{K}&(\mathbf{w}^{\intercal}\hat{\sigma})=\sum_{m=0}^{N}\sum_{\sigma}\sum_{k_{1}<\cdots<k_{m}}K\left(\sum_{j=1}^{m}\mathbf{w}^{\intercal}_{k_{j}}\right)\\ &\times\delta_{\sigma_{k\in\{k_{1},\cdots,k_{m}\}},1}~{}\delta_{\sigma_{k\notin\{k_{1},\cdots,k_{m}\}},0}\\ &\times\ket{\sigma_{k\in\{k_{1},\cdots,k_{m}\}}=1;\sigma_{k^{\prime}\notin\{k_{1},\cdots,k_{m}\}}=0}\bra{\sigma}\end{split} (75)

In the classical case whereby σk\sigma_{k} is binary, we write the Kronecker delta as

δσk{k1,,km},1δσk{k1,,km},0=i=1mσkik{k1,,km}(1σk)\delta_{\sigma_{k\in\{k_{1},\cdots,k_{m}\}},1}~{}\delta_{\sigma_{k\notin\{k_{1},\cdots,k_{m}\}},0}=\prod_{i=1}^{m}\sigma_{k_{i}}\prod_{k\notin\{k_{1},\cdots,k_{m}\}}(1-\sigma_{k}) (76)

Hence the diagonal terms of K^\hat{K} given by the trace over σ\sigma is

K(𝐰σ)=m=0Nk1<<kmK(j=1m𝐰kj)×i=1mσkik{k1,,km}(1σk)\begin{split}K(\mathbf{w}^{\intercal}\sigma)=&\sum_{m=0}^{N}\sum_{k_{1}<\cdots<k_{m}}K\left(\sum_{j=1}^{m}\mathbf{w}_{k_{j}}^{\intercal}\right)\\ &\times\prod_{i=1}^{m}\sigma_{k_{i}}\prod_{k\notin\{k_{1},\cdots,k_{m}\}}(1-\sigma_{k})\end{split} (77)

we expand the last term into

k{k1,,km}(1σk)=p=0Nm(1)p×(km+1<<km+pσkm+1σkm+p)\begin{split}\prod_{k\notin\{k_{1},\cdots,k_{m}\}}&\left(1-\sigma_{k}\right)=\sum_{p=0}^{N-m}(-1)^{p}\\ &\times\left(\sum_{k_{m+1}<\cdots<k_{m+p}}\sigma_{k_{m+1}}\cdots\sigma_{k_{m+p}}\right)\end{split} (78)

Then we have the effective interaction for ssth order:

K(𝐰σ)=m=0Np=0Nmk1<<km(1)pK(j=1m𝐰kj)×σk1σk2σkm+p\begin{split}K(\mathbf{w}^{\intercal}\sigma)=&\sum_{m=0}^{N}\sum_{p=0}^{N-m}\sum_{k_{1}<\cdots<k_{m}}(-1)^{p}K\left(\sum_{j=1}^{m}\mathbf{w}_{k_{j}}^{\intercal}\right)\\ &\times\sigma_{k_{1}}\sigma_{k_{2}}\cdots\sigma_{k_{m+p}}\end{split} (79)

This is equivalent to the result derived in Ref. [63] for σ{0,1}\sigma\in\{0,1\} using a different method. By rearranging the induces, it is straightforward to get

k1,,ks(s)=p=0s1(1)pj1<<jspsK(i=1sp𝐰kji)\mathcal{I}^{(s)}_{k_{1},\cdots,k_{s}}=\sum_{p=0}^{s-1}(-1)^{p}\sum_{j_{1}<\cdots<j_{s-p}}^{s}K\left(\sum_{i=1}^{s-p}\mathbf{w}_{k_{j_{i}}}^{\intercal}\right) (80)

This result can be used to represent high-order density-density interaction in fermion models [24]; however, since any presence of zero would lead to a trivial energy contribution in the effective energy regardless of the order of interaction, hence, it is not capable of representing non-trivial many-body correlation/interaction of spins, which requires the representation for σ{+1,1}\sigma\in\{+1,-1\} as discussed in the main text.

Appendix B Training of Restricted Boltzmann machine

An RBM is a bipartite binary probabilistic graphical model corresponding to the following distribution,

p(σ,τ)=1Zexp[E(σ,τ)]p(\sigma,\tau)=\frac{1}{Z}\exp[-E(\sigma,\tau)] (81)

which assigns a probability to every possible pair of a visible (σ\sigma) and a hidden vector (τ\tau) via this energy function function:

E(σ,τ)=ivisibleaiσijhiddenbjτji,jwijσiτjE(\sigma,\tau)=-\sum_{i\in\mathrm{visible}}a_{i}\sigma_{i}-\sum_{j\in\mathrm{hidden}}b_{j}\tau_{j}-\sum_{i,j}w_{ij}\sigma_{i}\tau_{j} (82)

The probability of σ\sigma or τ\tau is given by a marginalization:

p(σ)=1Zτexp[E(σ,τ)],p(τ)=1Zσexp[E(σ,τ)]p(\sigma)=\frac{1}{Z}\sum_{\tau}\exp[-E(\sigma,\tau)],~{}~{}p(\tau)=\frac{1}{Z}\sum_{\sigma}\exp[-E(\sigma,\tau)] (83)

The derivation of the log probability w.r.t. wijw_{ij} is:

logp(σ)wij=1p(σ)(1Z2Zwij)τeE(σ,τ)+1p(σ)τσiτjeE(σ,τ)Z=1p(σ)(τ,σσiτjeE(σ,τ)Z)(τeE(σ,τ)Z)+τσiτjp(σ,τ)p(σ)=τ,σσiτjp(σ,τ)+τσiτjp(τ|σ)=𝔼model[σiτj]+𝔼data[σiτj]\begin{split}\frac{\partial\log p(\sigma)}{\partial w_{ij}}&=\frac{1}{p(\sigma)}\left(-\frac{1}{Z^{2}}\frac{\partial Z}{\partial w_{ij}}\right)\sum_{\tau}e^{-E(\sigma,\tau)}\\ &+\frac{1}{p(\sigma)}\sum_{\tau}\sigma_{i}\tau_{j}\frac{e^{-E(\sigma,\tau)}}{Z}\\ &=-\frac{1}{p(\sigma)}\left(\sum_{\tau,\sigma}\sigma_{i}\tau_{j}\frac{e^{-E(\sigma,\tau)}}{Z}\right)\left(\sum_{\tau}\frac{e^{-E(\sigma,\tau)}}{Z}\right)\\ &+\sum_{\tau}\sigma_{i}\tau_{j}\frac{p(\sigma,\tau)}{p(\sigma)}\\ &=-\sum_{\tau,\sigma}\sigma_{i}\tau_{j}p(\sigma,\tau)+\sum_{\tau}\sigma_{i}\tau_{j}p(\tau|\sigma)\\ &=-\mathds{E}_{\rm model}\left[\sigma_{i}\tau_{j}\right]+\mathds{E}_{\rm data}\left[\sigma_{i}\tau_{j}\right]\end{split} (84)

this leads to the gradient ascent learning rule of wijw_{ij}:

δwij=β(𝔼data[σiτj]𝔼model[σiτj])\delta w_{ij}=\beta(\mathds{E}_{\rm data}\left[\sigma_{i}\tau_{j}\right]-\mathds{E}_{\rm model}\left[\sigma_{i}\tau_{j}\right]) (85)

and by the same token we can derive the updating process for aia_{i} and bjb_{j}:

δai=β(𝔼data[σi]𝔼model[σi])δbj=β(𝔼data[τj]𝔼model[τj])\begin{split}\delta a_{i}&=\beta(\mathds{E}_{\rm data}\left[\sigma_{i}\right]-\mathds{E}_{\rm model}\left[\sigma_{i}\right])\\ \delta b_{j}&=\beta(\mathds{E}_{\rm data}\left[\tau_{j}\right]-\mathds{E}_{\rm model}\left[\tau_{j}\right])\end{split} (86)

where β\beta is the learning rate.

Now we need to figure out how to calculate the relevant expectation values mentioned above. We start with the conditional expectation 𝔼data[σiτj]\mathds{E}_{\rm data}\left[\sigma_{i}\tau_{j}\right]. The key is to sample the probability p(τ|σ)p(\tau|\sigma). We can easily write down the conditional probability:

p(τ|σ)=p(τ,σ)p(σ)=1ZeE(σ,τ)1ZτeE(σ,τ)=eE(σ,τ)τeE(σ,τ)p(\tau|\sigma)=\frac{p(\tau,\sigma)}{p(\sigma)}=\frac{\frac{1}{Z}e^{-E(\sigma,\tau)}}{\frac{1}{Z}\sum_{\tau}e^{-E(\sigma,\tau)}}=\frac{e^{-E(\sigma,\tau)}}{\sum_{\tau}e^{-E(\sigma,\tau)}} (87)

and conditional probability for a single hidden node τj\tau_{j} can be derived by marginalization:

p(τj|σ)={τk}τjp({τk}|σ)={τk}τjeE(σ,τ)τeE(σ,τ)p(\tau_{j}|\sigma)=\sum_{\{\tau_{k}\}-\tau_{j}}p(\{\tau_{k}\}|\sigma)=\frac{\sum_{\{\tau_{k}\}-\tau_{j}}e^{-E(\sigma,\tau)}}{\sum_{\tau}e^{-E(\sigma,\tau)}} (88)

For convenience we rewrite the energy function in the following form which separates the hidden and the visible nodes:

E(σ,τ)=jhidden[τj(bj+ivisiblewijσi)]ivisibleaiσijγj(σ)τjiaiσi\begin{split}E(\sigma,\tau)&=-\sum_{j\in\rm hidden}\left[\tau_{j}\left(b_{j}+\sum_{i\in\rm visible}w_{ij}\sigma_{i}\right)\right]\\ &-\sum_{i\in\rm visible}a_{i}\sigma_{i}\\ &\equiv-\sum_{j}\gamma_{j}(\sigma)\tau_{j}-\sum_{i}a_{i}\sigma_{i}\end{split} (89)

so the Boltzmann factor in the numerator now takes the form:

exp[E(σ,τ)]=ieaiσijeγj(σ)τj\exp[-E(\sigma,\tau)]=\prod_{i}e^{-a_{i}\sigma_{i}}\prod_{j}e^{-\gamma_{j}(\sigma)\tau_{j}} (90)

Therefore the denominator in Eq. 88 can be written as

τeE(σ,τ)=ieaiσiτkeγk(σ)τk=[ieaiσi][τj={0,1}eγj(σ)τj]×[{τk}τjkjeγk(σ)τk]\begin{split}\sum_{\tau}e^{-E(\sigma,\tau)}&=\prod_{i}e^{-a_{i}\sigma_{i}}\sum_{\tau}\prod_{k}e^{-\gamma_{k}(\sigma)\tau_{k}}\\ &=\left[\prod_{i}e^{-a_{i}\sigma_{i}}\right]\left[\sum_{\tau_{j}=\{0,1\}}e^{-\gamma_{j}(\sigma)\tau_{j}}\right]\\ &\times\left[\sum_{\{\tau_{k}\}-\tau_{j}}\prod_{k\neq j}e^{-\gamma_{k}(\sigma)\tau_{k}}\right]\end{split} (91)

and the numerator:

{τk}τjeE(σ,τ)=eγj(σ)τj[ieaiσi]×[{τk}τjkjeγk(σ)τk]\begin{split}\sum_{\{\tau_{k}\}-\tau_{j}}e^{-E(\sigma,\tau)}&=e^{-\gamma_{j}(\sigma)\tau_{j}}\left[\prod_{i}e^{-a_{i}\sigma_{i}}\right]\\ &\times\left[\sum_{\{\tau_{k}\}-\tau_{j}}\prod_{k\neq j}e^{-\gamma_{k}(\sigma)\tau_{k}}\right]\end{split} (92)

hence Eq. 88 becomes a Logistic form:

p(τj|σ)=eγj(σ)τj1+eγj(σ)p(\tau_{j}|\sigma)=\frac{e^{-\gamma_{j}(\sigma)\tau_{j}}}{1+e^{-\gamma_{j}(\sigma)}} (93)

Since each element in τj\tau_{j} is binary, we can readily write down the conditional probability for τj=1,1\tau_{j}=1,-1 conditioned on σ\sigma:

p(τj=1|σ)\displaystyle p(\tau_{j}=1|\sigma) =exp(bjiwijσi)1+exp(bjiwijσi)\displaystyle=\frac{\exp(-b_{j}-\sum_{i}w_{ij}\sigma_{i})}{1+\exp(-b_{j}-\sum_{i}w_{ij}\sigma_{i})}
=sigmoid(bj+iwijσi)\displaystyle={\rm sigmoid}\left(b_{j}+\sum_{i}w_{ij}\sigma_{i}\right) (94)
p(τj=0|σ)\displaystyle p(\tau_{j}=0|\sigma) =1p(τj=1|σ)\displaystyle=1-p(\tau_{j}=1|\sigma)
=11+exp(bjiwijσi)\displaystyle=\frac{1}{1+\exp(-b_{j}-\sum_{i}w_{ij}\sigma_{i})} (95)

By the same token, we can show p(σi|τ)p(\sigma_{i}|\tau), which, however, is no longer a sigmoid function in the case σ{+1,1}\sigma\in\{+1,-1\}. We are now prepared to sample calculate 𝔼data[σiτj]=τσiτjp(τ|σ)\mathds{E}_{\rm data}\left[\sigma_{i}\tau_{j}\right]=\sum_{\tau}\sigma_{i}\tau_{j}p(\tau|\sigma) for every pair of ii and jj.

Algorithm 1 Sampling 𝔼data[σiτj]\mathds{E}_{\rm data}\left[\sigma_{i}\tau_{j}\right]

Input: Data batch (σ1,,σN)(\sigma_{1},\cdots,\sigma_{N}) and initial parameters of RBM
Output: 𝔼data[σiτj]\mathds{E}_{\rm data}\left[\sigma_{i}\tau_{j}\right]
1. Initialize the 𝐌=0\mathbf{M}=0 matrix
2. For each σt\sigma_{t} in data batch:
ForEach Sample τp(τ|σt)=σ(𝐛+𝐰σ)\tau\sim p(\tau|\sigma_{t})=\sigma(\mathbf{b}+\mathbf{w}^{\top}\sigma)
ForEach 𝐌𝐌+σtτ\mathbf{M}\leftarrow\mathbf{M}+\sigma_{t}\tau^{\top}
3. 𝔼data[στ]𝐌/N\mathds{E}_{\rm data}[\sigma\tau^{\top}]\leftarrow\mathbf{M}/N

Next we need to compute 𝔼model[σiτj]=σ,τσiτj\mathds{E}_{\rm model}\left[\sigma_{i}\tau_{j}\right]=\sum_{\sigma,\tau}\sigma_{i}\tau_{j}, which is significantly harder since we are drawing correlated samples. Nevertheless, note that elements in σ\sigma or τ\tau are not correlated within the same layer, so, assuming convergence is achievable, we can write down a similar algorithm sampling the hidden and visible layer one after another:

Algorithm 2 Sampling 𝔼model[σiτj]\mathds{E}_{\rm model}\left[\sigma_{i}\tau_{j}\right]

Input: Initial parameters of RBM
Output: 𝔼model[σiτj]\mathds{E}_{\rm model}\left[\sigma_{i}\tau_{j}\right]
1. Initialize the 𝐌=0\mathbf{M}=0 matrix
2. Initialize σ\sigma to be a random vector
3. Repeat NcN_{c} times (until convergence):
ForEach Sample τp(τ|σ)=σ(𝐛+𝐰σ)\tau\sim p(\tau|\sigma)=\sigma(\mathbf{b}+\mathbf{w}^{\top}\sigma)
ForEach Sample σp(σ|τ)=σ(𝐚+𝐰τ)\sigma\sim p(\sigma|\tau)=\sigma(\mathbf{a}+\mathbf{w}\tau)
ForEach 𝐌𝐌+στ\mathbf{M}\leftarrow\mathbf{M}+\sigma\tau^{\top}
4. 𝔼model[στ]𝐌/Nc\mathds{E}_{\rm model}[\sigma\tau^{\top}]\leftarrow\mathbf{M}/N_{c}

However, this scheme usually converges very slowly since samples of τ\tau and σ\sigma are correlated. This is exactly where the contrastive divergence (CD) has a part to play. This can simply be done by setting Nc=nN_{c}=n for CDn, where nn is commonly chosen to be n=1n=1.

References

  • Matsuda [2000] H. Matsuda, Physical nature of higher-order mutual information: Intrinsic correlations and frustration, Phys. Rev. E 62, 3096 (2000).
  • Boccaletti et al. [2006] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D.-U. Hwang, Complex networks: Structure and dynamics, Physics Reports 424, 175 (2006).
  • Battiston et al. [2021] F. Battiston, E. Amico, A. Barrat, G. Bianconi, G. Ferraz de Arruda, B. Franceschiello, I. Iacopini, S. Kéfi, V. Latora, Y. Moreno, M. M. Murray, T. P. Peixoto, F. Vaccarino, and G. Petri, The physics of higher-order interactions in complex systems, Nature Physics 17, 1093 (2021).
  • Rosas et al. [2022] F. E. Rosas, P. A. M. Mediano, A. I. Luppi, T. F. Varley, J. T. Lizier, S. Stramaglia, H. J. Jensen, and D. Marinazzo, Disentangling high-order mechanisms and high-order behaviours in complex systems, Nature Physics 18, 476 (2022).
  • Boccaletti et al. [2023] S. Boccaletti, P. De Lellis, C. del Genio, K. Alfaro-Bittner, R. Criado, S. Jalan, and M. Romance, The structure and dynamics of networks with higher order interactions, Physics Reports 1018, 1 (2023), the structure and dynamics of networks with higher order interactions.
  • Wen and Niu [1990] X. G. Wen and Q. Niu, Ground-state degeneracy of the fractional quantum hall states in the presence of a random potential and on high-genus riemann surfaces, Phys. Rev. B 41, 9377 (1990).
  • Wen [2002] X.-G. Wen, Quantum orders and symmetric spin liquids, Phys. Rev. B 65, 165113 (2002).
  • Chen et al. [2010] X. Chen, Z.-C. Gu, and X.-G. Wen, Local unitary transformation, long-range quantum entanglement, wave function renormalization, and topological order, Phys. Rev. B 82, 155138 (2010).
  • Kitaev and Preskill [2006] A. Kitaev and J. Preskill, Topological entanglement entropy, Phys. Rev. Lett. 96, 110404 (2006).
  • Levin and Wen [2006] M. Levin and X.-G. Wen, Detecting topological order in a ground state wave function, Phys. Rev. Lett. 96, 110405 (2006).
  • Binder and Barthel [2020] M. Binder and T. Barthel, Low-energy physics of isotropic spin-1 chains in the critical and haldane phases, Phys. Rev. B 102, 014447 (2020).
  • Feng et al. [2020] S. Feng, N. D. Patel, P. Kim, J. H. Han, and N. Trivedi, Magnetic phase transitions in quantum spin-orbital liquids, Phys. Rev. B 101, 155112 (2020).
  • Feng et al. [2022a] S. Feng, G. Alvarez, and N. Trivedi, Gapless to gapless phase transitions in quantum spin chains, Phys. Rev. B 105, 014435 (2022a).
  • Kitaev [2003] A. Y. Kitaev, Fault-tolerant quantum computation by anyons, Annals of Physics 303, 2 (2003).
  • Kitaev [2006] A. Kitaev, Anyons in an exactly solved model and beyond, Annals of Physics 321, 2 (2006).
  • Semeghini et al. [2021] G. Semeghini, H. Levine, A. Keesling, S. Ebadi, T. T. Wang, D. Bluvstein, R. Verresen, H. Pichler, M. Kalinowski, R. Samajdar, A. Omran, S. Sachdev, A. Vishwanath, M. Greiner, V. Vuletić, and M. D. Lukin, Probing topological spin liquids on a programmable quantum simulator, Science 374, 1242 (2021).
  • Ebadi et al. [2021] S. Ebadi, T. T. Wang, H. Levine, A. Keesling, G. Semeghini, A. Omran, D. Bluvstein, R. Samajdar, H. Pichler, W. W. Ho, S. Choi, S. Sachdev, M. Greiner, V. Vuletić, and M. D. Lukin, Quantum phases of matter on a 256-atom programmable quantum simulator, Nature 595, 227 (2021).
  • Nielsen and Chuang [2010] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information: 10th Anniversary Edition (Cambridge University Press, 2010).
  • Ghosh et al. [2015] S. Ghosh, R. M. Soni, and S. P. Trivedi, On the entanglement entropy for gauge theories, Journal of High Energy Physics 2015, 69 (2015).
  • Soni and Trivedi [2016] R. M. Soni and S. P. Trivedi, Aspects of entanglement entropy for gauge theories, Journal of High Energy Physics 2016, 136 (2016).
  • Ryu and Takayanagi [2006a] S. Ryu and T. Takayanagi, Holographic derivation of entanglement entropy from the anti–de sitter space/conformal field theory correspondence, Phys. Rev. Lett. 96, 181602 (2006a).
  • Williams and Beer [2010] P. L. Williams and R. D. Beer, Nonnegative decomposition of multivariate information (2010), arXiv:1004.2515 [cs.IT] .
  • Gezerlis et al. [2014] A. Gezerlis, I. Tews, E. Epelbaum, M. Freunek, S. Gandolfi, K. Hebeler, A. Nogga, and A. Schwenk, Local chiral effective field theory interactions and quantum monte carlo applications, Phys. Rev. C 90, 054323 (2014).
  • Rrapaj and Roggero [2021] E. Rrapaj and A. Roggero, Exact representations of many-body interactions with restricted-boltzmann-machine neural networks, Phys. Rev. E 103, 013302 (2021).
  • Li and Haldane [2008] H. Li and F. D. M. Haldane, Entanglement spectrum as a generalization of entanglement entropy: Identification of topological order in non-abelian fractional quantum hall effect states, Phys. Rev. Lett. 101, 010504 (2008).
  • Zhang et al. [2017] Y. Zhang, R. G. Melko, and E.-A. Kim, Machine learning 𝕫2{\mathbb{z}}_{2} quantum spin liquids with quasiparticle statistics, Phys. Rev. B 96, 245119 (2017).
  • Zhang et al. [2023] K. Zhang, S. Feng, Y. D. Lensky, N. Trivedi, and E.-A. Kim, Machine learning feature discovery of spinon fermi surface (2023), arXiv:2306.03143 [cond-mat.str-el] .
  • Carleo and Troyer [2017] G. Carleo and M. Troyer, Solving the quantum many-body problem with artificial neural networks, Science 355, 602 (2017)https://www.science.org/doi/pdf/10.1126/science.aag2302 .
  • Carrasquilla and Melko [2017] J. Carrasquilla and R. G. Melko, Machine learning phases of matter, Nature Physics 13, 431 (2017).
  • Huang et al. [2020] H.-Y. Huang, R. Kueng, and J. Preskill, Predicting many properties of a quantum system from very few measurements, Nature Physics 16, 1050 (2020).
  • Huang et al. [2022] H.-Y. Huang, R. Kueng, G. Torlai, V. V. Albert, and J. Preskill, Provably efficient machine learning for quantum many-body problems, Science 377, eabk3333 (2022)https://www.science.org/doi/pdf/10.1126/science.abk3333 .
  • Teng et al. [2023] Y. Teng, S. Sachdev, and M. S. Scheurer, Classifying topological neural network quantum states via diffusion maps (2023), arXiv:2301.02683 [quant-ph] .
  • Torlai and Melko [2016] G. Torlai and R. G. Melko, Learning thermodynamics with boltzmann machines, Phys. Rev. B 94, 165134 (2016).
  • Huang and Wang [2017] L. Huang and L. Wang, Accelerated monte carlo simulations with restricted boltzmann machines, Phys. Rev. B 95, 035105 (2017).
  • Deng et al. [2017] D.-L. Deng, X. Li, and S. Das Sarma, Quantum entanglement in neural network states, Phys. Rev. X 7, 021021 (2017).
  • Amin et al. [2018] M. H. Amin, E. Andriyash, J. Rolfe, B. Kulchytskyy, and R. Melko, Quantum boltzmann machine, Phys. Rev. X 8, 021050 (2018).
  • Tramel et al. [2018] E. W. Tramel, M. Gabrié, A. Manoel, F. Caltagirone, and F. Krzakala, Deterministic and generalized framework for unsupervised learning with restricted boltzmann machines, Phys. Rev. X 8, 041006 (2018).
  • Koch-Janusz and Ringel [2018] M. Koch-Janusz and Z. Ringel, Mutual information, neural networks and the renormalization group, Nature Physics 14, 578 (2018).
  • Cossu et al. [2019] G. Cossu, L. Del Debbio, T. Giani, A. Khamseh, and M. Wilson, Machine learning determination of dynamical parameters: The ising model case, Phys. Rev. B 100, 064304 (2019).
  • Decelle and Furtlehner [2021] A. Decelle and C. Furtlehner, Exact training of restricted boltzmann machines on intrinsically low dimensional data, Phys. Rev. Lett. 127, 158303 (2021).
  • Furukawa and Misguich [2007] S. Furukawa and G. Misguich, Topological entanglement entropy in the quantum dimer model on the triangular lattice, Phys. Rev. B 75, 214407 (2007).
  • Kim [2013] I. H. Kim, Long-range entanglement is necessary for a topological storage of quantum information, Phys. Rev. Lett. 111, 080503 (2013).
  • Shi [2019] B. Shi, Seeing topological entanglement through the information convex, Phys. Rev. Research 1, 033048 (2019).
  • Shi [2020] B. Shi, Verlinde formula from entanglement, Phys. Rev. Research 2, 023132 (2020).
  • Shirley et al. [2019] W. Shirley, K. Slagle, and X. Chen, Universal entanglement signatures of foliated fracton phases, SciPost Phys. 6, 015 (2019).
  • Feng et al. [2023] S. Feng, A. Agarwala, S. Bhattacharjee, and N. Trivedi, Anyon dynamics in field-driven phases of the anisotropic kitaev model (2023), arXiv:2206.12990 [cond-mat.str-el] .
  • Castelnovo and Chamon [2007] C. Castelnovo and C. Chamon, Topological order and topological entropy in classical systems, Phys. Rev. B 76, 174416 (2007).
  • Ryu and Takayanagi [2006b] S. Ryu and T. Takayanagi, Aspects of holographic entanglement entropy, Journal of High Energy Physics 2006, 045 (2006b).
  • Hayden et al. [2013] P. Hayden, M. Headrick, and A. Maloney, Holographic mutual information is monogamous, Phys. Rev. D 87, 046003 (2013).
  • Wen et al. [2016] X. Wen, S. Matsuura, and S. Ryu, Edge theory approach to topological entanglement entropy, mutual information, and entanglement negativity in chern-simons theories, Phys. Rev. B 93, 245140 (2016).
  • Kudler-Flam et al. [2020] J. Kudler-Flam, M. Nozaki, S. Ryu, and M. T. Tan, Quantum vs. classical information: operator negativity as a probe of scrambling, Journal of High Energy Physics 2020, 31 (2020).
  • Zhou [2008] D. L. Zhou, Irreducible multiparty correlations in quantum states without maximal rank, Phys. Rev. Lett. 101, 180505 (2008).
  • Kato et al. [2016] K. Kato, F. Furrer, and M. Murao, Information-theoretical analysis of topological entanglement entropy and multipartite correlations, Phys. Rev. A 93, 022317 (2016).
  • Berthiere and Witczak-Krempa [2022] C. Berthiere and W. Witczak-Krempa, Entanglement of skeletal regions, Phys. Rev. Lett. 128, 240502 (2022).
  • Feng et al. [2022b] S. Feng, Y. He, and N. Trivedi, Detection of long-range entanglement in gapped quantum spin liquids by local measurements, Phys. Rev. A 106, 042417 (2022b).
  • Kokail et al. [2021] C. Kokail, R. van Bijnen, A. Elben, B. Vermersch, and P. Zoller, Entanglement hamiltonian tomography in quantum simulation, Nature Physics 17, 936 (2021).
  • Hinton [2012] G. E. Hinton, A practical guide to training restricted boltzmann machines, in Neural Networks: Tricks of the Trade: Second Edition, edited by G. Montavon, G. B. Orr, and K.-R. Müller (Springer Berlin Heidelberg, Berlin, Heidelberg, 2012) pp. 599–619.
  • Le Roux and Bengio [2008] N. Le Roux and Y. Bengio, Representational Power of Restricted Boltzmann Machines and Deep Belief Networks, Neural Computation 20, 1631 (2008).
  • You et al. [2018] Y.-Z. You, Z. Yang, and X.-L. Qi, Machine learning spatial geometry from entanglement features, Phys. Rev. B 97, 045153 (2018).
  • Vasseur et al. [2019] R. Vasseur, A. C. Potter, Y.-Z. You, and A. W. W. Ludwig, Entanglement transitions from holographic random tensor networks, Phys. Rev. B 100, 134203 (2019).
  • Beentjes and Khamseh [2020] S. V. Beentjes and A. Khamseh, Higher-order interactions in statistical physics and machine learning: A model-independent solution to the inverse problem at equilibrium, Phys. Rev. E 102, 053314 (2020).
  • Baskaran et al. [2007] G. Baskaran, S. Mandal, and R. Shankar, Exact results for spin dynamics and fractionalization in the kitaev model, Phys. Rev. Lett. 98, 247201 (2007).
  • Bulso and Roudi [2021] N. Bulso and Y. Roudi, Restricted boltzmann machines as models of interacting variables (2021).