Quantum inspired K-means algorithm using matrix product states

Xiao Shi Institute of Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China Yun Shang [email protected] Institute of Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China NCMIS, MDIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190,China Chu Guo [email protected] Department of Physics and Synergetic Innovation Center for Quantum Effects and Applications, Hunan Normal University, Changsha 410081, China

Abstract

Matrix product state has become the algorithm of choice when studying one-dimensional interacting quantum many-body systems, which demonstrates to be able to explore the most relevant portion of the exponentially large quantum Hilbert space and find accurate solutions. Here we propose a quantum inspired K-means clustering algorithm which first maps the classical data into quantum states represented as matrix product states, and then minimize the loss function using the variational matrix product states method in the enlarged space. We demonstrate the performance of this algorithm by applying it to several commonly used machine learning datasets and show that this algorithm could reach higher prediction accuracies and that it is less likely to be trapped in local minima compared to the classical K-means algorithm.

I Introduction

The past few years have witnessed a growing interest in the intersection between quantum physics and machine learning. On the one side, machine learning tools have been used to solve quantum problems, such as phase recognition Carrasquilla and Melko (2017); Zhang et al. (2018); Ch’ng et al. (2018); Van Nieuwenburg et al. (2017); Broecker et al. (2017); Ch’Ng et al. (2017); Zhang and Kim (2017); Dong et al. (2019), quantum state tomography Torlai et al. (2018); Torlai and Melko (2018); Rocchetto et al. (2018); Quek et al. (2018); Carrasquilla et al. (2019), solving quantum many-body problems Arsenault et al. (2014, 2015); Torlai and Melko (2016); Amin et al. (2018); Liu et al. (2017); Huang and Wang (2017); Aoki and Kobayashi (2016); Carleo and Troyer (2017); Nomura et al. (2017); Czischek et al. (2018), non-Markovian quantum dynamics Luchnikov et al. (2020). Connections have been drawn between quantum many-body algorithms and neural network algorithms Bény (2013); Mehta and Schwab (2014); Lin et al. (2017); Deng et al. (2017). On the other side, tools from quantum many-body physics, especially tensor network states algorithms, are used to solve classical machine learning problems. Tensor network states algorithms have been extensively used to study quantum many-body systems in the last two decades. In particular, Matrix Product States (MPS), which is a one-dimensional variant of TNS, has achieved great success and become a standard numerical tool to solve one-dimensional strongly correlated quantum many-body systems due to its high efficiency and accuracy Schollwöck (2011). Applications of MPS to solve machine learning tasks include classification problems Stoudenmire and Schwab (2016); Stoudenmire (2018); Sun et al. (2020), generative modeling Han et al. (2018), sequence to sequence modeling Guo et al. (2018), where it is shown that MPS based algorithm could achieve a learning precision close to or even better than state of the art machine learning algorithms. Tensor network states based machine learning algorithms has also been applied for quantum process tomographyGuo et al. (2020); Torlai et al. (2020).

In this work, we apply matrix product states to the clustering task, which is an elementary machine learning task to separate unlabeled data into distinct and non-overlapping clusters. A standard algorithm for clustering is the K-means algorithm, which divides the data into $k$ different classes by minimizing the variance of the data in each class MacQueen et al. (1967). The most popular K-means clustering algorithm, Lloyd’s algorithm, starts from $k$ random centroids, and then iteratively dividing the data into $k$ classes according to their distances with the centroids, and recomputing the center within each cluster Lloyd (1982). This algorithm becomes a standard algorithm for clustering due to its simplicity and efficiency, and we will refer to it as the classical K-means algorithm in the rest of this work. However, it is known that in some cases this algorithm could result in bad clustering with high probability, even for randomized initialization. Various efforts have been paid to improve the accuracy of Lloyd’s algorithm, such as a clever way of initialization Arthur and Vassilvitskii (2006); Ostrovsky et al. (2013); Celebi et al. (2013); Bachem et al. (2016), or a smoother objective loss function Zhang et al. (1999); Xu and Lange (2019); De Amorim and Mirkin (2012); Chakraborty and Das (2017).

Similar to other MPS-based machine learning algorithms, the basic idea of this work is to first map the classical data into quantum states which live in an exponentially large Hilbert space, and then explore this enlarged space in a numerically efficient way and find the optimal solution. In the context of clustering problem, the hope is that by exploring a much larger space, the algorithm is less likely to result in bad clustering. This paper is organized as follows. In Sec. II, we describe our method in detail. In Sec. III, we apply our method to several machine learning data sets, showing that our method could result in a high learning accuracy and is less likely to be trapped in local minimum. We conclude in Sec.IV.

II Method

II.1 Classical K-means algorithm

Before we introduce our method, we first briefly review the standard classical K-means algorithm Lloyd (1982). For $N$ input vectors labeled as $\vec{x}_{n}$ ( $1\leq n\leq N$ ), the classical K-means algorithm works as follows: 1) initializing $k$ random vectors $\vec{\mathcal{O}}_{m}$ ( $1\leq m\leq k$ ) as the initial centroids. 2) Dividing all the data into $k$ classes according their distances to the centroids, denoted as $d^{C}$ . Concretely, for each data vector $\vec{x}_{n}$ , one computes

\displaystyle d^{C}_{n,m}=|\vec{x}_{n}-\vec{\mathcal{O}}_{m}|^{2},

(1)

where $|\vec{v}|=\sqrt{\sum v_{j}^{2}}$ is the standard $2$ -norm of a vector $\vec{v}$ . Then $\vec{x}_{n}$ is assigned to the $j$ -th cluster with $j=\operatornamewithlimits{argmax}_{m}d^{C}_{n,m}$ . After this step, the $N$ data vectors are divided into $k$ non-overlapping clusters. 3) Computing the new center of each class by minimizing its variances independently, namely by minimizing the following loss function

\displaystyle f^{C}(\vec{\mathcal{O}}_{m})=\sum_{n_{j}=1}^{N_{j}}|\vec{x}^{j}_{n_{j}}-\vec{\mathcal{O}}_{m}|^{2}=\sum_{n_{j}=1}^{N_{j}}d^{C}_{n_{j},m},

(2)

where $N_{j}$ denote the size of the $j$ -th cluster, and $\vec{x}^{j}_{n_{j}}$ denotes the $n_{j}$ -th vector inside the $j$ -th cluster. By repeating steps 2) and 3), one could often quickly converge to a local minimum. In the prediction stage, given a new input vector $\vec{x}$ , one simply computes each of the distance $d^{C}(\vec{x},\vec{\mathcal{O}}_{m})$ between $\vec{x}$ and the centroid $\vec{\mathcal{O}}_{m}$ , and then $\vec{x}$ is assigned to the $j$ -th cluster with $j=\operatornamewithlimits{argmax}_{m}d^{C}(\vec{x},\vec{\mathcal{O}}_{m})$ .

II.2 Quantum K-means algorithm

The above K-means algorithm could be straightforwardly converted into a quantum machine learning algorithm, which we describe as follows and will be referred to as the quantum K-means algorithm. In this case we assume that there is a list of input quantum states labeled as $|x_{n}\rangle$ . The centers are denoted as $|\mathcal{O}_{m}\rangle$ and are initialized using parametric quantum circuits Mitarai et al. (2018); Liu et al. (2020)

\displaystyle|\mathcal{O}_{m}\rangle=\mathcal{C}_{m}(\vec{\theta})|0\rangle^{\otimes L},

(3)

where $\mathcal{C}_{m}(\vec{\theta})$ denotes the $m$ -th parametric quantum circuit with an array of parameters denoted by $\vec{\theta}$ . The distance between $|x_{n}\rangle$ and $|\mathcal{O}_{m}\rangle$ can be defined as

\displaystyle d^{Q}_{n,m}=1-|\langle x_{n}|\mathcal{O}_{m}\rangle|^{2},

(4)

where the second term on the right hand side of the above equation can be efficiently computed with a quantum computer using the SWAP test technique Buhrman et al. (2001). Similarly, the loss function for the $m$ -th cluster can be defined as

\displaystyle f^{Q}(|\mathcal{O}_{m}\rangle)=\sum_{n_{j}=1}^{N_{j}}\left(1-|\langle x_{n_{j}}^{j}|\mathcal{O}_{m}\rangle|^{2}\right)=\sum_{n_{j}=1}^{N_{j}}d^{Q}_{n_{j},m}.

(5)

We note that the gradient of $f^{Q}(|\mathcal{O}_{m}\rangle)$ can also be computed with a quantum computer using Mitarai et al. (2018)

\displaystyle\frac{\partial f^{Q}(|\mathcal{O}_{m}\rangle)}{\partial\theta_{i}}=\frac{1}{2}f^{Q}(|\mathcal{O}_{m}\rangle^{-})-\frac{1}{2}f^{Q}(|\mathcal{O}_{m}\rangle^{+}),

(6)

where $|\mathcal{O}_{m}\rangle^{\pm}$ means to shift the $i$ -th parameters $\theta_{i}$ in $\vec{\theta}$ by $\pm\frac{\pi}{2}$ . Implementing SWAP test on current quantum computers is not an easy task since it requires the three-qubit TOFFILI gate. In this work we will not go any further on this quantum machine learning algorithm but will instead focus on the quantum inspired classical algorithm as follows.

II.3 Quantum inspired K-means algorithm

Now we will describe in detail our quantum inspired K-means algorithm. Similar to the quantum K-means algorithm, the inputs will be assumed to be a list of quantum states. However, here we also assume that each quantum state $|x_{n}\rangle$ could be parameterized using an MPS $X_{n}^{\vec{\sigma}}$

\displaystyle X_{n}^{\vec{\sigma}}=\sum_{a_{1},a_{2},\dots,a_{L+1}}X^{\sigma_{1}}_{n,a_{1},a_{2}}X^{\sigma_{2}}_{n,a_{2},a_{3}}\dots X^{\sigma_{L}}_{n,a_{L},a_{L+1}},

(7)

where $L$ is the number of qubits required by the corresponding quantum state $|x_{n}\rangle$ . The size of an MPS is characterize by an integer $D$ , referred to as the bond dimension, defined as

\displaystyle D=\max_{1\leq l\leq L}\left(\dim(a_{l})\right).

(8)

We note that for a vector $\vec{x}_{l}$ with $L$ elements $\left(x_{l,1},x_{l,2},\dots,x_{l,L}\right)$ , one could map it into a separable quantum state $|x_{l}\rangle$ such that each element $x_{n,l}$ is mapped into a single-qubit state $\cos(\frac{\pi}{2}x_{n,l})|0\rangle+\sin(\frac{\pi}{2}x_{n,l})|1\rangle$ . This would correspond to a separable MPS $X_{n}^{\vec{\sigma}}$ of $D=1$ with elements

\displaystyle X_{n,1,1}^{0}=\cos(\frac{\pi}{2}x_{n,l}),\quad X_{n,1,1}^{1}=\sin(\frac{\pi}{2}x_{n,l}).

(9)

The centroids are then assumed to be MPSs with a fixed bond dimension $D$ , and written as $\mathcal{O}_{m}^{\vec{\sigma}}$

\displaystyle\mathcal{O}_{m}^{\vec{\sigma}}=\sum_{b_{1},b_{2},\dots,b_{L+1}}M^{\sigma_{1}}_{m,b_{1},b_{2}}M^{\sigma_{2}}_{m,b_{2},b_{3}}\dots M^{\sigma_{L}}_{m,b_{L},b_{L+1}}.

(10)

The distance between two MPSs is defined as

\displaystyle d^{M}_{n,m}=\left(X_{n}^{\vec{\sigma},\dagger}-\mathcal{O}_{m}^{\vec{\sigma},\dagger}\right)\left(X_{n}^{\vec{\sigma}}-\mathcal{O}_{m}^{\vec{\sigma}}\right),

(11)

and similarly, the loss function for the $m$ -th cluster can be defined as

\displaystyle f^{M}(\mathcal{O}_{m}^{\vec{\sigma}})=\sum_{n_{j}=1}^{N_{j}}d^{M}_{n_{j},m},

(12)

with the normalization condition $\mathcal{O}_{m}^{\vec{\sigma},\dagger}\mathcal{O}_{m}^{\vec{\sigma}}=1$ . The above cost function can be minimized by setting the gradient against each tensor $M_{m,b_{l},b_{l+1}}^{\sigma_{l}}$ to $0$ , namely

\displaystyle\frac{\partial f^{M}(\mathcal{O}_{m}^{\vec{\sigma}})}{\partial M^{\sigma_{l}}_{m,b_{l},b_{l+1}}}=0.

(13)

The above equation can be simply reduced to a set of algebraic equations. To see this, we first define

$\displaystyle L_{b_{l}}^{b_{l}^{\prime}}$	$\displaystyle=\sum_{b_{l-1},b_{l-1}^{\prime},\sigma_{l-1}}L_{b_{l-1}}^{b_{l-1}^{\prime}}M^{\sigma_{l-1}}_{b_{l-1},b_{l}}M^{\sigma_{l-1}}_{b_{l-1}^{\prime},b_{l}^{\prime}};$	(14)
$\displaystyle R_{b_{l+1}}^{b_{l+1}^{\prime}}$	$\displaystyle=\sum_{b_{l+2},b_{l+2}^{\prime},\sigma_{l+1}}R_{b_{l+2}}^{b_{l+2}^{\prime}}M^{\sigma_{l+1}}_{b_{l+1},b_{l+2}}M^{\sigma_{l+1}}_{b_{l+1}^{\prime},b_{l+2}^{\prime}};$	(15)
$\displaystyle A_{b_{l}}^{a_{l}}$	$\displaystyle=\sum_{b_{l-1},a_{l-1},\sigma_{l-1}}A_{b_{l-1}}^{a_{l-1}}M^{\sigma_{l-1}}_{b_{l-1},b_{l}}X^{\sigma_{l-1}}_{a_{l-1},a_{l}};$	(16)
$\displaystyle B_{b_{l+1}}^{a_{l+1}}$	$\displaystyle=\sum_{b_{l+2},a_{l+2},\sigma_{l+1}}R_{b_{l+2}}^{a_{l+2}}M^{\sigma_{l+1}}_{b_{l+1},b_{l+2}}M^{\sigma_{l+1}}_{a_{l+1},a_{l+2}},$	(17)

with $L_{b_{1}}^{b_{1}^{\prime}}=R_{b_{L+1}}^{b_{L+1}^{\prime}}=A_{b_{1}}^{a_{1}}=B_{b_{L+1}}^{a_{L+1}}=1$ . With these equations, Eq. 13 can be written as

\displaystyle L_{b_{l}}^{b_{l}^{\prime}}M^{\sigma_{l}}_{b_{l},b_{l+1}}R_{b_{l+1}}^{b_{l+1}^{\prime}}=\sum_{i=1}^{n}A_{b_{l}}^{a_{l}}X^{\sigma_{l}}_{i,a_{l},a_{l+1}}B_{b_{l+1}}^{a_{l+1}},

(18)

where we have neglected the label $m$ for the cluster since each cluster can be minimized independently. Interestingly, if we keep $\mathcal{O}_{m}^{\vec{\sigma}}$ in a mixed-canonical form Schollwöck (2011), then $L_{b_{l},b_{l}^{\prime}}=\delta_{b_{l},b_{l}^{\prime}}$ and $R_{b_{l+1},b_{l+1}^{\prime}}=\delta_{b_{l+1},b_{l+1}^{\prime}}$ , such that the above equation reduces to

\displaystyle M^{\sigma_{l}}_{b_{l},b_{l+1}}=\sum_{n=1}^{N}A_{b_{l}}^{a_{l}}X^{\sigma_{l}}_{n,a_{l},a_{l+1}}B_{b_{l+1}}^{a_{l+1}}.

(19)

III Applications

Table 1: Datasets used for comparison between classical K-means algorithm and our quantum inspired algorithm.

Name	Points	Dimensions	Class
Breast	683	9	2
Ionosphere	351	34	2
Wine	178	13	3
Yeast	1484	8	10
E-coli	336	7	8

To demonstrate our method, we apply it to several commonly used machine learning datasets for clustering tasks as shown in TABLE 1. For each dataset, we randomly pick $80\%$ of the data as the training data, and the remaining $20\%$ as the testing data. To ensure a fair comparison, we will always use the same splitting of training and testing data for the classical K-means algorithm and our quantum inspired K-means algorithm. Since the K-means algorithm is based on the distance measurement, if the difference between variables of different dimensions is too large, it may cause a small number of variables to exert an excessively high influence on the whole cost, thus eliminating the effect of the rest variables. To avoid this effect, we divide each dimension of the data by the its largest value in the training set, namely

\displaystyle x_{n,j}\leftarrow\frac{x_{n,j}}{A^{\max}_{j}},

(20)

with $A^{\max}_{j}=\max_{1\leq n\leq N_{train}}x_{n,j}$ , where $N_{train}$ is the total number of training data. As a result, we have $0\leq x_{n,j}\leq 1$ for all $1\leq n\leq N_{train}$ and $1\leq j\leq L$ . In the prediction stage, we also divide each dimension of the testing data by $A^{\max}_{j}$ . We note that this procedure may produce elements larger than $1$ in the testing set, which may not be well distinguished by our quantum inspired algorithm since our encoding in Eq. 9 is a periodic function of period $1$ . However, our numerical results show that we could still get high precision results using our method despite those elements.

In the following, we show the comparison between our quantum inspired K-means algorithm with the classical K-means algorithm in terms learning accuracy and the likelihood of getting trapped in local minima. In TABLE. 2, we show the accuracies for different datasets, which is defined as the number of the correctly predicted data divided by the total number of testing data. Since it is possible for both algorithms to be trapped in local minima, we run the same simulation for each algorithm for $50$ times with randomly initialized centroids and pick the one with the highest accuracy. For our quantum inspired K-means algorithm, we also tune the bond dimension $D$ to be $8$ and $15$ to see the effect of different bond dimensions. We can see from TABLE. 2 that with $D=8$ , our quantum inspired algorithm already performs as good as or better than the classical algorithm for all the cases. And for $D=15$ , we get even higher accuracies.

Table 2: Comparison of accuracies on the testing data between the classical K-means algorithm and the quantum inspired K-means algorithm. For the E-coil dataset, the data for the case

D=15

is missing since the dimension of each data is too small.

	Classic k-means	Quantum inspired k-means
	Classic k-means	D = 8	D = 15
Breast	99.27 $\%$	100 $\%$	100 $\%$
Ionosphere	60.56 $\%$	60.56 $\%$	63.38 $\%$
Wine	100 $\%$	100 $\%$	100 $\%$
Yeast	45.45 $\%$	47.47 $\%$	48.48 $\%$
E-coli	85.29 $\%$	89.71 $\%$	none

A well-known drawback of the classical K-means algorithm is that it could easily be trapped in local minima. Here we also check the likelihood of our algorithm to be trapped in local minima. For this purpose, we take the Wine and Breast dataset as examples and we run the same simulation for $100$ randomly initialized centroids for each algorithm. We show the distribution of the number of cases against different accuracies as a histogram in Fig. 1. We can see that for our quantum inspired K-means algorithm, there is a higher probability to get a higher accuracy. For example, for the Wine dataset, there are $50$ cases out of $100$ with accuracies beyond $0.9$ for the classical K-means algorithm, while for quantum inspired K-means algorithm with a bond dimension $D=8$ there are $53$ cases, and for $D=15$ there are $60$ cases.

Refer to caption — Figure 1: Frequency as a function of the accuracy for the Wine dataset (a), and the Breast dataset (b). The red, blue, yellow columns correspond to classical K-means, quantum inspired K-means algorithm with $D=8$ and $D=15$ respectively. The y-axis is the number of instances out of $100$ cases with an accuracy in between the interval indicated by the x-axis.

Another way to access the quality of the learned centers is to compute the Euclidean distances between the centroids in both cases. It is already shown in Sun et al. (2020) that the classical to quantum encoding in Eq.9 could result in better separated input data, thus when appropriate learning algorithms are used, it is more likely to get higher prediction accuracy. In the classical case, The Euclidean distance between two different centers $\vec{\mathcal{O}}_{i}$ and $\vec{\mathcal{O}}_{j}$ is

\displaystyle D_{i,j}=|\vec{\mathcal{O}}_{i}-\vec{\mathcal{O}}_{j}|,

(21)

where $|\vec{v}|$ means the $2$ -norm of the vector $\vec{v}$ . Similarly, in the quantum inspired case, the distance between two different centers $\mathcal{O}_{i}^{\vec{\sigma}}$ and $\mathcal{O}_{j}^{\vec{\sigma}}$ is

\displaystyle D^{\vec{\sigma}}_{i,j}=|\langle\mathcal{O}_{i}^{\vec{\sigma}}|\mathcal{O}_{j}^{\vec{\sigma}}\rangle|^{2}.

(22)

Since $\mathcal{O}_{i}^{\vec{\sigma}}$ lives in a much larger space than $\vec{\mathcal{O}}_{i}$ , it is possible that the centroids are more likely orthogonal to each other in the quantum inspired case, that is, the $k\times k$ matrix formed by $D^{\vec{\sigma}}_{i,j}$ is closer to a diagonal matrix. As a result, it is easier to label a new data in the correct category. To show this clearly, we directly show the distance matrices formed by $D_{i,j}$ and $D^{\vec{\sigma}}_{i,j}$ based on the optimized centroids for the E-coil dataset in Fig.2. We can see clearly that in the classical case the centroids have large overlaps between one another while in the quantum inspired case, the centers are almost orthogonal to each other.

Lastly, we compare the convergence rate towards minima for different algorithms. In Fig.3, we show the loss values as a function of the minimization steps. In the classical case, the loss values are computed after each K-means iteration which is shown in Fig.3(a). Each K-means iteration is counted as one minimization step in this case. In the quantum inspired case, in each K-means iteration, we use a single variation MPS sweep in which the local on-site energy is first minimized from the left boundary to the right boundary and then back from the right boundary to the left boundary Schollwöck (2011). Each local minimization is counted as a minimization step in the quantum inspired case. In Fig.3(b), we show the loss values in this case. We have taken the initial loss value as $1$ in Fig. 3 and the rest loss values are renormalized against the initial value. We can see that the classical K-means algorithm takes $7$ minimization steps ( $7$ K-means iterations) to reach a final loss value $0.316$ , while the quantum inspired K-means algorithm uses $12$ minimization steps ( $2$ K-means iterations) to reach a final loss value $0.091$ .

IV Conclusion

In this paper, we propose a quantum inspired K-means clustering algorithm based on matrix product states. By mapping the input data into a much larger quantum Hilbert space and clustering data in the enlarged space, we show that the learning algorithm could result in a much higher precision. We demonstrate our algorithm by applying it to several commonly used machine learning datasets. Our results show that our algorithm is advantageous to the classical K-means algorithm in that 1) our algorithm could reach a higher prediction accuracy and 2) our algorithm is less likely to be trapped in local minima. We show that compared to the classical K-means algorithm, the learned centroids with our algorithm are better separately, which could be a reason why it could make more precise predictions. We also show the loss values as a function of the number of minimizations steps to help to better visualize the convergence in both algorithms.

To this end, we point out that there are at least two directions which could be inspired from this work: 1) the classical K-means algorithm is well studied and there exists various techniques to improve the learning accuracy on top of the standard K-means algorithm. In future works those techniques could also be adapted into our quantum inspired algorithm to further increase the learning accuracy; 2) As discussed in Sec.II, our quantum inspired algorithm has a one-to-one correspondence with a pure quantum machine learning algorithm which could be readily be executed on a quantum computer, with the most significant difference that in the quantum inspired algorithm the quantum state is represented using matrix product states which could be efficiently stored on a classical computer. In the near future it would be interesting to carry out the quantum K-means algorithm on a quantum computer with applications to real world clustering problems. Other than that, one could also explore the two-site variant of the quantum inspired algorithm such that the bond dimension $D$ could be dynamically adjusted. Moreover, in this work we use variational-MPS sweeps where we minimized the local energy on each site iteratively, it could also be interesting to compare this approach with the gradient-based minimization methods.

V Acknowledgement

Y. Shang thanks the support of National Key Research and Development Program of China under grant 2016YFB1000902, National Research Foundation of China (Grant No. 61872352, 61472412), and Program for Creative Research Group of National Natural Science Foundation of China (Grant No. 61621003); C. Guo. acknowledges support from National Natural Science Foundation of China under Grants No. 11805279.

References

Carrasquilla and Melko (2017) J. Carrasquilla and R. G. Melko, Nature Physics 13, 431 (2017).
Zhang et al. (2018) P. Zhang, H. Shen, and H. Zhai, Physical review letters 120, 066401 (2018).
Ch’ng et al. (2018) K. Ch’ng, N. Vazquez, and E. Khatami, Physical Review E 97, 013306 (2018).
Van Nieuwenburg et al. (2017) E. P. Van Nieuwenburg, Y.-H. Liu, and S. D. Huber, Nature Physics 13, 435 (2017).
Broecker et al. (2017) P. Broecker, J. Carrasquilla, R. G. Melko, and S. Trebst, Scientific reports 7, 1 (2017).
Ch’Ng et al. (2017) K. Ch’Ng, J. Carrasquilla, R. G. Melko, and E. Khatami, Physical Review X 7, 031038 (2017).
Zhang and Kim (2017) Y. Zhang and E.-A. Kim, Physical review letters 118, 216401 (2017).
Dong et al. (2019) X.-Y. Dong, F. Pollmann, X.-F. Zhang, et al., Physical Review B 99, 121104 (2019).
Torlai et al. (2018) G. Torlai, G. Mazzola, J. Carrasquilla, M. Troyer, R. Melko, and G. Carleo, Nature Physics 14, 447 (2018).
Torlai and Melko (2018) G. Torlai and R. G. Melko, Physical review letters 120, 240503 (2018).
Rocchetto et al. (2018) A. Rocchetto, E. Grant, S. Strelchuk, G. Carleo, and S. Severini, npj Quantum Information 4, 28 (2018).
Quek et al. (2018) Y. Quek, S. Fort, and H. K. Ng, arXiv preprint arXiv:1812.06693 (2018).
Carrasquilla et al. (2019) J. Carrasquilla, G. Torlai, R. G. Melko, and L. Aolita, Nature Machine Intelligence 1, 155 (2019).
Arsenault et al. (2014) L.-F. Arsenault, A. Lopez-Bezanilla, O. A. von Lilienfeld, and A. J. Millis, Physical Review B 90, 155136 (2014).
Arsenault et al. (2015) L.-F. Arsenault, O. A. von Lilienfeld, and A. J. Millis, arXiv preprint arXiv:1506.08858 (2015).
Torlai and Melko (2016) G. Torlai and R. G. Melko, Physical Review B 94, 165134 (2016).
Amin et al. (2018) M. H. Amin, E. Andriyash, J. Rolfe, B. Kulchytskyy, and R. Melko, Physical Review X 8, 021050 (2018).
Liu et al. (2017) J. Liu, Y. Qi, Z. Y. Meng, and L. Fu, Physical Review B 95, 041101 (2017).
Huang and Wang (2017) L. Huang and L. Wang, Physical Review B 95, 035105 (2017).
Aoki and Kobayashi (2016) K.-I. Aoki and T. Kobayashi, Modern Physics Letters B 30, 1650401 (2016).
Carleo and Troyer (2017) G. Carleo and M. Troyer, Science 355, 602 (2017).
Nomura et al. (2017) Y. Nomura, A. S. Darmawan, Y. Yamaji, and M. Imada, Physical Review B 96, 205152 (2017).
Czischek et al. (2018) S. Czischek, M. Gärttner, and T. Gasenzer, Physical Review B 98, 024311 (2018).
Luchnikov et al. (2020) I. A. Luchnikov, S. V. Vintskevich, D. A. Grigoriev, and S. N. Filippov, Phys. Rev. Lett. 124, 140502 (2020).
Bény (2013) C. Bény, arXiv preprint arXiv:1301.3124 (2013).
Mehta and Schwab (2014) P. Mehta and D. J. Schwab, arXiv preprint arXiv:1410.3831 (2014).
Lin et al. (2017) H. W. Lin, M. Tegmark, and D. Rolnick, Journal of Statistical Physics 168, 1223 (2017).
Deng et al. (2017) D.-L. Deng, X. Li, and S. D. Sarma, Physical Review X 7, 021021 (2017).
Schollwöck (2011) U. Schollwöck, Annals of Physics 326, 96 (2011).
Stoudenmire and Schwab (2016) E. Stoudenmire and D. J. Schwab, in Advances in Neural Information Processing Systems (2016) pp. 4799–4807.
Stoudenmire (2018) E. M. Stoudenmire, Quantum Science and Technology 3, 034003 (2018).
Sun et al. (2020) Z.-Z. Sun, C. Peng, D. Liu, S.-J. Ran, and G. Su, Physical Review B 101, 075135 (2020).
Han et al. (2018) Z.-Y. Han, J. Wang, H. Fan, L. Wang, and P. Zhang, Physical Review X 8, 031012 (2018).
Guo et al. (2018) C. Guo, Z. Jie, W. Lu, and D. Poletti, Physical Review E 98, 042114 (2018).
Guo et al. (2020) C. Guo, K. Modi, and D. Poletti, arXiv preprint arXiv:2004.11038 (2020).
Torlai et al. (2020) G. Torlai, C. J. Wood, A. Acharya, G. Carleo, J. Carrasquilla, and L. Aolita, arXiv preprint arXiv:2006.02424 (2020).
MacQueen et al. (1967) J. MacQueen et al., in Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Vol. 1 (Oakland, CA, USA, 1967) pp. 281–297.
Lloyd (1982) S. Lloyd, IEEE transactions on information theory 28, 129 (1982).
Arthur and Vassilvitskii (2006) D. Arthur and S. Vassilvitskii, k-means++: The advantages of careful seeding, Tech. Rep. (Stanford, 2006).
Ostrovsky et al. (2013) R. Ostrovsky, Y. Rabani, L. J. Schulman, and C. Swamy, Journal of the ACM (JACM) 59, 1 (2013).
Celebi et al. (2013) M. E. Celebi, H. A. Kingravi, and P. A. Vela, Expert systems with applications 40, 200 (2013).
Bachem et al. (2016) O. Bachem, M. Lucic, H. Hassani, and A. Krause, in Advances in neural information processing systems (2016) pp. 55–63.
Zhang et al. (1999) B. Zhang, M. Hsu, and U. Dayal, Hewlett-Packard Labs Technical Report HPL-1999-124 55 (1999).
Xu and Lange (2019) J. Xu and K. Lange, in International Conference on Machine Learning (2019) pp. 6921–6931.
De Amorim and Mirkin (2012) R. C. De Amorim and B. Mirkin, Pattern Recognition 45, 1061 (2012).
Chakraborty and Das (2017) S. Chakraborty and S. Das, Pattern Recognition Letters 100, 67 (2017).
Mitarai et al. (2018) K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii, Physical Review A 98, 032309 (2018).
Liu et al. (2020) Y. Liu, D. Wang, S. Xue, A. Huang, X. Fu, X. Qiang, P. Xu, H.-L. Huang, M. Deng, C. Guo, et al., Physical Review A 101, 052316 (2020).
Buhrman et al. (2001) H. Buhrman, R. Cleve, J. Watrous, and R. De Wolf, Physical Review Letters 87, 167902 (2001).