This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A geometric conjecture about phase transitions

O. B. Eriçok [email protected] Materials Science and Engineering, University of California, Davis, CA, 95616, USA.    J. K. Mason [email protected] Materials Science and Engineering, University of California, Davis, CA, 95616, USA.
Abstract

As phenomena that necessarily emerge from the collective behavior of interacting particles, phase transitions continue to be difficult to predict using statistical thermodynamics. A recent proposal called the topological hypothesis suggests that the existence of a phase transition could perhaps be inferred from changes to the topology of the accessible part of the configuration space. This paper instead suggests that such a topological change is often associated with a dramatic change in the configuration space geometry, and that the geometric change is the actual driver of the phase transition. More precisely, a geometric change that brings about a discontinuity in the mixing time required for an initial probability distribution on the configuration space to reach steady-state is conjectured to be related to the onset of a phase transition in the thermodynamic limit. This conjecture is tested by evaluating the diffusion diameter and ϵ\epsilon-mixing time of the configuration spaces of hard disk and hard sphere systems of increasing size. Explicit geometries are constructed for the configuration spaces of these systems, and numerical evidence suggests that a discontinuity in the ϵ\epsilon-mixing time coincides with the solid-fluid phase transition in the thermodynamic limit.

I Introduction

Phase transitions are essential to a variety of scientific and engineering applications, but continue to be difficult to predict from fundamental considerations. Instead, a phase transition is usually identified by an observed discontinuity in one of the derivatives of a thermodynamic potential. There have been several proposals concerning the origin of these discontinuities. For example, Landau theory Landau (1936); Landau and Lifshitz (2013) associates first-order phase transitions with spontaneous symmetry breaking as quantified by an appropriately-constructed order parameter. Such order parameters often need to be defined ex post facto though, after the characteristics of the phases involved in the phase transition are already known. An understanding of phase transitions that derives from more fundamental considerations would therefore be valuable.

Statistical thermodynamics suggests that thermodynamic observables can be calculated as time averages of the relevant microscopic quantity along the system’s trajectory through the phase space. The ergodic hypothesis Boltzmann (1871); Ehrenfest and Ehrenfest (1990) implies that, provided such time averages are conducted over a period longer than a characteristic mixing time, they can be replaced by an average over the entire phase space using a measure that is independent of the system’s initial microstate; that is, they can be replaced by an ensemble average. One question that has been raised about this procedure is whether the ensemble average could really be independent of the system’s initial conditions in general. Consider that any system with a disconnected configuration space (i.e., a system that is not metrically transitive Birkhoff (1931); Neumann (1932)) is necessarily confined to the component of the configuration space where it begins, meaning that the average behavior of an ensemble of such systems would not resemble the time-averaged behavior of any one system. Instead, the measure used for the average should depend on the system’s initial microstate, since that defines the component of the configuration space to which the system is confined. Following this line of thought further, changes to the connectivity of the configuration space could discontinuously change the support of the measure used for thermodynamic averages, and thereby lead to the discontinuities in thermodynamic observables required for a phase transition. This is one way to motivate the topological hypothesis Caiani et al. (1997); Franzosi and Pettini (2004) which roughly proposes that changes in the topology of the accessible part of the configuration space are necessary for a phase transition to occur.

Refer to caption
Figure 1: The left panel shows the evolution of a model configuration space with points colored by their yy-coordinate. The filtration value is an upper bound for the yy-coordinate of any point on the system’s trajectory. The middle panel shows the average xx-coordinate x\langle x\rangle calculated using the ergodic hypothesis and the topological hypothesis as a function of filtration value, assuming that the system starts at the lower left corner of the configuration space. The right panel shows the diffusion diameter and ϵ\epsilon-mixing time of the configuration space as functions of filtration value and that they are sensitive to changes in the configuration space geometry. The diffusion diameter, but not the mixing time, is also sensitive to the bottleneck. The five dotted lines show the filtration values used in the left panel.

The application of these ideas to a toy model is shown in Fig. 1. Suppose that there is an isolated system with the configuration space in the left panel, and that the potential energy is strictly monotone increasing with the yy-coordinate in the figure. The system’s internal energy then defines a value of yy called the filtration value above which the system’s trajectory cannot pass. Further suppose that the system’s initial microstate corresponds to a point in the bottom left corner of the configuration space, and that the observable of interest is the xx-coordinate in the figure. The middle panel shows the ensemble average of this observable as a function of filtration value for both the ergodic hypothesis and the topological hypothesis. Since the configuration space is symmetric about x=0x=0 and the ergodic hypothesis stipulates that the measure should be independent of the initial microstate, the ensemble average of the observable is zero for all filtration values. By comparison, the topological hypothesis recognizes that for sufficiently small filtration values the configuration space is disconnected, the system is confined to the left side of the configuration space, and the ensemble average of the observable should be zero only when the filtration value increases enough for the space to become connected. Moreover, the discontinuous change in the integration measure at this filtration value leads to a discontinuous observable, precisely of the type that would be expected for a system undergoing a phase transition. It is significant that it was not necessary at any point to pass to the thermodynamic limit for there to be a discontinuity in the observable, though it is reasonable to suppose that the magnitude of the discontinuities and their location with respect to any control variables would vary systematically with size in a more realistic situation.

Let EE be the potential energy of a system, 𝒒i\bm{q}_{i} be the coordinates of the system’s particles, and V(𝒒1𝒒N)V(\bm{q}_{1}\dots\bm{q}_{N}) be a smooth, stable, confining and short-range interaction potential. The topological hypothesis Franzosi et al. (2007); Franzosi and Pettini (2007) initially claimed that a topological change in the equipotential energy submanifolds ΣE=V1((,E])\Sigma_{E}=V^{-1}\left((-\infty,E]\right) of the configuration space was a necessary condition for a phase transition. Kastner and Mehta Kastner and Mehta (2011) observed a second-order phase transition in a two-dimensional Φ4\Phi^{4} system at an energy where no topological change occurred, disproving this claim. Gori et al. Gori et al. (2018) subsequently refined the hypothesis and observed that the phase transition in the two-dimensional Φ4\Phi^{4} system was caused by a diverging transition time between two parts of the configuration space, and that this was associated with an asymptotic topological change. That is, a continuous change in the configuration space geometry brought about a divergence in the mixing time, and while such events are often accompanied by topological changes they are not always. The significance of the configuration space geometry is supported by a recent study of the topological and geometric properties of the two-dimensional XYXY model by Bel-Hadj-Aissa et al. Bel-Hadj-Aissa et al. (2021); they computed the mean geometric curvature of the equipotential energy level sets and observed that the location of the phase transition could be inferred from the level set curvatures.

With this as background, the fundamental conjecture of this work is:

Conjecture 1.

A necessary condition for a first-order phase transition is a discontinuity in the mixing time on the configuration space.

This is not intended to suggest that the mixing time directly regulates the appearance of phase transitions, but merely that the mixing time is sensitive to any geometric changes that could discontinuously change the integration measure for thermodynamic averages.

A natural question at this point is whether it is actually possible to measure the mixing time of a thermodynamic system, or whether this will always remain a vague quantity significant only in that it is necessarily shorter than the time scales of quasiequilibrium processes. We hypothesize that the relevant geometric changes are so severe that any geometric quantity reasonably sensitive to the accessible volume of the configuration space and the length and number of paths connecting distant regions should exhibit measurable discontinuities for the same values of the control variable as the mixing time. The two quantities used here are the diffusion distance Coifman and Lafon (2006); Talmon et al. (2013) and the ϵ\epsilon-mixing time. The first measures the difference between two distributions that start as Dirac delta distributions at different locations on the configuration space and evolve by diffusion; maximizing the diffusion distance over all starting locations of the two distributions gives the configuration space’s diffusion diameter. The second measures the time required for a Dirac delta distribution to diffuse to the steady-state distribution within a tolerance defined by ϵ\epsilon, averaged over all starting locations of the distribution in the configuration space. This is intended to resemble the thermodynamic mixing time, though the distribution evolves by the diffusion equation rather than the Liouville equation and the similarity to the steady-state distribution is quantified by the Kullback–Leibler divergence Kullback and Leibler (1951).

The conjecture is tested by evaluating the diffusion diameter and ϵ\epsilon-mixing time for the configuration spaces of hard disk and hard sphere systems, collectively called hard disk systems in the following. These systems are frequently used to model simple fluids Dyre (2016), and are governed by the hard disk potential for which the energy is infinite if any disks overlap and is zero otherwise. The phase transitions for these systems have been studied extensively, starting with the seminal work of Alder and Wainwright Alder and Wainwright (1962) who observed a phase transition in a system of hard disks as a function of packing fraction η\eta (the fraction of the area covered by the disks). The solid for η>0.72\eta>0.72 is characterized by the presence of a long-range translational and orientational order, whereas the fluid for η<0.70\eta<0.70 is characterized by the absence of any long-range order Weber et al. (1995); Mitus et al. (1997). The interval 0.70<η<0.720.70<\eta<0.72 was initially believed to contain coexisting solid and fluid phases as would be expected of a first-order transition, but more recently was claimed to contain a hexatic phase Halperin and Nelson (1978); Young (1979). It is significant that there is still not agreement in the literature about the order of the transitions and the phases involved Weber et al. (1995); Mitus et al. (1997); Bernard and Krauth (2011); Engel et al. (2013) even for this simple system. Similar studies of hard spheres in three dimensions include those by Isobe and Krauth Isobe and Krauth (2015) and Pieprzyk et al. Pieprzyk et al. (2019) who observed solid-fluid phase coexistence in the 0.49<η<0.5480.49<\eta<0.548 interval.

The configuration spaces of hard disk systems have been studied before. Carlsson et al. Carlsson et al. (2012) explored the topology of the configuration space of five hard disks in a square box by regularizing the potential energy surface and using classical Morse theory Morse (1934); Milnor (2016). Baryshnikov et al. Baryshnikov et al. (2014) developed a min-type Morse theory for the configuration spaces of hard disk systems and proved that the critical points where the topology changes as a function of packing fraction are precisely the mechanically-balanced disk configurations. Ritchey Ritchey (2019) studied configuration spaces of hard disk in the hexagonal torus for n=112n=1\ldots 12 disks, created a database of the critical points, defined their critical index (specifying the nature of the associated topological change), and found their plane symmetry groups. The authors previously Ericok and Mason (2021) proposed distance functions on the configuration spaces of hard disks quotiented by various symmetry groups. They subsequently Eriçok et al. (2021) used these distances to explicitly triangulate the quotients of the configuration space of two spheres in the rhombic dodecahedron and measure the diffusion diameters of the resulting spaces. They also observed an accumulation of critical points in the configuration space at packing fractions in the phase coexistence interval, suggesting the possibility of a topological and geometric catastrophe there in the thermodynamic limit.

The technical contribution of this work is a set of techniques described in Sec. II that collectively allow the configuration space geometries for hard disk systems with n7n\leq 7 disks and hard sphere systems with n6n\leq 6 spheres to be evaluated in practice. The diffusion diameters and ϵ\epsilon-mixing times (defined in Sec. III) of these spaces are computed as functions of particle diameter and particle number, and in Sec. IV significant discontinuities are observed at the packing fractions of critical points close to the phase coexistence intervals. Along with the observation Eriçok et al. (2021) that critical points accumulate in the phase coexistence intervals with increasing particle number, this suggests that there should be a discontinuity in the mixing time at the packing fraction of the phase transition in the thermodynamic limit, offering preliminary support for Conj. 1.

II Configuration Spaces

II.1 Tautological function

The configuration space of nn points on a dd-dimensional torus TdT^{d} is

Λ(n)={𝐱=(𝒙1𝒙n)|𝒙iTd}.\Lambda(n)=\{\mathbf{x}=(\bm{x}_{1}\dots\bm{x}_{n})\,|\,\bm{x}_{i}\in T^{d}\}. (1)

A two-dimensional torus T2T^{2} for hard disks is obtained by identifying opposite edges of a regular hexagon, whereas a three-dimensional torus T3T^{3} for hard spheres is obtained by identifying opposite faces of a rhombic dodecahedron. Figure 2 shows these domains with some of the periodic images of the fundamental unit cells. The center to center distance to the neighboring cells is one in both cases.

Refer to caption
Figure 2: A two-dimensional torus T2T^{2} (left) and a three-dimensional torus T3T^{3} (right) are shown with some of their periodic images. The unit vector 𝒂1\bm{a}_{1} points in the xx-direction in both cases. All the unit vectors pass through the centers of the corresponding faces.

The tautological function τ:Λ(n)R\tau:\Lambda(n)\rightarrow R is the maximum radius that the disks could have without overlapping, or

τ(𝐱)=min1i<jnrij\tau(\mathbf{x})=\min_{\begin{subarray}{c}1\leq i<j\leq n\end{subarray}}r_{ij} (2)

where rijr_{ij} is half the geodesic distance between the centers of disks ii and jj. This allows the configuration space of nn disks of radius ρ\rho to be written as

Γ(n,ρ)=τ1([ρ,))\Gamma(n,\rho)=\tau^{-1}\left([\rho,\infty)\right) (3)

or the set of all configurations of points where the minimum point separation is at least ρ\rho.

II.2 Critical configurations

Morse theory Morse (1934); Milnor (2016) stipulates that the topology of the sublevel sets of a generic real-valued function ff defined on a smooth manifold MM can change only at the critical points of ff. Intuitively, a critical point is a point where the gradient of the function vanishes. The index of a critical point is defined as the number of negative eigenvalues of the Hessian matrix there, and roughly indicates the number of independent directions along which ff decreases to second order. Let Ma={xM|f(x)<a}M_{a}=\{x\in M|f(x)<a\} denote a sublevel set of MM. One of the results of Morse theory is that the topology of MaM_{a} and MbM_{b} are equivalent if the interval [a,b][a,b] doesn’t contain any critical points of ff.

Baryshnikov et al. Baryshnikov et al. (2014) began to develop a min-type Morse theory specifically for hard disk configuration spaces, and proved that the critical points of the tautological function are precisely the mechanically-balanced configurations. These can be shown to coincide with the critical points of the regularized hard-disk potential in Ref. Eriçok et al. (2021), found using a conjugate gradient algorithm to minimize the square of the potential gradient magnitude. Interactive databases of the critical points for the hard disk111https://github.com/burakericok/hard_disk_crits and hard sphere222https://github.com/burakericok/hard_sphere_crits configuration spaces are available online. The distributions of critical points as functions of index and packing fraction in Refs. Ritchey (2019); Eriçok et al. (2021) suggest that an accumulation of critical points in a narrow packing fraction interval with increasing number of disks could be associated with the onset of a phase transition; one of the purposes of this work is to explore this observation further.

II.3 Geometric representation

The procedure followed previously Ericok and Mason (2021); Eriçok et al. (2021) when studying hard disk configuration spaces involved repeatedly sampling points in the configuration space of points Λ(n)\Lambda(n), connecting nearby points in the resulting point cloud to reconstruct Λ(n)\Lambda(n), and restricting to the hard disk configuration space Γ(n,ρ)\Gamma(n,\rho) by retaining only those regions with suitable values of the tautological function. The difficulty with this procedure is that the dimension and complexity of Γ(n,ρ)\Gamma(n,\rho) increases rapidly with nn, quickly requiring a point cloud with an overwhelming number of points to accurately capture the geometric details. This section describes several techniques that, while they do not solve the problem, reduce the required number of points enough to allow us to study the geometry of configuration spaces with up to 1818 dimensions.

The first is to quotient the configuration space by various symmetry groups, thereby reducing the relevant volume. The three isometry groups considered here are the group of rigid translations 𝒯\mathcal{T}, the group of disk label permutations 𝒫\mathcal{P}, and the point group symmetries of the fundamental unit cell \mathcal{L}. As described previously Ericok and Mason (2021); Eriçok et al. (2021), the configuration space is quotiented by these groups by changing the distance function used to identify and connect nearby points in the point cloud. More precisely, there is a natural distance function dΛ(𝐱,𝐲)=i=1n𝒙i𝒚id_{\Lambda}(\mathbf{x},\mathbf{y})=\sum_{i=1}^{n}\lVert\bm{x}_{i}-\bm{y}_{i}\rVert on Λ\Lambda that is the sum of the geodesic distances each point in a configuration 𝐱\mathbf{x} would need to travel to be converted into configuration 𝐲\mathbf{y}. Given an isometry group 𝒮\mathcal{S}, the distance dΛ/𝒮d_{\Lambda/\mathcal{S}} in the quotient space Λ/𝒮\Lambda/\mathcal{S} can be written as Burago et al. (2001)

dΛ/𝒮(𝐱,𝐲)=infS𝒮dΛ[𝐱,S(𝐲)].d_{\Lambda/\mathcal{S}}(\mathbf{x},\mathbf{y})=\inf\limits_{\begin{subarray}{c}S\in\mathcal{S}\end{subarray}}d_{\Lambda}[\mathbf{x},S(\mathbf{y})]. (4)

Observe that 𝒯\mathcal{T} is a continuous group, that Eq. 4 involves solving a global optimization problem when 𝒯𝒮\mathcal{T}\subset\mathcal{S}, and that this problem can have multiple local minima. While algorithms like Tabu search Glover (1989); Chelouah and Siarry (2000) can be used in such situations, the number of problems to be solved increases rapidly with the number of isometry groups considered. For example, when 𝒮=𝒯𝒫\mathcal{S}=\mathcal{T}\cup\mathcal{P}\cup\mathcal{L}, the global optimization over rigid translations needs to be solved n!×O()n!\times O(\mathcal{L}) times to evaluate Eq. 4 where O()O(\mathcal{L}) is the order of \mathcal{L}.

The second technique reduces the number of optimization problems that need to be solved in Eq. 4. The idea is to construct an approximation to the solution of the global optimization problem over rigid translations, allowing the discrete symmetry operations that are unlikely to realize the minimum to be quickly rejected. The procedure followed here samples a fixed number of random translations for a fixed discrete symmetry operation, and uses the minimum over the random translations as the approximation. The details of this procedure and various numerical results are provided in App. A. This reduces the computational cost enough to be able to evaluate a small number of distances dΛ/𝒮d_{\Lambda/\mathcal{S}}, but not all the pairwise distances in the point cloud when identifying nearby points to reconstruct Λ/𝒮\Lambda/\mathcal{S}.

The third techniques reduces the number of times that dΛ/𝒮d_{\Lambda/\mathcal{S}} needs to be evaluated. Points sampled from Λ\Lambda are initially mapped to a Cartesian space with coordinates given by descriptors that are invariant to the action of 𝒯\mathcal{T}, 𝒫\mathcal{P}, and \mathcal{L} Ericok and Mason (2021); Eriçok et al. (2021). This mapping is conjectured to be an embedding and therefore to preserve the neighborhood of every point in Λ\Lambda, but is not isometric in the sense that distances between configurations are distorted. That said, there are efficient algorithms to calculate kk^{\prime}-nearest neighbor graphs in the descriptor space with the Euclidean metric. Since neighborhoods are preserved, the kk^{\prime}-nearest neighbor graph of a point in the descriptor space should contain the kk-nearest neighbor graph of the point in Λ/𝒮\Lambda/\mathcal{S} for sufficiently large k>kk^{\prime}>k; our numerical experiments suggest that k5kk^{\prime}\sim 5k is usually sufficient. The procedure therefore involves constructing the kk^{\prime}-nearest neighbor graph in the descriptor space, evaluating dΛ/𝒮d_{\Lambda/\mathcal{S}} for each edge of this graph, and constructing the approximate kk-nearest neighbor graph in Λ/𝒮\Lambda/\mathcal{S} using these distances.

It is useful at this point to discuss the relationship of the kk-nearest neighbor graph in Λ/𝒮\Lambda/\mathcal{S} to the geometry of the underlying space. It is clear that the apparent connectivity of the space depends on the value of kk; for very small kk the graph would likely contain many disconnected components, whereas for very large kk every point would share an edge with every other point, regardless of the actual properties of the space. While there does not seem to be an established canonical approach to selecting the value of kk, the strategy followed here involves the use of topological information. Observe that, for any filtration of the kk-nearest neighbor graph by the tautological function, the number of disconnected components in the resulting graph should be at most the number of index-0 critical points with disk radius larger than or equal to the filtration value. Moreover, when the filtration value is equal to the smallest value such that all of the index-11 critical points (which correspond to saddle points) are included, only a single connected component should remain. The smallest value of kk for which these conditions are satisfied seems a reasonable choice, and for the numbers of points in our point clouds (indicated in Fig. 3) is approximately k=100×nk=100\times n where nn is the number of disks.

Refer to caption
Figure 3: Number of points used in the graph representations for hard disks and spheres.

The fourth technique is to directly use the kk-nearest neighbor graph rather than a simplicial complex (a triangulation) to evaluate the geometric properties of Λ/𝒮\Lambda/\mathcal{S}. Previously, the configuration spaces for small numbers of hard disks and spheres were triangulated as α\alpha-complexes Ericok and Mason (2021); Eriçok et al. (2021). While such simplicial complexes can accurately represent all the geometric properties of the underlying space, the number of simplices required generally increases exponentially with dimension, quickly making the computational memory requirements prohibitive Sheehy (2013); Buchet et al. (2016). Fortunately, the only geometric information required to calculate the diffusion diameter and the ϵ\epsilon-mixing time are geodesic distances, and these can be reasonably approximated from the kk-nearest neighbor graph alone. Figure 4 shows the quotient space Λ(2)/{𝒯,𝒫,}\Lambda(2)/\{\mathcal{T,P,L}\} of two points on T2T^{2} represented as an α\alpha-complex (left) and as a graph (right). While the shortest path connecting two points is a straight line on the left, the path through the edges of the graph on the right is slightly longer. Since both the diffusion distance and the ϵ\epsilon-mixing time are designed to be robust to small geometric perturbations like these, using the kk-nearest neighbor graph of Λ/𝒮\Lambda/\mathcal{S} directly is sufficient for our purposes.

Refer to caption
Figure 4: The quotient space Λ(2)/{𝒯,𝒫,}\Lambda(2)/\{\mathcal{T,P,L}\} of two points on T2T^{2} represented as an α\alpha-complex (left) and a graph (right), with color indicating the value of the tautological function. The red path connecting two points is a straight line on the left, but is slightly longer on the right.

The fifth and final technique is related to the construction of the point cloud on Λ\Lambda, and is intended to reduce the number of points required to accurately represent the relevant geometric features of Λ/𝒮\Lambda/\mathcal{S}. Figure 5 shows the distributions of tautological function values ρ\rho for point clouds on Λ(4)\Lambda(4) sampled by three different procedures, with the dashed lines indicating the radii of previously-identified critical points. The left panel shows the distribution of points sampled uniformly on Λ(4)\Lambda(4), and reveals that the overwhelming majority of the space’s volume has comparatively small values of ρ\rho where only a single pair of disks would be in contact. The critical points are concentrated at relatively high values of ρ\rho though. This motivates the use of importance sampling Kahn and Harris (1951) to sample points more uniformly over ρ\rho, concentrating points in the regions where the topology and geometry of Γ(4,ρ)\Gamma(4,\rho) are most likely to change and giving the distribution in the middle panel. The sampling density around the critical points can be increased further by perturbing known critical configurations and adding the resulting configurations directly to the point cloud, giving the distribution in the right panel. The intention is to ensure that the density of sampled points is high enough to accurately reflect the parts of Λ\Lambda that are most relevant to the hypothesis without unnecessarily sampling points elsewhere in the space.

Refer to caption
Figure 5: Distributions of tautological function values for point clouds on Λ(4)\Lambda(4) sampled by three different procedures, with the dashed lines indicating the radii of previously-identified critical points. Sampling points uniformly gives the distribution on the left. The distribution in the middle uses importance sampling and is more uniform. The distribution on the right is for a point cloud that both uses importance sampling and samples additional points in the neighborhoods of the critical points.

III Configuration space geometry

Diffusion is a smoothing process. Flows from regions of high concentration to low concentration necessarily make a concentration field more uniform with time. This also makes diffusion processes relatively insensitive to small perturbations in the initial concentration field or to the geometry of the underlying space. This property is likely one of the reasons that diffusive processes are often used to learn about the geometry of a space represented by a sampled point cloud, since the resulting insights do not depend sensitively on the distribution of the points.

III.1 Diffusion Distance

Let GG be the kk-nearest neighbor graph on Λ/𝒮\Lambda/\mathcal{S} constructed by the procedure in Sec. II.3. The diffusion distance dij,td_{ij,t} on GG measures the L2L^{2} distance between two distributions that begin as Dirac delta distributions on vertices ii and jj and diffuse on GG for a time tt. Let the kernel matrix 𝐊\mathbf{K} have entries kij=exp(lij2/σ2)k_{ij}=\exp(-l_{ij}^{2}/\sigma^{2}) where lijl_{ij} is Dijkstra’s distance between vertices ii and jj and σ\sigma is twice the median length of edges in GG, and the diagonal degree matrix 𝐃\mathbf{D} has entries dii=jkijd_{ii}=\sum_{j}k_{ij}. Then 𝐏=𝐃1𝐊\mathbf{P}=\mathbf{D}^{-1}\mathbf{K} is the transition rate matrix of a continuous-time Markov process where the effect of raising 𝐏\mathbf{P} to the ttth power is equivalent to propagating the process by a time tt. Let {λl,ϕl}\{\lambda_{l},\bm{\phi}_{l}\} be the set of eigenvalues and eigenvectors of 𝐏\mathbf{P} for 0lN10\leq l\leq N-1 where NN is the number of vertices. Since 𝐏\mathbf{P} is normalized, the largest eigenvalue λ0\lambda_{0} is associated with a constant eigenvector ϕ0\bm{\phi}_{0}. Discarding λ0\lambda_{0} and ϕ0\bm{\phi}_{0}, the diffusion coordinates are defined as

𝚽t,i=[λ1tϕ1(i),,λN1tϕN1(i)]\bm{\Phi}_{t,i}=\left[\lambda_{1}^{t}\bm{\phi}_{1}(i),\dots,\lambda_{N-1}^{t}\bm{\phi}_{N-1}(i)\right] (5)

and encode a distribution starting as a Dirac delta distribution on vertex ii and diffusing for a time tt. The diffusion distance between vertices ii and jj is then defined as

dij,t=𝚽t,i𝚽t,jd_{ij,t}=\|\bm{\Phi}_{t,i}-\bm{\Phi}_{t,j}\| (6)

where \|\cdot\| is the Euclidean norm. Note that the significance of the time tt depends on the diffusion rate implicit in the kernel 𝐊\mathbf{K}. Since the eigenspectrum of 𝐏t\mathbf{P}^{t} decays very quickly for sufficiently large tt, the diffusion distance dij,td_{ij,t} can often be accurately approximated using only the first few eigenvalues and eigenvectors for a substantial time and memory savings Talmon et al. (2013). Details about this approximation are provided in App. B.

III.2 Mixing Time

Thermodynamics is generally concerned with equilibrium or quasiequilibrium systems, where there is no net redistribution of matter or energy and thermodynamic observables are time independent. From the standpoint of statistical thermodynamics, this means that the time average of any thermodynamic observable over the system’s trajectory through the phase space does not depend on the initial microstate provided that the averaging is performed over a sufficiently long time interval; the shortest such interval is known as the mixing time.

Let Ω\Omega be the phase space of a thermodynamic system, 𝒒\bm{q} and 𝒑\bm{p} be the canonical coordinates and momenta, and μ(𝒒,𝒑,t)\mu(\bm{q},\bm{p},t) be the probability distribution of microstates on Ω\Omega at time tt. Suppose that the system has an arbitrary initial condition μ0(𝒒,𝒑)=μ(𝒒,𝒑,0)\mu_{0}(\bm{q},\bm{p})=\mu(\bm{q},\bm{p},0). Briefly setting aside the specifics of the time evolution of μ(𝒒,𝒑,t)\mu(\bm{q},\bm{p},t), the steady-state distribution μ(𝒒,𝒑)=limtμ(𝒒,𝒑,t)\mu_{\infty}(\bm{q},\bm{p})=\lim_{t\rightarrow\infty}\mu(\bm{q},\bm{p},t) defines the equilibrium condition on Ω\Omega, and the mixing time could in principle be defined by the approach of μ(𝒒,𝒑,t)\mu(\bm{q},\bm{p},t) to μ(𝒒,𝒑)\mu_{\infty}(\bm{q},\bm{p}). That said, the literature does not seem to precisely define the necessary conditions for the two distributions to be effectively indistinguishable, nor the effect of the initial condition μ0(𝒒,𝒑)\mu_{0}(\bm{q},\bm{p}) on the resulting mixing time. A likely reason for this is the historical emphasis on equilibrium states for which, by hypothesis, the observation time can be made as long as necessary for there to be complete mixing.

Classically, the time evolution of μ(𝒒,𝒑,t)\mu(\bm{q},\bm{p},t) is governed by the Liouville equation Liouville (1838); Gibbs (1902). Liouville’s theorem states that the resulting convective derivative of the probability distribution is zero, or that the flow of probability density resembles that of an incompressible fluid. This raises questions relating to the convergence of probability distributions that are closely related to those about the origins of irreversibility, and that continue to be discussed in the literature. This discussion is avoided by simply supposing that μ(𝒒,𝒑,t)\mu(\bm{q},\bm{p},t) evolves by the diffusion equation, while not endorsing this approach generally. One consequence of this supposition is that the marginal probability distributions on the configuration and momentum subspaces evolve independently, though only the marginal probability distribution on the configuration space has a limiting distribution (the momentum subspace is unbounded). This suggests that the mixing time be defined by the convergence of the marginal probability distribution ν(𝒒,t)\nu(\bm{q},t) on the configuration space Ωq\Omega_{q} to the limiting distribution ν(𝒒)\nu_{\infty}(\bm{q}).

The Kullback-Leibler divergence Kullback and Leibler (1951) is often called the relative entropy, and is a standard way to quantify how much a probability distribution ν\nu differs from a reference probability distribution ν\nu_{\infty}:

I(t;ν0)=Ων(𝒒,t)logν(𝒒,t)ν(𝒒)d𝒒.I(t;\nu_{0})=\int_{\Omega}\nu(\bm{q},t)\log\frac{\nu(\bm{q},t)}{\nu_{\infty}(\bm{q})}\mathrm{d}\bm{q}. (7)

We propose that the condition for ν(𝒒,t)\nu(\bm{q},t) to have converged to ν(𝒒)\nu_{\infty}(\bm{q}) be that I(t;ν0)ϵI(t;\nu_{0})\leq\epsilon, where ϵ\epsilon is an adjustable parameter analogous to a conventional convergence threshold.

Now suppose that ν0\nu_{0} is a Dirac delta distribution centered at 𝒒0\bm{q}_{0}, and that tϵ(𝒒0)t_{\epsilon}(\bm{q}_{0}) is the minimum time required for this initial condition to converge to ν(𝒒)\nu_{\infty}(\bm{q}) on Ωq\Omega_{q} in the sense above. The ϵ\epsilon-mixing time tϵ\langle t\rangle_{\epsilon} is defined as the weighted average of tϵt_{\epsilon} over all possible choices of the initial condition, or

tϵ=Ωtϵ(𝒒)ν(𝒒)𝑑𝒒\langle t\rangle_{\epsilon}=\int_{\Omega}t_{\epsilon}(\bm{q})\nu_{\infty}(\bm{q})d\bm{q} (8)

where the measure of integration is taken to be ν(𝒒)\nu_{\infty}(\bm{q}) in the absence of a natural alternative. The ϵ\epsilon-mixing time is regarded as a precisely-defined proxy for the thermodynamic mixing time on the configuration space in the following.

This leaves only the definition of a diffusive process occurring on a connected graph GG instead of on a continuous space Ωq\Omega_{q}. Let 𝝂(t)\bm{\nu}(t) be the probability masses on the vertices of GG at time tt, and define the graph Laplacian as 𝐋=𝐃𝐊\mathbf{L}=\mathbf{D}-\mathbf{K} Chung and Graham (1997). The governing equation for a diffusive process is

(/t+𝐋)𝝂(t)=0.(\partial/\partial t+\mathbf{L})\bm{\nu}(t)=0. (9)

Let 𝚲\mathbf{\Lambda} be the diagonal matrix of the eigenvalues λ0=0<λ1λN1\lambda_{0}=0<\lambda_{1}\dots\leq\lambda_{N-1} of 𝐋\mathbf{L}, and 𝚽\mathbf{\Phi} be the matrix of the corresponding eigenvectors where the first column is a constant vector. Since 𝐋\mathbf{L} is a symmetric matrix, the columns of 𝚽\mathbf{\Phi} form an orthogonal basis and the solution to the diffusion equation is

𝝂(t)=𝚽et𝚲𝚽T𝝂0\bm{\nu}(t)=\mathbf{\Phi}e^{-t\mathbf{\Lambda}}\mathbf{\Phi}^{T}\bm{\nu}_{0} (10)

for the initial condition 𝝂0\bm{\nu}_{0}. The steady-state distribution is readily calculated using the fact that all of the diagonal elements of et𝚲e^{-t\mathbf{\Lambda}} go to zero in the limit of long time except the first term which goes to one. This suggests that the steady-state distribution ν\nu_{\infty} is always the uniform distribution over all vertices of GG.

The right panel of Fig. 1 shows the different behaviors of the diffusion distance and the mixing time in practice. Observe that while both exhibit a discontinuous jump at the filtration value where the two components of the space merge, the diffusion diameter has an additional peak at the filtration value of the bottleneck. The initial rise is attributed to the probability mass having difficulty diffusing to the point of the bottleneck through a small number of paths. Since the space is constructed by means of kk-nearest neighbor graphs, more paths spanning the neck appear with further increases in the filtration value, subsequently reducing the diffusion diameter. That is, the peak in the diffusion diameter is effectively an artifact of the construction of the space and the precise sequence in which the edges appear. By comparison, the average mixing time tϵ\langle t\rangle_{\epsilon} is unaffected by the bottleneck since the small volume of that region reduces its contribution to the average in Eq. 8. More details about the calculation of the ϵ\epsilon-mixing time for this system are given in App. C.

IV Results

The hard disk equation of state usually appears in the literature as a function of packing fraction η\eta Mulero et al. (2008); Tian et al. (2020). The radius ρ\rho of the disks appearing in Eq. 3 can be converted to a packing fraction using η=nπρ2/A\eta=n\pi\rho^{2}/A for hard disks and η=4nπρ3/(3V)\eta=4n\pi\rho^{3}/(3V) for hard spheres, where A=3/2A=\sqrt{3}/2 and V=2/2V=\sqrt{2}/2 are respectively the area of the fundamental hexagon and the volume of the fundamental rhombic dodecahedron in Fig. 2.

The authors previously constructed quotient spaces of two hard disks Ericok and Mason (2021) and two hard spheres Eriçok et al. (2021) and studied the topological and geometric properties of both the original and quotient spaces as functions of packing fraction. They observed that while the geometric and topological features of the these spaces changed dramatically with the choice of symmetry groups by which to quotient, the general behavior of the diffusion diameter (and significantly the locations of the discontinuities) as a function of packing fraction did not. This suggests that the quotient space has the advantage of a substantially reduced volume while retaining the essential geometric and topological features of the full configuration space. If the behavior of the ϵ\epsilon-mixing time is similarly robust to the choice of symmetry groups by which to quotient, this would allow Conj. 1 to be tested more easily and for systems with larger numbers of disks.

Figure 6 shows the ϵ\epsilon-mixing times of the quotient spaces Γ(2,η)/𝒯\Gamma(2,\eta)/\mathcal{T} and Γ(2,η)/{𝒯𝒫}\Gamma(2,\eta)/\{\mathcal{T}\cup\mathcal{P}\cup\mathcal{L}\} for hard disks (top) and hard spheres (bottom). Since only relative changes in the ϵ\epsilon-mixing time are significant, all the results in this section are normalized to their maximum value for ease of comparison. Initially consider the hard disk results in the top row. There are only two critical points, one index-0 and one index-11. The volumes and ϵ\epsilon-mixing times of both spaces initially grow slowly with decreasing η\eta, with a discontinuity appearing at precisely the packing fraction of the index-11 critical point; the discontinuity is stronger for Γ(2,η)/𝒯\Gamma(2,\eta)/\mathcal{T} since the geometric change is more severe, though at the price of greatly increased computational cost. The bottom row shows the results for hard spheres, for which there are two distinct index-0, one index-11, and one index-22 critical points. Since having distinct index-0 critical points results in the space having multiple disconnected components for certain intervals of η\eta, the ϵ\epsilon-mixing times for each component are shown with an opacity that indicates the fraction of vertices participating in that component. As before, there is a discontinuity in the ϵ\epsilon-mixing time at the packing fraction of the index-11 critical point, in this situation indicating that the disconnected components are joined. Notice that a similar discontinuity does not occur at the packing fraction of the index-22 critical point, nor indeed at any other critical point of the tautological function. Since the behavior of the diffusion diameter was previously found to be similarly robust to the choice of symmetry groups by which to quotient, the quotient spaces Γ/{𝒯𝒫}\Gamma/\{\mathcal{T}\cup\mathcal{P}\cup\mathcal{L}\} will be used exclusively in the following.

Refer to caption
Figure 6: ϵ\epsilon-mixing times of the quotient spaces Γ(2,η)/𝒯\Gamma(2,\eta)/\mathcal{T} and Γ(2,η)/{𝒯𝒫}\Gamma(2,\eta)/\{\mathcal{T}\cup\mathcal{P}\cup\mathcal{L}\} for hard disks (top) and hard spheres (bottom). Dashed lines represent the packing fractions of the critical points.

Figure 7 shows the diffusion diameters and ϵ\epsilon-mixing times of the quotient spaces Γ/{𝒯𝒫}\Gamma/\{\mathcal{T}\cup\mathcal{P}\cup\mathcal{L}\} for n=37n=3\dots 7 hard disks as functions of packing fraction, with dashed lines indicating the packing fractions of the critical points. The first observation is that not all of the critical points correspond to substantial geometric changes, at least not ones to which the diffusion distance and ϵ\epsilon-mixing time are sensitive. The second is that the diffusion diameter is generally much noisier than the ϵ\epsilon-mixing time, and while the structure in the signal could perhaps be analyzed for further geometric information that is not the purpose of our study. The third is that while the discontinuities in the diffusion distance and ϵ\epsilon-mixing time do not always occur at the same packing fractions, the largest discontinuities generally do.

Refer to caption
Figure 7: Diffusion diameters and ϵ\epsilon-mixing times of the spaces Γ/{𝒯𝒫}\Gamma/\{\mathcal{T}\cup\mathcal{P}\cup\mathcal{L}\} for n=37n=3\dots 7 hard disks as functions of packing fraction. Dashed lines shows the packing fractions of the critical points.

Figure 8 shows the corresponding results to Fig. 7 for n=36n=3\dots 6 hard spheres. While the number of critical points is greatly increased relative to the hard disk systems in Fig. 7, the number of discontinuities in the diffusion diameter and ϵ\epsilon-mixing time are approximately the same. This suggests that there is perhaps a small class of critical points associated with substantial geometric changes to the configuration space, and that the distribution of these critical points is the most relevant to the underlying hypothesis. Notice particularly that the packing fraction of the largest discontinuities appears to be approaching the packing fraction of the lower end of the phase-coexistence interval with increasing nn. Unfortunately, even using all of the techniques in Sec. II.3, the memory requirements increase so rapidly with nn that we could not realistically examine the spaces for n7n\geq 7. Even for five and six hard spheres, the volume of the space increases so rapidly with decreasing packing fraction that we cannot report diffusion diameters and ϵ\epsilon-mixing times over the entire domain. This does not substantially affect our conclusions though, since the sampled domain of η\eta already extends well below the coexistence interval, and notably includes all of the index-11 critical points.

Refer to caption
Figure 8: Diffusion diameters and ϵ\epsilon-mixing times of the spaces Γ/{𝒯𝒫}\Gamma/\{\mathcal{T}\cup\mathcal{P}\cup\mathcal{L}\} for n=36n=3\dots 6 hard spheres as functions of packing fraction. Dashed lines shows the packing fractions of the critical points.

There is evidence that, at least for the hard disk systems, the substantial geometric changes leading to discontinuities in the ϵ\epsilon-mixing time are often associated with the lowest-packing-fraction index-11 critical point. Figure 9 shows the packing fractions of the largest discontinuities in Figs. 7 and 8 as black squares, with the number of hard disks (top) and hard spheres (bottom) increasing on the vertical axes. The packing fractions of all known critical points are also shown, with index-0 critical points in purple, index-11 critical points in blue, and all others in green. While ϵ\epsilon-mixing time data is only available for n7n\leq 7 hard disks and n6n\leq 6 hard spheres, populations of critical points for n12n\leq 12 hard disks and hard spheres have been sampled using established techniques that are described elsewhere Ritchey (2019); Ericok and Mason (2021); Eriçok et al. (2021). Apart from the increasing concentration of low-index critical points around the phase coexistence interval with increasing nn, the most striking aspect of the figure is that the largest discontinuities almost always occur close to the packing fraction of the last index-11 critical point to appear with decreasing packing fraction (each of the black squares comes slightly before an index-11 critical point due to finite sampling). This is not entirely unexpected, since each index-11 critical point either joins previously disconnected components or add a new class of closed paths to the space. Supposing that discontinuities in the ϵ\epsilon-mixing time are associated with the former, it also makes sense that the largest discontinuity would be observed at lower packing fractions where the disconnected components being joined had the opportunity to grow to substantial volumes. The increasing concentration of low-index critical points around the phase coexistence interval draws the discontinuity closer to the liquid limit with increasing nn, though that is admittedly a noisy trend for the small numbers of disks considered here. Nevertheless, this does provide evidence that something along the lines of Conj. 1 could be true for hard sphere systems, and perhaps for thermodynamics systems as well.

Refer to caption
Figure 9: The largest discontinuous jumps (black squares) observed in Figs. 7 (top) and 8 (bottom) along with the packing fractions of all known critical points, colored by their indices (index-0 in purple, index-11 in blue, all others in green).

V Conclusion

A phase transition is necessarily related to a discontinuous change in the probability density function describing the distribution of the system’s microstates on the phase space. The question of how this could occur as the thermodynamic control variables are continuously varied has not been conclusively answered in the literature. One proposal called the topological hypothesis Caiani et al. (1997); Franzosi and Pettini (2004) suggests that topological changes to the accessible region of the configuration space is a necessary condition for a phase transition. This paper instead suggests that a substantial change to the geometry of the accessible region is a necessary condition for a phase transition, and that such a change is often (but not always) associated with a topological change. More specifically, Conj. 1 proposes that a discontinuity in the mixing time on the configuration space is a necessary condition for a first-order phase transition in the thermodynamic limit. Our main result is a preliminary test of this conjecture for hard disk and hard sphere systems with few enough disks that the configuration space geometry can be explicitly studied.

The configuration spaces of hard disks and hard spheres are represented as graphs with vertices representing specific configurations of disks and edges indicating the distances between the configurations. The vertices of the graph are sampled more densely around critical points, or locations where the topology of the configuration space is known to change as a function of disk radius, to more accurately capture the topological and geometric changes in those regions. The diffusion diameters and the ϵ\epsilon-mixing times of the resulting graphs are calculated as functions of packing fraction for n7n\leq 7 and n6n\leq 6 hard spheres; the ϵ\epsilon-mixing time is proposed here, and is designed to be an analogue to the thermodynamic mixing time that can be explicitly evaluated.

The geometric signals obtained for the diffusion diameter and the ϵ\epsilon-mixing time are consistent in the sense that their discontinuities generally occur at the same packing fractions. Apart from relating more directly to the content of Conj. 1, the ϵ\epsilon-mixing time is a much smoother function of packing fraction than the diffusion diameter. The discontinuities in the ϵ\epsilon-mixing time are found to occur at comparatively few critical points; for the hard disk and hard sphere systems at least these are predominantly index-11 critical points. Along with the observation that the low-index critical points are increasingly concentrated around the phase coexistence interval with the number of disks, this suggests that a discontinuous change in the ϵ\epsilon-mixing time could indeed coincide with the first-order phase transitions in the hard disk and hard sphere systems in the thermodynamic limit.

Future studies along these lines would likely need to extend the analysis to larger numbers of disks to make the trends in the approach to the thermodynamic limit clearer. Given the rate of increase in the computational requirements with number of disks, this would require at a minimum more efficient algorithms to search for critical points and calculate the distances between configurations. The computational requirements would also be reduced by using a further-reduced representation of the spaces by means of witness graphs Aronov et al. (2013), or by restricting to particular intervals of disk radius close to the phase coexistence region. Further development of a suitable min-type Morse theory would also be helpful Mori and Salvetti (2011); Baryshnikov et al. (2014).

Acknowledgements.
O.B.E. and J.K.M. were supported by the National Science Foundation under Grant No. 1839370.

Appendix A Heuristic for the distance

Let 𝒮=𝒯𝒫\mathcal{S}=\mathcal{T}\cup\mathcal{P}\cup\mathcal{L} in Eq. 4. Since 𝒯\mathcal{T} is a continuous group and 𝒫\mathcal{P}\cup\mathcal{L} is discrete, Eq. 4 can be rewritten as

dΛ/𝒮(𝐱,𝐲)\displaystyle d_{\Lambda/\mathcal{S}}(\mathbf{x},\mathbf{y}) =minS𝒫dΛ/𝒯[𝐱,S(𝐲)]\displaystyle=\min_{S\in\mathcal{P}\cup\mathcal{L}}d_{\Lambda/\mathcal{T}}[\mathbf{x},S(\mathbf{y})]
dΛ/𝒯[𝐱,S(𝐲)]\displaystyle d_{\Lambda/\mathcal{T}}[\mathbf{x},S(\mathbf{y})] =inf𝒕𝒯dΛ[𝐱,S(𝐲)+𝐭]\displaystyle=\inf_{\bm{t}\in\mathcal{T}}d_{\Lambda}[\mathbf{x},S(\mathbf{y})+\mathbf{t}]
dΛ[𝐱,S(𝐲)+𝐭]\displaystyle d_{\Lambda}[\mathbf{x},S(\mathbf{y})+\mathbf{t}] =i=1n𝒙iS(𝒚i)𝒕.\displaystyle=\sum_{i=1}^{n}\lVert\bm{x}_{i}-S(\bm{y}_{i})-\bm{t}\rVert.

Evaluating dΛ/𝒯[𝐱,S(𝐲)]d_{\Lambda/\mathcal{T}}[\mathbf{x},S(\mathbf{y})] involves solving a global optimization problem over rigid translations for a fixed discrete symmetry operation SS. Since evaluating dΛ/𝒮(𝐱,𝐲)d_{\Lambda/\mathcal{S}}(\mathbf{x},\mathbf{y}) involves solving n!×O()n!\times O(\mathcal{L}) of these global optimization problems, the computational cost could be significantly reduced if a suitable approximation for dΛ/𝒯[𝐱,S(𝐲)]d_{\Lambda/\mathcal{T}}[\mathbf{x},S(\mathbf{y})] could be found to quickly reject some of the SS.

Such an approximation is constructed by, for a fixed number of iterations mm, randomly sampling a random translation 𝐭\mathbf{t}, calculating the distance dΛ[𝐱,S(𝐲)+𝐭]d_{\Lambda}[\mathbf{x},S(\mathbf{y})+\mathbf{t}] for that translation, and accepting that translation only if it reduces dΛ[𝐱,S(𝐲)+𝐭]d_{\Lambda}[\mathbf{x},S(\mathbf{y})+\mathbf{t}] Wells et al. (2004). The resulting approximations for dΛ/𝒯[𝐱,S(𝐲)]d_{\Lambda/\mathcal{T}}[\mathbf{x},S(\mathbf{y})] are sorted by increasing magnitude, and the full optimization problem is solved only for the first MM symmetry operations in the sorted list. The rationale for this procedure is that calculating the approximations using a moderate mm is less expensive than solving the global optimization problem.

Refer to caption
Figure 10: The sorted true distances dΛ/𝒯d_{\Lambda/\mathcal{T}} calculated for a fixed generic configuration and the 7!×127!\times 12 symmetric versions of a second generic configuration for 77 hard disks.

For two generic configurations of 77 hard disks there are 7!×12=60 4807!\times 12=60\,480 symmetric versions of the second configuration due to the actions of 𝒫\mathcal{P} and \mathcal{L}. Figure 10 shows the sorted true distances dΛ/𝒯d_{\Lambda/\mathcal{T}}, and dΛ/𝒮d_{\Lambda/\mathcal{S}} is the minimum of this set. Figure 11 shows the same configurations sorted by the approximate values of dΛ/𝒯[𝐱,S(𝐲)]d_{\Lambda/\mathcal{T}}[\mathbf{x},S(\mathbf{y})] for m=1m=1, 1010, 5050, and 100100 plotted with the true values of dΛ/𝒯[𝐱,S(𝐲)]d_{\Lambda/\mathcal{T}}[\mathbf{x},S(\mathbf{y})] on the vertical axis. Observe that with increasing mm these curves should converge to the one in Fig. 10, and that even for small mm the true minimum distance should appear within the first M60 480M\ll 60\,480 symmetry operations. Numerical experiments suggest that m=50m=50 and M=200M=200 are generally sufficient for an approximation with a relative error on the order of 10410^{-4}.

Refer to caption
Figure 11: The same configurations as in Fig. 10 are sorted by the approximate values of dΛ/𝒯[𝐱,S(𝐲)]d_{\Lambda/\mathcal{T}}[\mathbf{x},S(\mathbf{y})] for m=1m=1, 1010, 5050, and 100100 and are plotted with the true values of dΛ/𝒯[𝐱,S(𝐲)]d_{\Lambda/\mathcal{T}}[\mathbf{x},S(\mathbf{y})] on the vertical axis.

Appendix B Eigenspectrum analysis

The diffusion coordinates in Eq. 5 involve the ttth powers of the eigenvalues of 𝐏\mathbf{P} sorted by decreasing magnitude. This means that for sufficiently large tt, the computational expense of calculating the diffusion distance can be considerably reduced with negligible loss of accuracy by truncating the eigenspectrum and only considering the first few diffusion coordinates Talmon et al. (2013). Figure 12 shows the ttth powers of the eigenvalues of 𝐏\mathbf{P} for Λ(4)/{𝒯𝒫}\Lambda(4)/\{\mathcal{T}\cup\mathcal{P}\cup\mathcal{L}\} at times t=1t=1, 55, and 1010. Observe that even for the relatively small t=5t=5 most of the diffusion coordinates will be numerically indistinguishable from zero. Only the first 200200 diffusion coordinates are used here unless otherwise specified, changing the diffusion diameter to a relative precision of less than 10810^{-8} and making this source of error negligible compared to other sources.

Refer to caption
Figure 12: The ttth powers of the eigenvalues of 𝐏\mathbf{P} for Λ(4)/{𝒯𝒫}\Lambda(4)/\{\mathcal{T}\cup\mathcal{P}\cup\mathcal{L}\} at times t=1t=1, 55, and 1010. The values generally decay rapidly, and are often numerically indistinguishable from zero for sufficiently large tt.

Appendix C Mixing time of an example system

Figure 13 shows the ϵ\epsilon-mixing times tϵ(𝒒0)t_{\epsilon}(\bm{q}_{0}) as a function of 𝒒0\bm{q}_{0} for the example system in Fig. 1. The kk-nearest neighbor graph is constructed with k=7k=7, the length of an edge is given by the Euclidean distance between the corresponding vertices, and ϵ=0.001\epsilon=0.001 is used since the steady-state probability mass at each vertex is 0.00050.0005 (the inverse of the number of vertices). Points closer to the center have smaller ϵ\epsilon-mixing times since the initial Dirac-delta distribution can spread to the upper and lower regions more easily as compared to distributions starting near the corners. The average mixing time tϵ\langle t\rangle_{\epsilon} for this example is around 15 00015\ 000.

Refer to caption
Figure 13: Individual mixing times tϵ(𝒒0)t_{\epsilon}(\bm{q}_{0}) of the example system shown in Fig. 1. The average mixing time tϵ\langle t\rangle_{\epsilon} is around 15 00015\ 000.

References

  • Landau (1936) L. Landau, Nature 138, 840 (1936).
  • Landau and Lifshitz (2013) L. D. Landau and E. M. Lifshitz, Statistical Physics: Volume 5, vol. 5 (Elsevier, 2013).
  • Boltzmann (1871) L. E. Boltzmann, Wiener Berichte 63, 679 (1871).
  • Ehrenfest and Ehrenfest (1990) P. Ehrenfest and T. Ehrenfest, The conceptual foundations of the statistical approach in mechanics (Courier Corporation, 1990).
  • Birkhoff (1931) G. D. Birkhoff, Proceedings of the National Academy of Sciences 17, 656 (1931).
  • Neumann (1932) J. v. Neumann, Proceedings of the National Academy of Sciences 18, 70 (1932).
  • Caiani et al. (1997) L. Caiani, L. Casetti, C. Clementi, and M. Pettini, Physical review letters 79, 4361 (1997).
  • Franzosi and Pettini (2004) R. Franzosi and M. Pettini, Physical review letters 92, 060601 (2004).
  • Franzosi et al. (2007) R. Franzosi, M. Pettini, and L. Spinelli, Nuclear Physics B 782, 189 (2007).
  • Franzosi and Pettini (2007) R. Franzosi and M. Pettini, Nuclear Physics B 782, 219 (2007).
  • Kastner and Mehta (2011) M. Kastner and D. Mehta, Physical review letters 107, 160602 (2011).
  • Gori et al. (2018) M. Gori, R. Franzosi, and M. Pettini, Journal of Statistical Mechanics: Theory and Experiment 2018, 093204 (2018).
  • Bel-Hadj-Aissa et al. (2021) G. Bel-Hadj-Aissa, M. Gori, R. Franzosi, and M. Pettini, Journal of Statistical Mechanics: Theory and Experiment 2021, 023206 (2021).
  • Coifman and Lafon (2006) R. R. Coifman and S. Lafon, Applied and computational harmonic analysis 21, 5 (2006).
  • Talmon et al. (2013) R. Talmon, I. Cohen, S. Gannot, and R. R. Coifman, IEEE signal processing magazine 30, 75 (2013).
  • Kullback and Leibler (1951) S. Kullback and R. A. Leibler, The annals of mathematical statistics 22, 79 (1951).
  • Dyre (2016) J. C. Dyre, Journal of Physics: Condensed Matter 28, 323001 (2016).
  • Alder and Wainwright (1962) B. Alder and T. Wainwright, Physical Review 127, 359 (1962).
  • Weber et al. (1995) H. Weber, D. Marx, and K. Binder, Physical Review B 51, 14636 (1995).
  • Mitus et al. (1997) A. Mitus, H. Weber, and D. Marx, Physical Review E 55, 6855 (1997).
  • Halperin and Nelson (1978) B. Halperin and D. R. Nelson, Physical Review Letters 41, 121 (1978).
  • Young (1979) A. Young, Physical Review B 19, 1855 (1979).
  • Bernard and Krauth (2011) E. P. Bernard and W. Krauth, Physical review letters 107, 155704 (2011).
  • Engel et al. (2013) M. Engel, J. A. Anderson, S. C. Glotzer, M. Isobe, E. P. Bernard, and W. Krauth, Physical Review E 87, 042134 (2013).
  • Isobe and Krauth (2015) M. Isobe and W. Krauth, The Journal of chemical physics 143, 084509 (2015).
  • Pieprzyk et al. (2019) S. Pieprzyk, M. N. Bannerman, A. C. Brańka, M. Chudak, and D. M. Heyes, Physical Chemistry Chemical Physics 21, 6886 (2019).
  • Carlsson et al. (2012) G. Carlsson, J. Gorham, M. Kahle, and J. Mason, Physical Review E 85, 011303 (2012).
  • Morse (1934) M. Morse, The calculus of variations in the large, vol. 18 (American Mathematical Soc., 1934).
  • Milnor (2016) J. Milnor, Morse theory.(AM-51), vol. 51 (Princeton university press, 2016).
  • Baryshnikov et al. (2014) Y. Baryshnikov, P. Bubenik, and M. Kahle, International Mathematics Research Notices 2014, 2577 (2014).
  • Ritchey (2019) K. Ritchey, Ph.D. thesis, Ohio State University, Electronic Thesis or Dissertation (2019).
  • Ericok and Mason (2021) O. B. Ericok and J. K. Mason, arXiv preprint arXiv:2101.00780 (2021).
  • Eriçok et al. (2021) O. B. Eriçok, K. Ganesan, and J. K. Mason, Phys. Rev. E 104, 055304 (2021).
  • Burago et al. (2001) D. Burago, I. D. Burago, Y. Burago, S. Ivanov, and S. A. Ivanov, A course in metric geometry, vol. 33 (American Mathematical Soc., 2001).
  • Glover (1989) F. Glover, ORSA Journal on computing 1, 190–206 (1989).
  • Chelouah and Siarry (2000) R. Chelouah and P. Siarry, European journal of operational research 123, 256–270 (2000).
  • Sheehy (2013) D. R. Sheehy, Discrete & Computational Geometry 49, 778 (2013).
  • Buchet et al. (2016) M. Buchet, F. Chazal, S. Y. Oudot, and D. R. Sheehy, Computational Geometry 58, 70 (2016).
  • Kahn and Harris (1951) H. Kahn and T. E. Harris, National Bureau of Standards applied mathematics series 12, 27 (1951).
  • Liouville (1838) J. Liouville, Journal de Mathématiques Pures et Appliquées 3, 342 (1838).
  • Gibbs (1902) J. W. Gibbs, Elementary principles in statistical mechanics: Developed with especial reference to the rational foundations of thermodynamics (C. Scribner’s sons, 1902).
  • Chung and Graham (1997) F. R. Chung and F. C. Graham, Spectral graph theory, 92 (American Mathematical Soc., 1997).
  • Mulero et al. (2008) A. Mulero, C. Galán, M. Parra, and F. Cuadros, in Theory and Simulation of Hard-Sphere Fluids and Related Systems (Springer, 2008), pp. 37–109.
  • Tian et al. (2020) J. Tian, H. Jiang, and A. Mulero, Molecular Physics 118, e1687948 (2020).
  • Aronov et al. (2013) B. Aronov, M. Dulieu, and F. Hurtado, Computational Geometry 46, 894 (2013), ISSN 0925-7721, euroCG 2009.
  • Mori and Salvetti (2011) F. Mori and M. Salvetti, Mathematical Research Letters 18, 39 (2011).
  • Wells et al. (2004) M. T. Wells, G. Casella, and C. P. Robert, in A Festschrift for Herman Rubin (Institute of Mathematical Statistics, 2004), pp. 342–347.