Energy Landscape and Metastability of Curie–Weiss–Potts Model
Abstract.
In this paper, we thoroughly analyze the energy landscape of the Curie–Weiss–Potts model, which is a ferromagnetic spin system consisting of spins defined on complete graphs. In particular, for the Curie–Weiss–Potts model with spins and zero external field, we completely characterize all critical temperatures and phase transitions in view of the global structure of the energy landscape. We observe that there are three critical temperatures and four different regimes for , whereas there are four critical temperatures and five different regimes for . Our analysis extends the investigations performed in [M. Costeniuc, R. S. Ellis, H. Touchette: J. Math. Phys (2005)]; they provide the precise characterization of the second critical temperatures for all and in [Landim and Seo: J. Stat. Phys. (2016)], which provides a complete analysis of the energy landscape for . Based on our precise analysis of the energy landscape, we also perform a quantitative investigation of the metastable behavior of the heat-bath Glauber dynamics associated with the Curie–Weiss–Potts model.
1. Introduction
The Potts model is a well-known mathematical model suitable for studying ferromagnetic spin system consisting of spins. We refer to [36] a comprehensive review on the Potts model. In the present work, we focus on the Potts model defined on large complete graphs without an external field to understand the associated energy landscape as well as the metastable behavior of the heat-bath Glauber dynamics to the highly precise level. This special case of the Potts model defined on complete graphs is called a Curie–Weiss–Potts model and investigated in various studies; e.g., [5, 13, 14, 15, 16, 21, 24, 35, 6] and references therein. We note that the rigorous mathematical definition of the Curie–Weiss–Potts model is presented in the next section.
The Curie–Weiss model
The Ising case of the Curie–Weiss–Potts model, i.e., the corresponding spin system consisting only of spins, is the famous Curie–Weiss model. It is well-known that the Curie–Weiss model without an external field exhibits a phase transition at the critical (inverse) temperature . It is mainly because the number of global minima of the potential function associated with the empirical magnetization is one for the high temperature regime while it becomes two for the low temperature regime , where represents the inverse temperature (cf. [34, Chapter 9] for more detail). It is also well-known that such a phase transition for the structure of the energy landscape is closely related to the mixing property of the associated heat-bath Glauber dynamics. In [25], it has been shown that the Glauber dynamics exhibits the so-called cut-off phenomenon which is a signature of the fast mixing for the high-temperature regime (i.e., ) and the metastability for the low-temperature regime (i.e., ). The metastability for the low-temperature regime has been more deeply investigated in [12].
The Curie–Weiss–Potts model with
The picture for the Curie-Weiss model explained above has been fully extended to the Curie–Weiss–Potts model consisting of spins. The complete description of the energy landscape has been obtained recently in [21, 24], where three critical temperatures
are characterized. More precisely, it has been shown that the potential function associated with the empirical magnetization (which will be explained in detail in section 2.3) has
-
•
the unique global minimum for ,
-
•
one global minimum and three local minima for ,
-
•
three global minima and one local minimum for , and
-
•
three global minima for .
The articles [21, 24] also analyzed the associated saddle structure. Based on this analysis, [24] discussed the quantitative feature of the metastable behavior of the heat-bath Glauber dynamics in view of the Eyring–Kramers formula and Markov chain model reduction (cf. [2, 3, 22]) for all the low-temperature regime . Because of the abrupt change in the structure of the potential function at and , the metastable behaviors of the Glauber dynamics in three low-temperature regimes , , and turned out to be both quantitatively and qualitatively different. For the high-temperature regime , the cut-off phenomenon has been verified in [14] for all . Adjoining all these works completes the picture for the Curie–Weiss–Potts model with spins.
The Curie–Weiss–Potts model with
Compared to the Curie–Weiss–Potts model with or spins, the analysis of the case with spins is not completed so far. In many literature, two critical temperatures for the Curie–Weiss–Potts model with spins are observed and the phase transitions near these critical temperatures have been analyzed. For instance, in [14], the phase transition from the fast mixing (the cut-off phenomenon) to the slow mixing (due to the appearance of new local minima) at has been confirmed. In [16], it has been observed that the limiting distributions of the empirical magnetization exhibits the abrupt change at . In [13], the phase transition around also has been studied in view of the equivalence and non-equivalence of ensembles.
These studies focus on the phase transitions involved with the local and the global minima of the potential function. However, in order to investigate the metastable behavior whose main objective is to analyze the transitions between neighborhoods of local minima (i.e., the metastable states), the precise understanding of the saddle structure is also required. To the best of our knowledge, the analysis of the saddle structure as well as the metastable behavior of the heat-bath Glauber dynamics for has not been analyzed yet.
Main contribution of the article
The main result of the present work is to provide the complete description of the energy landscape including the saddle structure and to analyze dynamical features of the Glauber dynamics based on it for the Curie–Weiss–Potts models with spins.
First, we observe that for , as in the case of , the potential function has three critical temperatures
and moreover the associated metastable behavior is quite similar to that of the case . On the other hand, for , we will deduce that there are four critical temperatures
where two critical temperatures and play essentially the same role with and (and hence and ), respectively. Surprisingly, our work reveals that the role of the third critical temperature for is divided into the third and fourth critical temperatures and for . More precisely, for , the change in the saddle gates between global minima and the disappearance of the local minimum representing the chaotic configuration happen simultaneously at ; however, for , the change of saddle gates happens at and the disappearance of the chaotic local minimum occurs at . Hence, for , we observe another type of metastable behavior at compared to the case .
Other studies on the Potts model
Although the present work focuses on the Potts model on complete graphs, we also note that the Ising and Potts models on the lattice are widely studied as well. For instance, we refer to [34] and the references therein for the phase transition, to [26, 27, 28] for the cut-off phenomenon in the high-temperature regime, and to [1, 4, 7, 8, 9, 10, 20, 29, 30, 31, 32] for the metastability in the low-temperature regime. In addition, we refer to [17, 19] for the Potts model in many spins or large dimensions and to [11, 18] for the study of metastability of the Ising model on random graphs.
2. Model
In this section, we introduce the formal definition of the Curie–Weiss–Potts model, which will be analyzed in the present work. Fix an integer and let be the set of spins.
2.1. Curie–Weiss–Potts Model
For a positive integer , let us denote by111We write to emphasize that our model is on the complete graph the set of sites. Let be the configuration space of spins on . Each configuration is represented as where denotes a spin at site . Let be the external magnetic field. The Hamiltonian associated to the Curie–Weiss–Potts model with the external field is given by
where denotes the usual indicator function. Then, the Gibbs measure associated to the Hamiltonian at the (inverse) temperature is given by
where is the partition function. The measure denotes the Curie–Weiss–Potts measure on at the inverse temperature .
2.2. Heat-bath Glauber Dynamics
Now, we define a heat-bath Glauber dynamics associated with the Curie–Weiss–Potts measure . For , , and , denote by the configuration whose spin at site is flipped to , i.e.,
Then, we will consider a heat-bath Glauber dynamics associated with generator which acts on as
where
It can be observed that this dynamics is reversible with respect to the Curie–Weiss–Potts measure . Henceforth, denote by the continuous time Markov process associated with the generator .
2.3. Empirical Magnetization
For each spin , denote by the proportion of spin of configuration , i.e.,
and define the proportional vector as
which represents the empirical magnetization of the configuration containing the macroscopic information of .
Define as
(2.1) |
and then define a discretization of as
With this notation, we immediately have for .
For the Markov process , we write which is a stochastic process on expressing the evolution of the empirical magnetization. Since the model is defined on the complete graph , we obtain the following proposition.
Proposition 2.1.
The process is a continuous time Markov chain on whose invariant measure is given by
where denotes the set . Furthermore, is reversible with respect to .
The proof of this proposition including jump rates is given in Section 5.1. Let be the law of the Markov chain starting at and let be the corresponding expectation.
More on the measure
For , let where . Then, the Hamiltonian can be written as
where
(2.2) |
Therefore, by Proposition 2.1, the invariant measure of the process on can be written as
(2.3) |
where, by Stirling’s formula, we can write
where
(2.4) |
In this equation, is the energy functional defined in (2.2) and is the entropy functional defined by
and converges to uniformly on every compact subsets of .
Main objectives of the article
Now, we can express the main purpose of the current article in a more concrete manner. In this article, we consider the Curie–Weiss–Potts model when there is no external magnetic field; i.e., . Therefore, from now on, we assume . Under this assumption, the first objective is to analyze the function expressing the energy landscape of the empirical magnetization of the Curie–Weiss–Potts model. This result will be explained in Section 3. The second concern is to investigate the metastable behavior of the process in the low-temperature regime. This will be explained in Section 4. Latter part of the article is devoted to proofs of these results.
3. Main Result for Energy Landscape
In view of Proposition 2.1, (2.3), and (2.4), the structure of the invariant measure of the process is essentially captured by the potential function ; hence, the investigation of is crucial in the analysis of the energy landscape and the metastable behavior of . In this section, we explain our detailed analysis of the function .
Note that the function express the competition between the energy and the entropy represented by and , respectively. Since there is a factor in front of the entropy functional, we can expect that the entropy dominates the competition when is small (i.e., the temperature is high). Since entropy is uniquely minimized at the equally distributed configuration , we can expect that the potential also has the unique minimum when is small. On the other hand, if is large enough (i.e., the temperature is low), the energy with minima dominates the system, and therefore, we can expect that the potential also has global minima. In this section we provide the complete characterization of the complicated pattern of transition from this high-temperature regime to low-temperature regime in a precise level.
In Section 3.1, we define several points that will be shown to be critical points. In Section 3.2, we introduce several critical values of (inverse) temperature . In Section 3.3, we summarize the results on the energy landscape . In Section 3.4, as a by-product of these results, we compute the mean-field free energy.
3.1. Critical Points of
Let us first investigate critical points of . We recall that
Notation 3.1.
We have following notations for convenience.
-
(1)
Since there is no risk of confusion, we will write the point as where .
-
(2)
Let be the orthonormal basis of and .

Now, we explain the candidates for the critical points of playing important role in the analysis of the energy landscape. The first candidate is
which represents the state where the spins are equally distributed.
In order to introduce the other candidates, we fix and let . Define as
(3.1) |
where we set so that becomes a continuous function on . We refer to Figure 3.1 for an illustration of graph of . Then, it will be verified by Lemma 6.1 in Section 6.1 (and we can expect from the graph illustrated in Figure 3.1) that has at most two solutions. We denote by these solutions, provided that they exist. If there is only one solution, we let be this solution.
For , let
(3.2) | ||||
(3.3) |
where and are located at the -th component of and , respectively. For222Henceforth, implies that , , and . , let
(3.4) | ||||
where is located at the -th and -th components. Of course, each of these points is well defined only when ,, or exists, respectively. Then, let
We remark that these sets depend on although we omit in the expressions for the simplicity of the notation.
Since we assumed that , by symmetry, we can expect that the elements in have the same properties; for instance, for all , we have , and is a critical point of if and only if is. Of course the elements in or respectively have the same properties. Thus, it suffices to analyze their representatives, and hence select these representatives as
(3.5) |
Now, we have the following preliminary classification of critical points. We remark that a saddle point is a critical point at which the Hessian has only one negative eigenvalue.
Proposition 3.2.
The following hold.
-
(1)
If is a local minimum of , then .
-
(2)
If is a saddle point of , then for and for .
Remark 3.3.
The set is not defined for since the set is defined only when . This will be explained in Section 6.1.
The proof of this proposition is an immediate consequence of Proposition 6.3 in Section 6.1. The above proposition permits us to focus only on when we analyze the energy landscape in view of the metastable behavior, since the critical points of index greater than 1 cannot play any role, as the metastable transition always happens at the neighborhood of a saddle point (a critical point of index ).
3.2. Critical Temperatures
In this subsection, we introduce critical temperatures
at which the phase transitions in the energy landscape occur. The precise definition of these critical temperatures are given in (6.9) of Section 6.2. Henceforth, we write , , since there is no risk of confusion.


To describe the role of these critical temperatures, we regard as increasing from to . Figure 3.2 shows the role of , and according to inverse temperature. Section 6 will prove this figure.
At , the dynamics exhibits phase transition from fast mixing to slow mixing, and this is proven in [14]. Furthermore, the behavior of the dynamics changes from cutoff phenomenon to metastability. This phase transition is due to the appearance of new local minima of other than at . At , the ground states of dynamics change from to elements of , as observed in [13, Theorem 3.1(b)]. To explain the role of critical temperatures and , we have to divide the explanation into several cases. Let us first assume that so that . At , the saddle gates among the ground states in is changed from to (since the heights and are reversed at this point) and at , the local minimum becomes a local maximum. On the other hand, for , we have . At , the change of the saddle gates and the disappearance of the local minimum occur simultaneously. We refer to [24] for the detailed description when .
3.3. Stable and Metastable Sets
We define some metastable sets based on the results explained earlier. If , define as (cf. (3.5))
(3.6) |
When we set for all (cf. Remark 3.3). It will be verified in Lemma 6.7 and (6.9) that is the height of the lowest saddle points.
Let and . Let333We define the set , , and as the empty set if the set does not contain and does not contain respectively. , , be the connected component of containing and let be the connected component of containing . For , let be a set of saddle gates of height between and .






Now, we can state the main result on energy landscape and the proofs of theorems in this section will presented in Section 9. The first result holds for all .
Theorem 3.4.
For , the following hold.
-
(1)
If , there is no critical point other than , which is the global minimum.
-
(2)
For , we have and for , we have .
-
(3)
Let be a set of local minima of . Then, we have
-
(4)
Let be a set of global minima of . Then, we have
Since there is only one minimum if , we now consider . Before we write the main result on metastable sets, we would like to emphasize that [24, Proposition 4.4] proved the case when , while the proof for the case is the main novel contents of the current article. We first consider the case . See Figure 3.3444This figures are excerpt from [24, Fig 4] for the visualization of the following and above theorem.
Theorem 3.5.
For , the following hold.
-
(1)
.
-
(2)
For , the sets , , are nonempty and disjoint. For , and for , .
-
(3)
For , we have . The sets , , are nonempty and disjoint. For , .
-
(4)
For , we have . The sets , , are nonempty and disjoint. For ,






Next, we consider the case . Note that the crucial difference compared to the previous theorem lies in the third and fifth statements. See Figure 3.4 for the visualization of the following theorem and Theorem 3.4.
Theorem 3.6.
For , the following hold.
-
(1)
.
-
(2)
For , the sets , , are nonempty and disjoint. For , and for ,
-
(3)
For , the sets , , are nonempty and disjoint. For , and for , .
-
(4)
For , the sets , , are nonempty and disjoint. For , and for , .
-
(5)
For , we have . Furthermore, the set has only two connected components, the well and the other containing . The saddle points between them are .
3.4. Mean-field Free Energy
In this subsection, we compute the mean-field free energy of the Curie–Weiss–Potts model defined by
(3.7) |
It is well known that the Curie–Weiss model with spins exhibits the second-order phase transition at the unique critical temperature , while the Curie–Weiss–Potts model with spins exhibits the first-order phase transition at (cf. [13, 16, 33]). We now reconfirm this folklore by computing the free energy explicitly. This computation is based on the following observation (cf. [16, display (2.4)]):
(3.8) |
We give a rigorous proof in Appendix B.
Corollary 3.7.
We have that
(3.10) |
In particular, the Curie–Weiss–Potts model with exhibits the first-order phase transition at .
4. Main Result for Metastability
In this section, we analyze the metastable behavior of based on the analysis of the energy landscape carried out in the previous section and the general results obtained by [23]. As inverse temperature varies, the behavior of this dynamics changes both qualitatively and quantitatively thanks to the structural phase transitions explained in the previous section.
Since the invariant measure is exponentially concentrated in neighborhoods of ground states, the corresponding Markov process stays most of the time at these neighborhoods. The abrupt transitions between such stable states are the metastable behavior of the process and one of the natural ways of describing these hopping dynamics among the neighborhoods of the ground states is the Markov chain model reduction. A comprehensive understanding of such approaches can be obtained from [2, 3, 22].
When the dynamics starts from a local minimum which is not a global minimum, we have to estimate the mean of the transition time to the global minimum in order to quantitatively understand the metastable behavior. This estimation is known as the Eyring–Kramers formula. In this section we provide the Markov chain model reduction and Eyring–Kramers formula for the metastable process .
Such a metastable behavior is observed only when there are multiple local minima; and hence we cannot expect metastable behavior at the high-temperature regime for which is the unique local (and global) minimum. Hence, we assume in this section.
4.1. Some preliminaries
In this subsection, we introduce several notions crucial to the description of the metastable behavior.
Some constants
Recall the definition of from Notation 3.1. Define matrices , , and as
where . The appearance of the weight is explained in Section 5.3. Since , , are positive definite, satisfies [23, display (A.1)] and hence, by [23, Lemma A.1], for all , the matrices and have the unique negative eigenvalue which will be denoted respectively by and .
Let us define the so-called Eyring–Kramers constants corresponding to our model as
(4.1) | ||||
(4.2) |
By symmetry, we have for all and and for all . Hence, let us write and . Next, define
(4.3) | ||||
(4.4) |
By the symmetry, we also obtain .
Time scales
The constant defined in (3.6) denotes the height of the lowest saddle points. Let , , be the depth of well , i.e.,
Then, and represent the time scales on which exhibits metastability. For , the constant is meaningless since .
Order process and Markov chain model reduction
Let be a small enough number such that . If , since is not defined, let . For , define
For , define as
This set is called the metastable set, provided that it is not an empty set. For , we write
Let be or . Denote by the projection map given by
Let us define the so-called order process by which represents the index of metastable set at which the process is staying.
Definition 4.1 (Markov chain model reduction).
Let be a continuous time Markov chain on . We say that the metastable behavior of the process is described by a Markov Process in the time scale if, for all and for all sequence such that for all , the finite dimensional marginals of the process under converges to that of the Markov chain as .
In the previous definition, it is clear that the Markov chain describes the inter-valley dynamics of the process accelerated by a factor of .
4.2. Metastability Results for
We can now state the main result for the metastable behavior. First, we consider the case whose result is essentially the same as that in [24, Section 4.3] where only the case was considered.
We define limiting Markov chains when .
Definition.
Let and . Let be a Markov chain on with jump rate given by
-
(1)
.
-
(2)
.
-
(3)
.
-
(4)
.
-
(5)
.
The following theorem is the metastability result for . We remark that the case when is proven in [24, Section 4.3].
Theorem 4.2.
Let . Then, the metastable behavior of the process is described by (cf. Definition 4.1)
-
(1)
: the process in the time scale .
-
(2)
: the process in the time scale .
-
(3)
: the process in the time scale and by the process in the time scale .
-
(4)
: the process in the time scale .
The proof follows from Theorems 3.4 and 3.5, Proposition 5.3, and [23, Theorem 2.2, Remark 2.10, 2.11].
Remark 4.3.
As mentioned in [24], we cannot investigate the case with the current method since is a degenerate saddle point.
Remark 4.4.
The qualitative feature of the metastable behavior of is essentially same for and . The only difference is that the saddle points between metastable sets are defined in different ways; however, when , the points in for and for play the same role since all the points belonging to these sets represent states in which most sites are aligned to two spins equally.
4.3. Metastability Results for
As in the previous subsection, we start by defining limiting Markov chains. Note that there are two different Markov chains.
Definition.
Let and . Let be a Markov chain on with jump rate with jump rate given by
-
(1)
.
-
(2)
.
-
(3)
.
-
(4)
.
-
(5)
.
-
(6)
.
-
(7)
.
Now, we present the metastability result for . The new metastable behaviors are observed when and .
Theorem 4.5.
Let . Then, the metastable behavior of the process is described by
-
(1)
: the process in the time scale .
-
(2)
: the process in the time scale .
-
(3)
: the process in the time scale and by the process in the time scale .
-
(4)
: the process in the time scale and by the process in the time scale .
-
(5)
: the process in the time scale and by the process in the time scale .
-
(6)
: the process in the time scale .
The proof follows from Theorems 3.4 and 3.6, Proposition 5.3, and [23, Theorem 2.2, Remarks 2.10, 2.11].
Remark 4.6.
Notably, in contrast to the case , we can describe the metastable behavior for all since the saddle points are nondegenerate when .






We can now provide a more intuitive explanation of Theorem 4.5. See Figure 4.1 also for the description of metastable behavior. Note that if , there are two time scales.
-
•
: If , in the time scale , starting from , reaches and stays there forever. When it goes from , , to , it visits the neighborhood of .
-
•
: If , in the time scale , the process goes around each well in . However, when starting from , , goes to , , it must pass through and as in the case , it visits the neighborhood of and then the neighborhood of .
-
•
: If , in the time scale , the process starting from travels , however, it stays in in negligible time. Furthermore, when goes from , , to , , it must visit .
-
•
: If , in the time scale , the process starting from travels , however, it stays in in negligible time. Furthermore, there are two ways in which goes from , , to , . First, it goes to directly and must pass through the neighborhood of . Second, it visits and stays there for a negligible period of time.
-
•
: If , in the time scale the process starting from travels without visiting . As in the case , it must pass through the neighborhood of , , when it goes from to .
-
•
: If , in the second time scale , the process starting from , goes to , , and stays there forever. As , , and , it passes through the neighborhood of when it moves to from .
-
•
: If , in the second time scale , the process starting from , goes to and stays there forever. This dynamics is similar to ; however, , , are not distinguishable.
4.4. Eyring–Kramers Formulae
In this subsection, we present the Eyring–Kramers formula with regard to metastable behavior. Before we state the result, we introduce some notations. Let be the nearest point in of . If there is more than one such point, one of them is chosen arbitrarily. For , define as
Denote by the hitting time of the set by the process :
If , we simply write .
We have the following theorem.
Theorem 4.7.
Let . We have the following.
-
(1)
For and , we have
-
(2)
For , we have
-
(3)
For and , we have
5. Preliminary Analysis on Potential and Generator
In this section, we conduct several preliminary analyses. In Section 5.1, we prove Proposition 2.1. In particular, we compute the jump rates of Markov chain . In Section 5.2, we decompose the generator into several simple generators , , . Via this decomposition of , we can handle using the method developed in [23] since our model is a special case of [23, Remarks 2.10, 2.11]; this correspondence will be explained in Section 5.3.
5.1. Dynamics of Proportion Vector.
We prove Proposition 2.1 in this section.
Proof of Proposition 2.1.
Let , (cf. Notation 3.1). Fix configurations such that and let
For some sites such that let . Then, the Markovity of the process can be inferred from the identity
for . Hence, is a Markov chain.
Since there are sites whose spins are , the jump rate of can be written as
(5.1) |
Hence, the generator of is given as
for .
Finally, this dynamics is reversible with respect to since we have the following detailed balance condition
so that is the invariant measure. ∎
5.2. Cyclic Decomposition
For , let be the cycle of length 2 on and let . Define as
Define a jump rate associated with this cycle as
where
Let , be a generator acting on as
(5.2) |
Here, can be regarded as a generator of a Markov chain on the cycle .
Let
By (2.3), we can write
By elementary computations, we obtain
so that
Hence, by (5.2), we can write
(5.3) |
Since converges uniformly to and is uniformly Lipschitz on every compact subset of , our model is a special case of [23, Remarks 2.10, 2.11] provided that several technical requirements are verified. These requirements will be verified in the next subsection.
Remark 5.1.
[23, Section 2] assumes that for , generates ; however, this requirement is needed to make be irreducible. Since generate , we do not need this assumption.
5.3. Requirements for and
In this subsection, we verify that our model is a special case of [23, Remarks 2.10, 2.11].
First, we give some properties of and . By the following proposition, and fulfill the requirements in the first paragraph of [23, Section 2].
Proposition 5.2.
The functions and satisfy the following properties.
-
(1)
is twice-differentiable and there is no critical points at . For all , .
-
(2)
The second partial derivatives of are Lipschitz-continuous on every compact subset of .
-
(3)
On each compact subset of , is uniformly Lipschitz and converges uniformly to as .
-
(4)
There are finitely many critical points of .
Proof.
Next, fix one of saddle points . Note that has a unique negative eigenvalue. As in [23, Section 4.3], define as
Denote by () the eigenvalues of , and by the corresponding eigenvectors. Let so that satisfies [23, displays (4.7), (4.8)]. Define a neighborhood of as
Then, by the next proposition, definitions (4.1)-(4.4) are consistent with [23, Remarks 2.10, 2.11].
Proposition 5.3.
For a smooth function , we have uniformly on ,
Proof.
6. Investigation of Critical Points and Temperatures
This section is devoted to the investigation of critical points and temperatures including their definitions. We will provide a preliminary analysis of the critical points in Section 6.1 and of the critical temperatures in section 6.2.
6.1. Classification of Critical Points
By elementary computation, we can check that the equation has at most two positive real solutions for fixed . Hence, if is a critical point555Recall Notaion 3.1., ’s can have at most 2 values by (6.1). Hereafter, we assume is a critical point and
where is the number of ’s and . Observe that by symmetry, each permutation of coordinates of has the same properties. Without loss of generality, we may assume
The point will be analyzed separately.


Lemma 6.1.
Fix , and . Then, the function has the unique minimum, say . Furthermore, if , has two solutions.
Proof.
Define as666As ; the function can be continuously extended to .
(6.3) |
By elementary computation, we obtain
(6.4) |
There are two cases, where and . By elementary computation, we can show that the graphs of are given by Figure 6.1, which completes the proof. ∎
For , let
(6.5) |
where is the unique minimum of given in the above lemma.
If , there are one or two solutions of which will be denoted by where . Let
for . We have the following candidates of the critical points of .
Lemma 6.2.
A critical point of is exactly one of the following cases.
-
(1)
.
-
(2)
For and , elements of .
-
(3)
For and , elements of .
-
(4)
For and , elements of .
Proof.
Finally, we have the following results for critical points. The proof for is given in [24, Proposition 4.2].
Proposition 6.3.
The minima and saddle points of for , , and are given by Table 1, 2, and 3, respectively.
the only minimum | |||
the only minimum | degenerate | degenerate | |
local minimum | local minima | saddle points | |
degenerate | local minima | degenerate | |
local maximum | local minima | saddle points |
the only minimum | ||||
the only minimum | degenerate | degenerate | ||
local minimum | local minima | saddle points | ||
degenerate | local minima | degenerate | degenerate | |
local maximum | local minima | index | saddle points |
the only minimum | ||||
the only minimum | degenerate | degenerate | ||
local minimum | local minima | saddle points | ||
local minimum | local minima | saddle points | degenerate | |
local minimum | local minima | saddle points | saddle points | |
degenerate | local minima | degenerate | degenerate | |
local maximum | local minima | index | saddle points |
Section 7 proves the above proposition. Until now, we classified all minima and saddle points for all .
6.2. Definition of Critical Temperatures
In the previous subsection, we defined several temperatures , . In this subsection, we prove several properties of such temperatures and moreover introduce new temperatures. Then, we select the critical temperatures at which phase transitions occur.
The first lemma is about the order of .
Lemma 6.4.
We have . If is even, we have and otherwise, .
Proof.
In this proof, we regard as a real number and claim that increases as increases for fixed . By elementary computation, we obtain
By the inequality , we can conclude that . Hence, if .
Hereafter, let . Suppose . Since , we obtain
by the above claim. The first inequality holds since is a minimum of . If , since , we obtain
If , we have so that . This with the above argument prove the second assertion. ∎
Remark.
In particular, by the above lemma, we have for and for .
The relative order of heights of critical points changes with changes in , and the phase transition is owing to this fact. We will explain when and how this order is changed. Since the proofs are technical, they are postponed to Section 8.
Order of heights of and
Lemma 6.5.
For , we have and for , we have .
The proof of the lemma is given in Section 8.1. The following lemma is an important property of .
Lemma 6.6.
Let . Then, we have
(6.7) |
This result is the same as [13, Theorem 3.1(b)]. The proof is provided in [13, Appendices A, B] via convex-duality.
We may assume that increases from a very small positive number. Observe that the elements of and simultaneously appear when and the elements of appear when . By the above two lemmas, before the appearance of critical points in , the heights of and are reversed.
Order of heights of and
We have the following lemma about the heights of and . The critical temperature given in the following lemma is the crucial development of this article.
Lemma 6.7.
Let . We have a critical temperature such that
(6.8) |
The proof of the lemma is given in Section 8.2.
Up to this point, we have obtained four critical values
when . If , we have , else if , is not defined. Thus, if , define so that
We conclude this section with the definition of the critical temperatures at which the phase transitions occur. We can now define critical temperatures appearing in Section 3.2. The critical temperatures are given by
(6.9) |
7. Critical Points of
In this section, we prove Proposition 6.3 for . For the case , we refer to [24] and we will only highlight the difference.
7.1. Eigenvalues of Hessian of at Critical Points
First, we investigate , which is always a critical point for all . The following lemma proves the property of .
Lemma 7.1.
The point is a local minimum of if , a local maximum of if , and a degenerate critical point when .
Proof.
Let be a matrix. By elementary computation, we obtain
whose eigenvalues are with multiplicity and with 1. This completes the proof. ∎
Now, for , , and , define and as
(7.1) |
We have the following lemma about eigenvalues of Hessian of at critical points.
Lemma 7.2.
Let and . Moreover, let and . Then, is a critical point of and eigenvalues of constitute one of the following cases.
-
(1)
If , all eigenvalues of are , with multiplicative , , respectively, and the roots of .
-
(2)
If , all eigenvalues of are with multiplicative and with multiplicative 1.
Proof.
By Lemma 6.2, is a critical point of since . By elementary computation, we have
so that
Then, we can write as
where
Let be a -identity matrix. By the formula
we can write
Hence, we obtain
The proof of the lemma arises directly from this explicit computation of characteristic polynomial of Hessian of . ∎
We have the following lemma about the sign of the eigenvalues of . Recall the definition of from Lemma 6.1.
Lemma 7.3.
Let and . Moreover, let and . Then, we have the following table regarding the sign of each value. If , we ignore and .
Proof.
First, suppose that . Then,
By substituting , one can deduce that is equivalent to which implies . Moreover, by the same argument above, we have . In the same manner, if , we obtain and .
Now, we investigate the sign of . We write
By elementary computation, if . Hence, if and only if
First, assume . Then, if and only if
By investigating the graph of (cf. Figure 6.1), the above inequality holds if and only if . Second, assume . Then, if and only if if and only if . Hence, if and only if or .
The case when can be proven by the argument in the first paragraph of this proof. If , then since . The above argument can prove the case when since . ∎
7.2. Relevant Critical Points of
In this and the next subsection, we classify nondegenerate critical points. When we consider critical points in or , we assume that since when , the elements of are degenerate. The case when is treated in Section 7.4.
By the Morse theory, critical points with more than 2 negative eigenvalues can be neither saddle points nor minima. Hence, the critical points with only positive eigenvalues or only one negative eigenvalue and positive eigenvalues are relevant to the landscape of . We select these critical points in this subsection.
Lemma 7.4.
Let . If , is a set of local minima. If , is a set of saddle points. If , is a set of saddle points else if , each point in has at least two negative eigenvalues.
Proof.
Consider . Eigenvalues of are with multiplicative and with multiplicative 1. By Lemma 7.3, if , then since , we obtain ; hence, is a local minimum.
Next, consider . Eigenvalues of are with multiplicative and with multiplicative 1. By Lemma 7.3, if , then since , we obtain and ; hence, it is a saddle point. If , then since , we obtain and so that has more than two negative eigenvalues.
Finally, let , , and . In this case, has eigenvalues with multiplicative and the roots of . Since for all and , by Lemma 7.3, , , and so that it has positive eigenvalues and negative eigenvalues. Hence, is a saddle point. ∎
Remark 7.5.
For , by the same argument, has only one negative eigenvalue and two positive eigenvalues for .
7.3. Irrelevant Critical Points of
In this subsection, we eliminate unneeded critical points.
Lemma 7.6.
Let . For and , each point in has at least two negative eigenvalues. And for and , each point in has at least two negative eigenvalues.
Proof.
By the proof of Lemma 7.4, for has at least two negative eigenvalues. Now, let , , and . In this case, each points in has eigenvalues with multiplicative , and the roots of . If , then so that , , and . In this case,
so that the two roots of are negative. Hence, it has positive eigenvalues and negative eigenvalues. If , then so that , and points in have at least negative eigenvalues, where since . ∎
Lemma 7.7.
Let and . Then, we have .
Proof.
Observe that . If , since there is only one solution to . Suppose . By elementary computation, we obtain
so that . Hence, is a permutation of , that is, each element of is one of the elements of so that . ∎
By lemmas in this subsection, , , and , , are not of interest.
7.4. At Critical Temperature
In this subsection, we investigate the critical points at the critical temperatures, that is, at or . The point is degenerate when and the point is degenerate when by Lemma 7.2 and 7.3.
Lemma 7.8.
If and , the point is not a local minimum. If , the point is not a local minimum.
Proof.
Fix such that and define as
We therefore obtain
We claim that and are not the local minima of and , respectively, and this completes the proof.
For the first claim, assume , and note that . Then, and if is in a neighborhood of and . In this case, near so that is not a local minimum. If , so that it suffices to show the second assertion.
Next, note that so that we have , if and , if . Therefore, near so that is not a local minimum. ∎
Even though , , is not a saddle point if , we cannot exclude the possibility that is a saddle point when ; however, by the next two lemmas, , , are irrelevant to the landscape of .
Lemma 7.9.
Let and . Then, if , is not a saddle point.
Proof.
Lemma 7.10.
Let . We have . Furthermore, if , we have . Hence, cannot be a saddle point lower than or .
The proof is presented in Section 8.3. We remark that if , we have so that and the second assertion is not needed.
8. Analysis of Energy Landscape
In this section, we prove lemmas introduced in Section 6.2 and Lemma 7.10. To prove these lemmas, we need numerical computation given in Appendix A.
8.1. Proof of Lemma 6.5
Lemma 8.1.
If , we have .
Proof.
Fix and write for convenience. Since , we have
(8.1) |
Let
(8.2) |
We claim that , that is, by (8.1),
By (8.2), the above inequality is equivalent to
By plugging given in (8.2) into this inequality, it becomes . Hence, since is increasing at , we obtain .
Finally, we claim that
According to Figure 6.1, we can show this by
This holds if or by elementary computation. Now, assume . Therefore, we obtain
which completes the proof. ∎
We can prove Lemma 6.5 by the aforementioned lemma.
8.2. Proof of Lemma 6.7
We first introduce two lemmas.
Lemma 8.2.
Let . When , we have and when , we have .
The proof of the above lemma is given in Section 8.3.
Lemma 8.3.
Let . decreases as increases in .
Proof.
For , which satisfies , let
(8.3) |
Since is a critical point, by the proof of Corollary 3.7, we have
Define a function as
(8.4) |
By elementary computations, we obtain so that we have
(8.5) |
We can now prove Lemma 6.7.
8.3. Proofs of Lemmas 7.10 and 8.2
Before we go further, we conduct some computations. Recall the definition of from Lemma 6.1. Since and is the minimum of , we have
so that
(8.9) |
For defined in (8.3), since and , we can write
(8.10) |
Hence, by (8.9) and , we have
(8.11) | ||||
By (8.9) again, we obtain
(8.12) |
Lemma 8.4.
For , we have
The proof is given in Section 10.
Lemma 8.5.
Let . Define and
Then, we have for .
Proof.
By the above lemmas, Lemma 8.2 can be proven.
Proof of Lemma 8.2.
Now, we prove Lemma 7.10.
9. Characterization of Metastable Sets
Proof of Theorem 3.4.
9.1. Proof of Theorem 3.6
Before we go further, we recall the height between two points. Let , and let be a set of all -path , such that and . Then, we can define the height between and as . We prove Theorem 3.6 in several steps.
Lemma 9.1.
Let . If , the sets , , are different. In particular, they are nonempty.
Proof.
Since the elements of are the lowest minima, we have so that ’s are nonempty. Without loss of generality, suppose . Since and is connected, there is a -path , such that , . Therefore, we have for , so that
Then, there is a saddle point , such that . However, by Proposition 6.3, the values of saddle points are greater than or equal to . This is contradiction. Hence, ’s are different. ∎
Lemma 9.2.
Let . If , the set is singleton for all .
Proof.
First, we claim that ’s are not empty. Suppose one of ’s is empty. Then, by symmetry, all of them are empty. Let us fix . Since is a saddle point, there is a unit eigenvector that corresponds to the unique negative eigenvalue of . There exists , such that for all . Now, consider the path described by the ordinary differential equation
(9.1) |
Then, converges to a critical point whose height is less than as . If this convergent point is not a local minimum, we can find an eigenvector corresponding to a negative eigenvalue of the Hessian of at that point. Then, by the same argument defining the path (9.1), the next path converges to another critical point whose height is lower than that of the previous critical point. Finally, this path converges to a local minimum. Since there is no local minimum other than , converges to some elements of , say without loss of generality. Since ’s are different, converges to only one minimum.
By the same argument, the similar path starting at converges to some , say . If , so that is not empty. So, we obtain . In this case, we obtain and for . By symmetry, since has elements and the number of is , there are elements in corresponding to each , that is, , where is the number of elements of set . If , for some , we obtain by symmetry, and therefore . Hence, we have . If for some , since and by symmetry, for some , and this contradicts the assumption that . Hence, ’s are nonempty.
Observe that the elements of are saddle points and for all . Hence, by Proposition 6.3, . Since ’s have only one negative eigenvalue, each element of connects only two wells, i.e., if . Therefore, has at most one point and from the above two paragraphs, we obtain . ∎
We can now prove Theorem 3.6.
Proof of Theorem 3.6.
By Lemma 9.2, to prove when , without loss of generality, it is sufficient to show that and . First, suppose . Then, by symmetry, we obtain , which contradicts to . Second, suppose so that by symmetry, we have which is also contradiction. Hence, we obtain .
Since is continuous in , the values and , , are continuous in . Note that for since there is no saddle point other than . Since if and there is no saddle point other than the elements of , by continuity, we obtain
Hence, is a saddle point between and and if . Coupled with Lemma 9.1, the fourth assertion holds except that .
Without loss of generality, suppose that . We, therefore, obtain so that . By continuity, we get
so that for . However, it is in contradiction to . Hence, we obtain for .
By the same argument and symmetry, the second assertion can be proven for . By continuity argument, we can extend these to . The third assertion holds because of the first and fourth assertions, symmetry, and continuity. Finally, the fifth assertion can be proven by the same argument. ∎
9.2. Proof of Theorem 3.5
If , cannot be proven by symmetry argument. Hence, we directly prove the Theorem 3.5.
Proof of Theorem 3.5.
Consider . It can be observed that these six planes divide into four pieces, and each plain contains one element of and . We claim that for all if . Note that is not local minimum if .
Let be a restriction of to and let . Since ,
so that if is a critical point, we have
Since , if , the critical points in are , . From the proof Lemma 7.7, we obtain .
Let and . We therefore obtain
The eigenvalues of are and . By Lemma 7.3, so that is a local minimum in . Since this is the unique critical point, is the unique minimum in . Since is a closure of and there is no critical point in , is the unique minimum in . Hence, ’s are different if .
Let . By the definition of , we obtain if so that . By Lemma 9.2, are not empty. It can be observed and if . Since , we have , thus the fourth assertion is proved.
For the third assertion, note that for all and is the only point in , such that . Moreover, we obtain if , and finally we can deduce if using elementary calculus. Hence, ’s are different if .
For the second assertion, we can use the symmetry argument and the proofs are the same as the proof of Theorem 3.6. ∎
10. Proof of Lemma 8.4
This section is devoted to the proof of Lemma 8.4. In Section 10.1, we provide an auxiliary lemma to prove Lemma 10.1. In section 10.2, we prove this auxiliary lemma. So far, we have fixed an integer ; however, in this section, we consider as a real number and several variables as functions of . For example, , , and .
10.1. Proof of Lemma 8.4
Lemma 10.1.
The function of is defined as
(10.1) |
Then, if , .
10.2. Proof of Lemma 10.1
Let . In the first lemma, we compute , , and .
Lemma 10.2.
We have
Proof.
We observe that
so that
By differentiating this equation in , we get
By elementary computation, we can write
(10.2) |
Let . Then,
(10.3) |
Next, we compute . Note that
so that
(10.4) | ||||
∎
The next lemma provides the bound of .
Lemma 10.3.
Let . We have
Proof.
It can be observed that and if . We claim that
where . The above inequality can be written as
Since the right-hand side is smaller than , it suffices to show that
which is true if . Hence, . Next, we have since
which is true if . ∎
In the next two lemmas, we prove that some quantities are positive.
Lemma 10.4.
Let . We have
Proof.
We have
It suffices to show that
Since , we obtain
In the second and third inequalities, we use . Hence, . ∎
Lemma 10.5.
Let . We have
(10.5) |
Proof.
Since, , the right-hand side is greater than
and
where the last inequality is equivalent to which is true for .
Hence, it is enough to show that
Since and , we obtain, for ,
∎
Now, we derive the proof of Lemma 10.1.
Appendix A Some Numerical Computations
Recall the definition (10.1) of . In this section, we verify several inequalities numerically. Our purpose is the following proposition. The proof is presented at the end of this section.
Proposition A.1.
The following hold.
-
(1)
For , we have .
-
(2)
For , we have .
-
(3)
.
Bounds of , and .
We will obtain the bounds of , , and . Fix and let . By gradient descent method, we obtain the following.
Algorithm A.2.
We define and in the following way.
-
(1)
.
-
(2)
While , let .
-
(3)
If , let .
Let and .
We record in the above algorithm and let
Algorithm A.3.
We define and in the following way.
-
(1)
If , let .
-
(a)
.
-
(b)
While , let .
-
(c)
If , let .
-
(a)
-
(2)
If , let .
-
(a)
.
-
(b)
While , let .
-
(c)
If , let .
-
(a)
By Newton method, we approximate which satisfies .
Algorithm A.4.
We define and in the following way.
-
(1)
Let and .
-
(a)
While , let .
-
(b)
If , let and .
-
(a)
-
(2)
If , let .
-
(3)
If , let
-
(a)
.
-
(b)
While , let .
-
(c)
If , let .
-
(a)
-
(4)
If , let .
-
(5)
If , let
-
(a)
.
-
(b)
While , let .
-
(c)
If , let .
-
(a)
Lemma A.5.
We have , , and .
Proof.
From the Taylor’s theorem, we obtain
for some if or if . Since is increasing in the neighborhood of , we obtain
Since , we obtain
where the last inequality is from . Hence, we have
so that we have
which proves the first claim. In the last inequality, we use the fact that .
Since if and if , the second claim is true. Finally, since is increasing in the neighborhood of , the third claim holds. ∎
We finally prove Proposition A.1.
Appendix B Proof of (3.8)
Proof of (3.8).
Since we have
we can use the elementary bound
to obtain
Hence, by the definition of (2.4), we can obtain
and the proof is completed. ∎
Acknowledgement.
This work was supported by Samsung Science and Technology Foundation (Project Number SSTF-BA1901-03).
References
- [1] L. Alonso, R. Cerf: The three dimensional polyominoes of minimal area. Electron. J. Comb. 3, R27 (1996)
- [2] J. Beltrán, C. Landim: Tunneling and metastability of continuous time Markov chains. J. Stat. Phys. 140, 1065–1114 (2010)
- [3] J. Beltrán, C. Landim: Tunneling and metastability of continuous time Markov chains II. J. Stat. Phys. 149, 598–618 (2012)
- [4] G. Ben Arous, R. Cerf : Metastability of the three dimensional Ising model on a torus at very low temperatures. Electron. J. Probab. 1, 1–55 (1996)
- [5] A. Binachi, A. Bovier, D. Ioffe: Sharp asymptotics for metastability in the random field Curie-Weiss model. Electron. J. Probab. 14, 1541–1603 (2009)
- [6] A. Bovier, M. Eckhoff,V. Gayrard, M. Klein: Metastability in stochastic dynamics of disordered mean-field models. Probab. Theory Relat. Fields 119, 99–161 (2001).
- [7] A. Bovier, F. den Hollander: Metastabillity: A Potential-theoretic approach. Grundlehren der mathematischen Wissenschaften. Springer. (2015)
- [8] A. Bovier, F. den Hollander, F.R. Nardi: Sharp asymptotics for Kawasaki dynamics on a finite box with open boundary. Probab. Theory Relat. Fields 135, 265-310 (2006)
- [9] A. Bovier, F. den Hollander, C. Spitoni: Homogeneous nucleation for Glauber and Kawasaki dynamics in large volumes at low temperatures. Ann. Probab. 38, 661–713 (2010)
- [10] A. Bovier, F. Manzo: Metastability in Glauber dynamics in the low-temperature limit: beyond exponential asymptotics. J. Stat. Phys. 107, 757–779 (2002)
- [11] A. Bovier, S. Marelo, E. Pulvirenti: Metastability for the dilute Curie–Weiss model with Glauber dynamics. arXiv:1912.10699 (2019)
- [12] M. Cassandro, A. Galves, E. Olivieri, M.E. Vares : Metastable behavior of stochastic dynamics: A pathwise approach. J. Stat. Phys. 35, 603–634 (1984)
- [13] M. Costeniuc, R. S. Ellis, H. Touchette: Complete analysis of phase transitions and ensemble equivalence for the Curie–Weiss–Potts model. J. Math. Phys. 46, 063301 (2005)
- [14] P. Cuff, J. Ding, O. Louidor, E. Lubetzky, Y. Peres, A. Sly: Glauber dynamics for the mean-field Potts model. J. Stat. Phys. 149, 432–477 (2012)
- [15] P. Eichelsbacher, B. Martschink: On rates of convergence in the Curie–Weiss–Potts model with an external field. Ann. Henri Poincaré: Probab. Statist. 51, 252-282 (2015)
- [16] R. S. Ellis, K. Wang: Limit theorems for the empirical vector of the Curie–Weiss–Potts model. Stoch. Process. Their Appl. 35, 59–79 (1990)
- [17] R. B. Griffiths, P. A. Pearce: Potts model in the many-component limit. J. Phys. A. 13, 2143–2148 (1980)
- [18] F. den Hollander, O. Jovanovski: Glauber dynamics on the Erdős-Rényi random graph. arXiv:1912.10591 (2020)
- [19] H. Kesten, R. H. Schonmann: Behavior in large dimensions of the Potts and Heisenberg models. Rev. Math. Phys. 1, 147–182 (1982)
- [20] S. Kim, I. Seo: Metastability of stochastic Ising and Potts models on lattices without external fields. arXiv:2102.05565 (2021)
- [21] C. Külske, D. Meißner. Stable and metastable phases for the Curie–Weiss–Potts model in vector-valued fields via singularity theory. J. Stat. Phys. 181, 968–989 (2020)
- [22] C. Landim: Metastable Markov chains. Probab. Surv. 16, 143–227 (2019)
- [23] C. Landim, I. Seo: Metastability of non-reversible random walks in a potential field, the Eyring– Kramers transition rate formula. Commun. Pure Appl. Math. 71, 203–266 (2018)
- [24] C. Landim, I. Seo: Metastability of non-reversible, mean-field Potts model with three spins. J. Stat. Phys. 165, 693–726 (2016)
- [25] D.A. Levin, M.J. Luczak, Y. Peres: Glauber dynamics for the mean-field Ising model: cut-off, critical power law, and metastability. Probab. Theory Relat. Fields 146, 223 (2010)
- [26] E. Lubetzky, A. Sly: Cutoff for the Ising model on the lattice. Invent. Math. 191, 719–755 (2013)
- [27] E. Lubetzky, A. Sly: Information percolation and cutoff for the stochastic ising model. J. Am. Math. Soc. 29, 729–774 (2016)
- [28] E. Lubetzky, A. Sly: Universality of cutoff for the Ising model. Ann. Probab. 45, 3664–3696 (2017)
- [29] F.R. Nardi, A. Zocca: Tunneling behavior of Ising and Potts models in the low-temperature regime. Stoch. Process. Their Appl. 129, 4556–4575 (2019)
- [30] E.J. Neves: A discrete variational problem related droplets at low temperatures. J. Stat. Phys. 80, 103–123 (1995)
- [31] E.J. Neves, R.H. Schonmann: Critical droplets and metastability for a glauber dynamics at very low temperatures. Commun. Math. Phys. 137, 209–230 (1991)
- [32] E. Olivieri, M.E. Vares : Large deviations and metastability. Encyclopedia of Mathematics and Its Applications. 100. Cambridge University Press, Cambridge. (2005)
- [33] M. Ostilli, F. Mukhamedov: Continuous- and discrete-time Glauber dynamics. First- and second order phase transitions in mean-field Potts models. EPL. 101(6) (2013)
- [34] F. Rassoul-Agha, T. Seppäläinen: A Course on Large Deviations with an Introduction to Gibbs Measures. Graduate Studies in Mathematics. 162. American Mathematical Society. (2015)
- [35] K. Wang: Solutions of the variational problem in the Curie–Weiss–Potts model. Stoch. Process. Their Appl. 50, 245–252 (1994).
- [36] F.Y. Wu: The Potts model. Rev. Mod. Phys. 54, 235–268 (1982)