This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Energy Landscape and Metastability of Curie–Weiss–Potts Model

Jungkyoung Lee Department of Mathematical Sciences, Seoul National University, Seoul, Republic of Korea [email protected]
Abstract.

In this paper, we thoroughly analyze the energy landscape of the Curie–Weiss–Potts model, which is a ferromagnetic spin system consisting of q3q\geq 3 spins defined on complete graphs. In particular, for the Curie–Weiss–Potts model with q3q\geq 3 spins and zero external field, we completely characterize all critical temperatures and phase transitions in view of the global structure of the energy landscape. We observe that there are three critical temperatures and four different regimes for q<5q<5, whereas there are four critical temperatures and five different regimes for q5q\geq 5. Our analysis extends the investigations performed in [M. Costeniuc, R. S. Ellis, H. Touchette: J. Math. Phys (2005)]; they provide the precise characterization of the second critical temperatures for all q3q\geq 3 and in [Landim and Seo: J. Stat. Phys. (2016)], which provides a complete analysis of the energy landscape for q=3q=3. Based on our precise analysis of the energy landscape, we also perform a quantitative investigation of the metastable behavior of the heat-bath Glauber dynamics associated with the Curie–Weiss–Potts model.

1. Introduction

The Potts model is a well-known mathematical model suitable for studying ferromagnetic spin system consisting of q3q\geq 3 spins. We refer to [36] a comprehensive review on the Potts model. In the present work, we focus on the Potts model defined on large complete graphs without an external field to understand the associated energy landscape as well as the metastable behavior of the heat-bath Glauber dynamics to the highly precise level. This special case of the Potts model defined on complete graphs is called a Curie–Weiss–Potts model and investigated in various studies; e.g., [5, 13, 14, 15, 16, 21, 24, 35, 6] and references therein. We note that the rigorous mathematical definition of the Curie–Weiss–Potts model is presented in the next section.

The Curie–Weiss model

The Ising case of the Curie–Weiss–Potts model, i.e., the corresponding spin system consisting only of q=2q=2 spins, is the famous Curie–Weiss model. It is well-known that the Curie–Weiss model without an external field exhibits a phase transition at the critical (inverse) temperature βc>0\beta_{c}>0. It is mainly because the number of global minima of the potential function associated with the empirical magnetization is one for the high temperature regime ββc\beta\leq\beta_{c} while it becomes two for the low temperature regime β>βc\beta>\beta_{c}, where β>0\beta>0 represents the inverse temperature (cf. [34, Chapter 9] for more detail). It is also well-known that such a phase transition for the structure of the energy landscape is closely related to the mixing property of the associated heat-bath Glauber dynamics. In [25], it has been shown that the Glauber dynamics exhibits the so-called cut-off phenomenon which is a signature of the fast mixing for the high-temperature regime (i.e., β<βc\beta<\beta_{c}) and the metastability for the low-temperature regime (i.e., β>βc\beta>\beta_{c}). The metastability for the low-temperature regime has been more deeply investigated in [12].

The Curie–Weiss–Potts model with q=3q=3

The picture for the Curie-Weiss model explained above has been fully extended to the Curie–Weiss–Potts model consisting of q=3q=3 spins. The complete description of the energy landscape has been obtained recently in [21, 24], where three critical temperatures

0<β1<β2<β3=30<\beta_{1}<\beta_{2}<\beta_{3}=3

are characterized. More precisely, it has been shown that the potential function associated with the empirical magnetization (which will be explained in detail in section 2.3) has

  • the unique global minimum for β(0,β1)\beta\in(0,\beta_{1}),

  • one global minimum and three local minima for β(β1,β2)\beta\in(\beta_{1},\beta_{2}),

  • three global minima and one local minimum for β(β2,β3)\beta\in(\beta_{2},\beta_{3}), and

  • three global minima for β(β3,)\beta\in(\beta_{3},\infty).

The articles [21, 24] also analyzed the associated saddle structure. Based on this analysis, [24] discussed the quantitative feature of the metastable behavior of the heat-bath Glauber dynamics in view of the Eyring–Kramers formula and Markov chain model reduction (cf. [2, 3, 22]) for all the low-temperature regime β>β1\beta>\beta_{1}. Because of the abrupt change in the structure of the potential function at β=β2\beta=\beta_{2} and β=β3\beta=\beta_{3}, the metastable behaviors of the Glauber dynamics in three low-temperature regimes (β1,β2)(\beta_{1},\beta_{2}), (β2,β3)(\beta_{2},\beta_{3}), and (β3,)(\beta_{3},\infty) turned out to be both quantitatively and qualitatively different. For the high-temperature regime (0,β1)(0,\,\beta_{1}), the cut-off phenomenon has been verified in [14] for all q3q\geq 3. Adjoining all these works completes the picture for the Curie–Weiss–Potts model with q=3q=3 spins.

The Curie–Weiss–Potts model with q4q\geq 4

Compared to the Curie–Weiss–Potts model with q=2q=2 or 33 spins, the analysis of the case with q4q\geq 4 spins is not completed so far. In many literature, two critical temperatures β1(q)<β2(q)\beta_{1}(q)<\beta_{2}(q) for the Curie–Weiss–Potts model with q4q\geq 4 spins are observed and the phase transitions near these critical temperatures have been analyzed. For instance, in [14], the phase transition from the fast mixing (the cut-off phenomenon) to the slow mixing (due to the appearance of new local minima) at β=β1(q)\beta=\beta_{1}(q) has been confirmed. In [16], it has been observed that the limiting distributions of the empirical magnetization exhibits the abrupt change at β=β2(q)\beta=\beta_{2}(q). In [13], the phase transition around β2(q)\beta_{2}(q) also has been studied in view of the equivalence and non-equivalence of ensembles.

These studies focus on the phase transitions involved with the local and the global minima of the potential function. However, in order to investigate the metastable behavior whose main objective is to analyze the transitions between neighborhoods of local minima (i.e., the metastable states), the precise understanding of the saddle structure is also required. To the best of our knowledge, the analysis of the saddle structure as well as the metastable behavior of the heat-bath Glauber dynamics for q4q\geq 4 has not been analyzed yet.

Main contribution of the article

The main result of the present work is to provide the complete description of the energy landscape including the saddle structure and to analyze dynamical features of the Glauber dynamics based on it for the Curie–Weiss–Potts models with q4q\geq 4 spins.

First, we observe that for q=4q=4, as in the case of q=3q=3, the potential function has three critical temperatures

0<β1(4)<β2(4)<β3(4)=4,0<\beta_{1}(4)<\beta_{2}(4)<\beta_{3}(4)=4\ ,

and moreover the associated metastable behavior is quite similar to that of the case q=3q=3. On the other hand, for q5q\geq 5, we will deduce that there are four critical temperatures

0<β1(q)<β2(q)<β3(q)<β4(q)=q,0<\beta_{1}(q)<\beta_{2}(q)<\beta_{3}(q)<\beta_{4}(q)=q\ ,

where two critical temperatures β1(q)\beta_{1}(q) and β2(q)\beta_{2}(q) play essentially the same role with β1(3)\beta_{1}(3) and β2(3)\beta_{2}(3) (and hence β1(4)\beta_{1}(4) and β2(4)\beta_{2}(4)), respectively. Surprisingly, our work reveals that the role of the third critical temperature β3(q)\beta_{3}(q) for q4q\leq 4 is divided into the third and fourth critical temperatures β3(q)\beta_{3}(q) and β4(q)\beta_{4}(q) for q5q\geq 5. More precisely, for q4q\leq 4, the change in the saddle gates between global minima and the disappearance of the local minimum representing the chaotic configuration happen simultaneously at β=β3(q)=q\beta=\beta_{3}(q)=q; however, for q5q\geq 5, the change of saddle gates happens at β=β3(q)<q\beta=\beta_{3}(q)<q and the disappearance of the chaotic local minimum occurs at β=β4(q)=q\beta=\beta_{4}(q)=q. Hence, for q5q\geq 5, we observe another type of metastable behavior at β[β3(q),β4(q))\beta\in[\beta_{3}(q),\beta_{4}(q)) compared to the case q4q\leq 4.

Other studies on the Potts model

Although the present work focuses on the Potts model on complete graphs, we also note that the Ising and Potts models on the lattice are widely studied as well. For instance, we refer to [34] and the references therein for the phase transition, to [26, 27, 28] for the cut-off phenomenon in the high-temperature regime, and to [1, 4, 7, 8, 9, 10, 20, 29, 30, 31, 32] for the metastability in the low-temperature regime. In addition, we refer to [17, 19] for the Potts model in many spins or large dimensions and to [11, 18] for the study of metastability of the Ising model on random graphs.

2. Model

In this section, we introduce the formal definition of the Curie–Weiss–Potts model, which will be analyzed in the present work. Fix an integer q3q\geq 3 and let S={1,,q}S=\{1,\,\dots,\,q\} be the set of spins.

2.1. Curie–Weiss–Potts Model

For a positive integer NN, let us denote by111We write KNK_{N} to emphasize that our model is on the complete graph KN={1,,N}K_{N}=\{1,\,\dots,\,N\} the set of sites. Let ΩN=SKN\Omega_{N}=S^{K_{N}} be the configuration space of spins on KNK_{N}. Each configuration is represented as σ=(σ1,,σN)ΩN\sigma=(\sigma_{1},\,\dots,\,\sigma_{N})\in\Omega_{N} where σvS\sigma_{v}\in S denotes a spin at site vKNv\in K_{N}. Let 𝒉=(h1,,hq)q\bm{h}=(h_{1},\,\dots,\,h_{q})\in\mathbb{R}^{q} be the external magnetic field. The Hamiltonian associated to the Curie–Weiss–Potts model with the external field 𝒉\bm{h} is given by

N(σ)=12N1u,vN𝟏(σu=σv)v=1Nj=1qhj𝟏(σv=j);σΩN,\mathbb{H}_{N}(\sigma)\,=\,-\frac{1}{2N}\sum_{1\leq u,v\leq N}\bm{1}(\sigma_{u}=\sigma_{v})\,-\,\sum_{v=1}^{N}\sum_{j=1}^{q}h_{j}\bm{1}(\sigma_{v}=j)\ \ ;\ \sigma\in\Omega_{N}\ ,

where 𝟏\bm{1} denotes the usual indicator function. Then, the Gibbs measure associated to the Hamiltonian at the (inverse) temperature β>0\beta>0 is given by

μNβ(σ)=1ZN(β)eβN(σ);σΩN,\mu_{N}^{\beta}(\sigma)\,=\,\frac{1}{Z_{N}(\beta)}e^{-\beta\mathbb{H}_{N}(\sigma)}\ \ ;\ \sigma\in\Omega_{N}\ ,

where ZN(β)=σΩNeβN(σ)Z_{N}(\beta)=\sum_{\sigma\in\Omega_{N}}e^{-\beta\mathbb{H}_{N}(\sigma)} is the partition function. The measure μNβ()\mu_{N}^{\beta}(\cdot) denotes the Curie–Weiss–Potts measure on ΩN\Omega_{N} at the inverse temperature β\beta.

2.2. Heat-bath Glauber Dynamics

Now, we define a heat-bath Glauber dynamics associated with the Curie–Weiss–Potts measure μNβ()\mu_{N}^{\beta}(\cdot). For σΩN\sigma\in\Omega_{N}, vKNv\in K_{N}, and kSk\in S, denote by σv,k\sigma^{v,\,k} the configuration whose spin σv\sigma_{v} at site vv is flipped to kk, i.e.,

(σv,k)u={σuuv,ku=v.(\sigma^{v,\,k})_{u}\,=\,\begin{cases}\sigma_{u}&u\neq v\ ,\\ k&u=v\ .\end{cases}

Then, we will consider a heat-bath Glauber dynamics associated with generator N\mathcal{L}_{N} which acts on f:ΩNf:\Omega_{N}\to\mathbb{R} as

(Nf)(σ)=1Nv=1Nk=1qcv,k(σ)[f(σv,k)f(σ)],(\mathcal{L}_{N}f)(\sigma)\,=\,\frac{1}{N}\,\sum_{v=1}^{N}\sum_{k=1}^{q}c_{v,\,k}(\sigma)[f(\sigma^{v,\,k})-f(\sigma)]\ ,

where

cv,k(σ)=exp{β2[N(σv,k)N(σ)]}.c_{v,\,k}(\sigma)\,=\,\exp\left\{-\frac{\beta}{2}[\mathbb{H}_{N}(\sigma^{v,\,k})-\mathbb{H}_{N}(\sigma)]\right\}\ .

It can be observed that this dynamics is reversible with respect to the Curie–Weiss–Potts measure μNβ()\mu_{N}^{\beta}(\cdot). Henceforth, denote by σ(t)=σKN(t)=(σ1(t),,σN(t))\sigma(t)=\sigma^{K_{N}}(t)=(\sigma_{1}(t),\,\dots,\,\sigma_{N}(t)) the continuous time Markov process associated with the generator N\mathcal{L}_{N}.

2.3. Empirical Magnetization

For each spin kSk\in S, denote by rNk(σ)r_{N}^{k}(\sigma) the proportion of spin kk of configuration σΩN\sigma\in\Omega_{N}, i.e.,

rNk(σ):=1Nv=1N𝟏(σv=k),r_{N}^{k}(\sigma)\,:=\,\frac{1}{N}\sum_{v=1}^{N}\bm{1}(\sigma_{v}=k)\ ,

and define the proportional vector 𝒓N(σ)\bm{r}_{N}(\sigma) as

𝒓N(σ):=(rN1(σ),,rNq1(σ)),\bm{r}_{N}(\sigma)\,:=\,(r_{N}^{1}(\sigma),\,\dots,\,r_{N}^{q-1}(\sigma))\ ,

which represents the empirical magnetization of the configuration σ\sigma containing the macroscopic information of σ\sigma.

Define Ξ\Xi as

Ξ={𝒙=(x1,,xq1)(0)q1:x1++xq11},\Xi\,=\,\{\bm{x}=(x_{1},\,\dots,\,x_{q-1})\in(\mathbb{R}_{\geq 0})^{q-1}:\,x_{1}+\cdots+x_{q-1}\leq 1\}\ , (2.1)

and then define a discretization of Ξ\Xi as

ΞN=Ξ(/N)q1.\Xi_{N}=\Xi\cap(\mathbb{Z}/N)^{q-1}\ .

With this notation, we immediately have 𝒓N(σ)ΞN\bm{r}_{N}(\sigma)\in\Xi_{N} for σΩN\sigma\in\Omega_{N}.

For the Markov process (σ(t))t0\big{(}\sigma(t)\big{)}_{t\geq 0}, we write 𝒓N()=𝒓N(σ())\bm{r}_{N}(\cdot)=\bm{r}_{N}(\sigma(\cdot)) which is a stochastic process on ΞN\Xi_{N} expressing the evolution of the empirical magnetization. Since the model is defined on the complete graph KNK_{N}, we obtain the following proposition.

Proposition 2.1.

The process (𝐫N(t))t0\big{(}\bm{r}_{N}(t)\big{)}_{t\geq 0} is a continuous time Markov chain on ΞN\Xi_{N} whose invariant measure is given by

νNβ(𝒙):=μNβ(𝒓N1(𝒙));𝒙ΞN\nu_{N}^{\beta}(\bm{x}):=\mu_{N}^{\beta}(\bm{r}_{N}^{-1}(\bm{x}))\ \ ;\ \bm{x}\in\Xi_{N}

where 𝐫N1(𝐱)\bm{r}_{N}^{-1}(\bm{x}) denotes the set {σΩN:𝐫N(σ)=𝐱}\{\sigma\in\Omega_{N}\,:\,\bm{r}_{N}(\sigma)=\bm{x}\}. Furthermore, 𝐫N()\bm{r}_{N}(\cdot) is reversible with respect to νNβ\nu_{N}^{\beta}.

The proof of this proposition including jump rates is given in Section 5.1. Let 𝒙N,β\mathbb{P}_{\bm{x}}^{N,\,\beta} be the law of the Markov chain 𝒓N()\bm{r}_{N}(\cdot) starting at 𝒙ΞN\bm{x}\in\Xi_{N} and let 𝔼𝒙N,β\mathbb{E}_{\bm{x}}^{N,\,\beta} be the corresponding expectation.

More on the measure νNβ()\nu_{N}^{\beta}(\cdot)

For 𝒚Ξ\bm{y}\in\Xi, let 𝒚^=(y,,yq1,yq)q\widehat{\bm{y}}=(y,\,\dots,\,y_{q-1},\,y_{q})\in\mathbb{R}^{q} where yq=1(y1++yq1)y_{q}=1-(y_{1}+\cdots+y_{q-1}). Then, the Hamiltonian N\mathbb{H}_{N} can be written as

N(σ)=NH(𝒓N(σ));σΩN\mathbb{H}_{N}(\sigma)\,=\,NH(\bm{r}_{N}(\sigma))\ \;;\;\sigma\in\Omega_{N}

where

H(𝒙)=12|𝒙^|2𝒉𝒙^;𝒙Ξ.H(\bm{x})\,=\,-\frac{1}{2}|\widehat{\bm{x}}|^{2}-\bm{h}\cdot\widehat{\bm{x}}\ \;;\;\bm{x}\in\Xi\;. (2.2)

Therefore, by Proposition 2.1, the invariant measure νNβ()\nu_{N}^{\beta}(\cdot) of the process 𝒓N(t)\bm{r}_{N}(t) on ΞN\Xi_{N} can be written as

νNβ(𝒙)\displaystyle\nu_{N}^{\beta}(\bm{x}) =σ:𝒓N(σ)=𝒙1ZN(β)exp{βN(σ)}\displaystyle\,=\,\sum_{\sigma:\bm{r}_{N}(\sigma)=\bm{x}}\frac{1}{Z_{N}(\beta)}\exp\{-\beta\mathbb{H}_{N}(\sigma)\}
=(N(Nx1)(Nxq))1ZN(β)exp{βNH(𝒙)}\displaystyle=\,{N\choose(Nx_{1})\cdots(Nx_{q})}\frac{1}{Z_{N}(\beta)}\exp\{-\beta NH(\bm{x})\}
1(2πN)(q1)/2ZN(β)exp{βNFβ,N(𝒙)},\displaystyle\eqqcolon\,\frac{1}{(2\pi N)^{(q-1)/2}Z_{N}(\beta)}\exp\{-\beta NF_{\beta,\,N}(\bm{x})\}\ , (2.3)

where, by Stirling’s formula, we can write

Fβ,N(𝒙)=Fβ(𝒙)+1NGβ,N(𝒙),F_{\beta,\,N}(\bm{x})\,=\,F_{\beta}(\bm{x})+\frac{1}{N}G_{\beta,\,N}(\bm{x})\ ,

where

Fβ(𝒙)=H(𝒙)+1βS(𝒙) andGβ,N(𝒙)=log(x1xq)2β+O(N1).F_{\beta}(\bm{x})\,=\,H(\bm{x})+\frac{1}{\beta}S(\bm{x})\;\ \text{ and}\ \ G_{\beta,\,N}(\bm{x})\,=\,\frac{\log(x_{1}\cdots x_{q})}{2\beta}+O(N^{-1})\ . (2.4)

In this equation, H()H(\cdot) is the energy functional defined in (2.2) and S()S(\cdot) is the entropy functional defined by

S(𝒙)=i=1qxilog(xi),S(\bm{x})=\sum_{i=1}^{q}x_{i}\log(x_{i})\ ,

and Gβ,N(𝒙)G_{\beta,\,N}(\bm{x}) converges to log(x1xq)/(2β)\log(x_{1}\cdots x_{q})/(2\beta) uniformly on every compact subsets of intΞ\text{int}\,\Xi.

Main objectives of the article

Now, we can express the main purpose of the current article in a more concrete manner. In this article, we consider the Curie–Weiss–Potts model when there is no external magnetic field; i.e., 𝒉=𝟎\bm{h}=\bm{0}. Therefore, from now on, we assume 𝒉=𝟎\bm{h}=\bm{0}. Under this assumption, the first objective is to analyze the function Fβ()F_{\beta}(\cdot) expressing the energy landscape of the empirical magnetization of the Curie–Weiss–Potts model. This result will be explained in Section 3. The second concern is to investigate the metastable behavior of the process 𝒓N()\bm{r}_{N}(\cdot) in the low-temperature regime. This will be explained in Section 4. Latter part of the article is devoted to proofs of these results.

3. Main Result for Energy Landscape

In view of Proposition 2.1, (2.3), and (2.4), the structure of the invariant measure νNβ()\nu_{N}^{\beta}(\cdot) of the process 𝒓N()\bm{r}_{N}(\cdot) is essentially captured by the potential function Fβ()F_{\beta}(\cdot); hence, the investigation of Fβ()F_{\beta}(\cdot) is crucial in the analysis of the energy landscape and the metastable behavior of 𝒓N()\bm{r}_{N}(\cdot). In this section, we explain our detailed analysis of the function Fβ()F_{\beta}(\cdot).

Note that the function Fβ()=H()+β1S()F_{\beta}(\cdot)=H(\cdot)+\beta^{-1}S(\cdot) express the competition between the energy and the entropy represented by H()H(\cdot) and S()S(\cdot), respectively. Since there is a β1\beta^{-1} factor in front of the entropy functional, we can expect that the entropy dominates the competition when β\beta is small (i.e., the temperature is high). Since entropy is uniquely minimized at the equally distributed configuration (1/q,, 1/q)Ξ(1/q,\,\dots,\,1/q)\in\Xi, we can expect that the potential Fβ()F_{\beta}(\cdot) also has the unique minimum when β\beta is small. On the other hand, if β\beta is large enough (i.e., the temperature is low), the energy H()H(\cdot) with qq minima dominates the system, and therefore, we can expect that the potential FβF_{\beta} also has qq global minima. In this section we provide the complete characterization of the complicated pattern of transition from this high-temperature regime to low-temperature regime in a precise level.

In Section 3.1, we define several points that will be shown to be critical points. In Section 3.2, we introduce several critical values of (inverse) temperature β\beta. In Section 3.3, we summarize the results on the energy landscape Fβ()F_{\beta}(\cdot). In Section 3.4, as a by-product of these results, we compute the mean-field free energy.

3.1. Critical Points of Fβ()F_{\beta}(\cdot)

Let us first investigate critical points of Fβ()F_{\beta}(\cdot). We recall that

Fβ(𝒙)=12k=1qxk2+1βk=1qxklogxk;𝒙Ξ.F_{\beta}(\bm{x})\,=\,-\frac{1}{2}\sum_{k=1}^{q}x_{k}^{2}\,+\,\frac{1}{\beta}\sum_{k=1}^{q}x_{k}\log x_{k}\;\ ;\;\bm{x}\in\Xi\ .
Notation 3.1.

We have following notations for convenience.

  1. (1)

    Since there is no risk of confusion, we will write the point 𝒙=(x1,,xq1)Ξ\bm{x}=(x_{1},\,\dots,\,x_{q-1})\in\Xi as 𝒙=(x1,,xq1,xq)q\bm{x}=(x_{1},\,\dots,\,x_{q-1},\,x_{q})\in\mathbb{R}^{q} where xq=1x1xq1x_{q}=1-x_{1}-\cdots-x_{q-1}.

  2. (2)

    Let {𝒆1,,𝒆q1}\{\bm{e}_{1},\,\dots,\,\bm{e}_{q-1}\} be the orthonormal basis of q1\mathbb{R}^{q-1} and 𝒆q=𝟎q1\bm{e}_{q}=\bm{0}\in\mathbb{R}^{q-1}.

Refer to caption
Figure 3.1. Graph of g2(t)g_{2}(t) for q=10q=10.

Now, we explain the candidates for the critical points of Fβ()F_{\beta}(\cdot) playing important role in the analysis of the energy landscape. The first candidate is

𝐩(1/q,, 1/q)Ξ,{\bf p}\coloneqq(1/q,\,\dots,\,1/q)\in\Xi\ ,

which represents the state where the spins are equally distributed.

In order to introduce the other candidates, we fix i[1,q/2]i\in\mathbb{N}\cap[1,q/2] and let j=qij=q-i. Define gi:(0,1/j)g_{i}:(0,1/j)\to\mathbb{R} as

gi(t)i1qtlog(1jtit),g_{i}(t)\,\coloneqq\,\frac{i}{1-qt}\log\Big{(}\frac{1-jt}{it}\Big{)}\ , (3.1)

where we set gi(1/q)=qg_{i}(1/q)=q so that gig_{i} becomes a continuous function on (0,1/j)(0,1/j). We refer to Figure 3.1 for an illustration of graph of gig_{i}. Then, it will be verified by Lemma 6.1 in Section 6.1 (and we can expect from the graph illustrated in Figure 3.1) that gi(t)=βg_{i}(t)=\beta has at most two solutions. We denote by ui(β)vi(β)u_{i}(\beta)\leq v_{i}(\beta) these solutions, provided that they exist. If there is only one solution, we let ui(β)=vi(β)u_{i}(\beta)=v_{i}(\beta) be this solution.

For kSk\in S, let

𝐮1k=𝐮1k(β)\displaystyle{\bf u}_{1}^{k}=\mathbf{u}_{1}^{k}(\beta) (u1(β),, 1(q1)u1(β),,u1(β))Ξ,\displaystyle\coloneqq\,\Big{(}u_{1}(\beta),\,\dots,\,1-(q-1)u_{1}(\beta),\,\dots,\,u_{1}(\beta)\Big{)}\in\Xi\ , (3.2)
𝐯1k=𝐯1k(β)\displaystyle{\bf v}_{1}^{k}=\mathbf{v}_{1}^{k}(\beta) (v1(β),, 1(q1)v1(β),,v1(β))Ξ,\displaystyle\coloneqq\,\Big{(}v_{1}(\beta),\,\dots,\,1-(q-1)v_{1}(\beta),\,\dots,\,v_{1}(\beta)\Big{)}\in\Xi\ , (3.3)

where 1(q1)u1(β)1-(q-1)u_{1}(\beta) and 1(q1)v1(β)1-(q-1)v_{1}(\beta) are located at the kk-th component of 𝐮1k{\bf u}_{1}^{k} and 𝐯1k{\bf v}_{1}^{k}, respectively. For222Henceforth, a,bSa,\,b\in S implies that aSa\in S, bSb\in S, and aba\neq b. k,lSk,\,l\in S, let

𝐮2k,l=𝐮2k,l(β)\displaystyle{\bf u}_{2}^{k,\,l}=\mathbf{u}_{2}^{k,\,l}(\beta) (u2(β),,1(q2)u2(β)2,,\displaystyle\coloneqq\,\Big{(}u_{2}(\beta),\,\dots,\,\frac{1-(q-2)u_{2}(\beta)}{2},\,\dots, (3.4)
,1(q2)u2(β)2,,u2(β))Ξ,\displaystyle\ \ \ \ \ \ \ \ \ \ \ \dots,\,\frac{1-(q-2)u_{2}(\beta)}{2},\,\dots,\,u_{2}(\beta)\Big{)}\in\Xi\ ,

where 1(q2)u2(β)2\frac{1-(q-2)u_{2}(\beta)}{2} is located at the kk-th and ll-th components. Of course, each of these points is well defined only when u1(β)u_{1}(\beta),v1(β)v_{1}(\beta), or u2(β)u_{2}(\beta) exists, respectively. Then, let

𝒰1:={𝐮1k:kS},𝒰2{𝐮2k,l:k,lS},and 𝒱1{𝐯1k:kS}.\mathcal{U}_{1}:=\{{\bf u}_{1}^{k}:k\in S\}\,,\;\mathcal{U}_{2}\coloneqq\,\{{\bf u}_{2}^{k,\,l}:k,\,l\in S\}\,,\;\text{and\;}\mathcal{V}_{1}\coloneqq\,\{{\bf v}_{1}^{k}:k\in S\}\ .

We remark that these sets depend on β\beta although we omit β\beta in the expressions for the simplicity of the notation.

Since we assumed that 𝒉=𝟎\bm{h}=\bm{0}, by symmetry, we can expect that the elements in 𝒰1\mathcal{U}_{1} have the same properties; for instance, for all k,lSk,l\in S, we have Fβ(𝐮1k)=Fβ(𝐮1l)F_{\beta}({\bf u}_{1}^{k})=F_{\beta}({\bf u}_{1}^{l}), and 𝐮1k{\bf u}_{1}^{k} is a critical point of Fβ()F_{\beta}(\cdot) if and only if 𝐮1l{\bf u}_{1}^{l} is. Of course the elements in 𝒰2\mathcal{U}_{2} or 𝒱1\mathcal{V}_{1} respectively have the same properties. Thus, it suffices to analyze their representatives, and hence select these representatives as

𝐮1=𝐮1q,𝐮2=𝐮2q1,q,and𝐯1=𝐯1q.{\bf u}_{1}={\bf u}_{1}^{q}\,,\ {\bf u}_{2}={\bf u}_{2}^{q-1,\,q}\,,\ \text{and}\ {\bf v}_{1}={\bf v}_{1}^{q}\ . (3.5)

Now, we have the following preliminary classification of critical points. We remark that a saddle point is a critical point at which the Hessian has only one negative eigenvalue.

Proposition 3.2.

The following hold.

  1. (1)

    If 𝐜Ξ{\bf c}\in\Xi is a local minimum of FβF_{\beta}, then 𝐜{𝐩}𝒰1{\bf c}\in\{{\bf p}\}\cup\mathcal{U}_{1}.

  2. (2)

    If 𝐬Ξ{\bf s}\in\Xi is a saddle point of FβF_{\beta}, then 𝐬𝒱1𝒰2{\bf s}\in\mathcal{V}_{1}\cup\mathcal{U}_{2} for q4q\geq 4 and 𝐬𝒱1{\bf s}\in\mathcal{V}_{1} for q=3q=3.

Remark 3.3.

The set 𝒰2\mathcal{U}_{2} is not defined for q=3q=3 since the set 𝒰i\mathcal{U}_{i} is defined only when iq/2i\leq q/2. This will be explained in Section 6.1.

The proof of this proposition is an immediate consequence of Proposition 6.3 in Section 6.1. The above proposition permits us to focus only on {𝐩}𝒰1𝒰2𝒱1\{\mathbf{p}\}\cup\mathcal{U}_{1}\cup\mathcal{U}_{2}\cup\mathcal{V}_{1} when we analyze the energy landscape in view of the metastable behavior, since the critical points of index greater than 1 cannot play any role, as the metastable transition always happens at the neighborhood of a saddle point (a critical point of index 11).

3.2. Critical Temperatures

In this subsection, we introduce critical temperatures

0<β1(q)<β2(q)<β3(q)q,0<\beta_{1}(q)<\beta_{2}(q)<\beta_{3}(q)\leq q\ ,

at which the phase transitions in the energy landscape occur. The precise definition of these critical temperatures are given in (6.9) of Section 6.2. Henceforth, we write βi=βi(q)\beta_{i}=\beta_{i}(q), 1i31\leq i\leq 3, since there is no risk of confusion.

Refer to caption
(a) q=4q=4
Refer to caption
(b) q5q\geq 5
Figure 3.2. Role of each critical point according to temperature. Solid lines imply local minima and dashed lines imply saddle points.

To describe the role of these critical temperatures, we regard β\beta as increasing from 0 to \infty. Figure 3.2 shows the role of 𝐩,𝒰1,𝒱1{\bf p},\,\mathcal{U}_{1},\,\mathcal{V}_{1}, and 𝒰2\mathcal{U}_{2} according to inverse temperature. Section 6 will prove this figure.

At β=β1\beta=\beta_{1}, the dynamics exhibits phase transition from fast mixing to slow mixing, and this is proven in [14]. Furthermore, the behavior of the dynamics changes from cutoff phenomenon to metastability. This phase transition is due to the appearance of new local minima 𝒰1\mathcal{U}_{1} of Fβ()F_{\beta}(\cdot) other than 𝐩{\bf p} at β=β1\beta=\beta_{1}. At β=β2\beta=\beta_{2}, the ground states of dynamics change from 𝐩{\bf p} to elements of 𝒰1\mathcal{U}_{1}, as observed in [13, Theorem 3.1(b)]. To explain the role of critical temperatures β3\beta_{3} and qq, we have to divide the explanation into several cases. Let us first assume that q5q\geq 5 so that β3<q\beta_{3}<q. At β=β3\beta=\beta_{3}, the saddle gates among the ground states in 𝒰1\mathcal{U}_{1} is changed from 𝒱1\mathcal{V}_{1} to 𝒰2\mathcal{U}_{2} (since the heights Fβ(𝐯1)F_{\beta}({\bf v}_{1}) and Fβ(𝐮2)F_{\beta}({\bf u}_{2}) are reversed at this point) and at β=q\beta=q, the local minimum 𝐩\mathbf{p} becomes a local maximum. On the other hand, for q4q\leq 4, we have β3=q\beta_{3}=q. At β=β3\beta=\beta_{3}, the change of the saddle gates and the disappearance of the local minimum 𝐩\mathbf{p} occur simultaneously. We refer to [24] for the detailed description when q=3q=3.

3.3. Stable and Metastable Sets

We define some metastable sets based on the results explained earlier. If q4q\geq 4, define HβH_{\beta} as (cf. (3.5))

Hβ={Fβ(𝐯1),β(β1,β3),Fβ(𝐮2),β[β3,).H_{\beta}=\begin{cases}F_{\beta}({\bf v}_{1})\ ,&\beta\in(\beta_{1},\beta_{3})\ ,\\ F_{\beta}({\bf u}_{2})\ ,&\beta\in[\beta_{3},\infty)\ .\end{cases} (3.6)

When q=3,q=3, we set Hβ=Fβ(𝐯1)H_{\beta}=F_{\beta}({\bf v}_{1}) for all β>β1\beta>\beta_{1} (cf. Remark 3.3). It will be verified in Lemma 6.7 and (6.9) that HβH_{\beta} is the height of the lowest saddle points.

Let S^:=S{𝔬}\widehat{S}:=S\cup\{\mathfrak{o}\} and 𝐮1𝔬:=𝐩{\bf u}_{1}^{\mathfrak{o}}:={\bf p}. Let333We define the set 𝒲k\mathcal{W}_{k}, kSk\in S, and 𝒲𝔬\mathcal{W}_{\mathfrak{o}} as the empty set if the set {Fβ<Hβ}\{F_{\beta}<H_{\beta}\} does not contain 𝐮1k\mathbf{u}_{1}^{k} and {Fβ<Fβ(𝐯1)}\{F_{\beta}<F_{\beta}({\bf v}_{1})\} does not contain 𝐮1𝔬{\bf u}_{1}^{\mathfrak{o}} respectively. 𝒲k=𝒲k(β)\mathcal{W}_{k}=\mathcal{W}_{k}(\beta), kSk\in S, be the connected component of {Fβ<Hβ}\{F_{\beta}<H_{\beta}\} containing 𝐮1k{\bf u}_{1}^{k} and let 𝒲𝔬=𝒲𝔬(β)\mathcal{W}_{\mathfrak{o}}=\mathcal{W}_{\mathfrak{o}}(\beta) be the connected component of {Fβ<Fβ(𝐯1)}\{F_{\beta}<F_{\beta}({\bf v}_{1})\} containing 𝐮1𝔬{\bf u}_{1}^{\mathfrak{o}}. For k,lS^k,\,l\in\widehat{S}, let Σk,l=Σk,l(β):=𝒲k¯𝒲l¯\Sigma_{k,\,l}=\Sigma_{k,\,l}(\beta):=\overline{\mathcal{W}_{k}}\cap\overline{\mathcal{W}_{l}} be a set of saddle gates of height HβH_{\beta} between 𝐮1k\mathbf{u}_{1}^{k} and 𝐮1l\mathbf{u}_{1}^{l}.

Refer to caption
(a) β(0,β1]\beta\in(0,\beta_{1}]
Refer to caption
(b) β(β1,β2)\beta\in(\beta_{1},\beta_{2})
Refer to caption
(c) β=β2\beta=\beta_{2}
Refer to caption
(d) β(β2,β3)\beta\in(\beta_{2},\beta_{3})
Refer to caption
(e) β=β3\beta=\beta_{3}
Refer to caption
(f) β(β3,)\beta\in(\beta_{3},\infty)
Figure 3.3. Energy landscape of FβF_{\beta} when q=3q=3.

Now, we can state the main result on energy landscape and the proofs of theorems in this section will presented in Section 9. The first result holds for all q3q\geq 3.

Theorem 3.4.

For q3q\geq 3, the following hold.

  1. (1)

    If ββ1\beta\leq\beta_{1}, there is no critical point other than 𝐩{\bf p}, which is the global minimum.

  2. (2)

    For β(β1,q)\beta\in(\beta_{1},q), we have 𝒲𝔬\mathcal{W}_{\mathfrak{o}}\neq\emptyset and for β[q,)\beta\in[q,\infty), we have 𝒲𝔬=\mathcal{W}_{\mathfrak{o}}=\emptyset.

  3. (3)

    Let β\mathcal{M}_{\beta} be a set of local minima of FβF_{\beta}. Then, we have

    β={{𝐩}β(0,β1],{𝐩}𝒰1β(β1,q),𝒰1β[q,).\mathcal{M}_{\beta}=\begin{cases}\{{\bf p}\}&\beta\in(0,\beta_{1}]\ ,\\ \{{\bf p}\}\cup\mathcal{U}_{1}&\beta\in(\beta_{1},q)\ ,\\ \mathcal{U}_{1}&\beta\in[q,\infty)\ .\end{cases}
  4. (4)

    Let β\mathcal{M}_{\beta}^{\star} be a set of global minima of FβF_{\beta}. Then, we have

    β={{𝐩}β(0,β2),{𝐩}𝒰1β=β2,𝒰1β(β2,).\mathcal{M}_{\beta}^{\star}=\begin{cases}\{{\bf p}\}&\beta\in(0,\beta_{2})\ ,\\ \{{\bf p}\}\cup\mathcal{U}_{1}&\beta=\beta_{2}\ ,\\ \mathcal{U}_{1}&\beta\in(\beta_{2},\infty)\ .\end{cases}

Since there is only one minimum if ββ1\beta\leq\beta_{1}, we now consider β>β1\beta>\beta_{1}. Before we write the main result on metastable sets, we would like to emphasize that [24, Proposition 4.4] proved the case when q=3q=3, while the proof for the case q4q\geq 4 is the main novel contents of the current article. We first consider the case q4q\leq 4. See Figure 3.3444This figures are excerpt from [24, Fig 4] for the visualization of the following and above theorem.

Theorem 3.5.

For q4q\leq 4, the following hold.

  1. (1)

    β3=q\beta_{3}=q.

  2. (2)

    For β(β1,q)\beta\in(\beta_{1},q), the sets 𝒲k\mathcal{W}_{k}, kS^k\in\widehat{S}, are nonempty and disjoint. For k,lSk,\,l\in S, Σk,l=\Sigma_{k,\,l}=\emptyset and for kSk\in S, Σ𝔬,k={𝐯1k}\Sigma_{\mathfrak{o},\,k}=\{{\bf v}_{1}^{k}\}.

  3. (3)

    For β=q\beta=q, we have 𝒲𝔬=\mathcal{W}_{\mathfrak{o}}=\emptyset. The sets 𝒲k\mathcal{W}_{k}, kSk\in S, are nonempty and disjoint. For k,lSk,\,l\in S, Σk,l={𝐩}\Sigma_{k,\,l}=\{{\bf p}\}.

  4. (4)

    For β(q,)\beta\in(q,\infty), we have 𝒲𝔬=\mathcal{W}_{\mathfrak{o}}=\emptyset. The sets 𝒲k\mathcal{W}_{k}, kSk\in S, are nonempty and disjoint. For k,lSk,\,l\in S,

    Σk,l={{𝐯1m},where mS{k,l},if q=3,{𝐮2k,l},if q=4.\Sigma_{k,\,l}=\begin{cases}\{{\bf v}_{1}^{m}\}\ ,\ \text{where }m\in S\setminus\{k,l\}\ ,&\text{if }q=3\ ,\\ \{{\bf u}_{2}^{k,\,l}\}\ ,&\text{if }q=4\ .\end{cases}
Refer to caption
(a) (β1,β2)(\beta_{1},\beta_{2})
Refer to caption
(b) β=β2\beta=\beta_{2}
Refer to caption
(c) (β2,β3)(\beta_{2},\beta_{3})
Refer to caption
(d) β=β3\beta=\beta_{3}
Refer to caption
(e) (β3,)(\beta_{3},\infty)
Refer to caption
(f) (β3,q)(\beta_{3},q)
Figure 3.4. Illustration of energy landscape of FβF_{\beta} when q=5q=5. The first five figures are {FβHβ}\{F_{\beta}\leq H_{\beta}\} and the last figure is {FβFβ(𝐯1)}\{F_{\beta}\leq F_{\beta}({\bf v}_{1})\}. The star-shaped vertices and circles represent saddle points and local minima, respectively. The empty circles are shallower minima.

Next, we consider the case q5q\geq 5. Note that the crucial difference compared to the previous theorem lies in the third and fifth statements. See Figure 3.4 for the visualization of the following theorem and Theorem 3.4.

Theorem 3.6.

For q5q\geq 5, the following hold.

  1. (1)

    β3<q\beta_{3}<q.

  2. (2)

    For β(β1,β3)\beta\in(\beta_{1},\beta_{3}), the sets 𝒲k\mathcal{W}_{k}, kS^k\in\widehat{S}, are nonempty and disjoint. For k,lSk,\,l\in S, Σk,l=\Sigma_{k,\,l}=\emptyset and for kSk\in S, Σ𝔬,k={𝐯1k}\Sigma_{\mathfrak{o},\,k}=\{{\bf v}_{1}^{k}\}

  3. (3)

    For β=β3\beta=\beta_{3}, the sets 𝒲k\mathcal{W}_{k}, kS^k\in\widehat{S}, are nonempty and disjoint. For k,lSk,\,l\in S, Σk,l={𝐮2k,l}\Sigma_{k,\,l}=\{{\bf u}_{2}^{k,\,l}\} and for kSk\in S, Σ𝔬,k={𝐯1k}\Sigma_{\mathfrak{o},\,k}=\{{\bf v}_{1}^{k}\}.

  4. (4)

    For β(β3,)\beta\in(\beta_{3},\infty), the sets 𝒲k\mathcal{W}_{k}, kSk\in S, are nonempty and disjoint. For k,lSk,\,l\in S, Σk,l={𝐮2k,l}\Sigma_{k,\,l}=\{{\bf u}_{2}^{k,\,l}\} and for kSk\in S, Σ𝔬,k=\Sigma_{\mathfrak{o},\,k}=\emptyset.

  5. (5)

    For β(β3,q)\beta\in(\beta_{3},q), we have Fβ(𝐯1)>HβF_{\beta}({\bf v}_{1})>H_{\beta}. Furthermore, the set {Fβ<Fβ(𝐯1)}\{F_{\beta}<F_{\beta}({\bf v}_{1})\} has only two connected components, the well 𝒲𝔬\mathcal{W}_{\mathfrak{o}} and the other containing 𝒰1\mathcal{U}_{1}. The saddle points between them are 𝒱1\mathcal{V}_{1}.

3.4. Mean-field Free Energy

In this subsection, we compute the mean-field free energy of the Curie–Weiss–Potts model defined by

ψ(β)limN1βNlogZN(β).\psi(\beta)\coloneqq-\lim_{N\to\infty}\frac{1}{\beta N}\log Z_{N}(\beta)\ . (3.7)

It is well known that the Curie–Weiss model with q=2q=2 spins exhibits the second-order phase transition at the unique critical temperature β=βc\beta=\beta_{c}, while the Curie–Weiss–Potts model with q3q\geq 3 spins exhibits the first-order phase transition at β=β2\beta=\beta_{2} (cf. [13, 16, 33]). We now reconfirm this folklore by computing the free energy explicitly. This computation is based on the following observation (cf. [16, display (2.4)]):

limN1βNlogZN(β)=sup𝒙Ξ{Fβ(𝒙)}.\lim_{N\to\infty}\frac{1}{\beta N}\log Z_{N}(\beta)=\sup_{\bm{x}\in\Xi}\{-F_{\beta}(\bm{x})\}\ . (3.8)

We give a rigorous proof in Appendix B.

Now, let us assume that q3q\geq 3 so that by (3.7), (3.8), and Theorems 3.4, we can deduce that

ψ(β)={Fβ(𝐩)if ββ2,Fβ(𝐮1)if β>β2.\psi(\beta)=\begin{cases}\,F_{\beta}({\bf p})&\text{if }\beta\leq\beta_{2}\ ,\\ \,F_{\beta}({\bf u}_{1})&\text{if }\beta>\beta_{2}\ .\end{cases} (3.9)
Corollary 3.7.

We have that

ψ(β)={1β2S(𝐩)if β<β2,1β2S(𝐮1)if β>β2.\psi^{\prime}(\beta)=\begin{cases}\,-\frac{1}{\beta^{2}}S({\bf p})&\text{if }\beta<\beta_{2}\ ,\\ \,-\frac{1}{\beta^{2}}S({\bf u}_{1})&\text{if }\beta>\beta_{2}\ .\end{cases} (3.10)

In particular, the Curie–Weiss–Potts model with q3q\geq 3 exhibits the first-order phase transition at β=β2\beta=\beta_{2}.

Proof.

Let 𝐜(β)Ξ{\bf c}(\beta)\in\Xi be a critical point of Fβ()F_{\beta}(\cdot). Then, since Fβ=H+β1SF_{\beta}=H+\beta^{-1}S, we have

ddβFβ(𝐜(β))=Fβ(𝐜(β))𝐜˙(β)1β2S(𝐜(β)).\frac{d}{d\beta}F_{\beta}({\bf c}(\beta))=\nabla F_{\beta}({\bf c}(\beta))\cdot\dot{{\bf c}}(\beta)-\frac{1}{\beta^{2}}S({\bf c}(\beta))\ .

Since Fβ(𝐜(β))=0\nabla F_{\beta}({\bf c}(\beta))=0, we get (3.10) from (3.9). Since SS attains its unique local minimum at 𝐩\mathbf{p} and 𝐮1𝐩\mathbf{u}_{1}\neq\mathbf{p}, ψ()\psi^{\prime}(\cdot) is discontinuous at β=β2\beta=\beta_{2}. ∎

4. Main Result for Metastability

In this section, we analyze the metastable behavior of 𝒓N()\bm{r}_{N}(\cdot) based on the analysis of the energy landscape carried out in the previous section and the general results obtained by [23]. As inverse temperature β\beta varies, the behavior of this dynamics changes both qualitatively and quantitatively thanks to the structural phase transitions explained in the previous section.

Since the invariant measure νNβ\nu_{N}^{\beta} is exponentially concentrated in neighborhoods of ground states, the corresponding Markov process 𝒓N()\bm{r}_{N}(\cdot) stays most of the time at these neighborhoods. The abrupt transitions between such stable states are the metastable behavior of the process 𝒓N()\bm{r}_{N}(\cdot) and one of the natural ways of describing these hopping dynamics among the neighborhoods of the ground states is the Markov chain model reduction. A comprehensive understanding of such approaches can be obtained from [2, 3, 22].

When the dynamics starts from a local minimum which is not a global minimum, we have to estimate the mean of the transition time to the global minimum in order to quantitatively understand the metastable behavior. This estimation is known as the Eyring–Kramers formula. In this section we provide the Markov chain model reduction and Eyring–Kramers formula for the metastable process 𝒓N()\bm{r}_{N}(\cdot).

Such a metastable behavior is observed only when there are multiple local minima; and hence we cannot expect metastable behavior at the high-temperature regime ββ1\beta\leq\beta_{1} for which 𝐩{\bf p} is the unique local (and global) minimum. Hence, we assume β>β1\beta>\beta_{1} in this section.

4.1. Some preliminaries

In this subsection, we introduce several notions crucial to the description of the metastable behavior.

Some constants

Recall the definition of {𝒆1,,𝒆q}\{\bm{e}_{1},\,\dots,\,\bm{e}_{q}\} from Notation 3.1. Define (q1)×(q1)(q-1)\times(q-1) matrices 𝔸i,j\mathbb{A}^{i,\,j}, i,jSi,j\in S, and 𝔸(𝒙)\mathbb{A}(\bm{x}) as

𝔸i,j=(𝒆j𝒆i)(𝒆j𝒆i)and 𝔸(𝒙)=1i<jqwi,j(𝒙)𝔸i,j,\mathbb{A}^{i,\,j}\,=\,(\bm{e}_{j}-\bm{e}_{i})(\bm{e}_{j}-\bm{e}_{i})^{\dagger}\ \;\;\;\text{and \;\;\;\;}\mathbb{A}(\bm{x})=\sum_{1\leq i<j\leq q}w^{i,\,j}(\bm{x})\mathbb{A}^{i,\,j}\ ,

where wi,j(𝒙):=xixjw^{i,\,j}(\bm{x})\,:=\,\sqrt{x_{i}x_{j}} . The appearance of the weight wi,j()w^{i,\,j}(\cdot) is explained in Section 5.3. Since 𝔸i,j\mathbb{A}^{i,\,j}, i,jSi,j\in S, are positive definite, 𝔸(𝒙)\mathbb{A}(\bm{x}) satisfies [23, display (A.1)] and hence, by [23, Lemma A.1], for all k,lSk,l\in S, the matrices (2Fβ)(𝐮2k,l)𝔸(𝐮2k,l)(\nabla^{2}F_{\beta})({\bf u}_{2}^{k,\,l})\mathbb{A}({\bf u}_{2}^{k,\,l})^{\dagger} and (2Fβ)(𝐯1k)𝔸(𝐯1k)(\nabla^{2}F_{\beta})({\bf v}_{1}^{k})\mathbb{A}({\bf v}_{1}^{k})^{\dagger} have the unique negative eigenvalue which will be denoted respectively by μk,l=μk,l(β)-\mu_{k,\,l}=-\mu_{k,\,l}(\beta) and μ𝔬,k=μ𝔬,k(β)-\mu_{\mathfrak{o},\,k}=-\mu_{\mathfrak{o},\,k}(\beta).

Let us define the so-called Eyring–Kramers constants corresponding to our model as

ωk,l=ωk,l(β)\displaystyle\omega_{k,\,l}=\omega_{k,\,l}(\beta) :=μk,l(β)det[(2Fβ)(𝐮2k,l)]eβGβ(𝐮2k,l),k,lS,\displaystyle:=\frac{\mu_{k,\,l}(\beta)}{\sqrt{-\det[(\nabla^{2}F_{\beta})({\bf u}_{2}^{k,\,l})]}}e^{-\beta G_{\beta}({\bf u}_{2}^{k,\,l})}\ ,\ \ \ k,\,l\in S\ , (4.1)
ω𝔬,k=ω𝔬,k(β)\displaystyle\omega_{\mathfrak{o},\,k}=\omega_{\mathfrak{o},\,k}(\beta) :=μ𝔬,k(β)det[(2Fβ)(𝐯1k)]eβGβ(𝐯1k),kS.\displaystyle:=\frac{\mu_{\mathfrak{o},\,k}(\beta)}{\sqrt{-\det[(\nabla^{2}F_{\beta})({\bf v}_{1}^{k})]}}e^{-\beta G_{\beta}({\bf v}_{1}^{k})}\ ,\ \ \ k\in S\ . (4.2)

By symmetry, we have ωk,l=ωk,l\omega_{k,\,l}=\omega_{k^{\prime},\,l^{\prime}} for all k,lSk,\,l\in S and k,lSk^{\prime},\,l^{\prime}\in S and ωo,k=ω𝔬,k\omega_{o,\,k}=\omega_{\mathfrak{o},\,k^{\prime}} for all k,kSk,\,k^{\prime}\in S. Hence, let us write ω𝔬=ω𝔬, 1\omega_{\mathfrak{o}}=\omega_{\mathfrak{o},\,1} and ω1=ω1, 2\omega_{1}=\omega_{1,\,2}. Next, define

νk=νk(β)\displaystyle\nu_{k}=\nu_{k}(\beta)\, :=exp(βGβ(𝐮1k))β2det[(2Fβ)(𝐮1k)],kS,\displaystyle:=\,\frac{\exp(-\beta G_{\beta}({\bf u}_{1}^{k}))}{\sqrt{\beta^{2}\det[(\nabla^{2}F_{\beta})({\bf u}_{1}^{k})]}}\ ,\ \ \ k\in S\ , (4.3)
ν𝔬=ν𝔬(β)\displaystyle\nu_{\mathfrak{o}}=\nu_{\mathfrak{o}}(\beta)\, :=exp(βGβ(𝐩))β2det[(2Fβ)(𝐩)].\displaystyle:=\,\frac{\exp(-\beta G_{\beta}({\bf p}))}{\sqrt{\beta^{2}\det[(\nabla^{2}F_{\beta})({\bf p})]}}\ . (4.4)

By the symmetry, we also obtain ν1==νq\nu_{1}=\cdots=\nu_{q}.

Time scales

The constant HβH_{\beta} defined in (3.6) denotes the height of the lowest saddle points. Let θk=θk(β)\theta_{k}=\theta_{k}(\beta), kS^k\in\widehat{S}, be the depth of well 𝒲k(β)\mathcal{W}_{k}(\beta), i.e.,

{θ1==θq=β[HβFβ(𝐮1)],θ𝔬=β[Fβ(𝐯1)Fβ(𝐩)].\begin{cases}\,\theta_{1}=\cdots=\theta_{q}=\beta[H_{\beta}-F_{\beta}({\bf u}_{1})]\ ,\\ \,\theta_{\mathfrak{o}}=\beta[F_{\beta}({\bf v}_{1})-F_{\beta}({\bf p})]\ .\end{cases}

Then, eNθ1e^{N\theta_{1}} and eNθ𝔬e^{N\theta_{\mathfrak{o}}} represent the time scales on which 𝒓N()\bm{r}_{N}(\cdot) exhibits metastability. For βq\beta\geq q, the constant θ𝔬\theta_{\mathfrak{o}} is meaningless since 𝒲𝔬=\mathcal{W}_{\mathfrak{o}}=\emptyset.

Order process and Markov chain model reduction

Let δ=δ(β)>0\delta=\delta(\beta)>0 be a small enough number such that δ<min{θ𝔬,θ1}\delta<\min\{\theta_{\mathfrak{o}},\theta_{1}\}. If βq\beta\geq q, since θ𝔬\theta_{\mathfrak{o}} is not defined, let δ=(1/2)θ1\delta=(1/2)\theta_{1}. For kSk\in S, define

𝒲kδ\displaystyle\mathcal{W}_{k}^{\delta} =𝒲k{𝒙Ξ:Fβ(𝒙)<Hβδ},\displaystyle=\mathcal{W}_{k}\cap\{\bm{x}\in\Xi\,:\,F_{\beta}(\bm{x})<H_{\beta}-\delta\}\ ,
𝒲𝔬δ\displaystyle\mathcal{W}_{\mathfrak{o}}^{\delta} =𝒲𝔬{𝒙Ξ:Fβ(𝒙)<Fβ(𝐯1)δ}.\displaystyle=\mathcal{W}_{\mathfrak{o}}\cap\{\bm{x}\in\Xi\,:\,F_{\beta}(\bm{x})<F_{\beta}({\bf v}_{1})-\delta\}\ .

For kS^k\in\widehat{S}, define Nk=Nk(β)\mathcal{E}_{N}^{k}=\mathcal{E}_{N}^{k}(\beta) as

Nk=𝒲kδΞN.\mathcal{E}_{N}^{k}=\mathcal{W}_{k}^{\delta}\cap\Xi_{N}\ .

This set Nk\mathcal{E}_{N}^{k} is called the metastable set, provided that it is not an empty set. For AS^A\subset\widehat{S}, we write

NA=kANk.\mathcal{E}_{N}^{A}=\bigcup_{k\in A}\mathcal{E}_{N}^{k}\ .

Let TT be S,S^,S,\,\widehat{S}, or {𝔬,S}\{\mathfrak{o},\,S\}. Denote by ΨN=ΨNβ:ΞNT{N}\Psi_{N}=\Psi_{N}^{\beta}:\Xi_{N}\to T\cup\{N\} the projection map given by

ΨN(𝒙)=kTk𝟏{𝒙Nk}+N𝟏{𝒙ΞNNT}.\Psi_{N}(\bm{x})\,=\,\sum_{k\in T}k\bm{1}\{\bm{x}\in\mathcal{E}_{N}^{k}\}+N\bm{1}\{\bm{x}\in\Xi_{N}\setminus\mathcal{E}_{N}^{T}\}\ .

Let us define the so-called order process by 𝐗N(t)=ΨN(𝒓N(t)){\bf X}_{N}(t)=\Psi_{N}(\bm{r}_{N}(t)) which represents the index of metastable set at which the process 𝒓N()\bm{r}_{N}(\cdot) is staying.

Definition 4.1 (Markov chain model reduction).

Let 𝐗(){\bf X}(\cdot) be a continuous time Markov chain on TT. We say that the metastable behavior of the process 𝐫N()\bm{r}_{N}(\cdot) is described by a Markov Process 𝐗(){\bf X}(\cdot) in the time scale θN\theta_{N} if, for all kTk\in T and for all sequence (𝒙N)N1(\bm{x}_{N})_{N\geq 1} such that 𝒙NNk\bm{x}_{N}\in\mathcal{E}_{N}^{k} for all N1N\geq 1, the finite dimensional marginals of the process 𝐗N(θN){\bf X}_{N}(\theta_{N}\cdot) under 𝒙NN,β\mathbb{P}_{\bm{x}_{N}}^{N,\,\beta} converges to that of the Markov chain 𝐗()\mathbf{X}(\cdot) as NN\to\infty.

In the previous definition, it is clear that the Markov chain 𝐗()\mathbf{X}(\cdot) describes the inter-valley dynamics of the process 𝒓N()\bm{r}_{N}(\cdot) accelerated by a factor of θN\theta_{N}.

4.2. Metastability Results for q4q\leq 4

We can now state the main result for the metastable behavior. First, we consider the case q4q\leq 4 whose result is essentially the same as that in [24, Section 4.3] where only the case q=3q=3 was considered.

We define limiting Markov chains when q4q\leq 4.

Definition.

Let q4q\leq 4 and i{(1,2),(2),(2,3),(3,),(4)}i\in\{\,(1,2),\,(2),\,(2,3),\,(3,\infty),\,(4)\,\}. Let 𝐘qi(){\bf Y}_{q}^{i}(\cdot) be a Markov chain on TT with jump rate rqi:T×Tr_{q}^{i}:T\times T\to\mathbb{R} given by

  1. (1)

    rq(1,2)(k,l)=𝟏{l=𝔬}ω𝔬ν1,T=S^r_{q}^{(1,2)}(k,l)=\bm{1}\{l=\mathfrak{o}\}\frac{\omega_{\mathfrak{o}}}{\nu_{1}}\,,\,T=\widehat{S}.

  2. (2)

    rq(2)(k,l)=𝟏{l=𝔬}ω𝔬ν1+𝟏{k=𝔬}ω𝔬ν𝔬,T=S^r_{q}^{(2)}(k,l)=\bm{1}\{l=\mathfrak{o}\}\frac{\omega_{\mathfrak{o}}}{\nu_{1}}+\bm{1}\{k=\mathfrak{o}\}\frac{\omega_{\mathfrak{o}}}{\nu_{\mathfrak{o}}}\,,\,T=\widehat{S}.

  3. (3)

    rq(2,3)(k,l)=ω𝔬qν1,T=Sr_{q}^{(2,3)}(k,l)=\frac{\omega_{\mathfrak{o}}}{q\nu_{1}}\,,\,T=S.

  4. (4)

    rq(3,)(k,l)={ω0ν1,q=3ω1ν1,q=4,T=Sr_{q}^{(3,\infty)}(k,l)=\begin{cases}\,\frac{\omega_{0}}{\nu_{1}}\ ,&q=3\\ \,\frac{\omega_{1}}{\nu_{1}}\ ,&q=4\end{cases}\,,\,T=S.

  5. (5)

    rq(4)(k,l)=𝟏{k=𝔬}ω𝔬ν𝔬,T=S^r_{q}^{(4)}(k,l)=\bm{1}\{k=\mathfrak{o}\}\frac{\omega_{\mathfrak{o}}}{\nu_{\mathfrak{o}}}\,,\,T=\widehat{S}.

The following theorem is the metastability result for q4q\leq 4. We remark that the case when q=3q=3 is proven in [24, Section 4.3].

Theorem 4.2.

Let q4q\leq 4. Then, the metastable behavior of the process 𝐫N()\bm{r}_{N}(\cdot) is described by (cf. Definition 4.1)

  1. (1)

    β(β1,β2)\beta\in(\beta_{1},\beta_{2}): the process 𝐘q(1,2)(){\bf Y}_{q}^{(1,2)}(\cdot) in the time scale 2πNeθ12\pi Ne^{\theta_{1}}.

  2. (2)

    β=β2\beta=\beta_{2}: the process 𝐘q(2)(){\bf Y}_{q}^{(2)}(\cdot) in the time scale 2πNeθ12\pi Ne^{\theta_{1}}.

  3. (3)

    β(β2,q)\beta\in(\beta_{2},q): the process 𝐘q(2,3)(){\bf Y}_{q}^{(2,3)}(\cdot) in the time scale 2πNeθ12\pi Ne^{\theta_{1}} and by the process 𝐘q(4)(){\bf Y}_{q}^{(4)}(\cdot) in the time scale 2πNeθ𝔬2\pi Ne^{\theta_{\mathfrak{o}}}.

  4. (4)

    β(q,)\beta\in(q,\infty): the process 𝐘q(3,)(){\bf Y}_{q}^{(3,\infty)}(\cdot) in the time scale 2πNeθ12\pi Ne^{\theta_{1}}.

The proof follows from Theorems 3.4 and 3.5, Proposition 5.3, and [23, Theorem 2.2, Remark 2.10, 2.11].

Remark 4.3.

As mentioned in [24], we cannot investigate the case β=β3=q\beta=\beta_{3}=q with the current method since 𝐩{\bf p} is a degenerate saddle point.

Remark 4.4.

The qualitative feature of the metastable behavior of 𝒓N()\bm{r}_{N}(\cdot) is essentially same for q=3q=3 and q=4q=4. The only difference is that the saddle points between metastable sets are defined in different ways; however, when β>β3\beta>\beta_{3}, the points in 𝒱1\mathcal{V}_{1} for q=3q=3 and 𝒰2\mathcal{U}_{2} for q=4q=4 play the same role since all the points belonging to these sets represent states in which most sites are aligned to two spins equally.

4.3. Metastability Results for q5q\geq 5

As in the previous subsection, we start by defining limiting Markov chains. Note that there are two different Markov chains.

Definition.

Let q5q\geq 5 and i{(1,2),(2),(2,3),(3),(3,),(4),(5)}i\in\{\,(1,2),\,(2),\,(2,3),\,(3),\,(3,\infty),\,(4),\,(5)\,\}. Let 𝐘qi(){\bf Y}_{q}^{i}(\cdot) be a Markov chain on TT with jump rate rqi:T×Tr_{q}^{i}:T\times T\to\mathbb{R} with jump rate rqi:T×Tr_{q}^{i}:T\times T\to\mathbb{R} given by

  1. (1)

    rq(1,2)(k,l)=𝟏{l=𝔬}ω𝔬ν1,T=S^r_{q}^{(1,2)}(k,l)=\bm{1}\{l=\mathfrak{o}\}\frac{\omega_{\mathfrak{o}}}{\nu_{1}}\,,\,T=\widehat{S}.

  2. (2)

    rq(2)(k,l)=𝟏{l=𝔬}ω𝔬ν1+𝟏{k=𝔬}ω𝔬ν𝔬,T=S^r_{q}^{(2)}(k,l)=\bm{1}\{l=\mathfrak{o}\}\frac{\omega_{\mathfrak{o}}}{\nu_{1}}+\bm{1}\{k=\mathfrak{o}\}\frac{\omega_{\mathfrak{o}}}{\nu_{\mathfrak{o}}}\,,\,T=\widehat{S}.

  3. (3)

    rq(2,3)(k,l)=ω𝔬qν1,T=Sr_{q}^{(2,3)}(k,l)=\frac{\omega_{\mathfrak{o}}}{q\nu_{1}}\,,\,T=S.

  4. (4)

    rq(3)(k,l)=1ν1(ω𝔬q+ω1),T=Sr_{q}^{(3)}(k,l)=\frac{1}{\nu_{1}}(\frac{\omega_{\mathfrak{o}}}{q}+\omega_{1})\,,\,T=S.

  5. (5)

    rq(3,)(k,l)=ω1ν1,T=Sr_{q}^{(3,\infty)}(k,l)=\frac{\omega_{1}}{\nu_{1}}\,,\,T=S.

  6. (6)

    rq(4)(k,l)=𝟏{k=𝔬}ω𝔬ν𝔬,T=S^r_{q}^{(4)}(k,l)=\bm{1}\{k=\mathfrak{o}\}\frac{\omega_{\mathfrak{o}}}{\nu_{\mathfrak{o}}}\,,\,T=\widehat{S}.

  7. (7)

    rq(5)(k,l)=𝟏{k=𝔬}qω𝔬ν𝔬,T={𝔬,S}r_{q}^{(5)}(k,l)=\bm{1}\{k=\mathfrak{o}\}\frac{q\omega_{\mathfrak{o}}}{\nu_{\mathfrak{o}}}\,,\,T=\{\mathfrak{o},\,S\}.

Now, we present the metastability result for q5q\geq 5. The new metastable behaviors are observed when β=β3\beta=\beta_{3} and β(β3,q)\beta\in(\beta_{3},q).

Theorem 4.5.

Let q5q\geq 5. Then, the metastable behavior of the process 𝐫N()\bm{r}_{N}(\cdot) is described by

  1. (1)

    β(β1,β2)\beta\in(\beta_{1},\beta_{2}): the process 𝐘q(1,2)(){\bf Y}_{q}^{(1,2)}(\cdot) in the time scale 2πNeθ12\pi Ne^{\theta_{1}}.

  2. (2)

    β=β2\beta=\beta_{2}: the process 𝐘q(2)(){\bf Y}_{q}^{(2)}(\cdot) in the time scale 2πNeθ12\pi Ne^{\theta_{1}}.

  3. (3)

    β(β2,β3)\beta\in(\beta_{2},\beta_{3}): the process 𝐘q(2,3)(){\bf Y}_{q}^{(2,3)}(\cdot) in the time scale 2πNeθ12\pi Ne^{\theta_{1}} and by the process 𝐘q(4)(){\bf Y}_{q}^{(4)}(\cdot) in the time scale 2πNeθ𝔬2\pi Ne^{\theta_{\mathfrak{o}}}.

  4. (4)

    β=β3\beta=\beta_{3}: the process 𝐘q(3)(){\bf Y}_{q}^{(3)}(\cdot) in the time scale 2πNeθ12\pi Ne^{\theta_{1}} and by the process 𝐘q(4)(){\bf Y}_{q}^{(4)}(\cdot) in the time scale 2πNeθ𝔬2\pi Ne^{\theta_{\mathfrak{o}}}.

  5. (5)

    β(β3,q)\beta\in(\beta_{3},q): the process 𝐘q(3,)(){\bf Y}_{q}^{(3,\infty)}(\cdot) in the time scale 2πNeθ12\pi Ne^{\theta_{1}}and by the process 𝐘q(5)(){\bf Y}_{q}^{(5)}(\cdot) in the time scale 2πNeθ𝔬2\pi Ne^{\theta_{\mathfrak{o}}}.

  6. (6)

    β[q,)\beta\in[q,\infty): the process 𝐘q(3,)(){\bf Y}_{q}^{(3,\infty)}(\cdot) in the time scale 2πNeθ12\pi Ne^{\theta_{1}}.

The proof follows from Theorems 3.4 and 3.6, Proposition 5.3, and [23, Theorem 2.2, Remarks 2.10, 2.11].

Remark 4.6.

Notably, in contrast to the case q4q\leq 4, we can describe the metastable behavior for all β(β1,)\beta\in(\beta_{1},\infty) since the saddle points are nondegenerate when β=β3\beta=\beta_{3}.

Refer to caption
(a) 𝐘q(1,2){\bf Y}_{q}^{(1,2)}
Refer to caption
(b) 𝐘q(2){\bf Y}_{q}^{(2)}
Refer to caption
(c) 𝐘q(2,3){\bf Y}_{q}^{(2,3)}
Refer to caption
(d) 𝐘q(3){\bf Y}_{q}^{(3)}
Refer to caption
(e) 𝐘q(3,){\bf Y}_{q}^{(3,\infty)}
Refer to caption
(f) 𝐘q(4){\bf Y}_{q}^{(4)}
Figure 4.1. Figure about metastability when q=5q=5

We can now provide a more intuitive explanation of Theorem 4.5. See Figure 4.1 also for the description of metastable behavior. Note that if β2<β<q\beta_{2}<\beta<q, there are two time scales.

  • 𝐘q(1,2){\bf Y}_{q}^{(1,2)}: If β1<β<β2\beta_{1}<\beta<\beta_{2}, in the time scale 2πNeθ12\pi Ne^{\theta_{1}}, 𝒓N()\bm{r}_{N}(\cdot) starting from NS\mathcal{E}_{N}^{S}, reaches N𝔬\mathcal{E}_{N}^{\mathfrak{o}} and stays there forever. When it goes from Nk\mathcal{E}_{N}^{k}, kSk\in S, to N𝔬\mathcal{E}_{N}^{\mathfrak{o}}, it visits the neighborhood of 𝐯1k{\bf v}_{1}^{k}.

  • 𝐘q(2){\bf Y}_{q}^{(2)}: If β=β2\beta=\beta_{2}, in the time scale 2πNeθ12\pi Ne^{\theta_{1}}, the process 𝒓N()\bm{r}_{N}(\cdot) goes around each well in NS^\mathcal{E}_{N}^{\widehat{S}}. However, when 𝒓N()\bm{r}_{N}(\cdot) starting from Nk\mathcal{E}_{N}^{k}, kSk\in S, goes to Nl\mathcal{E}_{N}^{l}, lS{k}l\in S\setminus\{k\}, it must pass through N𝔬\mathcal{E}_{N}^{\mathfrak{o}} and as in the case β(β1,β2)\beta\in(\beta_{1},\beta_{2}), it visits the neighborhood of 𝐯1k{\bf v}_{1}^{k} and then the neighborhood of 𝐯1l{\bf v}_{1}^{l}.

  • 𝐘q(2,3){\bf Y}_{q}^{(2,3)}: If β2<β<β3\beta_{2}<\beta<\beta_{3}, in the time scale 2πNeθ12\pi Ne^{\theta_{1}}, the process 𝒓N()\bm{r}_{N}(\cdot) starting from NS\mathcal{E}_{N}^{S} travels NS^\mathcal{E}_{N}^{\widehat{S}}, however, it stays in N𝔬\mathcal{E}_{N}^{\mathfrak{o}} in negligible time. Furthermore, when 𝒓N()\bm{r}_{N}(\cdot) goes from Nk\mathcal{E}_{N}^{k}, kSk\in S, to Nl\mathcal{E}_{N}^{l}, lS{k}l\in S\setminus\{k\}, it must visit N𝔬\mathcal{E}_{N}^{\mathfrak{o}}.

  • 𝐘q(3){\bf Y}_{q}^{(3)}: If β=β3\beta=\beta_{3}, in the time scale 2πNeθ12\pi Ne^{\theta_{1}}, the process 𝒓N()\bm{r}_{N}(\cdot) starting from NS\mathcal{E}_{N}^{S} travels NS^\mathcal{E}_{N}^{\widehat{S}}, however, it stays in N𝔬\mathcal{E}_{N}^{\mathfrak{o}} in negligible time. Furthermore, there are two ways in which 𝒓N()\bm{r}_{N}(\cdot) goes from Nk\mathcal{E}_{N}^{k}, kSk\in S, to Nl\mathcal{E}_{N}^{l}, lS{k}l\in S\setminus\{k\}. First, it goes to Nl\mathcal{E}_{N}^{l} directly and must pass through the neighborhood of 𝐮2k,l{\bf u}_{2}^{k,\,l}. Second, it visits N𝔬\mathcal{E}_{N}^{\mathfrak{o}} and stays there for a negligible period of time.

  • 𝐘q(3,){\bf Y}_{q}^{(3,\infty)}: If β>β3\beta>\beta_{3}, in the time scale 2πNeθ1,2\pi Ne^{\theta_{1}}, the process 𝒓N()\bm{r}_{N}(\cdot) starting from NS\mathcal{E}_{N}^{S} travels NS\mathcal{E}_{N}^{S} without visiting N𝔬\mathcal{E}_{N}^{\mathfrak{o}}. As in the case β=β3\beta=\beta_{3}, it must pass through the neighborhood of 𝐮2kl{\bf u}_{2}^{k\,l}, k,lSk,l\in S, when it goes from Nk\mathcal{E}_{N}^{k} to Nl\mathcal{E}_{N}^{l}.

  • 𝐘q(4){\bf Y}_{q}^{(4)}: If β2<ββ3\beta_{2}<\beta\leq\beta_{3}, in the second time scale 2πNeθ𝔬2\pi Ne^{\theta_{\mathfrak{o}}}, the process 𝒓N()\bm{r}_{N}(\cdot) starting from N𝔬\mathcal{E}_{N}^{\mathfrak{o}}, goes to Nk\mathcal{E}_{N}^{k}, kSk\in S, and stays there forever. As 𝐘q(1,2){\bf Y}_{q}^{(1,2)}, 𝐘q(2){\bf Y}_{q}^{(2)}, and 𝐘q(2,3){\bf Y}_{q}^{(2,3)}, it passes through the neighborhood of 𝐯1k{\bf v}_{1}^{k} when it moves to Nk\mathcal{E}_{N}^{k} from N𝔬\mathcal{E}_{N}^{\mathfrak{o}}.

  • 𝐘q(5){\bf Y}_{q}^{(5)}: If β3<β<q\beta_{3}<\beta<q, in the second time scale 2πNeθ𝔬2\pi Ne^{\theta_{\mathfrak{o}}}, the process 𝒓N()\bm{r}_{N}(\cdot) starting from N𝔬\mathcal{E}_{N}^{\mathfrak{o}}, goes to NS\mathcal{E}_{N}^{S} and stays there forever. This dynamics is similar to 𝐘q(4){\bf Y}_{q}^{(4)}; however, Nk\mathcal{E}_{N}^{k}, kSk\in S, are not distinguishable.

4.4. Eyring–Kramers Formulae

In this subsection, we present the Eyring–Kramers formula with regard to metastable behavior. Before we state the result, we introduce some notations. Let [𝒙]N[\bm{x}]_{N} be the nearest point in ΞN\Xi_{N} of 𝒙Ξ\bm{x}\in\Xi. If there is more than one such point, one of them is chosen arbitrarily. For 𝒜Ξ\mathcal{A}\subset\Xi, define [𝒜]N[\mathcal{A}]_{N} as

[𝒜]N={[𝒙]N:𝒙𝒜}.[\mathcal{A}]_{N}\,=\,\{[\bm{x}]_{N}\,:\,\bm{x}\in\mathcal{A}\}\ .

Denote by H𝒜H_{\mathcal{A}} the hitting time of the set [𝒜]N[\mathcal{A}]_{N} by the process 𝒓N()\bm{r}_{N}(\cdot):

H𝒜=inf{t>0:𝒓N(t)[𝒜]N}.H_{\mathcal{A}}\,=\,\inf\{t>0\,:\,\bm{r}_{N}(t)\in[\mathcal{A}]_{N}\}\ .

If 𝒜={𝒙}\mathcal{A}=\{\bm{x}\}, we simply write H𝒜=H𝒙H_{\mathcal{A}}=H_{\bm{x}}.

We have the following theorem.

Theorem 4.7.

Let q3q\geq 3. We have the following.

  1. (1)

    For β1<ββ2\beta_{1}<\beta\leq\beta_{2} and kSk\in S, we have

    𝔼𝐮1kN,β[H𝐩]=[1+oN(1)]ν1ω𝔬2πNexp(Nθ1).\mathbb{E}_{{\bf u}_{1}^{k}}^{N,\,\beta}[H_{{\bf p}}]\,=\,[1+o_{N}(1)]\frac{\nu_{1}}{\omega_{\mathfrak{o}}}2\pi N\exp(N\theta_{1})\ .
  2. (2)

    For β2β<q\beta_{2}\leq\beta<q, we have

    𝔼𝐩N,β[H𝒰1]=[1+oN(1)]ν𝔬qω𝔬2πNexp(Nθ0).\mathbb{E}_{{\bf p}}^{N,\,\beta}[H_{\mathcal{U}_{1}}]\,=\,[1+o_{N}(1)]\frac{\nu_{\mathfrak{o}}}{q\omega_{\mathfrak{o}}}2\pi N\exp(N\theta_{0})\ .
  3. (3)

    For β>β3\beta>\beta_{3} and kSk\in S, we have

    𝔼𝐮1kN,β[H𝒰1{𝐮1k}]=[1+oN(1)]ν1(q1)ω12πNexp(Nθ1).\mathbb{E}_{{\bf u}_{1}^{k}}^{N,\,\beta}[H_{\mathcal{U}_{1}\setminus\{{\bf u}_{1}^{k}\}}]\,=\,[1+o_{N}(1)]\frac{\nu_{1}}{(q-1)\omega_{1}}2\pi N\exp(N\theta_{1})\ .

The proof follows from Theorems 3.4-3.6, Proposition 5.3, and [23, Theorem 2.5, Remarks 2.10, 2.11].

5. Preliminary Analysis on Potential and Generator

In this section, we conduct several preliminary analyses. In Section 5.1, we prove Proposition 2.1. In particular, we compute the jump rates of Markov chain 𝒓N()\bm{r}_{N}(\cdot). In Section 5.2, we decompose the generator N\mathscr{L}_{N} into several simple generators N,𝒙i,j\mathscr{L}_{N,\,\bm{x}}^{i,\,j}, 𝒙ΞN\bm{x}\in\Xi_{N}, i,jSi,j\in S. Via this decomposition of N\mathscr{L}_{N}, we can handle N\mathscr{L}_{N} using the method developed in [23] since our model is a special case of [23, Remarks 2.10, 2.11]; this correspondence will be explained in Section 5.3.

5.1. Dynamics of Proportion Vector.

We prove Proposition 2.1 in this section.

Proof of Proposition 2.1.

Let 𝒆jN:=1N𝒆j\bm{e}_{j}^{N}:=\frac{1}{N}\bm{e}_{j}, jSj\in S (cf. Notation 3.1). Fix configurations σ,τΩN\sigma,\tau\in\Omega_{N} such that 𝒓N(σ)=𝒓N(τ)\bm{r}_{N}(\sigma)=\bm{r}_{N}(\tau) and let

𝒙=(x1,,xq1)=𝒓N(σ)=𝒓N(τ)ΞN.\bm{x}=(x_{1},\,\dots,\,x_{q-1})=\bm{r}_{N}(\sigma)=\bm{r}_{N}(\tau)\in\Xi_{N}\ .

For some sites u1,u2,v1,v2KNu_{1},u_{2},v_{1},v_{2}\in K_{N} such that σu1=σv1=τu2=τv2,\sigma_{u_{1}}=\sigma_{v_{1}}=\tau_{u_{2}}=\tau_{v_{2}}, let i=σv1i=\sigma_{v_{1}}. Then, the Markovity of the process 𝒓N(t)\bm{r}_{N}(t) can be inferred from the identity

cu1,j(σ)=cv1,j(σ)=cu2,j(τ)=cv2,j(τ)\displaystyle c_{u_{1},\,j}(\sigma)\,=\,c_{v_{1},\,j}(\sigma)\,=\,c_{u_{2},\,j}(\tau)\,=\,c_{v_{2},\,j}(\tau) =exp{Nβ2[H(𝒓N(σv1,j))H(𝒓N(σ))]}\displaystyle=\,\exp\left\{-\frac{N\beta}{2}[H(\bm{r}_{N}(\sigma^{v_{1},\,j}))-H(\bm{r}_{N}(\sigma))]\right\}
=exp{Nβ2[H(𝒙+𝒆jN𝒆iN)H(𝒙)]},\displaystyle=\,\exp\left\{-\frac{N\beta}{2}[H(\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N})-H(\bm{x})]\right\}\ ,

for jij\neq i. Hence, 𝒓N()\bm{r}_{N}(\cdot) is a Markov chain.

Since there are NxiNx_{i} sites whose spins are ii, the jump rate RN(,)R_{N}(\cdot,\cdot) of 𝒓N()\bm{r}_{N}(\cdot) can be written as

RN(𝒙,𝒙+𝒆jN𝒆iN)=NxiNexp{Nβ2[H(𝒙+𝒆jN𝒆iN)H(𝒙)]}.R_{N}(\bm{x},\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N})\,=\,\frac{Nx_{i}}{N}\exp\left\{-\frac{N\beta}{2}[H(\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N})-H(\bm{x})]\right\}\ . (5.1)

Hence, the generator N\mathscr{L}_{N} of 𝒓N()\bm{r}_{N}(\cdot) is given as

Nf(𝒙)=i,jSRN(𝒙,𝒙+𝒆jN𝒆iN)[f(𝒙+𝒆jN𝒆iN)f(𝒙)],\mathscr{L}_{N}f(\bm{x})\,=\,\sum_{i,\,j\in S}R_{N}(\bm{x},\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N})\,[f(\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N})-f(\bm{x})]\ ,

for f:ΞNf:\Xi_{N}\to\mathbb{R}.

Finally, this dynamics is reversible with respect to νβN\nu_{\beta}^{N} since we have the following detailed balance condition

νβN(𝒙)RN(𝒙,𝒙+𝒆jN𝒆iN)=νβN(𝒙+𝒆jN𝒆iN)RN(𝒙+𝒆jN𝒆iN,𝒙),\nu_{\beta}^{N}(\bm{x})\,R_{N}(\bm{x},\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N})\,=\,\nu_{\beta}^{N}(\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N})\,R_{N}(\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N},\bm{x})\ ,

so that νβN\nu_{\beta}^{N} is the invariant measure. ∎

5.2. Cyclic Decomposition

For 1i<jq1\leq i<j\leq q, let γNi,j\gamma_{N}^{i,\,j} be the cycle (𝒆0N,𝒆jN𝒆iN,𝒆0N)(\bm{e}_{0}^{N},\bm{e}_{j}^{N}-\bm{e}_{i}^{N},\bm{e}_{0}^{N}) of length 2 on (/N)q1(\mathbb{Z}/N)^{q-1} and let (γNi,j)𝒙=𝒙+γNi,j(\gamma_{N}^{i,\,j})_{\bm{x}}=\bm{x}+\gamma_{N}^{i,\,j}. Define Ξ^Ni,j\widehat{\Xi}_{N}^{i,\,j} as

Ξ^Ni,j={𝒙ΞN:(γNi,j)𝒙ΞN}={𝒙ΞN:xj1N1,xiN1}.\widehat{\Xi}_{N}^{i,\,j}\,=\,\{\bm{x}\in\Xi_{N}\,:\,(\gamma_{N}^{i,\,j})_{\bm{x}}\subset\Xi_{N}\}\,=\,\{\bm{x}\in\Xi_{N}\,:\,x_{j}\leq 1-N^{-1},\,x_{i}\geq N^{-1}\}\ .

Define a jump rate R~Ni,j\widetilde{R}_{N}^{i,\,j} associated with this cycle as

R~N, 0i,j(𝒙)\displaystyle\widetilde{R}_{N,\,0}^{i,\,j}(\bm{x}) =exp{Nβ[F¯β,Ni,j(𝒙)Fβ,N(𝒙)]},\displaystyle\,=\,\exp\left\{-N\beta[\overline{F}_{\beta,\,N}^{i,\,j}(\bm{x})-F_{\beta,\,N}(\bm{x})]\right\}\ ,
R~N, 1i,j(𝒙)\displaystyle\widetilde{R}_{N,\,1}^{i,\,j}(\bm{x}) =exp{Nβ[F¯β,Ni,j(𝒙)Fβ,N(𝒙+𝒆jN𝒆iN)]},\displaystyle\,=\,\exp\left\{-N\beta[\overline{F}_{\beta,\,N}^{i,\,j}(\bm{x})-F_{\beta,\,N}(\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N})]\right\}\ ,

where

F¯β,Ni,j(𝒙)=12[Fβ,N(𝒙)+Fβ,N(𝒙+𝒆jN𝒆iN)].\overline{F}_{\beta,\,N}^{i,\,j}(\bm{x})\,=\,\frac{1}{2}[F_{\beta,\,N}(\bm{x})+F_{\beta,\,N}(\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N})]\ .

Let N,𝒙i,j,𝒙Ξ^N\mathscr{L}_{N,\,\bm{x}}^{i,\,j},\,\bm{x}\in\widehat{\Xi}_{N}, be a generator acting on f:ΞNf:\Xi_{N}\to\mathbb{R} as

(N,𝒙i,jf)(𝒚)={R~N, 0i,j(𝒙)[f(𝒙+𝒆jN𝒆iN)f(𝒙)]𝒚=𝒙,R~N, 1i,j(𝒙)[f(𝒙)f(𝒙+𝒆jN𝒆iN)]𝒚=𝒙+𝒆jN𝒆iN, 0otherwise.(\mathscr{L}_{N,\,\bm{x}}^{i,\,j}f)(\bm{y})=\begin{cases}\,\widetilde{R}_{N,\,0}^{i,\,j}(\bm{x})\,\Big{[}f(\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N})-f(\bm{x})\Big{]}&\bm{y}=\bm{x}\ ,\\ \,\widetilde{R}_{N,\,1}^{i,\,j}(\bm{x})\,\Big{[}f(\bm{x})-f(\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N})\Big{]}&\bm{y}=\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N}\ ,\\ \,0&\text{otherwise}\ .\end{cases} (5.2)

Here, N,𝒙i,j\mathscr{L}_{N,\,\bm{x}}^{i,\,j} can be regarded as a generator of a Markov chain on the cycle (γNi,j)𝒙(\gamma_{N}^{i,\,j})_{\bm{x}}.

Let

wi,j(𝒙):=xixj,andwNi,j(𝒙):=xi(xj+1N).w^{i,\,j}(\bm{x})\,:=\,\sqrt{x_{i}x_{j}}\ ,\ \text{and}\ \ w_{N}^{i,\,j}(\bm{x})\,:=\,\sqrt{x_{i}(x_{j}+\frac{1}{N})}\ .

By (2.3), we can write

exp{βN[Fβ,N(𝒙)H(𝒙)]}=(2πN)(q1)/2(N(Nx1)(Nxq)).\exp\{-\beta N[F_{\beta,\,N}(\bm{x})-H(\bm{x})]\}\,=\,(2\pi N)^{(q-1)/2}{N\choose(Nx_{1})\cdots(Nx_{q})}\ .

By elementary computations, we obtain

RN(𝒙,𝒙+𝒆jN𝒆iN)/R~N, 0i,j(𝒙)=RN(𝒙+𝒆jN𝒆iN,𝒙)/R~N, 1i,j(𝒙)=wNi,j(𝒙),R_{N}(\bm{x},\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N})/\widetilde{R}_{N,\,0}^{i,\,j}(\bm{x})\,=\,R_{N}(\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N},\bm{x})/\widetilde{R}_{N,\,1}^{i,\,j}(\bm{x})\,=\,w_{N}^{i,\,j}(\bm{x})\ ,

so that

RN(𝒙,𝒙+𝒆jN𝒆iN)[f(𝒙+𝒆jN𝒆iN)f(𝒙)]\displaystyle R_{N}(\bm{x},\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N})[f(\bm{x}+\bm{e}_{j}^{N}-\bm{e}_{i}^{N})-f(\bm{x})] =wNi,j(𝒙)N,𝒙i,jf(𝒙)\displaystyle=w_{N}^{i,\,j}(\bm{x})\,\mathscr{L}_{N,\,\bm{x}}^{i,\,j}f(\bm{x})
RN(𝒙,𝒙+𝒆iN𝒆jN)[f(𝒙+𝒆iN𝒆jN)f(𝒙)]\displaystyle R_{N}(\bm{x},\bm{x}+\bm{e}_{i}^{N}-\bm{e}_{j}^{N})[f(\bm{x}+\bm{e}_{i}^{N}-\bm{e}_{j}^{N})-f(\bm{x})] =wNi,j(𝒙+𝒆iN𝒆jN)N,𝒙+𝒆iN𝒆jNi,jf(𝒙).\displaystyle=w_{N}^{i,\,j}(\bm{x}+\bm{e}_{i}^{N}-\bm{e}_{j}^{N})\,\mathscr{L}_{N,\,\bm{x}+\bm{e}_{i}^{N}-\bm{e}_{j}^{N}}^{i,\,j}f(\bm{x})\ .

Hence, by (5.2), we can write

Nf(𝒙)\displaystyle\mathscr{L}_{N}f(\bm{x}) =1i<jq[wNi,j(𝒙)N,𝒙i,j+wNi,j(𝒙+𝒆iN𝒆jN)N,𝒙+𝒆iN𝒆jNi,j]f(𝒙)\displaystyle=\sum_{1\leq i<j\leq q}[w_{N}^{i,\,j}(\bm{x})\,\mathscr{L}_{N,\,\bm{x}}^{i,\,j}+w_{N}^{i,\,j}(\bm{x}+\bm{e}_{i}^{N}-\bm{e}_{j}^{N})\,\mathscr{L}_{N,\,\bm{x}+\bm{e}_{i}^{N}-\bm{e}_{j}^{N}}^{i,\,j}]f(\bm{x})
=1i<jq𝒚Ξ^Ni,jwNi,j(𝒚)N,𝒚i,jf(𝒙).\displaystyle=\sum_{1\leq i<j\leq q}\sum_{\bm{y}\in\widehat{\Xi}_{N}^{i,\,j}}w_{N}^{i,\,j}(\bm{y})\mathscr{L}_{N,\,\bm{y}}^{i,\,j}f(\bm{x})\ . (5.3)

Since wNi,jw_{N}^{i,\,j} converges uniformly to wi,jw^{i,\,j} and is uniformly Lipschitz on every compact subset of intΞ\text{int}\,\Xi, our model is a special case of [23, Remarks 2.10, 2.11] provided that several technical requirements are verified. These requirements will be verified in the next subsection.

Remark 5.1.

[23, Section 2] assumes that for γNi,j=(𝒛0,𝒛1)\gamma_{N}^{i,\,j}=(\bm{z}_{0},\bm{z}_{1}), 𝒛1𝒛0\bm{z}_{1}-\bm{z}_{0} generates q1\mathbb{Z}^{q-1}; however, this requirement is needed to make 𝒓N()\bm{r}_{N}(\cdot) be irreducible. Since (γNi,j)1i<jq(\gamma_{N}^{i,\,j})_{1\leq i<j\leq q} generate q1\mathbb{Z}^{q-1}, we do not need this assumption.

5.3. Requirements for Fβ,NF_{\beta,\,N} and N\mathscr{L}_{N}

In this subsection, we verify that our model is a special case of [23, Remarks 2.10, 2.11].

First, we give some properties of Fβ()F_{\beta}(\cdot) and Gβ,N()G_{\beta,\,N}(\cdot). By the following proposition, Fβ()F_{\beta}(\cdot) and Gβ,N()G_{\beta,\,N}(\cdot) fulfill the requirements in the first paragraph of [23, Section 2].

Proposition 5.2.

The functions Fβ()F_{\beta}(\cdot) and Gβ,N()G_{\beta,\,N}(\cdot) satisfy the following properties.

  1. (1)

    FβF_{\beta} is twice-differentiable and there is no critical points at Ξ\partial\,\Xi. For all 𝒙Ξ\bm{x}\in\partial\Xi, Fβ(𝒙)𝒏(𝒙)>0\nabla F_{\beta}(\bm{x})\cdot\bm{n}(\bm{x})>0.

  2. (2)

    The second partial derivatives of Fβ()F_{\beta}(\cdot) are Lipschitz-continuous on every compact subset of int Ξ\text{int }\Xi.

  3. (3)

    On each compact subset of int Ξ\text{int }\Xi, Gβ,N()G_{\beta,\,N}(\cdot) is uniformly Lipschitz and converges uniformly to Gβ(𝒙):=(1/2β)log(x1xq)G_{\beta}(\bm{x}):=(1/2\beta)\log(x_{1}\cdots x_{q}) as NN\to\infty.

  4. (4)

    There are finitely many critical points of Fβ()F_{\beta}(\cdot).

Proof.

(1)-(3) are straightforward. By Lemma 6.2 in Section 6.1, there are finitely many critical points of Fβ()F_{\beta}(\cdot). ∎

Next, fix one of saddle points 𝐬{\bf s}. Note that 2Fβ(𝐬)\nabla^{2}F_{\beta}({\bf s}) has a unique negative eigenvalue. As in [23, Section 4.3], define (Ni,j)𝐬(\mathscr{L}_{N}^{i,\,j})^{{\bf s}} as

(Ni,j)𝐬f(𝒙)=1N2(𝒆j𝒆i)2f(𝒙)(𝒆j𝒆i)1N𝔸i,j2F(𝐬)(𝒙𝐬)f(𝒙).(\mathscr{L}_{N}^{i,\,j})^{{\bf s}}f(\bm{x})\,=\,\frac{1}{N^{2}}(\bm{e}_{j}-\bm{e}_{i})^{\dagger}\nabla^{2}f(\bm{x})(\bm{e}_{j}-\bm{e}_{i})-\frac{1}{N}\mathbb{A}^{i,\,j}\nabla^{2}F({\bf s})(\bm{x}-{\bf s})\cdot\nabla f(\bm{x})\ .

Denote by λ1𝐬,λ2𝐬,,λq1𝐬-\lambda_{1}^{{\bf s}},\lambda_{2}^{{\bf s}},\dots,\lambda_{q-1}^{{\bf s}} (λ1𝐬λq𝐬>0\lambda_{1}^{{\bf s}}\sim\lambda_{q}^{{\bf s}}>0) the eigenvalues of 2Fβ(𝐬)\nabla^{2}F_{\beta}({\bf s}), and by 𝒂1𝐬,𝒂2𝐬,,𝒂q1𝐬\bm{a}_{1}^{{\bf s}},\bm{a}_{2}^{{\bf s}},\dots,\bm{a}_{q-1}^{{\bf s}} the corresponding eigenvectors. Let ϵN:=N2/5N1\epsilon_{N}:=N^{-2/5}\ll N^{-1} so that ϵN\epsilon_{N} satisfies [23, displays (4.7), (4.8)]. Define a neighborhood of 𝐬{\bf s} as

𝒞N𝐬{𝐬+k=1q1xk𝐚k𝐬:|x1|ϵN,|xk|2λ1𝐬λkϵN, 2kq1}ΞN.\mathcal{C}_{N}^{{\bf s}}\,\coloneqq\,\bigg{\{}{\bf s}+\sum_{k=1}^{q-1}x_{k}{\bf a}_{k}^{{\bf s}}:|x_{1}|\leq\epsilon_{N},\,|x_{k}|\leq\sqrt{\frac{2\lambda_{1}^{{\bf s}}}{\lambda_{k}}}\epsilon_{N},\,2\leq k\leq q-1\bigg{\}}\cap\Xi_{N}\ .

Then, by the next proposition, definitions (4.1)-(4.4) are consistent with [23, Remarks 2.10, 2.11].

Proposition 5.3.

For a smooth function f:Ξf:\Xi\to\mathbb{R}, we have uniformly on 𝒞N𝐬\mathcal{C}_{N}^{{\bf s}},

Nf=[1+O(ϵN)]1i<jqwi,j(𝐬)(Ni,j)𝐬f.\mathscr{L}_{N}f\,=\,[1+O(\epsilon_{N})]\sum_{1\leq i<j\leq q}w^{i,\,j}({\bf s})(\mathscr{L}_{N}^{i,\,j})^{{\bf s}}f\ .
Proof.

Since |𝒙𝐬|=O(ϵN)|\bm{x}-{\bf s}|=O(\epsilon_{N}), by (5.2) and the second order Taylor expansion on 𝒞N𝐬\mathcal{C}_{N}^{{\bf s}}, we have

𝒚Ξ^Ni,jN,𝒚i,jf(𝒙)=[1+O(ϵN)](Ni,j)𝐬f(𝒙).\sum_{\bm{y}\in\widehat{\Xi}_{N}^{i,\,j}}\mathscr{L}_{N,\,\bm{y}}^{i,\,j}f(\bm{x})\,=\,[1+O(\epsilon_{N})](\mathscr{L}_{N}^{i,\,j})^{{\bf s}}f(\bm{x})\ .

Hence, on 𝒞N𝐬\mathcal{C}_{N}^{{\bf s}}, since wNi,j(𝒙)=[1+O(N1)]wi,j(𝒙)=[1+O(ϵN)]wi,j(𝐬)w_{N}^{i,\,j}(\bm{x})=[1+O(N^{-1})]w^{i,\,j}(\bm{x})=[1+O(\epsilon_{N})]w^{i,\,j}({\bf s}), we have

Nf(𝒙)\displaystyle\mathscr{L}_{N}f(\bm{x})\, =1i<jq𝒚Ξ^Ni,jwNi,j(𝒚)N,𝒚i,jf(𝒙)\displaystyle=\,\sum_{1\leq i<j\leq q}\sum_{\bm{y}\in\widehat{\Xi}_{N}^{i,\,j}}w_{N}^{i,\,j}(\bm{y})\mathscr{L}_{N,\,\bm{y}}^{i,\,j}f(\bm{x})
=[1+O(ϵN)]1i<jqwi,j(𝐬)𝒚Ξ^Ni,jN,𝒚i,jf(𝒙)\displaystyle=\,[1+O(\epsilon_{N})]\sum_{1\leq i<j\leq q}w^{i,\,j}({\bf s})\sum_{\bm{y}\in\widehat{\Xi}_{N}^{i,\,j}}\mathscr{L}_{N,\,\bm{y}}^{i,\,j}f(\bm{x})
=[1+O(ϵN)]1i<jqwi,j(𝐬)(Ni,j)𝐬f(𝒙).\displaystyle=\,[1+O(\epsilon_{N})]\sum_{1\leq i<j\leq q}w^{i,\,j}({\bf s})(\mathscr{L}_{N}^{i,\,j})^{{\bf s}}f(\bm{x})\ .

6. Investigation of Critical Points and Temperatures

This section is devoted to the investigation of critical points and temperatures including their definitions. We will provide a preliminary analysis of the critical points in Section 6.1 and of the critical temperatures in section 6.2.

6.1. Classification of Critical Points

We recall that

Fβ(𝒙)=12k=1qxk2+1βk=1qxklogxk,F_{\beta}(\bm{x})\,=\,-\frac{1}{2}\sum_{k=1}^{q}x_{k}^{2}\,+\,\frac{1}{\beta}\sum_{k=1}^{q}x_{k}\log x_{k}\ ,

and that xq=1(x1++xq1)x_{q}=1-(x_{1}+\cdots+x_{q-1}). For 1kq11\leq k\leq q-1,

xkFβ(𝒙)=(xkxq)+1β(logxklogxq).\frac{\partial}{\partial x_{k}}F_{\beta}(\bm{x})\,=\,-(x_{k}-x_{q})\,+\,\frac{1}{\beta}(\log x_{k}-\log x_{q})\ .

If xkFβ(𝒙)=0\frac{\partial}{\partial x_{k}}F_{\beta}(\bm{x})=0, we must have xk1βlogxk=xq1βlogxqx_{k}-\frac{1}{\beta}\log x_{k}=x_{q}-\frac{1}{\beta}\log x_{q}. Hence,

Fβ(𝒙)=0if and only ifxk1βlogxk, 1kq,are the same.\nabla F_{\beta}(\bm{x})=0\ \text{if and only if}\ x_{k}-\frac{1}{\beta}\log x_{k},\ 1\leq k\leq q,\ \text{are the same}\ . (6.1)

By (6.1), 𝐩=(1/q,, 1/q){\bf p}=(1/q,\,\dots,\,1/q) is a critical point.

By elementary computation, we can check that the equation t1βlogt=ct-\frac{1}{\beta}\log t=c has at most two positive real solutions for fixed β,c>0\beta,\,c>0. Hence, if (x1,,xq)(x_{1},\dots,x_{q}) is a critical point555Recall Notaion 3.1., xkx_{k}’s can have at most 2 values by (6.1). Hereafter, we assume 𝐜{\bf c} is a critical point and

𝐜=(t,,t,(1jt)/i,,(1jt)/i),{\bf c}\,=\,(t,\,\dots,\,t,(1-jt)/i,\,\dots,\,(1-jt)/i)\ ,

where jj is the number of tt’s and i=qji=q-j. Observe that by symmetry, each permutation of coordinates of 𝐜{\bf c} has the same properties. Without loss of generality, we may assume

1iq/2jq1andt1/q.1\leq i\leq q/2\leq j\leq q-1\ \text{and}\ t\neq 1/q\ .

The point 𝐩{\bf p} will be analyzed separately.

Refer to caption
(a) i=2i=2.
Refer to caption
(b) i=q/2i=q/2.
Figure 6.1. Graphs of gi(t)g_{i}(t), hi(t)h_{i}(t), and hi(t)h_{i}^{\prime}(t) when q=10q=10.

By (6.1), we obtain

t1βlogt=1jti1βlog(1jti),t-\frac{1}{\beta}\log t\,=\,\frac{1-jt}{i}-\frac{1}{\beta}\log\Big{(}\frac{1-jt}{i}\Big{)}\ ,

which implies

β=i1qtlog(1jtit)=gi(t).\beta\,=\,\frac{i}{1-qt}\log\Big{(}\frac{1-jt}{it}\Big{)}\,=\,g_{i}(t)\ . (6.2)
Lemma 6.1.

Fix q3q\geq 3, 1iq/21\leq i\leq q/2 and j=qij=q-i. Then, the function gi:(0,1/j)g_{i}:(0,1/j)\to\mathbb{R} has the unique minimum, say mim_{i}. Furthermore, if β>gi(mi)\beta>g_{i}(m_{i}), β=gi(t)\beta=g_{i}(t) has two solutions.

Proof.

Define hi:(0,1/j)h_{i}:(0,1/j)\to\mathbb{R} as666As gi()g_{i}(\cdot); the function hi()h_{i}(\cdot) can be continuously extended to (1,1/j)(1,1/j).

hi(t)log1jtit+qt1qt(1jt).h_{i}(t)\,\coloneqq\,\log\frac{1-jt}{it}+\frac{qt-1}{qt(1-jt)}\ . (6.3)

By elementary computation, we obtain

gi(t)=qi(1qt)2hi(t)andhi(t)=(qt1)(2jt1)q(1jt)2t2.g_{i}^{\prime}(t)\,=\,\frac{qi}{(1-qt)^{2}}h_{i}(t)\ \ \text{and}\ \ h_{i}^{\prime}(t)\,=\,\frac{(qt-1)(2jt-1)}{q(1-jt)^{2}t^{2}}\ . (6.4)

There are two cases, where i<q/2i<q/2 and i=q/2i=q/2. By elementary computation, we can show that the graphs of gi,hi,hig_{i},\,h_{i},\,h_{i}^{\prime} are given by Figure 6.1, which completes the proof. ∎

For 1iq/21\leq i\leq q/2, let

βs,i=βs,i(q)gi(mi),\beta_{s,\,i}=\beta_{s,\,i}(q)\,\coloneqq\,g_{i}(m_{i})\ , (6.5)

where mim_{i} is the unique minimum of gi()g_{i}(\cdot) given in the above lemma.

If ββs,i\beta\geq\beta_{s,\,i}, there are one or two solutions of β=gi(t)\beta=g_{i}(t) which will be denoted by ui=ui(β),vi=vi(β)u_{i}=u_{i}(\beta),\,v_{i}=v_{i}(\beta) where uiviu_{i}\leq v_{i}. Let

𝒰i=𝒰i(β)\displaystyle\mathcal{U}_{i}=\mathcal{U}_{i}(\beta) ={permutations of (ui,,ui,(1jui)/i,,(1jui)/i)},\displaystyle=\{\text{permutations of }(u_{i},\,\dots,\,u_{i},\,(1-ju_{i})/i,\,\dots,\,(1-ju_{i})/i)\}\ ,
𝒱i=𝒱i(β)\displaystyle\mathcal{V}_{i}=\mathcal{V}_{i}(\beta) ={permutations of (vi,,vi,(1jvi)/i,,(1jvi)/i)},\displaystyle=\{\text{permutations of }(v_{i},\,\dots,\,v_{i},\,(1-jv_{i})/i,\,\dots,\,(1-jv_{i})/i)\}\ ,

for ββs,i\beta\geq\beta_{s,\,i} . We have the following candidates of the critical points of FβF_{\beta}.

Lemma 6.2.

A critical point of FβF_{\beta} is exactly one of the following cases.

  1. (1)

    𝐩=(1/q,,1/q){\bf p}=(1/q,\dots,1/q).

  2. (2)

    For 1iq/21\leq i\leq q/2 and β(βs,i,)\beta\in(\beta_{s,\,i},\infty), elements of 𝒰i\mathcal{U}_{i}.

  3. (3)

    For 1iq/21\leq i\leq q/2 and β(βs,i,){q}\beta\in(\beta_{s,\,i},\infty)\setminus\{q\}, elements of 𝒱i\mathcal{V}_{i}.

  4. (4)

    For 1i<q/21\leq i<q/2 and β=βs,i\beta=\beta_{s,\,i}, elements of 𝒰i=𝒱i\mathcal{U}_{i}=\mathcal{V}_{i}.

Proof.

By part (1) of Proposition 5.2, points in Ξ\partial\Xi cannot be critical points. Then, the proof follows from (6.1) and Lemma 6.1. ∎

Finally, we have the following results for critical points. The proof for q=3q=3 is given in [24, Proposition 4.2].

Proposition 6.3.

The minima and saddle points of FβF_{\beta} for q=3q=3, q=4q=4, and q5q\geq 5 are given by Table 1, 2, and 3, respectively.

𝐩{\bf p} 𝒰1(β)\mathcal{U}_{1}(\beta) 𝒱1(β)\mathcal{V}_{1}(\beta)
β(0,βs, 1)\beta\in(0,\beta_{s,\,1}) the only minimum
β=βs, 1\beta=\beta_{s,\,1} the only minimum degenerate degenerate
β(βs, 1,q)\beta\in(\beta_{s,\,1},q) local minimum local minima saddle points
β=q\beta=q degenerate local minima degenerate
β(q,)\beta\in(q,\infty) local maximum local minima saddle points
Table 1. Classification of critical points when q=3q=3
𝐩{\bf p} 𝒰1(β)\mathcal{U}_{1}(\beta) 𝒱1(β)\mathcal{V}_{1}(\beta) 𝒰2(β)=𝒱2(β)\mathcal{U}_{2}(\beta)=\mathcal{V}_{2}(\beta)
β(0,βs, 1)\beta\in(0,\beta_{s,\,1}) the only minimum
β=βs, 1\beta=\beta_{s,\,1} the only minimum degenerate degenerate
β(βs, 1,q)\beta\in(\beta_{s,\,1},q) local minimum local minima saddle points
β=q\beta=q degenerate local minima degenerate degenerate
β(q,)\beta\in(q,\infty) local maximum local minima index 2\geq 2 saddle points
Table 2. Classification of critical points when q=4q=4
𝐩{\bf p} 𝒰1(β)\mathcal{U}_{1}(\beta) 𝒱1(β)\mathcal{V}_{1}(\beta) 𝒰2(β)\mathcal{U}_{2}(\beta)
β(0,βs, 1)\beta\in(0,\beta_{s,\,1}) the only minimum
β=βs, 1\beta=\beta_{s,\,1} the only minimum degenerate degenerate
β(βs, 1,βs, 2)\beta\in(\beta_{s,\,1},\beta_{s,\,2}) local minimum local minima saddle points
β=βs, 2\beta=\beta_{s,\,2} local minimum local minima saddle points degenerate
β(βs, 2,q)\beta\in(\beta_{s,\,2},q) local minimum local minima saddle points saddle points
β=q\beta=q degenerate local minima degenerate degenerate
β(q,)\beta\in(q,\infty) local maximum local minima index 2\geq 2 saddle points
Table 3. Classification of critical points when q=5q=5

Section 7 proves the above proposition. Until now, we classified all minima and saddle points for all q3q\geq 3.

6.2. Definition of Critical Temperatures

In the previous subsection, we defined several temperatures βs,i\beta_{s,\,i}, 1iq/21\leq i\leq q/2. In this subsection, we prove several properties of such temperatures and moreover introduce new temperatures. Then, we select the critical temperatures at which phase transitions occur.

The first lemma is about the order of βs,i\beta_{s,\,i}.

Lemma 6.4.

We have βs, 1<βs, 2<<βs,q/2\beta_{s,\,1}<\beta_{s,\,2}<\cdots<\beta_{s,\,\lfloor q/2\rfloor}. If qq is even, we have βs,q/2=q\beta_{s,\,q/2}=q and otherwise, βs,q/2<q\beta_{s,\,\lfloor q/2\rfloor}<q.

Proof.

In this proof, we regard ii as a real number and claim that gi(t)g_{i}(t) increases as i[1,q]i\in[1,q] increases for fixed t<1/qt<1/q. By elementary computation, we obtain

ddigi(t)=11qt(log1jtit+it1jt1).\frac{d}{di}g_{i}(t)\,=\,\frac{1}{1-qt}\Big{(}\log\frac{1-jt}{it}+\frac{it}{1-jt}-1\Big{)}\ .

By the inequality x1>logxx-1\,>\,\log x, we can conclude that ddigi(t)>0\frac{d}{di}g_{i}(t)>0. Hence, gi(t)<gi+1(t)g_{i}(t)<g_{i+1}(t) if t<1/qt<1/q.

Hereafter, let ii\in\mathbb{Z}. Suppose i<q/21i<q/2-1. Since mi,mi+1<1/qm_{i},\,m_{i+1}<1/q, we obtain

βs,i=gi(mi)gi(mi+1)<gi+1(mi+1)=βs,i+1,\beta_{s,\,i}=g_{i}(m_{i})\leq g_{i}(m_{i+1})<g_{i+1}(m_{i+1})=\beta_{s,\,i+1}\ ,

by the above claim. The first inequality holds since mim_{i} is a minimum of gig_{i}. If i=q/21i=q/2-1, since mi<1/q=mi+1m_{i}<1/q=m_{i+1}, we obtain

βs,i=gi(mi)<gi(mi+1)=q=βs,q/2.\beta_{s,\,i}=g_{i}(m_{i})<g_{i}(m_{i+1})=q=\beta_{s,\,q/2}\ .

If i<q/2i<q/2, we have mi<1/qm_{i}<1/q so that βs,i<gi(1/q)=q\beta_{s,\,i}<g_{i}(1/q)=q. This with the above argument prove the second assertion. ∎

Remark.

In particular, by the above lemma, we have βs, 1<βs, 2=q\beta_{s,\,1}<\beta_{s,\,2}=q for q=4q=4 and βs, 1<βs, 2<q\beta_{s,\,1}<\beta_{s,\,2}<q for q5q\geq 5.

The relative order of heights of critical points changes with changes in β\beta, and the phase transition is owing to this fact. We will explain when and how this order is changed. Since the proofs are technical, they are postponed to Section 8.

Order of heights of 𝐩{\bf p} and 𝒰1\mathcal{U}_{1}

Define βc\beta_{c} as

βc(q):=2(q1)q2log(q1),\beta_{c}(q)\,:=\,\frac{2(q-1)}{q-2}\log(q-1)\ , (6.6)

which is introduced in [13, display (3.3)]. Then, we obtain the following.

Lemma 6.5.

For q3q\geq 3, we have βs, 1<βc\beta_{s,\,1}<\beta_{c} and for q4q\geq 4, we have βs, 1<βc<βs, 2\beta_{s,\,1}<\beta_{c}<\beta_{s,\,2}.

The proof of the lemma is given in Section 8.1. The following lemma is an important property of βc\beta_{c}.

Lemma 6.6.

Let q3q\geq 3. Then, we have

{Fβ(𝐩)<Fβ(𝐮1)if β(βs, 1,βc),Fβ(𝐩)=Fβ(𝐮1)if β=βc,Fβ(𝐩)>Fβ(𝐮1)if β(βc,).\begin{cases}F_{\beta}({\bf p})<F_{\beta}({\bf u}_{1})&\text{if }\beta\in(\beta_{s,\,1},\beta_{c})\ ,\\ F_{\beta}({\bf p})=F_{\beta}({\bf u}_{1})&\text{if }\beta=\beta_{c}\ ,\\ F_{\beta}({\bf p})>F_{\beta}({\bf u}_{1})&\text{if }\beta\in(\beta_{c},\infty)\ .\end{cases} (6.7)

This result is the same as [13, Theorem 3.1(b)]. The proof is provided in [13, Appendices A, B] via convex-duality.

We may assume that β\beta increases from a very small positive number. Observe that the elements of 𝒰1\mathcal{U}_{1} and 𝒱1\mathcal{V}_{1} simultaneously appear when β=βs, 1\beta=\beta_{s,\,1} and the elements of 𝒰2\mathcal{U}_{2} appear when β=βs, 2\beta=\beta_{s,\,2} . By the above two lemmas, before the appearance of critical points in 𝒰2\mathcal{U}_{2}, the heights of 𝐩{\bf p} and 𝐮1{\bf u}_{1} are reversed.

Order of heights of 𝒱1\mathcal{V}_{1} and 𝒰2\mathcal{U}_{2}

We have the following lemma about the heights of 𝐮2{\bf u}_{2} and 𝐯1{\bf v}_{1}. The critical temperature βm\beta_{m} given in the following lemma is the crucial development of this article.

Lemma 6.7.

Let q5q\geq 5. We have a critical temperature βm(βs, 2,q)\beta_{m}\in(\beta_{s,\,2},q) such that

{Fβ(𝐯1)<Fβ(𝐮2)if βs, 2β<βm,Fβ(𝐯1)=Fβ(𝐮2)if β=βm,Fβ(𝐯1)>Fβ(𝐮2)if βm<βq.\begin{cases}F_{\beta}({\bf v}_{1})<F_{\beta}({\bf u}_{2})&\text{if }\beta_{s,\,2}\leq\beta<\beta_{m}\ ,\\ F_{\beta}({\bf v}_{1})=F_{\beta}({\bf u}_{2})&\text{if }\beta=\beta_{m}\ ,\\ F_{\beta}({\bf v}_{1})>F_{\beta}({\bf u}_{2})&\text{if }\beta_{m}<\beta\leq q\ .\end{cases} (6.8)

The proof of the lemma is given in Section 8.2.

Up to this point, we have obtained four critical values

0<βs, 1<βc<βs, 2<βm<q,0<\beta_{s,\,1}<\beta_{c}<\beta_{s,\,2}<\beta_{m}<q\ ,

when q5q\geq 5. If q=4q=4, we have βs, 2=q\beta_{s,\,2}=q, else if q=3q=3, βs, 2\beta_{s,\,2} is not defined. Thus, if q4q\leq 4, define βm=q\beta_{m}=q so that

0<βs, 1<βc<βm=q.0<\beta_{s,\,1}<\beta_{c}<\beta_{m}=q\ .

We conclude this section with the definition of the critical temperatures at which the phase transitions occur. We can now define critical temperatures β1,β2,β3\beta_{1},\beta_{2},\beta_{3} appearing in Section 3.2. The critical temperatures are given by

β1(q)βs, 1(q),β2(q)βc(q),β3(q)βm(q).\beta_{1}(q)\coloneqq\beta_{s,\,1}(q),\ \ \beta_{2}(q)\coloneqq\beta_{c}(q),\ \ \beta_{3}(q)\coloneqq\beta_{m}(q)\ . (6.9)

7. Critical Points of FβF_{\beta}

In this section, we prove Proposition 6.3 for q4q\geq 4. For the case q=3q=3, we refer to [24] and we will only highlight the difference.

7.1. Eigenvalues of Hessian of FβF_{\beta} at Critical Points

First, we investigate 𝐩=(1/q,, 1/q){\bf p}=(1/q,\,\dots,\,1/q), which is always a critical point for all β>0\beta>0. The following lemma proves the property of 𝐩{\bf p}.

Lemma 7.1.

The point 𝐩{\bf p} is a local minimum of FβF_{\beta} if β<q\beta<q, a local maximum of FβF_{\beta} if β>q\beta>q, and a degenerate critical point when β=q\beta=q.

Proof.

Let 𝟙=(1,,1)\mathbbm{1}=(1,\dots,1)^{\dagger} be a (q1)×1(q-1)\times 1 matrix. By elementary computation, we obtain

2Fβ(𝐩)=qββ(diag(1,,1)+𝟙𝟙)\nabla^{2}F_{\beta}({\bf p})\,=\,\frac{q-\beta}{\beta}\Big{(}\text{diag}(1,\dots,1)+\mathbbm{1}\mathbbm{1}^{\dagger}\Big{)}

whose eigenvalues are (qβ)/β(q-\beta)/\beta with multiplicity q2q-2 and q(qβ)/βq(q-\beta)/\beta with 1. This completes the proof. ∎

Now, for i[1,q/2]i\in[1,q/2]\cap\mathbb{N}, j=qij=q-i, and β=gi(t)\beta=g_{i}(t), define aa\in\mathbb{R} and bb\in\mathbb{R} as

a=a(i,t)=1+1/βt,b=b(i,t)=1+i/β(1jt).a=a(i,t)=-1+1/\beta t\ ,\ \ b=b(i,t)=-1+i/\beta(1-jt)\ . (7.1)

We have the following lemma about eigenvalues of Hessian of FβF_{\beta} at critical points.

Lemma 7.2.

Let i[1,q/2]i\in[1,q/2]\cap\mathbb{N} and j=qij=q-i. Moreover, let t(0,1/j)t\in(0,1/j) and β=gi(t)\beta=g_{i}(t). Then, 𝐜=(t,,t,(1jt)/i,,(1jt)/i){\bf c}=(t,\dots,t,(1-jt)/i,\dots,(1-jt)/i) is a critical point of FβF_{\beta} and eigenvalues of 2Fβ(𝐜)\nabla^{2}F_{\beta}({\bf c}) constitute one of the following cases.

  1. (1)

    If i2i\geq 2, all eigenvalues of 2Fβ(𝒄)\nabla^{2}F_{\beta}(\bm{c}) are aa, bb with multiplicative j1j-1, i2i-2, respectively, and the roots of λ2(a+qb)λ+b(ia+jb)\lambda^{2}-(a+qb)\lambda+b(ia+jb).

  2. (2)

    If i=1i=1, all eigenvalues of 2Fβ(𝒄)\nabla^{2}F_{\beta}(\bm{c}) are aa with multiplicative j1j-1 and a+(q1)ba+(q-1)b with multiplicative 1.

Proof.

By Lemma 6.2, 𝐜{\bf c} is a critical point of FβF_{\beta} since β=gi(t)\beta=g_{i}(t). By elementary computation, we have

2xk2Fβ(𝒙)\displaystyle\frac{\partial^{2}}{\partial x_{k}^{2}}F_{\beta}(\bm{x}) =1+1βxk+(1+1βxq),\displaystyle\,=\,-1+\frac{1}{\beta x_{k}}+\Big{(}-1+\frac{1}{\beta x_{q}}\Big{)}\ ,
2xkxlFβ(𝒙)\displaystyle\frac{\partial^{2}}{\partial x_{k}\partial x_{l}}F_{\beta}(\bm{x}) =1+1βxq,\displaystyle\,=\,-1+\frac{1}{\beta x_{q}}\ ,

so that

2xkxlFβ(𝐜)={1+1βt+(1+iβ(1jt))if 1k=lj2(1+iβ(1jt))if j+1k=lq11+iβ(1jt)if kl\frac{\partial^{2}}{\partial x_{k}\partial x_{l}}F_{\beta}({\bf c})=\begin{cases}-1+\frac{1}{\beta t}+(-1+\frac{i}{\beta(1-jt)})&\text{if }1\leq k=l\leq j\\ 2(-1+\frac{i}{\beta(1-jt)})&\text{if }j+1\leq k=l\leq q-1\\ -1+\frac{i}{\beta(1-jt)}&\text{if }k\neq l\end{cases}

Then, we can write 2Fβ(𝐜)\nabla^{2}F_{\beta}({\bf c}) as

2Fβ(𝐜)=𝔻+b𝟙𝟙,\nabla^{2}F_{\beta}({\bf c})\,=\,\mathbb{D}+b\mathbbm{1}\mathbbm{1}^{\dagger}\ ,

where

𝔻=diag(a,,a𝑗,b,,bq1j).\mathbb{D}\,=\,\text{diag}(\underset{j}{\underbrace{a,\dots,a}},\underset{q-1-j}{\underbrace{b,\dots,b}})\ .

Let 𝕀=𝕀q1\mathbb{I}=\mathbb{I}_{q-1} be a (q1)(q-1)-identity matrix. By the formula

det(A+𝒗𝒘)=detA(1+𝒗A1𝒘),\det(A+\bm{v}\bm{w}^{\dagger})=\det A(1+\bm{v}^{\dagger}A^{-1}\bm{w})\ ,

we can write

det(2Fβ(𝐜)λ𝕀)=det(𝔻λ𝕀+b𝟙𝟙)=(aλ)j(bλ)i1[1+b(jaλ+i1bλ)].\det(\nabla^{2}F_{\beta}({\bf c})-\lambda\mathbb{I})\,=\,\det(\mathbb{D}-\lambda\mathbb{I}+b\mathbbm{1}\mathbbm{1}^{\dagger})\,=\,(a-\lambda)^{j}(b-\lambda)^{i-1}\Big{[}1+b\big{(}\frac{j}{a-\lambda}+\frac{i-1}{b-\lambda}\big{)}\Big{]}\ .

Hence, we obtain

det(2Fβ(𝐜)λ𝕀)={(aλ)j1(bλ)i2(λ2(a+qb)λ+b(ia+jb))if i2,(aλ)j1(a+jbλ)=(aλ)q2(a+(q1)bλ)if i=1.\det(\nabla^{2}F_{\beta}({\bf c})-\lambda\mathbb{I})=\begin{cases}(a-\lambda)^{j-1}(b-\lambda)^{i-2}(\lambda^{2}-(a+qb)\lambda+b(ia+jb))&\text{if }i\geq 2\ ,\\ (a-\lambda)^{j-1}(a+jb-\lambda)=(a-\lambda)^{q-2}(a+(q-1)b-\lambda)&\text{if }i=1\ .\end{cases}

The proof of the lemma arises directly from this explicit computation of characteristic polynomial of Hessian of Fβ(𝐜)F_{\beta}({\bf c}). ∎

We have the following lemma about the sign of the eigenvalues of 2Fβ(𝐜)\nabla^{2}F_{\beta}({\bf c}). Recall the definition of mim_{i} from Lemma 6.1.

Lemma 7.3.

Let i[1,q/2]i\in[1,q/2]\cap\mathbb{N} and j=qij=q-i. Moreover, let t(0,1/j)t\in(0,1/j) and β=gi(t)\beta=g_{i}(t). Then, we have the following table regarding the sign of each value. If i=q/2i=q/2, we ignore t=mit=m_{i} and t(mi,1/q)t\in(m_{i},1/q).

t(0,mi)t\in(0,m_{i}) t=mit=m_{i} t(mi,1/q)t\in(m_{i},1/q) t=1/qt=1/q t(1/q,1/j)t\in(1/q,1/j)
aa ++ ++ ++ 0 -
bb - - - 0 ++
ia+jbia+jb ++ 0 - 0 ++
b(ia+jb)b(ia+jb) - 0 ++ 0 ++
Proof.

First, suppose that t<1/qt<1/q. Then,

a>0\displaystyle a>0\, 1t>β=i1qtlog(1jtit)1qtit>log(1jtit).\displaystyle\Longleftrightarrow\,\frac{1}{t}>\beta=\frac{i}{1-qt}\log\Big{(}\frac{1-jt}{it}\Big{)}\Leftrightarrow\,\frac{1-qt}{it}>\log\Big{(}\frac{1-jt}{it}\Big{)}\ .

By substituting x=(1jt)/(it)x=(1-jt)/(it), one can deduce that a>0a>0 is equivalent to t1/qt\neq 1/q which implies a>0a>0. Moreover, by the same argument above, we have b<0b<0. In the same manner, if t>1/qt>1/q, we obtain a<0a<0 and b>0b>0.

Now, we investigate the sign of ia+jbia+jb. We write

ia+jb=i+iβtj+ijβ(1jt)=q+iβt(1jt).ia+jb\,=\,-i+\frac{i}{\beta t}-j+\frac{ij}{\beta(1-jt)}\,=\,-q+\frac{i}{\beta t(1-jt)}\ .

By elementary computation, ia+jb=0ia+jb=0 if t=1/qt=1/q. Hence, ia+jb>0ia+jb>0 if and only if

iqt(1jt)>β=i1qtlog(1jtit).\frac{i}{qt(1-jt)}\,>\,\beta\,=\,\frac{i}{1-qt}\log\Big{(}\frac{1-jt}{it}\Big{)}\ .

First, assume t<1/qt<1/q. Then, ia+jb>0ia+jb>0 if and only if

hi(t)=log(1jtit)+qt1qt(1jt)< 0.h_{i}(t)\,=\,\log\Big{(}\frac{1-jt}{it}\Big{)}+\frac{qt-1}{qt(1-jt)}\,<\,0\ .

By investigating the graph of hih_{i} (cf. Figure 6.1), the above inequality holds if and only if t<mit<m_{i}. Second, assume t>1/qt>1/q. Then, ia+jb>0ia+jb>0 if and only if hi(t)>0h_{i}(t)>0 if and only if t>1/qt>1/q. Hence, ia+jb>0ia+jb>0 if and only if t<mit<m_{i} or t>1/qt>1/q.

The case when t=1/qt=1/q can be proven by the argument in the first paragraph of this proof. If t=mit=m_{i}, then ia+jb=0ia+jb=0 since hi(mi)=0h_{i}(m_{i})=0. The above argument can prove the case when i=q/2i=q/2 since mq/2=1/qm_{q/2}=1/q. ∎

7.2. Relevant Critical Points of FβF_{\beta}

In this and the next subsection, we classify nondegenerate critical points. When we consider critical points in 𝒰i\mathcal{U}_{i} or 𝒱i\mathcal{V}_{i} , we assume that β>βs,i\beta>\beta_{s,\,i} since when β=βs,i\beta=\beta_{s,\,i}, the elements of 𝒰i=𝒱i\mathcal{U}_{i}=\mathcal{V}_{i} are degenerate. The case when β=βs,i\beta=\beta_{s,\,i} is treated in Section 7.4.

By the Morse theory, critical points with more than 2 negative eigenvalues can be neither saddle points nor minima. Hence, the critical points with only positive eigenvalues or only one negative eigenvalue and q2q-2 positive eigenvalues are relevant to the landscape of FβF_{\beta}. We select these critical points in this subsection.

As in (7.1), for i[1,q/2]i\in[1,q/2]\cap\mathbb{N}, j=qij=q-i, and β>βs,i\beta>\beta_{s,\,i}, when we consider 𝐮i𝒰i{\bf u}_{i}\in\mathcal{U}_{i}, let

a=a(𝐮i):=1+1βui,b=b(𝐮i):=1+1β(1jui),a=a({\bf u}_{i}):=-1+\frac{1}{\beta u_{i}}\ ,\ \ b=b({\bf u}_{i}):=-1+\frac{1}{\beta(1-ju_{i})}\ ,

and when we consider 𝐯i𝒱i{\bf v}_{i}\in\mathcal{V}_{i}, let

a=a(𝐯i):=1+1βvi,b=b(𝐯i):=1+1β(1jvi).a=a({\bf v}_{i}):=-1+\frac{1}{\beta v_{i}}\ ,\ \ b=b({\bf v}_{i}):=-1+\frac{1}{\beta(1-jv_{i})}\ .
Lemma 7.4.

Let q4q\geq 4. If β>βs, 1\beta>\beta_{s,\,1} , 𝒰1\mathcal{U}_{1} is a set of local minima. If β>βs, 2\beta>\beta_{s,\,2} , 𝒰2\mathcal{U}_{2} is a set of saddle points. If βs, 1<β<q\beta_{s,\,1}<\beta<q, 𝒱1\mathcal{V}_{1} is a set of saddle points else if β>q\beta>q, each point in 𝒱1\mathcal{V}_{1} has at least two negative eigenvalues.

Proof.

Consider 𝐮1𝒰1{\bf u}_{1}\in\mathcal{U}_{1}. Eigenvalues of 2Fβ(𝐮1)\nabla^{2}F_{\beta}({\bf u}_{1}) are aa with multiplicative q2q-2 and a+(q1)ba+(q-1)b with multiplicative 1. By Lemma 7.3, if β>βs, 1\beta>\beta_{s,\,1}, then since u1<m1<1/qu_{1}<m_{1}<1/q, we obtain a,a+(q1)b>0a,\,a+(q-1)b>0; hence, 𝐮1{\bf u}_{1} is a local minimum.

Next, consider 𝐯1𝒱1{\bf v}_{1}\in\mathcal{V}_{1}. Eigenvalues of 2Fβ(𝐯1)\nabla^{2}F_{\beta}({\bf v}_{1}) are aa with multiplicative q2q-2 and a+(q1)ba+(q-1)b with multiplicative 1. By Lemma 7.3, if βs, 1<β<q\beta_{s,\,1}<\beta<q, then since m1<v1<1/qm_{1}<v_{1}<1/q, we obtain a>0a>0 and a+(q1)b<0a+(q-1)b<0; hence, it is a saddle point. If β>q\beta>q, then since v1>1/qv_{1}>1/q, we obtain a<0a<0 and a+(q1)b>0a+(q-1)b>0 so that 𝐯1{\bf v}_{1} has more than two negative eigenvalues.

Finally, let i2i\geq 2, j=qij=q-i, and β>βs,i\beta>\beta_{s,\,i}. In this case, 𝐮i{\bf u}_{i} has eigenvalues a,ba,\,b with multiplicative j1,i2j-1,\,i-2 and the roots of λ2(a+qb)λ+b(ia+jb)\lambda^{2}-(a+qb)\lambda+b(ia+jb). Since ui<mi1/qu_{i}<m_{i}\leq 1/q for all ii and β>βs,i\beta>\beta_{s,\,i}, by Lemma 7.3, a>0a>0, b<0b<0, and b(ia+jb)<0b(ia+jb)<0 so that it has jj positive eigenvalues and i1i-1 negative eigenvalues. Hence, 𝐮2{\bf u}_{2} is a saddle point. ∎

Remark 7.5.

For q=3q=3, by the same argument, 2Fβ(𝐯1)\nabla^{2}F_{\beta}({\bf v}_{1}) has only one negative eigenvalue and two positive eigenvalues for β(βs, 1,){q}\beta\in(\beta_{s,\,1},\infty)\setminus\{q\}.

7.3. Irrelevant Critical Points of FβF_{\beta}

In this subsection, we eliminate unneeded critical points.

Lemma 7.6.

Let q5q\geq 5. For i[3,q/2]i\in[3,q/2]\cap\mathbb{N} and β>βs,i\beta>\beta_{s,\,i}, each point in 𝒰i\mathcal{U}_{i} has at least two negative eigenvalues. And for i[2,q/2]i\in[2,q/2]\cap\mathbb{N} and β(βs,i,){q}\beta\in(\beta_{s,\,i},\infty)\setminus\{q\}, each point in 𝒱i\mathcal{V}_{i} has at least two negative eigenvalues.

Proof.

By the proof of Lemma 7.4, 𝐮i{\bf u}_{i} for i3i\geq 3 has at least two negative eigenvalues. Now, let i2i\geq 2, j=qij=q-i, and β(βs,i,){q}\beta\in(\beta_{s,\,i},\infty)\setminus\{q\}. In this case, each points in 𝒱i\mathcal{V}_{i} has eigenvalues a,ba,\,b with multiplicative j1,i2j-1,\,i-2, and the roots of λ2(a+qb)λ+b(ia+jb)\lambda^{2}-(a+qb)\lambda+b(ia+jb). If βs,i<β<q\beta_{s,\,i}<\beta<q, then vi<1/qv_{i}<1/q so that a>0a>0, b<0b<0, and b(ia+jb)>0b(ia+jb)>0. In this case,

a+qb=ia+jb+(1i)a+(qj)b<ia+jb< 0,a+qb\,=\,ia+jb+(1-i)a+(q-j)b\,<\,ia+jb\,<\,0\ ,

so that the two roots of λ2(a+qb)λ+b(ia+jb)\lambda^{2}-(a+qb)\lambda+b(ia+jb) are negative. Hence, it has j1j-1 positive eigenvalues and ii negative eigenvalues. If β>q\beta>q, then vi>1/qv_{i}>1/q so that a<0a<0, and points in 𝒱i\mathcal{V}_{i} have at least j1j-1 negative eigenvalues, where j12j-1\geq 2 since q5q\geq 5. ∎

Lemma 7.7.

Let q=4q=4 and βq\beta\geq q. Then, we have 𝒱2=𝒰2\mathcal{V}_{2}=\mathcal{U}_{2}.

Proof.

Observe that βs, 2=q\beta_{s,\,2}=q. If β=q\beta=q, 𝒱2=𝒰2\mathcal{V}_{2}=\mathcal{U}_{2} since there is only one solution m2m_{2} to q=g2(t)q=g_{2}(t). Suppose β>q\beta>q. By elementary computation, we obtain

g2(14t)=g2(14+t)fort[0,14),g_{2}\Big{(}\frac{1}{4}-t\Big{)}=g_{2}\Big{(}\frac{1}{4}+t\Big{)}\ \ \ \text{for}\ t\in\Big{[}0,\frac{1}{4}\Big{)}\ ,

so that v2=(1/2)u2v_{2}=(1/2)-u_{2}. Hence, 𝐯2=(u2,u2,v2,v2){\bf v}_{2}=(u_{2},u_{2},v_{2},v_{2}) is a permutation of 𝐮2{\bf u}_{2}, that is, each element of 𝒱2\mathcal{V}_{2} is one of the elements of 𝒰2\mathcal{U}_{2} so that 𝒱2=𝒰2\mathcal{V}_{2}=\mathcal{U}_{2}. ∎

By lemmas in this subsection, 𝒰i\mathcal{U}_{i}, i3i\geq 3, and 𝒱i\mathcal{V}_{i}, i2i\geq 2, are not of interest.

7.4. At Critical Temperature

In this subsection, we investigate the critical points at the critical temperatures, that is, at β=βs,i\beta=\beta_{s,\,i} or β=q\beta=q. The point 𝐮i=𝐯i{\bf u}_{i}={\bf v}_{i} is degenerate when β=βs,i\beta=\beta_{s,\,i} and the point 𝐩=𝐯i{\bf p}={\bf v}_{i} is degenerate when β=q\beta=q by Lemma 7.2 and 7.3.

Lemma 7.8.

If iq/2i\leq q/2 and β=βs,i\beta=\beta_{s,\,i}, the point 𝐮i=𝐯i{\bf u}_{i}={\bf v}_{i} is not a local minimum. If β=q\beta=q, the point 𝐩=𝐯i{\bf p}={\bf v}_{i} is not a local minimum.

Proof.

Fix 1ijq11\leq i\leq j\leq q-1 such that i+j=qi+j=q and define i:[0,1/j]Ξ\bm{\ell}_{i}:[0,1/j]\to\Xi as

i(s)=(s,,s,1jsi,,1jsi).\bm{\ell}_{i}(s)\,=\,\Big{(}s,\dots,s,\frac{1-js}{i},\dots,\frac{1-js}{i}\Big{)}\ .

We therefore obtain

Fβ(i(s))\displaystyle F_{\beta}(\bm{\ell}_{i}(s))\, =12[js2+i(1jsi)]+1β[jslogs+(1js)log(1jsi)]\displaystyle=\,-\frac{1}{2}\bigg{[}js^{2}+i\Big{(}\frac{1-js}{i}\Big{)}\bigg{]}+\frac{1}{\beta}\bigg{[}js\log s+(1-js)\log\Big{(}\frac{1-js}{i}\Big{)}\bigg{]}
=12i(jqs22js+1)+1β[(1js)(1qs)igi(s)+logs].\displaystyle=\,-\frac{1}{2i}(jqs^{2}-2js+1)+\frac{1}{\beta}\bigg{[}\frac{(1-js)(1-qs)}{i}g_{i}(s)+\log s\bigg{]}\ .

By (6.3) and (6.4), we have

ddsFβ(i(s))=ji(1qs)+jβi(qs1)gi(s)=jβi(1qs)(βgi(s)).\frac{d}{ds}F_{\beta}(\bm{\ell}_{i}(s))\,=\,\frac{j}{i}(1-qs)+\frac{j}{\beta i}(qs-1)g_{i}(s)\,=\,\frac{j}{\beta i}(1-qs)(\beta-g_{i}(s))\ .

We claim that Fβs,i(i(mi))F_{\beta_{s,\,i}}(\bm{\ell}_{i}(m_{i})) and Fq(i(1/q))F_{q}(\bm{\ell}_{i}(1/q)) are not the local minima of Fβs,i(i(s))F_{\beta_{s,\,i}}(\bm{\ell}_{i}(s)) and Fq(i(s))F_{q}(\bm{\ell}_{i}(s)), respectively, and this completes the proof.

For the first claim, assume i<ji<j, and note that mi<1/qm_{i}<1/q. Then, 1qs>01-qs>0 and βs,igi(s)<0\beta_{s,\,i}-g_{i}(s)<0 if ss is in a neighborhood of mim_{i} and smis\neq m_{i}. In this case, ddsFβs,i(i(s))<0\frac{d}{ds}F_{\beta_{s,\,i}}(\bm{\ell}_{i}(s))<0 near mim_{i} so that 𝐮i=𝐯i{\bf u}_{i}={\bf v}_{i} is not a local minimum. If i=ji=j, βs,i=q\beta_{s,\,i}=q so that it suffices to show the second assertion.

Next, note that vi(q)=1/qv_{i}(q)=1/q so that we have gi(s)<β=qg_{i}(s)<\beta=q, 1qs>01-qs>0 if s<1/qs<1/q and gi(s)>qg_{i}(s)>q, 1qs<01-qs<0 if s>1/qs>1/q. Therefore, ddsFq(i(s))>0\frac{d}{ds}F_{q}(\bm{\ell}_{i}(s))>0 near 1/q1/q so that 𝐩=𝐯i{\bf p}={\bf v}_{i} is not a local minimum. ∎

Even though 𝐮i{\bf u}_{i}, i3i\geq 3, is not a saddle point if β>βs,i\beta>\beta_{s,\,i}, we cannot exclude the possibility that 𝐮i{\bf u}_{i} is a saddle point when β=βs,i\beta=\beta_{s,\,i}; however, by the next two lemmas, 𝒰i(βs,i)\mathcal{U}_{i}(\beta_{s,\,i}), i3i\geq 3, are irrelevant to the landscape of FβF_{\beta}.

Lemma 7.9.

Let q8q\geq 8 and i4i\geq 4. Then, if β=βs,i\beta=\beta_{s,\,i}, 𝐮i=𝐯i{\bf u}_{i}={\bf v}_{i} is not a saddle point.

Proof.

By Lemma 7.2, 1+1/[βs,i{jui(βs,i)}]-1+1\,/\,[\beta_{s,\,i}\{-ju_{i}(\beta_{s,\,i})\}] is an eigenvalue of Fβs,i\nabla F_{\beta_{s,\,i}} at 𝐮i{\bf u}_{i} with a multiple of at least two. Hence, by Lemma 7.3, it has at least two negative eigenvalues. ∎

Lemma 7.10.

Let q6q\geq 6. We have Fβs, 3(𝐮3)>Fβs, 3(𝐮2)F_{\beta_{s,\,3}}({\bf u}_{3})>F_{\beta_{s,\,3}}({\bf u}_{2}). Furthermore, if q7q\geq 7, we have Fβs, 3(𝐮3)>Fβs, 3(𝐯1)F_{\beta_{s,\,3}}({\bf u}_{3})>F_{\beta_{s,\,3}}({\bf v}_{1}). Hence, 𝐮3{\bf u}_{3} cannot be a saddle point lower than 𝐮2{\bf u}_{2} or 𝐯1{\bf v}_{1}.

The proof is presented in Section 8.3. We remark that if q=6q=6, we have βs, 3=q\beta_{s,\,3}=q so that 𝐯1(βs, 3)=𝐩{\bf v}_{1}(\beta_{s,\,3})={\bf p} and the second assertion is not needed.

8. Analysis of Energy Landscape

In this section, we prove lemmas introduced in Section 6.2 and Lemma 7.10. To prove these lemmas, we need numerical computation given in Appendix A.

8.1. Proof of Lemma 6.5

Lemma 8.1.

If q4q\geq 4, we have v1(βs, 2)>12(q1)v_{1}(\beta_{s,\,2})>\frac{1}{2(q-1)}.

Proof.

Fix β=βs, 2\beta=\beta_{s,\,2} and write v1=v1(βs, 2)v_{1}=v_{1}(\beta_{s,\,2}) for convenience. Since βs, 2=g2(m2)=g1(v1)\beta_{s,\,2}=g_{2}(m_{2})=g_{1}(v_{1}), we have

21qm2log1(q2)m22m2=11qv1log1(q1)v1v1\frac{2}{1-qm_{2}}\log\frac{1-(q-2)m_{2}}{2m_{2}}\,=\,\frac{1}{1-qv_{1}}\log\frac{1-(q-1)v_{1}}{v_{1}} (8.1)

Let

v1=12q+m22,so that11qv1=21qm2.v_{1}^{*}\,=\,\frac{1}{2q}+\frac{m_{2}}{2}\ ,\ \text{so that}\ \frac{1}{1-qv_{1}^{*}}\,=\,\frac{2}{1-qm_{2}}\ . (8.2)

We claim that g1(v1)g1(v1)g_{1}(v_{1}^{*})\,\leq\,g_{1}(v_{1}), that is, by (8.1),

11qv1log1(q1)v1v121qm2log1(q2)m22m2.\frac{1}{1-qv_{1}^{*}}\log\frac{1-(q-1)v_{1}^{*}}{v_{1}^{*}}\,\leq\,\frac{2}{1-qm_{2}}\log\frac{1-(q-2)m_{2}}{2m_{2}}\ .

By (8.2), the above inequality is equivalent to

1(q1)v1v11(q2)m22m2.\frac{1-(q-1)v_{1}^{*}}{v_{1}^{*}}\leq\frac{1-(q-2)m_{2}}{2m_{2}}\ .

By plugging v1v_{1}^{*} given in (8.2) into this inequality, it becomes q2m22qm2+10q^{2}m_{2}-2qm_{2}+1\geq 0. Hence, since g1g_{1} is increasing at v1v_{1}, we obtain v1v1v_{1}^{*}\leq v_{1}.

Finally, we claim that

v1=1+qm22q>12(q1),i.e.,m2>1q(q1).v_{1}^{*}=\frac{1+qm_{2}}{2q}>\frac{1}{2(q-1)}\ ,\ \text{i.e.,}\ m_{2}>\frac{1}{q(q-1)}\ .

According to Figure 6.1, we can show this by

h2(1q(q1))=logq22q+22q(q1)(q2)q22q+2<0.h_{2}\Big{(}\frac{1}{q(q-1)}\Big{)}=\log\frac{q^{2}-2q+2}{2}-\frac{q(q-1)(q-2)}{q^{2}-2q+2}<0\ .

This holds if q=4q=4 or q=5q=5 by elementary computation. Now, assume q6q\geq 6. Therefore, we obtain

logq22q+22<logq2=2logq<q2<q(q1)(q2)q22q+2,\log\frac{q^{2}-2q+2}{2}<\log q^{2}=2\log q<q-2<\frac{q(q-1)(q-2)}{q^{2}-2q+2}\ ,

which completes the proof. ∎

We can prove Lemma 6.5 by the aforementioned lemma.

Proof of Lemma 6.5.

Since βc=g1(1q(q1))=g1(12(q1))\beta_{c}=g_{1}(\frac{1}{q(q-1)})=g_{1}(\frac{1}{2(q-1)}), we have βs, 1<βc\beta_{s,\,1}<\beta_{c}. By Lemma 8.1, since g1(t)g_{1}(t) is increasing on (m1,1/(q1))(m_{1},1/(q-1)\,) and m1<1/(2q2)m_{1}<1/(2q-2), we obtain

βs, 2=g1(v1)>g1(12(q1))=βc.\beta_{s,\,2}=g_{1}(v_{1})>g_{1}(\frac{1}{2(q-1)})=\beta_{c}\ .

8.2. Proof of Lemma 6.7

We first introduce two lemmas.

Lemma 8.2.

Let q5q\geq 5. When β=βs, 2\beta=\beta_{s,\,2}, we have Fβs, 2(𝐯1)<Fβs, 2(𝐮2)F_{\beta_{s,\,2}}({\bf v}_{1})<F_{\beta_{s,\,2}}({\bf u}_{2}) and when β=q\beta=q, we have Fq(𝐯1)=Fq(𝐩)>Fq(𝐮2)F_{q}({\bf v}_{1})=F_{q}({\bf p})>F_{q}({\bf u}_{2}).

The proof of the above lemma is given in Section 8.3.

Lemma 8.3.

Let q5q\geq 5. β2ddβ[Fβ(𝐮2)Fβ(𝐯1)]\beta^{2}\frac{d}{d\beta}[F_{\beta}({\bf u}_{2})-F_{\beta}({\bf v}_{1})] decreases as β\beta increases in (βs, 2,q)(\beta_{s,\,2},q).

Proof.

For t=t(β)t=t(\beta), which satisfies β=gi(t)\beta=g_{i}(t), let

𝐜i=𝐜i(β)=(t,,t,1jtit,,1jtit).{\bf c}_{i}\,=\,{\bf c}_{i}(\beta)\,=\,\Big{(}t,\dots,t,\frac{1-jt}{it},\dots,\frac{1-jt}{it}\Big{)}\ . (8.3)

Since 𝐜i{\bf c}_{i} is a critical point, by the proof of Corollary 3.7, we have

ddβFβ(𝐜i)=1β2S(𝐜i).\frac{d}{d\beta}F_{\beta}({\bf c}_{i})=-\frac{1}{\beta^{2}}S({\bf c}_{i})\ .

Define a function ki:(0,1)k_{i}:(0,1)\to\mathbb{R} as

ki(t):=(1jt)log1jtit+logt.k_{i}(t):=(1-jt)\log\frac{1-jt}{it}+\log t\ . (8.4)

By elementary computations, we obtain S(𝐜i)=ki(t)S({\bf c}_{i})=k_{i}(t) so that we have

ddβFβ(𝐜i)=1β2ki(t).\frac{d}{d\beta}F_{\beta}({\bf c}_{i})\,=\,-\frac{1}{\beta^{2}}k_{i}(t)\ . (8.5)

Now, by (8.5), we obtain

β2ddβ[Fβ(𝐮2)Fβ(𝐯1)]=k1(v1(β))k2(u2(β)).\beta^{2}\frac{d}{d\beta}\Big{[}F_{\beta}({\bf u}_{2})-F_{\beta}({\bf v}_{1})\Big{]}\,=\,k_{1}(v_{1}(\beta))-k_{2}(u_{2}(\beta))\ . (8.6)

Observe that the value u2(β)u_{2}(\beta) decreases and the value v1(β)v_{1}(\beta) increases as β\beta increases. By elementary computation, for t(0,1/q)t\in(0,1/q), we obtain

ki(t)=jlog1jtit+(1jt)(j1jt1t)+1t=jlog1jtit<0,k_{i}^{\prime}(t)=-j\log\frac{1-jt}{it}+(1-jt)\Big{(}\frac{-j}{1-jt}-\frac{1}{t}\Big{)}+\frac{1}{t}=-j\log\frac{1-jt}{it}<0\ , (8.7)

so that ki(t)k_{i}(t) decreasing on (0,1/q)(0,1/q). Hence, (8.6) decreases as β\beta increases in (βs, 2,q)(\beta_{s,\,2},q). ∎

We can now prove Lemma 6.7.

Proof of Lemma 6.7.

By Lemma 8.2, there is β0(βs, 2,q)\beta_{0}\in(\beta_{s,\,2},q), such that ddβ[Fβ(𝐮2)Fβ(𝐯𝟏)]<0\frac{d}{d\beta}[F_{\beta}({\bf u}_{2})-F_{\beta}({\bf v_{1}})]<0. Hence, by Lemma 8.3, we can deduce that there is only one critical value βm(βs, 2,q)\beta_{m}\in(\beta_{s,\,2},q), such that

Fβm(𝐮2)=Fβm(𝐯1).F_{\beta_{m}}({\bf u}_{2})=F_{\beta_{m}}({\bf v}_{1})\ . (8.8)

8.3. Proofs of Lemmas 7.10 and 8.2

Before we go further, we conduct some computations. Recall the definition of m2m_{2} from Lemma 6.1. Since βs,i=gi(mi)=i1qmilog1jmiimi\beta_{s,\,i}=g_{i}(m_{i})=\frac{i}{1-qm_{i}}\log\frac{1-jm_{i}}{im_{i}} and mim_{i} is the minimum of gig_{i}, we have

0=hi(mi)\displaystyle 0=h_{i}(m_{i})\, =log1jmiimi+qmi1qmi(1jmi)\displaystyle=\,\log\frac{1-jm_{i}}{im_{i}}+\frac{qm_{i}-1}{qm_{i}(1-jm_{i})}
=1qmiiβs,i1qmiqmi(1jmi),\displaystyle=\,\frac{1-qm_{i}}{i}\beta_{s,\,i}-\frac{1-qm_{i}}{qm_{i}(1-jm_{i})}\ ,

so that

qjmi2qmi=qmi(jmi1)=iβs,i.qjm_{i}^{2}-qm_{i}\,=qm_{i}(jm_{i}-1)\,=\,-\frac{i}{\beta_{s,\,i}}\ . (8.9)

For 𝐜i{\bf c}_{i} defined in (8.3), since S(𝐜i)=ki(t)S({\bf c}_{i})=k_{i}(t) and β=gi(t)\beta=g_{i}(t), we can write

Fβ(𝐜i)=12i[qjt22qt+1]+1βlogt.F_{\beta}({\bf c}_{i})=\,\frac{1}{2i}\Big{[}qjt^{2}-2qt+1\Big{]}+\frac{1}{\beta}\log t\ . (8.10)

Hence, by (8.9) and βs,i=gi(mi)\beta_{s,\,i}=g_{i}(m_{i}), we have

Fβs,i(𝐮i)\displaystyle F_{\beta_{s,\,i}}({\bf u}_{i}) =1qmi2i+1βs,i(logmi12)\displaystyle=\,\frac{1-qm_{i}}{2i}+\frac{1}{\beta_{s,\,i}}\Big{(}\log m_{i}-\frac{1}{2}\Big{)} (8.11)
=12βs,ilog1jmiimi+1βs,ilogmi12βs,i.\displaystyle=\,\frac{1}{2\beta_{s,\,i}}\log\frac{1-jm_{i}}{im_{i}}+\frac{1}{\beta_{s,\,i}}\log m_{i}-\frac{1}{2\beta_{s,\,i}}\ .

By (8.9) again, we obtain

Fβs,i(𝐮i)=12βs,ilog(qeβs,i).F_{\beta_{s,\,i}}({\bf u}_{i})=-\frac{1}{2\beta_{s,\,i}}\log(qe\beta_{s,\,i})\ . (8.12)

Now, we introduce two technical lemmas required in the proof of Lemmas 7.10 and 8.2.

Lemma 8.4.

For q6500q\geq 6500, we have

1βs, 2(logqm212)>(q1)8q(qm2)214m2+q2+4q+18q(q1).\frac{1}{\beta_{s,\,2}}\Big{(}\log qm_{2}-\frac{1}{2}\Big{)}>\frac{(q-1)}{8q}(qm_{2})^{2}-\frac{1}{4}m_{2}+\frac{-q^{2}+4q+1}{8q(q-1)}\ .

The proof is given in Section 10.

Lemma 8.5.

Let q5q\geq 5. Define fc(β)=12βlog(qeβ)f_{c}(\beta)=-\frac{1}{2\beta}\log(qe\beta) and

Φ(β)=ddβ[fc(β)Fβ(𝐮2)].\Phi(\beta)=\frac{d}{d\beta}[f_{c}(\beta)-F_{\beta}({\bf u}_{2})]\ .

Then, we have Φ(β)>0\Phi(\beta)>0 for β>βs, 2\beta>\beta_{s,\,2}.

Proof.

We have

ddβfc(β)\displaystyle\frac{d}{d\beta}f_{c}(\beta) =12β2logqeβ12β1β=12β2logqβ.\displaystyle=\,\frac{1}{2\beta^{2}}\log qe\beta-\frac{1}{2\beta}\frac{1}{\beta}\,=\,\frac{1}{2\beta^{2}}\log q\beta\ .

By (8.5), we obtain

β2ddβ[fc(β)Fβ(𝐮2)]=12[logqβ+2k2(u2)].\beta^{2}\frac{d}{d\beta}[f_{c}(\beta)-F_{\beta}({\bf u}_{2})]\,=\,\frac{1}{2}[\log q\beta+2k_{2}(u_{2})]\ .

By (8.7), the above expression is increasing function of β\beta since u2u_{2} decreases as β\beta increases. Hence, it is sufficient to show Φ(βs, 2)>0\Phi(\beta_{s,\,2})>0. First, let q55>e4q\geq 55>e^{4}. By (8.7)

logqβ+2k2(u2)\displaystyle\log q\beta+2k_{2}(u_{2})\, >logqβs, 2+2k2(12j)\displaystyle>\,\log q\beta_{s,\,2}+2k_{2}(\frac{1}{2j})
=logqβs, 2+log2jji+2log12j=logqβs, 24ij,\displaystyle=\,\log q\beta_{s,\,2}+\log\frac{2j-j}{i}+2\log\frac{1}{2j}\,=\,\log\frac{q\beta_{s,\,2}}{4ij}\ ,

where we use u2<1/(2j)u_{2}<1/(2j) for the inequality. Since βs, 2>βc>2logq\beta_{s,\,2}>\beta_{c}>2\log q, we obtain

qβs, 24ij>2qlogq8(q2)>qq2.\frac{q\beta_{s,\,2}}{4ij}\,>\,\frac{2q\log q}{8(q-2)}\,>\,\frac{q}{q-2}\ .

Finally, for 5q545\leq q\leq 54, by Proposition A.1, we have Φ(βs, 2)>0\Phi(\beta_{s,\,2})>0. ∎

By the above lemmas, Lemma 8.2 can be proven.

Proof of Lemma 8.2.

By Proposition A.1 given in appendix, we can check that Fβs, 2(𝐮2)>Fβs, 2(𝐯1)F_{\beta_{s,\,2}}({\bf u}_{2})>F_{\beta_{s,\,2}}({\bf v}_{1}) holds for 5q65005\leq q\leq 6500. Now, suppose that q>6500q>6500. By (8.10) and (8.11), we can write

Fβs, 2(𝐮2)\displaystyle F_{\beta_{s,\,2}}({\bf u}_{2}) =14qm2+14+1βs, 2(logm212),\displaystyle=\,-\frac{1}{4}qm_{2}+\frac{1}{4}+\frac{1}{\beta_{s,\,2}}\Big{(}\log m_{2}-\frac{1}{2}\Big{)}\ ,
Fβs, 2(𝐯1)\displaystyle F_{\beta_{s,\,2}}({\bf v}_{1}) =12[q(q1)(v11q1)21q1]+1βs, 2logv1.\displaystyle=\,\frac{1}{2}\bigg{[}q(q-1)\Big{(}v_{1}-\frac{1}{q-1}\Big{)}^{2}-\frac{1}{q-1}\bigg{]}+\frac{1}{\beta_{s,\,2}}\log v_{1}\ .

By the proof of Lemma 8.1, we have

qm2+12q=v1v1<1q,\frac{qm_{2}+1}{2q}=v_{1}^{*}\leq v_{1}<\frac{1}{q}\ ,

so that

Fβs, 2(𝐯1)<12[q(q1)(qm2+12q1q1)21q1]1βs, 2logq.F_{\beta_{s,\,2}}({\bf v}_{1})<\frac{1}{2}\left[q(q-1)\Big{(}\frac{qm_{2}+1}{2q}-\frac{1}{q-1}\Big{)}^{2}-\frac{1}{q-1}\right]-\frac{1}{\beta_{s,\,2}}\log q\ .

Hence, the lemma can be proven if we can prove

14qm2+14+1βs, 2(logm212)\displaystyle-\frac{1}{4}qm_{2}+\frac{1}{4}+\frac{1}{\beta_{s,\,2}}(\log m_{2}-\frac{1}{2})
>12[q(q1)(qm2+12q1q1)21q1]1βs, 2logq\displaystyle>\,\frac{1}{2}\bigg{[}q(q-1)\Big{(}\frac{qm_{2}+1}{2q}-\frac{1}{q-1}\Big{)}^{2}-\frac{1}{q-1}\bigg{]}-\frac{1}{\beta_{s,\,2}}\log q
=18q(q1)(m2)214(q+1)m2+(q+1)28q(q1)1βs, 2logq.\displaystyle=\,\frac{1}{8}q(q-1)(m_{2})^{2}-\frac{1}{4}(q+1)m_{2}+\frac{(q+1)^{2}}{8q(q-1)}-\frac{1}{\beta_{s,\,2}}\log q\ .

This is the content of Lemma 8.4. Finally, by Lemma 8.5, we obtain Fq(𝐩)Fq(𝐮2)=fc(q)Fq(𝐮2)>0F_{q}({\bf p})-F_{q}({\bf u}_{2})=f_{c}(q)-F_{q}({\bf u}_{2})>0 since fc(βs, 2)=Fβx, 2(𝐮2)f_{c}(\beta_{s,\,2})=F_{\beta_{x,\,2}}({\bf u}_{2}). ∎

Now, we prove Lemma 7.10.

Proof of Lemma 7.10.

Since the proof for Fβs, 3(𝐮3)>Fβs, 3(𝐯1)F_{\beta_{s,\,3}}({\bf u}_{3})>F_{\beta_{s,\,3}}({\bf v}_{1}) is exactly the same as the proof of Lemma 8.2 including numerical verification, we omit it. By (8.12), we can write

Fβs, 3(𝐮3)=fc(βs, 3).F_{\beta_{s,\,3}}({\bf u}_{3})\,=\,f_{c}(\beta_{s,\,3})\ .

Hence, by Lemma 8.5 and by Proposition A.1, we have

Fβs, 3(𝐮3)=fc(βs, 3)>Fβs, 3(𝐮2).F_{\beta_{s,\,3}}({\bf u}_{3})\,=\,f_{c}(\beta_{s,\,3})\,>\,F_{\beta_{s,\,3}}({\bf u}_{2})\ .

9. Characterization of Metastable Sets

In this section, we prove Theorems 3.4-3.6. First, we prove Theorem 3.4.

Proof of Theorem 3.4.

The first assertion is immediate from Lemmas 6.2 and 6.4. The third assertion is proven by Proposition 6.3 and Lemma 7.8. The fourth assertion is Lemma 6.6.

Now, it remains to show the second assertion. For β(β1,β2]\beta\in(\beta_{1},\beta_{2}], since 𝐩{\bf p} is the global minimum and 𝐯1{\bf v}_{1} is a saddle point, we have Fβ(𝐩)<Fβ(𝐯1)F_{\beta}({\bf p})<F_{\beta}({\bf v}_{1}) so that 𝒲𝔬\mathcal{W}_{\mathfrak{o}}\neq\emptyset. By the same argument in the proof of Lemma 8.3, we have

ddβ[Fβ(𝐯1)Fβ(𝐩)]=1β2[k1(v1(β))+logq].\frac{d}{d\beta}[F_{\beta}({\bf v}_{1})-F_{\beta}({\bf p})]=-\frac{1}{\beta^{2}}[k_{1}(v_{1}(\beta))+\log q]\ .

By 8.7, k1()k_{1}(\cdot) is decreasing on (0, 1/q)(0,\,1/q) and increasing on (1/q, 1/(q1))(1/q,\,1/(q-1)\,). Since k1(1/q)=logqk_{1}(1/q)=-\log q, we have k1(v1(β))+logq>0k_{1}(v_{1}(\beta))+\log q>0 for β(β1,q)\beta\in(\beta_{1},q) so that

ddβ[Fβ(𝐯1)Fβ(𝐩)]<0.\frac{d}{d\beta}[F_{\beta}({\bf v}_{1})-F_{\beta}({\bf p})]<0\ .

Since 𝐯1=𝐩{\bf v}_{1}={\bf p} when β=q\beta=q, we have Fβ(𝐯1)>Fβ(𝐩)F_{\beta}({\bf v}_{1})>F_{\beta}({\bf p}) for β<q\beta<q and Fβ(𝐯1)<Fβ(𝐩)F_{\beta}({\bf v}_{1})<F_{\beta}({\bf p}) for β>q\beta>q. ∎

9.1. Proof of Theorem 3.6

Before we go further, we recall the height between two points. Let 𝒂,𝒃intΞ\bm{a},\,\bm{b}\in\text{int}\,\Xi, and let Γ𝒂,𝒃\Gamma_{\bm{a},\,\bm{b}} be a set of all C1C^{1}-path γ:[0,1]intΞ\gamma:[0,1]\to\text{int}\,\Xi, such that γ(0)=𝒂\gamma(0)=\bm{a} and γ(1)=𝒃\gamma(1)=\bm{b}. Then, we can define the height (𝒂,𝒃)\mathfrak{H}(\bm{a},\bm{b}) between 𝒂\bm{a} and 𝒃\bm{b} as (𝒂,𝒃)=infγΓ𝒂,𝒃sup0t1Fβ(γ(t))\mathfrak{H}(\bm{a},\bm{b})=\inf_{\gamma\in\Gamma_{\bm{a},\,\bm{b}}}\,\sup_{0\leq t\leq 1}\,F_{\beta}(\gamma(t)). We prove Theorem 3.6 in several steps.

Lemma 9.1.

Let q4q\geq 4. If β>βm\beta>\beta_{m}, the sets 𝒲i(β)\mathcal{W}_{i}(\beta), iSi\in S, are different. In particular, they are nonempty.

Proof.

Since the elements of 𝒰1\mathcal{U}_{1} are the lowest minima, we have Fβ(𝐮1)<HβF_{\beta}({\bf u}_{1})<H_{\beta} so that 𝒲i\mathcal{W}_{i}’s are nonempty. Without loss of generality, suppose 𝒲1=𝒲2\mathcal{W}_{1}=\mathcal{W}_{2}. Since 𝐮11,𝐮12𝒲1{\bf u}_{1}^{1},\,{\bf u}_{1}^{2}\in\mathcal{W}_{1} and 𝒲1\mathcal{W}_{1} is connected, there is a C1C^{1}-path γ:[0,1]𝒲1\gamma:[0,1]\to\mathcal{W}_{1}, such that γ(0)=𝐮11\gamma(0)={\bf u}_{1}^{1}, γ(1)=𝐮12\gamma(1)={\bf u}_{1}^{2}. Therefore, we have Fβ(γ(t))<HβF_{\beta}(\gamma(t))<H_{\beta} for 0t10\leq t\leq 1, so that

Fβ(𝐮11)<(𝐮11,𝐮12)<Hβ.F_{\beta}({\bf u}_{1}^{1})<\mathfrak{H}({\bf u}_{1}^{1},{\bf u}_{1}^{2})<H_{\beta}\ .

Then, there is a saddle point 𝝈(𝐮11,𝐮12)\bm{\sigma}({\bf u}_{1}^{1},{\bf u}_{1}^{2}), such that Fβ(𝝈(𝐮11,𝐮12))=(𝐮11,𝐮12)F_{\beta}(\bm{\sigma}({\bf u}_{1}^{1},{\bf u}_{1}^{2}))=\mathfrak{H}({\bf u}_{1}^{1},{\bf u}_{1}^{2}). However, by Proposition 6.3, the values of saddle points are greater than or equal to HβH_{\beta}. This is contradiction. Hence, 𝒲i\mathcal{W}_{i}’s are different. ∎

Lemma 9.2.

Let q4q\geq 4. If β>q\beta>q, the set Σi,j\Sigma_{i,\,j} is singleton for all i,jSi,j\in S.

Proof.

First, we claim that Σi,j\Sigma_{i,\,j}’s are not empty. Suppose one of Σi,j\Sigma_{i,\,j}’s is empty. Then, by symmetry, all of them are empty. Let us fix 1k<lq1\leq k<l\leq q. Since 𝐮2k,l{\bf u}_{2}^{k,\,l} is a saddle point, there is a unit eigenvector 𝒘\bm{w} that corresponds to the unique negative eigenvalue of 2Fβ(𝐮2k,l)\nabla^{2}F_{\beta}({\bf u}_{2}^{k,\,l}). There exists η>0\eta>0, such that Fβ(𝐮2k,l+t𝒘)<HβF_{\beta}({\bf u}_{2}^{k,\,l}+t\bm{w})<H_{\beta} for all 0<|t|<η0<|t|<\eta. Now, consider the path 𝒚(t)\bm{y}(t) described by the ordinary differential equation

𝒚˙(t)=Fβ(𝒚(t)),𝒚(0)=𝐮2k,l+η𝒘.\dot{\bm{y}}(t)\,=\,-\nabla F_{\beta}(\bm{y}(t)),\ \ \ \bm{y}(0)\,=\,{\bf u}_{2}^{k,\,l}+\eta\bm{w}\ . (9.1)

Then, 𝒚(t)\bm{y}(t) converges to a critical point whose height is less than HβH_{\beta} as tt\to\infty. If this convergent point is not a local minimum, we can find an eigenvector 𝒘1\bm{w}_{1} corresponding to a negative eigenvalue of the Hessian of FβF_{\beta} at that point. Then, by the same argument defining the path (9.1), the next path converges to another critical point whose height is lower than that of the previous critical point. Finally, this path converges to a local minimum. Since there is no local minimum other than 𝒰1\mathcal{U}_{1}, 𝒚(t)\bm{y}(t) converges to some elements of 𝒰1\mathcal{U}_{1}, say 𝐮11{\bf u}_{1}^{1} without loss of generality. Since 𝒲i\mathcal{W}_{i}’s are different, 𝒚()\bm{y}(\cdot) converges to only one minimum.

By the same argument, the similar path starting at 𝐮2k,lϵ𝒘{\bf u}_{2}^{k,\,l}-\epsilon\bm{w} converges to some 𝐮1{\bf u}_{1}, say 𝐮1m{\bf u}_{1}^{m}. If m1m\neq 1, 𝐮2k,lΣ1,m{\bf u}_{2}^{k,\,l}\in\Sigma_{1,\,m} so that Σ1,m\Sigma_{1,\,m} is not empty. So, we obtain m=1m=1. In this case, we obtain 𝐮2k,l𝒲1¯{\bf u}_{2}^{k,\,l}\in\overline{\mathcal{W}_{1}} and 𝐮2k,l𝒲¯m{\bf u}_{2}^{k,\,l}\notin\overline{\mathcal{W}}_{m} for m1m\neq 1. By symmetry, since 𝒰2\mathcal{U}_{2} has q(q1)/2q(q-1)/2 elements and the number of 𝒲i\mathcal{W}_{i} is qq, there are (q1)/2(q-1)/2 elements in 𝒰2\mathcal{U}_{2} corresponding to each 𝒲i\mathcal{W}_{i}, that is, |𝒲1¯𝒰2|=(q1)/2|\overline{\mathcal{W}_{1}}\cap\mathcal{U}_{2}|=(q-1)/2, where |A||A| is the number of elements of set AA. If 𝐮21,a𝒲1¯{\bf u}_{2}^{1,\,a}\in\overline{\mathcal{W}_{1}}, for some 2aq2\leq a\leq q, we obtain 𝐮21,a𝒲a¯{\bf u}_{2}^{1,\,a}\in\overline{\mathcal{W}_{a}} by symmetry, and therefore Σ1,a=𝒲1¯𝒲a¯\Sigma_{1,\,a}=\overline{\mathcal{W}_{1}}\cap\overline{\mathcal{W}_{a}}\neq\emptyset. Hence, we have 𝐮21,a𝒲1¯{\bf u}_{2}^{1,\,a}\notin\overline{\mathcal{W}_{1}}. If 𝐮2a,b𝒲1¯{\bf u}_{2}^{a,\,b}\in\overline{\mathcal{W}_{1}} for some 1<a,b1<a,\,b, since q4q\geq 4 and by symmetry, 𝐮2a,b𝒲m¯{\bf u}_{2}^{a,\,b}\in\overline{\mathcal{W}_{m}} for some m2,a,bm\neq 2,\,a,\,b, and this contradicts the assumption that Σ1,m=𝒲1¯𝒲m¯=\Sigma_{1,\,m}=\overline{\mathcal{W}_{1}}\cap\overline{\mathcal{W}_{m}}=\emptyset. Hence, Σi,j\Sigma_{i,\,j}’s are nonempty.

Observe that the elements of Σi,j\Sigma_{i,\,j} are saddle points and Fβ(𝒙)=HβF_{\beta}(\bm{x})=H_{\beta} for all 𝒙Σi,j\bm{x}\in\Sigma_{i,\,j}. Hence, by Proposition 6.3, Σi,j𝒰2\Sigma_{i,\,j}\subset\mathcal{U}_{2}. Since 2Fβ(𝐮2)\nabla^{2}F_{\beta}({\bf u}_{2})’s have only one negative eigenvalue, each element of 𝒰2\mathcal{U}_{2} connects only two wells, i.e., Σi,jΣk,l=\Sigma_{i,\,j}\cap\Sigma_{k,\,l}=\emptyset if {i,j}{k,l}\{i,j\}\neq\{k,l\}. Therefore, Σi,j\Sigma_{i,\,j} has at most one point and from the above two paragraphs, we obtain |Σi,j|=1|\Sigma_{i,\,j}|=1. ∎

We can now prove Theorem 3.6.

Proof of Theorem 3.6.

The first assertion follows from the definition of critical temperatures (6.9) and Lemma 6.7.

By Lemma 9.2, to prove Σi,j={𝐮2i,j}\Sigma_{i,\,j}=\{{\bf u}_{2}^{i,\,j}\} when β>q\beta>q, without loss of generality, it is sufficient to show that Σ1, 2{𝐮21, 4}\Sigma_{1,\,2}\neq\{{\bf u}_{2}^{1,\,4}\} and Σ1, 2{𝐮23, 4}\Sigma_{1,\,2}\neq\{{\bf u}_{2}^{3,\,4}\}. First, suppose Σ1, 2={𝐮21, 4}\Sigma_{1,\,2}=\{{\bf u}_{2}^{1,\,4}\}. Then, by symmetry, we obtain 𝐮21, 4Σ1, 3{\bf u}_{2}^{1,\,4}\in\Sigma_{1,\,3}, which contradicts to Σ1, 2Σ1, 3=\Sigma_{1,\,2}\cap\Sigma_{1,\,3}=\emptyset. Second, suppose Σ1, 2={𝐮23, 4}\Sigma_{1,\,2}=\{{\bf u}_{2}^{3,\,4}\} so that by symmetry, we have Σ1, 5={𝐮23, 4}\Sigma_{1,\,5}=\{{\bf u}_{2}^{3,\,4}\} which is also contradiction. Hence, we obtain Σ1, 2={𝐮21, 2}\Sigma_{1,\,2}=\{{\bf u}_{2}^{1,\,2}\}.

Since FβF_{\beta} is continuous in β\beta, the values HβH_{\beta} and (𝐮1i(β),𝐮1j(β))\mathfrak{H}({\bf u}_{1}^{i}(\beta),{\bf u}_{1}^{j}(\beta)), i,jSi,j\in S, are continuous in β\beta. Note that (𝐮1i(β),𝐮1j(β))=Fβ(𝐮2)=Hβ\mathfrak{H}({\bf u}_{1}^{i}(\beta),{\bf u}_{1}^{j}(\beta))=F_{\beta}({\bf u}_{2})=H_{\beta} for βq\beta\geq q since there is no saddle point other than 𝒰2\mathcal{U}_{2}. Since Fβ(𝐯1)>HβF_{\beta}({\bf v}_{1})>H_{\beta} if β>βm=β3\beta>\beta_{m}=\beta_{3} and there is no saddle point other than the elements of 𝒰2𝒱1\mathcal{U}_{2}\cup\mathcal{V}_{1}, by continuity, we obtain

(𝐮1i(β),𝐮1j(β))=Hβifββ3.\mathfrak{H}({\bf u}_{1}^{i}(\beta),{\bf u}_{1}^{j}(\beta))\,=\,H_{\beta}\ \text{if}\ \beta\geq\beta_{3}\ .

Hence, 𝐮2i,j{\bf u}_{2}^{i,\,j} is a saddle point between 𝐮1i{\bf u}_{1}^{i} and 𝐮1j{\bf u}_{1}^{j} and Σi,j={𝐮2i,j}\Sigma_{i,\,j}=\{{\bf u}_{2}^{i,\,j}\} if β>β3\beta>\beta_{3}. Coupled with Lemma 9.1, the fourth assertion holds except that Σ0,i=\Sigma_{0,\,i}=\emptyset.

Without loss of generality, suppose that Σ0, 1=𝒲0¯𝒲1¯\Sigma_{0,\,1}=\overline{\mathcal{W}_{0}}\cap\overline{\mathcal{W}_{1}}\neq\emptyset. We, therefore, obtain (𝐩,𝐮11(β))<Fβ(𝐯1)\mathfrak{H}({\bf p},{\bf u}_{1}^{1}(\beta))<F_{\beta}({\bf v}_{1}) so that (𝐩,𝐮11(β))=Hβ\mathfrak{H}({\bf p},{\bf u}_{1}^{1}(\beta))=H_{\beta}. By continuity, we get

(𝐩,𝐮11(β))=Hβforβ3β<q,\mathfrak{H}({\bf p},{\bf u}_{1}^{1}(\beta))=H_{\beta}\ \text{for}\ \beta_{3}\leq\beta<q\ ,

so that Fβ(𝐩)HβF_{\beta}({\bf p})\leq H_{\beta} for β3β<q\beta_{3}\leq\beta<q. However, it is in contradiction to Fq(𝐩)=Fq(𝐯1)>HqF_{q}({\bf p})=F_{q}({\bf v}_{1})>H_{q}. Hence, we obtain Σ0,i=\Sigma_{0,\,i}=\emptyset for iSi\in S.

By the same argument and symmetry, the second assertion can be proven for β(βs, 1,βs, 2)=(β1,βs, 2)\beta\in(\beta_{s,\,1},\beta_{s,\,2})=(\beta_{1},\beta_{s,\,2}). By continuity argument, we can extend these to β(β1,β3)\beta\in(\beta_{1},\beta_{3}). The third assertion holds because of the first and fourth assertions, symmetry, and continuity. Finally, the fifth assertion can be proven by the same argument. ∎

9.2. Proof of Theorem 3.5

If q=4q=4, Σ1, 2{𝐮23, 4}\Sigma_{1,\,2}\neq\{{\bf u}_{2}^{3,\,4}\} cannot be proven by symmetry argument. Hence, we directly prove the Theorem 3.5.

Proof of Theorem 3.5.

By Lemma 6.7 and (6.9), we obtain the first assertion.

Consider 𝒦i,j={𝒙Ξ:xi=xj=max{x1,,x4}}\mathcal{K}_{i,\,j}=\{\,\bm{x}\in\Xi\,:\,x_{i}=x_{j}=\max\{x_{1,}\dots,x_{4}\}\,\}. It can be observed that these six planes divide Ξ\Xi into four pieces, and each plain contains one element of 𝒰2\mathcal{U}_{2} and 𝐮2i,j𝒦i,j{\bf u}_{2}^{i,\,j}\in\mathcal{K}_{i,\,j}. We claim that Hβ=Fβ(𝐮2i,j)<Fβ(𝒙)H_{\beta}=F_{\beta}({\bf u}_{2}^{i,\,j})<F_{\beta}(\bm{x}) for all 𝒙𝒦i,j\bm{x}\in\mathcal{K}_{i,\,j} if β>q\beta>q. Note that 𝐩{\bf p} is not local minimum if βq\beta\geq q.

Let F~β(𝒙)\widetilde{F}_{\beta}(\bm{x}) be a restriction of FβF_{\beta} to 𝒦3, 4\mathcal{K}_{3,\,4} and let 𝒦3, 4o={𝒙𝒦3, 4:x3=x4>x1,x2}\mathcal{K}_{3,\,4}^{o}=\{\bm{x}\in\mathcal{K}_{3,\,4}\,:\,x_{3}=x_{4}>x_{1},x_{2}\}. Since x3=x4=12(1x1x2)x_{3}=x_{4}=\frac{1}{2}(1-x_{1}-x_{2}),

xiF~β(𝒙)=xi+1βlogxi+x31βlogx3,\frac{\partial}{\partial x_{i}}\widetilde{F}_{\beta}(\bm{x})=-x_{i}+\frac{1}{\beta}\log x_{i}+x_{3}-\frac{1}{\beta}\log x_{3}\ ,

so that if 𝒙𝒦3, 4\bm{x}\in\mathcal{K}_{3,\,4} is a critical point, we have

x1+1βlogx1=x2+1βlogx2=x3+1βlogx3.-x_{1}+\frac{1}{\beta}\log x_{1}=-x_{2}+\frac{1}{\beta}\log x_{2}=-x_{3}+\frac{1}{\beta}\log x_{3}\ .

Since x3=x4>x1,x2x_{3}=x_{4}>x_{1},\,x_{2}, if βq\beta\geq q, the critical points in 𝒦3, 4o\mathcal{K}_{3,\,4}^{o} are 𝐮23, 4{\bf u}_{2}^{3,\,4}, 𝐯21, 2{\bf v}_{2}^{1,\,2}. From the proof Lemma 7.7, we obtain 𝐮23, 4=𝐯21, 2=(u2,u2,v2,v2){\bf u}_{2}^{3,\>4}={\bf v}_{2}^{1,\,2}=(u_{2},u_{2},v_{2},v_{2}).

Let a=1+1βu2a=-1+\frac{1}{\beta u_{2}} and b=1+1βv2b=-1+\frac{1}{\beta v_{2}}. We therefore obtain

2F~β(𝐮23, 4)=(a+12b12b12ba+12b).\nabla^{2}\widetilde{F}_{\beta}({\bf u}_{2}^{3,\,4})\ =\ \left(\begin{array}[]{cc}a+\frac{1}{2}b&\frac{1}{2}b\\ \frac{1}{2}b&a+\frac{1}{2}b\end{array}\right)\ .

The eigenvalues of 2F~β(𝐮23, 4)\nabla^{2}\widetilde{F}_{\beta}({\bf u}_{2}^{3,\,4}) are aa and a+ba+b. By Lemma 7.3, a,b>0a,\,b>0 so that 𝐮23, 4{\bf u}_{2}^{3,\,4} is a local minimum in 𝒦3, 4o\mathcal{K}_{3,\,4}^{o}. Since this is the unique critical point, 𝐮23, 4{\bf u}_{2}^{3,\,4} is the unique minimum in 𝒦3, 4o\mathcal{K}_{3,\,4}^{o}. Since 𝒦3, 4\mathcal{K}_{3,\,4} is a closure of 𝒦3, 4o\mathcal{K}_{3,\,4}^{o} and there is no critical point in 𝒦3, 4o{𝐮23, 4}\mathcal{K}_{3,\,4}^{o}\setminus\{{\bf u}_{2}^{3,\,4}\}, 𝐮23, 4{\bf u}_{2}^{3,\,4} is the unique minimum in 𝒦3, 4\mathcal{K}_{3,\,4}. Hence, 𝒲i\mathcal{W}_{i}’s are different if β>q\beta>q.

Let β>q\beta>q. By the definition of 𝒦i,j\mathcal{K}_{i,\,j}, we obtain 𝒲¯k𝒦i,j=\overline{\mathcal{W}}_{k}\cap\mathcal{K}_{i,\,j}=\emptyset if ki,jk\neq i,j so that Σi,j𝒦i,j\Sigma_{i,\,j}\subset\mathcal{K}_{i,\,j}. By Lemma 9.2, Σi,j\Sigma_{i,\,j} are not empty. It can be observed Fβ(𝒙)=HβF_{\beta}(\bm{x})=H_{\beta} and Fβ(𝒙)=0\nabla F_{\beta}(\bm{x})=0 if 𝒙Σi,j\bm{x}\in\Sigma_{i,\,j}. Since Σi,j𝒦i,j\Sigma_{i,\,j}\subset\mathcal{K}_{i,\,j}, we have Σi,j={𝐮2i,j}\Sigma_{i,\,j}=\{{\bf u}_{2}^{i,\,j}\}, thus the fourth assertion is proved.

For the third assertion, note that Fq(𝒙)=HqF_{q}(\bm{x})=H_{q} for all 𝒙Σi,j\bm{x}\in\Sigma_{i,\,j} and 𝐩{\bf p} is the only point in 𝒦i,j\mathcal{K}_{i,\,j}, such that Fq(𝒙)=HqF_{q}(\bm{x})=H_{q}. Moreover, we obtain Fq(𝒙)>Hq=Fq(𝐩)F_{q}(\bm{x})>H_{q}=F_{q}({\bf p}) if 𝒙𝒦i,jo\bm{x}\in\mathcal{K}_{i,\,j}^{o}, and finally we can deduce Fq(𝒙)>Hq=Fq(𝐩)F_{q}(\bm{x})>H_{q}=F_{q}({\bf p}) if 𝒙𝒦i,j{𝐩}\bm{x}\in\mathcal{K}_{i,\,j}\setminus\{{\bf p}\} using elementary calculus. Hence, 𝒲i\mathcal{W}_{i}’s are different if β=q\beta=q.

For the second assertion, we can use the symmetry argument and the proofs are the same as the proof of Theorem 3.6. ∎

10. Proof of Lemma 8.4

This section is devoted to the proof of Lemma 8.4. In Section 10.1, we provide an auxiliary lemma to prove Lemma 10.1. In section 10.2, we prove this auxiliary lemma. So far, we have fixed an integer q3q\geq 3; however, in this section, we consider qq as a real number and several variables as functions of qq. For example, m2=m2(q)m_{2}=m_{2}(q), j(q)=q2j(q)=q-2, and βs, 2=βs, 2(q)\beta_{s,\,2}=\beta_{s,\,2}(q).

10.1. Proof of Lemma 8.4

Lemma 10.1.

The function ff_{\star} of qq is defined as

f(q)=1βs, 2(logqm212)18(qm2)2+14m2+2512002.f_{\star}(q)=\frac{1}{\beta_{s,\,2}}\Big{(}\log qm_{2}-\frac{1}{2}\Big{)}-\frac{1}{8}(qm_{2})^{2}+\frac{1}{4}m_{2}+\frac{251}{2002}\ . (10.1)

Then, if q>e8q>e^{8}, f(q)=ddqf(q)>0f_{\star}^{\prime}(q)=\frac{d}{dq}f_{\star}(q)>0.

Proof of Lemma 8.4.

By Proposition A.1, we obtain f(6500)>0f_{\star}(6500)>0. We observe that (q1)8q<18\frac{(q-1)}{8q}<\frac{1}{8} and q2+4q+18q(q1)<2512002\frac{-q^{2}+4q+1}{8q(q-1)}<-\frac{251}{2002} if q>1000q>1000. Hence, Lemma 10.1 proves Lemma 8.4. ∎

10.2. Proof of Lemma 10.1

Let s2=s2(q)=qm2s_{2}=s_{2}(q)=qm_{2}. In the first lemma, we compute m2=(d/dq)m2m_{2}^{\prime}=(d/dq)m_{2}, s2=(d/dq)s2s_{2}^{\prime}=(d/dq)s_{2}, and βs, 2=(d/dq)βs, 2\beta_{s,\,2}^{\prime}=(d/dq)\beta_{s,\,2}.

Lemma 10.2.

We have

m2=ddqm2\displaystyle m_{2}^{\prime}=\frac{d}{dq}m_{2} =m2(1jm2qjm22)q(12jm2),\displaystyle=\,-\frac{m_{2}(1-jm_{2}-qjm_{2}^{2})}{q(1-2jm_{2})}\ ,
s2=ddqs2\displaystyle s_{2}^{\prime}=\frac{d}{dq}s_{2} ==js22(1s2)q(q2js2),\displaystyle=\,=-\frac{js_{2}^{2}(1-s_{2})}{q(q-2js_{2})}\ ,
βs, 2=ddqβs, 2\displaystyle\beta_{s,\,2}^{\prime}=\frac{d}{dq}\beta_{s,\,2} =11s2(βs, 2s22s2+s22+qs2(qjs2)s2).\displaystyle=\,\frac{1}{1-s_{2}}\bigg{(}\beta_{s,\,2}s_{2}^{\prime}-2\frac{-s_{2}+s_{2}^{2}+qs_{2}^{\prime}}{(q-js_{2})s_{2}}\bigg{)}\ .
Proof.

We observe that

βs, 2=g2(m2)=21qm2log1jm22m2=2qm2(1jm2),\beta_{s,\,2}=g_{2}(m_{2})=\frac{2}{1-qm_{2}}\log\frac{1-jm_{2}}{2m_{2}}=\frac{2}{qm_{2}(1-jm_{2})}\ ,

so that

log(1jm2)log2m2=2q(12m211jm2).\log(1-jm_{2})-\log 2m_{2}=\frac{2}{q}\Big{(}\frac{1}{2m_{2}}-\frac{1}{1-jm_{2}}\Big{)}\ .

By differentiating this equation in qq, we get

m2jm21jm2m2m2=2q2(12m211jm2)+2q(m22m22+m2jm2(1jm2)2).\frac{-m_{2}-jm_{2}^{\prime}}{1-jm_{2}}-\frac{m_{2}^{\prime}}{m_{2}}=-\frac{2}{q^{2}}\Big{(}\frac{1}{2m_{2}}-\frac{1}{1-jm_{2}}\Big{)}+\frac{2}{q}\Big{(}-\frac{m_{2}^{\prime}}{2m_{2}^{2}}+\frac{-m_{2}-jm_{2}^{\prime}}{(1-jm_{2})^{2}}\Big{)}\ .

By elementary computation, we can write

m2=m2(1jm2qjm22)q(12jm2).m_{2}^{\prime}=-\frac{m_{2}(1-jm_{2}-qjm_{2}^{2})}{q(1-2jm_{2})}\ . (10.2)

Let s2=qm2s_{2}=qm_{2}. Then,

s2=m2+qm2=js22(1s2)q(q2js2).s_{2}^{\prime}=m_{2}+qm_{2}^{\prime}=-\frac{js_{2}^{2}(1-s_{2})}{q(q-2js_{2})}\ . (10.3)

Next, we compute βs, 2\beta_{s,\,2}^{\prime}. Note that

βs, 2=21s2logqjs22s2,\beta_{s,\,2}=\frac{2}{1-s_{2}}\log\frac{q-js_{2}}{2s_{2}}\ ,

so that

βs, 2\displaystyle\beta_{s,\,2}^{\prime}\, =2s2(1s2)2logqjs22s2+21s2(1s2js2qjs2s2s2)\displaystyle=\,-\frac{2s_{2}^{\prime}}{(1-s_{2})^{2}}\log\frac{q-js_{2}}{2s_{2}}+\frac{2}{1-s_{2}}(\frac{1-s_{2}-js_{2}^{\prime}}{q-js_{2}}-\frac{s_{2}^{\prime}}{s_{2}})
=11s2(s221s2logqjs22s2+2s2s22js2s2qs2+js2s2(qjs2)s2)\displaystyle=\,\frac{1}{1-s_{2}}\bigg{(}s_{2}^{\prime}\frac{2}{1-s_{2}}\log\frac{q-js_{2}}{2s_{2}}+2\frac{s_{2}-s_{2}^{2}-js_{2}s_{2}^{\prime}-qs_{2}^{\prime}+js_{2}s_{2}^{\prime}}{(q-js_{2})s_{2}}\bigg{)} (10.4)
=11s2(βs, 2s22s2+s22+qs2(qjs2)s2).\displaystyle=\,\frac{1}{1-s_{2}}\bigg{(}\beta_{s,\,2}s_{2}^{\prime}-2\frac{-s_{2}+s_{2}^{2}+qs_{2}^{\prime}}{(q-js_{2})s_{2}}\bigg{)}\ .

The next lemma provides the bound of m2(q)m_{2}(q).

Lemma 10.3.

Let q>e8q>e^{8}. We have

12qlogq<m2(q)<1qlogq.\frac{1}{2q\log q}<m_{2}(q)<\frac{1}{q\log q}\ .
Proof.

It can be observed that h2(m2)=0h_{2}(m_{2})=0 and h2(t)>0h_{2}(t)>0 if m2<t<1/qm_{2}<t<1/q. We claim that

h2(a)=log1ja2a+qa1qa(1ja)>0,h_{2}(a)\,=\,\log\frac{1-ja}{2a}+\frac{qa-1}{qa(1-ja)}\,>0\ ,

where a=1/qlogqa=1/q\log q. The above inequality can be written as

logqlogqj2>(qlogqqqlogqj)logq.\log\frac{q\log q-j}{2}>\Big{(}\frac{q\log q-q}{q\log q-j}\Big{)}\log q\ .

Since the right-hand side is smaller than logq\log q, it suffices to show that

logq+loglogq1+2/q2>logq,\log q+\log\frac{\log q-1+2/q}{2}>\log q\ ,

which is true if q>e3q>e^{3}. Hence, m2<1/qlogqm_{2}<1/q\log q. Next, we have m2>(1/2q)logqm_{2}>(1/2q)\log q since

log(qlogqj2)2{qlogqq/2(qlogqj/2)}logq<0,\log\Big{(}q\log q-\frac{j}{2}\Big{)}-2\Big{\{}\frac{q\log q-q/2}{(q\log q-j/2)}\Big{\}}\log q<0\ ,

which is true if q>e8q>e^{8}. ∎

In the next two lemmas, we prove that some quantities are positive.

Lemma 10.4.

Let q>e8q>e^{8}. We have

m2s2s2>0.m_{2}^{\prime}-s_{2}s_{2}^{\prime}>0\ .
Proof.

We have

m2s2s2\displaystyle m_{2}^{\prime}-s_{2}s_{2}^{\prime} =m2(1jm2jqm22)q(12jm2)+js23(1s2)q(q2js2)\displaystyle\,=\,-\frac{m_{2}(1-jm_{2}-jqm_{2}^{2})}{q(1-2jm_{2})}+\frac{js_{2}^{3}(1-s_{2})}{q(q-2js_{2})}
=s2(1+jm2+jq(q+1)m22jq3m23)q(q2js2).\displaystyle\,=\,\frac{s_{2}(-1+jm_{2}+jq(q+1)m_{2}^{2}-jq^{3}m_{2}^{3})}{q(q-2js_{2})}\ .

It suffices to show that

jq(q+1)m22jq3m231>0.jq(q+1)m_{2}^{2}-jq^{3}m_{2}^{3}-1>0\ .

Since 12qlogq<m2<1qlogq\frac{1}{2q\log q}<m_{2}<\frac{1}{q\log q}, we obtain

jq(q+1)m22jq3m231\displaystyle jq(q+1)m_{2}^{2}-jq^{3}m_{2}^{3}-1 >jq(q+1)4q2(logq)2jq3q3(logq)31\displaystyle>\frac{jq(q+1)}{4q^{2}(\log q)^{2}}-\frac{jq^{3}}{q^{3}(\log q)^{3}}-1
=1q(logq)3[(q+1)(q2)4logqq(q2)q(logq)3]\displaystyle=\frac{1}{q(\log q)^{3}}\Big{[}\frac{(q+1)(q-2)}{4}\log q-q(q-2)-q(\log q)^{3}\Big{]}
>1q(logq)3[2(q+1)(q2)q(q2)q(logq)3]\displaystyle>\frac{1}{q(\log q)^{3}}[2(q+1)(q-2)-q(q-2)-q(\log q)^{3}]
=1q(logq)3[q2q(logq)34]>0.\displaystyle=\frac{1}{q(\log q)^{3}}[q^{2}-q(\log q)^{3}-4]>0\ .

In the second and third inequalities, we use q>e8q>e^{8}. Hence, m2s2s2>0m_{2}^{\prime}-s_{2}s_{2}^{\prime}>0. ∎

Lemma 10.5.

Let q>e8q>e^{8}. We have

(12logs2)βs, 2+βs, 2s2s2>0.\Big{(}\frac{1}{2}-\log s_{2}\Big{)}\beta_{s,\,2}^{\prime}+\beta_{s,\,2}\frac{s_{2}^{\prime}}{s_{2}}>0\ . (10.5)
Proof.

Let A(q)=12logs2A(q)=\frac{1}{2}-\log s_{2}. From Lemma 10.3, we obtain

52<12+log8<12+loglogq<A(q)<12+log(2logq),\frac{5}{2}<\frac{1}{2}+\log 8<\frac{1}{2}+\log\log q<A(q)<\frac{1}{2}+\log(2\log q)\ ,

and

A(q)βs, 2+βs, 2s2s2\displaystyle A(q)\beta_{s,\,2}^{\prime}+\beta_{s,\,2}\frac{s_{2}^{\prime}}{s_{2}}\, =s21s2[A(q)βs, 22qq2A(q)s22]+s21s2[1s2s2βs, 2]\displaystyle=\,\frac{s_{2}^{\prime}}{1-s_{2}}\Big{[}A(q)\beta_{s,\,2}-\frac{2q}{q-2}\frac{A(q)}{s_{2}^{2}}\Big{]}+\frac{s_{2}^{\prime}}{1-s_{2}}\Big{[}\frac{1-s_{2}}{s_{2}}\beta_{s,\,2}\Big{]}
=s21s2[βs, 2(1s2+A(q)1)2qq2A(q)s22].\displaystyle=\,\frac{s_{2}^{\prime}}{1-s_{2}}\Big{[}\beta_{s,\,2}\Big{(}\frac{1}{s_{2}}+A(q)-1\Big{)}-\frac{2q}{q-2}\frac{A(q)}{s_{2}^{2}}\Big{]}\ .

Hence, since s2<0s_{2}^{\prime}<0, it suffices to show that

(2qq2)A(q)s22>βs, 2(1s2+A(q)1)=βs, 2s2[1+(A(q)1)s2],\Big{(}\frac{2q}{q-2}\Big{)}\frac{A(q)}{s_{2}^{2}}>\beta_{s,\,2}\Big{(}\frac{1}{s_{2}}+A(q)-1\Big{)}=\frac{\beta_{s,\,2}}{s_{2}}[1+(A(q)-1)s_{2}]\ ,

i.e.,

βs, 2<11+(A(q)1)s22qq2A(q)s2.\beta_{s,\,2}<\frac{1}{1+(A(q)-1)s_{2}}\cdot\frac{2q}{q-2}\cdot\frac{A(q)}{s_{2}}\ .

Since, s2<1/logqs_{2}<1/\log q, the right-hand side is greater than

11+(A(q)1)s22qA(q)q2logq\displaystyle\frac{1}{1+(A(q)-1)s_{2}}\cdot\frac{2qA(q)}{q-2}\log q >11+(A(q)1)s2(5qq2)logq\displaystyle>\frac{1}{1+(A(q)-1)s_{2}}\Big{(}\frac{5q}{q-2}\Big{)}\log q
>51+(A(q)1)s2logq,\displaystyle>\frac{5}{1+(A(q)-1)s_{2}}\log q\ ,

and

βs, 2<g2(1/qlogq)=2logqlogq1logqlogq(q2)2<52log(qlogq)<154logq,\beta_{s,\,2}<g_{2}(1/q\log q)\,=\,\frac{2\log q}{\log q-1}\log\frac{q\log q-(q-2)}{2}<\,\frac{5}{2}\log(q\log q)<\,\frac{15}{4}\log q\ ,

where the last inequality is equivalent to 1/2>log(logq)/logq1/2>\log(\log q)/\log q which is true for q>e8q>e^{8}.

Hence, it is enough to show that

11+(A(q)1)s2>34,i.e.,13>(A(q)1)s2.\frac{1}{1+(A(q)-1)s_{2}}>\frac{3}{4}\ ,\ \text{i.e.,}\ \frac{1}{3}>(A(q)-1)s_{2}\ .

Since 0<A(q)<1/2+log(2logq)0<A(q)<1/2+\log(2\log q) and s2<1/logqs_{2}<1/\log q, we obtain, for q>e8q>e^{8},

(A(q)1)s2<log(2logq)1/2logq<13.(A(q)-1)s_{2}<\frac{\log(2\log q)-1/2}{\log q}<\frac{1}{3}\ .

Now, we derive the proof of Lemma 10.1.

Proof of Lemma 10.1.

By Lemma 10.2,

s2+s22+qs2=js22(1s2)q2js2s2(1s2)=(qjs2)qjs2s2,-s_{2}+s_{2}^{2}+qs_{2}^{\prime}\,=\,-\frac{js_{2}^{2}(1-s_{2})}{q-2js_{2}}-s_{2}(1-s_{2})\,=\,(q-js_{2})\frac{q}{js_{2}}s_{2}^{\prime}\ ,

so that

βs, 2=11s2(βs, 2s22qjs22s2)=11s2(βs, 22qjs22)s2.\beta_{s,\,2}^{\prime}=\frac{1}{1-s_{2}}\Big{(}\beta_{s,\,2}s_{2}^{\prime}-2\frac{q}{js_{2}^{2}}s_{2}^{\prime}\Big{)}=\frac{1}{1-s_{2}}\Big{(}\beta_{s,\,2}-\frac{2q}{js_{2}^{2}}\Big{)}s_{2}^{\prime}\ .

Now, we return to f(q)f_{\star}(q). We have

f(q)\displaystyle f_{\star}(q)\, =1βs, 2(logqm212)18(qm2)2+14m2+2512002\displaystyle=\,\frac{1}{\beta_{s,\,2}}(\log qm_{2}-\frac{1}{2})-\frac{1}{8}(qm_{2})^{2}+\frac{1}{4}m_{2}+\frac{251}{2002}
=1βs, 2(logs212)18(s2)2+14m2+2512002,\displaystyle=\,\frac{1}{\beta_{s,\,2}}(\log s_{2}-\frac{1}{2})-\frac{1}{8}(s_{2})^{2}+\frac{1}{4}m_{2}+\frac{251}{2002}\ ,

so that

f(q)\displaystyle f_{\star}^{\prime}(q) =βs, 2βs, 22(logs212)+1βs, 2(s2s2)+14(m2s2s2)\displaystyle=-\frac{\beta_{s,\,2}^{\prime}}{\beta_{s,\,2}^{2}}\Big{(}\log s_{2}-\frac{1}{2}\Big{)}+\frac{1}{\beta_{s,\,2}}(\frac{s_{2}^{\prime}}{s_{2}})+\frac{1}{4}(m_{2}^{\prime}-s_{2}s_{2}^{\prime})
=1βs, 22[βs, 2(12logs2)+βs, 2(s2s2)]+14(m2s2s2).\displaystyle=\frac{1}{\beta_{s,\,2}^{2}}\Big{[}\beta_{s,\,2}^{\prime}(\frac{1}{2}-\log s_{2})+\beta_{s,\,2}(\frac{s_{2}^{\prime}}{s_{2}})\Big{]}+\frac{1}{4}(m_{2}^{\prime}-s_{2}s_{2}^{\prime})\ .

Finally, Lemmas 10.4 and 10.5 prove Lemma 10.1. ∎

Appendix A Some Numerical Computations

Recall the definition (10.1) of f()f_{\star}(\cdot). In this section, we verify several inequalities numerically. Our purpose is the following proposition. The proof is presented at the end of this section.

Proposition A.1.

The following hold.

  1. (1)

    For 5q65005\leq q\leq 6500, we have Fβs, 2(𝐮2)>Fβs, 2(𝐯1)F_{\beta_{s,\,2}}({\bf u}_{2})>F_{\beta_{s,\,2}}({\bf v}_{1}).

  2. (2)

    For 6q546\leq q\leq 54, we have ddβ[fc(β)Fβ(𝐮2)]|β=βs, 2>0\frac{d}{d\beta}[f_{c}(\beta)-F_{\beta}({\bf u}_{2})]\Big{|}_{\beta=\beta_{s,\,2}}>0.

  3. (3)

    f(6500)>0f_{\star}(6500)>0.

Bounds of βs, 2\beta_{s,\,2} , m2m_{2} and v1v_{1}.

We will obtain the bounds of βs, 2\beta_{s,\,2}, m2m_{2}, and v1v_{1}. Fix q5q\geq 5 and let j=q2j=q-2. By gradient descent method, we obtain the following.

Algorithm A.2.

We define βs, 2u\beta_{s,\,2}^{u} and βs, 2l\beta_{s,\,2}^{l} in the following way.

  1. (1)

    t01/(2q4)t_{0}\leftarrow 1\,/\,(2q-4).

  2. (2)

    While g2(ti)>106g_{2}^{\prime}(t_{i})>10^{-6} , let ti+1tig2(ti)/(300q2)t_{i+1}\,\leftarrow\,t_{i}-g_{2}^{\prime}(t_{i})/(300q^{2}) .

  3. (3)

    If g2(ti)106g_{2}^{\prime}(t_{i})\leq 10^{-6} , let m2tim_{2}^{*}\leftarrow t_{i}.

Let βs, 2ug2(m2)+(36/q)|g2(m2)|\beta_{s,\,2}^{u}\coloneqq g_{2}(m_{2}^{*})+(36/q)|g_{2}^{\prime}(m_{2}^{*})| and βs, 2lg2(m2)(36/q)|g2(m2)|\beta_{s,\,2}^{l}\coloneqq g_{2}(m_{2}^{*})-(36/q)|g_{2}^{\prime}(m_{2}^{*})| .

We record m2m_{2}^{*} in the above algorithm and let

ρmg2(m2)/q.\rho_{m}\,\coloneqq\,g_{2}^{\prime}(m_{2}^{*})/q\ .
Algorithm A.3.

We define m2um_{2}^{u} and m2lm_{2}^{l} in the following way.

  1. (1)

    If h2(m2)0h_{2}(m_{2}^{*})\geq 0, let m2um2+ρmm_{2}^{u}\coloneqq m_{2}^{*}+\rho_{m}.

    1. (a)

      t0m2t_{0}\leftarrow m_{2}^{*}.

    2. (b)

      While h2(ti)0h_{2}(t_{i})\geq 0, let ti+1tiρmt_{i+1}\leftarrow t_{i}-\rho_{m}.

    3. (c)

      If h2(ti)<0h_{2}(t_{i})<0, let m2ltiρmm_{2}^{l}\coloneqq t_{i}-\rho_{m}.

  2. (2)

    If h2(m2)<0h_{2}(m_{2}^{*})<0, let m2lm2ρmm_{2}^{l}\coloneqq m_{2}^{*}-\rho_{m}.

    1. (a)

      t0m2t_{0}\leftarrow m_{2}^{*}.

    2. (b)

      While h2(ti)0h_{2}(t_{i})\leq 0, let ti+1ti+ρmt_{i+1}\leftarrow t_{i}+\rho_{m}.

    3. (c)

      If h2(ti)>0h_{2}(t_{i})>0, let m2uti+ρmm_{2}^{u}\coloneqq t_{i}+\rho_{m}.

By Newton method, we approximate v1v_{1} which satisfies g1(v1)=βs, 2g_{1}(v_{1})=\beta_{s,\,2}.

Algorithm A.4.

We define v1uv_{1}^{u} and v1lv_{1}^{l} in the following way.

  1. (1)

    Let t0= 0.8/qt_{0}\,=\,0.8/q and t1=0t_{-1}=0.

    1. (a)

      While |titi1|>105/q|t_{i}-t_{i-1}|>10^{-5}/q, let ti+1ti(g1(ti)βs, 2u)/g1(ti)t_{i+1}\leftarrow t_{i}-(g_{1}(t_{i})-\beta_{s,\,2}^{u})/g_{1}^{\prime}(t_{i}).

    2. (b)

      If |titi1|105/q|t_{i}-t_{i-1}|\leq 10^{-5}/q, let v1tiv_{1}^{*}\coloneqq t_{i} and ρv|titi1|\rho_{v}\coloneqq|t_{i}-t_{i-1}|.

  2. (2)

    If g1(v1)>βs, 2ug_{1}(v_{1}^{*})>\beta_{s,\,2}^{u}, let v1uv1+ρvv_{1}^{u}\coloneqq v_{1}^{*}+\rho_{v}.

  3. (3)

    If g1(v1)βs, 2ug_{1}(v_{1}^{*})\leq\beta_{s,\,2}^{u}, let

    1. (a)

      a0v1a_{0}\leftarrow v_{1}^{*}.

    2. (b)

      While g1(ai)βs, 2ug_{1}(a_{i})\leq\beta_{s,\,2}^{u}, let ai+1ai+ρva_{i+1}\leftarrow a_{i}+\rho_{v}.

    3. (c)

      If g1(ai)>βs, 2ug_{1}(a_{i})>\beta_{s,\,2}^{u}, let v1uai+ρvv_{1}^{u}\coloneqq a_{i}+\rho_{v}.

  4. (4)

    If g1(v1)<βs, 2lg_{1}(v_{1}^{*})<\beta_{s,\,2}^{l}, let v1lv1ρvv_{1}^{l}\coloneqq v_{1}^{*}-\rho_{v}.

  5. (5)

    If g1(v1)βs, 2lg_{1}(v_{1}^{*})\geq\beta_{s,\,2}^{l}, let

    1. (a)

      b0v1b_{0}\leftarrow v_{1}^{*}.

    2. (b)

      While g1(bi)βs, 2lg_{1}(b_{i})\geq\beta_{s,\,2}^{l}, let bi+1biρvb_{i+1}\leftarrow b_{i}-\rho_{v}.

    3. (c)

      If g1(bi)<βs, 2lg_{1}(b_{i})<\beta_{s,\,2}^{l}, let v1lbiρvv_{1}^{l}\coloneqq b_{i}-\rho_{v}.

Lemma A.5.

We have βs, 2l<βs, 2<βs, 2u\beta_{s,\,2}^{l}<\beta_{s,\,2}<\beta_{s,\,2}^{u} , m2l<m2<m2um_{2}^{l}<m_{2}<m_{2}^{u} , and v1l<v1<v1uv_{1}^{l}<v_{1}<v_{1}^{u} .

Proof.

From the Taylor’s theorem, we obtain

g2(m2+t)=g2(m2)+g2(m2+t)tg_{2}(m_{2}+t)=g_{2}(m_{2})+g_{2}^{\prime}(m_{2}+t^{*})t

for some t(0,t)t^{*}\in(0,t) if t>0t>0 or t(t,0)t^{*}\in(t,0) if t<0t<0. Since h2h_{2} is increasing in the neighborhood of m2m_{2}, we obtain

|g2(m2+t)|\displaystyle|g_{2}^{\prime}(m_{2}+t^{*})|\, =|2q[1q(m2+t)]2h2(m2+t)|\displaystyle=\,\bigg{|}\frac{2q}{[1-q(m_{2}+t^{*})]^{2}}h_{2}(m_{2}+t^{*})\bigg{|}
|2q[1q(m2+t)]2h2(m2+|t|)|=(1q(m2+|t|)1q(m2+t))2|g2(m2+|t|)|.\displaystyle\leq\,\bigg{|}\frac{2q}{[1-q(m_{2}+t^{*})]^{2}}h_{2}(m_{2}+|t|)\bigg{|}\,=\,\bigg{(}\frac{1-q(m_{2}+|t|)}{1-q(m_{2}+t^{*})}\bigg{)}^{2}|g_{2}^{\prime}(m_{2}+|t|)|\ .

Since m2+t,m2<1/(2j)m_{2}+t^{*},\,m_{2}<1/(2j), we obtain

1q(m2+|t|)1q(m2+t)11q(m2+t)11q/(2j)=2q4q46,\frac{1-q(m_{2}+|t|)}{1-q(m_{2}+t^{*})}\,\leq\,\frac{1}{1-q(m_{2}+t^{*})}\,\leq\,\frac{1}{1-q/(2j)}\,=\,\frac{2q-4}{q-4}\leq 6\ ,

where the last inequality is from q5q\geq 5. Hence, we have

|g2(m2+t)|36|g2(m2+|t|)|,|g_{2}^{\prime}(m_{2}+t^{*})|\leq 36|g_{2}^{\prime}(m_{2}+|t|)|\ ,

so that we have

|βs, 2g2(m2+t)|=|g2(m2)g2(m2+t)||g2(m2+t)||t|36q|g2(m2+|t|)|,|\beta_{s,\,2}-g_{2}(m_{2}+t)|\,=\,|g_{2}(m_{2})-g_{2}(m_{2}+t)|\,\leq\,|g_{2}^{\prime}(m_{2}+t^{*})||t|\,\,\leq\,\frac{36}{q}|g_{2}^{\prime}(m_{2}+|t|)|\ ,

which proves the first claim. In the last inequality, we use the fact that |t|<1/q|t|<1/q.

Since h2(t)>0h_{2}(t)>0 if t>m2t>m_{2} and h2(t)<0h_{2}(t)<0 if t<m2t<m_{2}, the second claim is true. Finally, since g1g_{1} is increasing in the neighborhood of v1v_{1}, the third claim holds. ∎

We finally prove Proposition A.1.

Proof of Proposition A.1.

From Lemma A.5, we obtain

βs, 2l<βs, 2<βs, 2u,m2l<m2<m2u,andv1l<v1<v1u.\beta_{s,\,2}^{l}<\beta_{s,\,2}<\beta_{s,\,2}^{u}\ ,\ m_{2}^{l}<m_{2}<m_{2}^{u}\ ,\ \text{and}\ v_{1}^{l}<v_{1}<v_{1}^{u}\ .

By elementary computation, we have

Fβs, 2(𝐮2)Fβs, 2(𝐯1)\displaystyle F_{\beta_{s,\,2}}({\bf u}_{2})-F_{\beta_{s,\,2}}({\bf v}_{1}) 14[q(q2)(m2u1q2)22q2]+1βs, 2llogm2l\displaystyle\geq\,\frac{1}{4}[q(q-2)\Big{(}m_{2}^{u}-\frac{1}{q-2}\Big{)}^{2}-\frac{2}{q-2}]+\frac{1}{\beta_{s,\,2}^{l}}\log m_{2}^{l}
12[q(q1)(v1l1q1)21q1]1βs, 2ulogv1u,\displaystyle-\,\frac{1}{2}\Big{[}q(q-1)\Big{(}v_{1}^{l}-\frac{1}{q-1}\Big{)}^{2}-\frac{1}{q-1}\Big{]}-\frac{1}{\beta_{s,\,2}^{u}}\log v_{1}^{u}\ ,
logqβs, 2+2k2(m2)\displaystyle\log q\beta_{s,\,2}+2k_{2}(m_{2}) log(qβs, 2l)+2k2(m2u),\displaystyle\geq\,\log(q\beta_{s,\,2}^{l})+2k_{2}(m_{2}^{u})\ ,
f(6500)\displaystyle f_{\star}(6500) 1βs, 2l(logqm2l12)18(qm2u)2+14m2l+2512002.\displaystyle\geq\,\frac{1}{\beta_{s,\,2}^{l}}\Big{(}\log qm_{2}^{l}-\frac{1}{2}\Big{)}-\frac{1}{8}(qm_{2}^{u})^{2}+\frac{1}{4}m_{2}^{l}+\frac{251}{2002}\ .

The second inequality holds since k2()k_{2}(\cdot) is decreasing according to (8.7). From the numerical computations, we find that the right-hand sides of the displayed equations are positive for 5q65005\leq q\leq 6500, and this completes the proof. ∎

Appendix B Proof of (3.8)

Proof of (3.8).

Since we have

ZN(β)=𝒙ΞN!(Nx1)!(Nxq)!exp{βNH(𝒙)},Z_{N}(\beta)=\sum_{\bm{x}\in\Xi}\frac{N!}{(Nx_{1})!\cdots(Nx_{q})!}\exp\{-\beta NH(\bm{x})\}\ ,

we can use the elementary bound

klogkklogk!(k+1)log(k+1)k,k\log k-k\leq\log k!\leq(k+1)\log(k+1)-k\ ,

to obtain

𝒙Ξexp{βN[12(x12++xq2)+1βi=1q(xi+1N)log(xi+1N)]qlogN}\displaystyle\sum_{\bm{x}\in\Xi}\exp\Big{\{}-\beta N\Big{[}-\frac{1}{2}(x_{1}^{2}+\cdots+x_{q}^{2})+\frac{1}{\beta}\sum_{i=1}^{q}\big{(}x_{i}+\frac{1}{N}\big{)}\log\big{(}x_{i}+\frac{1}{N}\big{)}\Big{]}-q\log N\Big{\}}
\displaystyle\leq ZN(β)𝒙Ξexp{βNFβ(𝒙)+log(N+1)+Nlog(1+1N)}.\displaystyle Z_{N}(\beta)\leq\sum_{\bm{x}\in\Xi}\exp\Big{\{}-\beta NF_{\beta}(\bm{x})+\log(N+1)+N\log\big{(}1+\frac{1}{N}\big{)}\Big{\}}\ .

Hence, by the definition of FβF_{\beta} (2.4), we can obtain

sup𝒙Ξ{Fβ(𝒙)}+O(logNN)1βNlogZN(β)sup𝒙Ξ{Fβ(𝒙)}+O(logNN)\sup_{\bm{x}\in\Xi}\{-F_{\beta}(\bm{x})\}+O\Big{(}\frac{\log N}{N}\Big{)}\leq\frac{1}{\beta N}\log Z_{N}(\beta)\leq\sup_{\bm{x}\in\Xi}\{-F_{\beta}(\bm{x})\}+O\Big{(}\frac{\log N}{N}\Big{)}

and the proof is completed. ∎

Acknowledgement.

This work was supported by Samsung Science and Technology Foundation (Project Number SSTF-BA1901-03).

References

  • [1] L. Alonso, R. Cerf: The three dimensional polyominoes of minimal area. Electron. J. Comb. 3, R27 (1996)
  • [2] J. Beltrán, C. Landim: Tunneling and metastability of continuous time Markov chains. J. Stat. Phys. 140, 1065–1114 (2010)
  • [3] J. Beltrán, C. Landim: Tunneling and metastability of continuous time Markov chains II. J. Stat. Phys. 149, 598–618 (2012)
  • [4] G. Ben Arous, R. Cerf : Metastability of the three dimensional Ising model on a torus at very low temperatures. Electron. J. Probab. 1, 1–55 (1996)
  • [5] A. Binachi, A. Bovier, D. Ioffe: Sharp asymptotics for metastability in the random field Curie-Weiss model. Electron. J. Probab. 14, 1541–1603 (2009)
  • [6] A. Bovier, M. Eckhoff,V. Gayrard, M. Klein: Metastability in stochastic dynamics of disordered mean-field models. Probab. Theory Relat. Fields 119, 99–161 (2001).
  • [7] A. Bovier, F. den Hollander: Metastabillity: A Potential-theoretic approach. Grundlehren der mathematischen Wissenschaften. Springer. (2015)
  • [8] A. Bovier, F. den Hollander, F.R. Nardi: Sharp asymptotics for Kawasaki dynamics on a finite box with open boundary. Probab. Theory Relat. Fields 135, 265-310 (2006)
  • [9] A. Bovier, F. den Hollander, C. Spitoni: Homogeneous nucleation for Glauber and Kawasaki dynamics in large volumes at low temperatures. Ann. Probab. 38, 661–713 (2010)
  • [10] A. Bovier, F. Manzo: Metastability in Glauber dynamics in the low-temperature limit: beyond exponential asymptotics. J. Stat. Phys. 107, 757–779 (2002)
  • [11] A. Bovier, S. Marelo, E. Pulvirenti: Metastability for the dilute Curie–Weiss model with Glauber dynamics. arXiv:1912.10699 (2019)
  • [12] M. Cassandro, A. Galves, E. Olivieri, M.E. Vares : Metastable behavior of stochastic dynamics: A pathwise approach. J. Stat. Phys. 35, 603–634 (1984)
  • [13] M. Costeniuc, R. S. Ellis, H. Touchette: Complete analysis of phase transitions and ensemble equivalence for the Curie–Weiss–Potts model. J. Math. Phys. 46, 063301 (2005)
  • [14] P. Cuff, J. Ding, O. Louidor, E. Lubetzky, Y. Peres, A. Sly: Glauber dynamics for the mean-field Potts model. J. Stat. Phys. 149, 432–477 (2012)
  • [15] P. Eichelsbacher, B. Martschink: On rates of convergence in the Curie–Weiss–Potts model with an external field. Ann. Henri Poincaré: Probab. Statist. 51, 252-282 (2015)
  • [16] R. S. Ellis, K. Wang: Limit theorems for the empirical vector of the Curie–Weiss–Potts model. Stoch. Process. Their Appl. 35, 59–79 (1990)
  • [17] R. B. Griffiths, P. A. Pearce: Potts model in the many-component limit. J. Phys. A. 13, 2143–2148 (1980)
  • [18] F. den Hollander, O. Jovanovski: Glauber dynamics on the Erdős-Rényi random graph. arXiv:1912.10591 (2020)
  • [19] H. Kesten, R. H. Schonmann: Behavior in large dimensions of the Potts and Heisenberg models. Rev. Math. Phys. 1, 147–182 (1982)
  • [20] S. Kim, I. Seo: Metastability of stochastic Ising and Potts models on lattices without external fields. arXiv:2102.05565 (2021)
  • [21] C. Külske, D. Meißner. Stable and metastable phases for the Curie–Weiss–Potts model in vector-valued fields via singularity theory. J. Stat. Phys. 181, 968–989 (2020)
  • [22] C. Landim: Metastable Markov chains. Probab. Surv. 16, 143–227 (2019)
  • [23] C. Landim, I. Seo: Metastability of non-reversible random walks in a potential field, the Eyring– Kramers transition rate formula. Commun. Pure Appl. Math. 71, 203–266 (2018)
  • [24] C. Landim, I. Seo: Metastability of non-reversible, mean-field Potts model with three spins. J. Stat. Phys. 165, 693–726 (2016)
  • [25] D.A. Levin, M.J. Luczak, Y. Peres: Glauber dynamics for the mean-field Ising model: cut-off, critical power law, and metastability. Probab. Theory Relat. Fields 146, 223 (2010)
  • [26] E. Lubetzky, A. Sly: Cutoff for the Ising model on the lattice. Invent. Math. 191, 719–755 (2013)
  • [27] E. Lubetzky, A. Sly: Information percolation and cutoff for the stochastic ising model. J. Am. Math. Soc. 29, 729–774 (2016)
  • [28] E. Lubetzky, A. Sly: Universality of cutoff for the Ising model. Ann. Probab. 45, 3664–3696 (2017)
  • [29] F.R. Nardi, A. Zocca: Tunneling behavior of Ising and Potts models in the low-temperature regime. Stoch. Process. Their Appl. 129, 4556–4575 (2019)
  • [30] E.J. Neves: A discrete variational problem related droplets at low temperatures. J. Stat. Phys. 80, 103–123 (1995)
  • [31] E.J. Neves, R.H. Schonmann: Critical droplets and metastability for a glauber dynamics at very low temperatures. Commun. Math. Phys. 137, 209–230 (1991)
  • [32] E. Olivieri, M.E. Vares : Large deviations and metastability. Encyclopedia of Mathematics and Its Applications. 100. Cambridge University Press, Cambridge. (2005)
  • [33] M. Ostilli, F. Mukhamedov: Continuous- and discrete-time Glauber dynamics. First- and second order phase transitions in mean-field Potts models. EPL. 101(6) (2013)
  • [34] F. Rassoul-Agha, T. Seppäläinen: A Course on Large Deviations with an Introduction to Gibbs Measures. Graduate Studies in Mathematics. 162. American Mathematical Society. (2015)
  • [35] K. Wang: Solutions of the variational problem in the Curie–Weiss–Potts model. Stoch. Process. Their Appl. 50, 245–252 (1994).
  • [36] F.Y. Wu: The Potts model. Rev. Mod. Phys. 54, 235–268 (1982)