This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

The Critical Mean-field Chayes-Machta Dynamics

Antonio Blanca Pennsylvania State University. Email: [email protected]. Research supported in part by NSF grant CCF-1850443.    Alistair Sinclair UC Berkeley. Email: [email protected]. Research supported in part by NSF grant CCF-1815328.    Xusheng Zhang Pennsylvania State University. Email: [email protected]. Research supported in part by NSF grant CCF-1850443.
Abstract

The random-cluster model is a unifying framework for studying random graphs, spin systems and electrical networks that plays a fundamental role in designing efficient Markov Chain Monte Carlo (MCMC) sampling algorithms for the classical ferromagnetic Ising and Potts models.

In this paper, we study a natural non-local Markov chain known as the Chayes-Machta dynamics for the mean-field case of the random-cluster model, where the underlying graph is the complete graph on nn vertices. The random-cluster model is parametrized by an edge probability pp and a cluster weight qq. Our focus is on the critical regime: p=pc(q)p=p_{c}(q) and q(1,2)q\in(1,2), where pc(q)p_{c}(q) is the threshold corresponding to the order-disorder phase transition of the model. We show that the mixing time of the Chayes-Machta dynamics is O(lognloglogn)O(\log n\cdot\log\log n) in this parameter regime, which reveals that the dynamics does not undergo an exponential slowdown at criticality, a surprising fact that had been predicted (but not proved) by statistical physicists. This also provides a nearly optimal bound (up to the loglogn\log\log n factor) for the mixing time of the mean-field Chayes-Machta dynamics in the only regime of parameters where no non-trivial bound was previously known. Our proof consists of a multi-phased coupling argument that combines several key ingredients, including a new local limit theorem, a precise bound on the maximum of symmetric random walks with varying step sizes, and tailored estimates for critical random graphs.

In addition, we derive an improved comparison inequality between the mixing time of the Chayes-Machta dynamics and that of the local Glauber dynamics on general graphs; this results in better mixing time bounds for the local dynamics in the mean-field setting.

1 Introduction

The random-cluster model generalizes classical random graph and spin system models, providing a unifying framework for their study [14]. It plays an indispensable role in the design of efficient Markov Chain Monte Carlo (MCMC) sampling algorithms for the ferromagnetic Ising/Potts model [31, 8, 20] and has become a fundamental tool in the study of phase transitions [2, 12, 11].

The random-cluster model is defined on a finite graph G=(V,E)G=(V,E) with an edge probability parameter p(0,1)p\in(0,1) and a cluster weight q>0q>0. The set of configurations of the model is the set of all subsets of edges AEA\subseteq E. The probability of each configuration AA is given by the Gibbs distribution:

μG,p,q(A)=1Zp|A|(1p)|E||A|qc(A);\mu_{G,p,q}(A)=\frac{1}{Z}\cdot p^{|A|}(1-p)^{|E|-|A|}q^{c(A)}; (1)

where c(A)c(A) is the number of connected components in (V,A)(V,A) and Z:=Z(G,p,q)Z:=Z(G,p,q) is the normalizing factor called the partition function.

The special case when q=1q=1 corresponds to the independent bond percolation model, where each edge of the graph GG appears independently with probability pp. Independent bond percolation is also known as the Erdős-Rényi random graph model when GG is the complete graph.

For integer q2q\geq 2, the random-cluster model is closely related to the ferromagnetic qq-state Potts model. Configurations in the qq-state Potts model are the assignments of spin values {1,,q}\{1,\dots,q\} to the vertices of GG; the q=2q=2 case corresponds to the Ising model. A sample AEA\subseteq E from the random-cluster distribution can be easily transformed into one for the Ising/Potts model by independently assigning a random spin from {1,,q}\{1,\dots,q\} to each connected component of (V,A)(V,A). Random-cluster based sampling algorithms, which include the widely-studied Swendsen-Wang dynamics [30], are an attractive alternative to Ising/Potts Markov chains since they are often efficient at “low-temperatures” (large pp). In this parameter regime, several standard Ising/Potts Markov chains are known to converge slowly.

In this paper we investigate the Chayes-Machta (CM) dynamics [10], a natural Markov chain on random-cluster configurations that converges to the random-cluster measure. The CM dynamics is a generalization to non-integer values of qq of the widely studied Swendsen-Wang dynamics [30]. As with all applications of the MCMC method, the primary object of study is the mixing time, i.e., the number of steps until the dynamics is close to its stationary distribution, starting from the worst possible initial configuration. We are interested in understanding how the mixing time of the CM dynamics grows as the size of the graph GG increases, and in particular how it relates to the phase transition of the model.

Given a random-cluster configuration (V,A)(V,A), one step of the CM dynamics is defined as follows:

  1. (i)

    activate each connected component of (V,A)(V,A) independently with probability 1/q1/q;

  2. (ii)

    remove all edges connecting active vertices;

  3. (iii)

    add each edge between active vertices independently with probability pp, leaving the rest of the configuration unchanged.

We call (i) the activation sub-step, and (ii) and (iii) combined the percolation sub-step. It is easy to check that this dynamics is reversible with respect to the Gibbs distribution (1) and thus converges to it [10]. For integer qq, the CM dynamics may be viewed as a variant of the Swendsen-Wang dynamics. In the Swendsen-Wang dynamics, each connected component of (V,A)(V,A) receives a random color from {1,,q}\{1,\dots,q\}, and the edges are updated within each color class as in (ii) and (iii) above; in contrast, the CM dynamics updates the edges of exactly one color class. However, note that the Swendsen-Wang dynamics is only well-defined for integer qq, while the CM dynamics is feasible for any real q>1q>1. Indeed, the CM dynamics was introduced precisely to allow this generalization.

The study of the interplay between phase transitions and the mixing time of Markov chains goes back to pioneering work in mathematical physics in the late 1980s. This connection for the specific case of the CM dynamics on the complete nn-vertex graph, known as the mean-field model, has received some attention in recent years (see [7, 15, 18]) and is the focus of this paper. As we shall see, the mean-field case is already quite non-trivial, and has historically proven to be a useful starting point in understanding various types of dynamics on more general graphs. We note that, so far, the mean-field is the only setting in which there are tight mixing time bounds for the CM dynamics; all other known bounds are deduced indirectly via comparison with other Markov chains, thus incurring significant overhead [8, 6, 17, 5, 31, 7].

The phase transition for the mean-field random-cluster model is fairly well-understood [9, 25]. In this setting, it is natural to re-parameterize by setting p=ζ/np=\zeta/n; the phase transition then occurs at the critical value ζ=ζcr(q)\zeta=\zeta_{\textnormal{{cr}}}(q), where ζcr(q)=q\zeta_{\textnormal{{cr}}}(q)=q when q(0,2]q\in(0,2] and ζcr(q)=2(q1q2)log(q1)\zeta_{\textnormal{{cr}}}(q)=2(\frac{q-1}{q-2})\log(q-1) for q>2q>2. For ζ<ζcr(q)\zeta<\zeta_{\textnormal{{cr}}}(q) all components are of size O(logn)O(\log n) with high probability (w.h.p.); that is, with probability tending to 11 as nn\rightarrow\infty. This regime is known as the disordered phase. On the other hand, for ζ>ζcr(q)\zeta>\zeta_{\textnormal{{cr}}}(q) there is a unique giant component of size θn\approx\theta n, where θ=θ(ζ,q)\theta=\theta(\zeta,q); this regime of parameters is known as the ordered phase. The phase transition is thus analogous to that in G(n,p)G(n,p) corresponding to the emergence of a giant component.

The phase structure of the mean-field random-cluster model, however, is more subtle and depends crucially on the second parameter qq. In particular, when q>2q>2 the model exhibits phase coexistence at the critical threshold ζ=ζcr(q)\zeta=\zeta_{\textnormal{{cr}}}(q). Roughly speaking, this means that when ζ=ζcr(q)\zeta=\zeta_{\textnormal{{cr}}}(q), the set of configurations with all connected components of size O(logn)O(\log n), and set of configurations with a unique giant component, contribute each a constant fraction of the probability mass. For q2q\leq 2, on the other hand, there is no phase coexistence. These subtleties are illustrated in Figure 1.

ζ\zetaζl\zeta_{\textsc{l}}ζcr\zeta_{\textnormal{{cr}}}ζr\zeta_{\textsc{r}} Ordered phase Disordered phase Phase coexistenceMetastability window
(a) q>2q>2
ζ\zetaζcr\zeta_{\textnormal{{cr}}} Ordered phase Disordered phase
(b) 1<q21<q\leq 2
Figure 1: (a)): phase structure when q>2q>2. (b): phase structure when q(1,2]q\in(1,2].

Phase coexistence at ζ=ζcr(q)\zeta=\zeta_{\textnormal{{cr}}}(q) when q>2q>2 has significant implications for the speed of convergence of Markov chains, including the CM dynamics. The following detailed connection between the phase structure of the model and the mixing time τmixCM\tau_{\rm mix}^{\mathrm{CM}} of the CM dynamics was recently established in [7, 4, 18]. When q>2q>2, we have:

τmixCM={Θ(logn)ifζ[ζl,ζr);Θ(n1/3)ifζ=ζl;eΩ(n)ifζ(ζl,ζr),\tau_{\rm mix}^{\mathrm{CM}}=\begin{cases}\Theta(\log n)&\textrm{if}~{}\zeta\not\in[\zeta_{\textsc{l}},\zeta_{\textsc{r}});\\ \Theta(n^{1/3})&\textrm{if}~{}\zeta=\zeta_{\textsc{l}};\\ e^{\Omega({n})}&\textrm{if}~{}\zeta\in(\zeta_{\textsc{l}},\zeta_{\textsc{r}}),\end{cases} (2)

where (ζl,ζr)(\zeta_{\textsc{l}},\zeta_{\textsc{r}}) is the so-called metastability window. It is known that ζr=q\zeta_{\textsc{r}}=q, but ζl\zeta_{\textsc{l}} does not have a closed form; see [7, 25]; we note that ζcr(q)(ζl,ζr)\zeta_{\textnormal{{cr}}}(q)\in(\zeta_{\textsc{l}},\zeta_{\textsc{r}}) for q>2q>2.

When q(1,2]q\in(1,2], there is no metastability window, and the mixing time of the mean-field CM dynamics is Θ(logn)\Theta(\log n) for all ζζcr(q)\zeta\neq\zeta_{\textnormal{{cr}}}(q). In view of these results, the only case remaining open is when q(1,2]q\in(1,2] and ζ=ζcr(q)\zeta=\zeta_{\textnormal{{cr}}}(q). Our main result shown below concerns precisely this regime, which is particularly delicate and had resisted analysis until now for reasons we explain in our proof overview.

Theorem 1.1.

The mixing time of the CM dynamics on the complete nn-vertex graph when ζ=ζcr(q)=q\zeta=\zeta_{\textnormal{{cr}}}(q)=q and q(1,2)q\in(1,2) is O(lognloglogn)O(\log n\cdot\log\log n).

A Ω(logn)\Omega(\log n) lower bound is known for the mixing time of the mean-field CM dynamics that holds for all p(0,1)p\in(0,1) and q>1q>1 [7]. Therefore, our result is tight up to the lower order O(loglogn)O(\log\log n) factor, and in fact even better as we explain in Remark 4. The conjectured tight bound when ζ=ζcr(q)\zeta=\zeta_{\textnormal{{cr}}}(q) and q(1,2)q\in(1,2) is Θ(logn)\Theta(\log n). We mention that the ζ=ζcr(q)\zeta=\zeta_{\textnormal{{cr}}}(q) and q=2q=2 case, which is quite different and not covered by Theorem 1.1, was considered earlier in [24] for the closely related Swendsen-Wang dynamics, and a tight Θ(n1/4)\Theta(n^{1/4}) bound was established for its mixing time. The same mixing time bound is expected for the CM dynamics in this regime.

Our result establishes a striking behavior for random-cluster dynamics when q(1,2)q\in(1,2). Namely, there is no slowdown (exponential or power law) in this regime at the critical threshold ζ=ζcr(q)\zeta=\zeta_{\textnormal{{cr}}}(q). Note that for q>2q>2, as described in (2) above, the mixing time of the dynamics undergoes an exponential slowdown, transitioning from Θ(logn)\Theta(\log n) when ζ<ζl\zeta<\zeta_{\textsc{l}}, to a power law at ζ=ζl\zeta=\zeta_{\textsc{l}}, and to exponential in nn when ζ(ζl,ζr)\zeta\in(\zeta_{\textsc{l}},\zeta_{\textsc{r}}). The absence of a critical slowdown for q(1,2)q\in(1,2) was in fact predicted by the statistical physics community [16], and our result provides the first rigorous proof of this phenomenon. See Remark 1 for further comments.

Our second result concerns the local Glauber dynamics for the random-cluster model. In each step, the Glauber dynamics updates a single edge of the current configuration chosen uniformly at random; a precise definition of this Markov chain is given in Section 6. In [7], it was established that any upper bound on the mixing time τmixCM\tau_{\rm mix}^{\mathrm{CM}} of the CM dynamics can be translated to one for the mixing time τmixGD\tau_{\rm mix}^{\mathrm{GD}} of the Glauber dynamics, at the expense of a O~(n4)\tilde{O}(n^{4}) factor; the O~\tilde{O} notation hides polylogarithmic factors. In particular, it was proved in [7] that τmixGDτmixCMO~(n4).\tau_{\rm mix}^{\mathrm{GD}}\leq\tau_{\rm mix}^{\mathrm{CM}}\cdot\tilde{O}(n^{4}). We provide here an improvement of this comparison inequality.

Theorem 1.2.

For all q>1q>1 and all ζ=O(1)\zeta=O(1), τmixGDτmixCMO(n3(logn)2).\tau_{\rm mix}^{\mathrm{GD}}\leq\tau_{\rm mix}^{\mathrm{CM}}\cdot{O}(n^{3}(\log n)^{2}).

To prove this theorem, we establish a general comparison inequality that holds for any graph, any q1q\geq 1 and any p(0,1)p\in(0,1); see Theorem 6.1 for a precise statement. When combined with the known mixing time bounds for the CM dynamics on the complete graph, Theorem 1.2 yields that the random-cluster Glauber dynamics mixes in O~(n3)\tilde{O}(n^{3}) steps when q>2q>2 and ζ(ζl,ζr)\zeta\not\in(\zeta_{\textsc{l}},\zeta_{\textsc{r}}), or when q(1,2)q\in(1,2) and ζ=O(1)\zeta=O(1). In these regimes, the mixing time of the Glauber dynamics was previously known to be O~(n4)\tilde{O}(n^{4}) and is conjectured to be O~(n2)\tilde{O}(n^{2}); the improved comparison inequality in Theorem 1.2 gets us closer to this conjectured tight bound. We note, however, that even if one showed the conjectured optimal bound for the mixing time of the Glauber dynamics, the CM is faster, even if we take into account the computational cost associated to implementing its steps.

We conclude this introduction with some brief remarks about our analysis techniques, which combine several key ingredients in a non-trivial way. Our bound on the mixing time uses the well-known technique of coupling: in order to show that the mixing time is O(lognloglogn)O(\log n\cdot\log\log n), it suffices to couple the evolutions of two copies of the dynamics, starting from two arbitrary configurations, in such a way that they arrive at the same configuration after O(logn)O(\log n) steps with probability Ω(1/loglogn)\Omega(1/\log\log n). (The moves of the two copies can be correlated any way we choose, provided that each copy, viewed in isolation, is a valid realization of the dynamics.) Because of the delicate nature of the phase transition in the random-cluster model, combined with the fact that the percolation sub-step of the CM dynamics is critical when ζ=q\zeta=q, our coupling is somewhat elaborate and proceeds in multiple phases. The first phase consists of a burn-in period, where the two copies of the chain are run independently and the evolution of their largest components is observed until they have shrunk to their “typical” sizes. This part of the analysis is inspired by similar arguments in earlier work [7, 24, 15].

In the second phase, we design a coupling of the activation of the connected components of the two copies which uses: (i) a local limit theorem, which can be thought of as a stronger version of a central limit theorem; (ii) a precise understanding of the distribution of the maximum of symmetric random walks on \mathbb{Z} with varying step sizes; and (iii) precise estimates for the component structure of random graphs. We develop tailored versions of these probabilistic tools for our setting and combine them to guarantee that the same number of vertices from each copy are activated in each step w.h.p. for sufficiently many steps. This phase of the coupling is the main novelty in our analysis, and allows us to quickly converge to the same configuration. We give a more detailed overview of our proof in the following section.

2 Proof sketch and techniques

We now give a detailed sketch of the multi-phased coupling argument for proving Theorem 1.1. We start by formally defining the notions of mixing and coupling times. Let ΩRC\Omega_{\textsc{\tiny RC}} be the set of random-cluster configurations of a graph GG; let \mathcal{M} be the transition matrix of a random-cluster Markov chain with stationary distribution μ=μG,p,q\mu=\mu_{G,p,q}, and let t(X0,)\mathcal{M}^{t}(X_{0},\cdot) be the distribution of the chain after tt steps starting from X0ΩRCX_{0}\in\Omega_{\textsc{\tiny RC}}. The ε\varepsilon-mixing time of \mathcal{M} is given by

τmix(ε):=maxX0ΩRCmint0{t(X0,)μ()TVε},\tau_{\rm mix}^{\mathcal{M}}(\varepsilon):=\max\limits_{X_{0}\in\Omega_{\textsc{\tiny RC}}}\min\limits_{t\geq 0}\left\{||\mathcal{M}^{t}(X_{0},\cdot)-\mu(\cdot)||_{\textsc{\tiny TV}}\leq\varepsilon\right\},

where ||||TV||\cdot||_{\textsc{\tiny TV}} denotes total variation distance. In particular, the mixing time of \mathcal{M} is τmix:=τmix(1/4)\tau_{\rm mix}^{\mathcal{M}}:=\tau_{\rm mix}^{\mathcal{M}}(1/4).

A (one step) coupling of the Markov chain \mathcal{M} specifies, for every pair of states (Xt,Yt)ΩRC×ΩRC(X_{t},Y_{t})\in\Omega_{\textsc{\tiny RC}}\times\Omega_{\textsc{\tiny RC}}, a probability distribution over (Xt+1,Yt+1)(X_{t+1},Y_{t+1}) such that the processes {Xt}\{X_{t}\} and {Yt}\{Y_{t}\} are valid realizations of \mathcal{M}, and if Xt=YtX_{t}=Y_{t} then Xt+1=Yt+1X_{t+1}=Y_{t+1}. The coupling time, denoted TcoupT_{\rm coup}, is the minimum TT such that Pr[XTYT]1/4\Pr[X_{T}\neq Y_{T}]\leq 1/4, starting from the worst possible pair of configurations in ΩRC\Omega_{\textsc{\tiny RC}}. It is a standard fact that τmixTcoup\tau_{\rm mix}^{\mathcal{M}}\leq T_{\rm coup}; moreover, when Pr[XT=YT]δ\Pr[X_{T}=Y_{T}]\geq\delta for some coupling, then τmix=O(Tδ1)\tau_{\rm mix}^{\mathcal{M}}=O(T\delta^{-1}) (see, e.g., [22]).

We provide first a high level description of our coupling for the CM dynamics. For this, we require the following notation. For a random cluster configuration XX, let Li(X)L_{i}(X) denote the size of the ii-th largest connected component in (V,X)(V,X), and let i(X):=jiLj(X)2\mathcal{R}_{i}(X):=\sum_{j\geq i}L_{j}(X)^{2}; in particular, 1(X)\mathcal{R}_{1}(X) is the sum of the squares of the sizes of all the components of (V,X)(V,X). Our coupling has three main phases:

  1. 1.

    Burn-in period: run two copies {Xt}\{X_{t}\}, {Yt}\{Y_{t}\} independently, starting from a pair of arbitrary initial configurations, until 1(XT)=O(n4/3)\mathcal{R}_{1}(X_{T})=O(n^{4/3}) and 1(YT)=O(n4/3)\mathcal{R}_{1}(Y_{T})=O(n^{4/3}).

  2. 2.

    Coupling to the same component structure: starting from XTX_{T} and YTY_{T} such that 1(XT)=O(n4/3)\mathcal{R}_{1}(X_{T})=O(n^{4/3}) and 1(YT)=O(n4/3)\mathcal{R}_{1}(Y_{T})=O(n^{4/3}), we design a two-phased coupling that reaches two configurations with the same component structure as follows:

    1. 2a.

      A two-step coupling after which the two configurations agree on all “large components”;

    2. 2b.

      A coupling that after O(logn)O(\log n) additional steps reaches two configurations that will also have the same “small component” structure.

  3. 3.

    Coupling to the same configuration: starting from two configurations with the same component structure, there is a straightforward coupling that couples the two configurations in O(logn)O(\log n) steps w.h.p.

We proceed to describe each of these phases in detail.

2.1 The burn-in period

During the initial phase, two copies of the dynamics evolve independently. This is called a burn-in period and in our case consists of three sub-phases.

In the first sub-phase of the burn-in period the goal is to reach a configuration XX such that 2(X)=O(n4/3)\mathcal{R}_{2}(X)=O(n^{4/3}). For this, we use a lemma from [4], which shows that after T=O(logn)T=O(\log n) steps of the CM dynamics 2(XT)=O(n4/3)\mathcal{R}_{2}(X_{T})=O(n^{4/3}) with at least constant probability; this holds when ζ=q\zeta=q for any initial configuration X0X_{0} and any q>1q>1.

Lemma 2.1 ([4], Lemma 3.42).

Let q>1q>1 and ζ=q\zeta=q, and let X0X_{0} be an arbitrary random-cluster configuration. Then, for any constant C0C\geq 0, after T=O(logn)T=O(\log n) steps 2(XT)=O(n4/3)\mathcal{R}_{2}(X_{T})=O(n^{4/3}) and L1(XT)>Cn2/3L_{1}(X_{T})>Cn^{2/3} with probability Ω(1)\Omega(1).

In the second and third sub-phases of the burn-in period, we use the fact that when 2(Xt)=O(n4/3)\mathcal{R}_{2}(X_{t})=O(n^{4/3}), the number of activated vertices is well concentrated around n/qn/q (its expectation). This is used to show that the size of the largest component contracts at a constant rate for T=O(logn)T=O(\log n) steps until a configuration XTX_{T} is reached such that 1(XT)=O(n4/3)\mathcal{R}_{1}(X_{T})=O(n^{4/3}). This part of the analysis is split into two sub-phases because the contraction for L1(Xt)L_{1}(X_{t}) requires a more delicate analysis when L1(Xt)=o(n)L_{1}(X_{t})=o(n); this is captured in the following two lemmas.

Lemma 2.2.

Let ζ=q\zeta=q and q(1,2)q\in(1,2). Suppose 2(X0)=O(n4/3)\mathcal{R}_{2}(X_{0})=O(n^{4/3}). Then, for any constant δ>0\delta>0, there exists T=T(δ)=O(1)T=T(\delta)=O(1) such that 2(XT)=O(n4/3)\mathcal{R}_{2}(X_{T})=O(n^{4/3}) and L1(XT)δn{L_{1}}(X_{T})\leq\delta n with probability Ω(1)\Omega(1).

Lemma 2.3.

Let ζ=q\zeta=q and q(1,2)q\in(1,2). Suppose 2(X0)=O(n4/3)\mathcal{R}_{2}(X_{0})=O(n^{4/3}) and that L1(X0)δnL_{1}(X_{0})\leq\delta n for a sufficiently small constant δ\delta. Then, with probability Ω(1)\Omega(1), after T=O(logn)T=O(\log n) steps 1(XT)=O(n4/3)\mathcal{R}_{1}(X_{T})=O(n^{4/3}).

Lemmas 2.2 and 2.3 are proved in Section 4. Combining them with Lemma 2.1 immediately yields the following theorem.

Theorem 2.4.

Let ζ=q\zeta=q, q(1,2)q\in(1,2) and let X0X_{0} be an arbitrary random-cluster configuration of the complete nn-vertex graph. Then, with probability Ω(1)\Omega(1), after T=O(logn)T=O(\log n) steps 1(XT)=O(n4/3)\mathcal{R}_{1}(X_{T})=O(n^{4/3}).

Remark 1.

The contraction of L1(Xt)L_{1}(X_{t}) established by Lemmas 2.2 and 2.3 only occurs when q(1,2)q\in(1,2); when q>2q>2 the quantity L1(Xt)L_{1}(X_{t}) may increase in expectation, whereas for q=2q=2 we have E[L1(Xt+1)Xt]L1(Xt)\operatorname*{{\rm E}}[L_{1}(X_{t+1})\mid X_{t}]\approx L_{1}(X_{t}), and the contraction of the size of the largest component is due instead to fluctuations caused by a large second moment. (This is what causes the power law slowdown when ζ=q=2\zeta=q=2.)

Remark 2.

Sub-steps (ii) and (iii) of the CM dynamics are equivalent to replacing the active portion of the configuration by a G(m,q/n)G(m,q/n) random graph, where mm is the number of active vertices. Since E[m]=n/q\operatorname*{{\rm E}}[m]=n/q, one key challenge in the proofs of Lemmas 2.2 and 2.3, and in fact in the entirety of our analysis, is that the random graph G(m,q/n)G(m,q/n) is critical or almost critical w.h.p. since mq/n1m\cdot q/n\approx 1; consequently its structural properties are not well concentrated and cannot be maintained for the required O(logn)O(\log n) steps of the coupling. This is one of the key reasons why the ζ=ζcr(q)=q\zeta=\zeta_{\textnormal{{cr}}}(q)=q regime is quite delicate.

2.2 Coupling to the same component structure

For the second phase of the coupling, we assume that we start from a pair of configurations X0X_{0}, Y0Y_{0} such that 1(X0)=O(n4/3)\mathcal{R}_{1}(X_{0})=O(n^{4/3}), 1(Y0)=O(n4/3)\mathcal{R}_{1}(Y_{0})=O(n^{4/3}). The goal is to show that after T=O(logn)T=O(\log n) steps, with probability Ω(1/loglogn)\Omega(1/\log\log n), we reach two configurations XTX_{T} and YTY_{T} with the same component structure; i.e., Lj(XT)=Lj(YT)L_{j}(X_{T})=L_{j}(Y_{T}) for all j1j\geq 1. In particular, we prove the following.

Theorem 2.5.

Let ζ=q\zeta=q, q(1,2)q\in(1,2) and suppose X0,Y0X_{0},Y_{0} are random-cluster configurations such that 1(X0)=O(n4/3)\mathcal{R}_{1}(X_{0})=O(n^{4/3}) and 1(Y0)=O(n4/3)\mathcal{R}_{1}(Y_{0})=O(n^{4/3}). Then, there exists a coupling of the CM steps such that after T=O(logn)T=O(\log n) steps XTX_{T} and YTY_{T} have the same component structure with probability Ω((loglogn)1)\Omega\left((\log\log n)^{-1}\right).

Our coupling construction for proving Theorem 2.5 has two main sub-phases. The first is a two-step coupling after which the two configurations agree on all the components of size above a certain threshold Bω=n2/3/ω(n)B_{\omega}={n^{2/3}}/{\omega(n)}, where ω(n)\omega(n) is a slowly increasing function. For convenience and definiteness we set ω(n)=loglogloglogn\omega(n)=\log\log\log\log n. In the second sub-phase we take care of matching the small component structures.

We note that when the same number of vertices are activated from each copy of the chain, we can easily couple the percolation sub-step (with an arbitrary bijection between the activated vertices) and replace the configuration on the active vertices in both chains with the same random sub-graph; consequently, the component structure in the updated sub-graph would be identical. Our goal is thus to design a coupling of the activation of the components that activates the same number of vertices in both copies in every step.

In order for the initial two-step coupling to succeed, certain (additional) properties of the configurations are required. These properties are achieved with a continuation of the initial burn-in phase for a small number of O(logω(n))O(\log\omega(n)) steps. For a random-cluster configuration XX, let ~ω(X)=j:Lj(X)BωLj(X)2\widetilde{\mathcal{R}}_{\omega}(X)=\sum_{j:L_{j}(X)\leq B_{\omega}}L_{j}(X)^{2} and let I(X)I(X) denote the number of isolated vertices of XX. Our extension of the burn-in period is captured by the following lemma.

Lemma 2.6.

Let ζ=q\zeta=q, q(1,2)q\in(1,2) and suppose X0X_{0} is such that 1(X0)=O(n4/3)\mathcal{R}_{1}(X_{0})=O(n^{4/3}). Then, there exists T=O(logω(n))T=O(\log\omega(n)) and a constant β>0\beta>0 such that ~ω(XT)=O(n4/3ω(n)1/2)\widetilde{\mathcal{R}}_{\omega}(X_{T})=O(n^{4/3}{\omega(n)}^{-1/2}), 1(XT)=O(n4/3)\mathcal{R}_{1}(X_{T})=O(n^{4/3}) and I(XT)=Ω(n)I(X_{T})=\Omega(n) with probability Ω(ω(n)β)\Omega(\omega(n)^{-\beta}).

The proof of Lemma 2.6 is provided in Section 5.1.

With these bounds on ~ω(XT)\widetilde{\mathcal{R}}_{\omega}(X_{T}), ~ω(YT)\widetilde{\mathcal{R}}_{\omega}(Y_{T}), I(XT)I(X_{T}) and I(YT)I(Y_{T}), we construct the two-step coupling for matching the large component structure. The construction crucially relies on a new local limit theorem (Theorem 5.1). In particular, under our assumptions, when ω(n)\omega(n) is small enough, there are few components with sizes above BωB_{\omega}. Hence, we can condition on the event that all of them are activated simultaneously. The difference in the number of active vertices generated by the activation of these large components can then be “corrected” by a coupling of the activation of the smaller components; for this we use our new local limit theorem.

Specifically, our local limit theorem applies to the random variables corresponding to the number of activated vertices from the small components of each copy. We prove it using a result of Mukhin [28] and the fact that, among the small components, there are (roughly speaking) many components of many different sizes. To establish the latter we require a refinement of known random graph estimates (see Lemma 3.11).

To formally state our result we introduce some additional notation. Let 𝒮ω(X)\mathcal{S}_{\omega}(X) be the set of connected components of XX with sizes greater than BωB_{\omega}. At step tt, the activation of the components of two random-cluster configurations XtX_{t} and YtY_{t} is done using a maximal matching WtW_{t} between the components of XtX_{t} and YtY_{t}, with the restriction that only components of equal size are matched to each other. For an increasing positive function gg and each integer k0k\geq 0, define N^k(t,g):=N^k(Xt,Yt,g)\hat{N}_{k}(t,g):=\hat{N}_{k}(X_{t},Y_{t},g) as the number of matched pairs in WtW_{t} whose component sizes are in the interval

k(g)=[ϑn2/32g(n)2k,ϑn2/3g(n)2k],\mathcal{I}_{k}(g)=\Big{[}\frac{\vartheta n^{2/3}}{2g(n)^{2^{k}}},\frac{\vartheta n^{2/3}}{g(n)^{2^{k}}}\Big{]},

where ϑ>0\vartheta>0 is a fixed large constant (independent of nn).

Lemma 2.7.

Let ζ=q\zeta=q, q(1,2)q\in(1,2) and suppose X0,Y0X_{0},Y_{0} are random-cluster configurations such that 1(X0)=O(n4/3)\mathcal{R}_{1}(X_{0})=O(n^{4/3}), ~ω(X0)=O(n4/3ω(n)1/2)\widetilde{\mathcal{R}}_{\omega}(X_{0})=O(n^{4/3}{\omega(n)}^{-1/2}), I(X0)=Ω(n)I(X_{0})=\Omega(n) and similarly for Y0Y_{0}. Then, there exists a two-step coupling of the CM dynamics such that 𝒮ω(X2)=𝒮ω(Y2)\mathcal{S}_{\omega}(X_{2})=\mathcal{S}_{\omega}(Y_{2}) with probability exp(O(ω(n)9))\exp\left(-O(\omega(n)^{9})\right).

Moreover, L1(X2)=O(n2/3ω(n))L_{1}(X_{2})=O(n^{2/3}\omega(n)), 2(X2)=O(n4/3)\mathcal{R}_{2}(X_{2})=O(n^{4/3}), ~ω(X2)=O(n4/3ω(n)1/2)\widetilde{\mathcal{R}}_{\omega}(X_{2})=O(n^{4/3}{\omega(n)}^{-1/2}), I(X2)=Ω(n)I(X_{2})=\Omega(n), N^k(2,ω(n))=Ω(ω(n)32k1)\hat{N}_{k}(2,\omega(n))=\Omega({\omega(n)}^{3\cdot 2^{k-1}}) for all k1k\geq 1 such that n2/3ω(n)2k1n^{2/3}\omega(n)^{-2^{k-1}}\rightarrow\infty, and similarly for Y2Y_{2}.

From the first part of the lemma we obtain two configurations that agree on all of their large components, as desired, while the second part guarantees additional structural properties for the resulting configurations so that the next sub-phase of the coupling can also succeed with the required probability. The proof of Lemma 2.7 is given in Section 5.2.

In the second sub-phase, after the large component are matched, we can design a coupling that activates exactly the same number of vertices from each copy of the chain. To analyze this coupling we use a precise estimate on the distribution of the maximum of symmetric random walks over integers (with steps of different sizes). We are first required to run the chains coupled for T=O(logω(n))T=O(\log\omega(n)) steps, so that certain additional structural properties appear. Let M(Xt)M(X_{t}) and M(Yt)M(Y_{t}) be the components in the matching WtW_{t} that belong to XtX_{t} and YtY_{t}, respectively, and let D(Xt)D(X_{t}) and D(Yt)D(Y_{t}) be the complements of M(Xt)M(X_{t}) and M(Yt)M(Y_{t}). Let

Zt=𝒞D(Xt)D(Yt)|𝒞|2.Z_{t}=\sum\nolimits_{\mathcal{C}\in D(X_{t})\cup D(Y_{t})}|\mathcal{C}|^{2}.
Lemma 2.8.

Let ζ=q\zeta=q, q(1,2)q\in(1,2). Suppose X0X_{0} and Y0Y_{0} are random-cluster configurations such that 𝒮ω(X0)=𝒮ω(Y0)\mathcal{S}_{\omega}(X_{0})=\mathcal{S}_{\omega}(Y_{0}), and N^k(0,ω(n))=Ω(ω(n)32k1)\hat{N}_{k}(0,\omega(n))=\Omega({\omega(n)}^{3\cdot 2^{k-1}}) for all k1k\geq 1 such that n2/3ω(n)2k1n^{2/3}\omega(n)^{-2^{k-1}}\rightarrow\infty. Suppose also that L1(X0)=O(n2/3ω(n))L_{1}(X_{0})=O(n^{2/3}\omega(n)), 2(X0)=O(n4/3)\mathcal{R}_{2}(X_{0})=O(n^{4/3}), ~ω(X0)=O(n4/3ω(n)1/2)\widetilde{\mathcal{R}}_{\omega}(X_{0})=O(n^{4/3}{\omega(n)}^{-1/2}), I(X0)=Ω(n)I(X_{0})=\Omega(n), and similarly for Y0Y_{0}.

Then, there exists a coupling of the CM steps such that with probability exp(O((logω(n))2))\exp(-O((\log\omega(n))^{2})) after T=O(logω(n))T=O(\log\omega(n)) steps: 𝒮ω(XT)=𝒮ω(YT)\mathcal{S}_{\omega}(X_{T})=\mathcal{S}_{\omega}(Y_{T}), ZT=O(n4/3ω(n)1/2)Z_{T}=O(n^{4/3}{\omega(n)}^{-1/2}), N^k(T,ω(n)1/2)=Ω(ω(n)32k2)\hat{N}_{k}(T,\omega(n)^{1/2})=\Omega({\omega(n)}^{3\cdot 2^{k-2}}) for all k1k\geq 1 such that n2/3ω(n)2k1n^{2/3}\omega(n)^{-2^{k-1}}\rightarrow\infty, 1(XT)=O(n4/3)\mathcal{R}_{1}(X_{T})=O(n^{4/3}), I(XT)=Ω(n)I(X_{T})=\Omega(n), and similarly for YTY_{T}.

The proof of Lemma 2.8 also uses our local limit theorem (Theorem 5.1) and is provided in Section 5.3.

The final step of our construction is a coupling of the activation of the components of size less than BωB_{\omega}, so that exactly the same number of vertices are activated from each copy in each step w.h.p.

Lemma 2.9.

Let ζ=q\zeta=q, q(1,2)q\in(1,2) and suppose X0X_{0} and Y0Y_{0} are random-cluster configurations such that 𝒮ω(X0)=𝒮ω(Y0)\mathcal{S}_{\omega}(X_{0})=\mathcal{S}_{\omega}(Y_{0}), Z0=O(n4/3ω(n)1/2)Z_{0}=O(n^{4/3}{\omega(n)}^{-1/2}), and N^k(0,ω(n)1/2)=Ω(ω(n)32k2)\hat{N}_{k}\left(0,\omega(n)^{1/2}\right)=\Omega({\omega(n)}^{3\cdot 2^{k-2}}) for all k1k\geq 1 such that n2/3ω(n)2k1n^{2/3}\omega(n)^{-2^{k-1}}\rightarrow\infty. Suppose also that 1(X0)=O(n4/3)\mathcal{R}_{1}(X_{0})=O(n^{4/3}), I(X0)=Ω(n)I(X_{0})=\Omega(n) and similarly for Y0Y_{0}. Then, there exist a coupling of the CM steps and a constant β>0\beta>0 such that after T=O(logn)T=O(\log n) steps, XTX_{T} and YTY_{T} have the same component structure with probability Ω((logloglogn)β)\Omega\left((\log\log\log n)^{-\beta}\right).

We comment briefly on how we prove this lemma. Our starting point is two configurations with the same “large” component structure; i.e., 𝒮ω(X0)=𝒮ω(Y0)\mathcal{S}_{\omega}(X_{0})=\mathcal{S}_{\omega}(Y_{0}). We use the maximal matching W0W_{0} to couple the activation of the large components in X0X_{0} and Y0Y_{0}. The small components not matched by W0W_{0}, i.e., those counted in Z0Z_{0}, are then activated independently. This creates a discrepancy 𝒟0\mathcal{D}_{0} between the number of active vertices from each copy. Since E[𝒟0]=0\operatorname*{{\rm E}}[\mathcal{D}_{0}]=0 and Var(𝒟0)=Θ(Z0)=Θ(n4/3ω(n)1/2)\operatorname*{{\rm Var}}(\mathcal{D}_{0})=\Theta(Z_{0})=\Theta({n^{4/3}}{\omega(n)^{-1/2}}), it follows from Hoeffding’s inequality that 𝒟0n2/3ω(n)1/4\mathcal{D}_{0}\leq{n^{2/3}}{\omega(n)^{-1/4}} w.h.p. To fix this discrepancy, we use the small components matched by W0W_{0}. Specifically, under the assumptions in Lemma 2.9, we can construct a coupling of the activation of the small components so that the difference in the number of activated vertices from the small components from each copy is exactly 𝒟0\mathcal{D}_{0} with probability Ω(1)\Omega(1). This part of the construction utilizes random walks over the integers; in particular, we use a lower bound for the maximum of such a random walk.

We need to repeat this process until Zt=0Z_{t}=0; this takes O(logn)O(\log n) steps since Zt(11/q)tZ0Z_{t}\approx(1-1/q)^{t}Z_{0}. However, there are a few complications. First, the initial assumptions on the component structure of the configurations are not preserved for this many steps w.h.p., so we need to relax the requirements as the process evolves. This is in turn possible because the discrepancy 𝒟t\mathcal{D}_{t} decreases with each step, which implies that the probability of success of the coupling increases at each step. See Section 5.4 for the detailed proof.

We now indicate how these lemmas lead to a proof of Theorem 2.5 stated earlier.

Proof of Theorem 2.5.

Suppose 1(X0)=O(n4/3)\mathcal{R}_{1}(X_{0})=O(n^{4/3}) and 1(Y0)=O(n4/3)\mathcal{R}_{1}(Y_{0})=O(n^{4/3}). It follows from Lemma 2.6, 2.7, 2.8 and 2.9 that there exists a coupling of the CM steps such that after T=O(logn)T=O(\log n) steps, XTX_{T} and YTY_{T} could have the same component structure. This coupling succeeds with probability at least

ρ=Ω(ω(n)β1)exp(O(ω(n)9))exp(O((logω(n))2))Ω((logloglogn)β2),\rho=\Omega(\omega(n)^{-\beta_{1}})\cdot\exp\big{(}-O(\omega(n)^{9})\big{)}\cdot\exp\big{(}-O\big{(}(\log\omega(n))^{2}\big{)}\big{)}\cdot\Omega\big{(}(\log\log\log n)^{-\beta_{2}}\big{)},

where β1,β2>0\beta_{1},\beta_{2}>0 are constants. Thus, ρ=Ω((loglogn)1)\rho=\Omega\big{(}(\log\log n)^{-1}\big{)}, since ω(n)=loglogloglogn\omega(n)=\log\log\log\log n. ∎

Remark 3.

We pause to mention that this delicate coupling for the activation of the components is not required when ζ=q\zeta=q and q>2q>2. In that regime, the random-cluster model is super-critical, so after the first O(logn)O(\log n) steps, the component structure is much simpler, with exactly one large component. On the other hand, when ζ=q\zeta=q and q(1,2]q\in(1,2] the model is critical, which, combined with the fact mentioned earlier that the percolation sub-step of the dynamics is also critical when ζ=q\zeta=q, makes the analysis of the CM dynamics in this regime quite subtle.

2.3 Coupling to the same configuration

In the last phase of the coupling, suppose we start with two configurations X0X_{0}, Y0Y_{0} with the same component structure. We are still required to bound the number of steps until the same configuration is reached. The following lemma from [7] supplies the desired bound.

Lemma 2.10 ([7], Lemma 24).

Let q>1q>1, ζ>0\zeta>0 and let X0X_{0}, Y0Y_{0} be two random-cluster configurations with the same component structure. Then, there exists a coupling of the CM steps such that after T=O(logn)T=O(\log n) steps, XT=YTX_{T}=Y_{T} w.h.p.

Combining the results for each of the phases of the coupling, we now prove Theorem 1.1.

Proof of Theorem 1.1.

By Theorem 2.4, after t0=O(logn)t_{0}=O(\log n) steps, with probability Ω(1)\Omega(1), we have 1(Xt0)=O(n4/3)\mathcal{R}_{1}(X_{t_{0}})=O(n^{4/3}) and 1(Yt0)=O(n4/3)\mathcal{R}_{1}(Y_{t_{0}})=O(n^{4/3}). If this is the case, Theorem 2.5 and Lemma 2.10 imply that there exists a coupling of the CM steps such that with probability Ω((loglogn)1)\Omega\left((\log\log n)^{-1}\right) after an additional t1=O(logn)t_{1}=O(\log n) steps, Xt0+t1=Yt0+t1X_{t_{0}+t_{1}}=Y_{t_{0}+t_{1}}. Consequently, we obtain that τmixCM=O(lognloglogn)\tau_{\rm mix}^{\mathrm{CM}}=O(\log n\cdot\log\log n) as claimed. ∎

Remark 4.

The probability of success in Theorem 2.5, which governs the lower order term O(loglogn)O(\log\log n) in our mixing time bound, is controlled by our choice of the function ω(n)\omega(n) for the definition of “large components”. By choosing ω(n)\omega(n) that goes to \infty more slowly, we could improve our mixing time bound to O(logng(n))O(\log n\cdot g(n)) where g(n)g(n) is any function that tends to infinity arbitrarily slowly. However, it seems that new ideas are required to obtain a bound of O(logn)O(\log n) (matching the known lower bound). In particular, the fact that ω(n)\omega(n)\rightarrow\infty is crucially used in some of our proofs. Our specific choice of ω(n)\omega(n) yields the O(lognloglogn)O(\log n\cdot\log\log n) bound and makes our analysis cleaner.

3 Random graph estimates

In this section, we compile a number of standard facts about the G(n,p)G(n,p) random graph model which will be useful in our proofs. We use GG(n,p)G\sim G(n,p) to denote a random graph GG sampled from the standard G(n,p)G(n,p) model, in which every edge appears independently with probability pp. A G(n,p)G(n,p) random graph is said to be sub-critical when np<1np<1. It is called super-critical when np>1np>1 and critical when np=1np=1. For a graph GG, with a slight abuse of notation, let Li(G)L_{i}(G) denote the size of the ii-th largest connected component in GG, and let i(G):=jiLj(G)2\mathcal{R}_{i}(G):=\sum_{j\geq i}L_{j}(G)^{2}; note that the same notation is used for the components of a random-cluster configuration, but it will always be clear from context which case is meant.

Fact 3.1.

Given 0<N1<N20<N_{1}<N_{2} and p[0,1]p\in[0,1]. Let G1G(N1,p)G_{1}\sim G(N_{1},p) and G2G(N2,p)G_{2}\sim G(N_{2},p). For any K>0K>0, Pr[L1(G1)>K]Pr[L1(G2)>K]\Pr[L_{1}(G_{1})>K]\leq\Pr[L_{1}(G_{2})>K].

Proof.

Consider the coupling of (G1,G2)(G_{1},G_{2}) such that G1G_{1} is a subgraph of G2G_{2}. L1(G1)L1(G2)L_{1}(G_{1})\leq L_{1}(G_{2}) with probability 1. Proposition just follows from Strassen’s theorem (see, e.g., Theorem 22.6 in [22]). ∎

Lemma 3.2 ([24], Lemma 5.7).

Let I(G)I(G) denote the number of isolated vertices in GG. If np=O(1)np=O(1), then there exists a constant C>0C>0 such that

Pr[I(G)>Cn]=1O(n1).\Pr[I(G)>Cn]=1-O(n^{-1}).

Consider the equation

edx=1xe^{-dx}=1-x (3)

and let β(d)\beta(d) be defined as its unique positive root. Observe that β\beta is well-defined for d>1d>1.

Lemma 3.3 ([4], Lemma 2.7).

Let GG(n+m,dn/n)G\sim G(n+m,d_{n}/n) random graph where |m|=o(n)|m|=o(n) and limndn=d\lim_{n\rightarrow\infty}d_{n}=d. Assume 1<dn=O(1)1<d_{n}=O(1) and dnd_{n} is bounded away from 11 for all nn\in\mathbb{N}. Then, For A=o(logn)A=o(\log n) and sufficiently large nn, there exists a constant c>0c>0 such that

Pr[|L1(G)β(d)n|>|m|+An]ecA2.\Pr\left[\lvert L_{1}(G)-\beta(d)n\rvert>\lvert m\rvert+A\sqrt{n}\right]\leq e^{-cA^{2}}.
Lemma 3.4 ([4], Lemma 2.16).

For np>0np>0, we have E[2(G)]=O(n4/3)\operatorname*{{\rm E}}\left[\mathcal{R}_{2}(G)\right]=O\left(n^{4/3}\right).

Consider the near-critical random graph G(n,1+εn)G\left(n,\frac{1+\varepsilon}{n}\right) with ε=ε(n)=o(1)\varepsilon=\varepsilon(n)=o(1).

Lemma 3.5 ([24], Theorem 5.9).

Assume ε3n1\varepsilon^{3}n\geq 1, then for any AA satisfying 2Aε3n/102\leq A\leq\sqrt{\varepsilon^{3}n}/10, there exists some constant c>0c>0 such that

Pr[|L1(G)2εn|>Anε]=O(ecA2).\Pr\left[\left|L_{1}(G)-2\varepsilon n\right|>A\sqrt{\frac{n}{\varepsilon}}\right]=O\left(e^{-cA^{2}}\right).
Corollary 3.6.

Let GG(n,1+εn)G\sim G\left(n,\frac{1+\varepsilon}{n}\right) with ε=o(1)\varepsilon=o(1). For any positive constant ρ1/10\rho\leq 1/10, there exist constants C1C\geq 1 and c>0c>0 such that if ε3nC\varepsilon^{3}n\geq C, then

Pr[|L1(G)2εn|>ρεn]=O(ecε3n).\Pr\left[\left\lvert L_{1}(G)-2\varepsilon n\right\rvert>\rho\varepsilon n\right]=O(e^{-c\varepsilon^{3}n}).
Lemma 3.7 ([24], Theorem 5.12).

Let ε<0\varepsilon<0, then E[1(G)]=O(n/|ε|).\operatorname*{{\rm E}}[\mathcal{R}_{1}(G)]=O\left(n/|\varepsilon|\right).

Lemma 3.8 ([24], Theorem 5.13).

Let ε>0\varepsilon>0 and ε3n1\varepsilon^{3}n\geq 1 for large nn, then E[2(G)]=O(n/ε)\operatorname*{{\rm E}}[\mathcal{R}_{2}(G)]=O\left(n/\varepsilon\right).

For the next results, suppose that GG(n,1+λn1/3n)G\sim G(n,\frac{1+\lambda n^{-1/3}}{n}), where λ=λ(n)\lambda=\lambda(n) may depend on nn.

Lemma 3.9.

If |λ|=O(1)|\lambda|=O(1), then E[1(G)]=O(n4/3)\operatorname*{{\rm E}}\left[\mathcal{R}_{1}(G)\right]=O\left(n^{4/3}\right).

Proof.

Follows from Lemmas 2.13, 2.15 and 2.16 in [4]. ∎

All the random graph facts stated so far can be either found in the literature, or follow directly from well-known results. The following lemmas are slightly more refined versions of similar results in the literature.

Lemma 3.10.

Suppose |λ|=O(h(n))|\lambda|=O(h(n)) and let Bh=n2/3h(n)1B_{h}={n^{2/3}}{h(n)^{-1}}, where h:h:\mathbb{N}\rightarrow\mathbb{R} is a positive increasing function such that h(n)=o(logn)h(n)=o(\log n). Then, for any α(0,1)\alpha\in(0,1) there exists a constant C=C(α)>0C=C(\alpha)>0 such that, with probability at least α\alpha,

j:Lj(G)BhLj(G)2Cn4/3h(n)1/2.\sum\nolimits_{j:L_{j}(G)\leq B_{h}}L_{j}(G)^{2}\leq C{n^{4/3}}{h(n)^{-1/2}}.
Lemma 3.11.

Let SB={j:BLj(G)2B}S_{B}=\{j:B\leq L_{j}(G)\leq 2B\} and suppose there exists a positive increasing function gg such that g(n)g(n)\rightarrow\infty, g(n)=o(n1/3)g(n)=o(n^{1/3}), |λ|g(n)|\lambda|\leq g(n) and Bn2/3g(n)2B\leq\frac{n^{2/3}}{g(n)^{2}}. If BB\rightarrow\infty, then there exists constants δ1,δ2>0\delta_{1},\delta_{2}>0 independent of nn such that

Pr[|SB|δ1nB3/2]δ2B3/2n.\Pr\left[|S_{B}|\leq\frac{\delta_{1}n}{B^{3/2}}\right]\leq\frac{\delta_{2}B^{3/2}}{n}.

The proofs of Lemmas 3.10 and 3.11 are provided in Appendix C. Finally, the following corollary of Lemma 3.11 will also be useful. For a graph HH, let Nk(H,g){N}_{k}(H,g) be the number of components of HH whose sizes are in the interval k(g)\mathcal{I}_{k}(g). We note that with a slight abuse of notation, for a random-cluster configuration XX, we also use Nk(X,g){N}_{k}(X,g) for the number of connected components of XX in k(g)\mathcal{I}_{k}(g).

Corollary 3.12.

Let m(n/2q,n]m\in(n/2q,n] and let gg be an increasing positive function that such that g(n)=o(m1/3)g(n)=o(m^{1/3}), g(n)g(n)\rightarrow\infty and |λ|g(m)|\lambda|\leq g(m). If HG(m,1+λm1/3m)H\sim G\big{(}m,\frac{1+\lambda m^{-1/3}}{m}\big{)}, there exists a constant b>0b>0 such that, with probability at least 1O(g(n)3)1-O\left(g(n)^{-3}\right), Nk(H,g)bg(n)32k1N_{k}(H,g)\geq bg(n)^{3\cdot 2^{k-1}} for all k1k\geq 1 such that n2/3g(n)2kn^{2/3}g(n)^{-2^{k}}\rightarrow\infty.

4 The burn-in period: proofs

In this section we will provide proofs for Lemma 2.2 and Lemma 2.3.

4.1 A drift function

Consider the mean-field random-cluster model with parameters q1q\geq 1 and p=ζ/np=\zeta/n. In this subsection, we introduce a drift function captures the rate of decay of the size of the largest component in a configuration under steps of the CM dynamics which will be helpful for proving Lemma 2.2; this function was first studied in [7].

Given θ(0,1]\theta\in(0,1], consider the equation

eζx=1qx1+(q1)θe^{-\zeta x}=1-\frac{qx}{1+(q-1)\theta} (4)

and let ϕ(θ,ζ,q)\phi(\theta,\zeta,q) be defined as the largest positive root of (4). We shall see that ϕ\phi is not defined for all qq and ζ\zeta since there may not be a positive root. When ζ\zeta and qq are clear from the context we use ϕ(θ)=ϕ(θ,ζ,q)\phi(\theta)=\phi(\theta,\zeta,q). Note that β(ζ)\beta(\zeta) defined by equation (3) is the special case of (4) when q=1q=1; observe that β\beta is only well-defined when ζ>1\zeta>1.

We let k(θ,q):=(1+(q1)θ)/qk(\theta,q):=(1+(q-1)\theta)/q so that ϕ(θ,ζ,q)=β(ζk(θ,q))k(θ,q).\phi(\theta,\zeta,q)=\beta(\zeta\cdot k(\theta,q))\cdot k(\theta,q). Hence, ϕ(θ,ζ,q)\phi(\theta,\zeta,q) is only defined when ζk(θ,q)>1\zeta\cdot k(\theta,q)>1; that is, θ(θmin,1]\theta\in(\theta_{min},1], where θmin=qζζ(q1)\theta_{min}=\frac{q-\zeta}{\zeta(q-1)}. Note that when ζ=q\zeta=q, ϕ(θ)\phi(\theta) is defined for every θ(0,1]\theta\in(0,1].

For fixed ζ\zeta and qq, we call f(θ):=θϕ(θ)f(\theta):=\theta-\phi(\theta) the drift function. which is defined on (max{θmin,0},1](\max\{\theta_{min},0\},1].

Lemma 4.1.

When q=ζ<2q=\zeta<2, the drift function ff is non-negative for any θ[ξ,1]\theta\in[\xi,1], where ξ\xi is an arbitrarily small positive constant.

Proof.

When ζ=q<2\zeta=q<2, the drift function ff does not have a positive root, it is continuous in (0,1], and f(1)>0f(1)>0; see Lemma 2.5 in [9] and Fact 3.5 in [4]. Since limθ0f(θ)=0\lim_{\theta\rightarrow 0}f(\theta)=0, the result follows. ∎

4.2 Shrinking a large component: proof of Lemma 2.2

The proof of Lemma 2.2 uses the following lemma, which follows directly from standard random graph estimates and Hoeffding’s inequality. To simplify the notation, we let L^(X):=L1(X)/n2/3\hat{L}(X):=L_{1}(X)/n^{2/3}. We use A(X)A(X) to denote the number of vertices activated by the step CM dynamics from configuration XX. Let Λt\Lambda_{t} denote the event that the largest component of the configuration is activated in step tt.

Lemma 4.2 ([4], Claim 3.45).

Suppose 2(Xt)=O(n4/3)\mathcal{R}_{2}(X_{t})=O(n^{4/3}) and L^(Xt)B\hat{L}(X_{t})\geq B for a large constant BB, and let CC be a fixed large constant. Then

  1. 1.

    Pr[2(Xt+1)<2(Xt)+Cn4/3L^(Xt)|Xt,Λt]=1O(L^(Xt)1/2)\Pr\left[\mathcal{R}_{2}(X_{t+1})<\mathcal{R}_{2}(X_{t})+\frac{Cn^{4/3}}{\sqrt{\hat{L}(X_{t})}}~{}\middle|~{}X_{t},\Lambda_{t}\right]=1-O\left(\hat{L}(X_{t})^{-1/2}\right).

  2. 2.

    Pr[2(Xt+1)<2(Xt)+Cn4/3L^(Xt)|Xt,¬Λt]=1O(L^(Xt)1/2)\Pr\left[\mathcal{R}_{2}(X_{t+1})<\mathcal{R}_{2}(X_{t})+\frac{Cn^{4/3}}{\sqrt{\hat{L}(X_{t})}}~{}\middle|~{}X_{t},\neg\Lambda_{t}\right]=1-O\left(\hat{L}(X_{t})^{-1/2}\right).

Proof of Lemma 2.2.

Let T^\hat{T} be the first time tt when L^(Xt)δn1/3\hat{L}(X_{t})\leq\delta n^{1/3}, let TT^{\prime} be a large constant we choose later; we set T:=min{T^,T}T:=\min\{\hat{T},T^{\prime}\}. Observe that with constant probability the largest component in the configuration is activated by the CM dynamics for every tTt\leq T^{\prime}; i.e., the event Λt\Lambda_{t} occurs for every tTt\leq T^{\prime}. Let us assume this is the case and fix t<Tt<T. Suppose 2(Xt)2(X0)+tCδn76\mathcal{R}_{2}(X_{t})\leq\mathcal{R}_{2}(X_{0})+t\cdot\frac{C}{\sqrt{\delta}}n^{\frac{7}{6}} where CC is the positive constant from Lemma 4.2. We show that with high probability:

  1. (i)

    2(Xt+1)2(X0)+tCδn76\mathcal{R}_{2}(X_{t+1})\leq\mathcal{R}_{2}(X_{0})+t\cdot\frac{C}{\sqrt{\delta}}n^{\frac{7}{6}}; and

  2. (ii)

    L1(Xt+1)L1(Xt)ξnL_{1}(X_{t+1})\leq L_{1}(X_{t})-\xi n where ξ\xi is a positive constant independent of tt and nn.

In particular, it suffices to set T=(1δ)/ξT^{\prime}=(1-\delta)/\xi for the lemma to hold.

First, we show that A(Xt)A(X_{t}) is concentrated around its mean. Let L1(Xt):=θtnL_{1}(X_{t}):=\theta_{t}n and L1(Xt+1):=θt+1nL_{1}(X_{t+1}):=\theta_{t+1}n. Let E[A(Xt)Λt]=μt=nq+(11q)θtn\operatorname*{{\rm E}}[A(X_{t})\mid\Lambda_{t}]=\mu_{t}=\frac{n}{q}+(1-\frac{1}{q})\cdot\theta_{t}n, γ:=n5/6\gamma:=n^{5/6}, and Jt:=[μtγ,μt+γ]J_{t}:=[\mu_{t}-\gamma,\mu_{t}+\gamma]. Hoeffding’s inequality implies

Pr[A(Xt)JtΛt]\displaystyle\Pr\left[A(X_{t})\in J_{t}\mid\Lambda_{t}\right] 12exp(2γ22(Xt))=1eΩ(n1/3).\displaystyle\geq 1-2\exp\left(\frac{-2\gamma^{2}}{\mathcal{R}_{2}(X_{t})}\right)=1-e^{-\Omega(n^{1/3})}.

If A(Xt)JtA(X_{t})\in J_{t}, then the random graph G(A(Xt),p)G(A(X_{t}),p) is super-critical since

A(Xt)p(μtγ)qn=[nq+(11q)θtnn5/6]qn=1+(q1)θto(1)>1.A(X_{t})\cdot p\geq(\mu_{t}-\gamma)\cdot\frac{q}{n}=\left[\frac{n}{q}+\left(1-\frac{1}{q}\right)\cdot\theta_{t}n-n^{5/6}\right]\cdot\frac{q}{n}=1+(q-1)\theta_{t}-o(1)>1.

Next, for a super-critical random graph, Lemma 3.3 provides a concentration bound for the size of largest new component, provided A(Xt)JtA(X_{t})\in J_{t}. To see this, we write G(A(Xt),ζ/n)G(A(X_{t}),\zeta/n) as

G(μt+m,k(θt,q)q/μt),G\left(\mu_{t}+m,k(\theta_{t},q)\cdot q/\mu_{t}\right),

where m:=A(Xt)μtm:=A(X_{t})-\mu_{t}; notice that |m|γ=o(n)|m|\leq\gamma=o(n). Let HG(μt+m,k(θt,q)q/μt)H\sim G\left(\mu_{t}+m,k(\theta_{t},q)\cdot q/\mu_{t}\right). Since

k(θt,q)q=1+(q1)θt>1+δ(q1)>1k(\theta_{t},q)\cdot q=1+(q-1)\theta_{t}>1+\delta(q-1)>1

holds regardless of nn, Lemma 3.3 implies that for ϕ(θt)>0\phi(\theta_{t})>0 defined in Section 4.1, with high probability

L1(H)[ϕ(θt)nnlogn|m|,ϕ(θt)n+nlogn+|m|].L_{1}(H)\in\left[\phi(\theta_{t})n-\sqrt{n\log n}-|m|,\phi(\theta_{t})n+\sqrt{n\log n}+|m|\right].

Note that L1(H)=Ω(n)L_{1}(H)=\Omega(n) w.h.p.; hence, since L2(Xt)=O(n2/3)L_{2}(X_{t})=O(n^{2/3}) we have L1(Xt+1)=L1(H)L_{1}(X_{t+1})=L_{1}(H) w.h.p. We have shown that w.h.p.

θt+1θtϕ(θt)+nlognnθt|m|n=f(θt)+lognn|m|n,\theta_{t+1}-\theta_{t}\leq\phi(\theta_{t})+\frac{\sqrt{n\log n}}{n}-\theta_{t}-\frac{|m|}{n}=-f(\theta_{t})+\sqrt{\frac{\log n}{n}}-\frac{|m|}{n},

where ff is the drift function defined in Section 4.1. By Lemma 4.1, we know f(θt)>ξ1>0f(\theta_{t})>\xi_{1}>0 for sufficiently small constant ξ1\xi_{1} (independent of nn and tt). Hence, w.h.p. for sufficiently large nn

L1(Xt+1)L1(Xt)ξ1n+o(n)ξ1n2;L_{1}(X_{t+1})-L_{1}(X_{t})\leq-\xi_{1}n+o(n)\leq\frac{-\xi_{1}n}{2};

this establishes (ii) from above.

For (i), note that for t<Tt<T we have L^(Xt)>δn1/3\hat{L}(X_{t})>\delta n^{1/3}, so Lemma 4.2 implies,

Pr[2(Xt+1)<2(X0)+tCδn76+Cn4/3δn1/6]=1o(1).\Pr\left[\mathcal{R}_{2}(X_{t+1})<\mathcal{R}_{2}(X_{0})+t\cdot\frac{C}{\sqrt{\delta}}n^{\frac{7}{6}}+\frac{Cn^{4/3}}{\sqrt{\delta}n^{1/6}}\right]=1-o(1).

A union bound implies that these two events occur simultaneously w.h.p. and the result follows. ∎

4.3 Shrinking a medium size component: proof of Lemma 2.3

In the third sub-phase of the burn-in period, we show that L1(Xt){L_{1}}(X_{t}) contracts at a constant rate; the precise description of this phenomenon is captured in the following lemma.

Lemma 4.3.

Suppose 2(Xt)=O(n4/3)\mathcal{R}_{2}(X_{t})=O(n^{4/3}), δn1/3L^(Xt)B\delta n^{1/3}\geq\hat{L}(X_{t})\geq B for a large constant B:=B(q)B:=B(q), and a small constant δ(q,B)\delta(q,B). Then:

  1. 1.

    There exists a constant α:=α(B,q,δ)<1\alpha:=\alpha(B,q,\delta)<1 such that

    Pr[L1(Xt+1)max{αL1(Xt),L2(Xt)}Xt,Λt]1exp(Ω(L^(Xt)));\Pr\left[L_{1}(X_{t+1})\leq\max\{\alpha L_{1}(X_{t}),L_{2}(X_{t})\}\mid X_{t},\Lambda_{t}\right]\geq 1-\exp\left(-\Omega\left(\hat{L}(X_{t})\right)\right);
  2. 2.

    Pr[L1(Xt+1)=L1(Xt)|Xt,¬Λt]1O(L^(Xt)3).\Pr\left[L_{1}(X_{t+1})=L_{1}(X_{t})|X_{t},\neg\Lambda_{t}\right]\geq 1-O\left(\hat{L}(X_{t})^{-3}\right).

Since Lemmas 4.2 suggests 2(Xt)=O(n4/3)\mathcal{R}_{2}(X_{t})=O(n^{4/3}) with reasonably high probability throughout the execution of the sub-phase, Lemmas 4.3 and 4.2 can be combined to derive the following more accurate contraction estimate which will be crucial in the proof Lemma 2.3.

Lemma 4.4.

Suppose g(n)g(n) is an arbitrary function with range in the interval [B6,δn1/3]\left[B^{6},\delta n^{1/3}\right] where BB is a large enough constant such that for xB6x\geq B^{6} we have xB(logx)8x\geq B(\log x)^{8}, and δ:=δ(q,B)\delta:=\delta(q,B) is a small constant.

Suppose X0X_{0} is such that g(n)L^(X0)B(logg(n))8g(n)\geq\hat{L}(X_{0})\geq B(\log g(n))^{8} and 2(X0)=O(n4/3)\mathcal{R}_{2}(X_{0})=O(n^{4/3}), then there exists a constant DD and T=O(logg(n))T=O(\log g(n)) such that at time TT, L^(XT)max{B(logg(n))8,D}\hat{L}(X_{T})\leq\max\{B(\log g(n))^{8},D\} and 2(XT)2(X0)+O(n4/3logg(n))\mathcal{R}_{2}(X_{T})\leq\mathcal{R}_{2}(X_{0})+O\left(\frac{n^{4/3}}{\log g(n)}\right) with probability at least 1O(log1g(n))1-O\left(\log^{-1}g(n)\right).

We first provide a proof for Lemma 2.3 that recursively uses the contraction estimate of Lemma 4.4.

Proof of Lemma 2.3.

Let BB be a constant large enough so that xB6\forall x\geq B^{6}, we have x(logx)48x\geq(\log x)^{48}. Suppose L^(X0)δn1/3\hat{L}(X_{0})\leq\delta n^{1/3} and 2(X0)=O(n4/3)\mathcal{R}_{2}(X_{0})=O(n^{4/3}) for the constant δ=δ(q,B)\delta=\delta(q,B) from Lemma 4.4. Suppose also L^(X0)B6\hat{L}(X_{0})\geq B^{6}; otherwise there is nothing to prove.

Let g0(n):=δn1/3g_{0}(n):=\delta n^{1/3} and gi+1(n):=B(loggi(n))8g_{i+1}(n):=B(\log g_{i}(n))^{8} for all i0i\geq 0. Let KK be defined as the minimum natural number such that gK(n)B6g_{K}(n)\leq B^{6}. Note that K=O(logn)K=O(\log^{*}n). Assume at time t0t\geq 0, there exists an integer j0j\geq 0 such that XtX_{t} satisfies:

  1. 1.

    gj+1(n)L^(Xt)gj(n)g_{j+1}(n)\leq\hat{L}(X_{t})\leq g_{j}(n), and

  2. 2.

    2(Xt)=O(n4/3)+O(k=0j1n4/3loggk(n))\mathcal{R}_{2}(X_{t})=O(n^{4/3})+O\left(\sum_{k=0}^{j-1}\frac{n^{4/3}}{\log g_{k}(n)}\right).

We show there exists time t>tt^{\prime}>t such that properties 1 and 2 hold for XtX_{t^{\prime}} for a different index j>jj^{\prime}>j. The following bounds on sums and products involving the gig_{i}’s will be useful; the proof is elementary and delayed to the end of this section.

Claim 4.5.

Let KK be defined as above. j<K,\forall j<K,

  1. (i)

    For any positive constant cc, we have i=0j(1cloggi(n))11.5cloggj(n)\prod_{i=0}^{j}\left(1-\frac{c}{\log g_{i}(n)}\right)\geq 1-\frac{1.5c}{\log g_{j}(n)}

  2. (ii)

    i=0j1loggi(n)1.5loggj(n)\sum_{i=0}^{j}\frac{1}{\log g_{i}(n)}\leq\frac{1.5}{\log g_{j}(n)}

By part (ii) of this claim, note that

O(k=0j1n4/3loggk(n))=O(n4/3loggj1(n))=O(n4/3).O\left(\sum_{k=0}^{j-1}\frac{n^{4/3}}{\log g_{k}(n)}\right)=O\left(\frac{n^{4/3}}{\log g_{j-1}(n)}\right)=O(n^{4/3}).

Hence, Lemma 4.4 implies that with probability 1O((loggj(n))1)1-O\left((\log g_{j}(n))^{-1}\right) there exist a time tt+O(loggj(n))t^{\prime}\leq t+O(\log g_{j}(n)) and a large constant DD such that L^(Xt)max{B(loggj(n))8,D}\hat{L}(X_{t^{\prime}})\leq\max\{B(\log g_{j}(n))^{8},D\} and 2(Xt)2(Xt)+O(n4/3loggj(n)).\mathcal{R}_{2}(X_{t^{\prime}})\leq\mathcal{R}_{2}(X_{t})+O\left(\frac{n^{4/3}}{\log g_{j}(n)}\right). If L^(Xt)max{D,B6}\hat{L}(X_{t^{\prime}})\leq\max\{D,B^{6}\} we are done. Hence, suppose otherwise that L^(Xt)(B6,loggj+1(n)]\hat{L}(X_{t^{\prime}})\in(B^{6},\log g_{j+1}(n)]. Since the interval (B6,loggj+1(n)](B^{6},\log g_{j+1}(n)] is completely covered by the union of the intervals [gj+2,gj+1][g_{j+2},g_{j+1}], …, [gK,gK1][g_{K},g_{K-1}], there must be an integer jj+1j^{\prime}\geq j+1 such that gj+1(n)L^(Xt)gj(n)g_{j^{\prime}+1}(n)\leq\hat{L}(X_{t^{\prime}})\leq g_{j^{\prime}}(n). Also, notice

2(Xt)\displaystyle\mathcal{R}_{2}(X_{t^{\prime}}) 2(Xt)+O(n4/3loggj(n))=O(n4/3)+O(k=0j1n4/3loggk(n))+O(n4/3loggj(n))\displaystyle\leq\mathcal{R}_{2}(X_{t})+O\left(\frac{n^{4/3}}{\log g_{j}(n)}\right)=O(n^{4/3})+O\left(\sum_{k=0}^{j-1}\frac{n^{4/3}}{\log g_{k}(n)}\right)+O\left(\frac{n^{4/3}}{\log g_{j}(n)}\right)
=O(n4/3)+O(k=0jn4/3loggk(n))=O(n4/3)+O(k=0j1n4/3loggk(n)).\displaystyle=O(n^{4/3})+O\left(\sum_{k=0}^{j}\frac{n^{4/3}}{\log g_{k}(n)}\right)=O(n^{4/3})+O\left(\sum_{k=0}^{j^{\prime}-1}\frac{n^{4/3}}{\log g_{k}(n)}\right).

By taking at most KK steps of induction, we obtain that there exist constants CC and cc such that with probability at least ρ:=i=0K1(1cloggi(n)),\rho:=\prod_{i=0}^{K-1}\left(1-\frac{c}{\log g_{i}(n)}\right), there exists a time

tKi=0K1Cloggi(n)t_{K}\leq\sum_{i=0}^{K-1}C\log g_{i}(n)

that satisfies L^(XtK)gK(n)B6\hat{L}(X_{t_{K}})\leq g_{K}(n)\leq B^{6} and 2(XtK)=O(n4/3)\mathcal{R}_{2}(X_{t_{K}})=O(n^{4/3}). Observe that tKt_{K} is a time when our goal has been achieved, so it only remains to show that ρ=Ω(1)\rho=\Omega(1) and tK=O(logn)t_{K}=O(\log n). The lower bound on ρ\rho follows from part (i) of Claim 4.5:

i=0K11cloggi(n)\displaystyle\prod_{i=0}^{K-1}1-\frac{c}{\log g_{i}(n)} 11.5cloggK1(n)>11.5clogB6=Ω(1).\displaystyle\geq 1-\frac{1.5c}{\log g_{K-1}(n)}>1-\frac{1.5c}{\log B^{6}}=\Omega(1).

By noting that K=O(logn)K=O(\log^{*}n), we can also bound tKt_{K} since i=0K1Cloggi(n)\sum_{i=0}^{K-1}C\log g_{i}(n) is at most logg0(n)+(K1)logg1(n)=O(logn)\log g_{0}(n)+(K-1)\log g_{1}(n)=O(\log n). ∎

Before proving Lemma 4.4 we provide the proof of Lemma 4.3.

Proof of Lemma 4.3.

We start with part 1. Let μt:=E[A(Xt)|Λt,Xt]\mu_{t}:=\operatorname*{{\rm E}}[A(X_{t})|\Lambda_{t},X_{t}], γt:=L^(Xt)n2/3\gamma_{t}:=\sqrt{\hat{L}(X_{t})}\cdot n^{2/3}, and Jt:=[μtγt,μt+γt]J_{t}:=[\mu_{t}-\gamma_{t},\mu_{t}+\gamma_{t}]. Hoeffding’s inequality implies that

Pr[A(Xt)JtΛt,Xt]12exp(2γt22(Xt))=1exp(Ω(L^(Xt))).\begin{split}\Pr\left[A(X_{t})\in J_{t}\mid\Lambda_{t},X_{t}\right]&\geq 1-2\exp\left(\frac{-2\gamma_{t}^{2}}{\mathcal{R}_{2}(X_{t})}\right)=1-\exp\left(-\Omega(\hat{L}(X_{t}))\right).\end{split}

Let m:=μt+γtm:=\mu_{t}+\gamma_{t}, GG(m,qn)G\sim G(m,\frac{q}{n}) and G^G(A(Xt),p)\hat{G}\sim G(A(X_{t}),p). Then, the monotonicity of the largest component in a random graph implies that for any >0\ell>0

Pr[L1(G^)>A(Xt)Jt]\displaystyle\Pr\left[L_{1}(\hat{G})>\ell\mid A(X_{t})\in J_{t}\right]
=aJtPr[L1(G^)>A(Xt)=a]Pr[A(Xt)=aA(Xt)Jt]\displaystyle=\sum_{a\in J_{t}}\Pr\left[L_{1}(\hat{G})>\ell\mid A(X_{t})=a\right]\Pr\left[A(X_{t})=a\mid A(X_{t})\in J_{t}\right]
aJtPr[L1(G^)>A(Xt)=m]Pr[A(Xt)=aA(Xt)Jt]\displaystyle\leq\sum_{a\in J_{t}}\Pr\left[L_{1}(\hat{G})>\ell\mid A(X_{t})=m\right]\Pr\left[A(X_{t})=a\mid A(X_{t})\in J_{t}\right]
=Pr[L1(G)>]aJtPr[A(Xt)=aA(Xt)Jt]\displaystyle=\Pr[L_{1}(G)>\ell]\sum_{a\in J_{t}}\Pr[A(X_{t})=a\mid A(X_{t})\in J_{t}]
=Pr[L1(G)>].\displaystyle=\Pr[L_{1}(G)>\ell].

We bound next Pr[L1(G)>]\Pr[L_{1}(G)>\ell]. For this, we rewrite G(m,qn)G(m,\frac{q}{n}) as G(m,1+εm)G\left(m,\frac{1+\varepsilon}{m}\right); since

μt=L^(Xt)n2/3+(nL^(Xt)n2/3)q1\mu_{t}=\hat{L}(X_{t})\cdot n^{2/3}+\left(n-\hat{L}(X_{t})n^{2/3}\right)q^{-1}

we have

ε=mqn1=(q1+qL^(Xt))L^(Xt)n1/3.\varepsilon=m\cdot\frac{q}{n}-1=\left(q-1+\frac{q}{\sqrt{\hat{L}(X_{t})}}\right)\frac{\hat{L}(X_{t})}{n^{1/3}}.

Thus,

ε3m\displaystyle\varepsilon^{3}\cdot m =(q1+qL^(Xt))3L^(Xt)3n(L^(Xt)n2/3+nL^(Xt)n2/3q+L^(Xt)n2/3)\displaystyle=\left(q-1+\frac{q}{\sqrt{\hat{L}(X_{t})}}\right)^{3}\frac{\hat{L}(X_{t})^{3}}{n}\left(\hat{L}(X_{t})n^{2/3}+\frac{n-\hat{L}(X_{t})n^{2/3}}{q}+\sqrt{\hat{L}(X_{t})}n^{2/3}\right)
(q1+qL^(Xt))3L^(Xt)3nnq\displaystyle\geq\left(q-1+\frac{q}{\sqrt{\hat{L}(X_{t})}}\right)^{3}\cdot\frac{\hat{L}(X_{t})^{3}}{n}\cdot\frac{n}{q}
1q((q1)3L^(Xt)3+q3L^(Xt)3)\displaystyle\geq\frac{1}{q}\cdot\left((q-1)^{3}\hat{L}(X_{t})^{3}+q^{3}\sqrt{\hat{L}(X_{t})}^{3}\right)
q2L^(Xt)3/2100L^(Xt),\displaystyle\geq q^{2}\hat{L}(X_{t})^{3/2}\geq 100\hat{L}(X_{t}),

where the last inequality follows from the fact that L^(Xt)>B\hat{L}(X_{t})>B, where B=B(q)B=B(q) is a sufficiently large constant.

Since ε3m1\varepsilon^{3}\cdot m\geq 1, Lemma 3.5 implies

Pr[|L1(G)2εm|>L^(Xt)mε]=eΩ(L^(Xt)).\Pr\left[\lvert L_{1}(G)-2\varepsilon m\rvert>\sqrt{\hat{L}(X_{t})}\sqrt{\frac{m}{\varepsilon}}\right]=e^{-\Omega\left(\hat{L}(X_{t})\right)}.

Let c1=21+(q1)δq(q1)c_{1}=2\sqrt{\frac{1+(q-1)\delta}{q(q-1)}}. The upper tail bound implies

Pr[L1(G)2εm+c1n2/3]1eΩ(L^(Xt)).\Pr\left[L_{1}(G)\leq 2\varepsilon m+c_{1}n^{2/3}\right]\geq 1-e^{-\Omega\left(\hat{L}(X_{t})\right)}.

We show next that 2εm+c1n2/3αL1(Xt)2\varepsilon m+c_{1}n^{2/3}\leq\alpha L_{1}(X_{t}) for some α(0,1)\alpha\in(0,1).

2εm+c1n2/3\displaystyle 2\varepsilon m+c_{1}n^{2/3}
=\displaystyle= 2(q1+qL^(Xt))L^(Xt)n1/3(L^(Xt)n2/3+nL^(Xt)n2/3q+L^(Xt)n2/3)+c1n2/3\displaystyle 2\left(q-1+\frac{q}{\sqrt{\hat{L}(X_{t})}}\right)\frac{\hat{L}(X_{t})}{n^{1/3}}\left(\hat{L}(X_{t})n^{2/3}+\frac{n-\hat{L}(X_{t})n^{2/3}}{q}+\sqrt{\hat{L}(X_{t})}n^{2/3}\right)+c_{1}n^{2/3}
=\displaystyle= 2q(q1+qL^(Xt))L^(Xt)n1/3[n+(q1+qL^(Xt))L^(Xt)n2/3]+c1n2/3\displaystyle\frac{2}{q}\left(q-1+\frac{q}{\sqrt{\hat{L}(X_{t})}}\right)\frac{\hat{L}(X_{t})}{n^{1/3}}\left[n+\left(q-1+\frac{q}{\sqrt{\hat{L}(X_{t})}}\right)\hat{L}(X_{t})n^{2/3}\right]+c_{1}n^{2/3}
=\displaystyle= 2q(q1+qL^(Xt)+c1qL^(Xt))L^(Xt)n2/3+2q(q1+qL^(Xt))2L^(Xt)2n1/3\displaystyle\frac{2}{q}\left(q-1+\frac{q}{\sqrt{\hat{L}(X_{t})}}+\frac{c_{1}q}{\hat{L}(X_{t})}\right)\hat{L}(X_{t})n^{2/3}+\frac{2}{q}\left(q-1+\frac{q}{\sqrt{\hat{L}(X_{t})}}\right)^{2}\hat{L}(X_{t})^{2}n^{1/3}
\displaystyle\leq 2q[δ(q1+O(L^(Xt)1/2))2+(q1+O(L^(Xt)1/2))]L^(Xt)n2/3,\displaystyle\frac{2}{q}\left[\delta\left(q-1+O\left(\hat{L}(X_{t})^{-1/2}\right)\right)^{2}+\left(q-1+O\left(\hat{L}(X_{t})^{-1/2}\right)\right)\right]\hat{L}(X_{t})n^{2/3},

where in the last inequality we use the assumption that δn1/3L^(Xt)\delta n^{1/3}\geq\hat{L}(X_{t}). For sufficiently small δ\delta and sufficiently large BB, α<1\exists~{}\alpha<1 such that

α>2q[δ(q1+2qB1/2)2+(q1+2qB1/2)].\alpha>\frac{2}{q}\left[\delta\left(q-1+\frac{2q}{B^{1/2}}\right)^{2}+\left(q-1+\frac{2q}{B^{1/2}}\right)\right].

Consequently, L1(G)2εm+c1n2/3αL1(Xt)L_{1}\left(G\right)\leq 2\varepsilon m+c_{1}n^{2/3}\leq\alpha L_{1}(X_{t}) with probability 1exp(Ω(L^(Xt))1-\exp\left(-\Omega(\hat{L}(X_{t})\right). If that is the case, L1(Xt+1)max{αL1(Xt),L2(Xt)}=:L+L_{1}(X_{t+1})\leq\max\left\{\alpha L_{1}(X_{t}),L_{2}(X_{t})\right\}=:L^{+}. Therefore,

Pr[L1(Xt+1)L+Xt,Λt]\displaystyle\Pr\left[L_{1}(X_{t+1})\leq L^{+}\mid X_{t},\Lambda_{t}\right]
Pr[L1(Xt+1)L+Xt,Λt,A(Xt)Jt]Pr[A(Xt)JtXt,Λt]\displaystyle\geq\Pr\left[L_{1}(X_{t+1})\leq L^{+}\mid X_{t},\Lambda_{t},A(X_{t})\in J_{t}\right]\cdot\Pr\left[A(X_{t})\in J_{t}\mid X_{t},\Lambda_{t}\right]
1exp(Ω(L^(Xt))),\displaystyle\geq 1-\exp\left(-\Omega(\hat{L}(X_{t}))\right),

which concludes the proof of part 1.

For part 2, note first that when the largest component is inactive, we have L1(Xt+1)L1(Xt)L_{1}(X_{t+1})\geq L_{1}(X_{t}); hence, it is sufficient to show that L1(Xt+1)L1(Xt)L_{1}(X_{t+1})\leq L_{1}(X_{t}) with the desired probability.

Let μt:=E[A(Xt)¬Λt,Xt]=(nL^(Xt)n2/3)q1\mu_{t}^{\prime}:=\operatorname*{{\rm E}}\left[A(X_{t})\mid\neg\Lambda_{t},X_{t}\right]=\left(n-\hat{L}(X_{t})n^{2/3}\right)q^{-1}, γt:=L^(Xt)n2/3\gamma_{t}^{\prime}:=\sqrt{\hat{L}(X_{t})}\cdot n^{2/3}, and Jt:=[μtγt,μt+γt]J_{t}^{\prime}:=[\mu_{t}^{\prime}-\gamma_{t}^{\prime},\mu_{t}^{\prime}+\gamma_{t}^{\prime}]. By Hoeffding’s inequality,

Pr[A(Xt)Jt¬Λt,Xt]1exp(Ω(L^(Xt))).\Pr\left[A(X_{t})\in J_{t}^{\prime}\mid\neg\Lambda_{t},X_{t}\right]\geq 1-\exp\left(-\Omega\left(\hat{L}(X_{t})\right)\right).

Let GG(A(Xt),p)G\sim G(A(X_{t}),p), m=μt+γtm=\mu_{t}^{\prime}+\gamma_{t}^{\prime} and let G+G(μt+γt,p)G^{+}\sim G\left(\mu_{t}^{\prime}+\gamma_{t}^{\prime},p\right), By monotonicity of the largest component in a random graph,

Pr[L1(G)>L1(Xt)A(Xt)Jt]Pr[L1(G+)>L1(Xt)].\Pr\left[L_{1}(G)>L_{1}(X_{t})\mid A(X_{t})\in J_{t}^{\prime}\right]\leq\Pr\left[L_{1}(G^{+})>L_{1}(X_{t})\right].

Rewrite G(μt+γt,p)G\left(\mu_{t}^{\prime}+\gamma_{t}^{\prime},p\right) as G(m,1+εm)G\left(m,\frac{1+\varepsilon}{m}\right), where

ε=(nL^(Xt)n2/3q+L^(Xt)n2/3)qn1=(L^(Xt)qL^(Xt))n1/3.\varepsilon=\left(\frac{n-\hat{L}(X_{t})n^{2/3}}{q}+\sqrt{\hat{L}(X_{t})}n^{2/3}\right)\cdot\frac{q}{n}-1=\left(\sqrt{\hat{L}(X_{t})}q-\hat{L}(X_{t})\right)n^{-1/3}.

From this bound, applying Lemma 3.7 to G+G^{+}, we obtain

E[1(G+)]=O(mε)=O(n4/3L^(Xt)).\operatorname*{{\rm E}}\left[\mathcal{R}_{1}(G^{+})\right]=O\left(\frac{m}{\varepsilon}\right)=O\left(\frac{n^{4/3}}{\hat{L}(X_{t})}\right).

Hence, E[L1(G+)2]=O(n4/3/L^(Xt))\operatorname*{{\rm E}}\left[L_{1}(G^{+})^{2}\right]=O\left(n^{4/3}/\hat{L}(X_{t})\right) and by Markov’s inequality

Pr[L1(G+)>L^(Xt)n2/3]=Pr[L1(G+)2>L^(Xt)2n4/3]E[L1(G+)2]L^(Xt)2n4/3=O(1L^(Xt)3).\Pr\left[L_{1}(G^{+})>\hat{L}(X_{t})n^{2/3}\right]=\Pr\left[L_{1}(G^{+})^{2}>\hat{L}(X_{t})^{2}n^{4/3}\right]\leq\frac{\operatorname*{{\rm E}}[L_{1}(G^{+})^{2}]}{\hat{L}(X_{t})^{2}n^{4/3}}=O\left(\frac{1}{\hat{L}(X_{t})^{3}}\right).

To conclude, we observe that

Pr[L1(Xt+1)L1(Xt)Xt,¬Λt]\displaystyle\Pr[L_{1}(X_{t+1})\leq L_{1}(X_{t})\mid X_{t},\neg\Lambda_{t}]
Pr[L1(G)L1(Xt)Xt,¬Λt,A(Xt)Jt]Pr[A(Xt)JtXt,¬Λt]\displaystyle\geq\Pr\left[L_{1}(G)\leq L_{1}(X_{t})\mid X_{t},\neg\Lambda_{t},A(X_{t})\in J_{t}^{\prime}\right]\Pr\left[A(X_{t})\in J_{t}^{\prime}\mid X_{t},\neg\Lambda_{t}\right]
(1eΩ(L^(Xt)))(1O(1L^(Xt)3))=1O(1L^(Xt)3),\displaystyle\geq\left(1-e^{-\Omega\left(\hat{L}(X_{t})\right)}\right)\left(1-O\left(\frac{1}{\hat{L}(X_{t})^{3}}\right)\right)=1-O\left(\frac{1}{\hat{L}(X_{t})^{3}}\right),

as desired. ∎

We are now ready to prove Lemma 4.4.

Proof of Lemma 4.4.

Suppose 2(X0)D12n4/3\mathcal{R}_{2}(X_{0})\leq D_{1}^{2}n^{4/3} for a constant D1D_{1}. Let T:=Blogg(n)T^{\prime}:=B^{\prime}\log g(n), where BB^{\prime} is a constant such that Blogg(n)=2qlog1/α(g(n)B(logg(n))8)B^{\prime}\log g(n)=2q\log_{1/\alpha}\left(\frac{g(n)}{B(\log g(n))^{8}}\right) and α:=α(B,q,δ)\alpha:=\alpha(B,q,\delta) is the constant from Lemma 4.3. Let T^\hat{T} be the first time

L^(Xt)max{B(logg(n))8,D},\hat{L}(X_{t})\leq\max\{B(\log g(n))^{8},D\},

where DD is a large constant we choose later. Let T:=TT^T:=T^{\prime}\wedge\hat{T}, where the operator \wedge takes the minimum of the two numbers. Define e(t)e(t) as the number of steps up to time tt in which the largest component of the configuration is activated.

To facilitate the notation, we define the following events. (The constants CC and α\alpha are those from Lemmas 4.3 and 4.2, respectively).

  1. 1.

    Let HiH_{i} denote L^(Xi)>max{B(logg(n))8,D}\hat{L}(X_{i})>\max\{B(\log g(n))^{8},D\};

  2. 2.

    Let FiF_{i} denote 2(Xi)2(Xi1)+Cn4/3L^(Xi1)1/2\mathcal{R}_{2}(X_{i})\leq\mathcal{R}_{2}(X_{i-1})+Cn^{4/3}\hat{L}(X_{i-1})^{-1/2}; let us assume F0F_{0} occurs;

  3. 3.

    Let FiF^{\prime}_{i} denote 2(Xi)2(Xi1)+Cn4/3(logg(n))4B1/2\mathcal{R}_{2}(X_{i})\leq\mathcal{R}_{2}(X_{i-1})+Cn^{4/3}(\log g(n))^{-4}B^{-1/2}; again, we assume F0F^{\prime}_{0} occurs;

  4. 4.

    Let QiQ_{i} denote L^(Xi)max{αe(i)L^(X0),D}\hat{L}(X_{i})\leq\max\{\alpha^{e(i)}\hat{L}(X_{0}),D\};

  5. 5.

    Let BaseiBase_{i} be the intersection of {F0,Q0,H0},,{Fi1,Qi1,Hi1},\{F^{\prime}_{0},Q_{0},H_{0}\},...,\{F^{\prime}_{i-1},Q_{i-1},H_{i-1}\}, and {Fi,Qi\{F^{\prime}_{i},Q_{i}}.

By induction, we find a lower bound for the probability of BaseTBase_{T}. For the base case, note that Pr[Base0]\Pr[Base_{0}] =1=1 by assumption. Next we show

Pr[Basei+1TBaseiT]=1O((logg(n))4).\Pr\left[Base_{i+1\wedge T}\mid Base_{i\wedge T}\right]=1-O((\log g(n))^{-4}).

If TiT\leq i, then BaseiT=BaseT=Basei+1TBase_{i\wedge T}=Base_{T}=Base_{i+1\wedge T}, so the induction holds. If T>iT>i, then we have HiH_{i}. By the induction hypothesis F1,F2,,Fi1F^{\prime}_{1},F^{\prime}_{2},...,F^{\prime}_{i-1},

2(Xi)2(X0)+iCn4/3(logg(n))4B1/2.\mathcal{R}_{2}(X_{i})\leq\mathcal{R}_{2}(X_{0})+i\cdot Cn^{4/3}(\log g(n))^{-4}B^{-1/2}.

Moreover, since i<TT=Blogg(n)i<T\leq T^{\prime}=B^{\prime}\log g(n) and 2(X0)D12n4/3\mathcal{R}_{2}(X_{0})\leq D_{1}^{2}n^{4/3}, we have

2(Xi)D12n4/3+CBn4/3(logg(n))3B1/2.\mathcal{R}_{2}(X_{i})\leq D_{1}^{2}n^{4/3}+CB^{\prime}n^{4/3}(\log g(n))^{-3}B^{-1/2}.

Given 2(Xi)=O(n4/3)\mathcal{R}_{2}(X_{i})=O(n^{4/3}) and HiH_{i}, Lemma 4.2 implies that Fi+1F_{i+1} occurs with probability

1O(L^(Xi)1/2)=1O((logg(n))4).1-O(\hat{L}(X_{i})^{-1/2})=1-O((\log g(n))^{-4}).

In addition, note that Fi+1HiF_{i+1}\cup H_{i} leads to Fi+1F^{\prime}_{i+1}. Let 𝟙(Λt)\mathbbm{1}(\Lambda_{t}) be the indicator function for the event Λt\Lambda_{t}. Given Hi,QiH_{i},Q_{i} and 2(Xi)=O(n4/3)\mathcal{R}_{2}(X_{i})=O(n^{4/3}), Lemma 4.3 implies

L1(Xi+1)max{α𝟙(Λt)L1(Xi),L2(Xi)}L_{1}(X_{i+1})\leq\max\{\alpha^{\mathbbm{1}(\Lambda_{t})}L_{1}(X_{i}),L_{2}(X_{i})\} (5)

with probability at least 1O(L^(Xt)3)=1O((logg(n))24)1-O\left(\hat{L}(X_{t})^{-3}\right)=1-O\left((\log g(n))^{-24}\right).

Dividing equation (5) by n2/3n^{2/3}, we obtain Qi+1Q_{i+1} for large enough DD. In particular, we can choose DD to be D1+2D_{1}+2. A union bound then implies

Pr[Basei+1TBaseiT]Pr[Basei+1TBasei,Hi]=1O((logg(n))4).\Pr[Base_{i+1\wedge T}\mid Base_{i\wedge T}]\geq\Pr\left[Base_{i+1\wedge T}\mid Base_{i},H_{i}\right]=1-O((\log g(n))^{-4}).

The probability for BaseTBase_{T} can then be bounded as follows:

Pr[BaseT]i=0T1Pr[Basei+1TBaseiT]=i=0T11O((logg(n))4)=1O((logg(n))3).\Pr[Base_{T}]\geq\prod_{i=0}^{T-1}\Pr[Base_{i+1\wedge T}\mid Base_{i\wedge T}]=\prod_{i=0}^{T-1}1-O((\log g(n))^{-4})=1-O((\log g(n))^{-3}).

Next, let us assume BaseTBase_{T}. Then we have

2(XT)2(X0)+TCn4/3(logg(n))4B1/2=2(X0)+O(n4/3(logg(n))3).\mathcal{R}_{2}(X_{T})\leq\mathcal{R}_{2}(X_{0})+T^{\prime}\cdot Cn^{4/3}(\log g(n))^{-4}B^{-1/2}=\mathcal{R}_{2}(X_{0})+O\left(n^{4/3}(\log g(n))^{-3}\right).

Notice that if T=T^T=\hat{T} then the proof is complete. Consequently, it suffices to show T^T\hat{T}\leq T^{\prime} with probability at least 1g(n)Ω(1)1-g(n)^{-\Omega(1)}.

Observe that K:=e(T)K:=e(T^{\prime}) is a binomial random variable Bin(T,1/q)Bin\left(T^{\prime},1/q\right), whose expectation is Tq=Bqlogg(n)\frac{T^{\prime}}{q}=\frac{B^{\prime}}{q}\log g(n). By Chernoff bound

Pr[K<B2qlogg(n)]exp(B16qlogg(n))=g(n)Ω(1).\Pr\left[K<\frac{B^{\prime}}{2q}\log g(n)\right]\leq\exp\left(-\frac{B^{\prime}}{16q}\log g(n)\right)=g(n)^{-\Omega(1)}.

If indeed T=TT=T^{\prime} and KB2qlogg(n)K\geq\frac{B^{\prime}}{2q}\log g(n), then the event QTQ_{T} implies

L^(XT)<αe(T)L^(X0)αlogα(B(logg(n))8g(n))L^(X0)=B(logg(n))8g(n)L^(X0)B(logg(n))8,\hat{L}(X_{T})<\alpha^{e(T)}\hat{L}(X_{0})\leq\alpha^{\log_{\alpha}\left(\frac{B(\log g(n))^{8}}{g(n)}\right)}\hat{L}(X_{0})=\frac{B(\log g(n))^{8}}{g(n)}\hat{L}(X_{0})\leq B(\log g(n))^{8},

which leads to T^T\hat{T}\leq T. Therefore,

Pr[T^>TBaseT]Pr[K<B2qlogg(n)]=g(n)Ω(1),\Pr\left[\hat{T}>T^{\prime}\mid Base_{T}\right]\leq\Pr\left[K<\frac{B^{\prime}}{2q}\log g(n)\right]=g(n)^{-\Omega(1)},

as desired. ∎

Proof of Claim 4.5.

We first show the following inequality:

1.5loggj(n)+1loggj+1(n)1.5loggj+1(n).\frac{1.5}{\log g_{j}(n)}+\frac{1}{\log g_{j+1}(n)}\leq\frac{1.5}{\log g_{j+1}(n)}. (6)

Note that by direction computation

1.5loggj(n)+1loggj+1(n)=1.5(logB+log(loggj(n))8)+loggj(n)loggj(n)loggj+1(n).\begin{split}\frac{1.5}{\log g_{j}(n)}+\frac{1}{\log g_{j+1}(n)}&=\frac{1.5(\log B+\log(\log g_{j}(n))^{8})+\log g_{j}(n)}{\log g_{j}(n)\log g_{j+1}(n)}.\\ \end{split}

From the definition of KK, we know that gj(n)>B6g_{j}(n)>B^{6} for all j<Kj<K. Hence, logB<loggj(n)1/6\log B<\log g_{j}(n)^{1/6}. In addition, recall that BB is such that xB6\forall\,x\geq B^{6}, we have x(logx)48x\geq(\log x)^{48}; therefore, gj(n)(loggj(n))48g_{j}(n)\geq(\log g_{j}(n))^{48}. Then, log(loggj(n))8loggj(n)1/6\log(\log g_{j}(n))^{8}\leq\log g_{j}(n)^{1/6}. Putting all these together,

1.5(logB+log(loggj(n))8)+loggj(n)loggj(n)loggj+1(n)1.5(16loggj(n)+16loggj(n))+loggj(n)loggj(n)loggj+1(n)=1.5loggj(n)loggj(n)loggj+1(n)=1.5loggj+1(n)\begin{split}\frac{1.5(\log B+\log(\log g_{j}(n))^{8})+\log g_{j}(n)}{\log g_{j}(n)\log g_{j+1}(n)}&\leq\frac{1.5(\frac{1}{6}\log g_{j}(n)+\frac{1}{6}\log g_{j}(n))+\log g_{j}(n)}{\log g_{j}(n)\log g_{j+1}(n)}\\ &=\frac{1.5\log g_{j}(n)}{\log g_{j}(n)\log g_{j+1}(n)}=\frac{1.5}{\log g_{j+1}(n)}\end{split}

The proof of part (i) is inductive. The base case (i=0i=0) holds trivially. For the inductive step note that

i=0j+1(1cloggi(n))=(1cloggj+1(n))i=0j(1cloggi(n))(1cloggj+1(n))(11.5cloggj(n))1c(1.5loggj(n)+1loggj+1(n))11.5cloggj+1(n),\begin{split}\prod_{i=0}^{j+1}\left(1-\frac{c}{\log g_{i}(n)}\right)&=\left(1-\frac{c}{\log g_{j+1}(n)}\right)\prod_{i=0}^{j}\left(1-\frac{c}{\log g_{i}(n)}\right)\\ &\geq\left(1-\frac{c}{\log g_{j+1}(n)}\right)\left(1-\frac{1.5c}{\log g_{j}(n)}\right)\\ &\geq 1-c\left(\frac{1.5}{\log g_{j}(n)}+\frac{1}{\log g_{j+1}(n)}\right)\\ &\geq 1-\frac{1.5c}{\log g_{j+1}(n)},\end{split}

where the last inequality follows from (6).

For part (ii) we also use induction. The base case (i=0i=0) can be checked straightforwardly. For the inductive step,

i=0j+11loggi(n)1loggj+1(n)+i=0j1loggi(n)1loggj+1(n)+1.5loggj(n)1.5loggi(n),\begin{split}\sum_{i=0}^{j+1}\frac{1}{\log g_{i}(n)}&\leq\frac{1}{\log g_{j+1}(n)}+\sum_{i=0}^{j}\frac{1}{\log g_{i}(n)}\leq\frac{1}{\log g_{j+1}(n)}+\frac{1.5}{\log g_{j}(n)}\leq\frac{1.5}{\log g_{i}(n)},\end{split}

where the last inequality follows from (6). ∎

5 Coupling to the same component structure: proofs

In this section we provide the proofs of Lemma 2.6, Lemma 2.7, Lemma 2.8 and Lemma 2.9.

5.1 Continuation of the burn-in phase: proof of Lemma 2.6

Recall that for a random-cluster configuration XX, let A(X)A(X) denote the random variable corresponding to the number of vertices activated by step (i) of the CM dynamics from XX.

Proof of Lemma 2.6.

We show that there exist suitable constants CC, D>0D>0 and α(0,1)\alpha\in(0,1) such that if 1(Xt)Cn4/3\mathcal{R}_{1}(X_{t})\leq Cn^{4/3} and ~ω(Xt)>Dn4/3ω(n)1/2\widetilde{\mathcal{R}}_{\omega}(X_{t})>Dn^{4/3}\omega(n)^{-1/2}, then

1(Xt+1)\displaystyle\mathcal{R}_{1}(X_{t+1}) Cn4/3,and\displaystyle\leq Cn^{4/3},~{}\text{and} (7)
~ω(Xt+1)\displaystyle\widetilde{\mathcal{R}}_{\omega}(X_{t+1}) (1α)~ω(Xt)\displaystyle\leq(1-\alpha)\widetilde{\mathcal{R}}_{\omega}(X_{t}) (8)

with probability ρ=Ω(1)\rho=\Omega(1). This implies that we can maintain (7)-(8) for TT steps with probability ρT\rho^{T}. Precisely, if we let

τ1\displaystyle\tau_{1} =min{t>0:1(Xt)>Cn4/3},\displaystyle=\min\{t>0:\mathcal{R}_{1}(X_{t})>Cn^{4/3}\},
τ2\displaystyle\tau_{2} =min{t>0:~ω(Xt)>(1α)~ω(Xt1)},\displaystyle=\min\{t>0:\widetilde{\mathcal{R}}_{\omega}(X_{t})>(1-\alpha)\widetilde{\mathcal{R}}_{\omega}(X_{t-1})\},
T\displaystyle T =min{τ1,τ2,clogω(n)},\displaystyle=\min\{\tau_{1},\tau_{2},c\log\omega(n)\},

where the constant c>0c>0 is chosen such that (1α)clogω(n)=O(ω(n)1/2)(1-\alpha)^{c\log\omega(n)}=O(\omega(n)^{-1/2}), then T=clogω(n)T=c\log\omega(n) with probability ρclogω(n)\rho^{c\log\omega(n)}. (Note that ρclogω(n)=ω(n)β\rho^{c\log\omega(n)}={\omega(n)}^{-\beta} for a suitable constant β>0\beta>0.) Hence, 1(XT)=O(n4/3)\mathcal{R}_{1}(X_{T})=O(n^{4/3}) and

~ω(XT)~ω(X0)O(ω(n)1/2)1(X0)O(ω(n)1/2)=O(n4/3ω(n)1/2).\widetilde{\mathcal{R}}_{\omega}(X_{T})\leq\widetilde{\mathcal{R}}_{\omega}(X_{0})\cdot O(\omega(n)^{-1/2})\leq\mathcal{R}_{1}(X_{0})\cdot O(\omega(n)^{-1/2})=O(n^{4/3}\omega(n)^{-1/2}).

The lemma then follows from the fact that I(XT)=Ω(n)I(X_{T})=\Omega(n) with probability 1o(1)1-o(1) by Lemma 3.2 and a union bound.

To establish (7)-(8), let t1\mathcal{H}^{1}_{t} be the event that A(Xt)[n/qδn2/3,n/q+δn2/3]A(X_{t})\in\left[n/q-\delta n^{2/3},n/q+\delta n^{2/3}\right], where δ>0\delta>0 is a constant. By Hoeffding’s inequality, for a suitable δ>0\delta>0, Pr[t1]118q2\Pr[\mathcal{H}^{1}_{t}]\geq 1-\frac{1}{8q^{2}} since 1(Xt)=O(n4/3)\mathcal{R}_{1}(X_{t})=O(n^{4/3}). Let KtK_{t} denote the subgraph induced on the inactivated vertices at the step tt. Observe that E[~ω(Kt)]=(11q)~ω(Xt)\operatorname*{{\rm E}}\big{[}\widetilde{\mathcal{R}}_{\omega}(K_{t})\big{]}=\left(1-\frac{1}{q}\right)\widetilde{\mathcal{R}}_{\omega}(X_{t}). Similarly, E[1(Kt)~ω(Kt)]=(11q)(1(Xt)~ω(Xt+1))\operatorname*{{\rm E}}\big{[}\mathcal{R}_{1}(K_{t})-\widetilde{\mathcal{R}}_{\omega}(K_{t})\big{]}=\left(1-\frac{1}{q}\right)\big{(}\mathcal{R}_{1}(X_{t})-\widetilde{\mathcal{R}}_{\omega}(X_{t+1})\big{)}. Hence, by Markov’s inequality and independence between activation of each component, with probability at least 1/4q21/4q^{2}, the activation sub-step is such that GuG_{u} satisfies

~ω(Kt)(112q)~ω(Xt),\widetilde{\mathcal{R}}_{\omega}(K_{t})\leq\left(1-\frac{1}{2q}\right)\widetilde{\mathcal{R}}_{\omega}(X_{t}),

and

1(Kt)~ω(Kt)(112q)(1(Xt)~ω(Xt+1)).\mathcal{R}_{1}(K_{t})-\widetilde{\mathcal{R}}_{\omega}(K_{t})\leq\left(1-\frac{1}{2q}\right)\big{(}\mathcal{R}_{1}(X_{t})-\widetilde{\mathcal{R}}_{\omega}(X_{t+1})\big{)}.

We denote this event by t2\mathcal{H}^{2}_{t}. It follows by a union bound that t1\mathcal{H}^{1}_{t} and t2\mathcal{H}^{2}_{t} happen simultaneously with probability at least 1/8q21/8q^{2}. We assume that this is indeed the case and proceed to discuss the percolation sub-step.

Lemma 3.10 implies that there exists C1>0C_{1}>0 such that with probability 99/10099/100,

~ω(G(A(Xt),qn))C1n4/3ω(n)1/2.\widetilde{\mathcal{R}}_{\omega}\left(G\left(A(X_{t}),\frac{q}{n}\right)\right)\leq C_{1}\frac{n^{4/3}}{{\omega(n)}^{1/2}}.

Hence,

~ω(Xt+1)=~ω(Kt)+~ω(G(A(Xt),qn))(112q)~ω(Xt)+C1n4/3ω(n)1/2(1α)~ω(Xt),\displaystyle\widetilde{\mathcal{R}}_{\omega}(X_{t+1})=\widetilde{\mathcal{R}}_{\omega}(K_{t})+\widetilde{\mathcal{R}}_{\omega}\left(G\left(A(X_{t}),\frac{q}{n}\right)\right)\leq\left(1-\frac{1}{2q}\right)\widetilde{\mathcal{R}}_{\omega}(X_{t})+C_{1}\frac{n^{4/3}}{{\omega(n)}^{1/2}}\leq(1-\alpha)\widetilde{\mathcal{R}}_{\omega}(X_{t}),

where the last inequality holds for a suitable constant α(0,1)\alpha\in(0,1) and a sufficiently large DD since ~ω(Xt)>Dn4/3ω(n)1/2\widetilde{\mathcal{R}}_{\omega}(X_{t})>Dn^{4/3}\omega(n)^{-1/2}.

On the other hand, Lemma 3.9 implies E[1(G(A(Xt),qn))]=O(n4/3)\operatorname*{{\rm E}}\left[\mathcal{R}_{1}\left(G\left(A(X_{t}),\frac{q}{n}\right)\right)\right]=O(n^{4/3}). By Markov’s inequality, there exists C2C_{2} such that, with probability 99/10099/100,

1(G(A(Xt),qn))C2n4/3.\mathcal{R}_{1}\left(G\left(A(X_{t}),\frac{q}{n}\right)\right)\leq C_{2}n^{4/3}.

For large enough CC,

1(Xt+1)\displaystyle\mathcal{R}_{1}(X_{t+1}) 1(Kt)+1(G(A(Xt),qn))(112q)1(Xt)+1(G(A(Xt),qn))\displaystyle\leq\mathcal{R}_{1}(K_{t})+\mathcal{R}_{1}\left(G\left(A(X_{t}),\frac{q}{n}\right)\right)\leq\big{(}1-\frac{1}{2q}\big{)}\mathcal{R}_{1}(X_{t})+\mathcal{R}_{1}\left(G\left(A(X_{t}),\frac{q}{n}\right)\right)
(112q)Cn4/3+C2n4/3Cn4/3\displaystyle\leq\big{(}1-\frac{1}{2q}\big{)}Cn^{4/3}+C_{2}n^{4/3}\leq Cn^{4/3}

Finally, it follows from a union bound that (7) and (8) hold simultaneously with probability at least 981008q2\frac{98}{100\cdot 8q^{2}}. ∎

5.2 Coupling to the same large component structure: proof of Lemma 2.7

To prove Lemma 2.7, we use a local limit theorem to construct a two-step coupling of the CM dynamics that reaches two configurations with the same large component structure. The construction of Markov chain couplings using local limit theorems is not common (see [24] for another example), but it appears to be a powerful technique that may have other interesting applications. We provide next a brief introduction to local limit theorems.

Local limit theorem.  Let mm be an integer. Let c1cmc_{1}\leq\dots\leq c_{m} be integers and for i=1,,mi=1,\dots,m, let XiX_{i} be the random variable that is equal to cic_{i} with probability r(0,1)r\in(0,1), and it is zero otherwise. Let us assume that X1,,XmX_{1},\dots,X_{m} are independent random variables. Let Sm=i=1mXiS_{m}=\sum_{i=1}^{m}X_{i}, μm=E[Sm]\mu_{m}=\operatorname*{{\rm E}}[S_{m}] and σm2=Var(Sm)\sigma_{m}^{2}=\operatorname*{{\rm Var}}(S_{m}). We say that a local limit theorem holds for SmS_{m} if for every integer aa\in\mathbb{Z}:

Pr[Sm=a]=12πσmexp((aμm)22σm2)+o(σm1).\Pr[S_{m}=a]=\frac{1}{\sqrt{2\pi}\sigma_{m}}\exp\Big{(}-\frac{(a-\mu_{m})^{2}}{2\sigma_{m}^{2}}\Big{)}+o(\sigma_{m}^{-1}). (9)

We prove, under some conditions, a local limit theorem that applies to the random variables corresponding to the number of active vertices from small components. Recall that for an increasing positive function gg and each integer k0k\geq 0, we defined the intervals

k=[ϑm2/32g(m)2k,ϑm2/3g(m)2k].\mathcal{I}_{k}=\left[\frac{\vartheta m^{2/3}}{2g(m)^{2^{k}}},\frac{\vartheta m^{2/3}}{g(m)^{2^{k}}}\right].

where ϑ>0\vartheta>0 is a fixed large constant.

Theorem 5.1.

Let mm be an integer. Let c1cmc_{1}\leq\dots\leq c_{m} be integers, and suppose X1,,XmX_{1},...,X_{m} are independent random variables such that XiX_{i} is equal to cic_{i} with probability r(0,1)r\in(0,1), and XiX_{i} is zero otherwise. Let g:g:\mathbb{N}\rightarrow\mathbb{R} be an increasing positive function such that g(m)g(m)\rightarrow\infty and g(m)=o(logm)g(m)=o(\log m). Suppose cm=O(m2/3g(m)1)c_{m}=O\left(m^{2/3}g(m)^{-1}\right), i=1mci2=O(m4/3g(m)1/2)\sum_{i=1}^{m}c_{i}^{2}=O\left(m^{4/3}g(m)^{-1/2}\right) and ci=1c_{i}=1 for all iρmi\leq\rho m, where ρ(0,1)\rho\in(0,1) is independent of mm. Let =(m,g)>0\ell=\ell(m,g)>0 be the smallest integer such that m2/3g(m)2=o(m1/4)m^{2/3}g(m)^{-2^{\ell}}=o(m^{1/4}). If for all 1k1\leq k\leq\ell, we have |{i:cik(g)}|=Ω(g(m)32k1)|\{i:c_{i}\in\mathcal{I}_{k}(g)\}|=\Omega(g(m)^{3\cdot 2^{k-1}}), then a local limit theorem holds for Sm=i=1mXiS_{m}=\sum_{i=1}^{m}X_{i}.

Theorem 5.1 follows from a general local limit theorem proved in [28]; a proof is given in Appendix A. We provide next the proof of Lemma 2.7.

Proof of Lemma 2.7.

First, both {Xt}\{X_{t}\}, {Yt}\{Y_{t}\} perform one independent CM step from the initial configurations X0X_{0}, Y0Y_{0}. We start by establishing that X1X_{1} and Y1Y_{1} preserve the structural properties assumed for X0X_{0} and Y0Y_{0}.

By assumption 1(X0)=O(n4/3)\mathcal{R}_{1}(X_{0})=O(n^{4/3}), so Hoeffding’s inequality implies that the number of activated vertices from X0X_{0} is such that

A(X0)I:=[n/qO(n2/3),n/q+O(n2/3)]A(X_{0})\in I:=\left[n/q-O(n^{2/3}),n/q+O(n^{2/3})\right]

with probability Ω(1)\Omega(1). Then, the percolation step is distributed as a

G(A(X0),1+λA(X0)1/3A(X0))G\left(A(X_{0}),\frac{1+\lambda A(X_{0})^{-1/3}}{A(X_{0})}\right)

random graph, with |λ|=O(1)|\lambda|=O(1) with probability Ω(1)\Omega(1). Conditioning on this event, from Lemma 3.2 we obtain that I(X1)=Ω(n)I(X_{1})=\Omega(n) w.h.p. Moreover, from Lemma 3.9 and Markov’s inequality we obtain that 1(X1)=O(n4/3)\mathcal{R}_{1}(X_{1})=O(n^{4/3}) with probability at least 99/10099/100 and from Lemma 3.10 that ~ω(X1)=O(n4/3ω(n)1/2)\widetilde{\mathcal{R}}_{\omega}(X_{1})=O(n^{4/3}{\omega(n)}^{-1/2}) also with probability at least 99/10099/100.

We show next that X1X_{1} and Y1Y_{1}, in addition to preserving the structural properties of X0X_{0} and Y0Y_{0}, also have many connected components with sizes in certain carefully chosen intervals. This fact will be crucial in the design of our coupling. When A(X0)IA(X_{0})\in I, by Lemmas 3.11 and 3.12 and a union bound, for all integer k0k\geq 0 such that n2/3ω(n)2kn^{2/3}\omega(n)^{-2^{k}}\rightarrow\infty, Nk(X1,ω)=Ω(ω(n)32k1)N_{k}(X_{1},\omega)=\Omega(\omega(n)^{3\cdot 2^{k-1}}) w.h.p. (Recall, that Nk(X1,ω)N_{k}(X_{1},\omega) denotes the number of connected components of X1X_{1} with sizes in the interval k(ω)\mathcal{I}_{k}(\omega).) We will also require a bound for the number of components with sizes in the interval

J=[cn2/3ω(n)6,2cn2/3ω(n)6],J=\left[\frac{cn^{2/3}}{\omega(n)^{6}},\frac{2cn^{2/3}}{\omega(n)^{6}}\right],

where c>0c>0 is a constant such that JJ does not intersect any of the k(ω)\mathcal{I}_{k}(\omega)’s intervals. Let WXW_{X} (resp., WYW_{Y}) be the set of components of X1X_{1} (resp., Y1Y_{1}) with sizes in the interval JJ. Lemma 3.11 then implies that for some positive constants δ1,δ2\delta_{1},\delta_{2} independent of nn,

Pr[|WX|δ1n(ω(n)6cn2/3)3/2]1δ2n(cn2/3ω(n)6)3/2=1O(ω(n)9).\Pr\Big{[}|W_{X}|\geq\delta_{1}n\big{(}\frac{\omega(n)^{6}}{cn^{2/3}}\big{)}^{3/2}\Big{]}\geq 1-\frac{\delta_{2}}{n}\Big{(}\frac{cn^{2/3}}{\omega(n)^{6}}\Big{)}^{3/2}=1-O(\omega(n)^{-9}).

All the bounds above apply also to the analogous quantities for Y1Y_{1} with the same respective probabilities. Therefore, by a union bound, all these properties hold simultaneously for both X1X_{1} and Y1Y_{1} with probability Ω(1)\Omega(1). We assume that this is indeed the case and proceed to describe the second step of the coupling, in which we shall use each of the established properties for X1X_{1} and Y1Y_{1}.

Recall 𝒮ω(X1)\mathcal{S}_{\omega}(X_{1}) and 𝒮ω(Y1)\mathcal{S}_{\omega}(Y_{1}) denote the sets of connected components in X1X_{1} and Y1Y_{1}, respectively, with sizes larger than BωB_{\omega}. (Recall that Bω=n2/3ω(n)1B_{\omega}=n^{2/3}\omega(n)^{-1}, where ω(n)=loglogloglogn\omega(n)=\log\log\log\log n.) Since 1(X1)=O(n4/3)\mathcal{R}_{1}(X_{1})=O(n^{4/3}), the total number of components in 𝒮ω(X1)\mathcal{S}_{\omega}(X_{1}) is O(ω(n)2)O(\omega(n)^{2}); moreover, it follows from the Cauchy–Schwarz inequality that the total number of vertices in the components in 𝒮ω(X1)\mathcal{S}_{\omega}(X_{1}), denoted 𝒮ω(X1)\|\mathcal{S}_{\omega}(X_{1})\|, is O(n2/3ω(n))O(n^{2/3}\omega(n)); the same holds for 𝒮ω(Y1)\mathcal{S}_{\omega}(Y_{1}). Without loss of generality, let us assume that 𝒮ω(X1)𝒮ω(Y1)\|\mathcal{S}_{\omega}(X_{1})\|\geq\|\mathcal{S}_{\omega}(Y_{1})\|. Let

Γ={CWY:𝒮ω(Y1)C𝒮ω(X1)},\varGamma=\{C\subset W_{Y}:\|\mathcal{S}_{\omega}(Y_{1})\cup C\|\geq\|\mathcal{S}_{\omega}(X_{1})\|\},

and let Cmin=argminCΓ𝒮ω(Y1)CC_{\rm min}=\arg\min_{C\in\varGamma}\|\mathcal{S}_{\omega}(Y_{1})\cup C\|. In words, CminC_{\rm min} is the smallest subset CC of components of WYW_{Y} so that the number of vertices in the union of 𝒮ω(Y1)\mathcal{S}_{\omega}(Y_{1}) and CC is greater than that in 𝒮ω(X1)\mathcal{S}_{\omega}(X_{1}). Since every component in WYW_{Y} has size at least cn2/3ω(n)6cn^{2/3}\omega(n)^{-6} and |WY|=Ω(ω(n)9)|W_{Y}|=\Omega(\omega(n)^{9}), the number of vertices in WYW_{Y} is Ω(n2/3ω(n)3)\Omega(n^{2/3}\omega(n)^{3}) and so Γ\varGamma\neq\emptyset. In addition, the number of components in CminC_{\rm min} is O(ω(n)9)O(\omega(n)^{9}). Let 𝒮ω(Y1)=𝒮ω(Y1)Cmin\mathcal{S}_{\omega}^{\prime}(Y_{1})=\mathcal{S}_{\omega}(Y_{1})\cup C_{\rm min} and observe that the number of components in 𝒮ω(Y1)\mathcal{S}_{\omega}^{\prime}(Y_{1}) is also O(ω(n)9)O(\omega(n)^{9}) and that

0𝒮ω(Y1)𝒮ω(X1)2cn2/3ω(n)6.0\leq\|\mathcal{S}_{\omega}^{\prime}(Y_{1})\|-\|\mathcal{S}_{\omega}(X_{1})\|\leq 2cn^{2/3}\omega(n)^{-6}.

Note that 𝒮ω(X1)𝒮ω(Y1)\|\mathcal{S}_{\omega}(X_{1})\|-\|\mathcal{S}_{\omega}(Y_{1})\| may be Ω(n2/3ω(n))\Omega(n^{2/3}\omega(n)) (i.e., much larger than 𝒮ω(Y1)𝒮ω(X1)\|\mathcal{S}_{\omega}^{\prime}(Y_{1})\|-\|\mathcal{S}_{\omega}(X_{1})\|). Hence, if all the components from 𝒮ω(Y1)\mathcal{S}_{\omega}(Y_{1}) and 𝒮ω(X1)\mathcal{S}_{\omega}(X_{1}) were activated, the difference in the number of active vertices could be Ω(n2/3ω(n))\Omega(n^{2/3}\omega(n)). This difference cannot be corrected by our coupling for the activation of the small components. We shall require instead that all the components from 𝒮ω(Y1)\mathcal{S}_{\omega}^{\prime}(Y_{1}) and 𝒮ω(X1)\mathcal{S}_{\omega}(X_{1}) are activated so that the difference is O(n2/3ω(n)6)O(n^{2/3}\omega(n)^{-6}) instead.

We now describe a coupling of the activation sub-step for the second step of the CM dynamics. As mentioned, our goal is to design a coupling in which the same number of vertices are activated from each copy. If indeed A(X1)=A(Y1)A(X_{1})=A(Y_{1}), then we can choose an arbitrary bijective map φ\varphi between the activated vertices of X1X_{1} and the activated vertices of Y1Y_{1} and use φ\varphi to couple the percolation sub-step. Specifically, if uu and vv were activated in X1X_{1}, the state of the edges {u,v}\{u,v\} in X2X_{2} and {φ(u),φ(v)}\{\varphi(u),\varphi(v)\} in Y2Y_{2} would be the same. This yields a coupling of the percolation sub-step such that X2X_{2} and Y2Y_{2} agree on the subgraph update at time 11.

Suppose then that in the second CM step all the components in 𝒮ω(X1)\mathcal{S}_{\omega}(X_{1}) and 𝒮ω(Y1)\mathcal{S}_{\omega}^{\prime}(Y_{1}) are activated simultaneously. If this is the case, then the difference in the number of activated vertices is d2cn2/3ω(n)6d\leq 2cn^{2/3}\omega(n)^{-6}. We will use a local limit theorem (i.e., Theorem 5.1) to argue that there is a coupling of the activation of the remaining components in X1X_{1} and Y1Y_{1} such that the total number of active vertices in both copies is the same with probability Ω(1)\Omega(1). Since all the components in 𝒮ω(X1)\mathcal{S}_{\omega}(X_{1}) and 𝒮ω(Y1)\mathcal{S}_{\omega}^{\prime}(Y_{1}) are activated with probability exp(O(ω(n)9))\exp(-O(\omega(n)^{9})), the overall success probability of the coupling will be exp(O(ω(n)9))\exp(-O(\omega(n)^{9})).

Now, let x1,x2,,xmx_{1},x_{2},\dots,x_{m} be the sizes of the components of X1X_{1} that are not in 𝒮ω(X1)\mathcal{S}_{\omega}(X_{1}) (in increasing order). Let A^(X1)\hat{A}(X_{1}) be the random variable corresponding to the number of active vertices from these components. Observe that A^(X1)\hat{A}(X_{1}) is the sum of mm independent random variables, where the jj-th variable in the sum is equal to xjx_{j} with probability 1/q1/q, and it is 0 otherwise. We claim that sequence x1,x2,,xmx_{1},x_{2},\dots,x_{m} satisfies all the conditions in Theorem 5.1.

First, note that since the number of isolated vertices in X1X_{1} is Ω(n)\Omega(n), m=Θ(n)m=\Theta(n) and consequently xm=O(m2/3ω(m)1)x_{m}=O(m^{2/3}\omega(m)^{-1}), i=1mxi2=R~ω(X1)=O(m4/3ω(m)1/2)\sum_{i=1}^{m}x_{i}^{2}=\tilde{R}_{\omega}(X_{1})=O(m^{4/3}\omega(m)^{-1/2}) and xi=1x_{i}=1 for all iρmi\leq\rho m, where ρ(0,1)\rho\in(0,1) is independent of mm. Moreover, since Nk(X1,ω)=Ω(ω(n)32k1)N_{k}(X_{1},\omega)=\Omega(\omega(n)^{3\cdot 2^{k-1}}) for all k1k\geq 1 such that n2/3ω(n)2kn^{2/3}\omega(n)^{-2^{k}}\rightarrow\infty,

|{i:xik(ω)}|=Ω(ω(m)32k1).|\{i:x_{i}\in\mathcal{I}_{k}(\omega)\}|=\Omega(\omega(m)^{3\cdot 2^{k-1}}).

Since N0(X1,ω)=Ω(ω(n)3/2)N_{0}(X_{1},\omega)=\Omega(\omega(n)^{3/2}), we also have

i=1mxi2N0(X1,ω)ϑ2n4/34ω(n)2=Ω(m4/3ω(m)1/2).\sum\nolimits_{i=1}^{m}x_{i}^{2}\geq N_{0}(X_{1},\omega)\cdot\frac{\vartheta^{2}n^{4/3}}{4\omega(n)^{2}}=\Omega({m^{4/3}\omega(m)^{-1/2}}).

Let μX=E[A^(X1)]=q1i=1mxi\mu_{X}=\operatorname*{{\rm E}}[\hat{A}(X_{1})]=q^{-1}\sum_{i=1}^{m}x_{i} and let

σX2=Var(A^(X1))=q1(1q1)i=1mxi2=Θ(m4/3ω(m)1/2).\sigma_{X}^{2}=\operatorname*{{\rm Var}}(\hat{A}(X_{1}))=q^{-1}(1-q^{-1})\sum_{i=1}^{m}x_{i}^{2}=\Theta(m^{4/3}\omega(m)^{-1/2}).

Hence, Theorem 5.1 implies that Pr[A^(X1)=a]=Ω(σX1)\Pr[\hat{A}(X_{1})=a]=\Omega\left(\sigma_{X}^{-1}\right) for any a[μXσX,μX+σX]a\in[\mu_{X}-\sigma_{X},\mu_{X}+\sigma_{X}]. Similarly, we get Pr[A^(Y1)=a]=Ω(σY1)\Pr[\hat{A}(Y_{1})=a]=\Omega(\sigma_{Y}^{-1}) for any a[μYσY,μY+σY]a\in[\mu_{Y}-\sigma_{Y},\mu_{Y}+\sigma_{Y}], with A^(Y1)\hat{A}(Y_{1}), μY\mu_{Y} and σY\sigma_{Y} defined analogously for Y1Y_{1}. Note that μXμY=O(n2/3ω(n)6)\mu_{X}-\mu_{Y}=O(n^{2/3}\omega(n)^{-6}) and σX,σY=Θ(n2/3ω(n)1/4)\sigma_{X},\sigma_{Y}=\Theta(n^{2/3}\omega(n)^{-1/4}). Without loss of generality, suppose σX<σY\sigma_{X}<\sigma_{Y}. Then for any a[μXσX/2,μY+σX/2]a\in[\mu_{X}-\sigma_{X}/2,\mu_{Y}+\sigma_{X}/2] and d=O(n2/3ω(n)6)d=O(n^{2/3}\omega(n)^{-6}), we have

min{Pr[A^(X1)=a],Pr[A^(Y1)=ad]}=min{Ω(σX1),Ω(σY1)}=Ω(σY1).\min\left\{\Pr[\hat{A}(X_{1})=a],\Pr[\hat{A}(Y_{1})=a-d]\right\}=\min\left\{\Omega(\sigma_{X}^{-1}),\Omega(\sigma_{Y}^{-1})\right\}=\Omega(\sigma_{Y}^{-1}).

Hence, there exists a coupling \mathbb{P} of A^(X1)\hat{A}(X_{1}) and A^(Y1)\hat{A}(Y_{1}) so that [A^(X1)=a,A^(Y1)=ad]=Ω(σY1)\mathbb{P}[\hat{A}(X_{1})=a,\hat{A}(Y_{1})=a-d]=\Omega(\sigma_{Y}^{-1}) for all a[μXσX/2,μY+σX/2]a\in\left[\mu_{X}-\sigma_{X}/2,\mu_{Y}+\sigma_{X}/2\right]. Therefore, there is a coupling of A^(X1)\hat{A}(X_{1}) and A^(Y1)\hat{A}(Y_{1}) such that

Pr[A^(X1)A^(Y1)=d]=Ω(σX/σY)=Ω(1).\Pr[\hat{A}(X_{1})-\hat{A}(Y_{1})=d]=\Omega\left({\sigma_{X}}/{\sigma_{Y}}\right)=\Omega(1).

Putting all these together, we deduce that A(X1)=A(Y1)A(X_{1})=A(Y_{1}) with probability exp(O(ω(n)9))\exp(-O(\omega(n)^{9})). If this is the case, the edge re-sampling step is coupled bijectively (as described above) so that 𝒮ω(X2)=𝒮ω(Y2)\mathcal{S}_{\omega}(X_{2})=\mathcal{S}_{\omega}(Y_{2}).

It remains for us to guarantee the additional desired structural properties of X2X_{2} and Y2Y_{2}, which follow straightforwardly from the random graph estimates stated in Section 3. First note that by Hoeffding’s inequality, with probability Ω(1)\Omega(1),

|A(X1)nq(q1)𝒮ω(X1)q|=O(n2/3).\left|A(X_{1})-\frac{n}{q}-\frac{(q-1)\|\mathcal{S}_{\omega}(X_{1})\|}{q}\right|=O(n^{2/3}).

Hence, in the percolation sub-step the active subgraph is replaced by FG(A(X1),1+λA(X1)1/3A(X1))F\sim G\left(A(X_{1}),\frac{1+\lambda A(X_{1})^{-1/3}}{A(X_{1})}\right), where |λ|=O(ω(n))|\lambda|=O(\omega(n)) with probability Ω(1)\Omega(1) since 𝒮ω(X1)=O(n2/3ω(n))\|\mathcal{S}_{\omega}(X_{1})\|=O(n^{2/3}\omega(n)). Conditioning on this event, since the components of FF contribute to both X2X_{2} and Y2Y_{2}, Corollary 3.12 implies that w.h.p. N^k(2,ω(n))\hat{N}_{k}(2,\omega(n)) =Ω(ω(n)32k1)=\Omega(\omega(n)^{3\cdot 2^{k-1}}) for all k1k\geq 1 such that n2/3ω(n)2kn^{2/3}\omega(n)^{-2^{k}}\rightarrow\infty. Moreover, from Lemma 3.2 we obtain that I(X2)=Ω(n)I(X_{2})=\Omega(n) w.h.p. From Lemma 3.4 and Markov’s inequality, we obtain that 2(X2)=O(n4/3)\mathcal{R}_{2}(X_{2})=O(n^{4/3}) with probability at least 99/10099/100 and from Lemma 3.10 that ~ω(X2)=O(n4/3ω(n)1/2)\widetilde{\mathcal{R}}_{\omega}(X_{2})=O(n^{4/3}{\omega(n)}^{-1/2}) also with probability at least 99/10099/100. All these bounds apply also to the analogous quantities for Y2Y_{2} with the same respective probabilities.

Finally, we derive the bound for L1(X2)L_{1}(X_{2}) and L1(Y2)L_{1}(Y_{2}). First, notice L1(F)L_{1}(F) is stochastically dominated by L1(F)L_{1}(F^{\prime}), where FG(A(X1),1+|λ|A(X1)1/3A(X1))F^{\prime}\sim G\Big{(}A(X_{1}),\frac{1+|\lambda|A(X_{1})^{-1/3}}{A(X_{1})}\Big{)}. Under the assumption that |λ|=O(ω(n))|\lambda|=O(\omega(n)), if |λ||\lambda|\rightarrow\infty, then Corollary 3.6 implies that L1(F)=O(|λ|A(X1)2/3)=O(n2/3ω(n))L_{1}(F^{\prime})=O(|\lambda|A(X_{1})^{2/3})=O(n^{2/3}\omega(n)) w.h.p.; otherwise, |λ|=O(1)|\lambda|=O(1) and by Lemma 3.9 and Markov’s inequality, L1(F)=O(n2/3)L_{1}(F^{\prime})=O(n^{2/3}) with probability at least 99/10099/100. Thus, L1(F)=O(n2/3ω(n))L_{1}(F)=O(n^{2/3}\omega(n)) with probability at least 99/10099/100. We also know that the largest inactivated component in X1X_{1} has size less than n2/3ω(n)1n^{2/3}\omega(n)^{-1}, so L1(X2)=O(n2/3ω(n))L_{1}(X_{2})=O(n^{2/3}\omega(n)) with probability at least 99/10099/100. The same holds for Y2Y_{2}. Therefore, by a union bound, all these properties hold simultaneously for both X2X_{2} and Y2Y_{2} with probability Ω(1)\Omega(1), as claimed. ∎

5.3 Re-contracting largest component: proof of Lemma 2.8

In Section 5.2, we designed a coupling argument to ensure that the largest components of both configurations have the same size. For this, we needed to relax our constraint on the size of the largest component of the configurations. In this section we prove Lemma 2.8, which ensures that after O(logω(n))O(\log\omega(n)) steps the largest components of each configuration have size O(n2/3)O(n^{2/3}) again.

The following lemma is the core of the proof Lemma 2.8 and it may be viewed as a generalization of the coupling from the proof of Lemma 2.7 using the local limit theorem from Section 5.2.

We recall some notation from the proof sketch. Given two random-cluster configurations XtX_{t} and YtY_{t}, WtW_{t} is maximal matching between the components of XtX_{t} and YtY_{t} that only matches components of equal size to each other. We use M(Xt)M(X_{t}), M(Yt)M(Y_{t}) for the components in WtW_{t} from XtX_{t}, YtY_{t}, respectively, D(Xt)D(X_{t}), D(Yt)D(Y_{t}) for the complements of M(Xt)M(X_{t}), M(Yt)M(Y_{t}), and Zt=𝒞D(Xt)D(Yt)|𝒞|2.Z_{t}=\sum_{\mathcal{C}\in D(X_{t})\cup D(Y_{t})}|\mathcal{C}|^{2}. For an increasing positive function gg and each integer k1k\geq 1, define N^k(t,g):=N^k(Xt,Yt,g)\hat{N}_{k}(t,g):=\hat{N}_{k}(X_{t},Y_{t},g) as the number of matched pairs in WtW_{t} whose component sizes are in the interval

k(g)=[ϑn2/32g(n)2k,ϑn2/3g(n)2k],\mathcal{I}_{k}(g)=\left[\frac{\vartheta n^{2/3}}{2g(n)^{2^{k}}},\frac{\vartheta n^{2/3}}{g(n)^{2^{k}}}\right],

where ϑ>0\vartheta>0 is a fixed large constant (independent of nn).

Lemma 5.2.

There exists a coupling of the activation sub-step of the CM dynamics such that A(Xt)=A(Yt)A(X_{t})=A(Y_{t}) with at least Ω(1ω(n))\Omega\left(\frac{1}{\omega(n)}\right) probability, provided XtX_{t} and YtY_{t} are random-cluster configurations satisfying

  1. 1.

    𝒮ω(Xt)=𝒮ω(Yt)\mathcal{S}_{\omega}(X_{t})=\mathcal{S}_{\omega}(Y_{t});

  2. 2.

    Zt=O(n4/3ω(n)1/2)Z_{t}=O\left(\frac{n^{4/3}}{\omega(n)^{1/2}}\right);

  3. 3.

    N^k(Xt,Yt,ω(n))=Ω(ω(n)32k1)\hat{N}_{k}(X_{t},Y_{t},\omega(n))=\Omega\left(\omega(n)^{3\cdot 2^{k-1}}\right) for all k1k\geq 1 such that n2/3ω(n)2kn^{2/3}\omega(n)^{-2^{k}}\rightarrow\infty;

  4. 4.

    I(Xt),I(Yt)=Ω(n)I(X_{t}),I(Y_{t})=\Omega(n).

Proof.

The activation coupling has two parts. First we use the maximal matching WtW_{t} to couple the activation of a subset of the components in M(Xt)M(X_{t}) and M(Yt)M(Y_{t}). Specifically, let \ell be defined as in Theorem 5.1; for all k[1,]k\in[1,\ell], we exclude Θ(ω(n)32k1)\Theta(\omega(n)^{3\cdot 2^{k-1}}) pairs of components of size in the interval k(ω)\mathcal{I}_{k}(\omega) and we exclude Θ(n)\Theta(n) pairs of matched isolated vertices. (These components exist by assumptions 3 and 4.) All other pairs of components matched by WtW_{t} are jointly activated (or not). Hence, the number of vertices activated from XtX_{t} in this first part of the coupling is the same as that from YtY_{t}.

Let 𝒞(Xt)\mathcal{C}(X_{t}) and 𝒞(Yt)\mathcal{C}(Y_{t}) denote the sets containing the components in XtX_{t} and components in YtY_{t} not considered to be activated in the first step of the coupling. This includes all the components from D(Xt)D(X_{t}) and D(Yt)D(Y_{t}), and all the components from M(Xt)M(X_{t}) and M(Yt)M(Y_{t}) excluded in the first part of the coupling. Let A(Xt)A^{\prime}(X_{t}) and A(Yt)A^{\prime}(Y_{t}) denote the number of activated vertices from 𝒞(Xt)\mathcal{C}(X_{t}) and 𝒞(Yt)\mathcal{C}(Y_{t}) respectively. The second part is a coupling of the activation sub-step in a way such that

Pr[A(Xt)=A(Yt)]=Ω(ω(n)1).\Pr\left[A^{\prime}(X_{t})=A^{\prime}(Y_{t})\right]=\Omega(\omega(n)^{-1}).

Let mx:=|𝒞(Xt)|=Θ(n)m_{x}:=|\mathcal{C}(X_{t})|=\Theta(n), and similarly for my:=|𝒞(Yt)|m_{y}:=|\mathcal{C}(Y_{t})|. Let 𝒞1𝒞mx\mathcal{C}_{1}\leq\dots\leq\mathcal{C}_{m_{x}} (resp., 𝒞1𝒞my\mathcal{C^{\prime}}_{1}\leq\dots\leq\mathcal{C^{\prime}}_{m_{y}}) be sizes of components in 𝒞(Xt)\mathcal{C}(X_{t}) (resp., 𝒞(Yt)\mathcal{C}(Y_{t})) in ascending order. For all imxi\leq m_{x}, let 𝒳i\mathcal{X}_{i} be a random variable that equals to 𝒞i\mathcal{C}_{i} with probability 1/q1/q and 0 otherwise, which corresponds to the number of activated vertices from iith component in 𝒞(Xt)\mathcal{C}(X_{t}). Note that 𝒳1,,𝒳mx\mathcal{X}_{1},\dots,\mathcal{X}_{m_{x}} are independent. We check that 𝒳1,,𝒳mx\mathcal{X}_{1},\dots,\mathcal{X}_{m_{x}} satisfy all other conditions of Theorem 5.1.

Assumption 𝒮ω(Xt)=𝒮ω(Yt)\mathcal{S}_{\omega}(X_{t})=\mathcal{S}_{\omega}(Y_{t}) and the first part of the activation ensure that

𝒞mxBω=O(n2/3ω(n)1)=O(mx2/3ω(mx)1).\mathcal{C}_{m_{x}}\leq B_{\omega}=O\left(n^{2/3}\omega(n)^{-1}\right)=O\left(m_{x}^{2/3}\omega(m_{x})^{-1}\right).

Observe also that there exists a constant ρ\rho such that 𝒞i=1\mathcal{C}_{i}=1 for iρmxi\leq\rho m_{x} and |{i:𝒞ik(ω)}|=Θ(ω(n)32k1)\lvert\{i:\mathcal{C}_{i}\in\mathcal{I}_{k}(\omega)\}\rvert=\Theta\left(\omega(n)^{3\cdot 2^{k-1}}\right) for 1k1\leq k\leq\ell; lastly, from assumption Zt=O(n4/3ω(n)1/2)Z_{t}=O\left(\frac{n^{4/3}}{\omega(n)^{1/2}}\right), we obtain

i=1mx𝒞i2\displaystyle\sum_{i=1}^{m_{x}}\mathcal{C}_{i}^{2} Zt+O(ρmx)+k=1ϑn4/3ω(n)2k+1O(ω(n)32k1)\displaystyle\leq Z_{t}+O(\rho m_{x})+\sum_{k=1}^{\ell}\frac{\vartheta n^{4/3}}{\omega(n)^{2^{k+1}}}\cdot O\left(\omega(n)^{3\cdot 2^{k-1}}\right)
=O(mx4/3ω(mx))+O(k=1mx4/3ω(mx)2k1)\displaystyle=O\left(\frac{m_{x}^{4/3}}{\sqrt{\omega(m_{x})}}\right)+O\left(\sum_{k=1}^{\ell}\frac{m_{x}^{4/3}}{\omega(m_{x})^{2^{k-1}}}\right)
=O(mx4/3ω(mx))+O(k=1mx4/3ω(mx)k)\displaystyle=O\left(\frac{m_{x}^{4/3}}{\sqrt{\omega(m_{x})}}\right)+O\left(\sum_{k=1}^{\ell}\frac{m_{x}^{4/3}}{\omega(m_{x})^{k}}\right)
=O(mx4/3ω(mx))+O(mx4/3ω(mx))=O(mx4/3ω(mx)).\displaystyle=O\left(\frac{m_{x}^{4/3}}{\sqrt{\omega(m_{x})}}\right)+O\left(\frac{m_{x}^{4/3}}{\omega(m_{x})}\right)=O\left(\frac{m_{x}^{4/3}}{\sqrt{\omega(m_{x})}}\right). (10)

Therefore, if μx=E[i=1mx𝒳i]\mu_{x}=\operatorname*{{\rm E}}\left[\sum_{i=1}^{m_{x}}\mathcal{X}_{i}\right] and σx2=Var(i=1mx𝒳i)\sigma_{x}^{2}=Var\left(\sum_{i=1}^{m_{x}}\mathcal{X}_{i}\right), Theorem 5.1 implies that for any x[μxσx,μx+σx]x\in[\mu_{x}-\sigma_{x},\mu_{x}+\sigma_{x}],

Pr[A(Xt)=x]=Pr[i=1mx𝒳i=x]=12πσxexp((xμx)22σx2)+o(1σx)=Ω(1σx).\Pr\left[A^{\prime}(X_{t})=x\right]=\Pr\left[\sum_{i=1}^{m_{x}}\mathcal{X}_{i}=x\right]=\frac{1}{\sqrt{2\pi}\sigma_{x}}\exp\left(-\frac{(x-\mu_{x})^{2}}{2\sigma_{x}^{2}}\right)+o\left(\frac{1}{\sigma_{x}}\right)=\Omega\left(\frac{1}{\sigma_{x}}\right).

Similarly, we get that Pr[A(Yt)=y]=Ω(σy1)\Pr\left[A^{\prime}(Y_{t})=y\right]=\Omega(\sigma_{y}^{-1}) for any y[μyσy,μy+σy]y\in[\mu_{y}-\sigma_{y},\mu_{y}+\sigma_{y}], with μy\mu_{y} and σy\sigma_{y} defined analogously. Without loss of generality, suppose σyσx\sigma_{y}\leq\sigma_{x}. Since μx=μy\mu_{x}=\mu_{y}, for x[μxσy,μx+σy]x\in\left[\mu_{x}-\sigma_{y},\mu_{x}+\sigma_{y}\right], we obtain

min{Pr[A(Xt)=x],Pr[A(Yt)=x]}=Ω(1σx).\min\left\{\Pr\left[A^{\prime}(X_{t})=x\right],\Pr\left[A^{\prime}(Y_{t})=x\right]\right\}=\Omega\left(\frac{1}{\sigma_{x}}\right).

Hence, we can couple (A(Xt),A(Yt))(A^{\prime}(X_{t}),A^{\prime}(Y_{t})) so that Pr[A(Xt)=A(Yt)=x]=Ω(σx1)\Pr[A^{\prime}(X_{t})=A^{\prime}(Y_{t})=x]=\Omega(\sigma_{x}^{-1}) for all x[μxσy,μx+σy]x\in[\mu_{x}-\sigma_{y},\mu_{x}+\sigma_{y}]. Consequently, under this coupling,

Pr[A(Xt)=A(Yt)]=Ω(σyσx).\Pr\left[A^{\prime}(X_{t})=A^{\prime}(Y_{t})\right]=\Omega\left(\frac{\sigma_{y}}{\sigma_{x}}\right).

Since 𝒳1,,𝒳mx\mathcal{X}_{1},\dots,\mathcal{X}_{m_{x}} are independent, σx2=Θ(i=1mx𝒞i2)\sigma_{x}^{2}=\Theta\left(\sum_{i=1}^{m_{x}}\mathcal{C}_{i}^{2}\right), and similarly σy2=Θ(i=1my𝒞i2)\sigma_{y}^{2}=\Theta\left(\sum_{i=1}^{m_{y}}\mathcal{C^{\prime}}_{i}^{2}\right). Hence, inequality (10) gives an upper bound for σx2\sigma_{x}^{2}; meanwhile, a lower bound for σy2\sigma_{y}^{2} can be obtained by counting components in the largest interval:

i=1my𝒞i2i:𝒞i1(ω)𝒞i2Bn4/3ω(n)4Θ(ω(n)3)=Ω(n4/3ω(n)).\sum_{i=1}^{m_{y}}\mathcal{C^{\prime}}_{i}^{2}\geq\sum_{i:\mathcal{C^{\prime}}_{i}\in\mathcal{I}_{1}(\omega)}\mathcal{C^{\prime}}_{i}^{2}\geq\frac{Bn^{4/3}}{\omega(n)^{4}}\cdot\Theta\left(\omega(n)^{3}\right)=\Omega\left(\frac{n^{4/3}}{\omega(n)}\right).

Therefore,

Pr[A(Xt)=A(Yt)]=Ω(n2/3ω(n)1/2ω(mx)1/4mx2/3)=Ω(1ω(n)),\Pr\left[A^{\prime}(X_{t})=A^{\prime}(Y_{t})\right]=\Omega\left(\frac{n^{2/3}}{\omega(n)^{1/2}}\cdot\frac{\omega(m_{x})^{1/4}}{m_{x}^{2/3}}\right)=\Omega\left(\frac{1}{\omega(n)}\right),

as desired. ∎

We are now ready to prove Lemma 2.8.

Proof of Lemma 2.8.

Let C1C_{1} be a suitable constant that we choose later. We wish to maintain the following properties for all tT:=C1logω(n)t\leq T:=C_{1}\log\omega(n):

  1. 1.

    𝒮ω(Xt)=𝒮ω(Yt)\mathcal{S}_{\omega}(X_{t})=\mathcal{S}_{\omega}(Y_{t});

  2. 2.

    Zt=O(n4/3ω(n)1/2)Z_{t}=O\left(\frac{n^{4/3}}{\omega(n)^{1/2}}\right);

  3. 3.

    N^k(Xt,Yt,ω(n))=Ω(ω(n)32k1)\hat{N}_{k}(X_{t},Y_{t},\omega(n))=\Omega\left(\omega(n)^{3\cdot 2^{k-1}}\right) for all k1k\geq 1 such that n2/3ω(n)2kn^{2/3}\omega(n)^{-2^{k}}\rightarrow\infty;

  4. 4.

    I(Xt),I(Yt)=Ω(n)I(X_{t}),I(Y_{t})=\Omega(n);

  5. 5.

    2(Xt),2(Yt)=O(n4/3)\mathcal{R}_{2}(X_{t}),\mathcal{R}_{2}(Y_{t})=O(n^{4/3});

  6. 6.

    L1(Xt)αtL1(X0),L1(Yt)αtL1(Y0)L_{1}(X_{t})\leq\alpha^{t}L_{1}(X_{0}),L_{1}(Y_{t})\leq\alpha^{t}L_{1}(Y_{0}) for some constant α\alpha independent of tt.

By assumption, X0X_{0} and Y0Y_{0} satisfy these properties. Suppose that XtX_{t} and YtY_{t} satisfy these properties at step tTt\leq T. We show that there exists a one-step coupling of the CM dynamics such that Xt+1X_{t+1} and Yt+1Y_{t+1} preserve all six properties with probability Ω(ω(n)1)\Omega\left(\omega(n)^{-1}\right).

We provide the high-level ideas of the proof first. We will crucially exploit the coupling from Lemma  5.2. Assuming A(Xt)=A(Yt)A(X_{t})=A(Y_{t}), properties 1 and 2 hold immediately at t+1t+1, and properties 3 and 4 can be shown by a “standard” approach used throughout the paper. In addition, we reuse simple arguments from previous stages to guarantee properties 5 and 6.

Consider first the activation sub-step. By Lemma 5.2, A(Xt)=A(Yt)A(X_{t})=A(Y_{t}) with probability at least Ω(ω(n)1)\Omega(\omega(n)^{-1}). If the number of vertices in the percolation is the same in both copies, we can couple the edge re-sampling so that the updated part of the configuration is identical in both copies. In other words, all new components created in this step are automatically contained in the component matching Wt+1W_{t+1}; this includes all new components whose sizes are greater than BωB_{\omega}. Since none of the new components contributes to Zt+1Z_{t+1}, we obtain Zt+1Zt=O(n4/3ω(n)1/2)Z_{t+1}\leq Z_{t}=O\left(\frac{n^{4/3}}{\omega(n)^{1/2}}\right). Therefore, A(Xt)=A(Yt)A(X_{t})=A(Y_{t}) immediately implies properties 1 and 2 at time t+1t+1.

With probability 1/q1/q, the largest components of XtX_{t} and YtY_{t} are activated simultaneously. Suppose that this is the case. By Hoeffding’s inequality, for constant K>0K>0, we have

Pr[|A(Xt)E[A(Xt)]|Kn2/3]exp(K2n4/32(Xt)).\Pr\left[\left\lvert A(X_{t})-\operatorname*{{\rm E}}\left[A(X_{t})\right]\right\rvert\geq Kn^{2/3}\right]\leq\exp\left(-\frac{K^{2}n^{4/3}}{\mathcal{R}_{2}(X_{t})}\right).

Property 5 and the observation that E[A(Xt)]=L1(Xt)+nL1(Xt)q\operatorname*{{\rm E}}\left[A(X_{t})\right]=L_{1}(X_{t})+\frac{n-L_{1}(X_{t})}{q} imply that

Pr[|A(Xt)L1(Xt)nL1(Xt)q|Kn2/3]=O(1).\Pr\left[\left\lvert A(X_{t})-L_{1}(X_{t})-\frac{n-L_{1}(X_{t})}{q}\right\rvert\geq Kn^{2/3}\right]=O(1).

By noting that L1(Y0),L1(X0)n2/3ω(n)L_{1}(Y_{0}),L_{1}(X_{0})\leq n^{2/3}\omega(n), property 6 implies that

Pr[A(Xt)qn1+(q1)ω(n)+Kqn1/3]=Ω(1).\Pr\left[A(X_{t})\cdot\frac{q}{n}\leq 1+\frac{(q-1)\omega(n)+Kq}{n^{1/3}}\right]=\Omega(1). (11)

We denote A(Xt)=A(Yt)A(X_{t})=A(Y_{t}) by mm. By inequality (11), with at least constant probability, the random graph for both chains is HG(m,1+λm1/3m)H\sim G\left(m,\frac{1+\lambda m^{-1/3}}{m}\right), where λω(m)\lambda\leq\omega(m). Let us assume that is the case. Corollary 3.12 ensures that there exists a constant b>0b>0 such that, with probability at least 1O(ω(n)3)1-O(\omega(n)^{-3}), Nk(H,ω(n))bω(n)32k1N_{k}(H,\omega(n))\geq b\omega(n)^{3\cdot 2^{k-1}} for all k1k\geq 1 such that n2/3ω(n)2kn^{2/3}\omega(n)^{-2^{k}}\rightarrow\infty. Since components in HH are simultaneously added to both Xt+1X_{t+1} and Yt+1Y_{t+1}, property 3 is satisfied. Moreover, Lemma 3.2 implies that with high probability Ω(n)\Omega(n) isolated vertices are added to Xt+1X_{t+1} and Yt+1Y_{t+1}, and thus property 4 is satisfied at time t+1t+1.

In addition, Lemma 3.4 and Markov’s inequality imply that there exists a constant C2C_{2} such that

Pr[2(Xt+1)=C2n4/3]99100;\Pr\left[\mathcal{R}_{2}(X_{t+1})=C_{2}n^{4/3}\right]\geq\frac{99}{100};

By Lemma 4.3, there exists α<1\alpha<1 such that with at least probability 99/10099/100

L1(Xt+1)max{αL1(Xt),L2(Xt)},L_{1}(X_{t+1})\leq\max\{\alpha L_{1}(X_{t}),L_{2}(X_{t})\},

where α\alpha is independent of tt and nn. Potentially, property 6 may not hold when αL1(Xt)<L1(Xt+1)L2(Xt)=O(n2/3)\alpha L_{1}(X_{t})<L_{1}(X_{t+1})\leq L_{2}(X_{t})=O(n^{2/3}), but then we stop at this point. (We will argue that in this case all the desired properties are also established shortly.) Hence, we suppose otherwise and establish properties 5 and 6 for Xt+1X_{t+1}. Similar bounds hold for Yt+1Y_{t+1}.

By a union bound, Xt+1X_{t+1} and Yt+1Y_{t+1} have all six properties with probability at least 92/10092/100, assuming the activation sub-step satisfies all the desired properties, and thus overall with probability Ω(ω(n)1)\Omega\left(\omega(n)^{-1}\right). Inductively, the probability that XTX_{T} and YTY_{T} satisfy the six properties is

O(ω(n))C1logω(n)=exp(logO(ω(n))C1logω(n))=exp(O((logω(n))2)).O\left(\omega(n)\right)^{-C_{1}\log\omega(n)}=\exp\left(\log O(\omega(n))^{-C_{1}\log\omega(n)}\right)=\exp\left(-O\left((\log\omega(n))^{2}\right)\right).

Suppose XTX_{T} and YTY_{T} have the six properties. By choosing C1>1/log1αC_{1}>1/\log\frac{1}{\alpha}, properties 5 and 6 imply

1(XT)=L1(XT)2+2(XT)(αC1logω(n)n2/3ω(n))2+O(n4/3)=O(n4/3),\mathcal{R}_{1}(X_{T})=L_{1}(X_{T})^{2}+\mathcal{R}_{2}(X_{T})\leq\left(\alpha^{C_{1}\log\omega(n)}n^{2/3}\omega(n)\right)^{2}+O(n^{4/3})=O(n^{4/3}),

and 1(YT)=O(n4/3)\mathcal{R}_{1}(Y_{T})=O(n^{4/3}). While the lemma almost follows from these properties, notice that property 3 does not match the desired bounds on the components in the lemma statement. To fix this issue, we perform one additional step of the coupling.

Consider the activation sub-step at TT. Assume again A(XT)=A(YT)=:mA(X_{T})=A(Y_{T})=:m^{\prime}. By Hoeffding’s inequality, for some constant KK^{\prime}, we obtain

Pr[|mqn1|>Kn1/3]=Pr[|mnq|>Kn2/3]exp(K2n4/31(XT))=O(1).\Pr\left[\left|m^{\prime}\cdot\frac{q}{n}-1\right|>\frac{K^{\prime}}{n^{1/3}}\right]=\Pr\left[\left|m^{\prime}-\frac{n}{q}\right|>K^{\prime}n^{2/3}\right]\leq\exp\left(\frac{-{K^{\prime}}^{2}n^{4/3}}{\mathcal{R}_{1}(X_{T})}\right)=O(1). (12)

Let λ:=(mqn11)m1/3\lambda^{\prime}:=(m^{\prime}qn^{-1}-1)\cdot m^{\prime 1/3}. Inequality (12) implies with at least constant probability the random graph in the percolation step is HG(m,1+λm1/3m)H^{\prime}\sim G\left(m^{\prime},\frac{1+\lambda^{\prime}m^{\prime-1/3}}{m^{\prime}}\right), where λK\lambda^{\prime}\leq K^{\prime} and m(n/2q,n)m^{\prime}\in(n/2q,n). If so, Corollary 3.12 ensures with high probability N^k(XT+1,YT+1,ω(n)1/2)=Ω(ω(n)32k2)\hat{N}_{k}(X_{T+1},Y_{T+1},\omega(n)^{1/2})=\Omega\left({\omega(n)}^{3\cdot 2^{k-2}}\right) for all k1k\geq 1 such that n2/3ω(n)2k1n^{2/3}\omega(n)^{-2^{k-1}}\rightarrow\infty.

By the preceding argument, with Ω(ω(n)1)\Omega(\omega(n)^{-1}) probability, the six properties are still valid at step T+1T+1 , so the proof is complete. We note that if we had to stop earlier because property 6 did not hold, we perform one extra step (as above) to ensure that N^k(XT+1,YT+1,ω(n)1/2)=Ω(ω(n)32k2)\hat{N}_{k}(X_{T+1},Y_{T+1},\omega(n)^{1/2})=\Omega({\omega(n)}^{3\cdot 2^{k-2}}) for all k1k\geq 1 such that n2/3ω(n)2k1n^{2/3}\omega(n)^{-2^{k-1}}\rightarrow\infty. ∎

5.4 A four-phase analysis using random walks couplings: proof of Lemma 2.9

We introduce first some notation that will be useful in the proof of Lemma 2.9. Let S(X0)=S(X_{0})=\emptyset, and given S(Xt)S(X_{t}), S(Xt+1)S(X_{t+1}) is obtained as follows:

  1. (i)

    S(Xt+1)=S(Xt)S(X_{t+1})=S(X_{t});

  2. (ii)

    every component in S(Xt)S(X_{t}) activated by the CM dynamics at time tt is removed from S(Xt+1)S(X_{t+1}); and

  3. (iii)

    the largest new component (breaking ties arbitrarily) is added to S(Xt+1)S(X_{t+1}).

Let 𝒞(Xt)\mathcal{C}(X_{t}) denote the set of connected components of XtX_{t} and note that S(Xt)S(X_{t}) is a subset of 𝒞(Xt)\mathcal{C}(X_{t}); we use |S(Xt)||S(X_{t})| to denote the total number of vertices of the components in S(Xt)S(X_{t}). Finally, let

Q(Xt)=𝒞𝒞(Xt)S(Xt)|𝒞|2.Q(X_{t})=\sum_{\mathcal{C}\in\mathcal{C}(X_{t})\setminus S(X_{t})}|\mathcal{C}|^{2}.

In the proof of Lemma 2.9, we use the following lemmas.

Lemma 5.3.

Let rr be an increasing positive function such that r(n)=o(n1/15)r(n)=o(n^{1/15}) and let c>0c>0 be a sufficiently large constant. Suppose |S(Xt)|ctn2/3r(n)|S(X_{t})|\leq ctn^{2/3}r(n), Q(Xt)tn4/3r(n)+O(n4/3)Q(X_{t})\leq tn^{4/3}r(n)+O(n^{4/3}) and tr(n)/logr(n)t\leq r(n)/\log r(n). Then, with probability at least 1O(r(n)1)1-O\left(r(n)^{-1}\right), |S(Xt+1)|c(t+1)n2/3r(n)|S(X_{t+1})|\leq c(t+1)n^{2/3}r(n) and Q(Xt+1)(t+1)n4/3r(n)+O(n4/3)Q(X_{t+1})\leq(t+1)n^{4/3}r(n)+O(n^{4/3}).

Lemma 5.4.

Let ff be a positive function such that f(n)=o(n1/3)f(n)=o(n^{1/3}). Suppose a configuration XtX_{t} satisfies 1(Xt)=O(n4/3f(n)2(logf(n))1)\mathcal{R}_{1}(X_{t})=O\left(n^{4/3}f(n)^{2}(\log f(n))^{-1}\right). Let mm denote the number of activated vertices in this step, and λ:=(mq/n1)m1/3\lambda:=(mq/n-1)\cdot m^{1/3}. With probability 1O(f(n)1)1-O\left(f(n)^{-1}\right), m(n/2q,n)m\in(n/2q,n) and |λ|f(n)|\lambda|\leq f(n).

Lemma 5.5.

Let gg and hh be two increasing positive functions of nn. Assume g(n)=o(n1/6)g(n)=o(n^{1/6}). Let XtX_{t} and YtY_{t} be two random-cluster configuration such that N^k(Xt,Yt,g)bg(n)32k1\hat{N}_{k}(X_{t},Y_{t},g)\geq bg(n)^{3\cdot 2^{k-1}} for some fixed constant b>0b>0 independent of nn and for all k1k\geq 1 such that n2/3g(n)2kn^{2/3}g(n)^{-2^{k}}\rightarrow\infty. Assume also that ZtCn4/3h(n)1Z_{t}\leq Cn^{4/3}h(n)^{-1} for some constant C>0C>0. Lastly, assume I(Xt),I(Yt)=Ω(n)I(X_{t}),I(Y_{t})=\Omega(n). Then for every positive function η\eta there exists a coupling for the activation sub-step of the components of XtX_{t} and YtY_{t} such that

Pr[A(Xt)=A(Yt)]14e2η(n)g(n)η(n)h(n)δg(n),\Pr[A(X_{t})=A(Y_{t})]\geq 1-4e^{-2\eta(n)}-\sqrt{\frac{g(n)\eta(n)}{h(n)}}-\frac{\delta}{g(n)},

for some constant δ>0\delta>0 independent of nn.

The proofs of these lemmas are given in Section 5.4.1. In particular, as mentioned, to prove Lemma 5.5 we use a precise estimate on the maximum of a random walk on \mathbb{Z} with steps of different sizes (see Theorem 5.7).

Proof of Lemma 2.9.

The coupling has four phases: in phase 1 we will consider O(loglogloglogn)O(\log\log\log\log n) steps of the coupling, O(logloglogn)O(\log\log\log n) steps in phase 2, O(loglogn)O(\log\log n) steps in phase 3 and phase 4 consists of O(logn)O(\log n) steps.

We will keep track of the random variables 1(Xt),1(Yt),I(Xt),I(Yt),Zt\mathcal{R}_{1}(X_{t}),\mathcal{R}_{1}(Y_{t}),I(X_{t}),I(Y_{t}),Z_{t} and N^k(t,g)\hat{N}_{k}(t,g) for a function gg we shall carefully choose for each phase, and use these random variables to derive bounds on the probability of various events.

Phase 1. We set g1(n)=ω(n)1/2g_{1}(n)=\omega(n)^{1/2} and h1(n)=K2ω(n)1/2h_{1}(n)=K^{2}\omega(n)^{1/2} where K>0K>0 is a constant we choose. Let a1:=112qa_{1}:=1-\frac{1}{2q} and let T1:=12loga1(logloglogn)T_{1}:=-12\log_{a_{1}}(\log\log\log n), and we fix t<T1t<T_{1}. Suppose we have 1(Xt)+1(Yt)C1n4/3\mathcal{R}_{1}(X_{t})+\mathcal{R}_{1}(Y_{t})\leq C_{1}n^{4/3}, I(Xt),I(Yt)=Ω(n)I(X_{t}),I(Y_{t})=\Omega(n), and N^k(t,g1)=Ω(g1(n)32k1)\hat{N}_{k}(t,g_{1})=\Omega(g_{1}(n)^{3\cdot 2^{k-1}}) for all k1k\geq 1 such that n2/3g1(n)2kn^{2/3}g_{1}(n)^{-2^{k}}\rightarrow\infty, where C1>0C_{1}>0 is a large constant that we choose later.

By Lemma 5.5, for a sufficiently large constant K>0K>0, we obtain a coupling for the activation of XtX_{t} and YtY_{t} such that the same number of vertices are activated in XtX_{t} and YtY_{t}, with probability at least

14e2KKω(n)1/2K2ω(n)1/2δω(n)1/21116q2.1-4e^{-2K}-\sqrt{\frac{K\omega(n)^{1/2}}{K^{2}\omega(n)^{1/2}}}-\frac{\delta}{\omega(n)^{1/2}}\geq 1-\frac{1}{16q^{2}}.

By Lemma 5.4, A(Xt)(n/2q,n)A(X_{t})\in(n/2q,n) and λ:=(A(Xt)q/n1)A(Xt)1/3=O(1)\lambda:=(A(X_{t})q/n-1)\cdot A(X_{t})^{1/3}=O(1) with probability at least 1116q21-\frac{1}{16q^{2}}. It follows a union bound that A(Xt)=A(Yt)A(X_{t})=A(Y_{t}), A(Xt)(n/2q,n)A(X_{t})\in(n/2q,n) and λ=O(1)\lambda=O(1) with probability 118q21-\frac{1}{8q^{2}}. We call this event t1\mathcal{H}^{1}_{t}.

Let DtD_{t}^{\prime} denote the inactivated components in D(Xt)D(Yt)D(X_{t})\cup D(Y_{t}) at the step tt, and MtM_{t}^{\prime} the inactivated components in M(Xt)M(Xt)M(X_{t})\cup M(X_{t}). Observe that

E[𝒞Dt|𝒞|2]=(11q)𝒞D(Xt)D(Yt)|𝒞|2=(11q)Zt.\operatorname*{{\rm E}}\left[\sum\nolimits_{\mathcal{C}\in D_{t}^{\prime}}|\mathcal{C}|^{2}\right]=\left(1-\frac{1}{q}\right)\sum\nolimits_{\mathcal{C}\in D(X_{t})\cup D(Y_{t})}|\mathcal{C}|^{2}=\left(1-\frac{1}{q}\right)Z_{t}.

Similarly,

E[𝒞Mt|𝒞|2]=(11q)𝒞M(Xt)M(Yt)|𝒞|2=(11q)(1(Xt)+1(Yt)Zt).\operatorname*{{\rm E}}\left[\sum\nolimits_{\mathcal{C}\in M_{t}^{\prime}}|\mathcal{C}|^{2}\right]=\left(1-\frac{1}{q}\right)\sum\nolimits_{\mathcal{C}\in M(X_{t})\cup M(Y_{t})}|\mathcal{C}|^{2}=\left(1-\frac{1}{q}\right)\left(\mathcal{R}_{1}(X_{t})+\mathcal{R}_{1}(Y_{t})-Z_{t}\right).

Hence, by Markov’s inequality and independence between activation of components in DtD_{t}^{\prime} and components in MtM_{t}^{\prime}, with probability at least 1/4q21/4q^{2}, the activation sub-step is such that

𝒞Dt|𝒞|2(112q)Zt,\sum\nolimits_{\mathcal{C}\in D_{t}^{\prime}}|\mathcal{C}|^{2}\leq\left(1-\frac{1}{2q}\right)Z_{t},

and

𝒞Mt|𝒞|2(112q)(1(Xt)+1(Yt)Zt).\sum\nolimits_{\mathcal{C}\in M_{t}^{\prime}}|\mathcal{C}|^{2}\leq\left(1-\frac{1}{2q}\right)\left(\mathcal{R}_{1}(X_{t})+\mathcal{R}_{1}(Y_{t})-Z_{t}\right).

We denote this event by t2\mathcal{H}^{2}_{t}. By a union bound, t1\mathcal{H}^{1}_{t} and t2\mathcal{H}^{2}_{t} happen simultaneously with probability 1/8q21/8q^{2}.

Suppose all these events indeed happen; then we couple the percolation step so that the components newly generated in both copies are exactly identical, and we claim that all of the following holds with at least constant probability:

  1. 1.

    1(Xt+1)+1(Yt+1)C1n4/3\mathcal{R}_{1}(X_{t+1})+\mathcal{R}_{1}(Y_{t+1})\leq C_{1}n^{4/3};

  2. 2.

    Zt+1a1ZtZ_{t+1}\leq a_{1}Z_{t};

  3. 3.

    I(Xt+1),I(Yt+1)=Ω(n)I(X_{t+1}),I(Y_{t+1})=\Omega(n);

  4. 4.

    N^k(t+1,g1)=Ω(g1(n)32k1)\hat{N}_{k}(t+1,g_{1})=\Omega(g_{1}(n)^{3\cdot 2^{k-1}}) for all k1k\geq 1 such that n2/3g1(n)2kn^{2/3}g_{1}(n)^{-2^{k}}\rightarrow\infty.

First, note that Zt+1Z_{t+1} can not possibly increase because the matching Wt+1W_{t+1} can only grow under the coupling if indeed A(Xt)=A(Yt)A(X_{t})=A(Y_{t}). Observe that only the inactivated components in XtX_{t} and YtY_{t} would contribute to Zt+1Z_{t+1}, so

Zt+1=𝒞Dt|𝒞|2a1Zt.Z_{t+1}=\sum\nolimits_{\mathcal{C}\in D_{t}^{\prime}}|\mathcal{C}|^{2}\leq a_{1}Z_{t}.

Next, we establish the properties 3 and 4. For this, notice that the percolation step is distributed as a HH\sim G(A(Xt),1+λA(Xt)1/3A(Xt))G\left(A(X_{t}),\frac{1+\lambda A(X_{t})^{-1/3}}{A(X_{t})}\right) random graph. Corollary 3.12 implies Nk(H,g1)=Ω(g1(n)32k1)N_{k}(H,g_{1})=\Omega(g_{1}(n)^{3\cdot 2^{k-1}}) for all k1k\geq 1 such that n2/3g1(n)2kn^{2/3}g_{1}(n)^{-2^{k}}\rightarrow\infty, with probability at least 1O(g1(n)3)1-O\left(g_{1}(n)^{-3}\right). Moreover, Lemma 3.2 implies that with high probability I(H)=Ω(n)I(H)=\Omega(n). Since the percolation step is coupled, this implies that both Xt+1X_{t+1} and Yt+1Y_{t+1} will have all the components in HH, so we have N^k(t+1,g1)=Ω(g1(n)32k1)\hat{N}_{k}(t+1,g_{1})=\Omega(g_{1}(n)^{3\cdot 2^{k-1}}) for all k1k\geq 1 such that n2/3g1(n)2kn^{2/3}g_{1}(n)^{-2^{k}}\rightarrow\infty, and I(Xt+1),I(Yt+1)=Ω(n)I(X_{t+1}),I(Y_{t+1})=\Omega(n), w.h.p.

Finally, assuming that |λ|=O(1)|\lambda|=O(1), by Lemma 3.9 and Markov’s inequality, there exists C2>0C_{2}>0 such that E[1(H)]=C2n4/3\operatorname*{{\rm E}}\left[\mathcal{R}_{1}(H)\right]=C_{2}n^{4/3} with probability at least 99/10099/100. Then

1(Xt+1)+1(Yt+1)\displaystyle\mathcal{R}_{1}(X_{t+1})+\mathcal{R}_{1}(Y_{t+1}) =𝒞Dt|𝒞|2+𝒞Mt|𝒞|2+1(H)\displaystyle=\sum\nolimits_{\mathcal{C}\in D_{t}^{\prime}}|\mathcal{C}|^{2}+\sum\nolimits_{\mathcal{C}\in M_{t}^{\prime}}|\mathcal{C}|^{2}+\mathcal{R}_{1}(H)
a1(1(Xt)+1(Yt))+C2n4/3a1C1n4/3+C2n4/3C1n4/3,\displaystyle\leq a_{1}\left(\mathcal{R}_{1}(X_{t})+\mathcal{R}_{1}(Y_{t})\right)+C_{2}n^{4/3}\leq a_{1}C_{1}n^{4/3}+C_{2}n^{4/3}\leq C_{1}n^{4/3},

for large enough C1C_{1}. A union bound implies that all four properties hold with at least constant probability ϱ>0\varrho>0.

Thus, the probability that at each step of update all four properties can be maintained throughout Phase 1 is at least

ϱT1=ϱ12loga1(logloglogn)=(logloglogn)12logϱ(a1).\varrho^{T_{1}}=\varrho^{-12\log_{a_{1}}(\log\log\log n)}=\left(\log\log\log n\right)^{-12\log_{\varrho}(a_{1})}.

If property 2 holds at the end of Phase 1, we have

ZT1=O(n4/3h1(n)a1T1)=O(n4/3h1(n)a112loga1(logloglogn))=O(n4/3(logloglogn)12).Z_{T_{1}}=O\left(\frac{n^{4/3}}{h_{1}(n)}\cdot a_{1}^{T_{1}}\right)=O\left(\frac{n^{4/3}}{h_{1}(n)}\cdot a_{1}^{-12\log_{a_{1}}(\log\log\log n)}\right)=O\left(\frac{n^{4/3}}{(\log\log\log n)^{12}}\right).

To facilitate discussions in Phase 2, we show that the two copies of chains satisfy one additional property at the end of Phase 1. In particular, there exist a lower bound for the number of components in a different set of intervals. We consider the last percolation step in Phase 1. Then, Corollary 3.12 with g2(n):=(logloglognloglogloglogn)2g_{2}(n):=\left(\log\log\log n\cdot\log\log\log\log n\right)^{2} implies N^k(T1,g2)=Ω(g2(n)32k1)\hat{N}_{k}(T_{1},g_{2})=\Omega(g_{2}(n)^{3\cdot 2^{k-1}}) for all k1k\geq 1 such that n2/3g2(n)2kn^{2/3}g_{2}(n)^{-2^{k}}\rightarrow\infty, with high probability.

Recall S(X)S(X) and Q(X)Q(X) defined at the beginning of Section 5.4. In Phase 2, 3 and 4, a new element of the argument is to also control the behavior S(Xt)S(X_{t}) and Q(Xt)Q(X_{t}). We provide a general result that will be used in the analysis of the three phases:

Claim 5.6.

Given positive increasing functions T,g,hT,g,h and rr that tend to infinity and satisfy

  1. 1.

    g(n)=o(n1/6)g(n)=o(n^{1/6});

  2. 2.

    T=o(g)T=o(g);

  3. 3.

    r(n)=o(n1/15)r(n)=o(n^{1/15});

  4. 4.

    T(n)2r(n)2g(n)2/logg(n)T(n)^{2}\cdot r(n)^{2}\leq g(n)^{2}/\log g(n);

  5. 5.

    T(n)r(n)/logr(n)T(n)\leq r(n)/\log r(n);

  6. 6.

    g(n)logg(n)h(n)1/3g(n)\log g(n)\leq h(n)^{1/3}.

and random-cluster configurations X0,Y0X_{0},Y_{0} satisfying

  1. 1.

    Z0=O(n4/3h(n))Z_{0}=O\left(\frac{n^{4/3}}{h(n)}\right);

  2. 2.

    |S(X0)|,|S(Y0)|n2/3r(n)|S(X_{0})|,|S(Y_{0})|\leq n^{2/3}r(n);

  3. 3.

    Q(X0),Q(Y0)=O(n4/3r(n))Q(X_{0}),Q(Y_{0})=O(n^{4/3}r(n));

  4. 4.

    I(X0),I(Y0)=Ω(n)I(X_{0}),I(Y_{0})=\Omega(n);

  5. 5.

    N^k(0,g)=Ω(g(n)32k1)\hat{N}_{k}(0,g)=\Omega\left(g(n)^{3\cdot 2^{k-1}}\right) for all k1k\geq 1 such that n2/3g(n)2kn^{2/3}g(n)^{-2^{k}}\rightarrow\infty.

There exists a coupling of CM steps such that after T=T(n)T=T(n) steps, with Ω(1)\Omega(1) probability,

  1. 1.

    ZT=O(n4/3aT(n))Z_{T}=O\left(\frac{n^{4/3}}{a^{T(n)}}\right);

  2. 2.

    |S(XT)|,|S(YT)|=O(n2/3r(n)T(n))|S(X_{T})|,|S(Y_{T})|=O(n^{2/3}r(n)T(n));

  3. 3.

    Q(XT),Q(YT)=O(n4/3r(n)T(n))Q(X_{T}),Q(Y_{T})=O(n^{4/3}r(n)T(n));

  4. 4.

    I(XT),I(YT)=Ω(n)I(X_{T}),I(Y_{T})=\Omega(n);

  5. 5.

    If a function gg^{\prime} satisfies ggg^{\prime}\geq g and g(n)=o(n1/3)g^{\prime}(n)=o(n^{1/3}), then N^k(T,g)=Ω(g(n)32k1)\hat{N}_{k}(T,g^{\prime})=\Omega\left(g^{\prime}(n)^{3\cdot 2^{k-1}}\right) for all k1k\geq 1 such that n2/3g(n)2kn^{2/3}g^{\prime}(n)^{-2^{k}}\rightarrow\infty.

Proof of this claim is provided later in Section 5.4.1.

Phase 2. Let a=q/(q1)a=q/(q-1). For Phase 2, we set g2(n)=(logloglognloglogloglogn)2g_{2}(n)=\left(\log\log\log n\cdot\log\log\log\log n\right)^{2}, g3(n)=(loglognlogloglogn)2g_{3}(n)=(\log\log n\cdot\log\log\log n)^{2}, h2(n)=(logloglogn)12h_{2}(n)=\left(\log\log\log n\right)^{12}, r2(n)=13logaloglognloglogaloglognr_{2}(n)=13\log_{a}\log\log n\cdot\log\log_{a}\log\log n and T2=T1+12logaloglognT_{2}=T_{1}+12\log_{a}\log\log n. Notice these functions satisfy the conditions of 5.6:

  1. 1.

    g2(n)=o(n1/6)g_{2}(n)=o(n^{1/6});

  2. 2.

    T2T1=o(g2(n))T_{2}-T_{1}=o(g_{2}(n));

  3. 3.

    r2(n)=o(n1/15)r_{2}(n)=o(n^{1/15});

  4. 4.

    (T2T1)2r2(n)2106(logaloglogn)4(loglogaloglogn)2g2(n)2/logg2(n)(T_{2}-T_{1})^{2}r_{2}(n)^{2}\leq 10^{6}(\log_{a}\log\log n)^{4}(\log\log_{a}\log\log n)^{2}\leq g_{2}(n)^{2}/\log g_{2}(n);

  5. 5.

    T2T1=12logaloglognr2(n)/logr2(n)T_{2}-T_{1}=12\log_{a}\log\log n\leq r_{2}(n)/\log r_{2}(n);

  6. 6.

    g2(n)/logg2(n)(logloglogn)4=h2(n)1/3g_{2}(n)/\log g_{2}(n)\leq\left(\log\log\log n\right)^{4}=h_{2}(n)^{1/3}.

Suppose that we have all the desired properties from Phase 1, so at the beginning of Phase 2 we have:

  1. 1.

    ZT1=O(n4/3(logloglogn)12)=O(n4/3h2(n))Z_{T_{1}}=O\left(\frac{n^{4/3}}{(\log\log\log n)^{12}}\right)=O\left(\frac{n^{4/3}}{h_{2}(n)}\right);

  2. 2.

    S(XT1)R1(XT1)n2/3r2(n)S(X_{T_{1}})\leq\sqrt{R_{1}(X_{T_{1}})}\leq n^{2/3}r_{2}(n), S(YT1)=R1(YT1)n2/3r2(n)S(Y_{T_{1}})=\sqrt{R_{1}(Y_{T_{1}})}\leq n^{2/3}r_{2}(n);

  3. 3.

    I(XT1)=Ω(n),I(YT1)=Ω(n)I(X_{T_{1}})=\Omega(n),I(Y_{T_{1}})=\Omega(n);

  4. 4.

    Q(XT1)R1(XT1)=O(n4/3)Q(X_{T_{1}})\leq R_{1}(X_{T_{1}})=O(n^{4/3}), Q(YT1)R1(YT1)=O(n4/3)Q(Y_{T_{1}})\leq R_{1}(Y_{T_{1}})=O(n^{4/3});

  5. 5.

    N^k(T1,g2)=Ω(g2(n)32k1)\hat{N}_{k}(T_{1},g_{2})=\Omega\left(g_{2}(n)^{3\cdot 2^{k-1}}\right) for all k1k\geq 1 such that n2/3g2(n)2kn^{2/3}g_{2}(n)^{-2^{k}}\rightarrow\infty.

Claim 5.6 implies there exists a coupling such that with Ω(1)\Omega(1) probability

  1. 1.

    ZT2=O(n4/3(loglogn)12)Z_{T_{2}}=O\left(\frac{n^{4/3}}{(\log\log n)^{12}}\right);

  2. 2.

    |S(XT2)|n2/3r2(n)logaloglogn\lvert S(X_{T_{2}})\rvert\leq n^{2/3}r_{2}(n)\log_{a}\log\log n, |S(YT2)|n2/3r2(n)logaloglogn\lvert S(Y_{T_{2}})\rvert\leq n^{2/3}r_{2}(n)\log_{a}\log\log n;

  3. 3.

    Q(YT2)=O(n4/3r2(n)logaloglogn)Q(Y_{T_{2}})=O(n^{4/3}r_{2}(n)\log_{a}\log\log n), Q(XT2)=O(n4/3r2(n)logaloglogn)Q(X_{T_{2}})=O(n^{4/3}r_{2}(n)\log_{a}\log\log n);

  4. 4.

    I(XT2)=Ω(n),I(YT2)=Ω(n)I(X_{T_{2}})=\Omega(n),I(Y_{T_{2}})=\Omega(n);

  5. 5.

    N^k(T2,g3)=Ω(g3(n)32k1)\hat{N}_{k}(T_{2},g_{3})=\Omega(g_{3}(n)^{3\cdot 2^{k-1}}) for all k1k\geq 1 such that n2/3g3(n)2kn^{2/3}g_{3}(n)^{-2^{k}}\rightarrow\infty.

Phase 3. Suppose the coupling in Phase 2 succeeds.

For Phase 3, we set the functions as g3(n)=(loglognlogloglogn)2g_{3}(n)=\left(\log\log n\cdot\log\log\log n\right)^{2}, g4(n)=(lognloglogn)2g_{4}(n)=(\log n\cdot\log\log n)^{2}, h3(n)=(loglogn)12h_{3}(n)=\left(\log\log n\right)^{12}, r3(n)=20logalognloglogalognr_{3}(n)=20\log_{a}\log n\cdot\log\log_{a}\log n and T3=T2+10logalognT_{3}=T_{2}+10\log_{a}\log n. Claim 5.6 implies there exists a coupling such that with Ω(1)\Omega(1) probability

  1. 1.

    ZT3=O(n4/3(logn)10)Z_{T_{3}}=O\left(\frac{n^{4/3}}{(\log n)^{10}}\right);

  2. 2.

    |S(XT3)|=O(n2/3r3(n)logalogn),|S(YT3)|=O(n2/3r3(n)logalogn)|S(X_{T_{3}})|=O(n^{2/3}r_{3}(n)\log_{a}\log n),|S(Y_{T_{3}})|=O(n^{2/3}r_{3}(n)\log_{a}\log n);

  3. 3.

    Q(XT3)=O(n4/3r3(n)logalogn)Q(X_{T_{3}})=O(n^{4/3}r_{3}(n)\log_{a}\log n), Q(YT3)=O(n2/3r3(n)logalogn)Q(Y_{T_{3}})=O(n^{2/3}r_{3}(n)\log_{a}\log n);

  4. 4.

    I(XT3)=Ω(n),I(YT3)=Ω(n)I(X_{T_{3}})=\Omega(n),I(Y_{T_{3}})=\Omega(n),

  5. 5.

    N^k(T3,g4)=Ω(g4(n)32k1)\hat{N}_{k}(T_{3},g_{4})=\Omega(g_{4}(n)^{3\cdot 2^{k-1}}) for all k1k\geq 1 such that n2/3g4(n)2kn^{2/3}g_{4}(n)^{-2^{k}}\rightarrow\infty.

Phase 4. Suppose the coupling in Phase 3 succeeds. Let C2C_{2} be a constant greater than 4/34/3. We set g4(n)=(lognloglogn)2g_{4}(n)=\left(\log n\cdot\log\log n\right)^{2}, h4(n)=(logn)10h_{4}(n)=\left(\log n\right)^{10}, r4(n)=2C2loganlogloganr_{4}(n)=2C_{2}\log_{a}n\cdot\log\log_{a}n and T4=T3+C2loganT_{4}=T_{3}+C_{2}\log_{a}n. Claim 5.6 implies there exists a coupling such that with Ω(1)\Omega(1) probability ZT4<1Z_{T_{4}}<1. Since ZT4Z_{T_{4}} is a non-negative integer-value random variable, Pr[ZT4<1]=Pr[ZT4=0]\Pr[Z_{T_{4}}<1]=\Pr[Z_{T_{4}}=0]. When ZT4=0Z_{T_{4}}=0, XT4X_{T_{4}} and YT4Y_{T_{4}} have the same component structure.

Therefore, if the coupling in every phase succeeds, XT4X_{T_{4}} and YT4Y_{T_{4}} have the same component structure. The probability that coupling in Phase 1 succeeds is (logloglogn)O(logϱ(1/a1))\left(\log\log\log n\right)^{-O(\log_{\varrho}(1/a_{1}))}. Conditional on the success of their previous phases, couplings in Phase 2, 3 and 4 succeed respectively with at least constant probability. Thus, the entire coupling succeeds with probability

(logloglogn)O(logϱ(1/a1))Ω(1)Ω(1)Ω(1)=(1logloglogn)β,\left(\log\log\log n\right)^{-O(\log_{\varrho}(1/a_{1}))}\cdot\Omega(1)\cdot\Omega(1)\cdot\Omega(1)=\left(\frac{1}{\log\log\log n}\right)^{\beta},

where β\beta is a positive constant. ∎

5.4.1 Proof of lemmas used in Section 5.4

Proof of Claim 5.6.

We will show that given the following properties at any time tT(n)t\leq T(n), we can maintain them at time t+1t+1 with probability at least 1O(g(n)1)O(r(n)1)1-O(g(n)^{-1})-O(r(n)^{-1}):

  1. 1.

    Zt=O(n4/3h(n))Z_{t}=O\left(\frac{n^{4/3}}{h(n)}\right);

  2. 2.

    |S(Xt)|,|S(Yt)|C3tn2/3r(n)|S(X_{t})|,|S(Y_{t})|\leq C_{3}tn^{2/3}r(n) for a constant C3>0C_{3}>0;

  3. 3.

    Q(Xt),Q(Yt)tn4/3r(n)+O(n4/3)Q(X_{t}),Q(Y_{t})\leq tn^{4/3}r(n)+O(n^{4/3});

  4. 4.

    I(Xt),I(Yt)=Ω(n)I(X_{t}),I(Y_{t})=\Omega(n);

  5. 5.

    N^k(t,g)=Ω(g(n)32k1)\hat{N}_{k}(t,g)=\Omega(g(n)^{3\cdot 2^{k-1}}) for all k1k\geq 1 such that n2/3g(n)2kn^{2/3}g(n)^{-2^{k}}\rightarrow\infty.

By assumption, tT(n)r(n)/logr(n)t\leq T(n)\leq r(n)/\log r(n). According to Lemma 5.3, Xt+1X_{t+1} and Yt+1Y_{t+1} retain properties 2 and 3 with probability at least 1O(r(n)1)1-O(r(n)^{-1}).

Given properties 1, 4 and 5, Lemma 5.5 (with η=logg(n)/2\eta=\log g(n)/2) implies that there exist a constant δ>0\delta>0 and a coupling for the activation sub-step of XtX_{t} and YtY_{t}

Pr[A(Xt)=A(Yt)]\displaystyle\Pr\left[A(X_{t})=A(Y_{t})\right] 14elogg(n)g(n)logg(n)2h(n)δg(n)\displaystyle\geq 1-4e^{-\log g(n)}-\sqrt{\frac{g(n)\log g(n)}{2h(n)}}-\frac{\delta}{g(n)}
=1O(1h(n)1/3)O(1g(n))=1O(1g(n)).\displaystyle=1-O\left(\frac{1}{h(n)^{1/3}}\right)-O\left(\frac{1}{g(n)}\right)=1-O\left(\frac{1}{g(n)}\right).

Note that condition g(n)logg(n)h(n)1/3g(n)\log g(n)\leq h(n)^{1/3} is used to deduce the inequality above. Suppose A(Xt)=A(Yt)A(X_{t})=A(Y_{t}); we couple components generated in the percolation step and preclude the growth of ZtZ_{t}. Hence, Zt+1Zt=O(n4/3h(n))Z_{t+1}\leq Z_{t}=O\left(\frac{n^{4/3}}{h(n)}\right), and property 1 holds immediately.

Recall that 1(X)=Q(X)+|S(X)|2\mathcal{R}_{1}(X)=Q(X)+\lvert S(X)\rvert^{2}. Properties 2 and 3 imply that 1(Xt)=O(t2n4/3r(n)2)\mathcal{R}_{1}(X_{t})=O(t^{2}n^{4/3}r(n)^{2}) and 1(Yt)=O(t2n4/3r(n)2)\mathcal{R}_{1}(Y_{t})=O(t^{2}n^{4/3}r(n)^{2}). Since t<T(n)t<T(n) and T(n)2r(n)2g(n)2/logg(n)T(n)^{2}\cdot r(n)^{2}\leq g(n)^{2}/\log g(n), we can upper bound 1(Xt)\mathcal{R}_{1}(X_{t}) and 1(Yt)\mathcal{R}_{1}(Y_{t}) by

O(n4/3(T(n)r(n))2)=O(n4/3g(n)2logg(n)).O\left(n^{4/3}\left(T(n)\cdot r(n)\right)^{2}\right)=O\left(\frac{n^{4/3}g(n)^{2}}{\log g(n)}\right).

We establish properties 4 and 5 with a similar argument as the one used in Phase 1.

Let HtG(A(Xt),n/q)H_{t}\sim G(A(X_{t}),n/q). Due to Lemma 5.4 (with f=gf=g) and Corollary 3.12 with probability at least 1O(g(n)1)1-O(g(n)^{-1}), Nk(Ht,g)=Ω(g(n)32k1)N_{k}(H_{t},g)=\Omega(g(n)^{3\cdot 2^{k-1}}) for all k1k\geq 1 such that n2/3g(n)2kn^{2/3}g(n)^{-2^{k}}\rightarrow\infty. In addition, I(Ht)=Ω(n)I(H_{t})=\Omega(n) with probability 1O(n1)1-O(n^{-1}) by Lemma 3.2. Since the coupling adds components in HtH_{t} to both Xt+1X_{t+1} and Yt+1Y_{t+1}, properties 4 and 5 are maintained at time t+1t+1, with probability at least 1O(g(n)1)1-O(g(n)^{-1}).

A union bound concludes that at time t+1t+1 we can maintain all five properties with probability at least 1O(g(n)1)O(r(n)1)1-O(g(n)^{-1})-O(r(n)^{-1}). Hence, the probability that XT(n)X_{T(n)} and YT(n)Y_{T(n)} still satisfy the listed 5 properties above is

[1O(1g(n))O(1r(n))]T(n)=1o(1).\left[1-O\left(\frac{1}{g(n)}\right)-O\left(\frac{1}{r(n)}\right)\right]^{T(n)}=1-o(1).

It remains for us to show the bound for ZTZ_{T} and that for a given function gg^{\prime} satisfying ggg^{\prime}\geq g and g(n)=o(n1/3)g^{\prime}(n)=o(n^{1/3}), then N^k(T,g)=Ω(g(n)32k1)\hat{N}_{k}(T,g^{\prime})=\Omega\left(g^{\prime}(n)^{3\cdot 2^{k-1}}\right) for all k1k\geq 1 such that n2/3g(n)2kn^{2/3}g^{\prime}(n)^{-2^{k}}\rightarrow\infty.

Conditioned on A(Xt)=A(Yt)A(X_{t})=A(Y_{t}) for every activation sub-step in this phase, a bound for ZTZ_{T} can be obtained through a first moment method. On expectation ZtZ_{t} contract by a factor of 1a=11q\frac{1}{a}=1-\frac{1}{q} each step. Thus, we can recursively compute the expectation of E[ZT]\operatorname*{{\rm E}}[Z_{T}]:

E[ZT]\displaystyle\operatorname*{{\rm E}}[Z_{T}] =E[E[ZTZT1]]=1aE[ZT1]==(1a)TE[Z0]=O((1a)Tn4/3h(n)).\displaystyle=\operatorname*{{\rm E}}[\operatorname*{{\rm E}}[Z_{T}\mid Z_{T-1}]]=\frac{1}{a}\cdot\operatorname*{{\rm E}}[Z_{T-1}]=...=\left(\frac{1}{a}\right)^{T}\operatorname*{{\rm E}}[Z_{0}]=O\left(\left(\frac{1}{a}\right)^{T}\cdot\frac{n^{4/3}}{h(n)}\right). (13)

It follows from Markov’s inequality that with at least constant probability

ZT=O(n4/3aT(n)).Z_{T}=O\left(\frac{n^{4/3}}{a^{T(n)}}\right).

Finally, in the last percolation step in this phase, Corollary 3.12 guarantees that with high probability N^k(T,g)\hat{N}_{k}(T,g^{\prime}) =Ω(g(n)32k1)=\Omega(g^{\prime}(n)^{3\cdot 2^{k-1}}) for all k1k\geq 1 such that n2/3g(n)2kn^{2/3}g^{\prime}(n)^{-2^{k}}\rightarrow\infty. The claim follows from a union bound. ∎

Proof of Lemma 5.3.

We establish first the bound for |S(Xt+1)||S(X_{t+1})|. Suppose ss vertices are activated from S(Xt)S(X_{t}). By assumption

Q(Xt)tn4/3r(n)+O(n4/3)2n4/3r(n)2logr(n),Q(X_{t})\leq tn^{4/3}r(n)+O(n^{4/3})\leq\frac{2n^{4/3}r(n)^{2}}{\log r(n)},

for sufficiently large nn. Hence, Hoeffding’s inequality implies that

A(Xt)s+n|S(Xt)|q+n2/3r(n)nq+(q1)sq+n2/3r(n),A(X_{t})\leq s+\frac{n-|S(X_{t})|}{q}+n^{2/3}r(n)\leq\frac{n}{q}+\frac{(q-1)s}{q}+n^{2/3}r(n),

with probability at least 1O(r(n)1)1-O(r(n)^{-1}).

We consider two cases. First suppose that δ(q1)s/qn2/3r(n)\delta(q-1)s/q\geq n^{2/3}r(n), where δ>0\delta>0 is a sufficiently small constant we choose later. Then,

A(Xt)nq+(1+δ)(q1)sq=:M.A(X_{t})\leq\frac{n}{q}+\frac{(1+\delta)(q-1)s}{q}=:M.

The largest new component corresponds to the largest component of a G(A(Xt),q/n)G(A(X_{t}),q/n) random graph. Let NN be the size of that component, and let NMN_{M} be the size of the largest component of a G(M,1+εM)G\left(M,\frac{1+\varepsilon}{M}\right) random graph, where ε=qM/n1\varepsilon=qM/n-1. By Fact 3.1, NN is stochastically dominated by NMN_{M}. Then by Corollary 3.6 there exists a constant c>0c>0 such that

Pr[N>(2+ρ)εM]Pr[NM>(2+ρ)εM]=O(ecε3M),\Pr\left[N>(2+\rho)\varepsilon M\right]\leq\Pr[N_{M}>(2+\rho)\varepsilon M]=O({\operatorname*{{\rm e}}}^{-c\varepsilon^{3}M}), (14)

for any ρ<1/10\rho<1/10. Now,

εM\displaystyle\varepsilon M =(1+δ)(q1)sn(nq+(1+δ)(q1)sq)\displaystyle=\frac{(1+\delta)(q-1)s}{n}\left(\frac{n}{q}+\frac{(1+\delta)(q-1)s}{q}\right)
=(1+δ)(q1)sq+O(s2n)\displaystyle=\frac{(1+\delta)(q-1)s}{q}+O\left(\frac{s^{2}}{n}\right)
(1+δ)(q1)sq+O(n1/3r(n)4),\displaystyle\leq\frac{(1+\delta)(q-1)s}{q}+O(n^{1/3}r(n)^{4}),
(1+2δ)(q1)sq,\displaystyle\leq\frac{(1+2\delta)(q-1)s}{q}, (15)

where for the second to last inequality we use that s|S(Xt)|=O(n2/3r(n)2)s\leq|S(X_{t})|=O(n^{2/3}r(n)^{2}), and the last inequality follows from the assumptions δ(q1)sqn2/3r(n)\frac{\delta(q-1)s}{q}\geq n^{2/3}r(n) and r(n)=o(n1/15)r(n)=o\left(n^{1/15}\right). Also, since s=O(n2/3r(n)2)s=O(n^{2/3}r(n)^{2}) and r(n)=o(n1/15)r(n)=o\left(n^{1/15}\right),

ε3M=[(1+δ)(q1)sn]3(nq+(1+δ)(q1)sq)=Ω(s3n2+s4n3)=Ω(r(n)3).\displaystyle\varepsilon^{3}M=\left[\frac{(1+\delta)(q-1)s}{n}\right]^{3}\left(\frac{n}{q}+\frac{(1+\delta)(q-1)s}{q}\right)=\Omega\left(\frac{s^{3}}{n^{2}}+\frac{s^{4}}{n^{3}}\right)=\Omega(r(n)^{3}). (16)

Hence, (14), (15) and (16) imply

Pr[N(2+ρ)(1+2δ)(q1)sq]=eΩ(r(n)3).\Pr\left[N\geq\frac{(2+\rho)(1+2\delta)(q-1)s}{q}\right]={\operatorname*{{\rm e}}}^{-\Omega(r(n)^{3})}.

Since q<2q<2, for sufficiently small ρ\rho and δ\delta

(2+ρ)(1+2δ)(q1)q<1.\frac{(2+\rho)(1+2\delta)(q-1)}{q}<1.

Therefore, NsN\leq s with probability 1exp(Ω(r(n)3))1-\exp(-\Omega(r(n)^{3})). If this is the case, then |S(Xt+1)||S(Xt)||S(X_{t+1})|\leq|S(X_{t})| and so by a union bound |S(Xt+1)|c(t+1)n2/3r(n)|S(X_{t+1})|\leq c(t+1)n^{2/3}r(n) with probability at least 1O(r(n)1)1-O(r(n)^{-1}).

For the second case we assume δ(q1)sq<n2/3r(n)\frac{\delta(q-1)s}{q}<n^{2/3}r(n) and proceed in similar fashion. In this case, Hoeffding’s inequality implies with probability at least 1O(r(n)1)1-O(r(n)^{-1}),

A(Xt)nq+(1+1/δ)n2/3r(n)=:M.A(X_{t})\leq\frac{n}{q}+(1+1/\delta)n^{2/3}r(n)=:M^{\prime}.

The size of the largest new component, denoted NN^{\prime}, is stochastically dominated by the size of the largest component of a G(M,1+εM)G(M^{\prime},\frac{1+\varepsilon^{\prime}}{M^{\prime}}) random graph, with ε=qM/n1\varepsilon^{\prime}=qM^{\prime}/n-1. Now, since we assume r(n)=o(n1/15)r(n)=o\left(n^{1/15}\right),

εM\displaystyle\varepsilon^{\prime}M^{\prime} q(1+1/δ)r(n)n1/3[nq+(1+1/δ)n2/3r(n)]\displaystyle\leq\frac{q(1+1/\delta)r(n)}{n^{1/3}}\left[\frac{n}{q}+(1+1/\delta)n^{2/3}r(n)\right]
=(1+1/δ)n2/3r(n)+O(n1/3r(n)2)c3n2/3r(n),\displaystyle=(1+1/\delta)n^{2/3}r(n)+O(n^{1/3}r(n)^{2})\leq\frac{c}{3}n^{2/3}r(n),

where the last inequality holds for large nn and a sufficiently large constant cc. Moreover,

(ε)3M=Ω(r(n)3n[nq+n2/3r(n)])=Ω(r(n)3).\displaystyle\left({\varepsilon^{\prime}}\right)^{3}M^{\prime}=\Omega\left(\frac{r(n)^{3}}{n}\left[\frac{n}{q}+n^{2/3}r(n)\right]\right)=\Omega(r(n)^{3}).

Hence,

Pr[Ncn2/3r(n)]Pr[N(2+ρ)cn2/3r(n)3]Pr[N(2+ρ)εM],\Pr\left[N^{\prime}\geq cn^{2/3}r(n)\right]\leq\Pr\left[N^{\prime}\geq\frac{(2+\rho)cn^{2/3}r(n)}{3}\right]\leq\Pr\left[N^{\prime}\geq(2+\rho)\varepsilon^{\prime}M^{\prime}\right],

where ρ<1/10\rho<1/10, and by Corollary 3.6

Pr[N(2+ρ)εM]=eΩ((ε)3M)=eΩ(r(n)3).\displaystyle\Pr\left[N^{\prime}\geq(2+\rho)\varepsilon^{\prime}M^{\prime}\right]={\operatorname*{{\rm e}}}^{-\Omega(\left({\varepsilon^{\prime}}\right)^{3}M^{\prime})}={\operatorname*{{\rm e}}}^{-\Omega(r(n)^{3})}.

Since, |S(Xt+1)||S(Xt)|+N|S(X_{t+1})|\leq|S(X_{t})|+N^{\prime}, a union bound implies that |S(Xt+1)|c(t+1)n2/3r(n)|S(X_{t+1})|\leq c(t+1)n^{2/3}r(n) with probability at least 1O(r(n)1)1-O(r(n)^{-1}) as desired.

Finally, to bound Q(Xt+1)Q(X_{t+1}) we observe that if C1,,CkC_{1},\dots,C_{k} are all the new components in order of their sizes, then by Lemma 3.4 and Markov’s inequality:

Pr[j2|Cj|2n4/3r(n)]=O(r(n)1).\Pr\left[\sum_{j\geq 2}|C_{j}|^{2}\geq n^{4/3}r(n)\right]=O(r(n)^{-1}).

Thus, Q(Xt+1)Q(Xt)+n4/3r(n)(t+1)n4/3r(n)Q(X_{t+1})\leq Q(X_{t})+n^{4/3}r(n)\leq(t+1)n^{4/3}r(n) with probability at least 1O(r(n)1)1-O(r(n)^{-1}) as claimed. The lemma follows from a union bound. ∎

Proof of Lemma 5.4.

Since 1(Xt)=O(n4/3f(n)2(logf(n))1)\mathcal{R}_{1}(X_{t})=O\left(n^{4/3}f(n)^{2}(\log f(n))^{-1}\right), by Hoeffding’s inequality

A(Xt)[nn2/3f(n)q,n+n2/3f(n)q]=:J,A(X_{t})\in\left[\frac{n-n^{2/3}f(n)}{q},\frac{n+n^{2/3}f(n)}{q}\right]=:J,

with probability at least 1O(f(n)1)1-O(f(n)^{-1}). The new connected components in Xt+1X_{t+1} correspond to those of a G(A(Xt),1+εA(Xt))G(A(X_{t}),\frac{1+\varepsilon}{A(X_{t})}) random graph, where ε=A(Xt)q/n1\varepsilon=A(X_{t})q/n-1. If A(Xt)JA(X_{t})\in J, then

n1/3f(n)εn1/3f(n).-n^{-1/3}f(n)\leq\varepsilon\leq n^{-1/3}f(n). (17)

Since A(Xt)JA(X_{t})\in J we can also define m:=A(Xt)=θnm:=A(X_{t})=\theta n for θ(1/2q,1)\theta\in(1/2q,1), and λ:=εm1/3\lambda:=\varepsilon m^{1/3}, so we may rewrite (17) as

f(n)θ1/3f(n)λθ1/3f(n)f(n),-f(n)\leq-\theta^{1/3}f(n)\leq\lambda\leq\theta^{1/3}f(n)\leq f(n),

and the lemma follows. ∎

An important tool used in the proof of Lemma 5.5 is the following coupling on a (lazy) symmetric random walk on \mathbb{Z}; its proof is given in Appendix B.

Theorem 5.7.

Let A>0A>0 and let Ac1,c2,,cm2AA\leq c_{1},c_{2},\dots,c_{m}\leq 2A be positive integers. Let r(0,1/2]r\in(0,1/2] and consider the sequences of random variables X1,,XmX_{1},\dots,X_{m} and Y1,,YmY_{1},\dots,Y_{m} where for each i=1,,mi=1,\dots,m: Xi=ciX_{i}=c_{i} with probability rr; Xi=ciX_{i}=-c_{i} with probability rr; Xi=0X_{i}=0 otherwise and YiY_{i} has the same distribution as XiX_{i}. Let X=i=1mXiX=\sum_{i=1}^{m}X_{i} and Y=i=1mYiY=\sum_{i=1}^{m}Y_{i}. Then for any d>0d>0, there exist a constant δ:=δ(r)>0\delta:=\delta(r)>0 and a coupling of XX and YY such that

Pr[d+2AXYd]1δ(d+A)Am.\Pr[d+2A\geq X-Y\geq d]\geq 1-\frac{\delta(d+A)}{A\sqrt{m}}.

We note that Theorem 5.7 is a generalization of the following more standard fact which will also be useful to us.

Lemma 5.8 ([4], Lemma 2.18).

Let XX and YY be binomial random variables with parameters mm and rr, where r(0,1)r\in(0,1) is a constant. Then, for any integer y>0y>0, there exists a coupling (X,Y)(X,Y) such that for a suitable constant γ=γ(r)>0\gamma=\gamma(r)>0,

Pr[XY=y]1γym.\Pr[X-Y=y]\geq 1-\frac{\gamma y}{\sqrt{m}}.
Proof of Lemma 5.5.

For ease of notation let k=k(g)\mathcal{I}_{k}=\mathcal{I}_{k}(g) and N^k=N^k(t,g)\hat{N}_{k}=\hat{N}_{k}(t,g) for each k1k\geq 1. Also recall the notations WtW_{t}, M(X)M(X) and D(X)D(X) defined in Section 2.2. Let I^(Xt)\hat{I}(X_{t}) and I^(Yt)\hat{I}(Y_{t}) be the isolated vertices in WtW_{t} from XtX_{t} and YtY_{t}, respectively.

Let k:=mink{k:g(n)2kϑn1/3}k^{*}:=\min_{k}\{k\in\mathbb{Z}:g(n)^{2^{k}}\geq\vartheta n^{1/3}\}. The activation of the non-trivial components in M(Xt)M(X_{t}) and M(Yt)M(Y_{t}) whose sizes are not in {1}1k\{1\}\cup\mathcal{I}_{1}\cup\dots\cup\mathcal{I}_{k^{*}} is coupled using the matching WtW_{t}. That is, cM(Xt)c\in M(X_{t}) and Wt(c)M(Yt)W_{t}(c)\in M(Y_{t}) are activated simultaneously with probability 1/q1/q. The components in D(Xt)D(X_{t}) and D(Yt)D(Y_{t}) are activated independently. After independently activating these components, the number of active vertices from each copy is not necessarily the same. The idea is to couple the activation of the remaining components in M(Xt)M(X_{t}) and M(Yt)M(Y_{t}) in way that corrects this difference.

Let A0(Xt)A_{0}(X_{t}) and A0(Yt)A_{0}(Y_{t}) be number of active vertices from XtX_{t} and YtY_{t}, respectively, after the activation of the components from D(Xt)D(X_{t}) and D(Yt)D(Y_{t}). Observe that E[A0(Xt)]=E[A0(Yt)]=:μ\operatorname*{{\rm E}}[A_{0}(X_{t})]=\operatorname*{{\rm E}}[A_{0}(Y_{t})]=:\mu and that by Hoeffding’s inequality, for any η(n)>0\eta(n)>0

Pr[|A0(Xt)μ|η(n)Zt]2e2η(n).\Pr\left[\lvert A_{0}(X_{t})-\mu\rvert\geq\sqrt{\eta(n)Z_{t}}\right]\leq 2e^{-2\eta(n)}.

Recall ZtCn4/3h(n)Z_{t}\leq\frac{Cn^{4/3}}{h(n)}. Hence, with probability at least 14exp(2η(n))1-4\exp\left(-2\eta(n)\right),

d0:=|A0(Xt)A0(Yt)|2η(n)Zt2Cη(n)n2/3h(n).d_{0}:=\left|A_{0}(X_{t})-A_{0}(Y_{t})\right|\leq 2\sqrt{\eta(n)Z_{t}}\leq\frac{2{\sqrt{C\eta(n)}n^{2/3}}}{\sqrt{h(n)}}.

We first couple the activation of the components in 1\mathcal{I}_{1}, then in 2\mathcal{I}_{2} and so on up to k\mathcal{I}_{k^{*}}. Without loss of generality, suppose that d0=A0(Yt)A0(Xt)d_{0}=A_{0}(Y_{t})-A_{0}(X_{t}). If d0ϑn2/3g(n)2d_{0}\leq\frac{\vartheta n^{2/3}}{g(n)^{2}}, we simply couple the components with sizes in 1\mathcal{I}_{1} using the matching WtW_{t}. Suppose otherwise that d0>ϑn2/3g(n)2d_{0}>\frac{\vartheta n^{2/3}}{g(n)^{2}}. Let A1(Xt)A_{1}(X_{t}) and A1(Yt)A_{1}(Y_{t}) be random variables corresponding to the numbers of active vertices from M(Xt)M(X_{t}) and M(Yt)M(Y_{t}) with sizes in 1\mathcal{I}_{1} respectively. By assumption N^1bg(n)3\hat{N}_{1}\geq bg(n)^{3}. Hence, Theorem 5.7 implies that for δ=δ(q)>0\delta=\delta(q)>0, there exists a coupling for the activation of the components in M(Xt)M(X_{t}) and M(Yt)M(Y_{t}) with sizes in 1\mathcal{I}_{1} such that

d0A1(Xt)A1(Yt)d0ϑn2/3g(n)2d_{0}\geq A_{1}(X_{t})-A_{1}(Y_{t})\geq d_{0}-\frac{\vartheta n^{2/3}}{g(n)^{2}}

with probability at least

1δ(d0ϑn2/32g(n)2)ϑn2/32g(n)2bg(n)31δd0ϑn2/32g(n)2bg(n)314δCη(n)g(n)ϑbh(n)1η(n)g(n)h(n),1-\frac{\delta\left(d_{0}-\frac{\vartheta n^{2/3}}{2g(n)^{2}}\right)}{\frac{\vartheta n^{2/3}}{2g(n)^{2}}\sqrt{bg(n)^{3}}}\geq 1-\frac{\delta d_{0}}{\frac{\vartheta n^{2/3}}{2g(n)^{2}}\sqrt{bg(n)^{3}}}\geq 1-\frac{4\delta\sqrt{C\eta(n)g(n)}}{\vartheta\sqrt{bh(n)}}\geq 1-\sqrt{\frac{\eta(n)g(n)}{h(n)}},

where the last inequality holds for ϑ\vartheta large enough. Let d1:=(A0(Yt)A0(Xt))+(A1(Yt)A1(Xt))d_{1}:=\left(A_{0}(Y_{t})-A_{0}(X_{t})\right)+\left(A_{1}(Y_{t})-A_{1}(X_{t})\right). If the coupling succeeds, we have 0d1ϑn2/3g(n)20\leq d_{1}\leq\frac{\vartheta n^{2/3}}{g(n)^{2}}. Thus, we have shown that d1ϑn2/3g(n)2d_{1}\leq\frac{\vartheta n^{2/3}}{g(n)^{2}} with probability at least

(14e2η(n))(1η(n)g(n)h(n))14e2η(n)η(n)g(n)h(n).\left(1-4e^{-2\eta(n)}\right)\left(1-\sqrt{\frac{\eta(n)g(n)}{h(n)}}\right)\geq 1-4e^{-2\eta(n)}-\sqrt{\frac{\eta(n)g(n)}{h(n)}}.

Now, let dkd_{k} be the difference in the number of active vertices after activating the components in k\mathcal{I}_{k}. Suppose that dkϑn2/3g(n)2kd_{k}\leq\frac{\vartheta n^{2/3}}{g(n)^{2^{k}}}, for kkk\leq k^{*}. By assumption, N^k+1bg(n)32k\hat{N}_{k+1}\geq bg(n)^{3\cdot 2^{k}}. Thus, using Theorem 5.7 again we get that there exists a coupling for the activation of the components in k+1\mathcal{I}_{k+1} such that

Pr[dk+1ϑn2/3g(n)2k+1|dkϑn2/3g(n)2k]1δdkϑn2/32g(n)2k+1bg(n)32k12δbg(n)2k1.\Pr\left[d_{k+1}\leq\frac{\vartheta n^{2/3}}{g(n)^{2^{k+1}}}\,\middle|\,d_{k}\leq\frac{\vartheta n^{2/3}}{g(n)^{2^{k}}}\right]\geq 1-\frac{\delta d_{k}}{\frac{\vartheta n^{2/3}}{2g(n)^{2^{k+1}}}\sqrt{bg(n)^{3\cdot 2^{k}}}}\geq 1-\frac{2\delta}{\sqrt{b}g(n)^{2^{k-1}}}.

Therefore, there is a coupling of the activation components in 2,3,,k\mathcal{I}_{2},\mathcal{I}_{3},\dots,\mathcal{I}_{k^{*}} such that

Pr[dkn1/3|d1ϑn2/3g(n)2]k=2k(1δg(n)2k1),\Pr\left[d_{k^{*}}\leq n^{1/3}\,\middle|\,d_{1}\leq\frac{\vartheta n^{2/3}}{g(n)^{2}}\right]\geq\prod_{k=2}^{k^{*}}\left(1-\frac{\delta^{\prime}}{g(n)^{2^{k-1}}}\right),

where δ=2δ/b\delta^{\prime}=2\delta/\sqrt{b}. Note that for a suitable constant δ′′>0\delta^{\prime\prime}>0, we have

k2k(1δg(n)2k1)=exp(k1kln(1δg(n)2k))exp(δ′′k1k1g(n)2k),\prod_{k\geq 2}^{k^{*}}\left(1-\frac{\delta^{\prime}}{g(n)^{2^{k-1}}}\right)=\exp\left(\sum_{k\geq 1}^{k^{*}}\ln\left(1-\frac{\delta^{\prime}}{g(n)^{2^{k}}}\right)\right)\geq\exp\left(-\delta^{\prime\prime}\sum_{k\geq 1}^{k^{*}}\frac{1}{g(n)^{2^{k}}}\right),

and since

k1k1g(n)2kk11g(n)2kk11g(n)k1g(n)2g(n),\sum_{k\geq 1}^{k^{*}}\frac{1}{g(n)^{2^{k}}}\leq\sum_{k\geq 1}^{\infty}\frac{1}{g(n)^{2^{k}}}\leq\sum_{k\geq 1}^{\infty}\frac{1}{g(n)^{{k}}}\leq\frac{1}{g(n)^{2}-g(n)},

we get

k=2k(1δg(n)2k1)exp(δ′′g(n)2g(n))1δ′′g(n)2g(n).\prod_{k=2}^{k^{*}}\left(1-\frac{\delta^{\prime}}{g(n)^{2^{k-1}}}\right)\geq\exp\left(-\frac{\delta^{\prime\prime}}{g(n)^{2}-g(n)}\right)\geq 1-\frac{\delta^{\prime\prime}}{g(n)^{2}-g(n)}.

Finally, we couple I^(Xt)\hat{I}(X_{t}) and I^(Yt)\hat{I}(Y_{t}) to fix dkd_{k^{*}}. By assumption I(Xt),I(Yt)=Ω(n)I(X_{t}),I(Y_{t})=\Omega(n), so m:=|I^(Xt)|=|I^(Yt)|=Ω(n)m:=|\hat{I}(X_{t})|=|\hat{I}(Y_{t})|=\Omega(n). Let AI(Xt)A_{I}(X_{t}) and AI(Yt)A_{I}(Y_{t}) denote the total number of activated isolated vertices from I^(Xt)\hat{I}(X_{t}) and I^(Yt)\hat{I}(Y_{t}) respectively. We activate all isolated vertices independently, so AI(Xt)A_{I}(X_{t}) and AI(Yt)A_{I}(Y_{t}) can be seen as two binomial random variables with the same parameters mm and 1/q1/q. Lemma 5.8 gives a coupling for binomial random variables such that for rn1/3r\leq n^{1/3},

Pr[AI(Xt)AI(Yt)=r]1O(1n1/6)=1o(1g(n)).\Pr\left[A_{I}(X_{t})-A_{I}(Y_{t})=r\right]\geq 1-O\left(\frac{1}{n^{1/6}}\right)=1-o\left(\frac{1}{g(n)}\right).

Therefore,

Pr[A(Xt)=A(Yt)]14e2η(n)η(n)g(n)h(n)O(1g(n)),\Pr\left[A(X_{t})=A(Y_{t})\right]\geq 1-4e^{-2\eta(n)}-\sqrt{\frac{\eta(n)g(n)}{h(n)}}-O\left(\frac{1}{g(n)}\right),

as claimed. ∎

6 New mixing time for the Glauber dynamics via comparison

In this section, we establish a comparison inequality between the mixing times of the CM dynamics and of the heat-bath Glauber dynamics for the random-cluster model for a general graph G=(V,E)G=(V,E). The Glauber dynamics is defined as follows. Given a random-cluster configuration AtA_{t}, one step of this chain is given by:

  1. (i)

    pick an edge eEe\in E uniformly at random;

  2. (ii)

    replace the current configuration AtA_{t} by At{e}A_{t}\cup\{e\} with probability

    μG,p,q(At{e})μG,p,q(At{e})+μG,p,q(At{e});\frac{\mu_{G,p,q}(A_{t}\cup\{e\})}{\mu_{G,p,q}(A_{t}\cup\{e\})+\mu_{G,p,q}(A_{t}\setminus\{e\})};
  3. (iii)

    else replace AtA_{t} by At{e}A_{t}\setminus\{e\}.

It is immediate from its definition this chain is reversible with respect to μ=μG,p,q\mu=\mu_{G,p,q} and thus converges to it.

The following comparison inequality was proved in [7]:

gap1(GD)O(mlogm)gap1(CM),\mathrm{gap}^{-1}(\mathrm{GD})\leq O(m\log m)\cdot\mathrm{gap}^{-1}(\mathrm{CM}), (18)

where mm denotes the number of edges in GG, and gap(CM)\mathrm{gap}(\mathrm{CM}), gap(GD)\mathrm{gap}(\mathrm{GD}) the spectral gaps of the transition matrices of the CM and Glauber dynamics, respectively. The standard connection between the spectral gap and the mixing time (see, e.g., Theorem 12.3 in [23]) yields

τmixGDO(mlogm)τmixCMlogμmin1,\tau_{\rm mix}^{\mathrm{GD}}\leq O(m\log m)\cdot\tau_{\rm mix}^{\mathrm{CM}}\cdot\log\mu_{\mathrm{min}}^{-1}, (19)

where μmin=minAΩμ(A)\mu_{\mathrm{min}}=\min_{A\in\Omega}\mu(A) with Ω\Omega denoting the set of random-cluster configurations on GG. In some cases, such as in the mean-field model with p=Θ(n1)p=\Theta(n^{-1}), logμmin1=Ω(mlogm)\log\mu_{\mathrm{min}}^{-1}=\Omega(m\log m), and a factor of O(m2(logm)2)O(m^{2}(\log m)^{2}) is thus lost in the comparison. We provide here an improved version of this inequality.

Theorem 6.1.

For any q>1q>1 and any p(0,1)p\in(0,1), the mixing time of Glauber dynamics for the random-cluster model on a graph GG with nn vertices and mm edges satisfies

τmixGDO(mnlogn+pm2lognlog1min{p,1p})τmixCM.\tau_{\rm mix}^{\mathrm{GD}}\leq O\left(mn\log n+pm^{2}\log n\cdot\log\frac{1}{\min\{p,1-p\}}\right)\cdot\tau_{\rm mix}^{\mathrm{CM}}.

We note that in the mean-field model, where m=Θ(n2)m=\Theta(n^{2}) and we take p=ζ/np=\zeta/n with ζ=O(1)\zeta=O(1), this theorem yields that τmixGD=O(n3(logn)2)τmixCM\tau_{\rm mix}^{\mathrm{GD}}=O(n^{3}(\log n)^{2})\cdot\tau_{\rm mix}^{\mathrm{CM}}, which establishes Theorem 1.2 from the introduction and improves by a factor of O(n)O(n) the best previously known bound for the Glauber dynamics on the complete graph.

To prove Theorem 6.1 we use the following standard fact.

Theorem 6.2.

Let PP be a Markov chain on state space Γ\Gamma with stationary distribution π\pi. Suppose there exist a subset of states Γ0Γ\Gamma_{0}\subseteq\Gamma and a time TT, such that for any tTt\geq T and any xΓx\in\Gamma we have Pt(x,ΓΓ0)116.P^{t}(x,\Gamma\setminus\Gamma_{0})\leq\frac{1}{16}. Then

τmixP=O(T+gap1(P)log(8π01)),\tau_{\rm mix}^{P}=O\left(T+\mathrm{gap}^{-1}(P)\log(8\pi_{0}^{-1})\right), (20)

where π0:=minωΓ0π(ω)\pi_{0}:=\min_{\omega\in\Gamma_{0}}\pi(\omega).

Note that π0\pi_{0} is the minimum probability of any configuration on Γ0\Gamma_{0}. Without the additional assumptions in the theorem, the best possible bound involves a factor of πmin=minAΓπ(A)\pi_{\mathrm{min}}=\min_{A\in\Gamma}\pi(A) instead. We remark that there are related conditions under which (20) holds; we choose the condition that Pt(x,ΓΓ0)116P^{t}(x,\Gamma\setminus\Gamma_{0})\leq\frac{1}{16} for every xx and every tTt\geq T for convenience.

We can now provide the proof of Theorem 6.1.

Proof of Theorem 6.1.

First note that if p=Ω(1)p=\Omega(1), it suffices to prove that

τmixGD=O(mnlogn+m2lognlog1min{p,1p})τmixCM.\tau_{\rm mix}^{\mathrm{GD}}=O\left(mn\log n+m^{2}\log n\log\frac{1}{\min\{p,1-p\}}\right)\cdot\tau_{\rm mix}^{\mathrm{CM}}.

This follows from (19) and the fact that

μminmin{p,1p}mqn1\mu_{\mathrm{min}}\geq\frac{\min\{p,1-p\}^{m}}{q^{n-1}}

since the partition function for the random-cluster model on GG satisfies ZGqnZ_{G}\leq q^{n} (see, e.g., Theorem 3.60 in [19]).

Thus, we may assume p1/100p\leq 1/100. From (18) and the standard relationship between the spectral gap and the mixing time (see, e.g., Theorem 12.4 in [23]) we obtain:

gap1(GD)τmixCMO(mlogn).\mathrm{gap}^{-1}(\mathrm{GD})\leq\tau_{\rm mix}^{\mathrm{CM}}\cdot O(m\log n). (21)

Let PP denote the transition matrix of the Glauber dynamics. In order to apply Theorem 6.2, we have to find a suitable subset of states Ω0Ω\Omega_{0}\subseteq\Omega and a suitable time TT so that Pt(A,ΩΩ0)116P^{t}(A,\Omega\setminus\Omega_{0})\leq\frac{1}{16}, for every AΩA\in\Omega and every tTt\geq T.

We let Ω0={AE:|A|100mp}\Omega_{0}=\{A\subseteq E:|A|\leq 100mp\} and T=CmlogmT=Cm\log m for a sufficiently large constant C>0C>0. When an edge is selected for update by the Glauber dynamics, it is set to be open with probability p/(p+q(1p))p/(p+q(1-p)) if it is a “cut edge” or with probability pp if it is not; recall that we say an edge ee is open if the edge is present in the random-cluster configuration. Therefore, since pp/(p+q(1p))p\geq p/(p+q(1-p)) when q>1q>1, after every edge has been updated at least once the number of open edges in any configuration is stochastically dominated by the number of edges in a G(n,p)G(n,p) random graph. By the coupon collector bound, every edge has been updated at least once at time TT w.h.p. for large enough CC. Moreover, if all edges are indeed updated by time TT, the number of open edges in XtX_{t} at any time tTt\geq T is at most 100mp100mp with probability at least 19/2019/20 by Markov’s inequality. Therefore, the Glauber dynamics satisfies condition in Theorem 6.2 for these choices of TT and Ω0\Omega_{0}.

It remains for us to estimate π0\pi_{0}. Let πm\pi_{m} denote the probability of the configuration where all the edges are open; then,

πm=pmqZGpmqn1,\pi_{m}=\frac{p^{m}q}{Z_{G}}\geq\frac{p^{m}}{q^{n-1}}, (22)

where the inequality follows from the fact that ZGqnZ_{G}\leq q^{n}. Moreover, since 1p>p1-p>p when p1/100p\leq 1/100, then π0qp100mp(1p)m100mp/ZG\pi_{0}\geq qp^{100mp}(1-p)^{m-100mp}/Z_{G} and so

π0πmqp100mp(1p)m100mppmq=(1pp)m100mp.\frac{\pi_{0}}{\pi_{m}}\geq\frac{qp^{100mp}(1-p)^{m-100mp}}{p^{m}q}=\left(\frac{1-p}{p}\right)^{m-100mp}. (23)

Using (22), (23) and the fact that p1/100p\leq 1/100, we obtain:

log1π0\displaystyle\log\frac{1}{\pi_{0}} =log1πm+logπmπ0(n1)logq+mlogp1(m100mp)log1pp\displaystyle=\log\frac{1}{\pi_{m}}+\log\frac{\pi_{m}}{\pi_{0}}\leq(n-1)\log q+m\log p^{-1}-(m-100mp)\log\frac{1-p}{p}
=100mplogp1+m(1100p)log11p+O(n)\displaystyle=100mp\log p^{-1}+m(1-100p)\log\frac{1}{1-p}+O(n)
100mplogp1+mp(1100p)1p+O(n)=O(n+mp(logp1)).\displaystyle\leq 100mp\log p^{-1}+\frac{mp(1-100p)}{1-p}+O(n)=O(n+mp(\log p^{-1})).

Therefore, from (21) and Theorem 6.2 we obtain:

τmixGDO(mlogm)+τmixCMO(mlogn)O(n+mp(logp1))=O(mlogn(n+mplogp1))τmixCM,\tau_{\rm mix}^{\mathrm{GD}}\leq O(m\log m)+\tau_{\rm mix}^{\mathrm{CM}}\cdot O(m\log n)\cdot O(n+mp(\log p^{-1}))=O(m\log n\cdot(n+mp\log p^{-1}))\cdot\tau_{\rm mix}^{\mathrm{CM}},

as claimed. ∎

For the sake of completeness, we conclude this section with a proof of Theorem 6.2.

Proof of Theorem 6.2.

For xΓx\in\Gamma and tTt\geq T, we have

Pt(x,)π()TV\displaystyle{\|P^{t}(x,\cdot)-\pi(\cdot)\|}_{\textsc{\tiny TV}} =yΓ:Pt(x,y)>π(y)Pt(x,y)π(y)\displaystyle=\sum_{y\in\Gamma:P^{t}(x,y)>\pi(y)}P^{t}(x,y)-\pi(y)
yΓ0:Pt(x,y)>π(y)π(y)|1Pt(x,y)π(y)|+yΓ0:Pt(x,y)>π(y)Pt(x,y)|1π(y)Pt(x,y)|\displaystyle\leq\sum_{y\in\Gamma_{0}:P^{t}(x,y)>\pi(y)}\pi(y)\left|1-\frac{P^{t}(x,y)}{\pi(y)}\right|+\sum_{y\notin\Gamma_{0}:P^{t}(x,y)>\pi(y)}P^{t}(x,y)\left|1-\frac{\pi(y)}{P^{t}(x,y)}\right|
π(Γ0)maxyΓ0|1Pt(x,y)π(y)|+Pt(x,ΓΓ0)\displaystyle\leq\pi(\Gamma_{0})\max_{y\in\Gamma_{0}}\left|1-\frac{P^{t}(x,y)}{\pi(y)}\right|+P^{t}(x,\Gamma\setminus\Gamma_{0})
maxyΓ0|1Pt(x,y)π(y)|+116,\displaystyle\leq\max_{y\in\Gamma_{0}}\left|1-\frac{P^{t}(x,y)}{\pi(y)}\right|+\frac{1}{16},

where the last inequality follows from the theorem assumption for tTt\geq T.

For any yΓy\in\Gamma, we have

|1Pt(x,y)π(y)|egap(P)tπ(x)π(y);\left|1-\frac{P^{t}(x,y)}{\pi(y)}\right|\leq\frac{e^{-\mathrm{gap}(P)\cdot t}}{\sqrt{\pi(x)\pi(y)}};

see inequality (12.11) in [23]. Hence, for any xΓ0x\in\Gamma_{0} we have

Pt(x,)π()TVmaxyΓ0egap(P)tπ(x)π(y)+116egap(P)tπ0+116.{\|P^{t}(x,\cdot)-\pi(\cdot)\|}_{\textsc{\tiny TV}}\leq\max_{y\in\Gamma_{0}}\frac{e^{-\mathrm{gap}(P)\cdot t}}{\sqrt{\pi(x)\pi(y)}}+\frac{1}{16}\leq\frac{e^{-\mathrm{gap}(P)\cdot t}}{\pi_{0}}+\frac{1}{16}. (24)

Letting τmixP(x)=min{t0:Pt(x,)π()TV1/4}\tau_{\rm mix}^{P}(x)=\min\left\{{t\geq 0:\|P^{t}(x,\cdot)-\pi(\cdot)\|}_{\textsc{\tiny TV}}\leq 1/4\right\}, we deduce from (24) that for xΓ0x\in\Gamma_{0}

τmixP(x)max{T,gap1(P)log8π0}.\tau_{\rm mix}^{P}(x)\leq\max\left\{T,\mathrm{gap}^{-1}(P)\log\frac{8}{\pi_{0}}\right\}. (25)

Since τmixP=maxxΓτmix(x)\tau_{\rm mix}^{P}=\max_{x\in\Gamma}\tau_{\rm mix}(x), it remains for us to provide a bound for τmix(x)\tau_{\rm mix}(x) when xΓΓ0x\in\Gamma\setminus\Gamma_{0}. Consider two copies {Xt}\{X_{t}\}, {Yt}\{Y_{t}\} of the chain PP. For t>Tt>T let \mathbb{P} be the coupling of XtX_{t}, YtY_{t} such that the two copies evolve independently up to time TT and if XT=xX_{T}=x^{\prime} and YT=yY_{T}=y^{\prime} for some x,yΓ0x^{\prime},y^{\prime}\in\Gamma_{0} then the optimal coupling is used so that

[XtYtXT=x,YT=y]=PtT(x,)PtT(y,)TV;\mathbb{P}[X_{t}\neq Y_{t}\mid X_{T}=x^{\prime},Y_{T}=y^{\prime}]={\|P^{t-T}(x^{\prime},\cdot)-P^{t-T}(y^{\prime},\cdot)\|}_{\textsc{\tiny TV}};

recall that the existence of an optimal coupling is guaranteed by the coupling lemma (see, e.g., Proposition 4.7 in [23]). Then, for any x,yΓx,y\in\Gamma

[XtYt\displaystyle\mathbb{P}[X_{t}\neq Y_{t} X0=x,Y0=y]\displaystyle\mid X_{0}=x,Y_{0}=y]
[XTΓ0X0=x]+[YTΓ0Y0=y]+maxx,yΓ0[XtYtXT=x,YT=y]\displaystyle\leq\mathbb{P}[X_{T}\notin\Gamma_{0}\mid X_{0}=x]+\mathbb{P}[Y_{T}\notin\Gamma_{0}\mid Y_{0}=y]+\max_{x^{\prime},y^{\prime}\in\Gamma_{0}}\mathbb{P}[X_{t}\neq Y_{t}\mid X_{T}=x^{\prime},Y_{T}=y^{\prime}]
maxx,yΓ0[XtYtXT=x,YT=y]+18\displaystyle\leq\max_{x^{\prime},y^{\prime}\in\Gamma_{0}}\mathbb{P}[X_{t}\neq Y_{t}\mid X_{T}=x^{\prime},Y_{T}=y^{\prime}]+\frac{1}{8}
maxx,yΓ0PtT(x,)PtT(y,)TV+18\displaystyle\leq\max_{x^{\prime},y^{\prime}\in\Gamma_{0}}{\|P^{t-T}(x^{\prime},\cdot)-P^{t-T}(y^{\prime},\cdot)\|}_{\textsc{\tiny TV}}+\frac{1}{8}
2maxxΓ0PtT(x,)π()TV+18,\displaystyle\leq 2\max_{x^{\prime}\in\Gamma_{0}}{\|P^{t-T}(x^{\prime},\cdot)-\pi(\cdot)\|}_{\textsc{\tiny TV}}+\frac{1}{8},

where the last inequality follows from the triangle inequality. Now,

maxxΓPt(x,)π()TV\displaystyle\max_{x\in\Gamma}{\|P^{t}(x,\cdot)-\pi(\cdot)\|}_{\textsc{\tiny TV}} maxx,yΓ[XtYtX0=x,Y0=y]2maxxΓ0PtT(x,)π()TV+1858.\displaystyle\leq\max_{x,y\in\Gamma}\mathbb{P}[X_{t}\neq Y_{t}\mid X_{0}=x,Y_{0}=y]\leq 2\max_{x^{\prime}\in\Gamma_{0}}{\|P^{t-T}(x^{\prime},\cdot)-\pi(\cdot)\|}_{\textsc{\tiny TV}}+\frac{1}{8}\leq\frac{5}{8}.

provided tT+maxzΓ0τmixP(z)t\geq T+\max_{z\in\Gamma_{0}}\tau_{\rm mix}^{P}(z). Using a standard boosting argument (see (4.36) in [23]) and (25) we deduce that τmixP=O(T+gap1(P)log8π0)\tau_{\rm mix}^{P}=O(T+\mathrm{gap}^{-1}(P)\log\frac{8}{\pi_{0}}) as claimed. ∎

References

  • [1] N. Alon and J.H. Spencer. The probabilistic method. John Wiley & Sons, 2000.
  • [2] V. Beffara and H. Duminil-Copin. The self-dual point of the two-dimensional random-cluster model is critical for q1q\geq 1. Probability Theory and Related Fields, 153:511–542, 2012.
  • [3] A. C. Berry. The accuracy of the gaussian approximation to the sum of independent variates. 1941.
  • [4] A. Blanca. Random-cluster dynamics. PhD thesis, UC Berkeley, 2016.
  • [5] A. Blanca and R. Gheissari. Random-cluster dynamics on random regular graphs in tree uniqueness. Communications in Mathematical Physics, 2021.
  • [6] A. Blanca, R. Gheissari, and E. Vigoda. Random-cluster dynamics in 2\mathbb{Z}^{2}: rapid mixing with general boundary conditions. Annals of Applied Probability, 30(1):418–459, 2020.
  • [7] A. Blanca and A. Sinclair. Dynamics for the mean-field random-cluster model. Proceedings of the 19th International Workshop on Randomization and Computation (RANDOM), pages 528–543, 2015.
  • [8] A. Blanca and A. Sinclair. Random-Cluster Dynamics in 2\mathbb{Z}^{2}. Probability Theory and Related Fields, 168:821–847, 2017.
  • [9] B. Bollobás, G.R. Grimmett, and S. Janson. The random-cluster model on the complete graph. Probability Theory and Related Fields, 104(3):283–317, 1996.
  • [10] L. Chayes and J. Machta. Graphical representations and cluster algorithms II. Physica A, 254:477–516, 1998.
  • [11] H. Duminil-Copin, M. Gagnebin, M. Harel, I. Manolescu, and V. Tassion. Discontinuity of the phase transition for the planar random-cluster and Potts models with q>4q>4. Annales de l’ENS, 2016. To Appear.
  • [12] H. Duminil-Copin, V. Sidoravicius, and V. Tassion. Continuity of the Phase Transition for Planar Random-Cluster and Potts Models with 1q41\!\leq\!q\!\leq\!4. Communications in Mathematical Physics, 349(1):47–107, 2017.
  • [13] R. Durrett. Probability: Theory and Examples. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2010.
  • [14] C.M. Fortuin and P.W. Kasteleyn. On the random-cluster model I. Introduction and relation to other models. Physica, 57(4):536–564, 1972.
  • [15] A. Galanis, D. Štefankovič, and E. Vigoda. Swendsen-Wang algorithm on the mean-field Potts model. Proceedings of the 19th International Workshop on Randomization and Computation (RANDOM), pages 815–828, 2015.
  • [16] T. Garoni. Personal communication, 2015.
  • [17] R. Gheissari and E. Lubetzky. Quasi-polynomial mixing of critical two-dimensional random cluster models. Random Structures & Algorithms, 56(2):517–556, 2020.
  • [18] R. Gheissari, E. Lubetzky, and Y. Peres. Exponentially slow mixing in the mean-field Swendsen-Wang dynamics. In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1981–1988. SIAM, 2018.
  • [19] G.R. Grimmett. The Random-Cluster Model, volume 333 of Grundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Ver- lag, Berlin, 2006.
  • [20] H. Guo and M. Jerrum. Random cluster dynamics for the Ising model is rapidly mixing. In Proceedings of the 28th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1818–1827. SIAM, 2017.
  • [21] S. Janson, T. Łuczak, and A. Ruciński. Random Graphs. Wiley Series in Discrete Mathematics and Optimization. Wiley, 2011.
  • [22] D.A. Levin and Y. Peres. Markov Chains and Mixing Times. MBK. American Mathematical Society, 2017.
  • [23] D.A. Levin, Y. Peres, and E.L. Wilmer. Markov Chains and Mixing Times. American Mathematical Society, 2008.
  • [24] Y. Long, A. Nachmias, W. Ning, and Y. Peres. A power law of order 1/4 for critical mean-field Swendsen-Wang dynamics. Memoirs of the American Mathematical Society, 232(1092), 2011.
  • [25] Malwina Luczak and Tomasz Luczak. The phase transition in the cluster-scaled model of a random graph. Random Structures & Algorithms, 28(2):215–246, 2006.
  • [26] T. Luczak. Cycles in a random graph near the critical point. Random Structures & Algorithms, 2(4):421–439, 1991.
  • [27] T. Łuczak, B. Pittel, and J.C. Wierman. The structure of a random graph at the point of the phase transition. Transactions of the American Mathematical Society, 341(2):721–748, 1994.
  • [28] A.B. Mukhin. Local limit theorems for lattice random variables. Theory of Probability & Its Applications, 36(4):698–713, 1992.
  • [29] B. Pittel. On tree census and the giant component in sparse random graphs. Random Structures & Algorithms, 1(3):311–342, 1990.
  • [30] R.H. Swendsen and J.S. Wang. Nonuniversal critical dynamics in Monte Carlo simulations. Physical Review Letters, 58:86–88, 1987.
  • [31] M. Ullrich. Swendsen-Wang is faster than single-bond dynamics. SIAM Journal on Discrete Mathematics, 28(1):37–48, 2014.
  • [32] Roman Vershynin. High-dimensional probability. 2019.

Appendix A Proof of the local limit theorem

In this appendix, we prove Theorem 5.1. First, we introduce some notation. For a random variable XX and dd\in\mathbb{R}, let H(X,d)=E[Xd2]H(X,d)=\operatorname*{{\rm E}}[\langle X^{*}d\rangle^{2}], where \langle\cdot\rangle denotes distance to the closest integer and XX^{*} is a symmetrized version of XX; i.e., X=XXX^{*}=X-X^{\prime} where XX^{\prime} is an i.i.d. copy of XX. Let Hm=infd[14,12]i=1mH(Xi,d)H_{m}=\inf_{d\in[\frac{1}{4},\frac{1}{2}]}\,\,\sum_{i=1}^{m}H(X_{i},d). The following local limit theorem is due to Mukhin [28] (all limits are taken as mm\rightarrow\infty).

Theorem A.1 ([28], Theorem 1).

Suppose that the sequence Smμmσm\frac{S_{m}-\mu_{m}}{\sigma_{m}} converges in distribution to a standard normal random variable and that σm\sigma_{m}\rightarrow\infty. If HmH_{m}\rightarrow\infty and there exists α>0\alpha>0 such that u[Hm1/4,σm]\forall u\in[H_{m}^{1/4},\sigma_{m}] we have i:ciuci2αuσm,\sum_{i:c_{i}\leq u}c_{i}^{2}\geq\alpha u\sigma_{m}, then the local limit theorem holds.

Next, we show how to derive Theorem 5.1 from Theorem A.1. The proof involves the following two lemmas.

Lemma A.2.

For the random variables satisfying the conditions from Theorem 5.1, σm\sigma_{m}\rightarrow\infty and Smμmσm\frac{S_{m}-\mu_{m}}{\sigma_{m}} converges in distribution to a standard normal random variable.

Proof.

Observe that

σm2=r(1r)i=1mci2r(1r)i:ciI1ci2=Ω(m4/3g(m)4g(m)3)=Ω(m4/3g(m)),\sigma_{m}^{2}=r(1-r)\sum_{i=1}^{m}c_{i}^{2}\geq r(1-r)\sum_{i:c_{i}\in I_{1}}c_{i}^{2}=\Omega\left(\frac{m^{4/3}}{g(m)^{4}}\cdot g(m)^{3}\right)=\Omega\left(\frac{m^{4/3}}{g(m)}\right)\rightarrow\infty,

and also

1σm3i=1mE[|XiE[Xi]|3]\displaystyle\frac{1}{\sigma_{m}^{3}}\sum_{i=1}^{m}\operatorname*{{\rm E}}[|X_{i}-\operatorname*{{\rm E}}[X_{i}]|^{3}] =1σm3i=1mr(1r)ci3σm2cmσm3=O(cmσm)\displaystyle=\frac{1}{\sigma_{m}^{3}}\sum_{i=1}^{m}r(1-r)c_{i}^{3}\leq\frac{\sigma^{2}_{m}c_{m}}{\sigma^{3}_{m}}=O\left(\frac{c_{m}}{\sigma_{m}}\right)
=O(m2/3g(m)1m2/3g(m)1/2)=O(g(m)1/2)0.\displaystyle=O\left(\frac{m^{2/3}g(m)^{-1}}{m^{2/3}g(m)^{-1/2}}\right)=O\left(g(m)^{-1/2}\right)\rightarrow 0.

Hence, the random variables {Xi}\{X_{i}\} satisfy Lyapunov’s central limit theorem conditions (see, e.g., [13]), and so Smμmσm\frac{S_{m}-\mu_{m}}{\sigma_{m}} converges in distribution to a standard normal random variable. ∎

Lemma A.3.

Suppose c1,,cmc_{1},\dots,c_{m} satisfy the conditions from Theorem 5.1. For any uu satisfying σmu1\sigma_{m}\geq u\geq 1, j:cjucj2uσm/r(1r)\sum_{j:c_{j}\leq u}c_{j}^{2}\geq u\sigma_{m}/r(1-r).

Proof.

We have σm2=r(1r)i=1mci2=O(m4/3g(m)).\sigma_{m}^{2}=r(1-r)\sum_{i=1}^{m}c_{i}^{2}=O\left(\frac{m^{4/3}}{\sqrt{g(m)}}\right). We consider three cases. First, if m1/4ucm=O(m2/3g(m)1)m^{1/4}\leq u\leq c_{m}=O\left(m^{2/3}g(m)^{-1}\right), there exists a largest integer k[0,)k\in[0,\ell) such that u=O(ϑm2/3g(m)2k)u=O\left(\frac{\vartheta m^{2/3}}{g(m)^{2^{k}}}\right), where >0\ell>0 is the smallest integer such that m2/3g(m)2=o(m1/4)m^{2/3}g(m)^{-2^{\ell}}=o(m^{1/4}). Then,

i:ciuci2i:ciIk+1ci2ϑ2m4/34g(m)2k+2g(m)32k=ϑ2m4/34g(m)2kuσm;\sum_{i:c_{i}\leq u}c_{i}^{2}\geq\sum_{i:c_{i}\in I_{k+1}}c_{i}^{2}\geq\frac{\vartheta^{2}m^{4/3}}{4g(m)^{2^{k+2}}}g(m)^{3\cdot 2^{k}}=\frac{\vartheta^{2}m^{4/3}}{4g(m)^{2^{k}}}\gg u\sigma_{m};

by \gg we mean that uσmu\sigma_{m} is of lower order with respect to ϑ2m4/34g(m)2k\frac{\vartheta^{2}m^{4/3}}{4g(m)^{2^{k}}}. Now, when σmucm\sigma_{m}\geq u\geq c_{m}, we have

i:ciuci2=i=1mci2=σm2r(1r)uσmr(1r).\sum_{i:c_{i}\leq u}c_{i}^{2}=\sum_{i=1}^{m}c_{i}^{2}=\frac{\sigma_{m}^{2}}{r(1-r)}\geq\frac{u\sigma_{m}}{r(1-r)}.

Finally, if 1um1/41\leq u\leq m^{1/4}, uσmu\sigma_{m} is sublinear and so

i:ciuci2i=1ρmci2=ρmm1/4σmuσm,\sum_{i:c_{i}\leq u}c_{i}^{2}\geq\sum_{i=1}^{\rho m}c_{i}^{2}=\rho m\gg m^{1/4}\sigma_{m}\geq u\sigma_{m},

as claimed. ∎

Proof of Theorem 5.1.

We check that the XiX_{i}’s satisfy the conditions from Theorem A.1. Lemma A.2 implies σm\sigma_{m}\rightarrow\infty and SmμmσmN(0,1)\frac{S_{m}-\mu_{m}}{\sigma_{m}}\rightarrow N(0,1); by Lemma A.3 we also have that for any uu satisfying σmu1\sigma_{m}\geq u\geq 1, j:cjucj2uσm/r(1r)\sum_{j:c_{j}\leq u}c_{j}^{2}\geq u\sigma_{m}/r(1-r). It remains to show that HmH_{m}\rightarrow\infty.

Now, for iρmi\leq\rho m, observe that the value of XiX^{*}_{i} equals to 11 with probability r(1r)r(1-r), 1-1 with probability r(1r)r(1-r), and 0 otherwise. Then for 1/4d1/21/4\leq d\leq 1/2, Xid2\langle X_{i}^{*}d\rangle^{2} evaluates to d2d^{2} with probability 2r(1r)2r(1-r), and 0 otherwise. Therefore, for iρmi\leq\rho m and 1/4d1/21/4\leq d\leq 1/2 we have that E[Xid2]=2r(1r)d2.\operatorname*{{\rm E}}[\langle X_{i}^{*}d\rangle^{2}]=2r(1-r)d^{2}. Thus,

Hm=inf14d12i=1mH(Xi,d)inf14d12i=1ρmH(Xi,d)=inf14d12i=1ρm2r(1r)d2=Ω(m).H_{m}=\inf_{\frac{1}{4}\leq d\leq\frac{1}{2}}\sum_{i=1}^{m}H(X_{i},d)\geq\inf_{\frac{1}{4}\leq d\leq\frac{1}{2}}\sum_{i=1}^{\lfloor\rho m\rfloor}H(X_{i},d)=\inf_{\frac{1}{4}\leq d\leq\frac{1}{2}}\sum_{i=1}^{\lfloor\rho m\rfloor}2r(1-r)d^{2}=\Omega(m)\rightarrow\infty.

Since we have shown that the XiX_{i}’s satisfy all the conditions from Theorem A.1, the result follows. ∎

For completeness, we also derive Theorem 5.1 from first principles (i.e. without using Mukhin’s result [28]) in Appendix D.

Appendix B Proofs of random walk couplings

Another important tool in our proofs are couplings based on the evolutions of certain random walks. In this section we consider a (lazy) symmetric random walk (Sk)(S_{k}) on \mathbb{Z} with bounded step size, and the first result we present is an estimate on Mk=max{S1,,Sk}M_{k}=\max\{S_{1},\dots,S_{k}\} which is based on the well-known reflection principle (see, e.g., Chapter 2.7 in [22]).

Lemma B.1.

Let A>0A>0 and let Ac1,c2,,cn4AA\leq c_{1},c_{2},\dots,c_{n}\leq 4A be positive integers. Let r(0,1/2]r\in(0,1/2] and consider the sequence of random variables X1,,XnX_{1},\dots,X_{n} where for each i=1,,ni=1,\dots,n: Xi=ciX_{i}=c_{i} with probability rr; Xi=ciX_{i}=-c_{i} with probability rr; and Xi=0X_{i}=0 otherwise. Let Sk=i=1kXiS_{k}=\sum_{i=1}^{k}X_{i} and Mk=max{S1,,Sk}M_{k}=\max\{S_{1},\dots,S_{k}\}. Then, for any y0y\geq 0

Pr[Mny]2Pr[Sny+8A+1].\Pr[M_{n}\geq y]\geq 2\Pr[S_{n}\geq y+8A+1].
Proof.

First, note that

Pr[Mny]\displaystyle\Pr[M_{n}\geq y] =k=y4AnPr[Mny,Sn=k]+k=4Any1Pr[Mny,Sn=k]\displaystyle=\sum_{k=y}^{4An}\Pr[M_{n}\geq y,S_{n}=k]+\sum_{k=-4An}^{y-1}\Pr[M_{n}\geq y,S_{n}=k]
=Pr[Sny]+k=4Any1Pr[Mny,Sn=k].\displaystyle=\Pr[S_{n}\geq y]+\sum_{k=-4An}^{y-1}\Pr[M_{n}\geq y,S_{n}=k]. (26)

If MnyM_{n}\geq y, let WnW_{n} be the value of the random walk {Si}\{S_{i}\} the first time its value was at least yy. Then,

Pr[Mny,Sn=k]\displaystyle\Pr[M_{n}\geq y,S_{n}=k] =b=yy+4A1Pr[Mny,Sn=k,Wn=b]\displaystyle=\sum_{b=y}^{y+4A-1}\Pr[M_{n}\geq y,S_{n}=k,W_{n}=b]
=b=yy+4A1Pr[Mny,Sn=2bk,Wn=b]\displaystyle=\sum_{b=y}^{y+4A-1}\Pr[M_{n}\geq y,S_{n}=2b-k,W_{n}=b]
=b=yy+4A1Pr[Sn=2bk,Wn=b],\displaystyle=\sum_{b=y}^{y+4A-1}\Pr[S_{n}=2b-k,W_{n}=b],

where in the second equality we used the fact that the random walk is symmetric and the last one follows from the fact that 2bky2b-k\geq y. Plugging this into (26), we get

Pr[Mny]\displaystyle\Pr[M_{n}\geq y] =Pr[Sny]+b=yy+4A1k=4Any1Pr[Sn=2bk,Wn=b]\displaystyle=\Pr[S_{n}\geq y]+\sum_{b=y}^{y+4A-1}\sum_{k=-4An}^{y-1}\Pr[S_{n}=2b-k,W_{n}=b]
=Pr[Sny]+b=yy+4A1k=2by+14AnPr[Sn=k,Wn=b]\displaystyle=\Pr[S_{n}\geq y]+\sum_{b=y}^{y+4A-1}\sum_{k=2b-y+1}^{4An}\Pr[S_{n}=k,W_{n}=b]
=Pr[Sny]+b=yy+4A1Pr[Sn2by+1,Wn=b]\displaystyle=\Pr[S_{n}\geq y]+\sum_{b=y}^{y+4A-1}\Pr[S_{n}\geq 2b-y+1,W_{n}=b]
Pr[Sny]+b=yy+4A1Pr[Sny+8A+1,Wn=b],\displaystyle\geq\Pr[S_{n}\geq y]+\sum_{b=y}^{y+4A-1}\Pr[S_{n}\geq y+8A+1,W_{n}=b],

since b<y+4Ab<y+4A. Finally, observe that

b=yy+4A1Pr[Sny+8A+1,Wn=b]=Pr[Sny+8A+1]\sum_{b=y}^{y+4A-1}\Pr[S_{n}\geq y+8A+1,W_{n}=b]=\Pr[S_{n}\geq y+8A+1]

and so

Pr[Mny]Pr[Sny]+Pr[Sny+8A+1]2Pr[Sny+8A+1],\Pr[M_{n}\geq y]\geq\Pr[S_{n}\geq y]+\Pr[S_{n}\geq y+8A+1]\geq 2\Pr[S_{n}\geq y+8A+1],

as desired. ∎

We can now prove Theorem 5.7.

Proof of Theorem 5.7.

Set δ=10r\delta=\frac{10}{\sqrt{r}}. Let Dk=i=1k(XiYi)D_{k}=\sum_{i=1}^{k}(X_{i}-Y_{i}) for each k{1,,m}k\in\{1,\dots,m\}. We construct a coupling for (X,Y)(X,Y) by coupling each (Xk,Yk)(X_{k},Y_{k}) as follows:

  1. 1.

    If Dk<dD_{k}<d, sample Xk+1X_{k+1} and Yk+1Y_{k+1} independently.

  2. 2.

    If DkdD_{k}\geq d, set Xk+1=Yk+1X_{k+1}=Y_{k+1}.

Observe that if DkdD_{k}\geq d for any kmk\leq m, then d+2AXYdd+2A\geq X-Y\geq d. Therefore,

Pr[d+2AXYd]Pr[Mmd]\Pr[d+2A\geq X-Y\geq d]\geq\Pr[M_{m}\geq d]

where Mm=max{D0,,Dm}M_{m}=\max\{D_{0},...,D_{m}\}. Note that {Dk}\{D_{k}\} behaves like a (lazy) symmetric random walk until the first time τ\tau it is at least dd; after that {Dk}\{D_{k}\} stays put.

Let {Dk}\{D^{\prime}_{k}\} denote such random walk which does not stop after τ\tau, and Mm:=max{D0,,Dm}M^{\prime}_{m}:=\max\{D^{\prime}_{0},...,D^{\prime}_{m}\}. Notice

Pr[Mmd]=Pr[Mmd].\Pr[M_{m}\geq d]=\Pr[M^{\prime}_{m}\geq d].

Since the step size of {Dk}\{D^{\prime}_{k}\} is at least AA and at most 4A4A, by Lemma B.1 for any d0d\geq 0

Pr[Mmd]2Pr[Dmd+8A+1].\Pr[M^{\prime}_{m}\geq d]\geq 2\Pr[D^{\prime}_{m}\geq d+8A+1].

Let σ2=i=1mE[(XiYi)2]=4ri=1mci2\sigma^{2}=\sum_{i=1}^{m}\operatorname*{{\rm E}}[(X_{i}-Y_{i})^{2}]=4r\sum_{i=1}^{m}c_{i}^{2} and ρ=i=1mE[|XiYi|3]=4r(1+2r)i=1mci3\rho=\sum_{i=1}^{m}\operatorname*{{\rm E}}[|X_{i}-Y_{i}|^{3}]=4r(1+2r)\sum_{i=1}^{m}c_{i}^{3}. By the Berry-Esséen theorem for independent (but not necessarily identical) random variables (see, e.g. [3]), we get that for any yy\in\mathbb{R}

|Pr[Dm>yσ]Pr[N>y]|cρσ32cAσ.\left|\Pr[D^{\prime}_{m}>y\sigma]-\Pr[N>y]\right|\leq\frac{c\rho}{\sigma^{3}}\leq\frac{2cA}{\sigma}.

where NN is a standard normal random variable, and c[0.4,0.6]c\in[0.4,0.6] is an absolute constant. Then,

Pr[Dm>yσ]Pr[N>y]2cAσ.\Pr[D^{\prime}_{m}>y\sigma]\geq\Pr[N>y]-\frac{2cA}{\sigma}. (27)

Notice σ2Arm\sigma\geq 2A\sqrt{rm}. If d+8Aσd+8A\geq\sigma, the theorem holds vacuously since

1δ(d+A)Am=110(d+A)Arm<1d+8AArm1σArm12<0.1-\frac{\delta(d+A)}{A\sqrt{m}}=1-\frac{10(d+A)}{A\sqrt{rm}}<1-\frac{d+8A}{A\sqrt{rm}}\leq 1-\frac{\sigma}{A\sqrt{rm}}\leq 1-2<0.

If d+8A<σd+8A<\sigma, since it can be checked via a Taylor’s expansion that 2Pr[N>y]12πy2\Pr[N>y]\geq 1-\sqrt{\frac{2}{\pi}}y for y<1y<1, we get from (27)

Pr[Mmd]2Pr[Dm>d+8A]\displaystyle\Pr[M_{m}\geq d]\geq 2\Pr[D^{\prime}_{m}>d+8A] 2Pr[N>d+8Aσ]4cAσ\displaystyle\geq 2\Pr\left[N>\frac{d+8A}{\sigma}\right]-\frac{4cA}{\sigma}
12/π(d+8A)σ4cAσ\displaystyle\geq 1-\frac{\sqrt{2/\pi}(d+8A)}{\sigma}-\frac{4cA}{\sigma}
19(d+A)σ\displaystyle\geq 1-\frac{9(d+A)}{\sigma}
1δ(d+A)Am,\displaystyle\geq 1-\frac{\delta(d+A)}{A\sqrt{m}},

as claimed. ∎

Appendix C Random graphs estimates

In this section, we provide proofs of lemmas which do not appear in the literature.

Recall GG(n,1+λn1/3n)G\sim G(n,\frac{1+\lambda n^{-1/3}}{n}), where λ=λ(n)\lambda=\lambda(n) may depend on nn. Both of Lemmas 3.10 and 3.11 are proved using the following precise estimates on the moments of the number of trees of a given size in GG. We note that similar estimates can be found in the literature (see, e.g., [29, 1]); a proof is included for completeness.

Claim C.1.

Let tkt_{k} be the number of trees of size kk in GG. Suppose there exists a positive increasing function gg such that g(n)g(n)\rightarrow\infty, |λ|g(n)|\lambda|\leq g(n) and i,j,kn2/3g(n)2i,j,k\leq\frac{n^{2/3}}{g(n)^{2}}. If i,j,ki,j,k\rightarrow\infty as nn\rightarrow\infty, then:

  1. (i)

    E[tk]=Θ(nk5/2)\operatorname*{{\rm E}}[t_{k}]=\Theta\left(\frac{n}{k^{5/2}}\right);

  2. (ii)

    Var(tk)E[tk]+(1+o(1))λn2/32πk3\operatorname*{{\rm Var}}(t_{k})\leq\operatorname*{{\rm E}}[t_{k}]+\frac{(1+o(1))\lambda n^{2/3}}{2\pi k^{3}};

  3. (iii)

    For iji\neq j, Cov(ti,tj)(1+o(1))λn2/32πi3/2j3/2\operatorname*{{\rm Cov}}(t_{i},t_{j})\leq\frac{(1+o(1))\lambda n^{2/3}}{2\pi i^{3/2}j^{3/2}}.

To prove Lemma 3.10, we also use the following result.

Lemma C.2.

Suppose ε3n\varepsilon^{3}n\rightarrow\infty and ε=o(1)\varepsilon=o(1). Then w.h.p. the largest component of GG(n,1+εn)G\sim G\left(n,\frac{1+\varepsilon}{n}\right) is the only component of GG which contains more than one cycle. Also, w.h.p. the number of vertices contained in the unicyclic components of GG is less than g(n)ε2g(n)\varepsilon^{-2} for any function g(n)g(n)\rightarrow\infty.

Proof.

An equivalent result was established in [26] for the G(n,M)G(n,M) model, in which exactly MM edges are chosen independently at random from the set of all (n2)\binom{n}{2} possible edges (see Theorem 7 in [26]). The result follows from the asymptotic equivalence between the G(n,p)G(n,p) and G(n,M)G(n,M) models when M=(n2)pM=\binom{n}{2}p (see, e.g., Proposition 1.12 in [21]). ∎

Proof of Lemma 3.10.

Let us fix α>0\alpha>0 and consider first the case when |λ||\lambda| is large. If λ<0\lambda<0 and |λ|=Ω(h(n)1/2)|\lambda|=\Omega(h(n)^{1/2}), then Lemma 3.7 implies that

E[j:Lj(G)BhLj(G)2]E[1(G)]=O(nλn1/3)=O(n4/3h(n)1/2).\operatorname*{{\rm E}}\left[\sum\nolimits_{j:L_{j}(G)\leq B_{h}}L_{j}(G)^{2}\right]\leq\operatorname*{{\rm E}}[\mathcal{R}_{1}(G)]=O\left(\frac{n}{\lambda n^{-1/3}}\right)=O\left(\frac{n^{4/3}}{h(n)^{1/2}}\right).

Similarly, if λ>0\lambda>0 and λ=Ω(h(n)1/2)\lambda=\Omega(h(n)^{1/2}), then Lemma 3.8 implies that E[2(G)]=O(n4/3h(n)1/2)\operatorname*{{\rm E}}[\mathcal{R}_{2}(G)]=O(n^{4/3}h(n)^{-1/2}). We may assume L1(G)BhL_{1}(G)\leq B_{h} since otherwise the size of the largest component does not contribute to the sum. Then,

E[j:Lj(G)BhLj(G)2]E[2(G)]+Bh2=O(n4/3h(n)1/2).\operatorname*{{\rm E}}\left[\sum\nolimits_{j:L_{j}(G)\leq B_{h}}L_{j}(G)^{2}\right]\leq\operatorname*{{\rm E}}[\mathcal{R}_{2}(G)]+B_{h}^{2}=O\left(\frac{n^{4/3}}{h(n)^{1/2}}\right).

Hence, if |λ|=Ω(h(n)1/2)|\lambda|=\Omega(h(n)^{-1/2}), the result follows from Markov’s inequality.

Suppose next |λ|h(n)|\lambda|\leq\sqrt{h(n)}. Let tkt_{k} be the number of trees of size kk in GG and let 𝒯Bh\mathcal{T}_{B_{h}} be the set of trees of size at most BhB_{h} in GG. By Claim C.1.i,

E[τ𝒯Bh|τ|2]\displaystyle\operatorname*{{\rm E}}\left[\sum\nolimits_{\tau\in\mathcal{T}_{B_{h}}}|\tau|^{2}\right] =k=1Bhk2E[tk]=O(nh(n)2)+k=h(n)Bhk2E[tk]\displaystyle=\sum_{k=1}^{B_{h}}k^{2}\operatorname*{{\rm E}}[t_{k}]=O(nh(n)^{2})+\sum_{k=\lfloor h(n)\rfloor}^{B_{h}}k^{2}\operatorname*{{\rm E}}[t_{k}]
=O(nh(n)2)+O(n)k=h(n)Bh1k1/2=O(n4/3h(n)1/2).\displaystyle=O(nh(n)^{2})+O(n)\sum_{k=\lfloor h(n)\rfloor}^{B_{h}}\frac{1}{k^{1/2}}=O\left(\frac{n^{4/3}}{h(n)^{1/2}}\right). (28)

By Markov’s inequality, we get that τ𝒯Bh|τ|2An4/3h(n)1/2\sum\nolimits_{\tau\in\mathcal{T}_{B_{h}}}|\tau|^{2}\leq An^{4/3}h(n)^{-1/2} with probability at least γ\gamma, for any desired γ>0\gamma>0 for a suitable constant A=A(γ)>0A=A(\gamma)>0.

All that is left to prove is that the contribution from complex (non-tree) components is small. When |λ|=O(1)|\lambda|=O(1), this follows immediately from the fact that the expected number of complex components is O(1)O(1) (see, e.g., Lemma 2.1 in [27]). Then, if 𝒞Bh\mathcal{C}_{B_{h}} is the set of complex components in GG of size at most BhB_{h}, we have

E[C𝒞Bh|C|2]=O(n4/3h(n)2)E[|𝒞Bh|]=O(n4/3h(n)2),\operatorname*{{\rm E}}\left[\sum\nolimits_{C\in\mathcal{C}_{B_{h}}}|C|^{2}\right]=O\left(\frac{n^{4/3}}{h(n)^{2}}\right)\operatorname*{{\rm E}}\left[\left\lvert\mathcal{C}_{B_{h}}\right\rvert\right]=O\left(\frac{n^{4/3}}{h(n)^{2}}\right),

and the result follows again from Markov’s inequality and a union bound.

Finally, when h(n)|λ|\sqrt{h(n)}\geq|\lambda|\rightarrow\infty, Lemma C.2 implies that w.h.p. there is no multicyclic component except the largest component and that the number of vertices in unicyclic components is bounded by n2/3g(n)/λ2n^{2/3}g(n)/\lambda^{2}, for any function g(n)g(n)\rightarrow\infty. Hence, w.h.p.,

C𝒞Bh|C|n2/3g(n)λ2+Bh.\sum\nolimits_{C\in\mathcal{C}_{B_{h}}}|C|\leq\frac{n^{2/3}g(n)}{\lambda^{2}}+B_{h}.

Setting g(n)=λ2g(n)=\lambda^{2}, it follows that w.h.p.

C𝒞Bh|C|2Bh(n2/3g(n)λ2+Bh)n4/3h(n).\sum\nolimits_{C\in\mathcal{C}_{B_{h}}}|C|^{2}\leq B_{h}\left(\frac{n^{2/3}g(n)}{\lambda^{2}}+B_{h}\right)\leq\frac{n^{4/3}}{h(n)}.

This, combined with (C), Markov’s inequality and a union bound yields the result. ∎

Proof of Lemma 3.11.

Let TBT_{B} be number of trees in GG with size in the interval [B,2B][B,2B]; then |SB|TB|S_{B}|\geq T_{B}. By Chebyshev’s inequality for a>0a>0:

Pr[TBE[TB]aσ]1a2,\Pr[T_{B}\leq\operatorname*{{\rm E}}[T_{B}]-a\sigma]\leq\frac{1}{a^{2}},

where σ2=Var(TB)\sigma^{2}=\operatorname*{{\rm Var}}(T_{B}). By Claim C.1.i,

E[TB]=k=B2BE[Tk]c1nB3/2\operatorname*{{\rm E}}[T_{B}]=\sum_{k=B}^{2B}\operatorname*{{\rm E}}[T_{k}]\geq\frac{c_{1}n}{B^{3/2}}

for a suitable constant c1>0c_{1}>0. Now,

Var(TB)=k=B2BVar(tk)+ji:j,i[B,2B]Cov(ti,tj).\operatorname*{{\rm Var}}(T_{B})=\sum_{k=B}^{2B}\operatorname*{{\rm Var}}(t_{k})+\sum_{j\neq i:j,i\in[B,2B]}\operatorname*{{\rm Cov}}(t_{i},t_{j}).

By Claim C.1.i and Claim C.1.ii,

k=B2BVar(tk)k=B2BE[tk]+k=B2B(1+o(1))λn2/32πk3=O(nB3/2)+O(|λ|n2/3B2)=O(nB3/2),\sum_{k=B}^{2B}\operatorname*{{\rm Var}}(t_{k})\leq\sum_{k=B}^{2B}\operatorname*{{\rm E}}[t_{k}]+\sum_{k=B}^{2B}\frac{(1+o(1))\lambda n^{2/3}}{2\pi k^{3}}=O\left(\frac{n}{B^{3/2}}\right)+O\left(\frac{|\lambda|n^{2/3}}{B^{2}}\right)=O\left(\frac{n}{B^{3/2}}\right),

where in the last equality we used the assumption that λ=o(n1/3)\lambda=o(n^{1/3}). Similarly, by Claim C.1.iii

ji:j,i[B,2B]Cov(ti,tj)ji:j,i[B,2B](1+o(1))λn2/32πi3/2j3/2(1+o(1))|λ|n2/32πB=O(nB3/2),\sum_{j\neq i:j,i\in[B,2B]}\operatorname*{{\rm Cov}}(t_{i},t_{j})\leq\sum_{j\neq i:j,i\in[B,2B]}\frac{(1+o(1))\lambda n^{2/3}}{2\pi i^{3/2}j^{3/2}}\leq\frac{(1+o(1))|\lambda|n^{2/3}}{2\pi B}=O\left(\frac{n}{B^{3/2}}\right),

where the last inequality follows from the assumption that Bn2/3g(n)2B\leq\frac{n^{2/3}}{g(n)^{2}}. Hence, for a suitable constant c2>0c_{2}>0

Var(TB)c2nB3/2\operatorname*{{\rm Var}}(T_{B})\leq\frac{c_{2}n}{B^{3/2}}

and taking a=c1n2B3/2σa=\frac{c_{1}n}{2B^{3/2}\sigma} we get

Pr[|SB|c1n2B3/2]Pr[TBc1n2B3/2](2B3/2σc1n)24c2B3/2c12n,\Pr\left[|S_{B}|\leq\frac{c_{1}n}{2B^{3/2}}\right]\leq\Pr\left[T_{B}\leq\frac{c_{1}n}{2B^{3/2}}\right]\leq\left(\frac{2B^{3/2}\sigma}{c_{1}n}\right)^{2}\leq\frac{4c_{2}B^{3/2}}{c_{1}^{2}n},

as desired. ∎

Proof Corollary 3.12.

Lemma 3.11 implies that for a suitable constant b>0b>0

Pr[Nk(Xt+1,g)<bg(n)32k1]=O(g(n)32k1),\Pr\left[N_{k}(X_{t+1},g)<bg(n)^{3\cdot 2^{k-1}}\right]=O(g(n)^{-3\cdot 2^{k-1}}),

for any k1k\geq 1 such that g(n)2k=o(m2/3)g(n)^{2^{k}}=o(m^{2/3}). Observe that

k11g(n)32k1i11g(n)3i=O(g(n)3).\sum_{k\geq 1}\frac{1}{g(n)^{3\cdot 2^{k-1}}}\leq\sum_{i\geq 1}\frac{1}{g(n)^{3i}}=O(g(n)^{-3}).

Hence, a union bound over kk, i.e., over the intervals k(g)\mathcal{I}_{k}(g), implies that, with probability at least 1O(g(n)3)1-O(g(n)^{-3}), Nk(Xt+1,g)bg(n)32k1N_{k}(X_{t+1},g)\geq bg(n)^{3\cdot 2^{k-1}} for all k1k\geq 1 such that n2/3g(n)2kn^{2/3}g(n)^{-2^{k}}\rightarrow\infty, as claimed. ∎

Proof of Claim C.1.

Let c=1+λn1/3c=1+\lambda n^{-1/3}. The following combinatorial identity follows immediately from the fact that there are exactly kk2k^{k-2} trees of size kk.

E[tk]=(nk)kk2(cn)k1(1cn)k(nk)+(k2)k+1.\operatorname*{{\rm E}}[t_{k}]=\binom{n}{k}k^{k-2}\left(\frac{c}{n}\right)^{k-1}\left(1-\frac{c}{n}\right)^{k(n-k)+\binom{k}{2}-k+1}.

Using the Taylor expansion for ln(1x)\ln(1-x) and the fact that k=o(n2/3)k=o(n^{2/3}), we get

n!(nk)!\displaystyle\frac{n!}{(n-k)!} =nki=1k1(1in)=nkexp(k22nk36n2+o(1)).\displaystyle=n^{k}\,\prod_{i=1}^{k-1}\left(1-\frac{i}{n}\right)=n^{k}\exp\left(-\frac{k^{2}}{2n}-\frac{k^{3}}{6n^{2}}+o(1)\right). (29)

Similarly,

(cn)k1\displaystyle\left(\frac{c}{n}\right)^{k-1} =1nk1exp(λkn1/3λ2k2n2/3+o(1)),\displaystyle=\frac{1}{n^{k-1}}\exp\left(\frac{\lambda k}{n^{1/3}}-\frac{\lambda^{2}k}{2n^{2/3}}+o(1)\right),
(1cn)k(nk)+(k2)k+1\displaystyle\left(1-\frac{c}{n}\right)^{k(n-k)+\binom{k}{2}-k+1} =exp(kλkn1/3+k22n+λk22n4/3+o(1)).\displaystyle=\exp\left(-k-\frac{\lambda k}{n^{1/3}}+\frac{k^{2}}{2n}+\frac{\lambda k^{2}}{2n^{4/3}}+o(1)\right).

Since kk\rightarrow\infty, Stirling’s approximation gives

kk2k!=(1+o(1))ek2πk5/2.\frac{k^{k-2}}{k!}=\frac{(1+o(1))e^{k}}{\sqrt{2\pi}k^{5/2}}. (30)

Putting all these bounds together, we get

E[tk]=(1+o(1))n2πk5/2exp(λ2k2n2/3+λk22n4/3k36n2)=Θ(nk5/2),\displaystyle\operatorname*{{\rm E}}[t_{k}]=\frac{(1+o(1))n}{\sqrt{2\pi}k^{5/2}}\exp\left(-\frac{\lambda^{2}k}{2n^{2/3}}+\frac{\lambda k^{2}}{2n^{4/3}}-\frac{k^{3}}{6n^{2}}\right)=\Theta\left(\frac{n}{k^{5/2}}\right), (31)

where in the last inequality we used the assumptions that |λ|g(n)|\lambda|\leq g(n) and kn2/3g(n)2k\leq\frac{n^{2/3}}{g(n)^{2}}. This establishes part (i).

For part (ii) we proceed in similar fashion, starting instead from the following combinatorial identity:

E[tk(tk1)]=n!k!k!(n2k)!(kk2)2(cn)2k2(1cn)m,\operatorname*{{\rm E}}[t_{k}(t_{k}-1)]=\frac{n!}{k!k!(n-2k)!}(k^{k-2})^{2}\left(\frac{c}{n}\right)^{2k-2}\left(1-\frac{c}{n}\right)^{m},

where m=2(k2)2(k1)+k2+2k(n2k)m=2\binom{k}{2}-2(k-1)+k^{2}+2k(n-2k) (see, e.g., [29]). Using the Taylor expansion for ln(1x)\ln(1-x), we get

n!(n2k)!\displaystyle\frac{n!}{(n-2k)!} =n2kexp(2k2n4k33n2+o(1)),\displaystyle=n^{2k}\exp\left(-\frac{2k^{2}}{n}-\frac{4k^{3}}{3n^{2}}+o(1)\right),
(cn)2k2\displaystyle\left(\frac{c}{n}\right)^{2k-2} =1n2k2exp(2λkn1/3λ2kn2/3+o(1)),\displaystyle=\frac{1}{n^{2k-2}}\exp\left(\frac{2\lambda k}{n^{1/3}}-\frac{\lambda^{2}k}{n^{2/3}}+o(1)\right),
(1cn)m\displaystyle\left(1-\frac{c}{n}\right)^{m} =exp(2k+2k2n2λkn1/3+2λk2n4/3+o(1)).\displaystyle=\exp\left(-2k+\frac{2k^{2}}{n}-\frac{2\lambda k}{n^{1/3}}+\frac{2\lambda k^{2}}{n^{4/3}}+o(1)\right).

These three bounds together with (30) imply

E[tk(tk1)]=(1+o(1))n22πk5exp(4k33n2λ2kn2/3+2λk2n4/3).\operatorname*{{\rm E}}[t_{k}(t_{k}-1)]=\frac{(1+o(1))n^{2}}{2\pi k^{5}}\exp\left(-\frac{4k^{3}}{3n^{2}}-\frac{\lambda^{2}k}{n^{2/3}}+\frac{2\lambda k^{2}}{n^{4/3}}\right).

From (31), we get

E[tk]2=(1+o(1))n22πk5exp(λ2kn2/3+λk2n4/3k33n2).\operatorname*{{\rm E}}[t_{k}]^{2}=\frac{(1+o(1))n^{2}}{2\pi k^{5}}\exp\left(-\frac{\lambda^{2}k}{n^{2/3}}+\frac{\lambda k^{2}}{n^{4/3}}-\frac{k^{3}}{3n^{2}}\right).

Hence,

Var(tk)\displaystyle\operatorname*{{\rm Var}}(t_{k}) =E[tk]+(1+o(1))n22πk5exp(λ2kn2/3+λk2n4/3k33n2)[exp(λk2n4/3k3n2)1]\displaystyle=\operatorname*{{\rm E}}[t_{k}]+\frac{(1+o(1))n^{2}}{2\pi k^{5}}\exp\left(-\frac{\lambda^{2}k}{n^{2/3}}+\frac{\lambda k^{2}}{n^{4/3}}-\frac{k^{3}}{3n^{2}}\right)\left[\exp\left(\frac{\lambda k^{2}}{n^{4/3}}-\frac{k^{3}}{n^{2}}\right)-1\right]
=E[tk]+(1+o(1))n22πk5[exp(λk2n4/3k3n2)1]\displaystyle=\operatorname*{{\rm E}}[t_{k}]+\frac{(1+o(1))n^{2}}{2\pi k^{5}}\left[\exp\left(\frac{\lambda k^{2}}{n^{4/3}}-\frac{k^{3}}{n^{2}}\right)-1\right]
E[tk]+(1+o(1))n22πk5[exp(λk2n4/3)1]\displaystyle\leq\operatorname*{{\rm E}}[t_{k}]+\frac{(1+o(1))n^{2}}{2\pi k^{5}}\left[\exp\left(\frac{\lambda k^{2}}{n^{4/3}}\right)-1\right]
E[tk]+(1+o(1))λn2/32πk3\displaystyle\leq\operatorname*{{\rm E}}[t_{k}]+\frac{(1+o(1))\lambda n^{2/3}}{2\pi k^{3}}

where in the second equality we used the assumptions that |λ|g(n)|\lambda|\leq g(n) and kn2/3g(n)2k\leq\frac{n^{2/3}}{g(n)^{2}} and for the last inequality we used the Taylor expansion for exe^{x}. This completes the proof of part (ii).

For part (iii), let =i+j\ell=i+j. When iji\neq j we have the following combinatorial identity (see, e.g., [29]):

E[titj]=n!i!j!(n)!ii2jj2(cn)2(1cn)m,\operatorname*{{\rm E}}[t_{i}t_{j}]=\frac{n!}{i!j!(n-\ell)!}i^{i-2}j^{j-2}\left(\frac{c}{n}\right)^{\ell-2}\left(1-\frac{c}{n}\right)^{m^{\prime}},

where m=(i2)(i1)+(j2)(j1)+ij+(n)m^{\prime}=\binom{i}{2}-(i-1)+\binom{j}{2}-(j-1)+ij+\ell(n-\ell). Using Taylor expansions and Stirling’s approximation as in the previous two parts, we get

E[titj]=(1+o(1))n22πi5/2j5/2exp(36n2λ22n2/3+λ22n4/3).\operatorname*{{\rm E}}[t_{i}t_{j}]=\frac{(1+o(1))n^{2}}{2\pi i^{5/2}j^{5/2}}\exp\left(-\frac{\ell^{3}}{6n^{2}}-\frac{\lambda^{2}\ell}{2n^{2/3}}+\frac{\lambda\ell^{2}}{2n^{4/3}}\right).

Moreover, from (31) we have

E[ti]E[tj]=(1+o(1))n22πi5/2j5/2exp(λ22n2/3+λ(i2+j2)2n4/3i3+j36n2+o(1)),\operatorname*{{\rm E}}[t_{i}]\operatorname*{{\rm E}}[t_{j}]=\frac{(1+o(1))n^{2}}{2\pi i^{5/2}j^{5/2}}\exp\left(-\frac{\lambda^{2}\ell}{2n^{2/3}}+\frac{\lambda(i^{2}+j^{2})}{2n^{4/3}}-\frac{i^{3}+j^{3}}{6n^{2}}+o(1)\right),

and so

Cov(ti,tj)\displaystyle\operatorname*{{\rm Cov}}(t_{i},t_{j}) =E[titj]E[ti]E[tj]\displaystyle=\operatorname*{{\rm E}}[t_{i}t_{j}]-\operatorname*{{\rm E}}[t_{i}]\operatorname*{{\rm E}}[t_{j}]
=(1+o(1))n22πi5/2j5/2exp(36n2λ22n2/3+λ22n4/3)[1exp(λijn4/3+ij2n2)]\displaystyle=\frac{(1+o(1))n^{2}}{2\pi i^{5/2}j^{5/2}}\exp\left(-\frac{\ell^{3}}{6n^{2}}-\frac{\lambda^{2}\ell}{2n^{2/3}}+\frac{\lambda\ell^{2}}{2n^{4/3}}\right)\left[1-\exp\left(-\frac{\lambda ij}{n^{4/3}}+\frac{ij\ell}{2n^{2}}\right)\right]
=(1+o(1))n22πi5/2j5/2[1exp(λijn4/3+ij(i+j)2n2)]\displaystyle=\frac{(1+o(1))n^{2}}{2\pi i^{5/2}j^{5/2}}\left[1-\exp\left(-\frac{\lambda ij}{n^{4/3}}+\frac{ij(i+j)}{2n^{2}}\right)\right]
(1+o(1))λn2/32πi3/2j3/2\displaystyle\leq\frac{(1+o(1))\lambda n^{2/3}}{2\pi i^{3/2}j^{3/2}}

where in the third equality we used the assumptions that |λ|g(n)|\lambda|\leq g(n) and i,jn2/3g(n)2i,j\leq\frac{n^{2/3}}{g(n)^{2}} and the last inequality follows from the Taylor expansion for exe^{x}. ∎

Appendix D The second proof of the local limit theorem

In this appendix, we provide an alternative proof of Theorem 5.1 that does not use Theorem A.1.

Proof of Theorem 5.1.

Let Φ()\Phi(\cdot) denote the probability density function of a standard normal distribution. We will show for any fixed aa\in\mathbb{R},

|Pr[Smμmσm=a]Φ(a)σm|=o(1σm),\left\lvert\Pr\left[\frac{S_{m}-\mu_{m}}{\sigma_{m}}=a\right]-\frac{\Phi(a)}{\sigma_{m}}\right\rvert=o\left(\frac{1}{\sigma_{m}}\right), (32)

which is equivalent to (9).

Let ϕ(t)\phi(t) denote the characteristic function for the random variable (Smμm)/σm(S_{m}-\mu_{m})/\sigma_{m}. By applying the inversion formula (see Theorem 3.3.14 and Exercise 3.3.2 in [13]),

Φ(a)=12πeitaet2/2𝑑t,\Phi(a)=\frac{1}{2\pi}\int_{-\infty}^{\infty}e^{-ita}e^{-t^{2}/2}dt,

and

Pr[Smμmσm=a]=12πσmπσmπσmeitaϕ(t)𝑑t.\Pr\left[\frac{S_{m}-\mu_{m}}{\sigma_{m}}=a\right]=\frac{1}{2\pi\sigma_{m}}\int_{-\pi\sigma_{m}}^{\pi\sigma_{m}}e^{-ita}\phi(t)dt.

Hence, the left hand side of (32) can be bounded from above by

12πσm[πσmπσm|eita(ϕ(t)et22)|𝑑t+2πσmet22𝑑t].\frac{1}{2\pi\sigma_{m}}\left[\int_{-\pi\sigma_{m}}^{\pi\sigma_{m}}\left\lvert e^{-ita}\left(\phi(t)-e^{-\frac{t^{2}}{2}}\right)\right\rvert dt+2\int_{\pi\sigma_{m}}^{\infty}e^{-\frac{t^{2}}{2}}dt\right].

Since |eita|1|e^{-ita}|\leq 1, it suffices to show that for all ε>0\varepsilon>0 there exists M>0M>0 such that if m>Mm>M then

πσmπσm|ϕ(t)et22|𝑑t+2πσmet22𝑑tε.\int_{-\pi\sigma_{m}}^{\pi\sigma_{m}}\left\lvert\phi(t)-e^{-\frac{t^{2}}{2}}\right\rvert dt+2\int_{\pi\sigma_{m}}^{\infty}e^{-\frac{t^{2}}{2}}dt\leq\varepsilon. (33)

We can bound from above the left hand side of (33) by:

AA|ϕ(t)et22|𝑑t+2Aσm/2|ϕ(t)|𝑑t+2σm/2πσm|ϕ(t)|𝑑t+2Aet22𝑑t.\int_{-A}^{A}\left\lvert\phi(t)-e^{-\frac{t^{2}}{2}}\right\rvert dt+2\int_{A}^{\sigma_{m}/2}\left\lvert\phi(t)\right\rvert dt+2\int_{\sigma_{m}/2}^{\pi\sigma_{m}}\left\lvert\phi(t)\right\rvert dt+2\int_{A}^{\infty}e^{-\frac{t^{2}}{2}}dt. (34)

The division depends on some constant AA that we will choose soon. We proceed to bound integral terms in (34) independently.

Lemma A.2 implies that Smμmσm\frac{S_{m}-\mu_{m}}{\sigma_{m}} converges in distribution to a standard normal. Combined with the continuity theorem (see Theorem 3.3.17 in [13]), we get that ϕ(t)et22\phi(t)\rightarrow e^{-\frac{t^{2}}{2}} as mm\rightarrow\infty. The dominated convergence theorem (see Theorem 1.5.8 in [13]) hence implies that for any A<A<\infty the first integral of (34) converges to 0. We select MM large enough so that the integral is less than ε/4\varepsilon/4.

The last integral of (34) is the standard normal tail that goes to 0 exponentially fast as AA increases (see e.g. Proposition 2.1.2 in [32]). Therefore, we are able to select AA large enough so that each tail has probability mass less than ε/8\varepsilon/8.

To bound the remaining two terms we use the properties of the characteristic function ϕ(t)\phi(t). By definition and the independence between XiX_{i}’s,

ϕ(t)=E[exp(itSmμmσm)]=exp(itμmσm)j=1mϕj(t),\phi(t)=\operatorname*{{\rm E}}\left[\exp\left(it\cdot\frac{S_{m}-\mu_{m}}{\sigma_{m}}\right)\right]=\exp\left(-\frac{it\mu_{m}}{\sigma_{m}}\right)\prod_{j=1}^{m}\phi_{j}(t),

where ϕj(t)\phi_{j}(t) denotes the characteristic function of Xj/σmX_{j}/\sigma_{m}. Since exp(itμmσm)\exp(-\frac{it\mu_{m}}{\sigma_{m}}) always has modulo 1, |ϕ(t)|j=1m|ϕj(t)|\lvert\phi(t)\rvert\leq\prod_{j=1}^{m}\lvert\phi_{j}(t)\rvert.

We proceed to bound the third integral of (34). Note that |ϕj(t)|1\lvert\phi_{j}(t)\rvert\leq 1 for all jj and tt. Therefore,

|ϕ(t)|j=1m|ϕj(t)|jρm|ϕj(t)|.\lvert\phi(t)\rvert\leq\prod_{j=1}^{m}|\phi_{j}(t)|\leq\prod_{j\leq\rho m}|\phi_{j}(t)|.

Notice that the XjX_{j}’s for jρmj\leq\rho m are Bernoulli random variables. By periodicity (see Theorem 3.5.2 in [13]), |ϕj(t)||\phi_{j}(t)| equals to 1 only when tt equals to the multiples of 2πσm2\pi\sigma_{m}. For t[σm/2,σmπ]t\in[\sigma_{m}/2,\sigma_{m}\pi], |ϕj(t)||\phi_{j}(t)| is bounded away from 11, and there exists a constant η<1\eta<1 such that |ϕj(t)|η\lvert\phi_{j}(t)\rvert\leq\eta. Hence, |ϕ(t)|ηρm\lvert\phi(t)\rvert\leq\eta^{\rho m}. By choosing MM to be sufficiently large, we may bound the integral for m>Mm>M:

σm/2πσm|ϕ(t)|𝑑tσm/2πσmηρm𝑑tπσmηρmmηρmε8.\int_{\sigma_{m}/2}^{\pi\sigma_{m}}\left\lvert\phi(t)\right\rvert dt\leq\int_{\sigma_{m}/2}^{\pi\sigma_{m}}\eta^{\rho m}dt\leq\pi\sigma_{m}\eta^{\rho m}\leq m\eta^{\rho m}\leq\frac{\varepsilon}{8}.

Finally, we bound the second integral of (34). By the definition of XjX_{j}, we have

ϕj(t)=reitcjσm+(1r)=r(coscjtσm+isincjtσm)+1r,\phi_{j}(t)=re^{it\cdot\frac{c_{j}}{\sigma_{m}}}+(1-r)=r\cdot\left(\cos\frac{c_{j}t}{\sigma_{m}}+i\cdot\sin\frac{c_{j}t}{\sigma_{m}}\right)+1-r,

where the last identity uses Euler’ formula. Take the modulo of both sides,

|ϕj(t)|\displaystyle\lvert\phi_{j}(t)\rvert =r2sin2cjtσm+(rcjtσm+1r)2\displaystyle=\sqrt{r^{2}\sin^{2}\frac{c_{j}t}{\sigma_{m}}+\left(r\frac{c_{j}t}{\sigma_{m}}+1-r\right)^{2}}
=r2sin2cjtσm+r2cos2cjtσm+(1r)2+2r(1r)coscjtσm\displaystyle=\sqrt{r^{2}\sin^{2}\frac{c_{j}t}{\sigma_{m}}+r^{2}\cos^{2}\frac{c_{j}t}{\sigma_{m}}+(1-r)^{2}+2r(1-r)\cos\frac{c_{j}t}{\sigma_{m}}}
=r2+(1r)2+2r(1r)coscjtσm\displaystyle=\sqrt{r^{2}+(1-r)^{2}+2r(1-r)\cos\frac{c_{j}t}{\sigma_{m}}}
=12r(1r)(1coscjtσm)\displaystyle=\sqrt{1-2r(1-r)\left(1-\cos\frac{c_{j}t}{\sigma_{m}}\right)}
=1r(1r)(1coscjtσm)12r2(1r)2(1coscjtσm)2,\displaystyle=1-r(1-r)\left(1-\cos\frac{c_{j}t}{\sigma_{m}}\right)-\frac{1}{2}r^{2}(1-r)^{2}\left(1-\cos\frac{c_{j}t}{\sigma_{m}}\right)^{2}-\dots,

where the last equality corresponds to the Taylor expansion for 1+y\sqrt{1+y} when |y|1\lvert y\rvert\leq 1. We can also Taylor expand coscjtσm\cos\frac{c_{j}t}{\sigma_{m}} as

1cj2t22σm2+cj4t44!σm4cj6t66!σm6+1-\frac{c_{j}^{2}t^{2}}{2\sigma_{m}^{2}}+\frac{c_{j}^{4}t^{4}}{4!\sigma_{m}^{4}}-\frac{c_{j}^{6}t^{6}}{6!\sigma_{m}^{6}}+\dots

Observe that if cjtσm<1\frac{c_{j}t}{\sigma_{m}}<1, then we can bound coscjtσm\cos\frac{c_{j}t}{\sigma_{m}} from above by 1cj2t24σm21-\frac{c_{j}^{2}t^{2}}{4\sigma_{m}^{2}}. Furthermore, if we keep only the first order term from the expansion for |ϕj(t)|\lvert\phi_{j}(t)\rvert, we have

|ϕj(t)|1r(1r)cj2t24σm2exp(r(1r)cj2t24σm2).\lvert\phi_{j}(t)\rvert\leq 1-r(1-r)\frac{c_{j}^{2}t^{2}}{4\sigma_{m}^{2}}\leq\exp\left(-r(1-r)\frac{c_{j}^{2}t^{2}}{4\sigma_{m}^{2}}\right). (35)

Note that (35) only holds if cjt<σmc_{j}t<\sigma_{m}. However, if we fix σm\sigma_{m} then for every t[A,σm/2]t\in[A,\sigma_{m}/2], there always exists a real number u(t)1u(t)\geq 1 such that u(t)t<σm2u(t)tu(t)t<\sigma_{m}\leq 2u(t)t, which implies for all cju(t)c_{j}\leq u(t), (35) can be established. Now we aggregate all |ϕj(t)|\lvert\phi_{j}(t)\rvert for which we could claim (35).

|ϕ(t)|j:cju(t)|ϕj(t)|exp(r(1r)j:cju(t)cj2t24σm2).\lvert\phi(t)\rvert\leq\prod_{j:c_{j}\leq u(t)}\lvert\phi_{j}(t)\rvert\leq\exp\left(-r(1-r)\sum_{j:c_{j}\leq u(t)}\frac{c_{j}^{2}t^{2}}{4\sigma_{m}^{2}}\right). (36)

Without loss of generality, we assume A>1A>1. Consequently, u(t)σmu(t)\leq\sigma_{m}. Lemma A.3 implies for σmu(t)1\sigma_{m}\geq u(t)\geq 1, j:cju(t)cj2u(t)σm/r(1r)\sum_{j:c_{j}\leq u(t)}c_{j}^{2}\geq u(t)\sigma_{m}/r(1-r). Plugging this inequality into (36), we obtain

|ϕ(t)|exp(u(t)σmt24σm2)=exp(t82u(t)tσm)exp(t8).\lvert\phi(t)\rvert\leq\exp\left(-\frac{u(t)\sigma_{m}t^{2}}{4\sigma_{m}^{2}}\right)=\exp\left(-\frac{t}{8}\cdot\frac{2u(t)t}{\sigma_{m}}\right)\leq\exp\left(-\frac{t}{8}\right).

Therefore, for sufficiently large AA,

Aσm/2|ϕ(t)|𝑑tAexp(t8)𝑑teA/88ε8.\int_{A}^{\sigma_{m}/2}\left\lvert\phi(t)\right\rvert dt\leq\int_{A}^{\infty}\exp\left(-\frac{t}{8}\right)dt\leq\frac{e^{-A/8}}{8}\leq\frac{\varepsilon}{8}.

Thus we established (33) and the proof is complete. ∎