The Critical Mean-field Chayes-Machta Dynamics
Abstract
The random-cluster model is a unifying framework for studying random graphs, spin systems and electrical networks that plays a fundamental role in designing efficient Markov Chain Monte Carlo (MCMC) sampling algorithms for the classical ferromagnetic Ising and Potts models.
In this paper, we study a natural non-local Markov chain known as the Chayes-Machta dynamics for the mean-field case of the random-cluster model, where the underlying graph is the complete graph on vertices. The random-cluster model is parametrized by an edge probability and a cluster weight . Our focus is on the critical regime: and , where is the threshold corresponding to the order-disorder phase transition of the model. We show that the mixing time of the Chayes-Machta dynamics is in this parameter regime, which reveals that the dynamics does not undergo an exponential slowdown at criticality, a surprising fact that had been predicted (but not proved) by statistical physicists. This also provides a nearly optimal bound (up to the factor) for the mixing time of the mean-field Chayes-Machta dynamics in the only regime of parameters where no non-trivial bound was previously known. Our proof consists of a multi-phased coupling argument that combines several key ingredients, including a new local limit theorem, a precise bound on the maximum of symmetric random walks with varying step sizes, and tailored estimates for critical random graphs.
In addition, we derive an improved comparison inequality between the mixing time of the Chayes-Machta dynamics and that of the local Glauber dynamics on general graphs; this results in better mixing time bounds for the local dynamics in the mean-field setting.
1 Introduction
The random-cluster model generalizes classical random graph and spin system models, providing a unifying framework for their study [14]. It plays an indispensable role in the design of efficient Markov Chain Monte Carlo (MCMC) sampling algorithms for the ferromagnetic Ising/Potts model [31, 8, 20] and has become a fundamental tool in the study of phase transitions [2, 12, 11].
The random-cluster model is defined on a finite graph with an edge probability parameter and a cluster weight . The set of configurations of the model is the set of all subsets of edges . The probability of each configuration is given by the Gibbs distribution:
(1) |
where is the number of connected components in and is the normalizing factor called the partition function.
The special case when corresponds to the independent bond percolation model, where each edge of the graph appears independently with probability . Independent bond percolation is also known as the Erdős-Rényi random graph model when is the complete graph.
For integer , the random-cluster model is closely related to the ferromagnetic -state Potts model. Configurations in the -state Potts model are the assignments of spin values to the vertices of ; the case corresponds to the Ising model. A sample from the random-cluster distribution can be easily transformed into one for the Ising/Potts model by independently assigning a random spin from to each connected component of . Random-cluster based sampling algorithms, which include the widely-studied Swendsen-Wang dynamics [30], are an attractive alternative to Ising/Potts Markov chains since they are often efficient at “low-temperatures” (large ). In this parameter regime, several standard Ising/Potts Markov chains are known to converge slowly.
In this paper we investigate the Chayes-Machta (CM) dynamics [10], a natural Markov chain on random-cluster configurations that converges to the random-cluster measure. The CM dynamics is a generalization to non-integer values of of the widely studied Swendsen-Wang dynamics [30]. As with all applications of the MCMC method, the primary object of study is the mixing time, i.e., the number of steps until the dynamics is close to its stationary distribution, starting from the worst possible initial configuration. We are interested in understanding how the mixing time of the CM dynamics grows as the size of the graph increases, and in particular how it relates to the phase transition of the model.
Given a random-cluster configuration , one step of the CM dynamics is defined as follows:
-
(i)
activate each connected component of independently with probability ;
-
(ii)
remove all edges connecting active vertices;
-
(iii)
add each edge between active vertices independently with probability , leaving the rest of the configuration unchanged.
We call (i) the activation sub-step, and (ii) and (iii) combined the percolation sub-step. It is easy to check that this dynamics is reversible with respect to the Gibbs distribution (1) and thus converges to it [10]. For integer , the CM dynamics may be viewed as a variant of the Swendsen-Wang dynamics. In the Swendsen-Wang dynamics, each connected component of receives a random color from , and the edges are updated within each color class as in (ii) and (iii) above; in contrast, the CM dynamics updates the edges of exactly one color class. However, note that the Swendsen-Wang dynamics is only well-defined for integer , while the CM dynamics is feasible for any real . Indeed, the CM dynamics was introduced precisely to allow this generalization.
The study of the interplay between phase transitions and the mixing time of Markov chains goes back to pioneering work in mathematical physics in the late 1980s. This connection for the specific case of the CM dynamics on the complete -vertex graph, known as the mean-field model, has received some attention in recent years (see [7, 15, 18]) and is the focus of this paper. As we shall see, the mean-field case is already quite non-trivial, and has historically proven to be a useful starting point in understanding various types of dynamics on more general graphs. We note that, so far, the mean-field is the only setting in which there are tight mixing time bounds for the CM dynamics; all other known bounds are deduced indirectly via comparison with other Markov chains, thus incurring significant overhead [8, 6, 17, 5, 31, 7].
The phase transition for the mean-field random-cluster model is fairly well-understood [9, 25]. In this setting, it is natural to re-parameterize by setting ; the phase transition then occurs at the critical value , where when and for . For all components are of size with high probability (w.h.p.); that is, with probability tending to as . This regime is known as the disordered phase. On the other hand, for there is a unique giant component of size , where ; this regime of parameters is known as the ordered phase. The phase transition is thus analogous to that in corresponding to the emergence of a giant component.
The phase structure of the mean-field random-cluster model, however, is more subtle and depends crucially on the second parameter . In particular, when the model exhibits phase coexistence at the critical threshold . Roughly speaking, this means that when , the set of configurations with all connected components of size , and set of configurations with a unique giant component, contribute each a constant fraction of the probability mass. For , on the other hand, there is no phase coexistence. These subtleties are illustrated in Figure 1.
Phase coexistence at when has significant implications for the speed of convergence of Markov chains, including the CM dynamics. The following detailed connection between the phase structure of the model and the mixing time of the CM dynamics was recently established in [7, 4, 18]. When , we have:
(2) |
where is the so-called metastability window. It is known that , but does not have a closed form; see [7, 25]; we note that for .
When , there is no metastability window, and the mixing time of the mean-field CM dynamics is for all . In view of these results, the only case remaining open is when and . Our main result shown below concerns precisely this regime, which is particularly delicate and had resisted analysis until now for reasons we explain in our proof overview.
Theorem 1.1.
The mixing time of the CM dynamics on the complete -vertex graph when and is .
A lower bound is known for the mixing time of the mean-field CM dynamics that holds for all and [7]. Therefore, our result is tight up to the lower order factor, and in fact even better as we explain in Remark 4. The conjectured tight bound when and is . We mention that the and case, which is quite different and not covered by Theorem 1.1, was considered earlier in [24] for the closely related Swendsen-Wang dynamics, and a tight bound was established for its mixing time. The same mixing time bound is expected for the CM dynamics in this regime.
Our result establishes a striking behavior for random-cluster dynamics when . Namely, there is no slowdown (exponential or power law) in this regime at the critical threshold . Note that for , as described in (2) above, the mixing time of the dynamics undergoes an exponential slowdown, transitioning from when , to a power law at , and to exponential in when . The absence of a critical slowdown for was in fact predicted by the statistical physics community [16], and our result provides the first rigorous proof of this phenomenon. See Remark 1 for further comments.
Our second result concerns the local Glauber dynamics for the random-cluster model. In each step, the Glauber dynamics updates a single edge of the current configuration chosen uniformly at random; a precise definition of this Markov chain is given in Section 6. In [7], it was established that any upper bound on the mixing time of the CM dynamics can be translated to one for the mixing time of the Glauber dynamics, at the expense of a factor; the notation hides polylogarithmic factors. In particular, it was proved in [7] that We provide here an improvement of this comparison inequality.
Theorem 1.2.
For all and all ,
To prove this theorem, we establish a general comparison inequality that holds for any graph, any and any ; see Theorem 6.1 for a precise statement. When combined with the known mixing time bounds for the CM dynamics on the complete graph, Theorem 1.2 yields that the random-cluster Glauber dynamics mixes in steps when and , or when and . In these regimes, the mixing time of the Glauber dynamics was previously known to be and is conjectured to be ; the improved comparison inequality in Theorem 1.2 gets us closer to this conjectured tight bound. We note, however, that even if one showed the conjectured optimal bound for the mixing time of the Glauber dynamics, the CM is faster, even if we take into account the computational cost associated to implementing its steps.
We conclude this introduction with some brief remarks about our analysis techniques, which combine several key ingredients in a non-trivial way. Our bound on the mixing time uses the well-known technique of coupling: in order to show that the mixing time is , it suffices to couple the evolutions of two copies of the dynamics, starting from two arbitrary configurations, in such a way that they arrive at the same configuration after steps with probability . (The moves of the two copies can be correlated any way we choose, provided that each copy, viewed in isolation, is a valid realization of the dynamics.) Because of the delicate nature of the phase transition in the random-cluster model, combined with the fact that the percolation sub-step of the CM dynamics is critical when , our coupling is somewhat elaborate and proceeds in multiple phases. The first phase consists of a burn-in period, where the two copies of the chain are run independently and the evolution of their largest components is observed until they have shrunk to their “typical” sizes. This part of the analysis is inspired by similar arguments in earlier work [7, 24, 15].
In the second phase, we design a coupling of the activation of the connected components of the two copies which uses: (i) a local limit theorem, which can be thought of as a stronger version of a central limit theorem; (ii) a precise understanding of the distribution of the maximum of symmetric random walks on with varying step sizes; and (iii) precise estimates for the component structure of random graphs. We develop tailored versions of these probabilistic tools for our setting and combine them to guarantee that the same number of vertices from each copy are activated in each step w.h.p. for sufficiently many steps. This phase of the coupling is the main novelty in our analysis, and allows us to quickly converge to the same configuration. We give a more detailed overview of our proof in the following section.
2 Proof sketch and techniques
We now give a detailed sketch of the multi-phased coupling argument for proving Theorem 1.1. We start by formally defining the notions of mixing and coupling times. Let be the set of random-cluster configurations of a graph ; let be the transition matrix of a random-cluster Markov chain with stationary distribution , and let be the distribution of the chain after steps starting from . The -mixing time of is given by
where denotes total variation distance. In particular, the mixing time of is .
A (one step) coupling of the Markov chain specifies, for every pair of states , a probability distribution over such that the processes and are valid realizations of , and if then . The coupling time, denoted , is the minimum such that , starting from the worst possible pair of configurations in . It is a standard fact that ; moreover, when for some coupling, then (see, e.g., [22]).
We provide first a high level description of our coupling for the CM dynamics. For this, we require the following notation. For a random cluster configuration , let denote the size of the -th largest connected component in , and let ; in particular, is the sum of the squares of the sizes of all the components of . Our coupling has three main phases:
-
1.
Burn-in period: run two copies , independently, starting from a pair of arbitrary initial configurations, until and .
-
2.
Coupling to the same component structure: starting from and such that and , we design a two-phased coupling that reaches two configurations with the same component structure as follows:
-
2a.
A two-step coupling after which the two configurations agree on all “large components”;
-
2b.
A coupling that after additional steps reaches two configurations that will also have the same “small component” structure.
-
2a.
-
3.
Coupling to the same configuration: starting from two configurations with the same component structure, there is a straightforward coupling that couples the two configurations in steps w.h.p.
We proceed to describe each of these phases in detail.
2.1 The burn-in period
During the initial phase, two copies of the dynamics evolve independently. This is called a burn-in period and in our case consists of three sub-phases.
In the first sub-phase of the burn-in period the goal is to reach a configuration such that . For this, we use a lemma from [4], which shows that after steps of the CM dynamics with at least constant probability; this holds when for any initial configuration and any .
Lemma 2.1 ([4], Lemma 3.42).
Let and , and let be an arbitrary random-cluster configuration. Then, for any constant , after steps and with probability .
In the second and third sub-phases of the burn-in period, we use the fact that when , the number of activated vertices is well concentrated around (its expectation). This is used to show that the size of the largest component contracts at a constant rate for steps until a configuration is reached such that . This part of the analysis is split into two sub-phases because the contraction for requires a more delicate analysis when ; this is captured in the following two lemmas.
Lemma 2.2.
Let and . Suppose . Then, for any constant , there exists such that and with probability .
Lemma 2.3.
Let and . Suppose and that for a sufficiently small constant . Then, with probability , after steps .
Lemmas 2.2 and 2.3 are proved in Section 4. Combining them with Lemma 2.1 immediately yields the following theorem.
Theorem 2.4.
Let , and let be an arbitrary random-cluster configuration of the complete -vertex graph. Then, with probability , after steps .
Remark 1.
The contraction of established by Lemmas 2.2 and 2.3 only occurs when ; when the quantity may increase in expectation, whereas for we have , and the contraction of the size of the largest component is due instead to fluctuations caused by a large second moment. (This is what causes the power law slowdown when .)
Remark 2.
Sub-steps (ii) and (iii) of the CM dynamics are equivalent to replacing the active portion of the configuration by a random graph, where is the number of active vertices. Since , one key challenge in the proofs of Lemmas 2.2 and 2.3, and in fact in the entirety of our analysis, is that the random graph is critical or almost critical w.h.p. since ; consequently its structural properties are not well concentrated and cannot be maintained for the required steps of the coupling. This is one of the key reasons why the regime is quite delicate.
2.2 Coupling to the same component structure
For the second phase of the coupling, we assume that we start from a pair of configurations , such that , . The goal is to show that after steps, with probability , we reach two configurations and with the same component structure; i.e., for all . In particular, we prove the following.
Theorem 2.5.
Let , and suppose are random-cluster configurations such that and . Then, there exists a coupling of the CM steps such that after steps and have the same component structure with probability .
Our coupling construction for proving Theorem 2.5 has two main sub-phases. The first is a two-step coupling after which the two configurations agree on all the components of size above a certain threshold , where is a slowly increasing function. For convenience and definiteness we set . In the second sub-phase we take care of matching the small component structures.
We note that when the same number of vertices are activated from each copy of the chain, we can easily couple the percolation sub-step (with an arbitrary bijection between the activated vertices) and replace the configuration on the active vertices in both chains with the same random sub-graph; consequently, the component structure in the updated sub-graph would be identical. Our goal is thus to design a coupling of the activation of the components that activates the same number of vertices in both copies in every step.
In order for the initial two-step coupling to succeed, certain (additional) properties of the configurations are required. These properties are achieved with a continuation of the initial burn-in phase for a small number of steps. For a random-cluster configuration , let and let denote the number of isolated vertices of . Our extension of the burn-in period is captured by the following lemma.
Lemma 2.6.
Let , and suppose is such that . Then, there exists and a constant such that , and with probability .
With these bounds on , , and , we construct the two-step coupling for matching the large component structure. The construction crucially relies on a new local limit theorem (Theorem 5.1). In particular, under our assumptions, when is small enough, there are few components with sizes above . Hence, we can condition on the event that all of them are activated simultaneously. The difference in the number of active vertices generated by the activation of these large components can then be “corrected” by a coupling of the activation of the smaller components; for this we use our new local limit theorem.
Specifically, our local limit theorem applies to the random variables corresponding to the number of activated vertices from the small components of each copy. We prove it using a result of Mukhin [28] and the fact that, among the small components, there are (roughly speaking) many components of many different sizes. To establish the latter we require a refinement of known random graph estimates (see Lemma 3.11).
To formally state our result we introduce some additional notation. Let be the set of connected components of with sizes greater than . At step , the activation of the components of two random-cluster configurations and is done using a maximal matching between the components of and , with the restriction that only components of equal size are matched to each other. For an increasing positive function and each integer , define as the number of matched pairs in whose component sizes are in the interval
where is a fixed large constant (independent of ).
Lemma 2.7.
Let , and suppose are random-cluster configurations such that , , and similarly for . Then, there exists a two-step coupling of the CM dynamics such that with probability .
Moreover, , , , , for all such that , and similarly for .
From the first part of the lemma we obtain two configurations that agree on all of their large components, as desired, while the second part guarantees additional structural properties for the resulting configurations so that the next sub-phase of the coupling can also succeed with the required probability. The proof of Lemma 2.7 is given in Section 5.2.
In the second sub-phase, after the large component are matched, we can design a coupling that activates exactly the same number of vertices from each copy of the chain. To analyze this coupling we use a precise estimate on the distribution of the maximum of symmetric random walks over integers (with steps of different sizes). We are first required to run the chains coupled for steps, so that certain additional structural properties appear. Let and be the components in the matching that belong to and , respectively, and let and be the complements of and . Let
Lemma 2.8.
Let , . Suppose and are random-cluster configurations such that , and for all such that . Suppose also that , , , , and similarly for .
Then, there exists a coupling of the CM steps such that with probability after steps: , , for all such that , , , and similarly for .
The proof of Lemma 2.8 also uses our local limit theorem (Theorem 5.1) and is provided in Section 5.3.
The final step of our construction is a coupling of the activation of the components of size less than , so that exactly the same number of vertices are activated from each copy in each step w.h.p.
Lemma 2.9.
Let , and suppose and are random-cluster configurations such that , , and for all such that . Suppose also that , and similarly for . Then, there exist a coupling of the CM steps and a constant such that after steps, and have the same component structure with probability .
We comment briefly on how we prove this lemma. Our starting point is two configurations with the same “large” component structure; i.e., . We use the maximal matching to couple the activation of the large components in and . The small components not matched by , i.e., those counted in , are then activated independently. This creates a discrepancy between the number of active vertices from each copy. Since and , it follows from Hoeffding’s inequality that w.h.p. To fix this discrepancy, we use the small components matched by . Specifically, under the assumptions in Lemma 2.9, we can construct a coupling of the activation of the small components so that the difference in the number of activated vertices from the small components from each copy is exactly with probability . This part of the construction utilizes random walks over the integers; in particular, we use a lower bound for the maximum of such a random walk.
We need to repeat this process until ; this takes steps since . However, there are a few complications. First, the initial assumptions on the component structure of the configurations are not preserved for this many steps w.h.p., so we need to relax the requirements as the process evolves. This is in turn possible because the discrepancy decreases with each step, which implies that the probability of success of the coupling increases at each step. See Section 5.4 for the detailed proof.
We now indicate how these lemmas lead to a proof of Theorem 2.5 stated earlier.
Proof of Theorem 2.5.
Remark 3.
We pause to mention that this delicate coupling for the activation of the components is not required when and . In that regime, the random-cluster model is super-critical, so after the first steps, the component structure is much simpler, with exactly one large component. On the other hand, when and the model is critical, which, combined with the fact mentioned earlier that the percolation sub-step of the dynamics is also critical when , makes the analysis of the CM dynamics in this regime quite subtle.
2.3 Coupling to the same configuration
In the last phase of the coupling, suppose we start with two configurations , with the same component structure. We are still required to bound the number of steps until the same configuration is reached. The following lemma from [7] supplies the desired bound.
Lemma 2.10 ([7], Lemma 24).
Let , and let , be two random-cluster configurations with the same component structure. Then, there exists a coupling of the CM steps such that after steps, w.h.p.
Combining the results for each of the phases of the coupling, we now prove Theorem 1.1.
Proof of Theorem 1.1.
Remark 4.
The probability of success in Theorem 2.5, which governs the lower order term in our mixing time bound, is controlled by our choice of the function for the definition of “large components”. By choosing that goes to more slowly, we could improve our mixing time bound to where is any function that tends to infinity arbitrarily slowly. However, it seems that new ideas are required to obtain a bound of (matching the known lower bound). In particular, the fact that is crucially used in some of our proofs. Our specific choice of yields the bound and makes our analysis cleaner.
3 Random graph estimates
In this section, we compile a number of standard facts about the random graph model which will be useful in our proofs. We use to denote a random graph sampled from the standard model, in which every edge appears independently with probability . A random graph is said to be sub-critical when . It is called super-critical when and critical when . For a graph , with a slight abuse of notation, let denote the size of the -th largest connected component in , and let ; note that the same notation is used for the components of a random-cluster configuration, but it will always be clear from context which case is meant.
Fact 3.1.
Given and . Let and . For any , .
Proof.
Consider the coupling of such that is a subgraph of . with probability 1. Proposition just follows from Strassen’s theorem (see, e.g., Theorem 22.6 in [22]). ∎
Lemma 3.2 ([24], Lemma 5.7).
Let denote the number of isolated vertices in . If , then there exists a constant such that
Consider the equation
(3) |
and let be defined as its unique positive root. Observe that is well-defined for .
Lemma 3.3 ([4], Lemma 2.7).
Let random graph where and . Assume and is bounded away from for all . Then, For and sufficiently large , there exists a constant such that
Lemma 3.4 ([4], Lemma 2.16).
For , we have .
Consider the near-critical random graph with .
Lemma 3.5 ([24], Theorem 5.9).
Assume , then for any satisfying , there exists some constant such that
Corollary 3.6.
Let with . For any positive constant , there exist constants and such that if , then
Lemma 3.7 ([24], Theorem 5.12).
Let , then
Lemma 3.8 ([24], Theorem 5.13).
Let and for large , then .
For the next results, suppose that , where may depend on .
Lemma 3.9.
If , then .
Proof.
Follows from Lemmas 2.13, 2.15 and 2.16 in [4]. ∎
All the random graph facts stated so far can be either found in the literature, or follow directly from well-known results. The following lemmas are slightly more refined versions of similar results in the literature.
Lemma 3.10.
Suppose and let , where is a positive increasing function such that . Then, for any there exists a constant such that, with probability at least ,
Lemma 3.11.
Let and suppose there exists a positive increasing function such that , , and . If , then there exists constants independent of such that
The proofs of Lemmas 3.10 and 3.11 are provided in Appendix C. Finally, the following corollary of Lemma 3.11 will also be useful. For a graph , let be the number of components of whose sizes are in the interval . We note that with a slight abuse of notation, for a random-cluster configuration , we also use for the number of connected components of in .
Corollary 3.12.
Let and let be an increasing positive function that such that , and . If , there exists a constant such that, with probability at least , for all such that .
4 The burn-in period: proofs
4.1 A drift function
Consider the mean-field random-cluster model with parameters and . In this subsection, we introduce a drift function captures the rate of decay of the size of the largest component in a configuration under steps of the CM dynamics which will be helpful for proving Lemma 2.2; this function was first studied in [7].
Given , consider the equation
(4) |
and let be defined as the largest positive root of (4). We shall see that is not defined for all and since there may not be a positive root. When and are clear from the context we use . Note that defined by equation (3) is the special case of (4) when ; observe that is only well-defined when .
We let so that Hence, is only defined when ; that is, , where . Note that when , is defined for every .
For fixed and , we call the drift function. which is defined on .
Lemma 4.1.
When , the drift function is non-negative for any , where is an arbitrarily small positive constant.
4.2 Shrinking a large component: proof of Lemma 2.2
The proof of Lemma 2.2 uses the following lemma, which follows directly from standard random graph estimates and Hoeffding’s inequality. To simplify the notation, we let . We use to denote the number of vertices activated by the step CM dynamics from configuration . Let denote the event that the largest component of the configuration is activated in step .
Lemma 4.2 ([4], Claim 3.45).
Suppose and for a large constant , and let be a fixed large constant. Then
-
1.
.
-
2.
.
Proof of Lemma 2.2.
Let be the first time when , let be a large constant we choose later; we set . Observe that with constant probability the largest component in the configuration is activated by the CM dynamics for every ; i.e., the event occurs for every . Let us assume this is the case and fix . Suppose where is the positive constant from Lemma 4.2. We show that with high probability:
-
(i)
; and
-
(ii)
where is a positive constant independent of and .
In particular, it suffices to set for the lemma to hold.
First, we show that is concentrated around its mean. Let and . Let , , and . Hoeffding’s inequality implies
If , then the random graph is super-critical since
Next, for a super-critical random graph, Lemma 3.3 provides a concentration bound for the size of largest new component, provided . To see this, we write as
where ; notice that . Let . Since
holds regardless of , Lemma 3.3 implies that for defined in Section 4.1, with high probability
Note that w.h.p.; hence, since we have w.h.p. We have shown that w.h.p.
where is the drift function defined in Section 4.1. By Lemma 4.1, we know for sufficiently small constant (independent of and ). Hence, w.h.p. for sufficiently large
this establishes (ii) from above.
For (i), note that for we have , so Lemma 4.2 implies,
A union bound implies that these two events occur simultaneously w.h.p. and the result follows. ∎
4.3 Shrinking a medium size component: proof of Lemma 2.3
In the third sub-phase of the burn-in period, we show that contracts at a constant rate; the precise description of this phenomenon is captured in the following lemma.
Lemma 4.3.
Suppose , for a large constant , and a small constant . Then:
-
1.
There exists a constant such that
-
2.
Since Lemmas 4.2 suggests with reasonably high probability throughout the execution of the sub-phase, Lemmas 4.3 and 4.2 can be combined to derive the following more accurate contraction estimate which will be crucial in the proof Lemma 2.3.
Lemma 4.4.
Suppose is an arbitrary function with range in the interval where is a large enough constant such that for we have , and is a small constant.
Suppose is such that and , then there exists a constant and such that at time , and with probability at least .
Proof of Lemma 2.3.
Let be a constant large enough so that , we have . Suppose and for the constant from Lemma 4.4. Suppose also ; otherwise there is nothing to prove.
Let and for all . Let be defined as the minimum natural number such that . Note that . Assume at time , there exists an integer such that satisfies:
-
1.
, and
-
2.
.
We show there exists time such that properties 1 and 2 hold for for a different index . The following bounds on sums and products involving the ’s will be useful; the proof is elementary and delayed to the end of this section.
Claim 4.5.
Let be defined as above.
-
(i)
For any positive constant , we have
-
(ii)
By part (ii) of this claim, note that
Hence, Lemma 4.4 implies that with probability there exist a time and a large constant such that and If we are done. Hence, suppose otherwise that . Since the interval is completely covered by the union of the intervals , …, , there must be an integer such that . Also, notice
By taking at most steps of induction, we obtain that there exist constants and such that with probability at least there exists a time
that satisfies and . Observe that is a time when our goal has been achieved, so it only remains to show that and . The lower bound on follows from part (i) of Claim 4.5:
By noting that , we can also bound since is at most . ∎
Proof of Lemma 4.3.
We start with part 1. Let , , and . Hoeffding’s inequality implies that
Let , and . Then, the monotonicity of the largest component in a random graph implies that for any
We bound next . For this, we rewrite as ; since
we have
Thus,
where the last inequality follows from the fact that , where is a sufficiently large constant.
Since , Lemma 3.5 implies
Let . The upper tail bound implies
We show next that for some .
where in the last inequality we use the assumption that . For sufficiently small and sufficiently large , such that
Consequently, with probability . If that is the case, . Therefore,
which concludes the proof of part 1.
For part 2, note first that when the largest component is inactive, we have ; hence, it is sufficient to show that with the desired probability.
Let , , and . By Hoeffding’s inequality,
Let , and let , By monotonicity of the largest component in a random graph,
Rewrite as , where
To conclude, we observe that
as desired. ∎
We are now ready to prove Lemma 4.4.
Proof of Lemma 4.4.
Suppose for a constant . Let , where is a constant such that and is the constant from Lemma 4.3. Let be the first time
where is a large constant we choose later. Let , where the operator takes the minimum of the two numbers. Define as the number of steps up to time in which the largest component of the configuration is activated.
To facilitate the notation, we define the following events. (The constants and are those from Lemmas 4.3 and 4.2, respectively).
-
1.
Let denote ;
-
2.
Let denote ; let us assume occurs;
-
3.
Let denote ; again, we assume occurs;
-
4.
Let denote ;
-
5.
Let be the intersection of and }.
By induction, we find a lower bound for the probability of . For the base case, note that by assumption. Next we show
If , then , so the induction holds. If , then we have . By the induction hypothesis ,
Moreover, since and , we have
Given and , Lemma 4.2 implies that occurs with probability
In addition, note that leads to . Let be the indicator function for the event . Given and , Lemma 4.3 implies
(5) |
with probability at least .
Dividing equation (5) by , we obtain for large enough . In particular, we can choose to be . A union bound then implies
The probability for can then be bounded as follows:
Next, let us assume . Then we have
Notice that if then the proof is complete. Consequently, it suffices to show with probability at least .
Observe that is a binomial random variable , whose expectation is . By Chernoff bound
If indeed and , then the event implies
which leads to . Therefore,
as desired. ∎
Proof of Claim 4.5.
We first show the following inequality:
(6) |
Note that by direction computation
From the definition of , we know that for all . Hence, . In addition, recall that is such that , we have ; therefore, . Then, . Putting all these together,
The proof of part (i) is inductive. The base case () holds trivially. For the inductive step note that
where the last inequality follows from (6).
For part (ii) we also use induction. The base case () can be checked straightforwardly. For the inductive step,
where the last inequality follows from (6). ∎
5 Coupling to the same component structure: proofs
5.1 Continuation of the burn-in phase: proof of Lemma 2.6
Recall that for a random-cluster configuration , let denote the random variable corresponding to the number of vertices activated by step (i) of the CM dynamics from .
Proof of Lemma 2.6.
We show that there exist suitable constants , and such that if and , then
(7) | ||||
(8) |
with probability . This implies that we can maintain (7)-(8) for steps with probability . Precisely, if we let
where the constant is chosen such that , then with probability . (Note that for a suitable constant .) Hence, and
The lemma then follows from the fact that with probability by Lemma 3.2 and a union bound.
To establish (7)-(8), let be the event that , where is a constant. By Hoeffding’s inequality, for a suitable , since . Let denote the subgraph induced on the inactivated vertices at the step . Observe that . Similarly, . Hence, by Markov’s inequality and independence between activation of each component, with probability at least , the activation sub-step is such that satisfies
and
We denote this event by . It follows by a union bound that and happen simultaneously with probability at least . We assume that this is indeed the case and proceed to discuss the percolation sub-step.
Lemma 3.10 implies that there exists such that with probability ,
Hence,
where the last inequality holds for a suitable constant and a sufficiently large since .
5.2 Coupling to the same large component structure: proof of Lemma 2.7
To prove Lemma 2.7, we use a local limit theorem to construct a two-step coupling of the CM dynamics that reaches two configurations with the same large component structure. The construction of Markov chain couplings using local limit theorems is not common (see [24] for another example), but it appears to be a powerful technique that may have other interesting applications. We provide next a brief introduction to local limit theorems.
Local limit theorem. Let be an integer. Let be integers and for , let be the random variable that is equal to with probability , and it is zero otherwise. Let us assume that are independent random variables. Let , and . We say that a local limit theorem holds for if for every integer :
(9) |
We prove, under some conditions, a local limit theorem that applies to the random variables corresponding to the number of active vertices from small components. Recall that for an increasing positive function and each integer , we defined the intervals
where is a fixed large constant.
Theorem 5.1.
Let be an integer. Let be integers, and suppose are independent random variables such that is equal to with probability , and is zero otherwise. Let be an increasing positive function such that and . Suppose , and for all , where is independent of . Let be the smallest integer such that . If for all , we have , then a local limit theorem holds for .
Theorem 5.1 follows from a general local limit theorem proved in [28]; a proof is given in Appendix A. We provide next the proof of Lemma 2.7.
Proof of Lemma 2.7.
First, both , perform one independent CM step from the initial configurations , . We start by establishing that and preserve the structural properties assumed for and .
By assumption , so Hoeffding’s inequality implies that the number of activated vertices from is such that
with probability . Then, the percolation step is distributed as a
random graph, with with probability . Conditioning on this event, from Lemma 3.2 we obtain that w.h.p. Moreover, from Lemma 3.9 and Markov’s inequality we obtain that with probability at least and from Lemma 3.10 that also with probability at least .
We show next that and , in addition to preserving the structural properties of and , also have many connected components with sizes in certain carefully chosen intervals. This fact will be crucial in the design of our coupling. When , by Lemmas 3.11 and 3.12 and a union bound, for all integer such that , w.h.p. (Recall, that denotes the number of connected components of with sizes in the interval .) We will also require a bound for the number of components with sizes in the interval
where is a constant such that does not intersect any of the ’s intervals. Let (resp., ) be the set of components of (resp., ) with sizes in the interval . Lemma 3.11 then implies that for some positive constants independent of ,
All the bounds above apply also to the analogous quantities for with the same respective probabilities. Therefore, by a union bound, all these properties hold simultaneously for both and with probability . We assume that this is indeed the case and proceed to describe the second step of the coupling, in which we shall use each of the established properties for and .
Recall and denote the sets of connected components in and , respectively, with sizes larger than . (Recall that , where .) Since , the total number of components in is ; moreover, it follows from the Cauchy–Schwarz inequality that the total number of vertices in the components in , denoted , is ; the same holds for . Without loss of generality, let us assume that . Let
and let . In words, is the smallest subset of components of so that the number of vertices in the union of and is greater than that in . Since every component in has size at least and , the number of vertices in is and so . In addition, the number of components in is . Let and observe that the number of components in is also and that
Note that may be (i.e., much larger than ). Hence, if all the components from and were activated, the difference in the number of active vertices could be . This difference cannot be corrected by our coupling for the activation of the small components. We shall require instead that all the components from and are activated so that the difference is instead.
We now describe a coupling of the activation sub-step for the second step of the CM dynamics. As mentioned, our goal is to design a coupling in which the same number of vertices are activated from each copy. If indeed , then we can choose an arbitrary bijective map between the activated vertices of and the activated vertices of and use to couple the percolation sub-step. Specifically, if and were activated in , the state of the edges in and in would be the same. This yields a coupling of the percolation sub-step such that and agree on the subgraph update at time .
Suppose then that in the second CM step all the components in and are activated simultaneously. If this is the case, then the difference in the number of activated vertices is . We will use a local limit theorem (i.e., Theorem 5.1) to argue that there is a coupling of the activation of the remaining components in and such that the total number of active vertices in both copies is the same with probability . Since all the components in and are activated with probability , the overall success probability of the coupling will be .
Now, let be the sizes of the components of that are not in (in increasing order). Let be the random variable corresponding to the number of active vertices from these components. Observe that is the sum of independent random variables, where the -th variable in the sum is equal to with probability , and it is otherwise. We claim that sequence satisfies all the conditions in Theorem 5.1.
First, note that since the number of isolated vertices in is , and consequently , and for all , where is independent of . Moreover, since for all such that ,
Since , we also have
Let and let
Hence, Theorem 5.1 implies that for any . Similarly, we get for any , with , and defined analogously for . Note that and . Without loss of generality, suppose . Then for any and , we have
Hence, there exists a coupling of and so that for all . Therefore, there is a coupling of and such that
Putting all these together, we deduce that with probability . If this is the case, the edge re-sampling step is coupled bijectively (as described above) so that .
It remains for us to guarantee the additional desired structural properties of and , which follow straightforwardly from the random graph estimates stated in Section 3. First note that by Hoeffding’s inequality, with probability ,
Hence, in the percolation sub-step the active subgraph is replaced by , where with probability since . Conditioning on this event, since the components of contribute to both and , Corollary 3.12 implies that w.h.p. for all such that . Moreover, from Lemma 3.2 we obtain that w.h.p. From Lemma 3.4 and Markov’s inequality, we obtain that with probability at least and from Lemma 3.10 that also with probability at least . All these bounds apply also to the analogous quantities for with the same respective probabilities.
Finally, we derive the bound for and . First, notice is stochastically dominated by , where . Under the assumption that , if , then Corollary 3.6 implies that w.h.p.; otherwise, and by Lemma 3.9 and Markov’s inequality, with probability at least . Thus, with probability at least . We also know that the largest inactivated component in has size less than , so with probability at least . The same holds for . Therefore, by a union bound, all these properties hold simultaneously for both and with probability , as claimed. ∎
5.3 Re-contracting largest component: proof of Lemma 2.8
In Section 5.2, we designed a coupling argument to ensure that the largest components of both configurations have the same size. For this, we needed to relax our constraint on the size of the largest component of the configurations. In this section we prove Lemma 2.8, which ensures that after steps the largest components of each configuration have size again.
The following lemma is the core of the proof Lemma 2.8 and it may be viewed as a generalization of the coupling from the proof of Lemma 2.7 using the local limit theorem from Section 5.2.
We recall some notation from the proof sketch. Given two random-cluster configurations and , is maximal matching between the components of and that only matches components of equal size to each other. We use , for the components in from , , respectively, , for the complements of , , and For an increasing positive function and each integer , define as the number of matched pairs in whose component sizes are in the interval
where is a fixed large constant (independent of ).
Lemma 5.2.
There exists a coupling of the activation sub-step of the CM dynamics such that with at least probability, provided and are random-cluster configurations satisfying
-
1.
;
-
2.
;
-
3.
for all such that ;
-
4.
.
Proof.
The activation coupling has two parts. First we use the maximal matching to couple the activation of a subset of the components in and . Specifically, let be defined as in Theorem 5.1; for all , we exclude pairs of components of size in the interval and we exclude pairs of matched isolated vertices. (These components exist by assumptions 3 and 4.) All other pairs of components matched by are jointly activated (or not). Hence, the number of vertices activated from in this first part of the coupling is the same as that from .
Let and denote the sets containing the components in and components in not considered to be activated in the first step of the coupling. This includes all the components from and , and all the components from and excluded in the first part of the coupling. Let and denote the number of activated vertices from and respectively. The second part is a coupling of the activation sub-step in a way such that
Let , and similarly for . Let (resp., ) be sizes of components in (resp., ) in ascending order. For all , let be a random variable that equals to with probability and otherwise, which corresponds to the number of activated vertices from th component in . Note that are independent. We check that satisfy all other conditions of Theorem 5.1.
Assumption and the first part of the activation ensure that
Observe also that there exists a constant such that for and for ; lastly, from assumption , we obtain
(10) |
Therefore, if and , Theorem 5.1 implies that for any ,
Similarly, we get that for any , with and defined analogously. Without loss of generality, suppose . Since , for , we obtain
Hence, we can couple so that for all . Consequently, under this coupling,
Since are independent, , and similarly . Hence, inequality (10) gives an upper bound for ; meanwhile, a lower bound for can be obtained by counting components in the largest interval:
Therefore,
as desired. ∎
We are now ready to prove Lemma 2.8.
Proof of Lemma 2.8.
Let be a suitable constant that we choose later. We wish to maintain the following properties for all :
-
1.
;
-
2.
;
-
3.
for all such that ;
-
4.
;
-
5.
;
-
6.
for some constant independent of .
By assumption, and satisfy these properties. Suppose that and satisfy these properties at step . We show that there exists a one-step coupling of the CM dynamics such that and preserve all six properties with probability .
We provide the high-level ideas of the proof first. We will crucially exploit the coupling from Lemma 5.2. Assuming , properties 1 and 2 hold immediately at , and properties 3 and 4 can be shown by a “standard” approach used throughout the paper. In addition, we reuse simple arguments from previous stages to guarantee properties 5 and 6.
Consider first the activation sub-step. By Lemma 5.2, with probability at least . If the number of vertices in the percolation is the same in both copies, we can couple the edge re-sampling so that the updated part of the configuration is identical in both copies. In other words, all new components created in this step are automatically contained in the component matching ; this includes all new components whose sizes are greater than . Since none of the new components contributes to , we obtain . Therefore, immediately implies properties 1 and 2 at time .
With probability , the largest components of and are activated simultaneously. Suppose that this is the case. By Hoeffding’s inequality, for constant , we have
Property 5 and the observation that imply that
By noting that , property 6 implies that
(11) |
We denote by . By inequality (11), with at least constant probability, the random graph for both chains is , where . Let us assume that is the case. Corollary 3.12 ensures that there exists a constant such that, with probability at least , for all such that . Since components in are simultaneously added to both and , property 3 is satisfied. Moreover, Lemma 3.2 implies that with high probability isolated vertices are added to and , and thus property 4 is satisfied at time .
In addition, Lemma 3.4 and Markov’s inequality imply that there exists a constant such that
By Lemma 4.3, there exists such that with at least probability
where is independent of and . Potentially, property 6 may not hold when , but then we stop at this point. (We will argue that in this case all the desired properties are also established shortly.) Hence, we suppose otherwise and establish properties 5 and 6 for . Similar bounds hold for .
By a union bound, and have all six properties with probability at least , assuming the activation sub-step satisfies all the desired properties, and thus overall with probability . Inductively, the probability that and satisfy the six properties is
Suppose and have the six properties. By choosing , properties 5 and 6 imply
and . While the lemma almost follows from these properties, notice that property 3 does not match the desired bounds on the components in the lemma statement. To fix this issue, we perform one additional step of the coupling.
Consider the activation sub-step at . Assume again . By Hoeffding’s inequality, for some constant , we obtain
(12) |
Let . Inequality (12) implies with at least constant probability the random graph in the percolation step is , where and . If so, Corollary 3.12 ensures with high probability for all such that .
By the preceding argument, with probability, the six properties are still valid at step , so the proof is complete. We note that if we had to stop earlier because property 6 did not hold, we perform one extra step (as above) to ensure that for all such that . ∎
5.4 A four-phase analysis using random walks couplings: proof of Lemma 2.9
We introduce first some notation that will be useful in the proof of Lemma 2.9. Let , and given , is obtained as follows:
-
(i)
;
-
(ii)
every component in activated by the CM dynamics at time is removed from ; and
-
(iii)
the largest new component (breaking ties arbitrarily) is added to .
Let denote the set of connected components of and note that is a subset of ; we use to denote the total number of vertices of the components in . Finally, let
In the proof of Lemma 2.9, we use the following lemmas.
Lemma 5.3.
Let be an increasing positive function such that and let be a sufficiently large constant. Suppose , and . Then, with probability at least , and .
Lemma 5.4.
Let be a positive function such that . Suppose a configuration satisfies . Let denote the number of activated vertices in this step, and . With probability , and .
Lemma 5.5.
Let and be two increasing positive functions of . Assume . Let and be two random-cluster configuration such that for some fixed constant independent of and for all such that . Assume also that for some constant . Lastly, assume . Then for every positive function there exists a coupling for the activation sub-step of the components of and such that
for some constant independent of .
The proofs of these lemmas are given in Section 5.4.1. In particular, as mentioned, to prove Lemma 5.5 we use a precise estimate on the maximum of a random walk on with steps of different sizes (see Theorem 5.7).
Proof of Lemma 2.9.
The coupling has four phases: in phase 1 we will consider steps of the coupling, steps in phase 2, steps in phase 3 and phase 4 consists of steps.
We will keep track of the random variables and for a function we shall carefully choose for each phase, and use these random variables to derive bounds on the probability of various events.
Phase 1. We set and where is a constant we choose. Let and let , and we fix . Suppose we have , , and for all such that , where is a large constant that we choose later.
By Lemma 5.5, for a sufficiently large constant , we obtain a coupling for the activation of and such that the same number of vertices are activated in and , with probability at least
By Lemma 5.4, and with probability at least . It follows a union bound that , and with probability . We call this event .
Let denote the inactivated components in at the step , and the inactivated components in . Observe that
Similarly,
Hence, by Markov’s inequality and independence between activation of components in and components in , with probability at least , the activation sub-step is such that
and
We denote this event by . By a union bound, and happen simultaneously with probability .
Suppose all these events indeed happen; then we couple the percolation step so that the components newly generated in both copies are exactly identical, and we claim that all of the following holds with at least constant probability:
-
1.
;
-
2.
;
-
3.
;
-
4.
for all such that .
First, note that can not possibly increase because the matching can only grow under the coupling if indeed . Observe that only the inactivated components in and would contribute to , so
Next, we establish the properties 3 and 4. For this, notice that the percolation step is distributed as a random graph. Corollary 3.12 implies for all such that , with probability at least . Moreover, Lemma 3.2 implies that with high probability . Since the percolation step is coupled, this implies that both and will have all the components in , so we have for all such that , and , w.h.p.
Finally, assuming that , by Lemma 3.9 and Markov’s inequality, there exists such that with probability at least . Then
for large enough . A union bound implies that all four properties hold with at least constant probability .
Thus, the probability that at each step of update all four properties can be maintained throughout Phase 1 is at least
If property 2 holds at the end of Phase 1, we have
To facilitate discussions in Phase 2, we show that the two copies of chains satisfy one additional property at the end of Phase 1. In particular, there exist a lower bound for the number of components in a different set of intervals. We consider the last percolation step in Phase 1. Then, Corollary 3.12 with implies for all such that , with high probability.
Recall and defined at the beginning of Section 5.4. In Phase 2, 3 and 4, a new element of the argument is to also control the behavior and . We provide a general result that will be used in the analysis of the three phases:
Claim 5.6.
Given positive increasing functions and that tend to infinity and satisfy
-
1.
;
-
2.
;
-
3.
;
-
4.
;
-
5.
;
-
6.
.
and random-cluster configurations satisfying
-
1.
;
-
2.
;
-
3.
;
-
4.
;
-
5.
for all such that .
There exists a coupling of CM steps such that after steps, with probability,
-
1.
;
-
2.
;
-
3.
;
-
4.
;
-
5.
If a function satisfies and , then for all such that .
Proof of this claim is provided later in Section 5.4.1.
Phase 2. Let . For Phase 2, we set , , , and . Notice these functions satisfy the conditions of 5.6:
-
1.
;
-
2.
;
-
3.
;
-
4.
;
-
5.
;
-
6.
.
Suppose that we have all the desired properties from Phase 1, so at the beginning of Phase 2 we have:
-
1.
;
-
2.
, ;
-
3.
;
-
4.
, ;
-
5.
for all such that .
Claim 5.6 implies there exists a coupling such that with probability
-
1.
;
-
2.
, ;
-
3.
, ;
-
4.
;
-
5.
for all such that .
Phase 3. Suppose the coupling in Phase 2 succeeds.
For Phase 3, we set the functions as , , , and . Claim 5.6 implies there exists a coupling such that with probability
-
1.
;
-
2.
;
-
3.
, ;
-
4.
,
-
5.
for all such that .
Phase 4. Suppose the coupling in Phase 3 succeeds. Let be a constant greater than . We set , , and . Claim 5.6 implies there exists a coupling such that with probability . Since is a non-negative integer-value random variable, . When , and have the same component structure.
Therefore, if the coupling in every phase succeeds, and have the same component structure. The probability that coupling in Phase 1 succeeds is . Conditional on the success of their previous phases, couplings in Phase 2, 3 and 4 succeed respectively with at least constant probability. Thus, the entire coupling succeeds with probability
where is a positive constant. ∎
5.4.1 Proof of lemmas used in Section 5.4
Proof of Claim 5.6.
We will show that given the following properties at any time , we can maintain them at time with probability at least :
-
1.
;
-
2.
for a constant ;
-
3.
;
-
4.
;
-
5.
for all such that .
By assumption, . According to Lemma 5.3, and retain properties 2 and 3 with probability at least .
Given properties 1, 4 and 5, Lemma 5.5 (with ) implies that there exist a constant and a coupling for the activation sub-step of and
Note that condition is used to deduce the inequality above. Suppose ; we couple components generated in the percolation step and preclude the growth of . Hence, , and property 1 holds immediately.
Recall that . Properties 2 and 3 imply that and . Since and , we can upper bound and by
We establish properties 4 and 5 with a similar argument as the one used in Phase 1.
Let . Due to Lemma 5.4 (with ) and Corollary 3.12 with probability at least , for all such that . In addition, with probability by Lemma 3.2. Since the coupling adds components in to both and , properties 4 and 5 are maintained at time , with probability at least .
A union bound concludes that at time we can maintain all five properties with probability at least . Hence, the probability that and still satisfy the listed 5 properties above is
It remains for us to show the bound for and that for a given function satisfying and , then for all such that .
Conditioned on for every activation sub-step in this phase, a bound for can be obtained through a first moment method. On expectation contract by a factor of each step. Thus, we can recursively compute the expectation of :
(13) |
It follows from Markov’s inequality that with at least constant probability
Finally, in the last percolation step in this phase, Corollary 3.12 guarantees that with high probability for all such that . The claim follows from a union bound. ∎
Proof of Lemma 5.3.
We establish first the bound for . Suppose vertices are activated from . By assumption
for sufficiently large . Hence, Hoeffding’s inequality implies that
with probability at least .
We consider two cases. First suppose that , where is a sufficiently small constant we choose later. Then,
The largest new component corresponds to the largest component of a random graph. Let be the size of that component, and let be the size of the largest component of a random graph, where . By Fact 3.1, is stochastically dominated by . Then by Corollary 3.6 there exists a constant such that
(14) |
for any . Now,
(15) |
where for the second to last inequality we use that , and the last inequality follows from the assumptions and . Also, since and ,
(16) |
Hence, (14), (15) and (16) imply
Since , for sufficiently small and
Therefore, with probability . If this is the case, then and so by a union bound with probability at least .
For the second case we assume and proceed in similar fashion. In this case, Hoeffding’s inequality implies with probability at least ,
The size of the largest new component, denoted , is stochastically dominated by the size of the largest component of a random graph, with . Now, since we assume ,
where the last inequality holds for large and a sufficiently large constant . Moreover,
Hence,
where , and by Corollary 3.6
Since, , a union bound implies that with probability at least as desired.
Finally, to bound we observe that if are all the new components in order of their sizes, then by Lemma 3.4 and Markov’s inequality:
Thus, with probability at least as claimed. The lemma follows from a union bound. ∎
Proof of Lemma 5.4.
Since , by Hoeffding’s inequality
with probability at least . The new connected components in correspond to those of a random graph, where . If , then
(17) |
An important tool used in the proof of Lemma 5.5 is the following coupling on a (lazy) symmetric random walk on ; its proof is given in Appendix B.
Theorem 5.7.
Let and let be positive integers. Let and consider the sequences of random variables and where for each : with probability ; with probability ; otherwise and has the same distribution as . Let and . Then for any , there exist a constant and a coupling of and such that
We note that Theorem 5.7 is a generalization of the following more standard fact which will also be useful to us.
Lemma 5.8 ([4], Lemma 2.18).
Let and be binomial random variables with parameters and , where is a constant. Then, for any integer , there exists a coupling such that for a suitable constant ,
Proof of Lemma 5.5.
For ease of notation let and for each . Also recall the notations , and defined in Section 2.2. Let and be the isolated vertices in from and , respectively.
Let . The activation of the non-trivial components in and whose sizes are not in is coupled using the matching . That is, and are activated simultaneously with probability . The components in and are activated independently. After independently activating these components, the number of active vertices from each copy is not necessarily the same. The idea is to couple the activation of the remaining components in and in way that corrects this difference.
Let and be number of active vertices from and , respectively, after the activation of the components from and . Observe that and that by Hoeffding’s inequality, for any
Recall . Hence, with probability at least ,
We first couple the activation of the components in , then in and so on up to . Without loss of generality, suppose that . If , we simply couple the components with sizes in using the matching . Suppose otherwise that . Let and be random variables corresponding to the numbers of active vertices from and with sizes in respectively. By assumption . Hence, Theorem 5.7 implies that for , there exists a coupling for the activation of the components in and with sizes in such that
with probability at least
where the last inequality holds for large enough. Let . If the coupling succeeds, we have . Thus, we have shown that with probability at least
Now, let be the difference in the number of active vertices after activating the components in . Suppose that , for . By assumption, . Thus, using Theorem 5.7 again we get that there exists a coupling for the activation of the components in such that
Therefore, there is a coupling of the activation components in such that
where . Note that for a suitable constant , we have
and since
we get
Finally, we couple and to fix . By assumption , so . Let and denote the total number of activated isolated vertices from and respectively. We activate all isolated vertices independently, so and can be seen as two binomial random variables with the same parameters and . Lemma 5.8 gives a coupling for binomial random variables such that for ,
Therefore,
as claimed. ∎
6 New mixing time for the Glauber dynamics via comparison
In this section, we establish a comparison inequality between the mixing times of the CM dynamics and of the heat-bath Glauber dynamics for the random-cluster model for a general graph . The Glauber dynamics is defined as follows. Given a random-cluster configuration , one step of this chain is given by:
-
(i)
pick an edge uniformly at random;
-
(ii)
replace the current configuration by with probability
-
(iii)
else replace by .
It is immediate from its definition this chain is reversible with respect to and thus converges to it.
The following comparison inequality was proved in [7]:
(18) |
where denotes the number of edges in , and , the spectral gaps of the transition matrices of the CM and Glauber dynamics, respectively. The standard connection between the spectral gap and the mixing time (see, e.g., Theorem 12.3 in [23]) yields
(19) |
where with denoting the set of random-cluster configurations on . In some cases, such as in the mean-field model with , , and a factor of is thus lost in the comparison. We provide here an improved version of this inequality.
Theorem 6.1.
For any and any , the mixing time of Glauber dynamics for the random-cluster model on a graph with vertices and edges satisfies
We note that in the mean-field model, where and we take with , this theorem yields that , which establishes Theorem 1.2 from the introduction and improves by a factor of the best previously known bound for the Glauber dynamics on the complete graph.
To prove Theorem 6.1 we use the following standard fact.
Theorem 6.2.
Let be a Markov chain on state space with stationary distribution . Suppose there exist a subset of states and a time , such that for any and any we have Then
(20) |
where .
Note that is the minimum probability of any configuration on . Without the additional assumptions in the theorem, the best possible bound involves a factor of instead. We remark that there are related conditions under which (20) holds; we choose the condition that for every and every for convenience.
We can now provide the proof of Theorem 6.1.
Proof of Theorem 6.1.
First note that if , it suffices to prove that
This follows from (19) and the fact that
since the partition function for the random-cluster model on satisfies (see, e.g., Theorem 3.60 in [19]).
Thus, we may assume . From (18) and the standard relationship between the spectral gap and the mixing time (see, e.g., Theorem 12.4 in [23]) we obtain:
(21) |
Let denote the transition matrix of the Glauber dynamics. In order to apply Theorem 6.2, we have to find a suitable subset of states and a suitable time so that , for every and every .
We let and for a sufficiently large constant . When an edge is selected for update by the Glauber dynamics, it is set to be open with probability if it is a “cut edge” or with probability if it is not; recall that we say an edge is open if the edge is present in the random-cluster configuration. Therefore, since when , after every edge has been updated at least once the number of open edges in any configuration is stochastically dominated by the number of edges in a random graph. By the coupon collector bound, every edge has been updated at least once at time w.h.p. for large enough . Moreover, if all edges are indeed updated by time , the number of open edges in at any time is at most with probability at least by Markov’s inequality. Therefore, the Glauber dynamics satisfies condition in Theorem 6.2 for these choices of and .
For the sake of completeness, we conclude this section with a proof of Theorem 6.2.
Proof of Theorem 6.2.
For and , we have
where the last inequality follows from the theorem assumption for .
For any , we have
see inequality (12.11) in [23]. Hence, for any we have
(24) |
Letting , we deduce from (24) that for
(25) |
Since , it remains for us to provide a bound for when . Consider two copies , of the chain . For let be the coupling of , such that the two copies evolve independently up to time and if and for some then the optimal coupling is used so that
recall that the existence of an optimal coupling is guaranteed by the coupling lemma (see, e.g., Proposition 4.7 in [23]). Then, for any
where the last inequality follows from the triangle inequality. Now,
provided . Using a standard boosting argument (see (4.36) in [23]) and (25) we deduce that as claimed. ∎
References
- [1] N. Alon and J.H. Spencer. The probabilistic method. John Wiley & Sons, 2000.
- [2] V. Beffara and H. Duminil-Copin. The self-dual point of the two-dimensional random-cluster model is critical for . Probability Theory and Related Fields, 153:511–542, 2012.
- [3] A. C. Berry. The accuracy of the gaussian approximation to the sum of independent variates. 1941.
- [4] A. Blanca. Random-cluster dynamics. PhD thesis, UC Berkeley, 2016.
- [5] A. Blanca and R. Gheissari. Random-cluster dynamics on random regular graphs in tree uniqueness. Communications in Mathematical Physics, 2021.
- [6] A. Blanca, R. Gheissari, and E. Vigoda. Random-cluster dynamics in : rapid mixing with general boundary conditions. Annals of Applied Probability, 30(1):418–459, 2020.
- [7] A. Blanca and A. Sinclair. Dynamics for the mean-field random-cluster model. Proceedings of the 19th International Workshop on Randomization and Computation (RANDOM), pages 528–543, 2015.
- [8] A. Blanca and A. Sinclair. Random-Cluster Dynamics in . Probability Theory and Related Fields, 168:821–847, 2017.
- [9] B. Bollobás, G.R. Grimmett, and S. Janson. The random-cluster model on the complete graph. Probability Theory and Related Fields, 104(3):283–317, 1996.
- [10] L. Chayes and J. Machta. Graphical representations and cluster algorithms II. Physica A, 254:477–516, 1998.
- [11] H. Duminil-Copin, M. Gagnebin, M. Harel, I. Manolescu, and V. Tassion. Discontinuity of the phase transition for the planar random-cluster and Potts models with . Annales de l’ENS, 2016. To Appear.
- [12] H. Duminil-Copin, V. Sidoravicius, and V. Tassion. Continuity of the Phase Transition for Planar Random-Cluster and Potts Models with . Communications in Mathematical Physics, 349(1):47–107, 2017.
- [13] R. Durrett. Probability: Theory and Examples. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2010.
- [14] C.M. Fortuin and P.W. Kasteleyn. On the random-cluster model I. Introduction and relation to other models. Physica, 57(4):536–564, 1972.
- [15] A. Galanis, D. Štefankovič, and E. Vigoda. Swendsen-Wang algorithm on the mean-field Potts model. Proceedings of the 19th International Workshop on Randomization and Computation (RANDOM), pages 815–828, 2015.
- [16] T. Garoni. Personal communication, 2015.
- [17] R. Gheissari and E. Lubetzky. Quasi-polynomial mixing of critical two-dimensional random cluster models. Random Structures & Algorithms, 56(2):517–556, 2020.
- [18] R. Gheissari, E. Lubetzky, and Y. Peres. Exponentially slow mixing in the mean-field Swendsen-Wang dynamics. In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1981–1988. SIAM, 2018.
- [19] G.R. Grimmett. The Random-Cluster Model, volume 333 of Grundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Ver- lag, Berlin, 2006.
- [20] H. Guo and M. Jerrum. Random cluster dynamics for the Ising model is rapidly mixing. In Proceedings of the 28th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1818–1827. SIAM, 2017.
- [21] S. Janson, T. Łuczak, and A. Ruciński. Random Graphs. Wiley Series in Discrete Mathematics and Optimization. Wiley, 2011.
- [22] D.A. Levin and Y. Peres. Markov Chains and Mixing Times. MBK. American Mathematical Society, 2017.
- [23] D.A. Levin, Y. Peres, and E.L. Wilmer. Markov Chains and Mixing Times. American Mathematical Society, 2008.
- [24] Y. Long, A. Nachmias, W. Ning, and Y. Peres. A power law of order 1/4 for critical mean-field Swendsen-Wang dynamics. Memoirs of the American Mathematical Society, 232(1092), 2011.
- [25] Malwina Luczak and Tomasz Luczak. The phase transition in the cluster-scaled model of a random graph. Random Structures & Algorithms, 28(2):215–246, 2006.
- [26] T. Luczak. Cycles in a random graph near the critical point. Random Structures & Algorithms, 2(4):421–439, 1991.
- [27] T. Łuczak, B. Pittel, and J.C. Wierman. The structure of a random graph at the point of the phase transition. Transactions of the American Mathematical Society, 341(2):721–748, 1994.
- [28] A.B. Mukhin. Local limit theorems for lattice random variables. Theory of Probability & Its Applications, 36(4):698–713, 1992.
- [29] B. Pittel. On tree census and the giant component in sparse random graphs. Random Structures & Algorithms, 1(3):311–342, 1990.
- [30] R.H. Swendsen and J.S. Wang. Nonuniversal critical dynamics in Monte Carlo simulations. Physical Review Letters, 58:86–88, 1987.
- [31] M. Ullrich. Swendsen-Wang is faster than single-bond dynamics. SIAM Journal on Discrete Mathematics, 28(1):37–48, 2014.
- [32] Roman Vershynin. High-dimensional probability. 2019.
Appendix A Proof of the local limit theorem
In this appendix, we prove Theorem 5.1. First, we introduce some notation. For a random variable and , let , where denotes distance to the closest integer and is a symmetrized version of ; i.e., where is an i.i.d. copy of . Let . The following local limit theorem is due to Mukhin [28] (all limits are taken as ).
Theorem A.1 ([28], Theorem 1).
Suppose that the sequence converges in distribution to a standard normal random variable and that . If and there exists such that we have then the local limit theorem holds.
Next, we show how to derive Theorem 5.1 from Theorem A.1. The proof involves the following two lemmas.
Lemma A.2.
For the random variables satisfying the conditions from Theorem 5.1, and converges in distribution to a standard normal random variable.
Proof.
Observe that
and also
Hence, the random variables satisfy Lyapunov’s central limit theorem conditions (see, e.g., [13]), and so converges in distribution to a standard normal random variable. ∎
Lemma A.3.
Suppose satisfy the conditions from Theorem 5.1. For any satisfying , .
Proof.
We have We consider three cases. First, if , there exists a largest integer such that , where is the smallest integer such that . Then,
by we mean that is of lower order with respect to . Now, when , we have
Finally, if , is sublinear and so
as claimed. ∎
Proof of Theorem 5.1.
We check that the ’s satisfy the conditions from Theorem A.1. Lemma A.2 implies and ; by Lemma A.3 we also have that for any satisfying , . It remains to show that .
Now, for , observe that the value of equals to with probability , with probability , and otherwise. Then for , evaluates to with probability , and otherwise. Therefore, for and we have that Thus,
Since we have shown that the ’s satisfy all the conditions from Theorem A.1, the result follows. ∎
Appendix B Proofs of random walk couplings
Another important tool in our proofs are couplings based on the evolutions of certain random walks. In this section we consider a (lazy) symmetric random walk on with bounded step size, and the first result we present is an estimate on which is based on the well-known reflection principle (see, e.g., Chapter 2.7 in [22]).
Lemma B.1.
Let and let be positive integers. Let and consider the sequence of random variables where for each : with probability ; with probability ; and otherwise. Let and . Then, for any
Proof.
First, note that
(26) |
If , let be the value of the random walk the first time its value was at least . Then,
where in the second equality we used the fact that the random walk is symmetric and the last one follows from the fact that . Plugging this into (26), we get
since . Finally, observe that
and so
as desired. ∎
We can now prove Theorem 5.7.
Proof of Theorem 5.7.
Set . Let for each . We construct a coupling for by coupling each as follows:
-
1.
If , sample and independently.
-
2.
If , set .
Observe that if for any , then . Therefore,
where . Note that behaves like a (lazy) symmetric random walk until the first time it is at least ; after that stays put.
Let denote such random walk which does not stop after , and . Notice
Since the step size of is at least and at most , by Lemma B.1 for any
Let and . By the Berry-Esséen theorem for independent (but not necessarily identical) random variables (see, e.g. [3]), we get that for any
where is a standard normal random variable, and is an absolute constant. Then,
(27) |
Notice . If , the theorem holds vacuously since
Appendix C Random graphs estimates
In this section, we provide proofs of lemmas which do not appear in the literature.
Recall , where may depend on . Both of Lemmas 3.10 and 3.11 are proved using the following precise estimates on the moments of the number of trees of a given size in . We note that similar estimates can be found in the literature (see, e.g., [29, 1]); a proof is included for completeness.
Claim C.1.
Let be the number of trees of size in . Suppose there exists a positive increasing function such that , and . If as , then:
-
(i)
;
-
(ii)
;
-
(iii)
For , .
To prove Lemma 3.10, we also use the following result.
Lemma C.2.
Suppose and . Then w.h.p. the largest component of is the only component of which contains more than one cycle. Also, w.h.p. the number of vertices contained in the unicyclic components of is less than for any function .
Proof.
Proof of Lemma 3.10.
Let us fix and consider first the case when is large. If and , then Lemma 3.7 implies that
Similarly, if and , then Lemma 3.8 implies that . We may assume since otherwise the size of the largest component does not contribute to the sum. Then,
Hence, if , the result follows from Markov’s inequality.
Suppose next . Let be the number of trees of size in and let be the set of trees of size at most in . By Claim C.1.i,
(28) |
By Markov’s inequality, we get that with probability at least , for any desired for a suitable constant .
All that is left to prove is that the contribution from complex (non-tree) components is small. When , this follows immediately from the fact that the expected number of complex components is (see, e.g., Lemma 2.1 in [27]). Then, if is the set of complex components in of size at most , we have
and the result follows again from Markov’s inequality and a union bound.
Finally, when , Lemma C.2 implies that w.h.p. there is no multicyclic component except the largest component and that the number of vertices in unicyclic components is bounded by , for any function . Hence, w.h.p.,
Setting , it follows that w.h.p.
This, combined with (C), Markov’s inequality and a union bound yields the result. ∎
Proof of Lemma 3.11.
Let be number of trees in with size in the interval ; then . By Chebyshev’s inequality for :
for a suitable constant . Now,
By Claim C.1.i and Claim C.1.ii,
where in the last equality we used the assumption that . Similarly, by Claim C.1.iii
where the last inequality follows from the assumption that . Hence, for a suitable constant
and taking we get
as desired. ∎
Proof Corollary 3.12.
Lemma 3.11 implies that for a suitable constant
for any such that . Observe that
Hence, a union bound over , i.e., over the intervals , implies that, with probability at least , for all such that , as claimed. ∎
Proof of Claim C.1.
Let . The following combinatorial identity follows immediately from the fact that there are exactly trees of size .
Using the Taylor expansion for and the fact that , we get
(29) |
Similarly,
Since , Stirling’s approximation gives
(30) |
Putting all these bounds together, we get
(31) |
where in the last inequality we used the assumptions that and . This establishes part (i).
For part (ii) we proceed in similar fashion, starting instead from the following combinatorial identity:
where (see, e.g., [29]). Using the Taylor expansion for , we get
These three bounds together with (30) imply
From (31), we get
Hence,
where in the second equality we used the assumptions that and and for the last inequality we used the Taylor expansion for . This completes the proof of part (ii).
For part (iii), let . When we have the following combinatorial identity (see, e.g., [29]):
where . Using Taylor expansions and Stirling’s approximation as in the previous two parts, we get
Moreover, from (31) we have
and so
where in the third equality we used the assumptions that and and the last inequality follows from the Taylor expansion for . ∎
Appendix D The second proof of the local limit theorem
Proof of Theorem 5.1.
Let denote the probability density function of a standard normal distribution. We will show for any fixed ,
(32) |
which is equivalent to (9).
Let denote the characteristic function for the random variable . By applying the inversion formula (see Theorem 3.3.14 and Exercise 3.3.2 in [13]),
and
Hence, the left hand side of (32) can be bounded from above by
Since , it suffices to show that for all there exists such that if then
(33) |
We can bound from above the left hand side of (33) by:
(34) |
The division depends on some constant that we will choose soon. We proceed to bound integral terms in (34) independently.
Lemma A.2 implies that converges in distribution to a standard normal. Combined with the continuity theorem (see Theorem 3.3.17 in [13]), we get that as . The dominated convergence theorem (see Theorem 1.5.8 in [13]) hence implies that for any the first integral of (34) converges to 0. We select large enough so that the integral is less than .
The last integral of (34) is the standard normal tail that goes to exponentially fast as increases (see e.g. Proposition 2.1.2 in [32]). Therefore, we are able to select large enough so that each tail has probability mass less than .
To bound the remaining two terms we use the properties of the characteristic function . By definition and the independence between ’s,
where denotes the characteristic function of . Since always has modulo 1, .
We proceed to bound the third integral of (34). Note that for all and . Therefore,
Notice that the ’s for are Bernoulli random variables. By periodicity (see Theorem 3.5.2 in [13]), equals to 1 only when equals to the multiples of . For , is bounded away from , and there exists a constant such that . Hence, . By choosing to be sufficiently large, we may bound the integral for :
Finally, we bound the second integral of (34). By the definition of , we have
where the last identity uses Euler’ formula. Take the modulo of both sides,
where the last equality corresponds to the Taylor expansion for when . We can also Taylor expand as
Observe that if , then we can bound from above by . Furthermore, if we keep only the first order term from the expansion for , we have
(35) |