This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Central Limit Theorem for Majority Dynamics: Bribing Three Voters Suffices

Ross Berkowitz111Dept. of Mathematics, Yale University, New Haven CT, USA   [email protected]    Pat Devlin222Dept. of Mathematics, Yale University, New Haven CT, USA   [email protected]
(October 15, 2020)
Abstract

Given a graph GG and some initial labelling σ:V(G){Red,Blue}\sigma:V(G)\to\{Red,Blue\} of its vertices, the majority dynamics model is the deterministic process where at each stage, every vertex simultaneously replaces its label with the majority label among its neighbors (remaining unchanged in the case of a tie). We prove—for a wide range of parameters—that if an initial assignment is fixed and we independently sample an Erdős–Rényi random graph, Gn,pG_{n,p}, then after one step of majority dynamics, the number of vertices of each label follows a central limit law. As a corollary, we provide a strengthening of a theorem of Benjamini, Chan, O’Donnell, Tamuz, and Tan about the number of steps required for the process to reach unanimity when the initial assignment is also chosen randomly.

Moreover, suppose there are initially three more red vertices than blue. In this setting, we prove that if we independently sample the graph Gn,1/2G_{n,1/2}, then with probability at least 51%51\%, the majority dynamics process will converge to every vertex being red. This improves a result of Tran and Vu who addressed the case that the initial lead is at least 10.

footnotetext: AMS 2010 subject classification: 05C80, 05C78, 60F05, 68R10, 60C05footnotetext: Key words and phrases: majority dynamics, random graph, central limit theorem, Fourier analysis

1 Introduction

Majority dynamics is a model for belief propagation on a graph. In this process, each vertex of the graph has one of two opinions (which we will refer to as the colors ‘red’ and ‘blue’). Each round, the opinions of the vertices are all updated simultaneously, and each vertex replaces their opinion so as to match the majority opinion that was held by their neighbors (in the case of a tie, a vertex does not change its opinion). This updating step is then iterated, and the resulting process is referred to as majority dynamics. This model has been used in a variety of settings including social choice theory [14, 17], economics [3, 8], and statistical physics [6, 7], and many related processes have been proposed as well (see [1, 2, 4, 24] for just a few examples).

The majority dynamics model has gotten a lot of attention for fixed deterministic graphs, with many of the primary questions being related to the long-term behavior [13, 12]. The problem has also been studied in the settings of fixed graphs with random initial colorings [15, 21, 10]. We direct the interested reader to [16] and [19] for helpful surveys.

The situation of majority dynamics on random graphs has also been a focus of interest, but as discussed below this setting is not particularly well understood. Before diving into this literature, let us first establish the following notation to be used throughout.

Notation

The Erdős–Rényi random graph, Gn,pG_{n,p}, is defined as the graph with a vertex set of size nn and each potential edge independently included with probability pp. Given a graph and an intial coloring R0R_{0} and B0B_{0}, we will let RiR_{i} (resp. BiB_{i}) denote the set of vertices that are red (resp. blue) after ii steps of majority dynamics. The initial coloring R0,B0R_{0},B_{0} may be fixed or determined randomly, but in either case it will always be independent of the random graph Gn,pG_{n,p}.

Asymptotic notation is understood to mean as the relevant parameter (typically nn) tends to infinity. We say an event happens with high probability to mean its probability tends to 11. All logarithms have base ee. Finally, we use the standard notation Φ(t)=(2π)1/2tex2/2𝑑x\Phi(t)=(2\pi)^{-1/2}\int_{-\infty}^{t}e^{-x^{2}/2}dx to denote the probability that a mean 0, variance 1 Gaussian random variable is at most tt.

Previous results for random graphs

In studying majority dynamics on random graphs, a common approach is as follows. One first tries to control the number of red vertices after one [5, 23, 22] or two [9, 23] steps (hoping that in these initial steps, one color will have managed to establish a substantial lead). After this, one typically argues that with probability tending to 1, if any color ever manages to establish a sufficiently large lead, then this lead is going to increase even if the vertices were adversarially recolored (provided this recoloring preserves the number of vertices of each color).

The second part of this strategy (i.e., showing that with high probability large leads cannot be squandered) often involves some variant of a relatively direct union bound using the fact that the random graph is sufficiently expansive. On the other hand, the first part of this strategy (i.e., controlling the number of vertices of each color after one or two steps) is considerably more nuanced.

Following this general strategy, Benjamini, Chan, O’Donnell, Tamuz, and Tan [5] obtained rough estimates for the mean and variance of |R1||R_{1}| under the assumption that the initial coloring R0,B0R_{0},B_{0} is assigned uniformly at random with |R0||B0||R_{0}|\geq|B_{0}|. Their argument was based on a combination of Fourier analysis and spectral graph techniques, and using their estimates, they were able to prove that if pC/np\geq C/\sqrt{n} and |R0||B0||R_{0}|\geq|B_{0}|, then (R4=V)0.4o(1)\mathbb{P}(R_{4}=V)\geq 0.4-o(1). Thus, for pC/np\geq C/\sqrt{n}, they proved that with probability at least 0.40.4, for a randomized initial coloring of Gn,pG_{n,p}, the color that was initially ahead will propagate to include all the vertices of the graph within at most four steps. When pC/n1/3p\geq C/n^{1/3}, their same estimates on |R1||R_{1}| also prove that with probability at least 0.40.4, the same result holds after at most 3 steps.

Analyzing the same setting, Fountoulakis, Kang, and Makai [9] improved on the first of these results by focusing instead on estimating the mean and variance of |R2||R_{2}|. Namely, they proved that if R0,B0R_{0},B_{0} is chosen uniformly at random and pC/np\geq C/\sqrt{n}, then with probability tending to 1, the color that was initially ahead will propagate to the entire graph within at most four steps.

For smaller values of pp, Zehmakan [23] studied the case that the vertices of Gn,pG_{n,p} are each initially colored blue independently with probability qq. For q1/2ω(1/np)q\leq 1/2-\omega(1/\sqrt{np}) he proved that when p(1+ε)log(n)/np\geq(1+\varepsilon)\log(n)/n, then with high probability, majority dynamics converges to all vertices being red after a bounded number of steps. (For smaller pp, there will be isolated blue vertices.) Gärtner and Zehmakan [11] also proved a similar result for dd-regular random graphs. See [5] for several conjectures about how the majority dynamics process evolves for values of pp less than C/nC/\sqrt{n} and |R0||B0||R_{0}|\sim|B_{0}|.

Finally, in a different direction, Tran and Vu [22] considered the case that the initial coloring is fixed with |R0||B0|10|R_{0}|-|B_{0}|\geq 10. In this setting, they argued that with probability at least 90%90\%, the graph Gn,1/2G_{n,1/2} will result in all of its vertices being red.

Our results

It is worth emphasizing that in the above papers, the key first step has been to analyze the distribution of |R1||R_{1}| or |R2||R_{2}|. In fact, essentially all of these previous results have followed from some new estimate on the mean or variance of these random variables. To quote [5], “the main task is to analyze what happens at time 1.” This brings us to our primary result.

Theorem 1.

Let R0R_{0}, B0B_{0} be any fixed initial coloring of nn vertices, and suppose we independently sample the random graph Gn,pG_{n,p}. Let |R1||R_{1}| denote the number of vertices that will be red after one step of majority dynamics. Let Δ=||R0||B0||\Delta=\Big{|}|R_{0}|-|B_{0}|\Big{|}, let σ=np(1p)\sigma=\sqrt{np(1-p)}, and define

μ=(Bin(|R0|,p)Bin(|B0|,p))(Bin(|R0|,p)Bin(|B0|,p)).\mu=\mathbb{P}\Big{(}Bin(|R_{0}|,p)\geq Bin(|B_{0}|,p)\Big{)}\cdot\mathbb{P}\Big{(}Bin(|R_{0}|,p)\leq Bin(|B_{0}|,p)\Big{)}.
  • (i)

    For any choice of parameters, the variance of |R1||R_{1}| satisfies Var(|R1|)=nμ+𝒪(Δ+n/σ)Var(|R_{1}|)=n\mu+\mathcal{O}(\Delta+n/\sigma).

  • (ii)

    If log(1/μ)=o(log(σ))\log(1/\mu)=o(\log(\sigma)) and σ\sigma\to\infty, then Var(|R1|)nμVar(|R_{1}|)\sim n\mu, and |R1||R_{1}| obeys a central limit law. Namely, for all fixed reals a<ba<b, we have

    limn(|R1|𝔼|R1|nμ[a,b])=12πabex2/2𝑑x=Φ(b)Φ(a).\lim_{n\to\infty}\mathbb{P}\left(\dfrac{|R_{1}|-\mathbb{E}|R_{1}|}{\sqrt{n\mu}}\in[a,b]\right)=\dfrac{1}{\sqrt{2\pi}}\int_{a}^{b}e^{-x^{2}/2}dx=\Phi(b)-\Phi(a).

In particular, if Δp=𝒪(σ)\Delta p=\mathcal{O}(\sigma) and σ\sigma\to\infty, then the hypotheses and conclusion of (ii) hold.

This provides an unqualified upper bound that Var(|R1|)=𝒪(n)Var(|R_{1}|)=\mathcal{O}(n), and for a wide range of parameters, it also gives a central limit theorem. Crucially, the statement is for a given value of |R0||R_{0}| (which may certainly depend on nn) as opposed to |R0||R_{0}| being a random variable. Conditioning on |R0||R_{0}| in this way enables us to control all the moments of |R1||R_{1}|, whereas otherwise the variance in this quantity would be dominated by the variance of |R0||R_{0}|.

Theorem 1 is proven by the method of moments, but to illustrate the main challenge, consider the calculation of the variance of |R1||R_{1}|. The difficulty here is that one quickly arrives at this variance as a sum of covariances, but each covariance term is large in absolute value. Bounding this sum by its maximum term gives the incorrect asymptotic growth, and the key idea is therefore devising a way to analyze the large amount of cancellation in these sums. We do this by studying the relevant pp-biased Fourier coefficients (with random variables viewed as functions of the edges present in Gn,pG_{n,p}).

At this point, it is worth noting that for a wide range of parameters (concretely, for |R0|=|B0||R_{0}|=|B_{0}| and pp constant) the distribution of |R1||R_{1}| converges to a normal, but the distribution of |R2||R_{2}| certainly will not. This is simply because if |R1||R_{1}| is sufficiently far from n/2n/2 (e.g., at least Ω(n/p)\Omega(\sqrt{n/p}) away [see Proposition 3]) then this would force |R2||R_{2}| to be very far away from n/2n/2. Thus, the mass of the distribution of |R1||R_{1}| is pushed very far away from n/2n/2, resulting in a particularly bimodal distribution (having mean n/2n/2, variance Ω(n2)\Omega(n^{2}), but being bounded between 0 and nn). The distribution of |R3||R_{3}| would be even more exaggerated, and it would likely be essentially two peaks at 0 and nn. For many settings of parameters, we believe the distribution of |R4||R_{4}| would converge to literally two point masses at 0 and nn. Thus, we ask the following (with the unproven belief that |Rk||R_{k}| is often only interesting for k{1,2}k\in\{1,2\}).

Question 2.

How is |Rk||R_{k}| distributed? (For concreteness, take |R0|=|B0||R_{0}|=|B_{0}| and p=1/2p=1/2)

Regardless of the above, Theorem 1 is already strong enough to imply most of the known results (including the bounds on |R2||R_{2}| from [9]), but to leverage this we use the following.

Proposition 3.

For each ε>0\varepsilon>0, there are constants C>0C>0 and q(0,1)q\in(0,1) such that if np(1p)>Cnp(1-p)>C and GGn,pG\sim G_{n,p}, then with probability at least 1qn1-q^{n}, the following statement simultaneously holds for every choice of δ(ε,1ε)\delta\in(\varepsilon,1-\varepsilon) and every RV(G)R\subseteq V(G). If |R|n/2+αn(1p)/p|R|\geq n/2+\alpha\sqrt{n(1-p)/p} and

sup0γ2eγ2/2[Φ(γ2α1δ)]δ14ε,\sup_{0\leq\gamma\leq 2}e^{-\gamma^{2}/2}\left[\Phi\left(\frac{\gamma-2\alpha}{\sqrt{1-\delta}}\right)\right]^{\delta}\leq\dfrac{1}{4}-\varepsilon,

then the number of vertices of GG having at least as many edges to VRV\setminus R as to RR is at most δn\delta n. In particular, if ε=1010\varepsilon=10^{-10} and δ=0.499\delta=0.499, we may take α=0.85\alpha=0.85.

For δ=0.499\delta=0.499, a result such as the above is not difficult to prove for sufficiently large α\alpha, and in fact versions of this have also appeared in [5, 22]. But we state the above so as to improve the Δ=10\Delta=10 case of Gn,1/2G_{n,1/2} as studied in Tran and Vu. In that setting, the constants involved are frustratingly important, and a proof with α=0.85\alpha=0.85 becomes necessary.

Applications

From Theorem 1—and in fact merely from our variance estimate—we readily obtain:

Corollary 4.

There is a universal constant C>0C>0 such that the following holds. Suppose we fix any initial red-blue partition R0R_{0}, B0B_{0} of nn vertices where |R0||B0|Δ>0|R_{0}|-|B_{0}|\geq\Delta>0, and we independently sample a random graph Gn,pG_{n,p}. Then for all real t1t\geq 1, we have

(|R1|nΦ(Δpnp(1p))ntnp(1p))1Cp(1p)t2.\mathbb{P}\left(|R_{1}|\geq n\Phi\left(\dfrac{\Delta p}{\sqrt{np(1-p)}}\right)-\dfrac{nt}{\sqrt{np(1-p)}}\right)\geq 1-\dfrac{Cp(1-p)}{t^{2}}.

Moreover, if 5Δp5\leq\Delta p and np(1p)25\sqrt{np(1-p)}\geq 25, then we have

(|R1|n/2+min(0.3n,0.11Δnp/(1p)))1CΔ.\mathbb{P}\Big{(}|R_{1}|\geq n/2+\min\left(0.3n,0.11\Delta\sqrt{np/(1-p)}\right)\Big{)}\geq 1-\dfrac{C}{\Delta}.

The above gives us sufficient control of |R1||R_{1}| to immediately improve several results of Benjamini, Chan, O’Donnell, Tamuz, and Tan [5], which we summarize below.

Theorem 5.

For each ε>0\varepsilon>0, there is a C>0C>0 such that the following holds. Let GGn,pG\sim G_{n,p}, and initially assign each vertex one of two colors (independently uniformly at random). Without loss of generality, suppose this assignment initially colors at least half of the vertices red.

  • (a)

    If pC/np\geq C/\sqrt{n}, then with probability at least 1ε1-\varepsilon, the majority dynamics process on GG will be unanimously red after at most four steps.

  • (b)

    If pC/n1/3p\geq C/n^{1/3}, then with probability at least 1ε1-\varepsilon, the majority dynamics process on GG will be unanimously red after at most three steps.

  • (c)

    If p1C1p\geq 1-C^{-1}, then with probability at least 1ε1-\varepsilon, the majority dynamics process on GG will be unanimously red after at most two steps.

A proof of part (a) was the main theorem of [9], but results (b) and (c) are new. Parts (a) and (b) directly generalize Theorems 2 and 3.9 (respectively) of [5], where these events were only proven to hold with probability at least 0.4. That said, essentially the only difference in our proof is that we are able to use Corollary 4, which strengthens a weaker result of their paper (namely Theorem 3.5 of [5]). Part (c) of our result also follows in this same way, and we include it here mostly for completeness.

As a second application, the following improves upon a recent paper of Tran and Vu [22].

Theorem 6.

Suppose we have a red-blue coloring of a vertex set V=R0B0V=R_{0}\cup B_{0} such that there are initially at least Δ>0\Delta>0 more red vertices than blue. We then independently sample a random graph G|V|,1/2G_{|V|,1/2}. For λ{R,B}\lambda\in\{\texttt{R},\texttt{B}\}, let P(λ)P(\lambda) denote the probability that the majority dynamics process on GG eventually results in the vertices all being colored λ\lambda. Then

P(R)P(B)o(1)+22π02(Δ/2π0.85)ex2/2𝑑x=2Φ(2(Δ/2π0.85))1o(1).P(\texttt{R})-P(\texttt{B})\geq-o(1)+\dfrac{2}{\sqrt{2\pi}}\int_{0}^{2(\Delta/\sqrt{2\pi}-0.85)}e^{-x^{2}/2}dx=2\Phi\left(2(\Delta/\sqrt{2\pi}-0.85)\right)-1-o(1).

In particular, if Δ3\Delta\geq 3, and |V||V| is sufficiently large, then we have P(R)P(B)0.511P(\texttt{R})-P(\texttt{B})\geq 0.511.

Interpreting this in a somewhat playful light, the above result argues that if majority dynamics represents the propagation of opinions in a social network, then one side can obtain a noticeable advantage merely by bribing three extra vertices. This improves on a result of Tran and Vu [22] who argued a similar claim but with an initial lead of at least 1010.

The bottleneck in our argument is our reliance on Proposition 3, for which the value of α\alpha surprisingly becomes important. We make no claim that our constant α\alpha is especially close to the best possible, and sufficiently improving this constant would immediately prove a version of Theorem 6 for an initial lead of 11 vertex. However, it is easy to imagine that such incremental refinements might be somewhat unsatisfactory, and we instead propose the following, which gets to the heart of the matter.

Conjecture 7.

For each fixed p(0,1)p\in(0,1), there is a corresponding value of ε>0\varepsilon>0 such that if there are initially more red vertices than blue, then with probability at least 0.5+εo(1)0.5+\varepsilon-o(1), the majority dynamics process on Gn,pG_{n,p} will result in all vertices being red.

This conjecture, if true, would likely require a new approach, and we also believe the following stronger conjecture is true (which, in light of our central limit theorem, would exactly determine the value of ε\varepsilon in Conjecture 7).

Conjecture 8.

Given any initial coloring of vertices and any probability p(1+ε)log(n)/np\geq(1+\varepsilon)\log(n)/n, if we sample Gn,pG_{n,p} (independently of the initial coloring) then with probability tending to 1, whichever color has more vertices after the first step of the majority dynamics process will have increasingly more vertices after each subsequent step until this is the unanimous color.

Simulations suggest that both conjectures are true, but we are unable to prove either even when pp is constant. One heuristic argument is that after the first step, one side is likely to have a relatively extreme advantage (e.g., in the case of pp constant, after one step one side will be winning by Ω(n)\Omega(\sqrt{n})). If we were to resample the random graph at this point, then such an advantage would be nearly impossible to squander. We aren’t sure how to make this argument rigorous, but we think a resolution of this either way would be very interesting.

Outline of paper

We begin in Section 2 with a brief collection of known technical results. In Section 3, we present our Fourier-analytic proof of our main result, Theorem 1. We continue in Section 4 with Proposition 3 and Theorem 4. We then show quick applications in Section 5 by deriving Theorems 5 and 6. Finally, for the sake of clarity, we prove several lemmas in an Appendix.

2 Known technical lemmas

We will need the following. Throughout, we let Bin(m,p)Bin(m,p) denote a binomial random variable with mean mpmp and variance mp(1p)mp(1-p).

Bernstein’s inequality: Let X1,X2,,XnX_{1},X_{2},\ldots,X_{n} be independent random variables with aiXiai+Ma_{i}\leq X_{i}\leq a_{i}+M for all ii. Then for all t0t\geq 0, we have

(|i=1n(Xi𝔼Xi)|>t)2exp(2t2/2i=1nVar(Xi)+Mt/3).\mathbb{P}\left(\left|\sum_{i=1}^{n}\left(X_{i}-\mathbb{E}X_{i}\right)\right|>t\right)\leq 2\exp\left(\dfrac{-2t^{2}/2}{\sum_{i=1}^{n}\text{Var}(X_{i})+Mt/3}\right).
Lemma 9.

Let W=Bin(n,p)Bin(m,p)W=Bin(n,p)-Bin(m,p) with p{0,1}p\notin\{0,1\}. There is a universal constant CC (independent of mm, nn, and pp) such that the following holds.

  • (i)

    For all tt, we have (W=t)C(m+n)p(1p)\mathbb{P}(W=t)\leq\dfrac{C}{\sqrt{(m+n)p(1-p)}}.

  • (ii)

    For all tt, we have |(W=t+1)(W=t)|C(m+n)p(1p)|\mathbb{P}(W=t+1)-\mathbb{P}(W=t)|\leq\dfrac{C}{(m+n)p(1-p)}.

Proof.

For each of these claims, we may assume that m=0m=0. To see this, we would first condition on the value of either Bin(n,p)Bin(n,p) or Bin(m,p)Bin(m,p) (whichever has smaller variance), and then the claim reduces to that of a single binomial (with a different value of tt and CC).

When W=Bin(n,p)W=Bin(n,p), the first claim is a particularly well-known anticoncentration result. As for the second claim, consider F(t)=(W=t+1)(W=t)F(t)=\mathbb{P}(W=t+1)-\mathbb{P}(W=t). Then by routine manipulation, we see F(t)F(t1)F(t)-F(t-1) is positive iff |t+1/2p(n+1)|<1+4p(1p)(n+1)/2|t+1/2-p(n+1)|<\sqrt{1+4p(1-p)(n+1)}/2. Thus, we need only verify the claim for values of tt with |tnp|2np(1p)|t-np|\leq 2\sqrt{np(1-p)}. For these, we have

F(t)=(W=t)[1+(W=t+1)(W=t)]=(W=t)[1+(nt)p(t+1)(1p)].F(t)=\mathbb{P}(W=t)\left[-1+\dfrac{\mathbb{P}(W=t+1)}{\mathbb{P}(W=t)}\right]=\mathbb{P}(W=t)\left[-1+\dfrac{(n-t)p}{(t+1)(1-p)}\right].

We bound |(W=t)|C/np(1p)|\mathbb{P}(W=t)|\leq C/\sqrt{np(1-p)} by (i), and for tt in the given range, we readily see that the second term is also at most C/np(1p)C/\sqrt{np(1-p)}. ∎

3 Proof of Theorem 1

For each vVv\in V, define

μv={2(Bin(|R0|1,p)Bin(|B0|,p))1if vR0,2(Bin(|B0|1,p)Bin(|R0|,p))1if vB0,\mu_{v}=\begin{cases}2\mathbb{P}\Big{(}Bin(|R_{0}|-1,p)\geq Bin(|B_{0}|,p)\Big{)}-1\qquad&\text{if $v\in R_{0}$},\\ 2\mathbb{P}\Big{(}Bin(|B_{0}|-1,p)\geq Bin(|R_{0}|,p)\Big{)}-1\qquad&\text{if $v\in B_{0}$},\end{cases}

and

ε(v)={1,if vR0,1,if vB0.\varepsilon(v)=\begin{cases}1,\qquad&\text{if $v\in R_{0}$,}\\ -1,\qquad&\text{if $v\in B_{0}$}.\end{cases}

We will use the notation that R1R_{1} (resp. B1B_{1}) denotes the set of red (resp. blue) vertices after one step of majority dynamics. Then let ZvZ_{v} denote the following (centered) indicator

Zv={1μvif vR0R1 or vB0B1,1μvotherwise,Z_{v}=\begin{cases}1-\mu_{v}\qquad&\text{if $v\in R_{0}\cap R_{1}$ or $v\in B_{0}\cap B_{1}$},\\ -1-\mu_{v}\qquad&\text{otherwise},\end{cases}

so that by our choice of μv\mu_{v}, we have 𝔼[Zv]=0\mathbb{E}[Z_{v}]=0. Finally, define ZZ as

Z=rR0ZrbB0Zb=vVε(v)Zv=2|R1|nμr|R0|+μb|B0|.Z=\sum_{r\in R_{0}}Z_{r}-\sum_{b\in B_{0}}Z_{b}=\sum_{v\in V}\varepsilon(v)Z_{v}=2|R_{1}|-n-\mu_{r}|R_{0}|+\mu_{b}|B_{0}|.

Since ZZ is an affine transformation of |R1||R_{1}|, we need only prove a central limit theorem for ZZ. In fact, for each fixed kk, we will prove

𝔼[Zk]={𝒪(nk/2(1/σ+Δ/n))if k is odd𝒪(nk/2(1/σ+Δ/n))+(k1)!!(4nμ)k/2if k is even,\mathbb{E}\left[Z^{k}\right]=\begin{cases}\mathcal{O}\Big{(}n^{k/2}(1/\sigma+\Delta/n)\Big{)}\qquad&\text{if $k$ is odd}\\ \mathcal{O}\Big{(}n^{k/2}(1/\sigma+\Delta/n)\Big{)}+(k-1)!!(4n\mu)^{k/2}\qquad&\text{if $k$ is even,}\end{cases} (1)

where the implied constants depend only on kk.

Derivation of the theorem statement assuming (1)

Since 𝔼[Z]=0\mathbb{E}[Z]=0, establishing (1) would immediately prove statement (i) of the theorem. As for statement (ii), suppose log(1/μ)=o(log(σ))\log(1/\mu)=o(\log(\sigma)) and σ\sigma\to\infty. By Bernstein’s inequality, we have

μ(|Bin(|R0|,p)Bin(|B0|,p)Δp|Δp)2exp[(Δp)2/2np(1p)+Δp/3].\mu\leq\mathbb{P}\Big{(}|Bin(|R_{0}|,p)-Bin(|B_{0}|,p)-\Delta p|\geq\Delta p\Big{)}\leq 2\exp\left[\dfrac{-(\Delta p)^{2}/2}{np(1-p)+\Delta p/3}\right].

Since σ\sigma\to\infty, and log(1/μ)=o(log(σ))\log(1/\mu)=o(\log(\sigma)), this would imply Δpσlog(σ)\Delta p\ll\sigma\sqrt{\log(\sigma)}. Thus, we would need to have log(σ)=𝒪(log(n/Δ))\log(\sigma)=\mathcal{O}(\log(n/\Delta)). Therefore, (1)\eqref{moments of Z} would imply that for all fixed kk

𝔼[Zk](4nμ)k/2={𝒪((1/μk/2)(1/σ+Δ/n))if k is odd𝒪((1/μk/2)(1/σ+Δ/n))+(k1)!!if k is even.\dfrac{\mathbb{E}\left[Z^{k}\right]}{(4n\mu)^{k/2}}=\begin{cases}\mathcal{O}\Big{(}(1/\mu^{k/2})(1/\sigma+\Delta/n)\Big{)}\qquad&\text{if $k$ is odd}\\ \mathcal{O}\Big{(}(1/\mu^{k/2})(1/\sigma+\Delta/n)\Big{)}+(k-1)!!\qquad&\text{if $k$ is even.}\end{cases}

And since log(1/μ)=o(log(σ))\log(1/\mu)=o(\log(\sigma)) and log(1/μ)=o(log(n/Δ))\log(1/\mu)=o(\log(n/\Delta)), these error terms converge to zero for any fixed kk. Thus, the moments of Z/4nμZ/\sqrt{4n\mu} converge to those of the standard Gaussian, which establishes a central limit theorem by the method of moments.

Finally, we need to argue that if Δp=𝒪(σ)\Delta p=\mathcal{O}(\sigma) and σ\sigma\to\infty, then the hypotheses of (ii) hold. For this, we simply note if Δp=𝒪(σ)\Delta p=\mathcal{O}(\sigma), then μ\mu is bounded below,111This is by an immediate application of the usual central limit theorem for the binomial random variables in the definition of μ\mu. and log(1/μ)=𝒪(1)\log(1/\mu)=\mathcal{O}(1).

Set up for establishing (1)

Our proof of (1) is via standard techniques of (pp-biased) Fourier analysis. We first give a brief review and relevant set up, but for more, we refer the reader to O’Donnell’s helpful text [18].

Let =(V2)\mathcal{E}={{V\choose 2}} and consider the space of functions mapping from {1,1}\{-1,1\}^{\mathcal{E}}\to\mathbb{R}. Each input string is indexed by the 22-element subsets of VV, and each x{1,1}\vec{x}\in\{-1,1\}^{\mathcal{E}} corresponds to a graph whose edge set is {e:xe=1}\{e\ :\ x_{e}=1\}. Consider the product measure on {1,1}\{-1,1\}^{\mathcal{E}} where each bit is independently equal to 11 with probability pp. With this, we identify random variables depending on GG|V|,pG\sim G_{|V|,p} and functions f:{1,1}f:\{-1,1\}^{\mathcal{E}}\to\mathbb{R} in the obvious way.

For each SS\subseteq\mathcal{E}, we define the function ΦS:{1,1}\Phi_{S}:\{-1,1\}^{\mathcal{E}}\to\mathbb{R} via

ΦS(x)=eSxe+12p2p(1p),\Phi_{S}(\vec{x})=\prod_{e\in S}\dfrac{x_{e}+1-2p}{2\sqrt{p(1-p)}},

with Φ=1\Phi_{\emptyset}=1. Then it can be checked that for all S,TS,T we have

𝔼[ΦS(x)ΦT(x)]={1if S=T,0otherwise.\mathbb{E}[\Phi_{S}(\vec{x})\Phi_{T}(\vec{x})]=\begin{cases}1\qquad&\text{if $S=T$,}\\ 0\qquad&\text{otherwise.}\end{cases}

These ΦS\Phi_{S} therefore form an orthogonal basis for this space (with inner product f,g=𝔼[fg]\langle f,g\rangle=\mathbb{E}[fg]), and each ff can uniquely be written as

f(x)=Sf^(S)ΦS(x), where f^(S)=𝔼[f(x)ΦS(x)].f(\vec{x})=\sum_{S\subseteq\mathcal{E}}\widehat{f}(S)\Phi_{S}(\vec{x}),\qquad\text{ where }\qquad\widehat{f}(S)=\mathbb{E}[f(\vec{x})\Phi_{S}(\vec{x})].

We will need the following Lemma, which we prove in an Appendix.

Lemma 10.

With notation as before,

  • (i)

    We have 𝔼[Zv2]=1μv2=4μ+𝒪(1/σ)\mathbb{E}[Z_{v}^{2}]=1-\mu_{v}^{2}=4\mu+\mathcal{O}(1/\sigma).

  • (ii)

    Let Γu={{u,v}:vV{u}}\Gamma_{u}=\{\{u,v\}:v\in V\setminus\{u\}\}\subseteq\mathcal{E}. If SΓvS\not\subseteq\Gamma_{v}, then Zv^(S)=0\widehat{Z_{v}}(S)=0. Moreover, Zv^()=0\widehat{Z_{v}}(\emptyset)=0.

  • (iii)

    Zv^(S)=𝒪(1/n)\widehat{Z_{v}}(S)=\mathcal{O}(1/\sqrt{n}) if |S|10k2|S|\leq 10k^{2}

  • (iv)

    Zv^(S)=𝒪(1/n)\widehat{Z_{v}}(S)=\mathcal{O}(1/n) if 2|S|10k22\leq|S|\leq 10k^{2}

  • (v)

    If r,b,vr,b,v are distinct with rR0r\in R_{0} and bB0b\in B_{0}, then e(Zr^(e)Zb^(e))Zv^(e)=𝒪(1nσ)\displaystyle\sum_{e\in\mathcal{E}}\Big{(}\widehat{Z_{r}}(e)-\widehat{Z_{b}}(e)\Big{)}\widehat{Z_{v}}(e)=\mathcal{O}\left(\frac{1}{n\sigma}\right)

  • (vi)

    If SS\neq\emptyset and L1L\geq 1, then |ZvL^(S)|2L|Zv^(S)|2L+1\Big{|}\widehat{\ Z_{v}^{L}\ }(S)\Big{|}\leq 2^{L}\Big{|}\widehat{Z_{v}}(S)\Big{|}\leq 2^{L+1}

Let’s prove the moment estimate assuming Lemma 10.

Estimating 𝔼[Zk]\mathbb{E}[Z^{k}] assuming Lemma 10

For each tuple w=(w1,,wk)Vk\vec{w}=(w_{1},\ldots,w_{k})\in V^{k} consisting of (not necessarily distinct) vertices, set Zw=i=1kZwiZ_{\vec{w}}=\prod_{i=1}^{k}Z_{w_{i}} and ε(w)=i=1kε(wi)\varepsilon(\vec{w})=\prod_{i=1}^{k}\varepsilon(w_{i}). In this notation, we have 𝔼[Zk]=wVkε(w)𝔼[Zw]\mathbb{E}[Z^{k}]=\sum_{\vec{w}\in V^{k}}\varepsilon(\vec{w})\mathbb{E}[Z_{\vec{w}}].

  • For each ckc\leq k, let ΛcVk\Lambda_{c}\subseteq V^{k} denote the strings in which there are exactly cc vertices each appearing exactly once.

We will argue that for 𝔼[Zk]\mathbb{E}[Z^{k}], the contribution due to Λ0\Lambda_{0} is the main term, and all of the others are small. Namely, we’ll prove

wΛ0ε(w)𝔼[Zw]={𝒪(n(k1)/2)if k is odd,𝒪(n(k1)/2)+(k1)!![|R0|(1μr2)+|B0|(1μb2)]k/2if k is even.\sum_{\vec{w}\in\Lambda_{0}}\varepsilon(\vec{w})\mathbb{E}[Z_{\vec{w}}]=\begin{cases}\mathcal{O}\left(n^{(k-1)/2}\right)\quad&\text{if $k$ is odd,}\\ \mathcal{O}\left(n^{(k-1)/2}\right)+(k-1)!!\cdot\Big{[}|R_{0}|(1-\mu_{r}^{2})+|B_{0}|(1-\mu_{b}^{2})\Big{]}^{k/2}\quad&\text{if $k$ is even}.\\ \end{cases}

And moreover for all c1c\geq 1, we’ll show wΛcε(w)𝔼[Zw]=𝒪(nk/2(1/σ+Δ/n+1/n))\sum_{\vec{w}\in\Lambda_{c}}\varepsilon(\vec{w})\mathbb{E}[Z_{\vec{w}}]=\mathcal{O}(n^{k/2}(1/\sigma+\Delta/n+1/\sqrt{n})).

Together, this will yield (1) since Lemma 10.(i) implies [|R0|(1μr2)+|B0|(1μb2)]k/2\Big{[}|R_{0}|(1-\mu_{r}^{2})+|B_{0}|(1-\mu_{b}^{2})\Big{]}^{k/2} is of the form nk/2(4μ±𝒪(1/σ))k/2=(4nμ)k/2±𝒪(nk/2/σ)n^{k/2}(4\mu\pm\mathcal{O}(1/\sigma))^{k/2}=(4n\mu)^{k/2}\pm\mathcal{O}(n^{k/2}/\sigma) [valid since 4μ+𝒪(1/σ)4\mu+\mathcal{O}(1/\sigma) is bounded].

Expansion of 𝔼[Zw]\mathbb{E}[Z_{\vec{w}}]

Suppose TT and UU are disjoint sets of vertices such that |T|+|U|k|T|+|U|\leq k, and suppose for each uUu\in U, there is an integer αu[2,k]\alpha_{u}\in[2,k]. Then we have

𝔼[tTZtuUZuαu]=(Sv)vTUtTZt^(St)uUZuαu^(Su)𝔼[vTUΦSv],\mathbb{E}\left[\prod_{t\in T}Z_{t}\cdot\prod_{u\in U}Z_{u}^{\alpha_{u}}\right]=\sum_{(S_{v})_{v\in T\cup U}}\prod_{t\in T}\widehat{\ Z_{t}\ }(S_{t})\prod_{u\in U}\widehat{\ Z_{u}^{\alpha_{u}}\ }(S_{u})\cdot\mathbb{E}\left[\prod_{v\in T\cup U}\Phi_{S_{v}}\right],

where the sum is taken over all choices of (Sv)vTU(S_{v})_{v\in T\cup U} in |T|+|U|\mathcal{E}^{|T|+|U|} (so that each vTUv\in T\cup U is independently given a subset of \mathcal{E}).

Let E\mathcal{H}\subseteq E denote the 22-element subsets of TUT\cup U, and suppose (Sv)vTU(S_{v})_{v\in T\cup U} corresponds to a term in the above summation that isn’t zero. By Lemma 10.(ii), for each vv we must have SvΓvS_{v}\subseteq\Gamma_{v} (since otherwise one of the Fourier coefficients would be 0). Moreover every edge of uSu\bigcup_{u}S_{u} must appear in at least two sets SvS_{v} (otherwise the expectation of the Φ\Phi functions would be 0). Together, this implies that each edge of uSu\bigcup_{u}S_{u} appears in exactly two sets, that uSu\bigcup_{u}S_{u}\subseteq\mathcal{H} and that for all vv we have Sv=[uSu]ΓvS_{v}=\left[\bigcup_{u}S_{u}\right]\cap\Gamma_{v}. Thus, the above summation becomes

𝔼[tTZtuUZuαu]=HtTZt^(HΓt)uUZuαu^(HΓu).\mathbb{E}\left[\prod_{t\in T}Z_{t}\cdot\prod_{u\in U}Z_{u}^{\alpha_{u}}\right]=\sum_{H\subseteq\mathcal{H}}\prod_{t\in T}\widehat{\ Z_{t}\ }(H\cap\Gamma_{t})\prod_{u\in U}\widehat{\ Z_{u}^{\alpha_{u}}\ }(H\cap\Gamma_{u}).

Since ||10k2|\mathcal{H}|\leq 10k^{2}, Lemma 10.(iii) implies that each term Zt^(HΓt)\widehat{Z_{t}}(H\cap\Gamma_{t}) is 𝒪(1/n)\mathcal{O}(1/\sqrt{n}). Therefore, for any fixed HH\subseteq\mathcal{H} we have

tTZt^(HΓt)uUZuαu^(HΓu)=𝒪(1n|T|/2)uUZuαu^(HΓu)=𝒪(n|T|/2).\prod_{t\in T}\widehat{\ Z_{t}\ }(H\cap\Gamma_{t})\prod_{u\in U}\widehat{\ Z_{u}^{\alpha_{u}}\ }(H\cap\Gamma_{u})=\mathcal{O}\left(\dfrac{1}{n^{|T|/2}}\right)\prod_{u\in U}\widehat{\ Z_{u}^{\alpha_{u}}\ }(H\cap\Gamma_{u})=\mathcal{O}\left(n^{-|T|/2}\right). (2)

If there is a value uUu\in U for which HΓuH\cap\Gamma_{u}\neq\emptyset, then we could apply part (vi) [and (iii)] of Lemma 10 to show the above expression is 𝒪(n(|T|+1)/2)\mathcal{O}(n^{-(|T|+1)/2}). Similarly, if there is a vertex tTt\in T for which |HΓt|1|H\cap\Gamma_{t}|\neq 1, then we could apply (iv) to again conclude this expression is 𝒪(n(|T|+1)/2)\mathcal{O}(n^{-(|T|+1)/2}). Together, since there are at most 2||=𝒪(1)2^{|\mathcal{H}|}=\mathcal{O}(1) choices for HH, this implies

𝔼[tTZtuUZuαu]=𝒪(n(|T|+1)/2)+[uUZuαu^()]HtTZt^(HΓt),\mathbb{E}\left[\prod_{t\in T}Z_{t}\cdot\prod_{u\in U}Z_{u}^{\alpha_{u}}\right]=\mathcal{O}(n^{-(|T|+1)/2})+\left[\prod_{u\in U}\widehat{\ Z_{u}^{\alpha_{u}}\ }(\emptyset)\right]\cdot\sum_{H\in\mathcal{M}}\prod_{t\in T}\widehat{Z_{t}}(H\cap\Gamma_{t}), (3)

where \mathcal{M} is the set of all perfect matchings on the vertex set TT.

Contribution from Λ0\Lambda_{0}

Let Λ0+Λ0\Lambda_{0}^{+}\subseteq\Lambda_{0} denote the set of strings where each vertex appears at most twice. Then we have

wΛ0ε(w)𝔼[Zw]=wΛ0+𝔼[Zw]+𝒪(|Λ0Λ0+|)=wΛ0+𝔼[Zw]+𝒪(n(k1)/2),\sum_{\vec{w}\in\Lambda_{0}}\varepsilon(\vec{w})\mathbb{E}[Z_{\vec{w}}]=\sum_{\vec{w}\in\Lambda_{0}^{+}}\mathbb{E}[Z_{\vec{w}}]+\mathcal{O}(|\Lambda_{0}\setminus\Lambda_{0}^{+}|)=\sum_{\vec{w}\in\Lambda_{0}^{+}}\mathbb{E}[Z_{\vec{w}}]+\mathcal{O}(n^{(k-1)/2}),

which holds since each term of the summation is at most 𝒪(1)\mathcal{O}(1) and each string in Λ0Λ0+Vk\Lambda_{0}\setminus\Lambda_{0}^{+}\subseteq V^{k} has an element appearing more than twice (and none appearing only once). When kk is odd, Λ0+=\Lambda_{0}^{+}=\emptyset, so the above summation is 𝒪(n(k1)/2)\mathcal{O}(n^{(k-1)/2}). Otherwise, for kk even, (3) [together with Lemma 10.(i)] implies

wΛ0+𝔼[Zw]\displaystyle\sum_{\vec{w}\in\Lambda_{0}^{+}}\mathbb{E}[Z_{\vec{w}}] =\displaystyle= UV,|U|=k/2k!2k/2𝔼[uUZu2]=𝒪(n(k1)/2)+UV,|U|=k/2k!2k/2uUZu2^()\displaystyle\sum_{U\subseteq V,\ \ |U|=k/2}\dfrac{k!}{2^{k/2}}\mathbb{E}\left[\prod_{u\in U}Z_{u}^{2}\right]=\mathcal{O}\left(n^{(k-1)/2}\right)+\sum_{U\subseteq V,\ \ |U|=k/2}\dfrac{k!}{2^{k/2}}\prod_{u\in U}\widehat{\ Z_{u}^{2}\ }(\emptyset)
=\displaystyle= 𝒪(n(k1)/2)+k!2k/2j=0k/2(|R0|j)(|B0|k/2j)(1μr2)j(1μb2)k/2j\displaystyle\mathcal{O}\left(n^{(k-1)/2}\right)+\dfrac{k!}{2^{k/2}}\sum_{j=0}^{k/2}{|R_{0}|\choose j}{|B_{0}|\choose{k/2-j}}(1-\mu_{r}^{2})^{j}(1-\mu_{b}^{2})^{k/2-j}
=\displaystyle= 𝒪(n(k1)/2)+k!2k/2(1(k/2)![|R0|(1μr2)+|B0|(1μb2)]k/2+𝒪(nk/21))\displaystyle\mathcal{O}\left(n^{(k-1)/2}\right)+\dfrac{k!}{2^{k/2}}\left(\dfrac{1}{(k/2)!}\Big{[}|R_{0}|(1-\mu_{r}^{2})+|B_{0}|(1-\mu_{b}^{2})\Big{]}^{k/2}+\mathcal{O}\left(n^{k/2-1}\right)\right)
=\displaystyle= 𝒪(n(k1)/2)+(k1)!![|R0|(1μr2)+|B0|(1μb2)]k/2.\displaystyle\mathcal{O}\left(n^{(k-1)/2}\right)+(k-1)!!\cdot\Big{[}|R_{0}|(1-\mu_{r}^{2})+|B_{0}|(1-\mu_{b}^{2})\Big{]}^{k/2}.

Contribution from Λc\Lambda_{c} for c1c\geq 1

For ease of notation, let us now assume without loss of generality that |R0||B0||R_{0}|\geq|B_{0}|, and let us write R0={r1,r2,,rm}XR_{0}=\{r_{1},r_{2},\ldots,r_{m}\}\cup X and B0={b1,b2,,bm}B_{0}=\{b_{1},b_{2},\ldots,b_{m}\} where |X|=Δ=|R0||B0||X|=\Delta=|R_{0}|-|B_{0}|. For each c1c\geq 1, let Λc+\Lambda_{c}^{+} denote the strings wΛc\vec{w}\in\Lambda_{c} such that (a) w\vec{w} contains no elements of XX, (b) w\vec{w} contains no elements more than twice, and (c) for each ii, if rir_{i} appears in w\vec{w}, then bib_{i} does not. Then using (2) we have

wΛcΛc+ε(w)𝔼[Zw]\displaystyle\sum_{\vec{w}\in\Lambda_{c}\setminus\Lambda_{c}^{+}}\varepsilon(\vec{w})\mathbb{E}[Z_{\vec{w}}] =\displaystyle= |ΛcΛc+|𝒪(nc/2)=𝒪(nk/21(Δ+n)),\displaystyle|\Lambda_{c}\setminus\Lambda_{c}^{+}|\cdot\mathcal{O}(n^{-c/2})=\mathcal{O}\left(n^{k/2-1}(\Delta+\sqrt{n})\right),

which follows since

|ΛcΛc+|=𝒪(n(k+c)/21|X|)+𝒪(n(k+c1)/2)+𝒪(n(k+c)/21).|\Lambda_{c}\setminus\Lambda_{c}^{+}|=\mathcal{O}\left(n^{(k+c)/2-1}|X|\right)+\mathcal{O}\left(n^{(k+c-1)/2}\right)+\mathcal{O}\left(n^{(k+c)/2-1}\right).

Moreover, we have

wΛc+ε(w)𝔼[Zw]=k!2(kc)/2UVTVU𝔼[tT(ZrtZbt)uU(Zru2+Zbu2)].\sum_{\vec{w}\in\Lambda_{c}^{+}}\varepsilon(\vec{w})\mathbb{E}[Z_{\vec{w}}]=\dfrac{k!}{2^{(k-c)/2}}\sum_{U\subseteq V}\sum_{T\subseteq V\setminus U}\mathbb{E}\left[\prod_{t\in T}(Z_{r_{t}}-Z_{b_{t}})\prod_{u\in U}(Z_{r_{u}}^{2}+Z_{b_{u}}^{2})\right].

Fixing UU and TT and expanding this product out as in (3), we see

𝔼[tT(ZrtZbt)uU(Zru2+Zbu2)]=𝒪(n(c+1)/2+HtT[Zrt^(HΓrt)Zbt^(HΓbt)]),\mathbb{E}\left[\prod_{t\in T}(Z_{r_{t}}-Z_{b_{t}})\prod_{u\in U}(Z_{r_{u}}^{2}+Z_{b_{u}}^{2})\right]=\mathcal{O}\left(n^{-(c+1)/2}+\sum_{H\in\mathcal{M}}\prod_{t\in T}[\widehat{Z_{r_{t}}}(H\cap\Gamma_{r_{t}})-\widehat{Z_{b_{t}}}(H\cap\Gamma_{b_{t}})]\right),

where \mathcal{M} is the set of graphs HH on the vertex set tT{rt,bt}\bigcup_{t\in T}\{r_{t},b_{t}\} such that for all tTt\in T, we have rt≁btr_{t}\not\sim b_{t} and there is exactly one edge of HH meeting {rt,bt}\{r_{t},b_{t}\}. For distinct i,jTi,j\in T, let i,j\mathcal{M}_{i,j}\subseteq\mathcal{M} denote the subset of graphs containing one of the edges in {ri,bi}×{rj,bj}\{r_{i},b_{i}\}\times\{r_{j},b_{j}\}. Then we have

HtT[Zrt^(HΓrt)Zbt^(HΓbt)]\displaystyle\sum_{H\in\mathcal{M}}\prod_{t\in T}[\widehat{Z_{r_{t}}}(H\cap\Gamma_{r_{t}})-\widehat{Z_{b_{t}}}(H\cap\Gamma_{b_{t}})] =\displaystyle= 1|T|i,jHi,jtT[Zrt^(HΓrt)Zbt^(HΓbt)]\displaystyle\dfrac{1}{|T|}\sum_{i,j}\sum_{H\in\mathcal{M}_{i,j}}\prod_{t\in T}[\widehat{Z_{r_{t}}}(H\cap\Gamma_{r_{t}})-\widehat{Z_{b_{t}}}(H\cap\Gamma_{b_{t}})]
=\displaystyle= 1|T|i,jHi,jt{i,j}[Zrt^(HΓrt)Zbt^(HΓbt)]\displaystyle\dfrac{1}{|T|}\sum_{i,j}\sum_{H\in\mathcal{M}_{i,j}}\prod_{t\notin\{i,j\}}[\widehat{Z_{r_{t}}}(H\cap\Gamma_{r_{t}})-\widehat{Z_{b_{t}}}(H\cap\Gamma_{b_{t}})]
×(14e[Zri^(e)Zbi^(e)][Zri^(e)Zbi^(e)])\displaystyle\qquad\times\left(\dfrac{1}{4}\sum_{e}\Big{[}\widehat{Z_{r_{i}}}(e)-\widehat{Z_{b_{i}}}(e)\Big{]}\cdot\Big{[}\widehat{Z_{r_{i}}}(e)-\widehat{Z_{b_{i}}}(e)\Big{]}\right)
=\displaystyle= 𝒪(n1c/2)𝒪(1nσ)=𝒪(nc/2/σ),\displaystyle\mathcal{O}\left(n^{1-c/2}\right)\cdot\mathcal{O}\left(\dfrac{1}{n\sigma}\right)=\mathcal{O}(n^{-c/2}/\sigma),

where the last line follows from Lemma 10.(v). Therefore, we have

wΛcε(w)𝔼[Zw]=wΛcΛc+ε(w)𝔼[Zw]+wΛc+ε(w)𝔼[Zw]=𝒪(nk/21(Δ+n))+𝒪(nk/2/σ),\sum_{\vec{w}\in\Lambda_{c}}\varepsilon(\vec{w})\mathbb{E}[Z_{\vec{w}}]=\sum_{\vec{w}\in\Lambda_{c}\setminus\Lambda_{c}^{+}}\varepsilon(\vec{w})\mathbb{E}[Z_{\vec{w}}]+\sum_{\vec{w}\in\Lambda_{c}^{+}}\varepsilon(\vec{w})\mathbb{E}[Z_{\vec{w}}]=\mathcal{O}\left(n^{k/2-1}(\Delta+\sqrt{n})\right)+\mathcal{O}\left(n^{k/2}/\sigma\right),

as desired.

4 Derivations of Proposition 3 and Corollary 4

Proof of Proposition 3

Proof.

For GGn,pG\sim G_{n,p}, call a pair of sets (R,U)(R,U) bad iff (1) |R|n/2+αn(1p)/p|R|\geq n/2+\alpha\sqrt{n(1-p)/p}, (2) |U|=δn|U|=\delta n, and (3) every vertex of UU has at least as many neighbors in VRV\setminus R as it does in RR. For each pair (R,U)(R,U), we will show under the hypotheses of the proposition that if np(1p)\sqrt{np(1-p)} is large enough (depending only on ε\varepsilon), there is a value q<1q<1 such that ((R,U) is bad)(q/4)n\mathbb{P}((R,U)\text{ is bad})\leq(q/4)^{n}, which will prove that with probability at least 1qn1-q^{n}, the desired conclusion holds for every set RR and every δ(ε,1ε)\delta\in(\varepsilon,1-\varepsilon).

Let R,UR,U be fixed satisfying conditions (1) and (2) above, and define B=VRB=V\setminus R. Note that if we condition on the edges in G[U]G[U], then the events “dR(u)dB(u)d_{R}(u)\leq d_{B}(u)” are independent as uu ranges over UU. Thus, we have

((R,U) is bad|G[U])\displaystyle\mathbb{P}((R,U)\textit{ is bad}\ |\ G[U]) =\displaystyle= uU(dRU(u)dBU(u)dBU(u)dRU(u)|G[U])\displaystyle\prod_{u\in U}\mathbb{P}\Big{(}d_{R\setminus U}(u)-d_{B\setminus U}(u)\leq d_{B\cap U}(u)-d_{R\cap U}(u)\ |\ G[U]\Big{)}
\displaystyle\leq (dRU(u)dBU(u)1+2e(BU)2e(RU)|U||G[U])|U|,\displaystyle\mathbb{P}\left(d_{R\setminus U}(u)-d_{B\setminus U}(u)\leq 1+\dfrac{2e(B\cap U)-2e(R\cap U)}{|U|}\ |\ G[U]\right)^{|U|},

where e(S)e(S) denotes the number of edges of G[S]G[S], and this inequality holds due to the log-concavity of binomial distributions.222Namely, if Q(x)=(Bin(m,p)x)Q(x)=\mathbb{P}(Bin(m,p)\leq x), then a routine computation shows for any x<yx<y we have Q(x)Q(y)Q(x+1)Q(y1)Q(x)Q(y)\leq Q(x+1)Q(y-1). Thus, this concavity holds for linear combinations of binomial random variables as well.

Letting MM denote the random variable M=1+2e(BU)2e(BU)|U|M=1+\dfrac{2e(B\cap U)-2e(B\cap U)}{|U|}, we have

((R,U) bad)t(M=t)[(Bin(|RU|,p)Bin(|BU|,p)t)]|U|.\mathbb{P}((R,U)\textit{ bad})\leq\sum_{t}\mathbb{P}\left(M=t\right)\left[\mathbb{P}\Big{(}Bin(|R\setminus U|,p)-Bin(|B\setminus U|,p)\leq t\Big{)}\right]^{|U|}. (4)

The mean of MM is easily computed as

𝔼[M1]/p\displaystyle\mathbb{E}[M-1]/p =\displaystyle= 2|U|[(|BU|2)(|RU|2)]=|U|2|RU|1+2|RU|/|U|\displaystyle\dfrac{2}{|U|}\left[{|B\cap U|\choose 2}-{|R\cap U|\choose 2}\right]=|U|-2|R\cap U|-1+2|R\cap U|/|U|
=\displaystyle= |B||R|+|RU||BU|1+2|RU|/|U|.\displaystyle|B|-|R|+|R\setminus U|-|B\setminus U|-1+2|R\cap U|/|U|.

And for any fixed 0<γ20<\gamma\leq 2, setting Λ=np(1p)\Lambda=\sqrt{np(1-p)}, Bernstein’s inequality implies that

(M𝔼[M]+γΛ)\displaystyle\mathbb{P}(M\geq\mathbb{E}[M]+\gamma\Lambda) \displaystyle\leq exp(|U|2γ2np(1p)/8[(|RU|2)+(|BU|2)]p(1p)+|U|γnp(1p)/6)\displaystyle\exp\left(\dfrac{-|U|^{2}\gamma^{2}np(1-p)/8}{\left[\begin{pmatrix}|R\cap U|\\ 2\end{pmatrix}+\begin{pmatrix}|B\cap U|\\ 2\end{pmatrix}\right]p(1-p)+|U|\gamma\sqrt{np(1-p)}/6}\right)
\displaystyle\leq exp((1+o(1))|U|2γ2n/4|U|22|RU||BU|)exp((1+o(1))γ2n2),\displaystyle\exp\left(-(1+o(1))\dfrac{|U|^{2}\gamma^{2}n/4}{|U|^{2}-2|R\cap U||B\cap U|}\right)\leq\exp\left(-(1+o(1))\dfrac{\gamma^{2}n}{2}\right),

where the o(1)o(1) term tends to 0 (as np(1p)\sqrt{np(1-p)}\to\infty) uniformly over all γ[0,2]\gamma\in[0,2].

Let ZBin(|RU|,p)Bin(|BU|,p)Z\sim Bin(|R\setminus U|,p)-Bin(|B\setminus U|,p). Since np(1p)\sqrt{np(1-p)}\to\infty and since |RU|+|BU||R\setminus U|+|B\setminus U| is order nn, the central limit theorem applies to ZZ, and for all fixed γ0\gamma\geq 0 we have

(Z𝔼[M]+γΛ)\displaystyle\mathbb{P}\Big{(}Z\leq\mathbb{E}[M]+\gamma\Lambda\Big{)} =\displaystyle= (Z𝔼[Z]+1+(|B||R|)pp+2p|RU|/|U|+γΛ)\displaystyle\mathbb{P}\Big{(}Z\leq\mathbb{E}[Z]+1+(|B|-|R|)p-p+2p|R\cap U|/|U|+\gamma\Lambda\Big{)}
\displaystyle\leq o(1)+(Z𝔼[Z](2αγ)Λ)=o(1)+Φ(γ2α1|U|/n).\displaystyle o(1)+\mathbb{P}\Big{(}Z\leq\mathbb{E}[Z]-(2\alpha-\gamma)\Lambda\Big{)}=o(1)+\Phi\left(\frac{\gamma-2\alpha}{\sqrt{1-|U|/n}}\right).

Thus, putting these together we obtain

((R,U) bad)\displaystyle\mathbb{P}((R,U)\text{ bad}) \displaystyle\leq γ(M=𝔼[M]+γΛ)(Z𝔼[M]+γΛ)|U|\displaystyle\sum_{\gamma}\mathbb{P}\left(M=\mathbb{E}[M]+\gamma\Lambda\right)\mathbb{P}\Big{(}Z\leq\mathbb{E}[M]+\gamma\Lambda\Big{)}^{|U|}
\displaystyle\leq (Z𝔼[M])|U|+(M𝔼[M]+2Λ)\displaystyle\mathbb{P}\Big{(}Z\leq\mathbb{E}[M]\Big{)}^{|U|}+\mathbb{P}\Big{(}M\geq\mathbb{E}[M]+2\Lambda\Big{)}
+0<γ<2(M=𝔼[M]+γΛ)(Z𝔼[M]+γΛ)|U|\displaystyle\qquad+\sum_{0<\gamma<2}\mathbb{P}\left(M=\mathbb{E}[M]+\gamma\Lambda\right)\mathbb{P}\Big{(}Z\leq\mathbb{E}[M]+\gamma\Lambda\Big{)}^{|U|}
\displaystyle\leq (0.2+o(1))n+n2((1+o(1))sup0γ2eγ2/2[Φ(γ2α1|U|/n)]|U|/n)n.\displaystyle(0.2+o(1))^{n}+n^{2}\Bigg{(}(1+o(1))\sup_{0\leq\gamma\leq 2}e^{-\gamma^{2}/2}\left[\Phi\left(\frac{\gamma-2\alpha}{\sqrt{1-|U|/n}}\right)\right]^{|U|/n}\Bigg{)}^{n}.

Finally, by absorbing smaller terms into the o(1)o(1) error, we obtain the desired bound. ∎

Proof of Corollary 4

Proof.

We first prove the second part of the corollary given the first. For this, we assume 5Δp5\leq\Delta p and 25np(1p)25\leq\sqrt{np(1-p)}.

Case 1: First suppose Δpnp(1p)\Delta p\leq\sqrt{np(1-p)}. Note that for all x0x\geq 0, an application of the mean value theorem implies Φ(x)1/2+x/(2πex2)1/2\Phi(x)\geq 1/2+x/(2\pi e^{x^{2}})^{1/2}. With a degree of foresight, let us set t=0.9(Δp)/2πet=0.9(\Delta p)/\sqrt{2\pi e}, which is greater than 11 since 5Δp5\leq\Delta p. Since Δpnp(1p)\Delta p\leq\sqrt{np(1-p)}, we have

nΦ(Δpnp(1p))ntnp(1p)n2+nΔp/2πenp(1p)ntnp(1p)n2+0.02Δnp/(1p).n\Phi\left(\dfrac{\Delta p}{\sqrt{np(1-p)}}\right)-\dfrac{nt}{\sqrt{np(1-p)}}\geq\dfrac{n}{2}+\dfrac{n\Delta p/\sqrt{2\pi e}}{\sqrt{np(1-p)}}-\dfrac{nt}{\sqrt{np(1-p)}}\geq\dfrac{n}{2}+0.02\Delta\sqrt{np/(1-p)}.

Thus, using the first part of the lemma, we obtain

(Zn/2+0.02Δnp/(1p))1Cp(1p)t21CΔ.\mathbb{P}\Big{(}Z\geq n/2+0.02\Delta\sqrt{np/(1-p)}\Big{)}\geq 1-\dfrac{Cp(1-p)}{t^{2}}\geq 1-\dfrac{C}{\Delta}.

Case 2: Now suppose Δpnp(1p)\Delta p\geq\sqrt{np(1-p)}. Numerical computation shows Φ(1)0.040.8\Phi(1)-0.04\geq 0.8, so setting t=0.04np(1p)t=0.04\sqrt{np(1-p)} (which is in fact at least 11 by assumption) we have

(Z0.8n)(ZnΦ(Δpnp(1p))ntnp(1p))1Cp(1p)t21625Cn.\mathbb{P}(Z\geq 0.8n)\geq\mathbb{P}\left(Z\geq n\Phi\left(\dfrac{\Delta p}{\sqrt{np(1-p)}}\right)-\dfrac{nt}{\sqrt{np(1-p)}}\right)\geq 1-\dfrac{Cp(1-p)}{t^{2}}\geq 1-\dfrac{625C}{n}.

Thus, in either case, we have a lower bound of at least 1625C/Δ1-625C/\Delta for each corresponding probability, so in both cases, we know one (if not both) of these probabilities is at least this large, which finishes the proof (with the value of CC replaced by 625C625C).

We now turn our attention to proving the first statement. For each vV,v\in V, let ZvZ_{v} be the indicator for the event that vv will be red after one step of the majority dynamics process. Then 𝔼[Zv](Bin(|R0|,p)Bin(|B0|,p)>0)\mathbb{E}[Z_{v}]\geq\mathbb{P}(Bin(|R_{0}|,p)-Bin(|B_{0}|,p)>0). Setting W=Bin(|R0|,p)Bin(|B0|,p)W=Bin(|R_{0}|,p)-Bin(|B_{0}|,p), we have 𝔼[W]=Δp\mathbb{E}[W]=\Delta p, and using a version of the Berry-Esseen theorem from Shevstova [20], we obtain

𝔼[Zv]\displaystyle\mathbb{E}[Z_{v}] \displaystyle\geq (W>0)=(W𝔼[W]>Δp)=(W𝔼[W]np(1p)>Δpnp(1p))\displaystyle\mathbb{P}(W>0)=\mathbb{P}(W-\mathbb{E}[W]>-\Delta p)=\mathbb{P}\left(\dfrac{W-\mathbb{E}[W]}{\sqrt{np(1-p)}}>\dfrac{-\Delta p}{\sqrt{np(1-p)}}\right)
\displaystyle\geq Φ(Δpnp(1p))0.4748p(1p)3+(1p)p3(n1)p3(1p)3\displaystyle\Phi\left(\dfrac{\Delta p}{\sqrt{np(1-p)}}\right)-0.4748\dfrac{p(1-p)^{3}+(1-p)p^{3}}{\sqrt{(n-1)p^{3}(1-p)^{3}}}
\displaystyle\geq Φ(Δpnp(1p))0.475np(1p).\displaystyle\Phi\left(\dfrac{\Delta p}{\sqrt{np(1-p)}}\right)-\dfrac{0.475}{\sqrt{np(1-p)}}.

Thus, since |R1|=vZv|R_{1}|=\sum_{v}Z_{v}, we have

𝔼|R1|nΦ(Δpnp(1p))0.475np(1p).\mathbb{E}|R_{1}|\geq n\Phi\left(\dfrac{\Delta p}{\sqrt{np(1-p)}}\right)-\dfrac{0.475\sqrt{n}}{\sqrt{p(1-p)}}.

The variance of |R1||R_{1}| is computed in Theorem 1 (i) as Var|R1|nμ+𝒪(Δ+n/σ)cn\text{Var}|R_{1}|\leq n\mu+\mathcal{O}(\Delta+n/\sigma)\leq cn. Finally, we finish by applying Chebyshev’s inequality (with t1t\geq 1) as follows:

(|R1|nΦ(Δpnp(1p))ntnp(1p))\displaystyle\mathbb{P}\left(|R_{1}|\geq n\Phi\left(\dfrac{\Delta p}{\sqrt{np(1-p)}}\right)-\dfrac{nt}{\sqrt{np(1-p)}}\right) \displaystyle\geq (𝔼|R1||R1|(t0.475)np(1p))\displaystyle\mathbb{P}\left(\mathbb{E}|R_{1}|-|R_{1}|\leq\dfrac{(t-0.475)\sqrt{n}}{\sqrt{p(1-p)}}\right)
\displaystyle\geq (𝔼|R1||R1|t(10.475)np(1p))\displaystyle\mathbb{P}\left(\mathbb{E}|R_{1}|-|R_{1}|\leq\dfrac{t(1-0.475)\sqrt{n}}{\sqrt{p(1-p)}}\right)
\displaystyle\geq 1Cp(1p)t2.\displaystyle 1-\dfrac{Cp(1-p)}{t^{2}}.\qed

5 Applications

Proof of Theorem 5

Proof.

Each of the three parts of the theorem are proven from Theorem 4 in an analogous way exactly as in [5]. We will treat the three cases simultaneously, and we assume throughout that nn is large enough to support our argument (valid by choosing CC large enough).

Let R0,B0R_{0},B_{0} be the initial (uniformly random) coloring of GG. Pick a fixed γ(0,1)\gamma\in(0,1) small enough so that (||R0||B0||γn)>1ε/2\mathbb{P}\left(\Big{|}|R_{0}|-|B_{0}|\Big{|}\geq\gamma\sqrt{n}\right)>1-\varepsilon/2, which is possible since |R0|=n|B0||R_{0}|=n-|B_{0}| is distributed as a binomial random variable with mean n/2n/2 and variance n/4n/4. Define the events:

  • E0E_{0} is the event that |R0||B0|γn|R_{0}|-|B_{0}|\geq\gamma\sqrt{n}

  • E1E_{1} is the event that |R1|n/2+min(.03n,0.11γnp/(1p))n/2+0.11γnp|R_{1}|\geq n/2+\min(.03n,0.11\gamma n\sqrt{p/(1-p)})\geq n/2+0.11\gamma n\sqrt{p}

  • E2E_{2} is the event that |B2|32(0.22γp)2p700/γ2p2|B_{2}|\leq\dfrac{32}{(0.22\gamma\sqrt{p})^{2}p}\leq\dfrac{700/\gamma^{2}}{p^{2}}

  • E3E_{3} is the event that |B3|32(0.8)2p50p|B_{3}|\leq\dfrac{32}{(0.8)^{2}p}\leq\dfrac{50}{p}

  • FF is the event that every vertex of GG has degree at least np/2np/2

If p(100/γ)/np\geq(100/\gamma)/\sqrt{n}, we claim that all of these events happen simultaneously with probability at least 1/2ε/4o(1)1/2-\varepsilon/4-o(1), which—by increasing the constants of the theorem statement to allow us to assume nn is sufficiently large—will show they happen with probability at least 1/2ε/21/2-\varepsilon/2. Having shown this, we will be done as follows.

  • (i)

    If p(100/γ)/np\geq(100/\gamma)/\sqrt{n}, then p100/np\geq 100/\sqrt{n}, so we have 200/n<p2200/n<p^{2} implying 50/p<np/450/p<np/4. Therefore, assuming E3E_{3} and FF, we would deterministically need B4=B_{4}=\emptyset.

  • (ii)

    If p15/(γ2n)1/3p\geq 15/(\gamma^{2}n)^{1/3} then (2800/γ2)/n<p3(2800/\gamma^{2})/n<p^{3} implying (700/γ2)/p2<np/4(700/\gamma^{2})/p^{2}<np/4. Thus, assuming E2E_{2} and FF, we would need to have B3=B_{3}=\emptyset.

  • (iii)

    If p1(1+8/γ2)1p\geq 1-(1+8/\gamma^{2})^{-1} then 0.11γp/(1p)0.30.11\gamma\sqrt{p/(1-p)}\geq 0.3. Therefore, assuming E1E_{1}, we would have |B1|0.2n|B_{1}|\leq 0.2n, which is less than np/4np/4 since our assumption on pp (and γ<1\gamma<1) implies p>0.8p>0.8. Therefore, E1E_{1} and FF would imply B2=B_{2}=\emptyset.

Thus, it only remains to show that if p(100/γ)/np\geq(100/\gamma)/\sqrt{n}, then events E0,E1,E2,E3E_{0},E_{1},E_{2},E_{3} and FF all simultaneously occur with probability at least 1/2ε/4o(1)1/2-\varepsilon/4-o(1). We prove these in sequential order except for the claim (E1|E0)=1o(1)\mathbb{P}(E_{1}|E_{0})=1-o(1), which we present last.

First note that (E0)1/2ε/4\mathbb{P}(E_{0})\geq 1/2-\varepsilon/4 by our choice of γ\gamma. To show (E2|E1)=1o(1)\mathbb{P}(E_{2}|E_{1})=1-o(1) and (E3|E2)=1o(1)\mathbb{P}(E_{3}|E_{2})=1-o(1), we refer the reader to Lemmas 3.6 and 3.7 of [5], which prove the following.

Lemma 11 (Benjamini, Chan, O’Donnell, Tamuz, and Tan [5]).

If plog(n)5/np\geq\log(n)^{5}/n, then with probability tending to 1, the graph Gn,pG_{n,p} will satisfy the following property for all α>0\alpha>0. If RBR\cup B is any initial red-blue coloring with |R||B|αn|R|-|B|\geq\alpha n, then after one step of majority dynamics the number of blue vertices will be at most 32/(α2p)32/(\alpha^{2}p).

Therefore, with high probability we have that E1E_{1} will imply E2E_{2}, and E2E_{2} in turn implies E3E_{3} (because (700/γ2)/p2<0.1n(700/\gamma^{2})/p^{2}<0.1n). We control (F)\mathbb{P}(F) by a simple union bound together with Chernoff’s inequality to obtain

(F)1n(Bin(n1,p)<np/2)1nexp(np(1/2)22)=1exp(log(n)np8),\mathbb{P}(F)\geq 1-n\mathbb{P}(Bin(n-1,p)<np/2)\geq 1-n\exp\left(\dfrac{-np(1/2)^{2}}{2}\right)=1-\exp\left(\log(n)-\dfrac{np}{8}\right),

which tends to 11 since p(100/γ)/nlog(n)/np\geq(100/\gamma)/\sqrt{n}\gg\log(n)/n.

Finally, we must show (E1|E0)1o(1)\mathbb{P}(E_{1}|E_{0})\geq 1-o(1). If np(1p)25\sqrt{np(1-p)}\geq 25, then our choice of pp immediately allows us to apply Theorem 4, which implies (E1|E0)1𝒪((γn)1)\mathbb{P}(E_{1}|E_{0})\geq 1-\mathcal{O}((\gamma\sqrt{n})^{-1}). On the other hand, if np(1p)<25\sqrt{np(1-p)}<25, then we would need to have p2/3p\geq 2/3 (since npnp\to\infty) and thus 1p<2(3/2)252/n1000/n1-p<2\cdot(3/2)25^{2}/n\leq 1000/n. Therefore, with probability at least 1o(1)1-o(1), every vertex in GG has degree at least nlog(n)n-\log(n) since

(min deg(G)nlog(n))\displaystyle\mathbb{P}(\text{min deg}(G)\leq n-\log(n)) \displaystyle\leq n(Bin(n,1p)log(n))\displaystyle n\cdot\mathbb{P}(Bin(n,1-p)\geq\log(n))
\displaystyle\leq n(Bin(n,1000/n)log(n))\displaystyle n\cdot\mathbb{P}(Bin(n,1000/n)\geq\log(n))
\displaystyle\leq nexp((3/2+o(1))log(n))=o(1),\displaystyle n\exp\left(-(3/2+o(1))\log(n)\right)=o(1),

where the last inequality follows from Bernstein’s inequality. Therefore, if E0E_{0} occurs, then with probability 1o(1)1-o(1) we would need R1=VR_{1}=V (and in particular (E1|E0)=1o(1)\mathbb{P}(E_{1}|E_{0})=1-o(1)). This is because with probability 1o(1)1-o(1) every vertex of GG has degree at least nlog(n)n-\log(n), so if E0E_{0} occurs, then initially every vertex would have at least |R0||B0|log(n)>0|R_{0}|-|B_{0}|-\log(n)>0 more red neighbors than blue. (So in this case, majority dynamics will typically end after only one step.)

In any case, regardless of the value of np(1p)\sqrt{np(1-p)}, we’ve shown (E1|E0)=1o(1)\mathbb{P}(E_{1}|E_{0})=1-o(1), completing the proof. ∎

Proof of Theorem 6

For this proof, we need the following lemma, as proven in the Appendix.

Lemma 12.

Let V=XRBV=X\cup R\cup B where |R|=|B||R|=|B|, and suppose we independently sample the random graph G|V|,pG_{|V|,p}. Define the sets TR={vR: 0<dB(v)dR(v)dX(v)}T_{R}=\{v\in R\ :\ 0<d_{B}(v)-d_{R}(v)\leq d_{X}(v)\} and TB={vB: 0dB(v)dR(v)<dX(v)}T_{B}=\{v\in B\ :\ 0\leq d_{B}(v)-d_{R}(v)<d_{X}(v)\}, and let σ=2|R|p(1p)\sigma=\sqrt{2|R|p(1-p)}. If 1+|X|p=o(σ)1+|X|p=o\Big{(}\sigma\Big{)} then for any fixed ε>0\varepsilon>0, with high probability we have

||TRTB|𝔼[|TRTB|]||V|(|X|pσ)1/2ε.\Bigg{|}|T_{R}\cup T_{B}|-\mathbb{E}[|T_{R}\cup T_{B}|]\Bigg{|}\leq\sqrt{|V|}\cdot\left(\dfrac{|X|p}{\sigma}\right)^{1/2-\varepsilon}.

Moreover, 𝔼[|TRTB|]=|V||X|pσ2π+𝒪(|V|(1+|X|p)2σ2)\mathbb{E}[|T_{R}\cup T_{B}|]=\dfrac{|V|\cdot|X|p}{\sigma\sqrt{2\pi}}+\mathcal{O}\left(\dfrac{|V|(1+|X|p)^{2}}{\sigma^{2}}\right).

Assuming this lemma, we now proceed to a proof of the theorem at hand.

Proof of Theorem 6.

Suppose there are initially Δ\Delta more red vertices than blue, and by monotonicity of the majority dynamics process, we may first assume Δ=o(n1/5)\Delta=o(n^{1/5}) [since reducing Δ\Delta only decreases the quantity in question, and this would only change the integral by at most o(1)o(1)]. Before we run the majority dynamics process, let us consider the following coupling. We first decompose the vertices of GG as V=XR0B0V=X\cup R_{0}\cup B_{0}, where |R0|=|B0||R_{0}|=|B_{0}| and |X|=Δ|X|=\Delta. The vertices in R0R_{0} will be initially red, the vertices of B0B_{0} initially blue, and the vertices of XX will either be initially all red or initially all blue with these two options being equally likely.

Suppose we sample the random graph Gn,pG_{n,p}, but we have not yet decided on the color for XX. For each λ{R,B}\lambda\in\{\texttt{R},\texttt{B}\}, consider the majority dynamics process that would evolve if XX is initially colored λ\lambda, and define W(λ)W(\lambda) to be either (i) T if this process does not lead to unanimity or (ii) the unanimous color that the process eventually attains.

In this language, we seek to bound (W(R)=R)(W(R)=B)\mathbb{P}(W(\texttt{R})=\texttt{R})-\mathbb{P}(W(\texttt{R})=\texttt{B}). By monotonicity, we have (W(R)=B and W(B)B)=0\mathbb{P}(W(\texttt{R})=\texttt{B}\text{ and }W(\texttt{B})\neq\texttt{B})=0. Using this and exploiting symmetry, we have

(W(R)=R)(W(R)=B)\displaystyle\mathbb{P}(W(\texttt{R})=\texttt{R})-\mathbb{P}(W(\texttt{R})=\texttt{B}) =\displaystyle= (W(B)=B and W(R)=R)+(W(B)=B and W(R)=T)\displaystyle\mathbb{P}(W(\texttt{B})=\texttt{B}\text{ and }W(\texttt{R})=\texttt{R})+\mathbb{P}(W(\texttt{B})=\texttt{B}\text{ and }W(\texttt{R})=\texttt{T})
\displaystyle\geq (W(B)=B and W(R)=R).\displaystyle\mathbb{P}(W(\texttt{B})=\texttt{B}\text{ and }W(\texttt{R})=\texttt{R}).

Let |Ri(λ)||R_{i}(\lambda)| denote the number of red vertices that the graph would have after ii steps if XX is initially colored λ\lambda. By Proposition 3, with probability 1o(1)1-o(1), we have the two implications

|R1(R)|n/2+0.85n|R2(R)|0.501n|R3(R)|0.999n.|R_{1}(\texttt{R})|\geq n/2+0.85\sqrt{n}\quad\Rightarrow\quad|R_{2}(\texttt{R})|\geq 0.501n\quad\Rightarrow\quad|R_{3}(\texttt{R})|\geq 0.999n.

By the same reasoning as in the proof of Theorem 5, this in turn would imply R4(R)=VR_{4}(\texttt{R})=V with probability tending to 1. Therefore, with high probability |R1(R)|n/2+0.85n|R_{1}(\texttt{R})|\geq n/2+0.85\sqrt{n} would imply W(R)=RW(\texttt{R})=\texttt{R} and similarly |R1(B)|<n/20.85n|R_{1}(\texttt{B})|<n/2-0.85\sqrt{n} would imply W(B)=BW(\texttt{B})=\texttt{B}.

Let Z0Z_{0} denote the number of vertices that would be red after one step if the set XX were completely removed from GG. Since |X|=o(n1/5)|X|=o(n^{1/5}), we can use Lemma 12 (see above) with ε=1/6\varepsilon=1/6 to argue that with high probability

||R1(R)||R1(B)|2|X|n/(2π)|\displaystyle\Bigg{|}|R_{1}(\texttt{R})|-|R_{1}(\texttt{B})|-2|X|\sqrt{n/(2\pi)}\Bigg{|} \displaystyle\leq ||R1(R)|Z0|X|n/(2π)|\displaystyle\Big{|}|R_{1}(\texttt{R})|-Z_{0}-|X|\sqrt{n/(2\pi)}\Big{|}
+||R1(B)|Z0+|X|n/(2π)|\displaystyle\qquad\qquad+\Big{|}|R_{1}(\texttt{B})|-Z_{0}+|X|\sqrt{n/(2\pi)}\Big{|}
\displaystyle\leq 2(cn0.4+c|X|2+|X|)Cn0.4,\displaystyle 2\left(cn^{0.4}+c|X|^{2}+|X|\right)\leq Cn^{0.4},

for some constants cc and CC (here, the term c|X|2c|X|^{2} is from the error term in the estimate of 𝔼[|TRTB|]\mathbb{E}[|T_{R}\cup T_{B}|], and |X||X| is to account for the number of red vertices that would be in XX after one step). Thus, we have

(W(R)=R)(W(R)=B)(W(B)=B and W(R)=R)\displaystyle\mathbb{P}(W(\texttt{R})=\texttt{R})-\mathbb{P}(W(\texttt{R})=\texttt{B})\geq\mathbb{P}(W(\texttt{B})=\texttt{B}\text{ and }W(\texttt{R})=\texttt{R})
o(1)+(0.85n|R1(R)|n/2(2|X|/2π0.85)n2Cn0.4).\displaystyle\quad\qquad\qquad\geq-o(1)+\mathbb{P}\left(0.85\sqrt{n}\leq|R_{1}(\texttt{R})|-n/2\leq(2|X|/\sqrt{2\pi}-0.85)\sqrt{n}-2Cn^{0.4}\right).

Finally, using the central limit law for |R1(R)||R_{1}(\texttt{R})| given by Theorem 1 together with the fact that Δ=|X|\Delta=|X|, we obtain that the above expression is at least

(W(R)=R)(W(R)=B)\displaystyle\mathbb{P}(W(\texttt{R})=\texttt{R})-\mathbb{P}(W(\texttt{R})=\texttt{B}) \displaystyle\geq o(1)+(||R1(R)|𝔼|R1(R)|n/2|2(|X|/2π0.85))\displaystyle-o(1)+\mathbb{P}\left(\left|\dfrac{|R_{1}(\texttt{R})|-\mathbb{E}|R_{1}(\texttt{R})|}{\sqrt{n}/2}\right|\leq 2(|X|/\sqrt{2\pi}-0.85)\right)
\displaystyle\geq o(1)+22π02(Δ/2π0.85)ex2/2𝑑x.\displaystyle-o(1)+\dfrac{2}{\sqrt{2\pi}}\int_{0}^{2(\Delta/\sqrt{2\pi}-0.85)}e^{-x^{2}/2}dx.\qed

References

  • [1] Mohammed Amin Abdullah and Moez Draief. Global majority consensus by local majority polling on graphs of a given degree sequence. Discrete Applied Mathematics, 180:1–10, 2015.
  • [2] Gideon Amir, Rangel Baldasso, and Nissan Beilin. Majority dynamics and the median process: connections, convergence and some new conjectures. arXiv preprint arXiv:1911.08613, 2019.
  • [3] Venkatesh Bala and Sanjeev Goyal. Learning from neighbours. The review of economic studies, 65(3):595–621, 1998.
  • [4] József Balogh and Boris G Pittel. Bootstrap percolation on the random regular graph. Random Structures & Algorithms, 30(1-2):257–286, 2007.
  • [5] Itai Benjamini, Siu-On Chan, Ryan O’Donnell, Omer Tamuz, and Li-Yang Tan. Convergence, unanimity and disagreement in majority dynamics on unimodular graphs and random graphs. Stochastic Processes and their Applications, 126(9):2719–2733, 2016.
  • [6] Emilio De Santis and Charles M Newman. Convergence in energy-lowering (disordered) stochastic spin systems. Journal of statistical physics, 110(1-2):431–442, 2003.
  • [7] Sinziana M Eckner and Charles M Newman. Fixation to consensus on tree-related graphs. ALEA, 12(1):357–374, 2015.
  • [8] Glenn Ellison and Drew Fudenberg. Rules of thumb for social learning. Journal of political Economy, 101(4):612–643, 1993.
  • [9] Nikolaos Fountoulakis, Mihyun Kang, and Tamás Makai. Resolution of a conjecture on majority dynamics: rapid stabilisation in dense random graphs. arXiv preprint arXiv:1910.05820, 2019.
  • [10] Bernd Gärtner and Ahad N. Zehmakan. Color war: Cellular automata with majority-rule. In Frank Drewes, Carlos Martín-Vide, and Bianca Truthe, editors, Language and Automata Theory and Applications, pages 393–404, Cham, 2017. Springer International Publishing.
  • [11] Bernd Gärtner and Ahad N Zehmakan. Majority model on random regular graphs. In Latin American Symposium on Theoretical Informatics, pages 572–583. Springer, 2018.
  • [12] Yuval Ginosar and Ron Holzman. The majority action on infinite graphs: strings and puppets. Discrete Mathematics, 215(1-3):59–71, 2000.
  • [13] Eric Goles and Jorge Olivos. Periodic behaviour of generalized threshold functions. Discrete mathematics, 30(2):187–189, 1980.
  • [14] Mark Granovetter. Threshold models of collective behavior. American journal of sociology, 83(6):1420–1443, 1978.
  • [15] Yashodhan Kanoria, Andrea Montanari, et al. Majority dynamics on trees and the dynamic cavity method. The Annals of Applied Probability, 21(5):1694–1748, 2011.
  • [16] Elchanan Mossel, Joe Neeman, and Omer Tamuz. Majority dynamics and aggregation of information in social networks. Autonomous Agents and Multi-Agent Systems, 28(3):408–429, 2014.
  • [17] Vu Xuan Nguyen, Gaoxi Xiao, Xin-Jian Xu, Qingchu Wu, and Cheng-Yi Xia. Dynamics of opinion formation under majority rules on complex social networks. Scientific Reports, 10(1):1–9, 2020.
  • [18] Ryan O’Donnell. Analysis of boolean functions. Cambridge University Press, 2014.
  • [19] Devavrat Shah. Gossip algorithms. Now Publishers Inc, 2009.
  • [20] Irina Shevtsova. On the absolute constants in the berry-esseen type inequalities for identically distributed summands. arXiv preprint arXiv:1111.6554, 2011.
  • [21] Omer Tamuz and Ran J Tessler. Majority dynamics and the retention of information. Israel Journal of Mathematics, 206(1):483–507, 2015.
  • [22] Linh Tran and Van Vu. Reaching a consensus on random networks: The power of few. arXiv preprint arXiv:1911.10279, 2019.
  • [23] Ahad N. Zehmakan. Opinion Forming in Erdös-Rényi Random Graph and Expanders. In Wen-Lian Hsu, Der-Tsai Lee, and Chung-Shou Liao, editors, 29th International Symposium on Algorithms and Computation (ISAAC 2018), volume 123 of Leibniz International Proceedings in Informatics (LIPIcs), pages 4:1–4:13, Dagstuhl, Germany, 2018. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.
  • [24] Ahad N Zehmakan. Two phase transitions in two-way bootstrap percolation. arXiv preprint arXiv:1809.10764, 2018.

6 Appendix

Proof of Lemma 10

Proof.

We prove each claim in turn.

Claims (i) and (ii):

These essentially follow immediately from the facts that Zv+μv{1,1}Z_{v}+\mu_{v}\in\{-1,1\}, that ZvZ_{v} has mean 0, and that ZvZ_{v} depends only on Γv\Gamma_{v}. We’ll prove 1μv2=4μ+𝒪(1/σ)1-\mu_{v}^{2}=4\mu+\mathcal{O}(1/\sigma), in the case that v=rR0v=r\in R_{0} (the case vB0v\in B_{0} being analogous), for which we have

1μr2\displaystyle 1-\mu_{r}^{2} =\displaystyle= 1(2(Bin(|R0|1,p)Bin(|B0|,p))1)2\displaystyle 1-\Bigg{(}2\mathbb{P}\Big{(}Bin(|R_{0}|-1,p)\geq Bin(|B_{0}|,p)\Big{)}-1\Bigg{)}^{2}
=\displaystyle= 4(Bin(|R0|1,p)Bin(|B0|,p))(Bin(|R0|1,p)<Bin(|B0|,p)).\displaystyle 4\cdot\mathbb{P}\Big{(}Bin(|R_{0}|-1,p)\geq Bin(|B_{0}|,p)\Big{)}\cdot\mathbb{P}\Big{(}Bin(|R_{0}|-1,p)<Bin(|B_{0}|,p)\Big{)}.

And then we simply use the facts333To prove these, without loss of generality, suppose yx=Ω(n)y\leq x=\Omega(n). We then condition on the value of Bin(y,p)Bin(y,p) and use that (Bin(x,p)=t)=𝒪(1/xp(1p))=𝒪(1/σ)\mathbb{P}(Bin(x,p)=t)=\mathcal{O}(1/\sqrt{xp(1-p)})=\mathcal{O}(1/\sigma) uniformly for all tt. that if x+y=Ω(n)x+y=\Omega(n) then [with σ=np(1p)\sigma=\sqrt{np(1-p)}]

(Bin(x,p)Bin(y,p))\displaystyle\mathbb{P}\Big{(}Bin(x,p)\geq Bin(y,p)\Big{)} =\displaystyle= (Bin(x+1,p)Bin(y,p))+𝒪(1/σ),\displaystyle\mathbb{P}\Big{(}Bin(x+1,p)\geq Bin(y,p)\Big{)}+\mathcal{O}(1/\sigma),
(Bin(x,p)Bin(y,p))\displaystyle\mathbb{P}\Big{(}Bin(x,p)\geq Bin(y,p)\Big{)} =\displaystyle= (Bin(x,p)Bin(y+1,p))+𝒪(1/σ),\displaystyle\mathbb{P}\Big{(}Bin(x,p)\geq Bin(y+1,p)\Big{)}+\mathcal{O}(1/\sigma),
(Bin(x,p)>Bin(y,p))\displaystyle\mathbb{P}\Big{(}Bin(x,p)>Bin(y,p)\Big{)} =\displaystyle= (Bin(x,p)Bin(y,p))+𝒪(1/σ).\displaystyle\mathbb{P}\Big{(}Bin(x,p)\geq Bin(y,p)\Big{)}+\mathcal{O}(1/\sigma).
Claims (iii) and (iv):

For these, suppose S\emptyset\neq S\subseteq\mathcal{E}. Write SS as S=S1S2S=S_{1}\cup S_{2} where S1=S(R0×B0)S_{1}=S\setminus(R_{0}\times B_{0}) and S2=S(R0×B0)S_{2}=S\cap(R_{0}\times B_{0}) [so SS consists of |S1||S_{1}| edges from vv to a vertex of the same color and |S2||S_{2}| edges from vv to the other color]. Define the random variable

Wv;S:={Bin(|R0|1|S1|,p)Bin(|B0||S2|,p) if vR0Bin(|B0|1|S1|,p)Bin(|R0||S2|,p) if vB0.W_{v;S}:=\begin{cases}Bin(|R_{0}|-1-|S_{1}|,p)-Bin(|B_{0}|-|S_{2}|,p)\qquad&\text{ if $v\in R_{0}$}\\ Bin(|B_{0}|-1-|S_{1}|,p)-Bin(|R_{0}|-|S_{2}|,p)\qquad&\text{ if $v\in B_{0}$}.\end{cases}

Then by conditioning on the edges in SS, we have (for SS\neq\emptyset)

Zv^(S)\displaystyle\widehat{Z_{v}}(S) =\displaystyle= 𝔼[Zv(x)ΦS(x)]\displaystyle\mathbb{E}\left[Z_{v}(\vec{x})\Phi_{S}(\vec{x})\right]
=\displaystyle= IS1JS2p|I|+|J|(1p)|S||I||J|[2(Wv;S|J||I|)1]\displaystyle\sum_{I\subseteq S_{1}}\sum_{J\subseteq S_{2}}p^{|I|+|J|}(1-p)^{|S|-|I|-|J|}\bigg{[}2\mathbb{P}\Big{(}W_{v;S}\geq|J|-|I|\Big{)}-1\bigg{]}
×(22p2p(1p))|I|+|J|(2p2p(1p))|S||I||J|\displaystyle\qquad\times\left(\dfrac{2-2p}{2\sqrt{p(1-p)}}\right)^{|I|+|J|}\left(\dfrac{-2p}{2\sqrt{p(1-p)}}\right)^{|S|-|I|-|J|}
=\displaystyle= 2(p(1p))|S|IJS(1)|I|+|J|(Wv;S|J||I|).\displaystyle 2\left(-\sqrt{p(1-p)}\right)^{|S|}\sum_{I\cup J\subseteq S}(-1)^{|I|+|J|}\mathbb{P}\Big{(}W_{v;S}\geq|J|-|I|\Big{)}.

Since SS\neq\emptyset, let eSe\in S. If eS1e\in S_{1}, then by conditioning on whether eIJe\in I\cup J, we have

IJS(1)|I|+|J|(Wv;S|J||I|)\displaystyle\sum_{I\cup J\subseteq S}(-1)^{|I|+|J|}\mathbb{P}\Big{(}W_{v;S}\geq|J|-|I|\Big{)}
=IJ(S{e})(1)|I|+|J|[(WS|J||I|)(Wv;S|J||I|1)]\displaystyle\qquad\quad=\sum_{I\cup J\subseteq(S\setminus\{e\})}(-1)^{|I|+|J|}\Bigg{[}\mathbb{P}\Big{(}W_{S}\geq|J|-|I|\Big{)}-\mathbb{P}\Big{(}W_{v;S}\geq|J|-|I|-1\Big{)}\Bigg{]}
=IJ(S{e})(1)|I|+|J|[(Wv;S=|J||I|1)].\displaystyle\qquad\quad=-\sum_{I\cup J\subseteq(S\setminus\{e\})}(-1)^{|I|+|J|}\Bigg{[}\mathbb{P}\Big{(}W_{v;S}=|J|-|I|-1\Big{)}\Bigg{]}.

Similarly if eS2e\in S_{2}, then Zv^(S)=2(p(1p))|S|IJ(S{e})(1)|I|+|J|(Wv;S=|J||I|)\displaystyle\widehat{Z_{v}}(S)=2(-\sqrt{p(1-p)})^{|S|}\sum_{I\cup J\subseteq(S\setminus\{e\})}(-1)^{|I|+|J|}\mathbb{P}\Big{(}W_{v;S}=|J|-|I|\Big{)}. In either case, if |S|10k2|S|\leq 10k^{2} is bounded, these probabilities are at most 𝒪(1/np(1p))\mathcal{O}(1/\sqrt{np(1-p)}), establishing (iii). We prove (iv) by picking some other edge eSe^{\prime}\in S and conditioning on whether or not eIJe^{\prime}\in I\cup J to obtain

|Z^v(S)|\displaystyle|\widehat{Z}_{v}(S)| \displaystyle\leq 2(p(1p))|S|IJ(S{e,e})supL|(Wv;S=L+1)(Wv;S=L)|\displaystyle 2(\sqrt{p(1-p)})^{|S|}\sum_{I\cup J\subseteq(S\setminus\{e,e^{\prime}\})}\sup_{L}\bigg{|}\mathbb{P}(W_{v;S}=L+1)-\mathbb{P}(W_{v;S}=L)\bigg{|}
=\displaystyle= 𝒪((p(1p))|S|supL|(Wv;S=L+1)(Wv;S=L)|).\displaystyle\mathcal{O}\left(\left(\sqrt{p(1-p)}\right)^{|S|}\sup_{L}\bigg{|}\mathbb{P}(W_{v;S}=L+1)-\mathbb{P}(W_{v;S}=L)\bigg{|}\right).

And this supremum is 𝒪(1np(1p))\mathcal{O}\left(\frac{1}{np(1-p)}\right) by Lemma 9.

Claim (v):

For this, we first note that by claim (ii), this sum reduces to only two terms:

eE(Zr^(e)Zb^(e))Zv^(e)\displaystyle\sum_{e\in E}\Big{(}\widehat{Z_{r}}(e)-\widehat{Z_{b}}(e)\Big{)}\widehat{Z_{v}}(e) =\displaystyle= Zr^(rv)Zv^(rv)Zb^(bv)Zv^(bv)\displaystyle\widehat{Z_{r}}(rv)\widehat{Z_{v}}(rv)-\widehat{Z_{b}}(bv)\widehat{Z_{v}}(bv)
=\displaystyle= Zv^(rv)(Zr^(rv)+Zb^(bv))Zb^(bv)(Zv^(bv)+Zv^(rv)).\displaystyle\widehat{Z_{v}}(rv)\Big{(}\widehat{Z_{r}}(rv)+\widehat{Z_{b}}(bv)\Big{)}-\widehat{Z_{b}}(bv)\Big{(}\widehat{Z_{v}}(bv)+\widehat{Z_{v}}(rv)\Big{)}.

Suppose rR0r^{\prime}\in R_{0} and bB0b^{\prime}\in B_{0} are additional vertices. From our computation in (iii), we have

Zr^(rr)\displaystyle\widehat{Z_{r}}(rr^{\prime}) =\displaystyle= 2p(1p)(Bin(|R0|2,p)Bin(|B0|,p)=1),\displaystyle 2\sqrt{p(1-p)}\cdot\mathbb{P}\Big{(}Bin(|R_{0}|-2,p)-Bin(|B_{0}|,p)=-1\Big{)},
Zb^(bb)\displaystyle\widehat{Z_{b}}(bb^{\prime}) =\displaystyle= 2p(1p)(Bin(|B0|2,p)Bin(|R0|,p)=1),\displaystyle 2\sqrt{p(1-p)}\cdot\mathbb{P}\Big{(}Bin(|B_{0}|-2,p)-Bin(|R_{0}|,p)=-1\Big{)},
Zr^(rb)=Zb^(rb)\displaystyle\widehat{Z_{r}}(rb)=\widehat{Z_{b}}(rb) =\displaystyle= 2p(1p)(Bin(|R0|1,p)Bin(|B0|1,p)=0).\displaystyle-2\sqrt{p(1-p)}\cdot\mathbb{P}\Big{(}Bin(|R_{0}|-1,p)-Bin(|B_{0}|-1,p)=0\Big{)}.

By Lemma 9, the above three probabilities are all within Cnp(1p)\dfrac{C}{np(1-p)} of each other, which implies Zr^(rv)+Zb^(bv)\widehat{Z_{r}}(rv)+\widehat{Z_{b}}(bv) as well as Zv^(bv)+Zv^(rv)\widehat{Z_{v}}(bv)+\widehat{Z_{v}}(rv) are both at most 𝒪(1np(1p))\mathcal{O}\left(\dfrac{1}{n\sqrt{p(1-p)}}\right). Combining this with claim (iii) then establishes (v).

Claim (vi):

For this, we use the fact that Zv+μv{1,1}Z_{v}+\mu_{v}\in\{-1,1\} to obtain

ZvL\displaystyle Z_{v}^{L} =\displaystyle= (Zv+μvμv)L=i=0L(Li)(Zv+μv)i(μv)Li\displaystyle(Z_{v}+\mu_{v}-\mu_{v})^{L}=\sum_{i=0}^{L}{L\choose i}(Z_{v}+\mu_{v})^{i}(-\mu_{v})^{L-i}
=\displaystyle= Zv(1μv)L(1)L(1+μv)L2+(1μv2)(1μv)L1+(1)L(1+μv)L12.\displaystyle Z_{v}\dfrac{(1-\mu_{v})^{L}-(-1)^{L}(1+\mu_{v})^{L}}{2}+(1-\mu_{v}^{2})\dfrac{(1-\mu_{v})^{L-1}+(-1)^{L}(1+\mu_{v})^{L-1}}{2}.\qed

Proof of Lemma 12

Proof.

We will prove that with high probability |TR||T_{R}| and |TB||T_{B}| are both close to their means and that each has expected value of about (|V|/2)|X|pσ2π\frac{(|V|/2)|X|p}{\sigma\sqrt{2\pi}}, which will finish the proof since TRTB=T_{R}\cap T_{B}=\emptyset. For this, we’ll focus on TRT_{R} (the case of TBT_{B} being almost identical). In what follows, let m=|R|=|B|m=|R|=|B|.

For vRv\in R, let TvT_{v} denote the indicator for the event that 0<dB(v)dR(v)dX(v)0<d_{B}(v)-d_{R}(v)\leq d_{X}(v). Then we have |TR|=vRTv|T_{R}|=\sum_{v\in R}T_{v}. So the expected value of |TR||T_{R}| is mm times 𝔼[Tv]\mathbb{E}[T_{v}], and moreover

𝔼[Tv]\displaystyle\mathbb{E}[T_{v}] =\displaystyle= (0<Bin(m,p)Bin(m1,p)Bin(|X|,p))\displaystyle\mathbb{P}(0<Bin(m,p)-Bin(m-1,p)\leq Bin(|X|,p))
=\displaystyle= k=1|X|i=1k(Bin(m,p)Bin(m1,p)=i)(Bin(|X|,p)=k)\displaystyle\sum_{k=1}^{|X|}\sum_{i=1}^{k}\mathbb{P}(Bin(m,p)-Bin(m-1,p)=i)\mathbb{P}(Bin(|X|,p)=k)

Let W=Bin(m,p)Bin(m1,p)W=Bin(m,p)-Bin(m-1,p). For each ii we have |(W=i)(W=0)|Ci/σ2\Big{|}\mathbb{P}(W=i)-\mathbb{P}(W=0)\Big{|}\leq Ci/\sigma^{2}, by Lemma 9. Therefore we have

𝔼[Tv]\displaystyle\mathbb{E}[T_{v}] =\displaystyle= k=1|X|i=1k((W=0)+𝒪(i/σ2))(Bin(|X|,p)=k)\displaystyle\sum_{k=1}^{|X|}\sum_{i=1}^{k}\left(\mathbb{P}(W=0)+\mathcal{O}(i/\sigma^{2})\right)\mathbb{P}(Bin(|X|,p)=k)
=\displaystyle= (W=0)𝔼[Bin(|X|,p))]+𝒪(1σ2k=1|X|k(k+1)2(Bin(|X|,p)=k))\displaystyle\mathbb{P}(W=0)\mathbb{E}[Bin(|X|,p))]+\mathcal{O}\left(\dfrac{1}{\sigma^{2}}\sum_{k=1}^{|X|}\dfrac{k(k+1)}{2}\mathbb{P}(Bin(|X|,p)=k)\right)
=\displaystyle= (W=0)|X|p+𝒪(1mp(1p)|X|p(1+|X|p))=(W=0)|X|p+𝒪((1+|X|p)2σ2)\displaystyle\mathbb{P}(W=0)|X|p+\mathcal{O}\left(\dfrac{1}{mp(1-p)}|X|p(1+|X|p)\right)=\mathbb{P}(W=0)|X|p+\mathcal{O}\left(\dfrac{(1+|X|p)^{2}}{\sigma^{2}}\right)
=\displaystyle= |X|pσ2π+𝒪((1+|X|p)2σ2).\displaystyle\dfrac{|X|p}{\sigma\sqrt{2\pi}}+\mathcal{O}\left(\dfrac{(1+|X|p)^{2}}{\sigma^{2}}\right).

Moreover, for uvu\neq v, the covariance Cov(Tu,Tv)Cov(T_{u},T_{v}) is given by

Cov(Tu,Tv)\displaystyle Cov(T_{u},T_{v}) =\displaystyle= p(1p)[𝔼(Tu|uv)𝔼(Tu|u≁v)][𝔼(Tv|uv)𝔼(Tv|u≁v)]\displaystyle p(1-p)\Bigg{[}\mathbb{E}(T_{u}|u\sim v)-\mathbb{E}(T_{u}|u\not\sim v)\Bigg{]}\Bigg{[}\mathbb{E}(T_{v}|u\sim v)-\mathbb{E}(T_{v}|u\not\sim v)\Bigg{]}
=\displaystyle= p(1p)[(Bin(m,p)Bin(m2,p)=Bin(|X|,p)+1)\displaystyle p(1-p)\Bigg{[}\mathbb{P}(Bin(m,p)-Bin(m-2,p)=Bin(|X|,p)+1)
(Bin(m,p)Bin(m2,p)=1)]2\displaystyle\qquad\qquad\qquad\qquad-\mathbb{P}(Bin(m,p)-Bin(m-2,p)=1)\Bigg{]}^{2}
\displaystyle\leq p(1p)𝔼[(CBin(|X|,p)mp(1p))2]=𝒪((1+|X|p)2mσ2),\displaystyle p(1-p)\mathbb{E}\left[\left(\dfrac{CBin(|X|,p)}{mp(1-p)}\right)^{2}\right]=\mathcal{O}\left(\dfrac{(1+|X|p)^{2}}{m\sigma^{2}}\right),

where the last inequality again follows from Lemma 9. Thus, the variance of |TR||T_{R}| is satisfies

Var(|TR|)\displaystyle Var(|T_{R}|) =\displaystyle= mVar(Tu)+m(m1)Cov(Tu,Tv)=𝒪(m|X|pσ)+𝒪(m(1+p|X|)2σ2)\displaystyle mVar(T_{u})+m(m-1)Cov(T_{u},T_{v})=\mathcal{O}\left(\dfrac{m|X|p}{\sigma}\right)+\mathcal{O}\left(\dfrac{m(1+p|X|)^{2}}{\sigma^{2}}\right)
=\displaystyle= 𝒪(m|X|p/σ)=𝒪(|V|p|X|σ)=o(|V|[|X|pσ]12ε),\displaystyle\mathcal{O}(m|X|p/\sigma)=\mathcal{O}\left(|V|p\dfrac{|X|}{\sigma}\right)=o\left(|V|\cdot\left[\dfrac{|X|p}{\sigma}\right]^{1-2\varepsilon}\right),

which holds because of the assumption that 1+|X|p=o(σ)1+|X|p=o\Big{(}\sigma\Big{)}. Thus, the second-moment method provides the desired concentration statement of |TR||T_{R}| about its mean. ∎