This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Tight Bounds on Probabilistic Zero Forcing on Hypercubes and Grids

Natalie C. Behague Department of Mathematics, Ryerson University, Toronto, ON, Canada [email protected] Trent Marbach Department of Mathematics, Ryerson University, Toronto, ON, Canada [email protected]  and  Paweł Prałat Department of Mathematics, Ryerson University, Toronto, ON, Canada [email protected]
Abstract.

Zero forcing is a deterministic iterative graph colouring process in which vertices are coloured either blue or white, and in every round, any blue vertices that have a single white neighbour force these white vertices to become blue. Here we study probabilistic zero forcing, where blue vertices have a non-zero probability of forcing each white neighbour to become blue. We explore the propagation time for probabilistic zero forcing on hypercubes and grids.

Key words and phrases:
zero forcing, probabilistic zero forcing, hypercubes, grids

1. Introduction

1.1. Definition of Zero Forcing

Zero forcing is an iterative graph colouring procedure which can model certain real-world propagation and search processes such as rumor spreading. Given a graph GG and a set of marked, or blue, vertices ZGZ\subseteq G, the process of zero forcing involves the application of the zero forcing colour change rule in which a blue vertex uu forces a non-blue (white) vertex vv to become blue if N(u)Z={v}N(u)\setminus Z=\{v\}, that is, uu forces vv to become blue if vv is the only white neighbour of uu.

We say that ZZ is a zero forcing set if when starting with ZZ as the set of initially blue vertices, after iteratively applying the zero forcing colour change rule until no more vertices can be forced blue, the entire vertex set of GG becomes blue. Note that the order in which forces happen is arbitrary since if uu is in a position in which it can force vv, this property will not be destroyed if other vertices are turned blue. As a result, we may process vertices sequentially (in any order) or all vertices that are ready to turn blue can do so simultaneously. The zero forcing number, denoted z(G)z(G), is the cardinality of the smallest zero forcing set of GG.

Zero forcing has sparked a lot of interest recently. Some work has been done on calculating or bounding the zero forcing number for specific structures such as graph products [12], graphs with large girth [8] and random graphs [1, 16], while others have studied variants of zero forcing such as connected zero forcing [4] or positive semi-definite zero forcing [2].

While zero forcing is a relatively new concept (first introduced in [12]), the problem has many applications to other branches of mathematics and physics. For example, zero forcing can give insight into linear and quantum controllability for systems that stem from networks. More precisely, in [5], it was shown that for both classical and quantum control systems that can be modelled by networks, the existence of certain zero forcing sets guarantees controllability of the system.

Another area closely related to zero forcing is power domination [19]. Designed to model the situation where an electric company needs to continually monitor their power network, one method that is used is to place phase measurement units (PMUs) periodically through their network. To reduce the cost associated with this, one asks for the least number of PMUs necessary to observe a specific network fully. To be more specific, given a network modelled with a simple graph GG, a PMU placed at a vertex will be able to observe every adjacent vertex. Furthermore, if an observed vertex has exactly one unobserved neighbour, this observed vertex can observe this neighbour. In this way, power domination involves an observation rule compatible with the zero forcing colour change rule.

In the present paper we are concerned with a parameter associated with zero forcing known as the propagation time, which is the fewest number of rounds necessary for a zero forcing set of size z(G)z(G) to turn the entire graph blue. More formally, given a graph GG and a zero forcing set ZZ, we generate a finite sequence of sets Z0Z1ZtZ_{0}\subsetneq Z_{1}\subsetneq\dots\subsetneq Z_{t}, where Z0=ZZ_{0}=Z, Zt=V(G)Z_{t}=V(G), and given ZiZ_{i}, we define Zi+1=ZiYiZ_{i+1}=Z_{i}\cup Y_{i}, where YiV(G)ZiY_{i}\subseteq V(G)\setminus Z_{i} is the set of white vertices that can be forced in the next round if ZiZ_{i} is the set of the blue vertices. Then the propagation time of ZZ, denoted pt(G,Z)pt(G,Z), is defined to be tt. The propagation time of the graph GG is then given by pt(G)=minZpt(G,Z)pt(G)=\min_{Z}pt(G,Z), where the minimum is taken over all zero forcing sets ZZ of cardinality z(G)z(G). The propagation time for zero forcing has been studied in [13].

1.2. Definition of Probabilistic Zero Forcing

Zero forcing was initially formulated to bound a problem in linear algebra known as the min-rank problem [12]. In addition to this application to mathematics, zero forcing also models many real-world propagation processes. One specific application of zero forcing could be to rumor spreading, but the deterministic nature of zero forcing may not fit the chaotic nature of real-life situations. As such, probabilistic zero forcing has also been proposed and studied where blue vertices have a non-zero probability of forcing white neighbours, even if there is more than one white neighbour. More specifically, given a graph GG, a set of blue vertices ZZ, and vertices uZu\in Z and vV(G)Zv\in V(G)\setminus Z such that uvE(G)uv\in E(G), in a given time step, vertex uu will force vertex vv to become blue with probability

(u forces v)=|N[u]Z|deg(u),\mathbb{P}(u\text{ forces }v)=\frac{|N[u]\cap Z|}{\deg(u)},

where N[u]N[u] is the closed neighbourhood of uu.

In a given round, each blue vertex will attempt to force each white neighbour independently. If this happens, we may say that the edge uvuv is forced. A vertex becomes blue as long as it is forced by at least one blue neighbour, or in other words if at least one edge incident with it is forced. Note that if vv is the only white neighbour of uu, then with probability 11, uu forces vv, so given an initial set of blue vertices, the set of vertices forced via probabilistic zero forcing is always a superset of the set of vertices forced by traditional zero forcing. In this sense, probabilistic zero forcing and traditional zero forcing can be coupled. In the context of rumor spreading, the probabilistic colour change rule captures the idea that someone is more likely to spread a rumor if many of their friends have already heard the rumor.

Under probabilistic zero forcing, given a connected graph, it is clear that starting with any non-empty subset of blue vertices will with probability 1 eventually turn the entire graph blue, so the zero forcing number of a graph is not an interesting parameter to study for probabilistic zero forcing. Initially in [17], the authors studied a parameter that quantifies how likely it is for a subset of vertices to become a traditional zero forcing set the first time-step that it theoretically could under probabilistic zero forcing.

Instead, in this paper, we will be concerned with a parameter that generalizes the zero forcing propagation time. This generalization was first introduced in [11]. Given a graph GG, and a set ZV(G)Z\subseteq V(G), let ptpzf(G,Z)pt_{pzf}(G,Z) be the random variable that outputs the propagation time when probabilistic zero forcing is run with the initial blue set ZZ. For ease of notation, we will write ptpzf(G,v)=ptpzf(G,{v})pt_{pzf}(G,v)=pt_{pzf}(G,\{v\}). The propagation time for the graph GG will be defined as the random variable ptpzf(G)=minvV(G)ptpzf(G,v)pt_{pzf}(G)=\min_{v\in V(G)}pt_{pzf}(G,v). More specifically, ptpzf(G)pt_{pzf}(G) is a random variable for the experiment in which nn iterations of probabilistic zero forcing are performed independently, one for each vertex of GG, then the minimum is taken over the propagation times for these nn independent iterations.

1.3. Asymptotic Notation

Our results are asymptotic in nature, that is, we will assume that nn\to\infty. Formally, we consider a sequence of graphs Gn=(Vn,En)G_{n}=(V_{n},E_{n}) (for example, GnG_{n} is an nn-dimensional hypercube or nn by nn grid) and we are interested in events that hold asymptotically almost surely (a.a.s.), that is, events that hold with probability tending to 1 as nn\to\infty.

Given two functions f=f(n)f=f(n) and g=g(n)g=g(n), we will write f(n)=O(g(n))f(n)=O(g(n)) if there exists an absolute constant c+c\in{\mathbb{R}}_{+} such that |f(n)|c|g(n)||f(n)|\leq c|g(n)| for all nn, f(n)=Ω(g(n))f(n)=\Omega(g(n)) if g(n)=O(f(n))g(n)=O(f(n)), f(n)=Θ(g(n))f(n)=\Theta(g(n)) if f(n)=O(g(n))f(n)=O(g(n)) and f(n)=Ω(g(n))f(n)=\Omega(g(n)), and we write f(n)=o(g(n))f(n)=o(g(n)) or f(n)g(n)f(n)\ll g(n) if limnf(n)/g(n)=0\lim_{n\to\infty}f(n)/g(n)=0. In addition, we write f(n)g(n)f(n)\gg g(n) if g(n)=o(f(n))g(n)=o(f(n)) and we write f(n)g(n)f(n)\sim g(n) if f(n)=(1+o(1))g(n)f(n)=(1+o(1))g(n), that is, limnf(n)/g(n)=1\lim_{n\to\infty}f(n)/g(n)=1.

Finally, as typical in the field of random graphs, for expressions that clearly have to be an integer, we round up or down but do not specify which: the choice of which does not affect the argument.

1.4. Results on Probabilistic Zero Forcing

In [11], the authors studied probabilistic zero forcing, and more specifically the expected propagation time for many specific structures. A summary of this work is provided in the following theorem.

Theorem 1.1 ([11]).

Let n>2n>2. Then

  • minvV(Pn)𝔼(ptpzf(Pn,v))={n/2+2/3 if n is evenn/2+1/2 if n is odd,\min_{v\in V(P_{n})}\mathbb{E}(pt_{pzf}(P_{n},v))=\begin{cases}n/2+2/3&\text{ if }n\text{ is even}\\ n/2+1/2&\text{ if }n\text{ is odd},\end{cases}

  • minvV(Cn)𝔼(ptpzf(Cn,v))={n/2+1/3 if n is evenn/2+1/2 if n is odd,\min_{v\in V(C_{n})}\mathbb{E}(pt_{pzf}(C_{n},v))=\begin{cases}n/2+1/3&\text{ if }n\text{ is even}\\ n/2+1/2&\text{ if }n\text{ is odd},\end{cases}

  • minvV(K1,n)𝔼(ptpzf(K1,n,v))=Θ(logn)\min_{v\in V(K_{1,n})}\mathbb{E}(pt_{pzf}(K_{1,n},v))=\Theta(\log n),

  • Ω(loglogn)=minvV(Kn)𝔼(ptpzf(Kn,v))=O(logn)\Omega(\log\log n)=\min_{v\in V(K_{n})}\mathbb{E}(pt_{pzf}(K_{n},v))=O(\log n).

In [6], the authors used tools developed for Markov chains to analyze the expected propagation time for many small graphs. The authors also showed, in addition to other things, that minvV(Kn)𝔼(ptpzf(Kn,v))=Θ(loglogn)\min_{v\in V(K_{n})}\mathbb{E}(pt_{pzf}(K_{n},v))=\Theta(\log\log n) and for any connected graph GG, minvV(G)𝔼(ptpzf(G,v))=O(n)\min_{v\in V(G)}\mathbb{E}(pt_{pzf}(G,v))=O(n). This result was then improved in [18], where the authors showed that

log2log2(n)minvV(G)𝔼(ptpzf(G,v))n2+o(n)\log_{2}\log_{2}(n)\leq\min_{v\in V(G)}\mathbb{E}(pt_{pzf}(G,v))\leq\frac{n}{2}+o(n)

for general connected graphs GG. In the same paper, the authors also showed that

minvV(G)𝔼(ptpzf(G,v))=O(rlog(n/r)),\min_{v\in V(G)}\mathbb{E}(pt_{pzf}(G,v))=O(r\log(n/r)), (1.1)

where r1r\geq 1 denotes the radius of the connected graph GG. Moreover, they provided a class of graphs for which the bounds of the theorem are tight.

In addition to the results mentioned above, the authors of [11] also considered the binomial random graph 𝒢(n,p){\cal G}(n,p), proving the following theorem.

Theorem 1.2 ([11]).

Let 0<p<10<p<1 be constant. Then a.a.s. we have that

minvV(𝒢(n,p))𝔼(ptpzf(𝒢(n,p),v))=O((logn)2).\min_{v\in V({\cal G}(n,p))}\mathbb{E}(pt_{pzf}({\cal G}(n,p),v))=O((\log n)^{2}).

However, the authors in [18] conjectured that for the random graph, a.a.s.

minvV(𝒢(n,p))𝔼(ptpzf(𝒢(n,p),v))=(1+o(1))loglogn.\min_{v\in V({\cal G}(n,p))}\mathbb{E}(pt_{pzf}({\cal G}(n,p),v))=(1+o(1))\log\log n.

Of course, Theorem 1.2 can be improved immediately via (1.1) and the fact that for 0<p<10<p<1 constant, the radius of 𝒢(n,p){\cal G}(n,p) is a.a.s. 22 (see e.g. [10]), but even with this improvement, the bound is still far from the conjectured value. In [9], the authors explored probabilistic zero forcing on 𝒢(n,p){\cal G}(n,p) in more detail and, in particular, proved the above conjecture. Their results can be summarized in the following theorem that shows that probabilistic zero forcing occurs much faster in 𝒢(n,p){\cal G}(n,p) than in a general graph GG, as evidenced by the bounds of (1.1).

Theorem 1.3 ([9]).

Let vV(𝒢(n,p))v\in V({\cal G}(n,p)) be any vertex of 𝒢(n,p){\cal G}(n,p).
If p=logo(1)np=\log^{-o(1)}n (in particular, if pp is a constant), then a.a.s.

ptpzf(𝒢(n,p),v)log2log2n.pt_{pzf}({\cal G}(n,p),v)\sim\log_{2}\log_{2}n.

On the other hand, if logn/np=logΩ(1)n\log n/n\ll p=\log^{-\Omega(1)}n, then a.a.s.

ptpzf(𝒢(n,p),v)=Θ(log(1/p)).pt_{pzf}({\cal G}(n,p),v)=\Theta(\log(1/p)).

The results of most interest to us are from [14]. The authors of that paper are concerned with hypercubes QnQ_{n} and grids Gm×nG_{m\times n}, and prove the following result.

Theorem 1.4 ([14]).

The following bounds hold:

  • minvV(Qn)𝔼(ptpzf(Qn,v))=O(nlogn)\min_{v\in V(Q_{n})}\mathbb{E}(pt_{pzf}(Q_{n},v))=O(n\log n),

  • (1/2+o(1))(m+n)minvV(Gm×n)𝔼(ptpzf(Gm×n,v))(4+o(1))(m+n)(1/2+o(1))(m+n)\leq\min_{v\in V(G_{m\times n})}\mathbb{E}(pt_{pzf}(G_{m\times n},v))\leq(4+o(1))(m+n).

1.5. Our Results

In this paper, we improve both results from [14]. Instead of considering the expectation of the propagation time, we will calculate bounds on the propagation time that a.a.s. hold. The lower bounds for both families of graphs are trivial so the improvement is for the upper bounds.

Theorem 1.5.

The following bounds hold a.a.s.:

  • (a)

    ptpzf(Qn)npt_{pzf}(Q_{n})\sim n,

  • (b)

    (1+o(1))(m+n)/2ptpzf(Gm×n)(1+107)(m+n)/2(1+o(1))(m+n)/2\leq pt_{pzf}(G_{m\times n})\leq(1+10^{-7})(m+n)/2.

The bounds for QnQ_{n} are asymptotically tight. Unfortunately, we could not prove the matching upper bound for Gm×nG_{m\times n} but our upper bound is very tight, and so it supports the conjecture that a.a.s. ptpzf(Gn×n)npt_{pzf}(G_{n\times n})\sim n. This conjecture is supported by independent simulations performed in [14] as well as our own.

2. Preliminaries

2.1. Chernoff inequality

Let us first state a specific instance of Chernoff’s bound that we will find useful. Let XBin(n,p)X\in\textrm{Bin}(n,p) be a random variable with the binomial distribution with parameters nn and pp. Then, a consequence of Chernoff’s bound (see e.g. [15, Corollary 2.3]) is that

(|X𝔼X|ε𝔼X))2exp(ε2𝔼X3)\mathbb{P}(|X-\mathbb{E}X|\geq\varepsilon\mathbb{E}X))\leq 2\exp\left(-\frac{\varepsilon^{2}\mathbb{E}X}{3}\right) (2.1)

for 0<ε<3/20<\varepsilon<3/2. Moreover, let us mention that the bound holds for the general case in which X=i=1nXiX=\sum_{i=1}^{n}X_{i} and XiBernoulli(pi)X_{i}\in\textrm{Bernoulli}(p_{i}) with (possibly) different pip_{i} (again, e.g. see [15] for more details).

2.2. Hoeffding–Azuma inequality

Let X0,X1,X_{0},X_{1},\ldots be an infinite sequence of random variables that make up a martingale; that is, for any aa\in{\mathbb{N}} we have 𝔼[Xa|Xa1]=Xa1\mathbb{E}[X_{a}|X_{a-1}]=X_{a-1}. Suppose that there exist constants ca>0c_{a}>0 such that |XaXa1|ca|X_{a}-X_{a-1}|\leq c_{a} for each ata\leq t. Then, the Hoeffding–Azuma inequality implies that for every b>0b>0,

(i(0it):|XiX0|b)2exp(b22a=1tca2).\mathbb{P}\Big{(}\exists i(0\leq i\leq t):|X_{i}-X_{0}|\geq b\Big{)}\leq 2\exp\left(-\frac{b^{2}}{2\sum_{a=1}^{t}c_{a}^{2}}\right). (2.2)

2.3. Chernoff–Heoffding bounds for Markov chains

This result is due to Chung, Lam, Liu and Mitzenmacher [7]. Let MM be a discrete time ergodic Markov chain with state space [n][n] and the stationary distribution π\pi. MM may be interpreted as either the chain itself or the corresponding nn by nn transition matrix. The total variation distance between uu and ww, two distributions over [n][n], is defined as

uwTV=maxA[n]|iAuiiAwi|.\left\lVert u-w\right\rVert_{TV}=\max_{A\subseteq[n]}\left|\sum_{i\in A}u_{i}-\sum_{i\in A}w_{i}\right|.

For any ε>0\varepsilon>0, the mixing time of Markov chain MM is defined as

T(ε)=min{t:maxxxMtπTVε},T(\varepsilon)=\min\left\{t\in{\mathbb{N}}:\max_{x}\left\lVert xM^{t}-\pi\right\rVert_{TV}\leq\varepsilon\right\},

where xx is an arbitrary initial distribution. We will also need a definition of the π\pi-norm. The inner product under the π\pi-kernel is defined as u,vπ=i[n]uivi/πi\langle u,v\rangle_{\pi}=\sum_{i\in[n]}u_{i}v_{i}/\pi_{i}. Then, the π\pi-norm of uu is defined as uπ=u,uπ\left\lVert u\right\rVert_{\pi}=\sqrt{\langle u,u\rangle_{\pi}}. Note that, in particular, ππ=1\left\lVert\pi\right\rVert_{\pi}=1.

Now we are ready to state the main result from [7]. Let T=T(ε)T=T(\varepsilon) be the ε\varepsilon-mixing time of Markov chain MM for ε1/8\varepsilon\leq 1/8. Let (V1,V2,,Vt)(V_{1},V_{2},\ldots,V_{t}) denote a tt-step random walk on MM starting from an initial distribution φ\varphi on [n][n]. For every i[t]i\in[t], let fi:[n][0,1]f_{i}:[n]\rightarrow[0,1] be a weight function at step ii such that the expected weight 𝔼vπ[fi(v)]=μ\mathbb{E}_{v\leftarrow\pi}[f_{i}(v)]=\mu for all ii. Define the total weight of the walk by X=i=1tfi(Vi)X=\sum^{t}_{i=1}f_{i}(V_{i}). There exists some constant cc (which is independent of μ\mu, δ\delta and ϵ\epsilon) such that for 0δ10\leq\delta\leq 1

((1δ)μtX(1+δ)μt)1cφπexp(δ2μt/(72T)).\mathbb{P}\Big{(}(1-\delta)\mu t\leq X\leq(1+\delta)\mu t\Big{)}\geq 1-c\left\lVert\varphi\right\rVert_{\pi}\exp{\left(-\delta^{2}\mu t/(72T)\right)}. (2.3)

2.4. Useful Coupling

We will be using the following obvious coupling to simplify both upper and lower bounds. Having said that, in our case we will only need to prove upper bounds. Indeed, for lower bounds, it might be convenient to make some white vertices blue at some point of the process. Similarly, for upper bounds, one might want to make some blue vertices white. Given a graph GG, S,TV(G)S,T\subseteq V(G), and \ell\in{\mathbb{N}}, let A(S,T,)A(S,T,\ell) be the event that starting with blue set SS, after \ell rounds every vertex in TT is blue. The following simple observation was proved in [9].

Lemma 2.1 ([9]).

For all sets S1S2V(G)S_{1}\subseteq S_{2}\subseteq V(G), TV(G)T\subseteq V(G), and \ell\in{\mathbb{N}},

(A(S1,T,))(A(S2,T,)).\mathbb{P}(A(S_{1},T,\ell))\leq\mathbb{P}(A(S_{2},T,\ell)).

3. Hypercubes

This section is devoted to proving part (a) of Theorem 1.5. Let us start with a formal definition of the hypercube. The nn-dimensional hypercube QnQ_{n} has vertex set consisting of all binary strings of length nn and there is an edge between two vertices if and only if their binary strings differ in exactly one bit. For 0kn0\leq k\leq n, level kk of the hypercube QnQ_{n} is defined to be the set of all vertices whose binary strings contain exactly kk ones. Note that each vertex in level kk has kk neighbours in level k1k-1 and nkn-k neighbours in level k+1k+1.

Proof of Theorem 1.5(a).

Trivially, for any vertex vV(Qn)v\in V(Q_{n}), we deterministically have that ptpzf(Qn,v)npt_{pzf}(Q_{n},v)\geq n. Hence, in order to show that ptpzf(Qn)npt_{pzf}(Q_{n})\sim n, it is enough to show that a.a.s. ptpzf(Qn,v)n+o(n)pt_{pzf}(Q_{n},v)\leq n+o(n) for some vertex vV(Qn)v\in V(Q_{n}). Let vv be the vertex (0,0,,0)(0,0,\ldots,0) on level 0; in fact, since QnQ_{n} is a vertex-transitive graph, vv can be any vertex.

Suppose that for some integer kk, 0k<n/ln2n0\leq k<n/\ln^{2}n, the following property 𝒫(k)\mathcal{P}(k) holds at some point of the process: all vertices at levels up to kk are blue (including level kk). We will show that with probability 1o(n1)1-o(n^{-1}) property 𝒫(k+1)\mathcal{P}(k+1) holds after an additional O(lnn)O(\ln n) rounds. Since trivially 𝒫(0)\mathcal{P}(0) holds, by the union bound (over n/ln2nn/\ln^{2}n possible values of kk) we will conclude that a.a.s. 𝒫(n/ln2n)\mathcal{P}(n/\ln^{2}n) holds after O(n/lnn)=o(n)O(n/\ln n)=o(n) rounds.

Suppose that property 𝒫(k)\mathcal{P}(k) holds from some kk, 0k<n/ln2n0\leq k<n/\ln^{2}n. By Lemma 2.1, we may assume that all vertices at level k+1k+1 are white. It will be convenient to independently consider 3 phases after which property 𝒫(k+1)\mathcal{P}(k+1) will hold with probability 1o(n1)1-o(n^{-1}). The first phase lasts 320lnn320\ln n rounds. The probability that a given vertex at level k+1k+1 stays white during this phase is at most

(1k+1n)320lnnexp(320(k+1)lnnn)1160(k+1)lnnn=:p,\left(1-\frac{k+1}{n}\right)^{320\ln n}\leq\exp\left(-\frac{320(k+1)\ln n}{n}\right)\leq 1-\frac{160(k+1)\ln n}{n}=:p,

since kn/ln2nk\leq n/\ln^{2}n. Hence, for a given vertex ww at level kk, the number of neighbours at level k+1k+1 that turned blue during this phase is stochastically lower bounded by Bin(nk,1p)Bin(n/2,1p)X{\rm Bin}(n-k,1-p)\geq{\rm Bin}(n/2,1-p)\ni X with 𝔼[X]=(1p)n/2=80(k+1)lnn\mathbb{E}[X]=(1-p)n/2=80(k+1)\ln n. It follows from Chernoff’s bound (2.1) (applied with ε=1/2\varepsilon=1/2) that ww has at most 40(k+1)lnn40(k+1)\ln n blue neighbours at level k+1k+1 with probability at most

2exp(𝔼[X]12)=2n(80/12)(k+1)=o(1nk+1).2\exp\left(-\frac{\mathbb{E}[X]}{12}\right)=\frac{2}{n^{(80/12)(k+1)}}=o\left(\frac{1}{n^{k+1}}\right).

By the union bound (over at most nkn^{k} vertices at level kk), with probability 1o(n1)1-o(n^{-1}), each vertex at level kk has at least 40(k+1)lnn40(k+1)\ln n blue neighbours at level k+1k+1 at the end of the first phase. As we aim for a statement that holds with probability 1o(n1)1-o(n^{-1}), we may assume that this property holds once we enter the second phase.

The second phase lasts log5/4n\log_{5/4}n rounds. As before, let us concentrate on a given vertex ww at level kk. Suppose that ww has \ell blue neighbours for some integer \ell such that 40(k+1)lnnn/240(k+1)\ln n\leq\ell\leq n/2 (ww has only kk neighbours at level k1k-1 so, of course, it includes neighbours at level k+1k+1). The number of white neighbours of ww (at level k+1k+1) that turned blue in one round of the process is stochastically lower bounded by Bin(n,(+1)/n)Bin(n/2,(+1)/n)Y{\rm Bin}(n-\ell,(\ell+1)/n)\geq{\rm Bin}(n/2,(\ell+1)/n)\ni Y with 𝔼[Y]=(+1)/220(k+1)lnn\mathbb{E}[Y]=(\ell+1)/2\geq 20(k+1)\ln n. We get from Chernoff’s bound (2.1) (applied with ε=1/2\varepsilon=1/2) that Y+14Y\leq\frac{\ell+1}{4} with probability at most

2exp(𝔼[Y]12)=2n(20/12)(k+1)=o(1nk+1lnn).2\exp\left(-\frac{\mathbb{E}[Y]}{12}\right)=\frac{2}{n^{(20/12)(k+1)}}=o\left(\frac{1}{n^{k+1}\ln n}\right).

By the union bound (over at most nkn^{k} vertices at level kk and at most log5/4n\log_{5/4}n rounds), with probability 1o(n1)1-o(n^{-1}), each vertex at level kk increases the number of blue neighbours by a multiplicative factor of 5/45/4 each round, reaching at least n/2n/2 blue neighbours by the end of the second phase. We may assume that this property holds once we enter the third phase.

The third (and last) phase lasts 3log2n3\log_{2}n rounds. This time, let us concentrate on a given white vertex ww at level k+1k+1. This vertex has k+1k+1 neighbours at level kk, each of which has at least n/2n/2 blue neighbours. Hence, vertex ww stays white by the end of this phase with probability at most

((12)k+1)3log2n=23(k+1)log2n=1n3(k+1)=o(1nk+2).\left(\left(\frac{1}{2}\right)^{k+1}\right)^{3\log_{2}n}=2^{-3(k+1)\log_{2}n}=\frac{1}{n^{3(k+1)}}=o\left(\frac{1}{n^{k+2}}\right).

By the union bound (over at most nk+1n^{k+1} vertices at level k+1k+1), with probability 1o(n1)1-o(n^{-1}), all vertices at level k+1k+1 turn blue by the end of this phase and so property 𝒫(k+1)\mathcal{P}(k+1) holds.

Since we aim for a statement that holds a.a.s., we may assume that 𝒫(n/ln2n)\mathcal{P}(n/\ln^{2}n) holds after o(n)o(n) rounds, and continue the process from there. We say that a vertex at layer kk is happy if all but at most =(n)=ln5n\ell=\ell(n)=\ln^{5}n neighbours at layer k1k-1 are blue. Similarly, a vertex at layer kk is very happy if not only it is happy but also all but at most \ell neighbours at layer k+1k+1 are blue. (Note that a happy or even a very happy vertex might still be white.) Trivially, all vertices at levels up to n/ln2n+1n/\ln^{2}n+1 are happy (including level n/ln2n+1n/\ln^{2}n+1), and all vertices at levels before level n/ln2nn/\ln^{2}n are very happy.

Suppose that for some integer kk, n/ln2nkn1n/\ln^{2}n\leq k\leq n-1, all vertices at levels up to kk are happy (including level kk). We will show that after one single round all vertices at layer k+1k+1 are going to be happy and all vertices at layer k1k-1 are going to be very happy with probability 1o(n1)1-o(n^{-1}). By the union bound (over all possible values of kk), we will get that a.a.s. after less than nn rounds all vertices of the hypercube are going to be very happy.

For simplicity, by Lemma 2.1 we may assume that all vertices at level kk are white (despite the fact that they are happy). Let us concentrate on a given vertex ww at level kk. Since ww is happy and all of its blue neighbours at level k1k-1 are happy too, the probability that ww stays white is at most

(1kn)kexp((1+o(1))k2n)=:p.\left(1-\frac{k-\ell}{n}\right)^{k-\ell}\leq\exp\left(-(1+o(1))\frac{k^{2}}{n}\right)=:p.

Now, the probability that a given vertex at level k+1k+1 is not happy is at most

(k+1)p\displaystyle\binom{k+1}{\ell}p^{\ell} \displaystyle\leq nexp((1+o(1))k2n)\displaystyle n^{\ell}\exp\left(-(1+o(1))\frac{k^{2}}{n}\,\ell\right)
=\displaystyle= exp(lnn(1+o(1))k2n)\displaystyle\exp\left(\ell\ln n-(1+o(1))\frac{k^{2}}{n}\,\ell\right)
\displaystyle\leq exp(ln6n(1+o(1))nlnn)\displaystyle\exp\left(\ln^{6}n-(1+o(1))n\ln n\right)
=\displaystyle= o(12nn).\displaystyle o\left(\frac{1}{2^{n}\,n}\right).

Then the property that all vertices at level k+1k+1 are happy holds by the union bound (over at most 2n2^{n} vertices at level k+1k+1).

We may also show that all vertices at level k1k-1 are very happy. If knk\geq n-\ell, then all vertices at level k1k-1 are trivially very happy as they have less than \ell neighbours at level kk, so there is nothing to show. If knk\leq n-\ell (but still kn/ln2nk\geq n/\ln^{2}n), then a given vertex at level k1k-1 is not very happy with probability at most

(n(k1))pnexp((1+o(1))k2n)=o(12nn),\binom{n-(k-1)}{\ell}p^{\ell}\leq n^{\ell}\exp\left(-(1+o(1))\frac{k^{2}}{n}\,\ell\right)=o\left(\frac{1}{2^{n}\,n}\right),

and the desired bound holds by the union bound (over at most 2n2^{n} vertices at level k1k-1).

At this point we may assume that all vertices of the hypercube are very happy. It is easy to see that all remaining white vertices will turn blue in one single round. Indeed, since all vertices are very happy, the probability that some vertex stays white is at most

2n(1n2n)n2\displaystyle 2^{n}\left(1-\frac{n-2\ell}{n}\right)^{n-2\ell} =\displaystyle= 2n(2n)(1+o(1))n\displaystyle 2^{n}\left(\frac{2\ell}{n}\right)^{(1+o(1))n}
=\displaystyle= exp(O(n)(1+o(1))nlnn)\displaystyle\exp\left(O(n)-(1+o(1))n\ln n\right)
=\displaystyle= o(1).\displaystyle o(1).

The proof is finished. ∎

4. Grids

This section is devoted to proving part (b) of Theorem 1.5. Let us start with a formal definition of grids and tori. The mm by nn grid graph Gm×nG_{m\times n} is the mnmn vertex graph that is the Cartesian product of a path on mm vertices and a path on nn vertices. The mm by nn torus graph Tm×nT_{m\times n} is the Cartesian product of a cycle on mm vertices and a cycle on nn vertices.

We can restrict our focus to the square grid Gn×nG_{n\times n} or the square torus graph Tn×nT_{n\times n}. For a non-square grid Gn×mG_{n\times m} with n<mn<m, once a central n×nn\times n square is all blue, the two adjacent columns turn blue with probability one on the next time-step and so on. Thus pt(Gn×m)pt(Gn×n)+mn2pt\left(G_{n\times m}\right)\leq pt\left(G_{n\times n}\right)+\left\lceil\frac{m-n}{2}\right\rceil. A similar argument holds for the non-square torus. Hence, it will be straightforward to generalize the results for asymmetric cases which we will do once we are done with symmetric cases.

We define the origin of Gn×nG_{n\times n} to be the central vertex if nn is odd and one of the four central vertices if nn is even (the choice can be made arbitrarily as all of them are the same up to symmetry). Since Tn×nT_{n\times n} is vertex transitive, we may fix any vertex of Tn×nT_{n\times n} to be the origin of Tn×nT_{n\times n}. For a given positive integer k<n2k<\frac{n}{2} we define the kk-principal-square to be the 2k+12k+1 by 2k+12k+1 sub-grid centred on the origin.

Let s=n/ln2ns=\left\lfloor n/\ln^{2}n\right\rfloor. We will work in phases, where in phase ii we start with all vertices in an isis-principal-square being blue and end with all vertices in an (i+1)s(i+1)s-principal-square being blue. For simplicity, by Lemma 2.1 we may assume that at the beginning of each phase the only vertices that are blue are the ones that belong to the corresponding principal-square. In fact, during each phase we will turn a few more vertices white (if needed) so that the process behaves more predictably. We aim to bound the number of time-steps it takes to go from the isis-principal-square to the (i+1)s(i+1)s-principal square. As a result, in total there will be at most 12ln2n\frac{1}{2}\ln^{2}{n} phases. Note that the only phase that potentially differs between the grid and the torus is the final phase.

Let us start with the following simple observation that will allow us to ignore a few initial and final phases.

Lemma 4.1.

Let k1k\geq 1. Suppose that at time t0t_{0} all vertices in a kk-principal-square are blue. Then with probability 1o(1/n)1-o\left(1/n\right) at time t0+6lnnt_{0}+6\ln{n} all vertices in a (k+1)(k+1)-principal-square are blue.

Proof.

Let k1k\geq 1, and suppose that at time t0t_{0} all vertices in a kk-principal-square are blue. Note that vertices adjacent to a blue vertex with three blue neighbours turn blue with probability 1, and so all vertices adjacent to a non-corner vertex of the kk-principal-square will turn blue at time t0+1t_{0}+1. There are twelve (eight if k=1k=1) vertices in the (k+1)(k+1)-principal square that do not turn blue deterministically: the four corner vertices and the eight (four if k=1k=1) vertices adjacent to the corners. We will call these corner–adjacent.

A corner–adjacent vertex has a blue neighbour at time t0t_{0} and so the probability that it stays white in one time-step is at most 3/43/4. The probability it remains white after 4lnn4\ln{n} steps is at most (34)4lnn\left(\frac{3}{4}\right)^{4\ln{n}} and so the probability that all corner–adjacent vertices are blue at time t0+4lnnt_{0}+4\ln{n} is at least 18(34)4lnn=1o(1/n)1-8\left(\frac{3}{4}\right)^{4\ln{n}}=1-o\left(1/n\right).

Conditioning on a corner vertex having two blue neighbours at time t1=t0+4lnnt_{1}=t_{0}+4\ln{n}, the probability it stays white after the next time-step is at most (3/4)2=9/16(3/4)^{2}=9/16. The probability it is white after a further 2lnn2\ln{n} time-steps is at most (916)2lnn\left(\frac{9}{16}\right)^{2\ln{n}}. In particular, the probability that all four corner vertices are blue at time t1+2lnnt_{1}+2\ln{n} steps is at least 14(916)2lnn=1o(1/n)1-4\left(\frac{9}{16}\right)^{2\ln{n}}=1-o\left(1/n\right). ∎

Lemma 4.1 tells us that with high probability any single phase takes O(6slnn)=o(n)O\left(6s\ln{n}\right)=o(n) time-steps, so we may safely ignore what happens in the first five phases and the last phase. In particular, our argument is the same whether we are run the process on the grid or the torus. Hence, we may focus on the square grid graph Gn×nG_{n\times n}.

Recall that s=n/ln2ns=\lfloor n/\ln^{2}n\rfloor. Let 5i<n/2s5\leq i<n/2s and suppose that at time t0t_{0} all vertices in the (is)(is)-principal-square are blue. By Lemma 2.1, we may assume that these are the only blue vertices at that point of the process. We will show that with probability 1o(1/n)1-o(1/n), at time t0+(1+ε+o(1))st_{0}+(1+\varepsilon+o(1))s all vertices in the (i+1)s(i+1)s-principal-square are blue, where ε>0\varepsilon>0 will be an explicit, very small constant that we are not ready to introduce yet.

For simplicity, we identify the vertices of Gn×nG_{n\times n} with pairs (a,b)(a,b) from the set

{n12,,n2}×{n12,,n2}\left\{-\left\lfloor\frac{n-1}{2}\right\rfloor,\ldots,\left\lfloor\frac{n}{2}\right\rfloor\right\}\times\left\{-\left\lfloor\frac{n-1}{2}\right\rfloor,\ldots,\left\lfloor\frac{n}{2}\right\rfloor\right\}

in the natural way with the origin at (0,0)(0,0). In particular, note that a kk-principal-square contains all vertices (a,b)(a,b) with |a|k|a|\leq k and |b|k|b|\leq k. We will focus on the top-right quadrant where both coordinates are positive; by symmetry, the argument for the other quadrants will be exactly the same.

In order to control how the process progresses with time, we will pay attention to a few blue vertices at the top-right corner that are at the same distance from the origin. Hence, the following definition will be useful. For fixed positive integer dd, we define a dd-window (rooted at vertex (ad+1,b)(a-d+1,b)) to be a dd-tuple of vertices

((ad+1,b),(ad+2,b1),(ad+3,b2),,(a,bd+1)),\Big{(}(a-d+1,b),(a-d+2,b-1),(a-d+3,b-2),\ldots,(a,b-d+1)\Big{)},

where d<a,b<n/2d<a,b<n/2. Note that the distance of all vertices from such a dd-window from the origin is a+bd+1a+b-d+1.

Blue vertices from a dd-window at distance \ell from the origin will most likely turn some other vertices at distance +1\ell+1 from the origin to be blue. If that happens, we will simply move the window appropriately and continue the process. In order to compute the transition probabilities between given configurations that might occur in a dd-window, it will be convenient to assume that each blue vertex in the window has exactly two blue neighbours, namely the bottom and the left neighbours. We may do so based on the following observation.

Lemma 4.2.

Suppose that at time tt, a k×kk\times k sub-grid of vertices are all blue for some integer k2k\geq 2. Let vv be the top-right corner (blue) vertex of this sub-grid. If a neighbour of vv turns blue at time t+1t+1, then it is the top-right corner of a (k1)×(k1)(k-1)\times(k-1) sub-grid of vertices that are all blue.

Proof.

Consider the four vertices adjacent to vv. Clearly, the vertex below vv and the vertex to the left of vv are each the top right corner of a (k1)×(k1)(k-1)\times(k-1) all blue grid. Let uu be the vertex to the right of vv. The column of k2k-2 vertices directly below uu are all blue at time t+1t+1 with probability 1 (as each is adjacent to a blue vertex with three blue neighbours at time tt). Thus, if uu turns blue at time t+1t+1, then it is the top-right corner of a (k1)×(k1)(k-1)\times(k-1) blue sub-grid—see Figure 1. An analogous argument holds for the vertex above vv. ∎

Refer to caption
Figure 1. A 5×55\times 5 blue sub-grid at time tt and a 4×44\times 4 blue sub-grid at time t+1t+1 as described in Lemma 4.2.

Now, we are ready to define an auxiliary process, which monitors the behaviour of the original process, giving a sequence of triples (Dj,Cj,tj)(D_{j},C_{j},t_{j}) of dd-windows DjD_{j} at times tjt_{j}, and associated binary dd-tuples CjC_{j} indicating which vertices in the dd-window are blue at time tjt_{j} and which ones are white (1 represents blue and 0 represents white). This process will control the expansion of the blue principal squares and so can be used to upper bound the number of rounds needed to reach the end of a given phase. We will run this process for at most 3s3s steps and stop it prematurely if the end of the phase is not reached after 3s3s rounds. However, we will show that we do not stop prematurely with probability 1o(1/n)1-o(1/n) and, in fact, when dd is large enough, with this probability we will finish in almost ss rounds which is a trivial lower bound for the number of rounds needed to finish a given phase.

  1. (1)

    We start with D0=((isd+1,is),,(is,isd+1))D_{0}=((is-d+1,is),\ldots,(is,is-d+1)) and time-step t0t_{0}. Since all vertices in D0D_{0} are blue, the corresponding configuration is C0=(1,,1)C_{0}=(1,\ldots,1). (In fact, the initial triple is not important. Another natural starting point would be to start with a dd-window including vertex (is,is)(is,is) and so would have only one blue vertex in the corresponding configuration.)

  2. (2)

    Given the triple (Dj,Cj,tj)(D_{j},C_{j},t_{j}), we define triple (Dj+1,Cj+1,tj+1)(D_{j+1},C_{j+1},t_{j+1}) differently according to whether CjC_{j} contains a blue vertex or not. Suppose that DjD_{j} is a dd-window rooted at vertex (ad+1,b)(a-d+1,b).

  3. (3)

    If CjC_{j} contains a blue vertex, let tj+1=tj+1t_{j+1}=t_{j}+1 and consider the dd-windows

    X\displaystyle X =\displaystyle= ((ad+1,b+1),,(a,bd+2))and\displaystyle((a-d+1,b+1),\ldots,(a,b-d+2))\quad\text{and}
    Y\displaystyle Y =\displaystyle= ((ad+2,b),,(a+1,bd+1)).\displaystyle((a-d+2,b),\ldots,(a+1,b-d+1)).

    Let Dj+1D_{j+1} be whichever of XX or YY has more blue vertices at time tj+1t_{j+1} (see Figure 2 for an illustration). If they have an equal number of blue vertices, then pick one of the two uniformly at random. Let Cj+1C_{j+1} be the corresponding configuration capturing the information about which vertices in the window are blue at time tj+1t_{j+1}. Go to step (2).

    Refer to caption
    Figure 2. A 6-window DjD_{j} and 6-windows XX, YY as defined in Step (3) of the auxiliary process. Since YY has more blue vertices, Dj+1=YD_{j+1}=Y.
  4. (4)

    If CjC_{j} is all white (that is, all vertices in DjD_{j} are white at time tjt_{j}), we set tj+1=tjt_{j+1}=t_{j}. In this situation we need to introduce an auxiliary triple (Dj+1,Cj+1,tj+1)(D_{j+1},C_{j+1},t_{j+1}) that corresponds to an earlier dd-window that is not all white at time tjt_{j}. To that end we chose Dj+1D_{j+1} to have the same distance from the origin as Dj1D_{j-1}. Note that Cj1C_{j-1} is not all zeros and represents a configuration in the dd-window Dj1D_{j-1} at time tj1t_{j-1}; indeed, the process is designed in such a way that no Cj1C_{j-1} and CjC_{j} can be all zeros at any step in this process. We may simply fix Dj+1=Dj1D_{j+1}=D_{j-1} and Cj+1=Cj1C_{j+1}=C_{j-1} but our goal is to create a memory-less Markov chain so we will follow a different strategy.

    Let (x,y)(x,y) be a random vertex in Dj1D_{j-1} that was blue at time tj1t_{j-1}. Let Dj+1D_{j+1} be the dd-window centred on (x,y)(x,y). To be specific, if dd is odd then let

    Dj+1=((xd12,y+d12),,(x+d12,yd12)),D_{j+1}=\left(\left(x-\frac{d-1}{2},y+\frac{d-1}{2}\right),\ldots,\left(x+\frac{d-1}{2},y-\frac{d-1}{2}\right)\right),

    and if dd is even then let Dj+1D_{j+1} be one of

    ((xd22,y+d22),,(x+d2,yd2))\left(\left(x-\frac{d-2}{2},y+\frac{d-2}{2}\right),\ldots,\left(x+\frac{d}{2},y-\frac{d}{2}\right)\right)

    and

    ((xd2,y+d2),,(x+d22,yd22))\left(\left(x-\frac{d}{2},y+\frac{d}{2}\right),\ldots,\left(x+\frac{d-2}{2},y-\frac{d-2}{2}\right)\right)

    chosen uniformly at random. At time tj1t_{j-1}, (x,y)(x,y) is blue and it follows from Lemma 4.2 that (x1,y)(x-1,y) and (x,y1)(x,y-1) are blue and have three blue neighbours. (See the discussion below for details about implications of Lemma 4.2.) This means at time tj+1=tj=tj1+1t_{j+1}=t_{j}=t_{j-1}+1, the vertices (x1,y+1),(x,y),(x+1,y1)(x-1,y+1),(x,y),(x+1,y-1) are each blue and in Dj+1D_{j+1}. Applying Lemma 2.1, we may assume that any vertices in Dj+1D_{j+1} that are not one of (x1,y+1),(x,y),(x+1,y1)(x-1,y+1),(x,y),(x+1,y-1) are white at time tj+1=tjt_{j+1}=t_{j}. Let us stress again that by our choice of (x,y)(x,y), Dj+1D_{j+1} is not all white at time tj+1t_{j+1}. Go to step (2).

Recall that we start the current phase at time t0t_{0} with all vertices in the (is)(is)-principal-square being blue. Moreover, we already dealt with the first few phases so i5i\geq 5. By this assumption, every vertex in the starting dd-window D0D_{0} is the top-right corner of a (2is+1d)×(2is+1d)(2is+1-d)\times(2is+1-d) blue sub-grid at time t0t_{0}. Applying Lemma 4.2 recursively for each jj, we see that if a vertex in DjD_{j} turns blue at time tjt_{j} because of its blue neighbour in Dj1D_{j-1}, then it is the top-right corner of a (2is+1dj)×(2is+1dj)(2is+1-d-j)\times(2is+1-d-j) blue sub-grid. In particular, since we run the auxiliary process for at most 3s3s steps, it will be the top-right corner of a 3×33\times 3 blue sub-grid.

We are now ready to investigate the transition probabilities between configurations CjC_{j} and guide the auxiliary process so that it yields a Markov chain. Suppose that Dj=((ad+1,b),,(a,bd+1))D_{j}=((a-d+1,b),\ldots,(a,b-d+1)). Let us first investigate step (3) of the auxiliary process. By the assumption of this step, CjC_{j} is not all zeros; that is, DjD_{j} contains a blue vertex at time tjt_{j}. By Lemma 2.1 we may assume that at time tjt_{j} all vertices in the next dd-windows X=((ad+1,b+1),,(a,bd+2))X=((a-d+1,b+1),\ldots,(a,b-d+2)) and Y=((ad+2,b),,(a+1,bd+1))Y=((a-d+2,b),\ldots,(a+1,b-d+1)) are white, and that the vertices (ad,b+1)(a-d,b+1) and (a+1,bd)(a+1,b-d) are white. Moreover, as discussed above, Lemma 4.2 implies that any vertex in DjD_{j} that is blue at time tjt_{j} has exactly two blue neighbours. Thus the probability that a vertex in XYX\cup Y turns blue at time tj+1t_{j}+1 is exactly 1(1/4)2=15/161-(1/4)^{2}=15/16 if it has two blue neighbours in DjD_{j} at time tjt_{j}, 11/4=3/41-1/4=3/4 if it has one blue neighbour in DjD_{j} at time tjt_{j}, and 0 otherwise. In particular, the configuration Cj+1C_{j+1} representing the state of Dj+1D_{j+1} at time tj+1t_{j+1} depends only on the configuration CjC_{j} representing the state of DjD_{j} at time tjt_{j}. The transition probability Pd(C,C)P_{d}(C,C^{\prime}) between any two possible configurations CC and CC^{\prime} can be easily computed.

Now, let us investigate step (4). This is set up so that if CjC_{j} is all white (that is, DjD_{j} contains no blue vertices at time tjt_{j}) then deterministically Cj+1C_{j+1} consists of three centred consecutive blue vertices and all other vertices white. In particular, for d3d\geq 3 odd and CC all white, Pd(C,C)=1P_{d}(C,C^{\prime})=1 if C=(0,,0(d3)/2,1,1,1,0,,0(d3)/2)C^{\prime}=(\underbrace{0,\ldots,0}_{(d-3)/2},1,1,1,\underbrace{0,\ldots,0}_{(d-3)/2}) and 0 otherwise. For d4d\geq 4 even and CC all white,

Pd(C,C)={1/2if C=(0,,0(d4)/2,1,1,1,0,,0(d2)/2) or (0,,0(d2)/2,1,1,1,0,,0(d4)/2)0otherwise.P_{d}(C,C^{\prime})=\begin{cases}1/2&\text{if }C^{\prime}=(\underbrace{0,\ldots,0}_{(d-4)/2},1,1,1,\underbrace{0,\ldots,0}_{(d-2)/2})\text{ or }(\underbrace{0,\ldots,0}_{(d-2)/2},1,1,1,\underbrace{0,\ldots,0}_{(d-4)/2})\\ 0&\text{otherwise.}\end{cases}

Finally, for the degenerate case d=2d=2 and CC all white, P2(C,C)=1P_{2}(C,C^{\prime})=1 if CC^{\prime} is all blue, and 0 otherwise.

Since the state of Dj+1D_{j+1} at time tj+1t_{j+1} (namely, configuration Cj+1C_{j+1}) depends only on the state of DjD_{j} at time tjt_{j} (namely, configuration CjC_{j}) independent of the time and the choice of vertices, we can capture the behaviour of the process using a Markov chain (Cj)j=0(C_{j})_{j=0}^{\infty} on state space Ω={0,1}d\Omega=\{0,1\}^{d}. For example, when d=2d=2 the Markov chain has transition matrix

P2=(00011/169/323/329/161/163/329/329/161/25615/25615/256225/256)P_{2}=\begin{pmatrix}0&0&0&1\\ 1/16&9/32&3/32&9/16\\ 1/16&3/32&9/32&9/16\\ 1/256&15/256&15/256&225/256\end{pmatrix}

where the states are in the order (0,0),(0,1),(1,0),(1,1)(0,0),(0,1),(1,0),(1,1). It is easy to see that for any dd this Markov chain is irreducible and aperiodic, and so it has a limiting distribution which is equal to the stationary distribution. We define μd\mu_{d} to be the probability of the all white state C=(0,0,,0)C=(0,0,\ldots,0) in the limiting distribution.

The next lemma is the crux of our argument.

Lemma 4.3.

Take s=n/ln2ns=\lfloor n/\ln^{2}n\rfloor. Let 5i<n/(2s)5\leq i<n/(2s) and suppose that at time t0t_{0} all vertices in the (is)(is)-principal-square are blue. With probability 1o(1/n)1-o(1/n), before the end of round t0+(1+o(1))1μd12μd2st_{0}+(1+o(1))\frac{1-\mu_{d}}{1-2\mu_{d}}2s all vertices in the (i+1)s(i+1)s-principal-square are blue, where μd\mu_{d} is the probability of the all white state C=(0,0,,0)C=(0,0,\ldots,0) in the limiting stationary distribution of the Markov’s chain PdP_{d} defined above.

Applying the Lemma together with earlier observations we immediately obtain the following corollary.

Corollary 4.4.

The following bound holds a.a.s.:

pt(Gn×n)(1+o(1))1μd12μdn,pt\left(G_{n\times n}\right)\leq(1+o(1))\frac{1-\mu_{d}}{1-2\mu_{d}}\,n,

where μd\mu_{d} is the probability of the all white state C=(0,0,,0)C=(0,0,\ldots,0) in the limiting stationary distribution of the Markov’s chain PdP_{d} defined above.

Proof of Lemma 4.3.

Let us fix an integer d2d\geq 2. Suppose we run the auxiliary process (Dj,Cj,tj)(D_{j},C_{j},t_{j}) monitoring the trajectory of the dd-window for \ell steps. Let ww be the number of steps we spend in the all white state C=(0,0,,0)C=(0,0,\ldots,0). We have that t=t0+wt_{\ell}=t_{0}+\ell-w, since when we are in the all white state tj+1=tjt_{j+1}=t_{j}, and tj+1=tj+1t_{j+1}=t_{j}+1 otherwise. The distance of D0D_{0} from the origin is 2(is)d+12(is)-d+1. The distance of DD_{\ell} from the origin is 2(is)d+1+2w2(is)-d+1+\ell-2w, since when we are in the all white state Dj+1D_{j+1} is one closer to the origin than DjD_{j}, and otherwise Dj+1D_{j+1} is one further from the origin than DjD_{j}.

Applying Chernoff–Heoffding bounds for Markov chains (equation (2.3)), we have that

((1δ)μdw(1+δ)μd)1cexp(cδ2)\mathbb{P}\Big{(}(1-\delta)\mu_{d}\ell\leq w\leq(1+\delta)\mu_{d}\ell\Big{)}\geq 1-c\exp{(-c\delta^{2}\ell)}

for 0δ10\leq\delta\leq 1, where cc is some positive constant depending only on the Markov chain (in particular, independent of \ell and δ\delta). Set

=(212μd+ln4nn)ss and δ=ln2nn.\ell=\left(\frac{2}{1-2\mu_{d}}+\frac{\ln^{4}{n}}{\sqrt{n}}\right)s\geq s\quad\text{ and }\quad\delta=\frac{\ln^{2}{n}}{\sqrt{n}}.

(Note that μd1/77\mu_{d}\leq 1/77 implies that <3s\ell<3s, which is the number of steps taken. We will show that μd\mu_{d} satisfies this property below and so we may assume that the process does not stop prematurely.) Then, using that s=n/ln2ns=\lfloor n/\ln^{2}{n}\rfloor, we have that with probability 1o(1/n)1-o(1/n) the distance of DD_{\ell} from the origin is at least

2isd+1\displaystyle 2is-d+1 +2(1+δ)μd\displaystyle+\ell-2(1+\delta)\mu_{d}\ell
=2is+O(1)+(12μd)(1Θ(ln2nn))\displaystyle=2is+O(1)+(1-2\mu_{d})\left(1-\Theta\left(\frac{\ln^{2}n}{\sqrt{n}}\right)\right)\ell
=2is+O(1)+(12μd)(1Θ(ln2nn))212μd(1+Θ(ln4nn))s\displaystyle=2is+O(1)+(1-2\mu_{d})\left(1-\Theta\left(\frac{\ln^{2}n}{\sqrt{n}}\right)\right)\frac{2}{1-2\mu_{d}}\left(1+\Theta\left(\frac{\ln^{4}n}{\sqrt{n}}\right)\right)s
=2(i+1)s+Θ(nln2n),\displaystyle=2(i+1)s+\Theta(\sqrt{n}\ln^{2}n),

and the number of rounds in this phase that passed in the original zero-forcing process is equal to

tt0=w(1(1δ)μd)\displaystyle t_{\ell}-t_{0}=\ell-w\leq(1-(1-\delta)\mu_{d})\ell =\displaystyle= (1μd)(1+Θ(ln2nn))\displaystyle(1-\mu_{d})\left(1+\Theta\left(\frac{\ln^{2}n}{\sqrt{n}}\right)\right)\ell
=\displaystyle= (1+o(1))1μd12μd2s.\displaystyle(1+o(1))\frac{1-\mu_{d}}{1-2\mu_{d}}2s.

It remains to show that the dd-windows DjD_{j} are travelling broadly North-East rather than North or East, in order to be sure of obtaining a principal square as opposed to a rectangle. To that end, let us define the discrepancy disc(D)\operatorname{disc}(D) of a dd-window D=(ad+1,b),,(a,bd+1)D=(a-d+1,b),\ldots,(a,b-d+1) to be aba-b. The discrepancy captures how far from the North-East diagonal the dd-window is shifted. The discrepancy of D0D_{0} is 0. By definition, disc(Dj+1)\operatorname{disc}\left(D_{j+1}\right) differs from disc(Dj)\operatorname{disc}\left(D_{j}\right) by at most one if DjD_{j} is not all white at time tjt_{j}, and differs from disc(Dj)\operatorname{disc}\left(D_{j}\right) by at most dd if DjD_{j} is all white at time tjt_{j}.

Define a sequence (jk)k0(j_{k})_{k\geq 0} where j0=0j_{0}=0 and for k>0k>0, jkj_{k} is the least j>jk1j>j_{k-1} such that DjD_{j} is all white at time tjt_{j} (that is, Cj=(0,0,,0)C_{j}=(0,0,\ldots,0)). For any jj for which Cj(0,0,,0)C_{j}\neq(0,0,\ldots,0), the probability that Cj+1=(0,0,,0)C_{j+1}=(0,0,\ldots,0) is at least (1/16)d+1(1/16)^{d+1}. Hence, the probability that we do not hit an all white state after 216d+1lnn2\cdot 16^{d+1}\ln{n} consecutive steps, is at most

(1(1/16)d+1)216d+1lnnexp(2lnn)=1/n2.\left(1-(1/16)^{d+1}\right)^{2\cdot 16^{d+1}\ln{n}}\leq\exp(-2\ln n)=1/n^{2}.

Thus, since there are at most =o(n)\ell=o(n) terms in the sequence, we have jkjk1216d+1lnnj_{k}-j_{k-1}\leq 2\cdot 16^{d+1}\ln{n} for all kk with probability 1o(1/n)1-o(1/n). Since we aim for a statement that holds with probability 1o(1/n)1-o(1/n), we may assume that this property is deterministically satisfied.

Consider the sequence (disc(Djk))k0(\operatorname{disc}\left(D_{j_{k}}\right))_{k\geq 0}. Based on our assumption, for all k1k\geq 1

|disc(Djk)disc(Djk1)|d+tktk1(216d+1+o(1))lnn316d+1lnn.|\operatorname{disc}\left(D_{j_{k}}\right)-\operatorname{disc}\left(D_{j_{k-1}}\right)|\leq d+t_{k}-t_{k-1}\leq\left(2\cdot 16^{d+1}+o(1)\right)\ln{n}\leq 3\cdot 16^{d+1}\ln n.

By symmetry of the auxiliary process, we know that given disc(Djk1)\operatorname{disc}\left(D_{j_{k-1}}\right) and integer aa the probability that disc(Djk)=disc(Djk1)+a\operatorname{disc}\left(D_{j_{k}}\right)=\operatorname{disc}\left(D_{j_{k-1}}\right)+a must be the same as the probability that disc(Djk)=disc(Djk1)a\operatorname{disc}\left(D_{j_{k}}\right)=\operatorname{disc}\left(D_{j_{k-1}}\right)-a. Thus the sequence (disc(Djk))k0(\operatorname{disc}\left(D_{j_{k}}\right))_{k\geq 0} is a martingale. We can apply the Hoeffding-Azuma inequality (2.2) with b=ln2nb=\sqrt{\ell}\,\ln^{2}n and 216d+1ln2njk2\cdot 16^{d+1}\ln^{2}n\leq j_{k}\leq\ell to see that

(|disc(Djk)|b)2exp(b22k316d+1ln2n)=2exp(ln4nO(ln2n))=o(1/n),\mathbb{P}\left(|\operatorname{disc}\left(D_{j_{k}}\right)|\geq b\right)\leq 2\exp\left(\frac{-b^{2}}{2\cdot k\cdot 3\cdot 16^{d+1}\ln^{2}{n}}\right)=2\exp\left(\frac{-\ell\,\ln^{4}n}{O(\ell\ln^{2}n)}\right)=o(1/n),

and so

(|disc(D)|ln2n+316d+1lnn)1o(1/n),\mathbb{P}\left(|\operatorname{disc}\left(D_{\ell}\right)|\leq\sqrt{\ell}\ln^{2}{n}+3\cdot 16^{d+1}\ln n\right)\geq 1-o(1/n),

where the additional term 316d+1lnn3\cdot 16^{d+1}\ln n needed to be added because the last term in the sequence (jk)k0(j_{k})_{k\geq 0} can be smaller than \ell but, with the desired probability, not smaller than 316d+1lnn\ell-3\cdot 16^{d+1}\ln n.

Putting all of this together, we conclude that with probability 1o(1/n)1-o(1/n) at time t=t0+(1+o(1))1μd12μd2st_{\ell}=t_{0}+(1+o(1))\frac{1-\mu_{d}}{1-2\mu_{d}}2s there is a blue vertex at distance 2(i+1)s+Θ(nln2n)2(i+1)s+\Theta\left(\sqrt{n}\,\ln^{2}{n}\right) from the origin with a discrepancy of less than ln2n=O(nlnn)=o(nln2n)\sqrt{\ell}\ln^{2}{n}=O\left(\sqrt{n}\,\ln{n}\right)=o\left(\sqrt{n}\,\ln^{2}{n}\right). Using Lemma 4.2 for the last time, we get that this blue vertex is the top-right corner of a (2is+1d)×(2is+1d)(2is+1-d-\ell)\times(2is+1-d-\ell) all blue sub-grid at time tt_{\ell}. Since i5i\geq 5, we get that

2is+1d(i+1)s+4s+1d3s>(i+1)s,2is+1-d-\ell\geq(i+1)s+4s+1-d-3s>(i+1)s,

and so this all blue sub-grid entirely contains the top-right quadrant of the (i+1)s(i+1)s-principal-square.

By symmetry, the same conclusion holds for the other three quadrants and so with probability 1o(1/n)1-o(1/n) at time t0+(1+o(1))1μd12μd2st_{0}+(1+o(1))\frac{1-\mu_{d}}{1-2\mu_{d}}2s the (i+1)s(i+1)s-principal-square is entirely blue. The proof of the lemma is finished. ∎

The only thing remaining to finish the main theorem is to calculate the stationary distribution of the Markov chain and thus μd\mu_{d}. Based on Corollary 4.4, each value of μd\mu_{d} implies that a.a.s.,

pt(Gn×n)(1+εd+o(1))n, where εd:=1μd12μd1=μd12μd.pt\left(G_{n\times n}\right)\leq(1+\varepsilon_{d}+o(1))\,n,\quad\text{ where }\quad\varepsilon_{d}:=\frac{1-\mu_{d}}{1-2\mu_{d}}-1=\frac{\mu_{d}}{1-2\mu_{d}}.

When d=2d=2 or d=3d=3 the transition matrix is small enough to be calculated by hand and one can check that μ2=1/77\mu_{2}=1/77 and μ3=1861/491117\mu_{3}=1861/491117, respectively. It gives us ε2=1/750.01334\varepsilon_{2}=1/75\leq 0.01334 and ε3=1861/4873950.003819\varepsilon_{3}=1861/487395\leq 0.003819. For d7d\leq 7, one can use a computer to calculate the exact fractions in the transition matrix and thus the exact values of μd\mu_{d}. This gives

μ4=\displaystyle\mu_{4}= 114395249092101243\displaystyle\frac{11439524}{9092101243}
μ5=\displaystyle\mu_{5}= 11337636107985672542177028478096119\displaystyle\frac{1133763610798567}{2542177028478096119}
μ6=\displaystyle\mu_{6}= 112666827183116235892325831063686127236264864409019398540749761\displaystyle\frac{112666827183116235892325831063}{686127236264864409019398540749761}
μ7=\displaystyle\mu_{7}= 5367780869282489892831238835073092871486930343455658663791645046173690408989931892492266198652103814670581.\displaystyle\frac{536778086928248989283123883507309287148693034345565}{8663791645046173690408989931892492266198652103814670581}.

For larger values of dd, we must move to numerical approximations. Rounding errors become a concern for d>14d>14 when the minimum entry in the transition matrix is close to the precision of the computer. Below we summarize these numerical values in the table. In particular, when d=14d=14 we obtain ε14<107\varepsilon_{14}<10^{-7} and the main theorem holds.

dd μd\mu_{d} εd\varepsilon_{d}
2 0.0129870129870129880.012987012987012988 0.0133333333333333340.013333333333333334
3 0.00378932107827666340.0037893210782766634 0.00381825829152945740.0038182582915294574
4 0.00125818264604205520.0012581826460420552 0.0012613566802130820.001261356680213082
5 0.00044598137663029230.0004459813766302923 0.00044637953054535660.0004463795305453566
6 0.000164206901035515340.00016420690103551534 0.000164260846564667060.00016426084656466706
7 6.19564861341056.1956486134\cdot 10^{-5} 6.19641642981056.1964164298\cdot 10^{-5}
8 2.37761979971052.3776197997\cdot 10^{-5} 2.37773286661052.3777328666\cdot 10^{-5}
9 9.23814565351069.2381456535\cdot 10^{-6} 9.23831634331069.2383163433\cdot 10^{-6}
10 3.62355319681063.6235531968\cdot 10^{-6} 3.62357945731063.6235794573\cdot 10^{-6}
11 1.43191293991061.4319129399\cdot 10^{-6} 1.43191704061061.4319170406\cdot 10^{-6}
12 5.69253547551075.6925354755\cdot 10^{-7} 5.69254195651075.6925419565\cdot 10^{-7}
13 2.27426119421072.2742611942\cdot 10^{-7} 2.27426222871072.2742622287\cdot 10^{-7}
14 9.12367464771089.1236746477\cdot 10^{-8} 9.12367631261089.1236763126\cdot 10^{-8}

5. Acknowledgment

This research has been partially supported by NSERC and by the Ryerson University Faculty of Science Dean’s Research Fund. The numerical results presented in this paper were obtained using the Julia language [3]. We would like to thank Bogumił Kamiński from SGH Warsaw School of Economics for helping us to implement it. The program is available on-line.111https://math.ryerson.ca/~pralat/

References

  • [1] D. Bal, P. Bennett, S. English, C. MacRury, and P. Prałat. Zero-forcing in random regular graphs. Journal of Combinatorics, 12(1):85–116, 2021.
  • [2] F. Barioli, W. Barrett, S. Fallat, H. Hall, L. Hogben, B. Shader, P. van den Driessche, and H. van der Holst. Zero forcing parameters and minimum rank problems. Linear Algebra Appl., 433(2):401–411, 2010.
  • [3] J. Bezanson, A. Edelman, S. Karpinski, and V.B. Shah. Julia: A fresh approach to numerical computing. SIAM review, 59(1):65–98, 2017.
  • [4] B. Brimkov and I. Hicks. Complexity and computation of connected zero forcing. Discrete Appl. Math., 229:31–45, 2017.
  • [5] D. Burgarth, D. D’Alessandro, L. Hogben, S. Severini, and M. Young. Zero forcing, linear and quantum controllability for systems evolving on networks. IEEE Trans. Automat. Control, 58(9):2349–2354, 2013.
  • [6] Y. Chan, E. Curl, J. Geneson, L. Hogben, K. Liu, I. Odegard, and M. Ross. Using markov chains to determine expected propagation time for probabilistic zero forcing. Electronic Journal of Linear Algebra, 36:318–333, 2020.
  • [7] K. Chung, H. Lam, Z. Liu, and M. Mitzenmacher. Chernoff-hoeffding bounds for markov chains: Generalized and simplified. In 29th International Symposium on Theoretical Aspects of Computer Science, page 124, 2012.
  • [8] R. Davila and F. Kenter. Bounds for the zero forcing number of graphs with large girth. Theory Appl. Graphs, 2(2):Art. 1, 8, 2015.
  • [9] S. English, C. MacRury, and P. Prałat. Probabilistic zero forcing on random graphs. European J. Combin., 91:103207, 22, 2021.
  • [10] A. Frieze and M. Karoński. Introduction to random graphs. Cambridge University Press, Cambridge, 2016.
  • [11] J. Geneson and L. Hogben. Propagation time for probabilistic zero forcing. arXiv:1812.10476, 2018.
  • [12] AIM Minimum Rank-Special Graphs Work Group. Zero forcing sets and the minimum rank of graphs. Linear Algebra Appl., 428(7):1628–1648, 2008.
  • [13] L. Hogben, M. Huynh, N. Kingsley, S. Meyer, S. Walker, and M. Young. Propagation time for zero forcing on a graph. Discrete Appl. Math., 160(13-14):1994–2005, 2012.
  • [14] D. Hu and A. Sun. Probabilistic zero forcing on grid, regular, and hypercube graphs. arXiv:2010.12343, 2020.
  • [15] S. Janson, T. Łuczak, and A. Rucinski. Random graphs. Wiley-Interscience Series in Discrete Mathematics and Optimization. Wiley-Interscience, New York, 2000.
  • [16] T. Kalinowski, N. Kamčev, and B. Sudakov. The zero forcing number of graphs. SIAM J. Discrete Math., 33(1):95–115, 2019.
  • [17] C. Kang and E. Yi. Probabilistic zero forcing in graphs. Bull. Inst. Combin. Appl., 67:9–16, 2013.
  • [18] S. Narayanan and A. Sun. Bounds on expected propagation time of probabilistic zero forcing. arXiv:1909.04482, 2019.
  • [19] M. Zhao, L. Kang, and G. Chang. Power domination in graphs. Discrete Math., 306(15):1812–1816, 2006.