Square percolation and the threshold for quadratic divergence in random right-angled Coxeter groups
Abstract.
Given a graph , its auxiliary square-graph is the graph whose vertices are the non-edges of and whose edges are the pairs of non-edges which induce a square (i.e., a -cycle) in . We determine the threshold edge-probability at which the Erdős–Rényi random graph begins to asymptotically almost surely have a square-graph with a connected component whose squares together cover all the vertices of . We show , a polylogarithmic improvement on earlier bounds on due to Hagen and the authors. As a corollary, we determine the threshold at which the random right-angled Coxeter group asymptotically almost surely becomes strongly algebraically thick of order and has quadratic divergence.
2010 Mathematics Subject Classification:
05C80, 20F65, 57M15, 60B99, 20F55, 20F691. Introduction
In this paper we investigate the phase transition for a variant of “square percolation”, with motivation coming from both previous work on clique percolation and from questions in geometric group theory.
Clique percolation was introduced by Derényi, Palla and Vicsek [9] as a simple model for community detection, and quickly became well-studied in network science, from computational, empirical, and theoretical perspectives, see e.g. [7, 9, 18, 20, 21, 22]. In –clique percolation, to investigate the “community structure” of a graph or network one studies the auxiliary -clique graph whose vertices are the –cliques of and whose edges are those pairs of –cliques having at least vertices in common.
One of the main research questions in the area was determining the threshold for the emergence of a “giant component” in the auxiliary -clique graph when the original graph is an Erdős–Rényi random graph on vertices with edge-probability . This was completely resolved in 2009 by Bollobás and Riordan [7], in a highly impressive paper making sophisticated use of branching processes. In the concluding remarks of their paper, Bollobás and Riordan suggested a study of “square percolation” as a natural extension of their work. More precisely, given a graph they suggested studying the component structure of the auxiliary graph whose vertices are the not necessarily induced -cycles in , and whose edges are pairs of –cycles with a diagonal111Note that given a –cycle in a graph, we use the term “diagonal” to refer to the pair of vertices of a diagonal, even though they they may not span an edge in the graph; indeed most of this paper concerns induced –cycles so that the edge spanned by the diagonal is not in the graph. in common. For , they stated that they believed the threshold for the associated auxiliary graph to contain a giant component containing a positive proportion of all squares of should be , where (see the discussion around equation (19) in Section 2.3 of [7]).
A related (but slightly different) notion of “square percolation” arose independently in joint work of the authors with Hagen [5] on the divergence of the random right-angled Coxeter group, providing motivation from geometric group theory for understanding the phase transition in an auxiliary graph formed from the induced -cycles of an Erdős–Rényi random graph . To make this more precise, we make the following definition.
Definition 1.1.
To any graph , we associate an auxiliary square-graph, , whose vertices are the non-edges of , and whose edges are the pairs of non-edges of that together induce a –cycle (a.k.a. square) in .
Thus for vertices in a graph , the pair is an edge of if and only if (i) and are non-edges of (and thus vertices of ), and (ii) , , and are all edges of .
Remark 1.2.
This definition of the auxiliary square-graph differs slightly from the one used in the related papers [5, 8]. In those papers, the auxiliary graph had the induced -cycles as its vertices, and its edges were those pairs of induced -cycles having a diagonal in common. These two variants of auxiliary square-graphs encode essentially the same information, but the formulation above is more natural from a combinatorial perspective and more convenient for the exploration processes we shall consider in this paper.
We investigate the component structure of , albeit with an unusual twist. With a view to applications in geometric group theory, we will be interested in the question of whether or not has a component that “covers” all of the vertex-set of the original graph .
Definition 1.3.
We refer to connected components of as square-components of . Given a square-component we define its support to be the collection of vertices of given by:
and say that the component of covers the vertex set . If covers all of , we say it is a square-component with full support
Write to denote that is an instance of the Erdős–Rényi random graph model with parameters and , i.e., that is a graph on vertices obtained by including each edge at random with probability , independently of all the others.
Our main combinatorial result in this paper is pinpointing the precise threshold at which asymptotically almost surely222As usual, asymptotically almost surely or a.a.s. is shorthand for “with probability tending to as .” experiences a phase transition from having only square-components with support of logarithmic order to having a square-component with full support. Throughout this paper we set . The following two results establish that the critical threshold probability is by showing highly disparate behavior on either side of this threshold as given by the following two contrasting results.
Theorem 1.4 (Subcritical Behavior).
Let be fixed. Suppose that . Then for , a.a.s. every square-component of covers at most vertices.
Theorem 1.5 (Supercritical Behavior).
Let be fixed, and let be a function with and as . Let be an edge-probability with
Then for , a.a.s. there is a square-component of covering all vertices of .
Our proofs of Theorems 1.4 and 1.5 confirm the conjecture of Bollobás and Riordan regarding the location of the phase transition for their version of (non-induced) square percolation. Further, Theorems 1.4 and 1.5 have a direct application to the study of the geometric properties of random right-angled Coxeter group, which we now describe.
Given a graph , we define the associated right-angled Coxeter group (RACG) by taking the free group on and adding the relations and for all , . In this way, the graph encodes a finite presentation for the right-angled Coxeter group . Given graphs and , it is well-known that the associated groups and are isomorphic if and only if the graphs and are isomorphic, see [19]. Thus algebraic and geometric properties of can be studied via purely graph-theoretic means, as we do in this paper. Indeed, a number of geometric properties of a right-angled Coxeter group admit encodings as graph-theoretic properties of the presentation graph . Such properties include thickness and having quadratic divergence, which are both important in geometric group theory (see Section 3 below for a formal definition of these notions). An investigation of right-angled Coxeter groups with quadratic divergence was the main motivation for the work undertaken in this paper.
The correspondence between right-angled Coxeter groups and graphs allows one to define models of random groups based on random graph models. In particular, in this paper we consider the random right-angled Coxeter group, where the presentation graph is an Erdős–Rényi random graph. Using Theorems 1.4 and 1.5 above on square-components in Erdős–Rényi random graphs, we prove the following.
Theorem 1.6 (Criticality for quadratic divergence of RACGs).
Let . If
and . Then, a.a.s. the right-angled Coxeter group has quadratic divergence and is strongly algebraically thick of order exactly .
On the other hand, if satisfies
then the right-angled Coxeter group a.a.s has at least cubic divergence and is not strongly algebraically thick of order or .
The geometric properties of when and was described in detail by Behrstock, Hagen and Sisto in [6, Theorem V]. Together with their work, our results give an essentially complete picture of quadratic and linear divergence in random right-angled Coxeter groups.
Organization of the paper
In Section 3 we provide additional background material on the geometry of random groups and derive Theorem 1.6 from Theorems 1.4–1.5. In Section 4, we recall some basic facts about branching processes and give an outline of the proof strategy we follow for our main results, and of the ways in which it departs from the framework used by Bolloás and Riordan in their study of clique percolation in random graphs. Theorem 1.4 is proved in Section 5, while Theorem 1.5 is derived in Section 6. We end the paper in Section 7 with some discussion of the results and of further work and related problems.
Acknowledgments
The authors thank Mark Hagen for discussions during the early stages of this work. Behrstock was supported by NSF grant DMS-1710890. Falgas-Ravry was supported by Swedish Research Council grant VR 2016-03488. The authors thank Ela Behrstock for her skillful rendering of Figures 2 and 2. Aspects of this work were motivated by output from software written by the authors (available upon request from the authors, some available online at http://comet.lehman.cuny.edu/behrstock/random.html). Accordingly, this research was supported, in part, by a grant of computer time from the City University of New York High Performance Computing Center under NSF Grants CNS-0855217, CNS-0958379, and ACI-1126113.
2. Graph-theoretic notation and standard notions
Given a set and , let denote the collection of all subsets of of cardinality . So for example is the collection unordered distinct pairs of elements of . As a notational convenience, we set , and we often denote the unordered set by .
A graph is a pair , where is a set of vertices and is a collection of pairs of vertices referred to as the edges of . A subgraph of is a graph with and . If and , then we say is the subgraph of induced by and denote this fact by . When there is no risk of confusion, we may abuse notation and use to refer to both the subset of and the associated induced subgraph . The complement of a graph is the graph .
A path of length in a graph is an ordered sequence of distinct vertices together with a set of edges . Such a path is said to join to . Two vertices are said to be connected in if there is a path joining them. Being connected is an equivalence relation on the vertices of . A (connected) component of is then a nonempty set of vertices from that forms an equivalence class under this relation.
In this paper we study squares in graphs. An square, or –cycle, in is a copy of the graph as an subgraph of . In an abuse of notation, we will denote such a by . In other words, if we say “ is a copy of /a square in ”, we mean “”. Further if we say “ is an induced /square in ”, we mean that is a square in and that in addition . A useful notion for studying squares in graphs is that of a link graph: given a vertex , the link graph of is the collection of neighbors of in , i.e., .
By we mean that is a random graph on the vertex set obtained by including each edge in with probability , independently of all the others. This is known as the Erdős–Rényi random graph model. Given a sequence of edge probabilities and a graph property , we say that a typical instance of has property , or, equivalently, that holds asymptotically almost surely (a.a.s.) if
Throughout the paper, we use standard Landau notation: given functions , we write for and if there exists a constant such that . Further we write for , for . Finally if and both hold, we denote this fact by .
3. Geometric group theory and the property
Our main result in this paper establishes that is the threshold for a typical instance of the Erdős–Rényi random graph model to have a square-graph with a component covering all of . This property is a.a.s. equivalent to possessing the –property, defined below.
3.1. Background
Recall that the graph joint of two graphs and is the graph obtained by taking disjoint unions of and , and adding in all edges from to .
Definition 3.1.
A finite graph is defined to be (constructed from squares) if has induced subgraphs and with a (possibly empty) clique so that:
-
•
, and
-
•
has a component with .
Dani–Thomas were the first to introduce a special case of the property for triangle-free graphs in [8]. The property for arbitrary graphs was then studied by Hagen and the authors in [5], with an eye towards establishing when this property holds a.a.s. in random graphs, while in [15] Levcovitz studied the geometric properties of right-angled Coxeter groups whose presentation graphs do not possess the property.
With Hagen, the authors determined in [5] the threshold for the property to hold a.a.s. in Erdős–Rényi random graphs up to a polylogarithmic factor.
Theorem 3.2 (Theorems 5.1 and 5.7 in [5]).
If , then a.a.s. a graph does not have the property. On the other hand if and , then a.a.s. does have the property.
Our contribution in this paper is to eliminate the polylogarithmic gap in Theorem 3.2 and thus to determine the precise threshold for the property in random graphs.
The property is closely linked to the large scale geometry of right-angled Coxeter groups, connected to divergence and (strong algebraic) thickness. Divergence is a quasi-isometry invariant of groups introduced by Gersten [12] and further developed by Druţu, Mozes and Sapir [10], while thickness was introduced by Behrstock–Druţu–Mosher in [4] and then further refined by Behrstock–Druţu in [3]. We define these notions and explain how they are related below.
Definition 3.3.
Let be a geodesic metric space, let and let . Given with , we define to be the infimum of the lengths of paths in between and , if such a path exists, and otherwise; here denotes the ball of radius about . We then set
The divergence of is defined to be the collection of functions ,
Given two non-decreasing functions , we say that if there exists so that:
and we say if and . Importantly, two polynomials that are non decreasing and have the same degree are equivalent under this relation, and further for we have if and only if .
When is the Cayley graph of a right-angled Coxeter group, it is straightforward to see that . Therefore when we are referring to the divergence function of , we will mean . We say that a RACG has quadratic divergence if and linear divergence if .
Definition 3.4.
Let be a finitely generated group.
-
•
We say that is strongly algebraically thick of order if it has linear divergence.
-
•
We say that is strongly algebraically thick of order at most if has a collection of subgroups so that:
-
–
has finite index in
-
–
for there exists a sequence of elements of so that is infinite for each
-
–
there exists a constant so that each is –quasiconvex, that is to say, every pair of points in can be connected by an –quasigeodesic contained in
-
–
each is strongly algebraically thick of order at most .
-
–
Further, we say that is strongly algebraically thick of order exactly if it is strongly algebraically thick of order at most and not strongly algebraically thick of order at most . We also usually write “thick” as a shorthand for “strongly algebraically thick”.
Behrstock and Druţu discovered that the order of thickness provides upper bounds on the divergence of a metric space. In particular they proved:
Proposition 3.5 ([3, Corollary 4.17]).
Let be a finitely generated group which is strongly algebraically thick of order at most . Then for every , .
The group theoretic motivation for studying the property is that it provides a graph theoretical proxy for certain geometric properties of right-angled Coxeter groups, such as their divergence. To see that has quadratic divergence when has the property is straightforward, since interpreting the definition of in the Cayley graph yields a chain of linearly many spaces with linear divergence with each intersecting the next in an infinite diameter set. Indeed, it is an immediate consequence of the definitions that if if is the direct product of two infinite groups then has linear divergence, just as a path avoiding a linear-sized ball in the plane has linear length. Hence every finitely generated abelian group of rank at least has linear divergence (and is thick of order ).
Now, if where is a clique, then . In such a case is a finite-index subgroup of and thus, up to finite index, we can assume that does not contain a vertex sending an edge to all other vertices of .
Now, contains a network of convex subgroups generated by the induced square in the full-support component of . Each of these groups is virtually , that is to say has a finite index subgroup which is a copy of . Further, two induced squares in correspond to incident edges in if and only if the intersection of the associated virtual subgroups is virtually , that is to say has a finite index subgroup which is a copy of .
Thus, paths in the full-support component of give the connecting sequences needed in Definition 3.4. Hence if has the property, is thick of order at most and has at most quadratic divergence.
As shown by Dani–Thomas in the triangle-free case [8, Theorem 1.1 and Remark 4,8], and by the present authors with Hagen in the general case (as above) [5, Proposition 3.1], if has the property then the associated right-angled Coxeter group has thickness of order at most , and hence has at most quadratic divergence. Further, in [6], Behrstock, Hagen and Sisto show that a right-angled Coxeter group has linear divergence (and is thick of order ) if and only if is the join of two non-complete graphs. Finally Levcovitz proved that any graph without has at least cubic divergence [15], and so we see that has exactly quadratic divergence if and only if is not the join of two non-complete graphs and has the property.
3.2. Proof of threshold for quadratic divergence in random RACGs
Assuming our main theorems about square percolation, Theorems 1.4 and 1.5, we are now in a position to provide a proof of Theorem 1.6 on the threshold for quadratic divergence in RACGs:
Proof of Theorem 1.6 from Theorems 1.4 and 1.5.
Let , and suppose that
By Theorem 1.5, the graph a.a.s. has the property. Thus, by [5, Proposition 3.1], a.a.s. has at most quadratic divergence and is thick of order at most 1. Further, since , standard results on the connectivity of Erdős–Rényi random graphs tell us that a.a.s. the complement of is connected, and thus that itself is a.a.s. not the join of two non-trivial graphs. Thus, [6] implies that is not thick of order and hence is thick of order exactly one and has precisely quadratic divergence.
4. Branching processes and proof strategy
4.1. Branching processes
We recall here some basic facts and definitions from the theory of branching processes that we will use in our argument; for a more general treatment of such processes, see e.g. [2].
Definition 4.1.
A Galton–Watson branching process with offspring distribution is a sequence of non-negative integer-valued random variables with and for all , , where the : are independent, identically-distributed random variables with for all .
A Galton–Watson branching process can be viewed as a random rooted tree: in the zero-th generation there is a root or ancestor, who begets a random number of children that form the first generation. In every subsequent generation, each child independently begets a random number of children, with the -th member of generation begetting children.
Galton–Watson branching processes are a widely studied family of random processes and are the subject of much probabilistic research; see e.g. [2] and the references therein. Here we introduce only some fairly standard elements of the theory that are needed for our argument. A Galton–Watson process is said to become extinct if for some . The total progeny of is the total number of vertices in the associated tree, which we denote by ; this quantity is finite if and only if becomes extinct.
A key tool in the study of is the generating function of its offspring distribution, . The following standard results from the theory of branching processes relate the probability of extinction for to the mean and generating function of its offspring distribution .
Proposition 4.2 (See e.g. [2]).
Let and . Let be a Galton–Watson branching process with offspring distribution . Then the following hold:
-
(i)
(subcritical regime) if , then almost surely becomes extinct, and what is more
-
(ii)
(supercritical regime) if , then the probability that becomes extinct is the smallest solution to the equation
and satisfies .
We shall also need the following result on the distribution of the total progeny of .
Proposition 4.3 (Dwass’s formula [11]).
Let be a Galton–Watson branching process with offspring distribution . Then the total progeny satisfies
where are independent, identically distributed random variables with for all .
4.2. Departures from the Bollobás–Riordan framework
Bollobás and Riordan in [7] developed a powerful branching process framework for the study of clique percolation. Much of that framework can be adapted to the study of the non-induced square percolation we are concerned with in this paper. However there remains a number of significant hurdles which need to be overcome in order to extend their techniques to the present setting.
In the subcritical regime, the structure of squares makes the analysis of exceptional edges and offspring distributions (which are the crux of the argument) differ significantly from the Bollobás–Riordan paper; care is needed to handle the resulting complications correctly. Indeed, Bollobás and Riordan are able to model clique percolation using a Galton–Watson branching process whose offspring distribution is roughly Poisson; however, for square percolation, the offspring distribution is more heavy-tailed, forcing us to resort to somewhat delicate technical arguments.
Further, in the supercritical regime, because of our motivation from geometric group theory, we are interested in the study of induced square percolation. In particular, adding new edges to a graph could destroy some induced squares and hence split apart square-components even as we are trying to build a giant square-component. This situation is quite unlike that in clique percolation, and we have to use a completely different sprinkling argument to obtain our results (inter alia sprinkling vertices rather than edges). Thus here again there are significant complications and major departures from Bollobás and Riordan’s framework in [7].
4.3. Proof strategy
Our results rely on the analysis of a branching process exploration of the square-components of a graph for some fixed where .
We begin with an arbitrary induced square in . Its diagonals and give us two pairs of non-edges which can be used to discover further non-edges of belonging to the same square-component. The size of the set of common neighbors of and in is a binomially distributed random variable . Assuming that is an independent set (i.e., contains no edge of ) these common neighbors together with give rise to non-edges that lie in the same square-component as ; however, since we already knew about the pair , only of these are new. We then pursue our exploration of the square-component of by iterating this procedure: for each as-yet untested non-edge in our square-component, we can first find the common neighbors of , and add as “children” of all the new non-edges discovered in this way.
This can be viewed as a Galton–Watson branching process with offspring distribution in a natural way. Assuming the past exploration does not greatly interfere with the distribution of the number of children in our process, the expected number of children at each step is roughly equal to
The expected value of is readily computed from the first and second moments of the binomial distribution of , yielding
The Galton–Watson process becomes critical when the expectation of its offspring distribution is . Solving
and selecting the non-negative root , we thus see that for for any fixed , our branching process is subcritical. We thus expect it to terminate a.a.s. after a fairly small number of steps, from which one can hope to, in turn, deduce that a.a.s. all square-components are small. On the other hand, for any fixed , is supercritical, and with probability strictly bounded away from zero it does not terminate before we have discovered a reasonably large number of non-edges. A second-moment argument can then be used to show that a strictly positive proportion of non-edges must lie in reasonably large square-components. With a little glueing work, we can then hope to show that in fact a strictly positive proportion of non-edges lie in a giant square-component that covers all the vertices of .
The above is however a simplification of what is actually required to make the arguments go through, and the situation turns out to be considerably more nuanced than what we described above. A first issue is our assumption that the vertices in form an independent set: in the subcritical regime, we need to consider what happens if the set of common neighbors of some non-edge which we are testing interacts with some other previously discovered vertices, or with vertices in . In particular, any “exceptional” edge from to previously discovered vertices other than , could potentially create many additional squares, and hence add many new pairs to our square-component which are not accounted for by our branching process. Bollobás and Riordan faced a similar problem in their work on clique percolation. However, as stated in the previous subsection, the way they dealt with “exceptional edges” does not quite work for us in the square percolation setting. One issue is that in a copy of , vertices on opposite sides of a diagonal are not adjacent, so that the number of -cycles created by an exceptional edge cannot always be bounded by the degree of a newly discovered vertex. In addition, we note that if it is not dealt with properly, the presence of exceptional edges could significantly affect the future distribution of the number of children in our branching process: if in the example above had three common neighbors among the already discovered vertices rather than two, then the correct number of children for in the exploration process would be , which has expectation equal to when . Finally, for the argument to work, we need not only for a Galton–Watson branching process with offspring distribution to become a.a.s. extinct within a few generations (which is an easy first moment argument): we also need its total progeny to be a.a.s. small. Here the fact that is a quadratic function of the binomial random variable (and thus rather heavy-tailed) causes complicated issues, which were not faced in [7] (where the offspring distribution was essentially Poisson with mean ). Overcoming these problems is the main work done in Section 5.
Secondly, in the supercritical argument, after establishing the a.a.s. existence of many non-edges in reasonably large square-components, we must prove the a.a.s. existence of a giant square-component covering all vertices and a strictly positive proportion of non-edges of . Here the crucial point is that, because of the applications in geometric group theory motivating our work, we are considering induced square percolation. The size of a largest square-component in is not monotone with respect to the addition of edges to the graph — adding an edge could very well destroy an induced square, thus potentially breaking a large square-component in to several smaller pieces. So we have to use a completely new sprinkling argument to be able to agglomerate all “reasonably large” square-components into a single giant square-component. To do this we reserve some vertices for sprinkling, rather than edges. We use these vertices to build bridges between reasonably large square-components in a sequence of rounds until all such components are joined into one. Finally once we have established the a.a.s. existence of a giant square-component, some care is needed to ensure this square-component covers every vertex of . Assembling a giant square-component and ensuring it has full support in this way involves overcoming a number of interesting obstacles, and is the main work done in Section 6.
5. The subcritical regime: proof of Theorem 1.4
Theorem 1.4 will be established as an immediate consequence of a stronger result, Theorem 5.1, which we state and prove below after providing a few preliminary definitions.
Given a graph , in addition to the square-graph from Definition 1.1 we shall consider a different but closely related auxiliary graph that includes information about all squares in (rather than just the induced squares). Explicitly we let be the graph whose vertices are pairs of vertices from and whose edges correspond to (not necessarily induced) copies of . The support of a component of is defined as in Definition 1.3, mutatis mutandis.
Note that the square graph is exactly the subgraph of induced by the set of non-edges of . In particular, for every square-component in , there is a component in with and thus . To establish Theorem 1.4, it is thus enough to prove the following stronger theorem that bounds the size of the support in of components of .
Theorem 5.1.
Let be fixed. Suppose that . Then for , a.a.s. every component of has a support of size .
Since the order of the support of the largest component in is monotone non-decreasing with respect to the addition of edges to , we may assume in the remainder of this section that . Further, since is fixed, there exists a constant such that for a binomially distributed random variable we have
(5.1) |
With this last equality in hand, we are now ready to present and analyse the exploration process that lies at the heart of our proof of Theorem 5.1.
We shall discover a superset of the component of which contains sine fixed pair . We begin our exploration by finding common neighbors of and , then adding all pairs of such newly discovered vertices to a set of unexplored pairs. These new pairs obviously lie in the same component of as .
After this initial step in the exploration, we proceed as follows. First, we choose a new pair from our set of unexplored pairs. By assumption, there exists a previously explored pair such that all edges from to are present. We continue our exploration by finding the set of common neighbors of the vertices in among the previously undiscovered vertices of . We then add all pairs to our set of active pairs — these obviously lie in the same component of as — and delete from that set. We then repeat the procedure, choosing a new unexplored pair , finding its common neighbors among undiscovered vertices, etc.
This, however, is not enough to discover the totality of the component of in . Indeed, it is possible that the pair has additional common neighbors among already discovered vertices (in addition to the two neighbors in ), which could give rise to additional, unexplored pairs that lie in the same component of as . To deal with this possibility, we have to add in an exceptional phase in our exploration, which takes care of potential additional edges among the vertices we have discovered. In this exceptional phase, we generously overestimate how many new unexplored pairs could be discovered, and for each of these new pairs we start new and essentially independent versions of our exploration process, which we think of as children processes of our original exploration process.
The key is that, with the exceptional phase factored in, we do discover a superset of the collection of all pairs in the same component of as our starting pair . We compare the non-exceptional phase of our exploration to a subcritical branching process and give upper bound on its total progeny, which is small. Obtaining this bound is somewhat tricky (due to the nature of our offspring distribution) and relies on a rather technical application of Dwass’s formula (Proposition 4.3). The remainder of the proof is provided in Lemma 5.8 which shows that, with high probability, we do not run through more than five exceptional phases. In particular, we do not start too many child processes, so that with high probability our overall exploration process stops before we have discovered a large number of vertices.
5.1. An exploration process
Our exploration process will proceed by considering the following for each time :
-
•
(Discovered vertices.) An ordered set of vertices: .
-
•
(Active pairs.) A set of pairs of vertices (ordered lexicographically with respect to the ordering on ): .
-
•
(Discovered pairs.) A set of pairs of vertices: .
-
•
(Explored edge set.) A set of edges: .
-
•
(Epoch.) An integer: .
These sets will satisfy:
-
()
for all and for every active pair , the vertices and have either or common neighbors in the graph .
The initial state of the exploration consists of the following data, which is seeded by a choice of , an arbitrary pair of vertices from (note that this pair can, alternatively, be thought of as a vertex of ). We set , , and .
At each time step our exploration proceeds as follows, with as given in equation (5.1):
-
1.
If and , let be the first pair in . For each we test whether or not sends an edge in to both vertices of . Set and to be the collection of joint neighbors of and in . (Note, by property , the set is either empty or consists of a pair of discovered vertices.)
We arbitrarily order the vertices in as , and add them to to form . We then set
to be the new collection of active pairs (which again is ordered lexicographically with respect to the ordering on ), set , set , and then proceed to the next time step of the process. Note that since the only new edges being added are ones connecting a new vertex to and and since each pair in has either or common neighbors in , i.e., property () is still satisfied in the next time-step.
-
2.
If , then we terminate the process and declare large stop.
-
3.
If and , then we consider .
If or , then we terminate our exploration process and declare extinction stop or exceptional stop, respectively.
Otherwise we set , update by setting , and set (which by assumption is ). We then run the following subroutines:-
3A.
Set
We then update our value of , setting .
-
3B.
Let
Since subroutine 3A. terminated with each vertex in sends exactly two edges into . We set , and let consist of all pairs of vertices in containing at least one vertex of . Further, we set .
Once this is done, we proceed to the next time-step in the overall exploration process, observing that property has been preserved (since by construction every vertex in has degree exactly two in ).
-
3A.
2pt \pinlabel [ ] at 7 8 \pinlabel [ ] at 12 40 \pinlabel [ ] at 36 91 \pinlabel [ ] at 36 9 \pinlabel [ ] at 27 51 \pinlabel [ ] at 46 51 \pinlabel [ ] at 136 40 \pinlabel [ ] at 153 40 \endlabellist\includegraphics[scale=1.0]Figure_Process_1
2pt \pinlabel [ ] at 119 99 \pinlabel [ ] at 119 18 \pinlabel [ ] at 51 64 \pinlabel [ ] at 154 64 \pinlabel [ ] at 87 65 \pinlabel [ ] at 124 3 \endlabellist\includegraphics[scale=1.0]Figure_Process_2
5.2. Analysing the process
The exploration process defined in the previous subsection can terminate for one of three reasons:
-
(1)
(large stop);
-
(2)
(exceptional stop);
-
(3)
and (extinction stop).
It follows from the above that the process must in fact terminate within time-steps. We begin our analysis by noting that, given our aim of proving Theorem 1.4, extinction stops are good for us:
Lemma 5.2.
Suppose that the exploration from terminates at time with an extinction stop. Let be the component of containing . Then . Furthermore, the number of vertices in the support of is at most and .
Proof.
We perform our exploration process from the pair , and assume it terminates with an extinction stop at time . It is enough to show that given , for every neighbor of in , there is some such that .
If was discovered at a time-step where 1. applied or if , then and was an active pair at some time . Thus, at some later time-step where 1. applies, our exploration process selects as its “exploration pair” and discover all neighbors of in with — where is the pair we used to discover , and is the collection of joint neighbors of and that lie in . If is in this set, then .
Otherwise, if we failed to find at this time , must contain at least one vertex from . By property of our exploration, at least one of the edges , lies outside .
In particular, since we do not end with a large or exceptional stop, this edge will be uncovered at later time step where 3.3A. applies. But by the end of 3.3B., all vertices sending at least two edges into have been added to . Thus all common neighbors of will be found (since , and a common neighbor of has at least two neighbors in ). Hence after the updates . There are two options: if both vertices of are present after 3.3A., then , since after 3.3A. all possible pairs of discovered vertices (not already tested) are added to . Otherwise, both vertices of are present after 3.3B., and since at least one of them was discovered in 3.3B., is added to , and we are done again.
If on the other hand was discovered at a step where 3.3B. applies, then is added to , and the above applies. Finally, if was discovered at a time step where 3.3A. applies, then in 3.3A. and 3.3B., all common neighbors of are added to , and all such pairs are added to or .
Either way, since for all , we see that . Thus every neighbor of in is eventually discovered by our exploration process, and as claimed. Further, since by construction, and since our exploration ends with an extinction stop by the hypothesis of the lemma, we have as claimed. ∎
We now turn to the technical crux of the analysis.
Lemma 5.3.
If we are at a time-step of the process where 1. applies, then given the past history of the process, the random variable counting the number of new active pairs discovered by is stochastically dominated by a random variable , where is a binomial random variable with parameters and .
Proof.
As observed by Bollobás and Riordan [7, Inequality (3)], the past of the exploration is the intersection of the principal increasing event (corresponding to the edges that we have discovered are present in ) and a decreasing event (corresponding to the intersection of a number of events of the form “at least one of is not in ”). In particular appealing to Harris’s Lemma [13], the conditional probability that and both send an edge to in given the history is at most the unconditional probability . (We note here that the fact is essential — we have no control over the conditional probabilities of edges inside the set of discovered vertices .)
Thus conditional on the past history of the exploration process, the distribution of is stochastically dominated by a random variable . The Lemma then immediately follows from the definition of our exploration process. (Note that here we are using the fact property () is maintained throughout our exploration.) ∎
Lemma 5.4.
Let and . Then .
Proof.
Recall that in this section, , for some constant . Since , we have:
where in the last two inequalities we used the fact . Since , this immediately gives us the desired bound
∎
Corollary 5.5.
.
Proof.
Fix . By Lemma 5.4 with , we have
Taking a union bound over all possible choices of the pair , the lemma follows. ∎
We now analyse the total progeny of the Galton–Watson branching process with offspring distribution given by the random variable from the statement of Lemma 5.3. By (5.1), and thus the branching process is subcritical. Unfortunately, combining the Markovian bound on the extinction time from Proposition 4.2(i), with the bounds on the maximum degree in from Corollary 5.5 does not give us sufficiently good control on the extinction time and total progeny of our Galton–Watson process. Thus we turn to an application of Dwass’s formula to obtain the tighter bounds needed for the proof of Theorem 1.4.
Lemma 5.6.
Let be a Galton–Watson branching process with an offspring distribution as in Lemma 5.3. Set . Then is subcritical, and its total progeny satisfies
Proof.
Let be an infinite family of independent, identically distributed copies of . For each and every , write as , where . Set . By construction, . Since , where , and since , Lemma 5.4 implies that for every and ,
(5.2) |
Applying Dwass’s formula, Proposition 4.3, to our branching process , we have that for any ,
(5.3) |
Since , for we have that and hence . Thus we have
Since the random variables are by construction independent random variables with mean zero and absolute value at most , we can apply a standard Chernoff bound [1, Theorem A.1.16] in (5.2) to obtain:
Letting, we have that for sufficiently large, all satisfy . In particular for such we have
We now use Lemma 5.6 to estimate the probability that our exploration ends with a large stop. The key is to view our exploration as a branching process of branching processes: we begin with a subcritical branching process, corresponding to step 1. in our exploration process. Call this the ancestral branching process. When this process becomes extinct, we run through step 3. of our exploration process, potentially adding new active pairs to our otherwise empty set of active pairs . For each of these new pairs, we start an independent child branching process. For each of these we repeat the same procedure as for the ancestral branching process (so the child processes can generate their own child processes, and so forth). Thus to bound the total number of pairs discovered over the entire course of our exploration, we must control the growth of this branching process of branching processes.
Lemma 5.7.
The probability that the exploration from terminates with a large stop is .
Proof.
We view our exploration process as a kind of branching process of branching processes, with parent processes begetting children processes as described at the start of this section. Beginning from the single pair , the exploration of its component in undertaken at time-steps when 1. applies is dominated by a subcritical branching process as in the statement of Lemma 5.6. When that process terminates, we have discovered a certain set of vertices of . If 3. applies, then we add a certain number of vertices to to form and then add a subset of the pairs from to form . We may view each of the pairs added to at this time as the root of a subcritical independent branching process . There are at most of these “child-processes”, and they are stochastically dominated by independent copies of the subcritical branching process from Lemma 5.6. When all these branching processes have become extinct, we now have a (larger) set of discovered vertices and we may be back at a time step where 3. applies. We repeat our procedure — adding vertices to form , adding new pairs to form , etc. The whole procedure can begin again at most times (for otherwise an exceptional stop must have occurred).
We run our exploration ignoring large stops and only applying 1. and 3. until the process terminates (with an exceptional stop or an extinction stop), and show that if an exceptional stop does not occur then with probability the final size of the discovered set of vertices is at most .
Since we never can consider more than different active pairs, it follows that we never start more than branching processes in the course of our exploration. By Lemma 5.6, and a union bound, the probability that one of our at most branching processes has a total progeny of more than is . Assume from now on this does not happen. Since the progeny of our processes correspond to pairs , none of our processes can add more than vertices to the set of discovered vertices .
Further, by Corollary 5.5, the probability that there is any pair such that is at most . Assume from now on this does not happen. Then in the first time-step where 3. applies, we can bound the number of vertices added to : each pair can contribute at most vertices to a or , and as we do not have an exceptional stop we can repeat the addition procedure at most times. In particular we have
The number of child processes started at that time step is at most ; by our assumption, each of these discovers at most vertices in total, so that by the next time-step when 3. applies, we have
Repeating the analysis above, we obtain
and in the next time-step where 3. applies we have
and we can keep going in this way, defining , mutatis mutandis. If we avoid an exceptional stop, then we must terminate by the fifth time-step when 3. applies. Iterating our analysis, we see that the size of the final set of discovered vertices is as most
This shows that the probability our process terminates with a large stop is . ∎
Finally, we compute the probability that an exploration ends with an exceptional stop.
Lemma 5.8.
The probability that the exploration from terminates with an exceptional stop is .
Proof.
Suppose that the exploration from terminates at time with an exceptional stop. Then we must have discovered at least exceptional edges at time-steps when 3. applied, where we call an edge exceptional if it appeared in (type 1) or as the third or above edge from some vertex to (type 2), and where such edges are ordered according to the ordering of the vertices of .
Since the exploration did not terminate with a large stop, at each of these time steps we had at the start of each time-step . Also, since we did not terminate with a large stop, the time at which the process terminated must satisfy (this is an upper bound on the number of pairs we could have tested at time steps where 1. or 3. applied).
In any time-step where 3. applies and we are testing for membership in one of the (at most five) sets considered in that turn, the probability that a vertex in sends at least three edges of to the set is at most
Since we have at most vertices in and at most time-steps to choose from, the probability of having found at least type 2 exceptional edges for some is
(5.6) |
A similar (but simpler) calculation yields that the probability of having found edges of type 1 is:
(5.7) |
Adding the bounds (5.6) and (5.7) together and substituting in the value of , the Lemma follows. ∎
We are now in a position to prove Theorem 5.1, which, as previously noted, immediately implies Theorem 1.4.
Proof of Theorem 5.1.
Let be an arbitrary pair of vertices from . By Lemmas 5.7 and 5.8, with probability the exploration from terminates with an extinction stop. By Lemma 5.2 we obtain a bound on the size of each component of in found by this exploration, and a bound on its support as well. By a simple union bound, with probability , all pairs lie in components of supported on sets of size at most in . ∎
6. The supercritical regime: proof of Theorem 1.5
Fix . Suppose , where is a function with and . Let . By [5, Theorem 5.1], we know that if , then a.a.s. there is a square-component covering all of . We may thus restrict our attention in the proof of Theorem 1.5 to the range . For such , there exists such that if , then
(6.1) |
We shall prove the existence of a giant square-component with full support in four stages: first, we define an exploration process in . In a second stage, we analyse the process to show that a.a.s. a large proportion of non-edges of lie in “somewhat large” square-components of . Next, in the (more involved) third stage of the argument, we perform vertex-sprinkling to show a.a.s. a large proportion of non-edges of lie in a giant square-component. Finally, we show there a.a.s. a giant square-component covering all of .
6.1. An exploration process
We consider an exploration process consisting of the following data at each time :
-
•
(Discovered vertices.) An ordered set of vertices: .
-
•
(Active pairs.) A set of pairs of vertices (ordered lexicographically with respect to the ordering on ): .
-
•
(Reached pairs.) A set of pairs of vertices: .
These sets will satisfy:
-
()
for every active pair , the vertices and have at least common neighbors in the subgraph of induced by .
The initial state of the exploration consists of an arbitrary induced of , denoted , and the sets: , , and .
Our exploration then proceeds as follows:
1. If
, then we terminate the
process.
2. If and , then for each we test whether or not sends an edge in
to both of which are the vertices of the first
pair in the ordered set . We then set . Denote by
the set of common neighbors of and in (which by
property has size at least ).
We then set , and . We then proceed to the next time-step in the exploration process, noting that property is maintained.
3. If and , then we terminate the process.
6.2. Many non-edges in somewhat large components
The exploration process defined in the previous subsection can terminate for one of two reasons:
-
(1)
(Large stop.) , or
-
(2)
(Extinction stop.) .
The process always terminates after some number of time-steps. By construction, at all times the collection of pairs is a subset of the square-component of and in . Our aim is to show that with somewhat large probability .
Lemma 6.1.
At any time-step , the distribution conditional on the past history of the process of the random variable counting the number of new active pairs discovered by stochastically dominates a random variable with mean .
Proof.
Write for the set of edges discovered by the process. As observed by Bollobás and Riordan [7, Inequality (3)], the past of the process is the intersection of the principal increasing event and a decreasing event (corresponding to the intersection of a number of events of the form “at least one of is not in ” for some previously tested pair ). In particular, for and ,
Let denote the decreasing event
Note that the event is independent of . Using this fact and appealing to Harris’s Lemma [13], we have
Now conditional on , the probability that both and are in is readily computed: it is equal to
which is equal to (since).
The indicator function of the event thus stochastically dominates a Bernoulli random variable with mean . Further the are independent events given our conditioning (since each such event is only affected by the state of edges from to ). The random variable thus stochastically dominates a random variable . Let denote the sum of independent Bernoulli random variables with parameter . For , we have
where the inequality follows from the stochastic domination of by , and where in the last line we have used (6.1) and the fact that a binomial distribution with parameters and is close to . ∎
Let denote the extinction probability of the supercritical branching process with the offspring distribution given in the proof of Lemma 6.1. Note that by Proposition 4.2(b), is bounded away from .
Up to the time when it terminates, our exploration process on the square-component of stochastically dominates . We now use this fact to show many non-edges of lie in “somewhat large” square-components.
Lemma 6.2 (Many squares in large square-components).
Fix , satisfying , and as above. Then, with probability the number of induced s in which are part of square-components of order at least satisfies
Proof.
Given a collection of vertices , let be the indicator function of the event that and that our exploration process from terminates with a large stop (which is equivalent to being part of a square-component of order at least ). Conditional on , Lemma 6.1 implies that (which is the probability that the branching process does not become extinct). Applying Wald’s identity, the expectation of thus satisfies
(6.2) |
We now use Chebyshev’s inequality to show is concentrated around its mean. To do this, we must bound . Consider two collections of vertices .
Claim 1.
If , then and satisfy
Proof.
Our claim is that and are essentially independent. Indeed, let us first perform our exploration process from (stopping immediately if does not induce a copy of ). For (and as everywhere in this section), we have
(6.3) |
Thus with probability , the number of vertices added to in the last stage of the exploration process from is at most , implying that the set of vertices discovered by the process from has size at most . Further, the exploration process from tests at most pairs in total. This allows us to bound the probability that the exploration process from interacts with the exploration process from .
First of all, by Markov’s inequality the probability that a vertex in is discovered by the process from is at most
Secondly, the exploration process from tests at most pairs . By Markov’s inequality again, the probability that some sends an edge to both vertices in such a pair is at most
In particular, the probability of given differs from by at most , as claimed. ∎
Claim 2.
For any , we have
Proof.
Fix and consider the various ways in which and could intersect non-trivially.
-
•
There is one choice of with , for which we have .
-
•
Next, we have at most choices of with . Write and . For to be non-zero, both and must induce copies of in and moreover must occur. This is only possible if and sends edges to the two neighbors of in . Arguing as in Claim 1, these two events are almost independent and occur with probability . Thus the contribution of with to the left-hand side of is at most .
-
•
There are at most choices of with . For to be non-zero, it is necessary for to induce a copy of in and for . Arguing as in Claim 1, these two events are almost independent and occur with probability . Thus the contribution of with to the left-hand side of is at most .
-
•
Finally, there are at most choices of with . For to be non-zero, it is necessary for the vertices in to be incident at least three edges in and for . Arguing as in Claim 1, these two events are almost independent and occur with probability at most . Thus the contribution of with to the left-hand side of is at most .
Since , we have , and the analysis above shows the left-hand side of is at most , as claimed. ∎
Corollary 6.3.
Let be fixed, and let be an edge-probability satisfying . Then for all sufficiently small, there exist a constant such that if , then with probability the number of non-edges of that lie in square-components of of order at least , satisfies
Proof.
Let . For sufficiently small, we have . We now consider .
Ideally, we would now like to directly apply Lemma 6.2 in . However, to ensure the stochastic domination in Lemma 6.1, we started our exploration process from an induced rather than a non-edge — so we know that induced s are part of square-components of order at least whereas we want to show non-edges lie in such components. Since some non-edges could have as many as common neighbors in , it would in principle be possible for of order that, for example, the collection of the diagonals of the induced s contained in such “large” components consists of a set of only non-edges. We must thus rule out situation.
The simplest way to do this is to run through our proof of Lemma 6.2 again, but this time for the variant of our exploration process from Section 6.1 where we begin with an arbitrary non-edge of , set , and . We say such an exploration survives infancy if at the first time-step the pair discovers a set of joint neighbors that spans at least one non-edge of .
For in the range we are considering the random graph a.a.s. does not contain a complete graph on vertices, and we can use this to give a constant order lower bound on , the probability the process survives infancy:
Conditional on surviving infancy, by Lemma 6.1 the exploration process from stochastically dominates a supercritical branching process with extinction probability . Applying Wald’s identity, this implies that the number of non-edges of that belong to square-components of order at least satisfies
We now bound much as we did in Lemma 6.2. Given a pair , write for the event that is a non-edge and that our exploration process from terminates with a large stop. Claim 1 from the proof of Lemma 6.2 shows mutatis mutandis that if then
For non-disjoint pairs and , the situation is actually easier than it was in Claim 2: such pairs contribute at most to . Thus
and we conclude that with probability there are non-edges contained in square-components of order at least . The Corollary then follows from a suitable choice of the constant . ∎
6.3. A connecting lemma
The key to our sprinkling argument is the following, which we use to connect the somewhat large square-components into even larger square-components. We connect square-components by sprinkling in vertices, and looking for complete bipartite graphs with bipartition , where , are non-edges in distinct square-components, and is a non-edge inside the set of newly sprinkled vertices — see Figure 3 below.
Recall that a -random bipartite graph with partition is a graph on the vertex set obtained by including each pair with as an edge independently at random with probability .
Lemma 6.4 (Connecting Lemma).
Let , and be fixed. Let be a set of vertices, and be a set of vertices disjoint from . Suppose we are given disjoint subsets of and a subset with the following properties:
-
(1)
;
-
(2)
for every : , and some satisfying: ;
-
(3)
.
Let be an edge probability with
Consider the -random bipartite graph with bipartition . Let Boost be the event that for every with there exists and a triple such that the restriction of to is complete. Then for all sufficiently large we have
The proof of the connecting lemma relies on a celebrated inequality of Janson and some careful book-keeping.
Proposition 6.5 (The extended Janson inequality [14]).
Let be a finite set and a -random subset of for some . Let be a family of subsets of , and for every let be the indicator function of the event . Set , and let and . Then
Proof of Lemma 6.4.
Fix with . Set . Let denote the collection of connecting triples . Further let
Observe that the elements of are subsets of either or edges (depending on whether the pairs and overlap or not) of the complete bipartite graph with bipartition . We shall apply Janson’s inequality to to give an upper bound on the probability that does not connect to via a pair of squares of . To this end, we must compute and bound the and parameters for . The first of these is straightforward:
(6.4) |
To bound the parameter, fix a connecting triple , and consider the contribution to made by pairs of connecting triples that share at least one edge of ; call such pairs of connecting triples dependent.
Write for the set (which can have size either or — the latter if one of the is equal to one of the ) and for the pair . Also let be the collection of edges of from to . Clearly if or , then do not form a pair of dependent connecting triples.
Fix a connecting triple . For , let denote the collection of connecting triples with and . Further let and denote the collection of in with and respectively. We shall bound the sizes of the sets and . Note to begin with that there are at most ways of deleting a vertex in and replacing it by a different vertex in . In particular, for all connecting triples and all , we have
and |
Thus we may focus on bounding the sizes of and in the case where .
Case 1, :
-
•
there are at most ways of splitting into a pair from and a pair from , and at most ways of deleting a vertex from and viewing the remaining vertices as the union of a pair from and an (overlapping) pair from , whence and ;
-
•
there are at most ways of deleting one vertex from and replacing it by another vertex from . As noted above, there are most ways of splitting the resulting -set into a pair from and a pair from , whence ;
-
•
there are at most ways of deleting two vertices from and replacing them by two other vertices from , whence (similarly to the above) we have ; further, there are at most ways of deleting a pair of vertices from and replacing them by a single vertex from , whence (similarly to the above, since there are most ways of viewing three vertices of a pair from and an (overlapping) pair from ) we get ;
-
•
since contains at most pairs, there are at most ways of deleting three vertices in and replacing them by another three vertices from in such a way that the resulting set can still be viewed as the union of a pair from and a pair from , whence (similarly to the above) we have ; further and similarly there are at most ways of deleting three vertices in and replacing them by a pair of vertices from , whence (as before) we have ;
Case 2, :
-
•
there are at most ways of splitting into a pair from from and a pair from , whence ; further there are at most ways of adding a vertex to and splitting the resulting -set into two disjoint pairs, whence ;
-
•
there are at most ways of deleting one vertex from and replacing it by another vertex from , whence (as above) ; further, there are at most ways of deleting one vertex from and replacing it by a pair from , whence (as above) ;
-
•
since contains at most pairs, there are at most ways of deleting two vertices in and replacing them by another two vertices from in such a way that the resulting set can still be viewed as the union of a pair from and a pair from , whence ; similarly, there are at most ways of deleting two vertices in and replacing them by a triple of vertices from in such a way that the resulting set can be viewed as the disjoint union of a pair from and a pair from , whence . .
Given and considering the edges between and , we see that
(6.5) |
Similarly, for we have
(6.6) |
With the bounds on the size of and derived above and equalities (6.5) and (6.6) in hand, we are now ready to bound the contribution to from a connecting triple .
Case 1: .
(6.7) |
(Note in the last line we use the fact that , whence .)
Case 2: .
(6.8) |
Together, inequalities (6.7) and (6.8) yield that
(6.9) |
Applying the Extended Janson Inequality, Proposition 6.5, together with the bounds (6.4) and (6.9) on and , we get:
(6.13) |
Now, the probability that fails to connect to via a connecting triple is exactly the probability that . Applying Markov’s inequality together with (6.3) and using our assumptions that and that are disjoint, we have for sufficiently large that
provided is sufficiently large. This concludes the proof of the connecting lemma. ∎
6.4. Sprinkling vertices
With Lemma 6.4 in hand, we can return to the proof of Theorem 1.5. To complete the proof, we shall use a multiple-round vertex-sprinkling argument. We partition into the union of
-
(i)
on ,
-
(ii)
on , and
-
(iii)
a -random bipartite graph with bipartition .
We further partition into sets of size , (and ignore rounding errors). We say that is a good configuration if it satisfies the conclusion of Corollary 6.3, i.e., if at least non-edges of lie in square-components in of order at least (this is actually slightly weaker than what Corollary 6.3 gives us, but is all we need here).
We shall condition on being a good configuration when we perform our vertex-sprinkling. By Corollary 6.3, this occurs with probability . A key observation is that the state of the edges in and are independent of our conditioning. Our strategy is then to reveal the sprinkling sets one by one, and use them to create bridges between “somewhat large” square-components and thereby increase the minimum order of all “somewhat large” square-components.
More precisely, before stage we have revealed all the edges inside
At this stage, we deem a square-component “large” if it contains at least non-edges of , and “very large” if it contains at least non-edges of (which constitute, as we recall, the vertices of the square-graph). Now in stage , we reveal the set of non-edges of that lie inside and the edges between and . We then merge components as follows: given two square-components and , a connecting triple is a triple . Such a connecting triple is active if all edges between the sets and are in ; in this case the components and lie inside the same square-component in (see Figure 3). In particular, if both and contained at least non-edges, then must contain at least non-edges.
[ ] at 0 107 \pinlabel [ ] at 0 26 \pinlabel [ ] at 0 270 \pinlabel [ ] at 0 189 \pinlabel [ ] at 210 189 \pinlabel [ ] at 210 107 \endlabellist\includegraphics[scale=0.5]Connecting_triple
The connecting lemma we proved in the previous subsection immediately implies that with high probability at each stage , all components which are “large” but not “very large” must join up with at least one other “large” component. We make this explicit with a lemma below. Recall that throughout this section, is fixed and the edge-probability satisfies . Let be the constants whose existence is guaranteed by Corollary 6.3.
Lemma 6.6 (Sprinkling lemma).
Suppose that before stage , at least non-edges of lie in square-components of order at least in . Suppose . Then with probability at least
when we have revealed the edges from to at least non-edges of lie in square-components of order at least in .
In particular, with probability at least
we have that starting from a good configuration , the sprinkling process described above has discovered within steps a giant square-component containing at least non-edges, and all non-edges from that lie inside components of of size at least
Proof.
Let denote the set of non-edges of . We have By a standard Chernoff bound,
If holds, we can apply Lemma 6.4, concluding that every component of size is joined with at least one other, resulting in a component of size . Thus the desired conclusion for the first part of the lemma holds with probability at least
as desired.
For the “in particular” part, we first apply a simple union bound to the first steps of the process, to show that with probability at least
(6.14) |
our sprinkling process has uncovered a collection of square-components, each of which contains at least non-edges, and, whose union contains at least non-edges and includes all non-edges of coming from components of of size at least . There can be at most such components. By (6.3), the probability that a fixed pair of such components fails to join up in the next round of sprinkling is at most
Taking the union bound over the at most pairs of components, we have that the probability any pair of these components fail to join up is, for large , a lot less than the last term in equation (6.14):
Combining this with (6.14), we get the claimed bound on the probability of having discovered a giant square-component containing at least non-edges and all non-edges of contained in components of of size at least . ∎
6.5. Covering the whole world
All that now remains to complete the proof of Theorem 1.5 is to show that a.a.s. there is a square-component covering all vertices of . Note that while in Lemma 6.6 we showed that has a giant component, we have not quite shown it is unique: in principle, one could stitch together a rival giant component at the last stage of sprinkling by building numerous bridges between small components. This is of course highly unlikely, and one could show uniqueness of the giant by exploiting the fact that the number of non-edges of lying in square-components of order at least is a.a.s. concentrated around its expectation (as shown in Corollary 6.3). However we do not have a nice form for this expectation, so a little care would be needed to show it changes continuously with to make the argument above fully rigorous. As this paper is already sufficiently long and as the uniqueness of the giant is not our main concern here, we eschew this and focus instead on the problem of ensuring we have a giant component whose support covers all the vertices. We sidestep the issue of the uniqueness of the giant by considering a partition of which allows us to both build a preferred giant and, crucially, to ensure this preferred giant has full support. We begin by establishing a useful corollary of the work in the previous subsections.
Corollary 6.7.
Let be fixed, and let be an edge-probability satisfying . Let . Then there for every sufficiently small, there exists a constant such that given fixed sets with , all of the following hold with probability :
-
(i)
there are at least non-edges in contained in square-components of of order at least ;
-
(ii)
there is a unique square-component in containing all non-edges in contained in square-components of of order at least ;
-
(iii)
there is a unique square-component in containing all non-edges contained in square-components of or of order at least .∎
Proof.
We now apply this Corollary to prove Theorem 1.5.
Proof of Theorem 1.5.
First note that if is any function with and , and , then has the property, by [5, Theorem 5.1]. Now assume that and . Pick sufficiently small, and let be the constant whose existence is guaranteed by Corollary 6.7. Partition into sets
For each pair of distinct elements of , we apply Corollary 6.7 to the sets and ; taking a union bound over all such pairs , we see that with probability , for every pair of distinct elements the following hold:
-
(1)
at least non-edges in are contained in components of of order at least ;
-
(2)
there is a unique component of containing all non-edges of contained in components of of order at least ;
-
(3)
there is a unique component of containing as well as all non-edges of contained in components of of order at least .
We claim that we have . Indeed for , note that and . Since both and contain all of the at least non-edges contained in components of of order at least , it follows that . Since their intersection is non-empty, and are the same component of , as claimed. We may thus let denote the a.a.s. unique square-component with for all .
We now show that a.a.s. the support of this component is the whole vertex set . Pick and condition on the event that there is a square-component in of order at least (an event which occurs with probability , as we saw in (2) above). If two or more such components exist, pick a largest one. Further, condition on each vertex having at least non-neighbors in . By a standard application of Chernoff bounds and a union-bound, this event occurs with probability .
Having thus conditioned on the state of pairs in and , we now show that a.a.s. for every vertex , there exist and such that induces a copy of — so that that belongs to the component of containing . Combining this with (3) above (which implie that a.a.s. ) and a simple union bound will then yield Theorem 1.5.
Given non-edges and , let be the indicator function of the event that all of , , and are edges of . Observe that this event is independent of our conditioning. For , set
to be the number of induced ’s of with and . We shall again apply the Extended Janson Inequality ((Proposition 6.5) to bound . Given our conditioning, the expectation of is
We now compute the corresponding -parameter in Janson’s inequality. For and , write if . Note that the random variables and are independent unless , and further that implies .
Pick and fix and , which can be done in at most ways. We perform a case analysis to determine the contributions to the -parameter from terms of the form with :
-
•
there are at most choices of with such that , and for such we have ;
-
•
there are at most choices of with such that , and for such we have (with the contribution from the symmetric cases analysed similarly);
-
•
finally, there are at most choices of with , , and such that , and for such we have (with the contribution from the symmetric cases analysed similarly).
Putting it all together, we see
Since , and , the above implies
which for large enough is at most . Applying Janson’s inequality,
Thus with probability , and the vertex is covered by the square-component in that contains . Taking a union bound over , with probability , every vertex is covered by this square-component (which as we showed is the a.a.s uniquely determined giant component ). Taking a union bound over and combining this with (1)–(3) above, we see that with probability the square-component covers all of . This concludes the proof of Theorem 1.5. ∎
7. Concluding remarks
There are several natural questions arising from our work. To begin with, one could ask for more information about the component structure in : in the subcritical regime, can one get a better upper bound on the order of square-components? In the supercritical regime, can one give good bounds on the order of the second-largest square-component? In particular, can one give better bounds than just , and can one show its support has size ? This may be feasible albeit technically challenging.
Another question on the probabilistic side is determining the range of for which the square graph of is a.a.s. connected, a very different question from the ones considered in this paper. Investigating other parameters such as the diameter of may also lead in interesting directions.
Further afield, one could consider percolation problems for similar structures. We could for example consider the graph on all triples of independent vertices in obtained by setting two such triples to be adjacent if they induce a copy of the -cycle or of the complete balanced bipartite graph in . We would guess the techniques in this paper would adapt well to the latter problem, as Bollobás and Riordan also suggest, but that new ideas would be needed for the former.
One could also consider a similar problem for a -random -uniform hypergraph , by setting a set of vertex-disjoint non-edges to be adjacent if all of the edges meeting each in exactly one vertex are present in (in other words, if and only if the union of the s induces a copy of the complete balanced -partite -uniform hypergraph on vertices). Note the case corresponds exactly to square percolation. Levcovitz [16] has provided a quasi-isometry invariant for right-angled Coxeter groups by associating a hypergraph to any such group, so analysis of suitable variants of square percolation for hypergraphs may yield interesting applications in geometric group theory (besides constituting a challenging and rather natural problem in combinatorial probability).
Finally it would be interesting to study other properties of the right-angled Coxeter group when using tools from random graph theory. In particular, determining the threshold for algebraic thickness of every order, or the exact rate of divergence of for all would be of great interest (see [6, Question 1]). Doing so will require new group theoretic ideas to translate these properties into graph theoretic language, and the identification of suitably tractable graph theoretic proxies for these in . Work of Levcovitz [16] provides promising progress towards finding combinatorial properties to encode higher rates of polynomial divergence in right-angled Coxeter groups; indeed, as we finalized this paper, Levcovitz released a new preprint [17] that provides such a translation, which we expect will be of use in future work on this problem. As Levcovitz’s work involves hypergraphs, developing new techniques for generalizations of square percolation to hypergraphs will likely be key to further progress.
Finally, one could study thickness and relative hyperbolicity in random right-angled Coxeter groups with presentation graphs drawn from other distributions than the Erdős–Rényi random graph model, such as random regular graphs. We do not know of any work which has been done in this direction at the present time.
References
- [1] Alon, N., and Spencer, J. H. The probabilistic method, fourth ed. Wiley-Interscience Series in Discrete Mathematics and Optimization. John Wiley & Sons, Inc., Hoboken, NJ, 2015.
- [2] Balister, P. Branching processes. Lecture Note Series, IMS, NUS (2006).
- [3] Behrstock, J., and Druţu, C. Divergence, thick groups, and short conjugators. Illinois J. Math. 58, 4 (2014), 939–980.
- [4] Behrstock, J., Druţu, C., and Mosher, L. Thick metric spaces, relative hyperbolicity, and quasi-isometric rigidity. Mathematische Annalen 344, 3 (2009), 543–595.
- [5] Behrstock, J., Falgas-Ravry, V., Hagen, M. F., and Susse, T. Global structural properties of random graphs. Int. Math. Res. Not. 2018, 5 (2018), 1411–1441.
- [6] Behrstock, J., Hagen, M., and Sisto, A. Thickness, relative hyperbolicity, and randomness in Coxeter groups. Algebr. Geom. Topol. 17, 2 (2017), 705–740.
- [7] Bollobás, B., and Riordan, O. Clique percolation. Random Structures Algorithms 35, 3 (2009), 294–322.
- [8] Dani, P., and Thomas, A. Divergence in right-angled Coxeter groups. Trans. Amer. Math. Soc. 367, 5 (2015), 3549–3577.
- [9] Derényi, I., Palla, G., and Vicsek, T. Clique percolation in random networks. Phys. Rev. Lett. 94 (Apr 2005), 160202.
- [10] Druţu, C., Mozes, S., and Sapir, M. Divergence in lattices in semisimple Lie groups and graphs of groups. Trans. Amer. Math. Soc. 362, 5 (2010), 2451–2505.
- [11] Dwass, M. The total progeny in a branching process and a related random walk. Journal of Applied Probability 6, 3 (1969), 682–686.
- [12] Gersten, S. M. Divergence in -manifold groups. Geom. Funct. Anal. 4, 6 (1994), 633–647.
- [13] Harris, T. E. A lower bound for the critical probability in a certain percolation process. In Mathematical Proceedings of the Cambridge Philosophical Society (1960), vol. 56, Cambridge University Press, pp. 13–20.
- [14] Janson, S., Łuczak, T., and Ruciński, A. Random graphs, vol. 45. John Wiley & Sons, 2011.
- [15] Levcovitz, I. Divergence of cube complexes and Coxeter groups. Algebr. Geom. Topol. 18, 3 (2018), 1633–1673.
- [16] Levcovitz, I. A quasi-isometry invariant and thickness bounds for right-angled Coxeter groups. Groups Geom. Dyn. 13, 1 (2019), 349–378.
- [17] Levcovitz, I. Characterizing divergence and thickness in right-angled Coxeter groups. arXiv:2007.13796, 2020.
- [18] Li, M., Deng, Y., and Wang, B.-H. Clique percolation in random graphs. Phys. Rev. E (3) 92, 4 (2015), 042116–042122.
- [19] Mühlherr, B. Automorphisms of graph-universal coxeter groups. Journal of Algebra 200, 2 (1998), 629–649.
- [20] Palla, G., Derényi, I., Farkas, I., and Vicsek, T. Uncovering the overlapping community structure of complex networks in nature and society. Nature 435 (June 2005), 814–818.
- [21] Tóth, B., Vicsek, T., and Palla, G. Overlapping modularity at the critical point of -clique percolation. J. Stat. Phys. 151, 3-4 (2013), 689–706.
- [22] Wang, B., Cao, L., Suzuki, H., and Aihara, K. Impacts of clustering on interacting epidemics. J. Theoret. Biol. 304 (2012), 121–130.