The upper tail problem for induced 4-cycles in sparse random graphs
Abstract.
Building on the techniques from the breakthrough paper of Harel, Mousset and Samotij, which solved the upper tail problem for cliques, we compute the asymptotics of the upper tail for the number of induced copies of the 4-cycle in the binomial random graph . We observe a new phenomenon in the theory of large deviations of subgraph counts. This phenomenon is that, in a certain (large) range of , the upper tail of the induced 4-cycle does not admit a naive mean-field approximation.
1. Introduction
The study of the random variable counting the number of copies of a given graph in the binomial random graph has a very long history and many things are known about it. In particular, for every graph , when the expected number of copies of in tends to infinity a ‘law of large numbers’, see Bollobás [6], and a central limit theorem are known, see Ruciński [28].
After establishing such results, it is natural to ask what is the probability of the event that this random variable differs from its expectation by a significant amount. In this paper, we will consider only the ‘upper tail’ problem: what is the probability of a random variable exceeding its expectation by a multiplicative factor , where is some positive real.
We continue by recalling some definitions to make our discussion more rigorous. For every non-negative integer and a sequence , we let be the binomial random graph with vertices and density . Furthermore, for any graph , let be the random variable counting the number of (labeled) copies of in . Further, for any a positive real and a random variable we write for the event .
The work on the problem of estimating (the logarithm of) the upper tail probability of was initiated by the famous works of Kim and Vu [24], Vu [29], Janson and Ruciński [23]. This problem turns out to be so difficult111This problem was called ‘the infamous upper tail’ by Janson and Ruciński [23]. because when is a connected graph with at least two edges is not a linear function of independent Bernoulli random variables. In the case of linear functions, life is easier and much is known about its large deviation properties, see [15].
In a famous paper [22], Janson, Oleszkiewicz, and Ruciński estimated the logarithm of the upper tail probability for every graph up to a multiplicative factor of . Seven years later, Chatterjee [8] and DeMarco–Kahn [14] closed this gap up to a multiplicative factor of when is a triangle and DeMarco–Kahn [13] generalized this for a clique of arbitrary size.
Next, one would like to obtain a first order approximation of the logarithm of the upper tail probability. The first to do that were Chatterjee and Varadhan [10]. They obtained a first order approximation of the upper tail probability for any graph, under the assumption that is a constant. Their proof relied on the regularity method and therefore extends only to tending to zero very slowly, for more discussion about this see [26].
The general strategy for estimating the logarithm of the upper tail probability, used by Chatterjee and Varadhan as well as all later works on this subject, is to establish a ‘large deviation principle’. That is to prove that the logarithm of the upper tail probability is asymptotically equal to the solution to a minimization problem over a non-trivial set of product probability measures. These minimization problems are called ‘variational problems’. After achieving such large deviation principle one is then left with solving the variational problem.
In a breakthrough paper of Chatterjee–Dembo [9], the authors established a ‘large deviation principle’ not only for when , but also for a large class of functions of independent Bernoulli random variables with mean . They proved that for any ‘smooth enough’ function and a sequence of independent Bernoulli random variables , all with mean , we have the following:
(1) |
where is the relative entropy (the Kullback–Leibler divergence) between and , and is a sequence of independent Bernoulli random variables with for each .
This was further developed by Eldan [16] and Augeri [2, 3]. These methods are completely different from the ones used in the dense regime.
Revisiting the ideas from Chatergee–Varhadan [10], Cook–Dembo [11] developed a decomposition theorem similar to Szemerédi’s regularity lemma and a corresponding counting lemma which are suitable for sparse graphs. Using these they extended the range of sparsity of were the variational approximation (1) holds (for every graph ). Generalizing this method, Cook–Dembo–Pham [12] pushed the bounds even further in the case of subgraph count. They also obtained an approximation of the logarithm of the upper tail probability for the count of uniform sub-hypergraphs in the binomial random uniform hypergraph model, and the induced count of graphs in . Their result combined with the solution to the corresponding variational problem for uniform hypergraph cliques, and one more non-trivial 3-uniform hypergraph which were given by Liu–Zhao in [25], yields estimations for the sub-hypergraph count in the binomial random hypergraph model in the sparse regime. We also note that to the best of our knowledge the solution to the ‘induced variational problem’ is not known apart from the case where is a clique. The case of the induced subgraph count will be of main interest in this paper, and will be discussed later.
To estimate the asymptotics of the logarithm of the upper tail probability, one also needs to solve the variational problem in the right-hand side of (1). In the case of the random variable , Lubetzky and Zhao [27], and Bhattacharya, Ganguly, Lubetzky, and Zhao [5] solved this variational problem for all and satisfying . This then leads to a first order approximation of the logarithm of the upper tail probability for for every graph and . We wish to emphasize that, in all known cases, when counts the number of copies of a given in , the solution to the corresponding variational problem (the right hand side of (1)) always satisfies when .
A recent surprising result due to Gunby [20] studied the upper tail probability for subgraph counts in a random regular graph , where he also proved a large deviation principle. However, for a particular graph, and certain range of (the regularity) the answer for this variational problem is achieved with three possible values of and not two as in the binomial random graph model.
In a breakthrough paper of Harel, Mousset and Samotij [21], using a combinatorial approach, the authors managed to extend the range where the variational problem (the same one as before) bounds from above the logarithm of the upper tail probability for the count of cliques to the optimal range of , which is for a clique on vertices222That is because below this threshold they also showed a Poisson approximation.. We wish to emphasize that their approach for solving the upper tail problem is completely different from any previous techniques in the papers mentioned above. This, together with the solution to the variational problem, settled the problem of the first order approximation of the logarithm of the upper tail probability for cliques. Moreover, they established the correct range of , up to a polylogarithmic factor, where the variational problem bounds from above the logarithm of the upper tail probability for non-bipartite regular graphs333For bipartite regular graphs their result was not optimal, but also a lot of progress was made..
Building on the combinatorial ideas of Harel, Mousset and Samotij, Basak and Basu [4] extended the range of to the optimal range for all regular graphs, including bipartite graphs. Again, this with the solution to the variational problem settled the problem of estimating the logarithm of the upper tail probability for regular graphs. We summarize this discussion with the following first order approximation of the upper tail problem for regular graphs in the ‘localized regime’.
Theorem 1.1 (Due to [21, 4]).
For any and a -regular connected graph the following holds:
where is the unique solution to the equation where is the independence polynomial of 444The independent polynomial of is where is the number of independent sets in of size ..
In this paper we estimate the logarithm of the upper tail probability for the count of induced copies of in in the range , which is optimal up to in the lower bound. To the best of our knowledge, this is the first exact result for the upper tail probability of a random variable counting the number of induced copies of a given non-complete graph in . Our proofs rely heavily on the combinatorial approach of Harel, Mousset and Samotij.
We now present the main result of this paper. For this, from now on let be the random variable counting the number of induced copies of in .
thmthebesttheorem There is an explicit sequence such that the following holds for :
where .
The sequence above is explicitly defined in Section 6.
There are several interesting phenomena which distinguish Theorem 1.1 from previous results concerning the upper tail problem for subgraph counts.
The first, and most noticeable difference is the infinite number of phase transitions. To the best of our knowledge, there is no earlier example of an upper tail problem exhibiting infinitely many phase transitions in its first order approximation. To understand the reasons for this phenomenon let us first discuss further Theorem 1.1. A strategy to bound from below the upper tail probability is to observe that, conditioned on the event for some satisfying , the upper tail event holds with ‘decent’ probability. This allows us to bound the lower tail probability from below by the probability of (which is ) times the ‘decent’ probability above (which is ). We refer to this as ‘planting a copy of ’. Of course one would like to consider the smallest such , to increase the lower bound as much as possible.
In the previous works mentioned earlier, it was shown that for -regular graphs there are two natural candidates for . These candidates are: a clique on vertices or a spanning complete bipartite graph with the smaller side of size . The second construction is often called a Hub. When , the small side of the Hub construction needs to be smaller than one and hence at that point the Hub construction is no longer valid, and we are left only with the clique construction as a lower bound for the upper tail probability. This is the reason for the two different regimes in Theorem 1.1.
Even though is a regular graph, the presence of a clique in does not boost the expected number of induced 4-cycles by a significant amount. Therefore, the first construction is no longer valid. The analog of this construction in our case is a balanced and complete bipartite graph with vertices. Surprisingly, although amongst all graphs with a given number of edges the complete and balanced bipartite graph maximizes the number of induced copies of (see [7, 19] and Lemma 6.2) planting it does not always give the strongest lower bound on the upper tail probability.
This is because, in a range of densities , there is a large family of subgraphs of , where each satisfies and has only slightly suboptimal size (amongst all with this property), that is so ‘large’ that
where the minimum ranges over all graphs with ; moreover, conditioned on the event that for some , the upper tail event still holds with ‘decent’ probability. Note that this is very different from the case of non-induced regular graphs, as here we condition on the appearance of some graph from rather than a single graph (which corresponds to tilting the measure of to for some ). In fact, we will show (in Section 7) that the lower bound obtained with the use of is stronger than any bound corresponding to the more general tilting to for some .
Let us elaborate on the structure of the graphs in . The family depends on in the following way: provided for some integer , the set comprises all complete bipartite graphs with sides and . Note that, for any fixed and large , the graph has more edges than the complete and balanced bipartite graph with the same number of induced copies of . However, the key point here is that we are no longer planting a single copy of this graph. Since is large, there are many embeddings of in and the probability that contains some such copy is affected by the combinatorial factor of the number of possible embeddings. It turns out that the lower bound obtained by considering these families is actually tight in the range where . Further, between and the family changes infinitely many times depending on , which explains the infinite number of phase transitions. This will be explained in more details in Section 7.
Organisation of the paper
In Section 2, we prove a general large deviation principle for nonnegative functions of independents Bernoulli variables which is a version of Theorem 9.1 in [21], which was stated in [21] but was not proven. In Section 3, we continue by connecting the general large deviation principle to our large deviation problem. To this end we define special classes of graphs called core graphs. Then we modify the general result proved in Section 2 using this notion of core graphs.
Section 4 is independent of the others and gives lower bounds for the upper tail probability by planting graphs in the denser regimes and families of graphs in the sparser regimes.
In Section 5, we solve the variational problem presented in Section 3. Furthermore, we make a connection between the number of core graphs with edges and the maximum number of vertices a core graph with edges might have.
Section 6 uses the results from Section 5 to provide upper bounds for the logarithm of the upper tail probability. This section is divided into three parts. First is the dense regime , second is the sparse regime which we also split into two cases: the dense case in the sparse regime and the sparser cases in the sparse regime . Before the second split, we develop some tools used in both cases.
In Section 7 we solve the naive mean field variational problem, showing that it is different the solution to the variational problem we used.
Acknowledgments
The author thanks his advisor Wojciech Samotij for his guidance through out the write-up of this paper, as well as helpful and fruitful discussions. The author also thanks Arnon Chor and Dor Elboim for helpful discussions. Lastly, the author thanks Eden Kuperwasser for fruitful discussions and for creating the figure in this paper.
2. Main tools - polynomials in the hypercube
We start with some notations and definitions which can also be found in [21]. We work within the probability space . Suppose and . We say that
is a subcube centered at with codimension denoted by . Note that every subcube is centered at some pair . Moreover, if a subcube is centered at and it is also centered at then . Hence, the codimension is well defined. For every subcube centered at we define the one-supcube and the zero-supcube of to be and where for . Moreover, for every subcube we say that the one-codimension of it is the codimension of and the zero-codimension of it is the codimension of denoted by for and respectively. A simple observation is that any subcube always satisfies and . For every we define the complexity of denoted by to be the smallest integer for which it is possible to represent as a linear combination with nonnegative coefficients of indicator functions of subcubes with codimension at most . Note that the complexity is well defined as for every such function we can write
Moreover, this shows that for any we have . Assume now that is a random variable taking values in and that . Given a subcube , we write for the expectation of conditioned on . We further define by
Our main tool is [21, Theorem 9.1], which was stated but not proved. We prove a slightly different version of the theorem, but the essence remains the same.
Theorem 2.1.
For every positive integer and all positive real numbers and with , there is a positive constant such that the following holds. Let be a sequence of independent random variables for some and assume that is nonnegative with complexity at most and satisfies . Denote by the collection of all subcubes satisfying
-
()
,
-
()
.
Then,
To prove the theorem we state and prove the following lemmas.
Lemma 2.2.
Let be a random variable taking values in and let be a real-valued function of . Suppose that and that always. Then for all positive and ,
Proof.
Let . If then the claim is vacuously true. Otherwise, there exists a subcube such that and . By we have . Thus, and hence,
Now taking the negative logarithm gives the assertion of the lemma. ∎
Fact 2.3.
Suppose are subcubes of . If is nonempty, then it is also a subcube of and, moreover, .
Proof.
We prove the statement for ; the case follows by a simple inductive argument. Let and be such that is a subcube centered at for . Define in the following way: for all and put and and for all put . This is indeed a well defined element in as for all we have, , otherwise it will contradict our assumption that . It follows from the definition of that and hence is a subcube. Furthermore, we have
Lemma 2.4.
Let be a random variable taking values in and let be a nonnegative real-valued function with complexity bounded by . Then for every positive integer and all positive real numbers and with ,
where
Proof.
Given a subcube, let be the indicator random variable of the event that for all with . Note that implies and let . Since and , Markov’s inequality gives
(2) |
To simplify the notation, for every we write for . Write , where the sum ranges over all subcubes of , each coefficient is nonnegative, and for all with . Then for every ,
where we may let the third sum range only over sequences for which the event has a positive probability of occurring.
Claim 2.5.
For any such sequence,
Proof.
To see this, note that if has positive probability of occurring, then there exists a such that and . That means and for any subcube such that . Hence, . For the second part assume towards a contradiction that , meaning that there is such that . Therefore, there exists a subcube such that, . This is a contradiction as now, but by our assumption, as . ∎
Now we use the previous lemmas to prove the theorem.
Proof of Theorem 2.1.
Let be a large constant that may depend on and . Furthermore, let and . Define
It follows from Lemma 2.4 that
As , we can write
for some so that, for all , the coefficient is nonnegative and is a subcube with . Put and note that always. Applying Lemma 2.2 gives
Note also that as and therefore for every . Therefore,
Thus, provided is sufficiently large we also have
as we assumed that . Putting all of this together gives that for sufficiently large , we have
The assertion of the theorem now follows:
where the last inequality holds for . ∎
When and , the majority of the contribution to both and comes from the one-supcube . We prove a straightforward corollary of Theorem 2.1 which will be more convenient to work with. We start by proving the following lemma.
Lemma 2.6.
The following holds for every positive integer and all positive real numbers and with . Let be a sequence of independent random variables for some and assume that has complexity at most . Let be a subcube satisfying . Then, where is the one-supcube of .
Proof.
As has complexity we can write
for some so that, for all , the coefficient is nonnegative and is a subcube with . Moreover, by our assumptions on , we have . Thus, we have the following inequality,
It follows that
(3) |
where the first inequality in (3) follows from the assumption on and the second inequality in (3) follows from the assumption on . ∎
Using this lemma we now state and prove a straightforward corollary of Theorem 2.1.
Corollary 2.7.
For every positive integer and all positive real numbers and with , there is a positive constant such that the following holds. Let be a sequence of independent random variables for some and assume that has complexity at most and satisfies . Denote by the collection of all subcubes satisfying
-
()
,
-
()
,
-
()
.
Then
Proof.
Applying Theorem 2.1 with replaced by we obtain
(4) |
where is the collection of all subcubes satisfying () and () where is replaced with . Letting , we obtain () and () for every subcube due to () and the definition of one-supcubes. Noting that for every subcube we have we obtain also that
(5) |
Combining (4) and (5) we obtain that
We call a subcube a seed if is an element of defined in Corollary 2.7. A formal definition is given in the following section. A general way one would use Corollary 2.7 to bound upper tails of counts of induced subgraphs is to define a special type of seeds called structured seeds. These structured seeds are subcubes in from Corollary 2.7, with a stronger condition than (). This condition is a combinatorial condition involving various supcubes of the subcubes in (this will be explained further in Section 3). Then one would define a core subcube to be a subcube containing a structured seed such that ‘every coordinate counts’. In the following section we define these special subcubes. We will also show that Corollary 2.7 can be modified to bound the upper tail probability via the probability of being an element in a core subcube.
3. From seeds to modified cores
Note that for any subcube , the one-supcube of corresponds to all subgraphs of containing a specific subgraph of with edges. Therefore, from this point onwards we think of one-supcubes as a family of subgraphs of containing a specific subgraph. In particular, instead of writing for some one-supcube (of some subcube) we will write for the graph corresponding to . The subgraphs corresponding to the members of from Corollary 2.7 will be called seeds. We start with a definition which will be useful in this section as well as the next ones.
Definition.
Suppose and are graphs and let be an edge of . We define:
-
•
is the number of induced copies of in .
-
•
is the number of induced copies of in that contain .
Our main concern in this paper is the random variable counting the number of induced copies of in . Therefore, we let . We will use this notation from this point onward. We now continue by defining seed graphs.
Definition.
Let be positive reals. In addition let . Then we define to be the collection of all spanning subgraphs satisfying:
-
()
and
-
()
.
We call the graphs in seeds. Furthermore, for every positive integer we define to be the set of all seeds with edges.
Since is determined by the number of induced copies of various subgraphs of in , it will be convenient to restate () in the above definition with a graph theoretic condition, giving rise to the notion of structured seeds. In the definition we use the graph obtained from by adding an isolated vertex; we denote this graph by . Now we define the structured seeds.
Definition.
Let be positive reals. In addition let . Then we define to be the collection of all spanning subgraphs satisfying:
-
()
and
-
()
.
We call the graphs in structured seeds. Furthermore, for every positive integer we define to be the set of all structured seeds with edges.
The following is a lemma relating the seeds and the structured seeds, which we prove later.
Lemma 3.1.
Let be positive reals and suppose also that and . Then, there exists a positive constant such that for any and large enough we have,
Now we are ready to define what a core graph is. This is given in the following definition.
Definition.
Let be positive reals. In addition let . Then we define to be collection of all structured seeds satisfying the following:
For all
We call the graphs in core graphs. Furthermore, for every positive integer we define to be the set of all cores with edges.
In cases where the parameters and can be understood from the context we omit them. The main aim of this section is to prove the following theorem which we derive using Corollary 2.7 and several lemmas.
Theorem 3.2.
For all positive real numbers with and , there is a positive constant such that following holds:
(6) |
In particular,
We start by proving Lemma 3.1 relating seeds and structured seeds.
Proof of Lemma 3.1.
Suppose . By () we have
where stands for spanning subgraphs. Observe that
where is the graph on four vertices and one edge, is a matching of size two, and is the path with 3 edges; this indeed holds as two edges of span at most one induced or . Therefore, we obtain
(7) |
In Section 5 (Claim 5.1) we prove that . Therefore, as , we have . Further, as we get from (3) the following inequality for large enough ,
Therefore, we obtain for large enough . This is the assertion of the lemma. ∎
Informally, the next claim is that every structured seed contains a core. Formally this is given in the following claim.
Lemma 3.3.
Suppose are positive reals. Then for every structured seed and every nonnegative real , there exists a subgraph such that:
-
()
,
-
()
, and
-
()
for every edge .
Proof.
In this proof, it would be convenient to let
This is because then, () is equivalent to and () is equivalent to for every an edge in .
Define the sequences and by repeatedly setting to be a subgraph of obtained by the deletion of an edge such that
as long as such edge exists. The graph satisfies () as it is a subgraph of a structured seed. We also claim that, the subgraph satisfies (). That is because, if there is an edge in with the process would have continued by deleting this edge a contradiction. Recalling that is a structured seed we find that, and thus,
Finally, since , we have
Rearranging this inequality and recalling that is a structured seed we obtain the assertion of the lemma,
Applying Lemma 3.3 invoked with replaced by and yields the following corollary.
Corollary 3.4.
Suppose are positive reals. Suppose further that is a structured seed. Then, there exists a core such that .
4. Lower bounds
The aim of this section is to give lower bounds for for every positive . We do so by presenting a family of graphs such that ‘planting’ them in increases the expectation of by a multiplicative factor of . More formally, when we say ‘planting’ we mean changing the probability measure on the hypercube from the usual product measure of to the probability measure of conditioned on the existence of a subgraph from a predetermined family of graphs. The graphs in the families that we will consider will satisfy . We choose these families as such because, on the event that for such that the probability of is pretty ‘large’.
Since for every labeled graph the probability of containing is it makes sense to consider such graphs with the smallest . In contrast, in our case there is a wide range of where the strongest lower bound is obtained by planting some member of a large family of graphs; each individual graph in the family is sub-optimal in terms of the number of edges, but the size of the family compensates for the difference in the number of edges between the optimal graph and the graphs in our family. This family is, roughly speaking all embeddings of an unbalanced complete bipartite graph into .
More precisely, denoting the complete bipartite graph with sides of size and by , we will define integers and such that for and and for , , and .
Note that provided that the constructions and contain induced copies of , up to lower order terms. In addition, we will show later that admits .
Denote by the set of all copies of in when . Denote by the set of all in and and by denote the set of all copies of in . Planting one of or yields a lower bound for the probability of the upper tail event which is valid for all values of . As was mentioned in the introduction the significant different between this work and previous ones is the need to plant a large family of sub-optimal graphs and not a single optimal graph. This is true only for the families where . In the case of we could as well plant a single graph from these sets as and are negligible.
One can compare these bounds, and see that the best one depends on in the following way. There exists an increasing sequence with and so that, provided for some integer we obtain the strongest lower bound on the upper tail probability by planting . The best family to plant when is which should be thought of as the ‘limit’ of when goes to infinity. Lastly, is the best family to plant when .
The main result of this section is a formalisation of the above discussion. In order to make things rigorous from now on we fix to be positive reals and let
when is some non negative integer. Moreover, note the following: For all we have . Therefore, for clarity of the presentation we assume that is an integer divisible by for any . Furthermore, if we have , hence for clarity of the presentation we assume is an integer where will be specified later. The lower bounds given by the following theorem should be thought of as the probability of the appearance of in for some fixed integer (provided ) and the appearance of in . Where we define:
where and and . Now we are ready to state the main result of this section. We wish to emphasize that both and depend on and .
Theorem 4.1.
Let be positive real numbers and let be a positive integer. Then the following holds
for large enough .
In the following lemma we claim that the expected number of induced copies of conditioned on one labeled copy of or being a subgraph of in a suitable range of is at least provided is large enough. To this end let us introduce the following notations. From now on we assume that the vertex set of is . For every positive integer and with define the following events:
-
(I)
is the event that and there are no edges between any ,
-
(II)
.
Now we are ready to state the lemma.
Lemma 4.2.
Suppose are positive reals, is some positive integer and with . Then the following holds:
-
(1)
Suppose . Then, for large enough we have
-
(2)
Suppose . Then, for large enough we have
Proof.
We start with the first item in the lemma. Assume . First, note that
Let us compute these two quantities.
Since contains exactly induced copies of we obtain that contains
induced copies of as . Further, as and for all nonnegative integers we have , we deduce
Combining the above we obtain,
For the second item assume . Similar to the first case we have
We now compute these quantities.
Since contains exactly induced copies of , we obtain that contains induced copies of . Moreover, thinking of as a spanning subgraph of by adding isolated vertices, we see that contains at least induced copies of as we can choose one vertex from one side, two form the other side and another isolated vertex. Furthermore, as for all nonnegative integers we have and , we deduce
taking large enough we have, . Recalling that and combining all the above bounds we obtain,
By this inequality and the definition of one can check that for large enough ,
This completes the proof. ∎
Now we are ready to prove Theorem 4.1. Before proving the theorem we make the following remark. Suppose is a real number, and . Then provided is sufficiently large. Therefore, Theorem 4.1 invoked with implies:
for large enough . Thus, by setting , and letting be any increasing sequence satisfying and , we have the following corollary of Theorem 4.1, which will be shown to be tight in the next sections for some specific sequence .
Corollary 4.3.
Let and be positive real numbers and let be positive integer. Then the following holds
for large enough .
Proof of Theorem 4.1.
Through out the proof we assume that the vertex set of is .
We start with the first item. To this end fix an integer and some positive real and assume that . Let with and recall the definitions of the events and :
-
(I)
is the event that and there are no edges between any ,
-
(II)
.
Note that for any such that with we have . Therefore, we have the following,
(8) |
Lemma 4.2 asserts that provided we have the following for all with :
Note that always, we can bound from above (similar to the proof of Lemma 2.2) as follows:
Combining the two inequalities we obtain,
Therefore, for large enough we also have,
(9) |
Substituting (9) into (4) gives,
(10) |
We now show that and .
First, as and , we have the following for large :
(11) |
where the first inequality follows as for all small enough . Second, for all nonnegative integers we have , and thus we obtain that
In addition, as and , we have
Furthermore, as and we have and hence, for sufficiently large we have,
(12) |
Combining (4), (4), and (12) gives,
This finishes the first part of the proof.
For the second item, define the event to be the event that contains as a subgraph on the vertex set with sides and . Note that always. Further, Lemma 4.2 asserts that provided we have
and thus, . Therefore, applying Lemma 2.2 gives,
Moreover, for sufficiently large we have, . Thus, taking negative logarithms we obtain the following for large enough :
This is as claimed. ∎
5. Counting the number of cores
Recall that is the random variable counting the number of induced copies of in . In this section we prove a general upper bound on the logarithmic probability of the upper tail event of . The main tool we use in this section is Theorem 2.7 which is a variation of . There are two major parts in this section. The first is an evaluation of in different regimes of . The second is an evaluation of the entropic term . The main results of this section are the following lemmas.
Lemma 5.1.
Suppose are positive real numbers with being small as a function of . Then the following hold:
-
(i)
If then for large enough we have,
-
(ii)
If then for large enough we have,
Before we present the second lemma, let us remind the reader the definition of .
Definition.
Let be positive reals. In addition let . Then we define to be collection of all spanning subgraphs satisfying the following:
-
()
,
-
()
, and
-
()
For all
We call the graphs in core graphs. Furthermore, for every positive integer we define be the set of all cores with edges.
Lemma 5.2.
Suppose are positive reals with . Furthermore, suppose as tends to infinity. Then, there exist and such that the following holds for all :
Let be a positive integer with . Furthermore, let
Then,
We start with the first lemma. Let us give some motivation and history of the problem.
Definition.
For every graph and any positive integer define,
where the maximum ranges over all graphs with edges.
This definition was first presented by Erdős and Hanani in [17]. In their paper they computed the asymptotic of this function where is a clique. In [1] Alon generalized this and computed the asymptotic of this function for all . Later, in [18] Friedgut and Kahn reproved Alon’s result using entropy methods which they were also able to generalize for the case of hypergraphs.
In [22] Janson, Oleszkiewicz, and Ruciński found a relation between and a related parameter and the probability that the random variable counting the number of copies of a fixed graph in exceeds its expectation by a multiplicative factor. This result used and generalized the machinery developed by Friedgut and Kahn. This led us to generalize the definition of to fit our setting. Let us recall some definitions from Section 3 and introduce some new ones.
Definition.
Suppose and are graphs. Let be positive integers and let be an edge of . We define:
-
•
is the number of induced copies of in .
-
•
is the number of induced copies of in that contain .
-
•
is the maximum of over all graphs such that the number of vertices of is at most and the number of edges of is .
Note that similar generalizations have been studied in [7] and [19]. The following is a simple corollary of Lemma 6.2 which we prove later in Section 6.
Corollary 5.3.
For every positive integers such that we have,
In [7] the authors determined the asymptotic behaviour of for any (and actually obtained optimal bounds), and gave tight bounds in some ranges of (in the above we consider only ). In [19] the authors considered the problem of determining the asymptotic in and of the maximum number of copies of a fixed bipartite graph in a bipartite host graph with vertices and edges. They solved this problem for some class of graphs which includes all complete bipartite graphs . Here we prove a similar statement to the ones in [7, 19].
We wish to emphasize that there is a significant difference in the behaviour of the problem depending on whether or . This can be seen for example in the first result of this section which we now start to prove. To this end we start with an observation. Recall that we denote by the disjoint union of a star with two leaves and an isolated vertex.
Observation 5.4.
Suppose are positive integers with . Then,
Proof.
Let be a graph achieving the maximum in the definition of . Let and denote . Then,
Moreover we have,
Proof of Lemma 5.1.
For the upper bound in (i), assume that . Since we may treat as an integer and let . Note that has edges and provided is large enough contains at least induced copies of . Note further that,
As we also have the following for sufficiently large ,
Combining the above inequalities we find the following for large enough ,
Therefore, for large enough we have
Hence, by the definition of ,
For the upper bound in (ii), assume . Letting
and recalling that we may treat as an integer and consider the following graph. Let be a graph on the vertex set and such that and all vertices in are isolated. Note that contains edges. Let be a small positive real. Assuming is large enough, contains at least induced copies of . Furthermore, if is large enough, contains at least induced copies of . Let be a small positive real (that might depend on ). Then provided is large enough,
Where the first inequality holds as the first term is at most the expected number of induced copies with at least one vertex in , and the second term is the expected number of induced copies of with no vertices in .
By the definition of and the choices of being small enough we obtain,
Further, for large enough we have,
Combining all of these inequalities we have,
Hence, by the definition of we have,
Claim 5.5.
Suppose is a spanning subgraph of achieving the minimum in the definition of and let . Then for any small enough and large enough
Proof.
Let be a spanning subgraph of achieving the minimum in the definition of and put . By the definition of we have . Note that we can bound from above in the following way:
(13) |
where stands for spanning subgraphs. Let be the path with three edges, be a matching of size two and be a disjoint union of an edge and an independent set of size two. Since any matching of size two span at most one induced copy of or we have
Therefore we obtain
By Observation 5.4 we have
Furthermore we always have,
Combining the two we find that,
Hence, by the definition of and Corollary 5.3, for large enough we have,
(14) |
As and the upper bound achieved before, we have . Therefore, for any and sufficiently large we obtain from (5) the following inequality
We now prove the lower bound in (i). For this assume that and note that this implies that . Thus, for any and large enough we deduce the following from Claim 5.5,
Taking sufficiently small we obtain the following bound on ,
implying
This proves the lower bound in (i) as for sufficiently large we obtain,
This finishes the first evaluation we prove in this section. The second evaluation is the evaluation of the entropic term given by Lemma 5.2. Roughly speaking Lemma 5.2 shows that the number of core graphs with edges is determined by the maximum number of non-isolated vertices over all graphs in . As seen already, there is a big difference in the behaviour of the problem depending on whether or . Later, we will derive two corollaries from Lemma 5.2, corresponding to these regimes, these corollaries will be very important in Section 6.
Proof of Lemma 5.2.
Let be a core graph with edges and let be an edge in . Then by the definition of a core graph we have,
Therefore,
(15) |
or
(16) |
call this property . To bound it is enough to bound the number of subgraphs of with edges satisfying property . This can be bounded from above by multiplying the following quantities:
-
(1)
The number of possible ways to choose the set of non-isolated vertices of such graph.
-
(2)
The number of possible choices of the edges of the graph such that property is being satisfied.
The first item can be bounded from above by the number of ways to choose a set of at most vertices which is at most .
We now bound the second item. Let be a graph with vertices and edges that satisfies property . For every integer define . Furthermore, for every integer define . Using a standard double counting argument and the fact that we have
for all . By property edges can be placed between the sets in two ways.
The first option is when the edges satisfy (15). Since, the degree of each vertex is always bounded from above by , we obtain that the degrees of the endpoints of each edge satisfying (15) are bounded from below by defined as
where is some positive real that might depend on and . Thus, such edges can only connect a vertex in and a vertex in for integers and . We denote the number of such options by .
The second option is that the edge has an endpoint in where is defined as . That happens when the edge satisfies (16). We denote the number of such options by .
Furthermore, note that provided is large enough there are at most
partitions of the non-isolated vertices of our graph into sets where is an integer and . We will think of the vertices in as vertices which will have degrees between and where is an integer.
We conclude that given the vertex set and a partition as explained above, there are at most pairs of vertices that can be edges of . Hence, the number of graphs with property is bounded from above by
Let us now estimate and .
Recalling the assumption that we obtain,
Applying Lemma 5.1 we find that there is such that,
By our assumptions, . This implies there are such that for large enough we have,
For recall that there are at most vertices in and further recall that . Hence,
where . This implies that for large enough we have . Putting it all together there exists such that provided is large enough we have,
This proves the lemma. ∎
We now deduce two corollaries and discuss how one would use them in order to obtain an upper bound on the upper tail probability of . The first corollary will be used when and the second when .
Corollary 5.6.
Suppose are positive reals with . Furthermore, suppose as tends to infinity. Then, there exist and such that the following holds for all :
Let be a positive integer with . Furthermore, let
Then,
Proof.
By Lemma 5.2 and the assumption that we have the following provided is sufficiently large,
Since, we deduce that
We now split into two cases. First, assume . This assumption implies that
On the other hand, if we assume we obtain
where the equality is due to the assumption that . Since , in both cases we obtain that, provided is large enough, the following holds:
Corollary 5.6 suggests the following ‘plan of attack’ which we implement in Section 6. Assuming the term is negative. Therefore in order to estimate estimate the maximum number of vertices that a core with edges can have for each . This estimation and Corollary 5.6 then yield an upper bound on the number of cores with edges, denote this bound by . Then we compare for every , where is the minimum number of edges in a core graph and denote this maximum by . This implies,
We also show that is negligible compared to and therefore we obtain,
Then Theorem 3.2 will imply
Note that depends on . We will also show that this upper bound is matched to the lower bounds that we obtained in Section 4.
Next we present a corollary of Lemma 5.2 which will be used in the range where .
Corollary 5.7.
Suppose are positive reals with . Furthermore, suppose as tends to infinity. Then, there exists such that the following holds for all :
Let be a positive integer with . Then,
This is also an immediate corollary of Lemma 5.2. To see this note that in this range of we have and thus . Hence Lemma 5.2 implies
Similar to before will be shown to be negligible and therefore Corollary 5.7 will yield the following provided ,
where is the minimum number of edges in a core graph. Which then by Theorem 3.2 implies
In the next section we will compute this and show that the above matches the lower bounds given in Section 4.
6. Upper bounds
As can be seen in previous sections there is a big difference in the behaviour of the problem depending on the regime lies at. As explained in Section 5, in order to obtain quantitative bounds on the upper tail probability we need to bound the entropic term when , and when we only need to compute the minimum number of edges in a core graph. Actually, the situation is more involved in the sparse regime where we see a surprising change in the behaviour of the problem. To present the main result of this section let us define a sequence for :
We note that is an increasing sequence and . This is the sequence promised in the our main theorem 1.1 and in Section 4. Furthermore, recall the definitions of and which were given at the beginning of Section 4. For the convenience of the reader we give these definitions here as well. First, we define , where
Second, we define
where and . We are now ready to state the main theorem of this section which is the following.
Theorem 6.1.
Suppose is an integer greater than and suppose that are positive reals with small enough. Then there exists such that for any the following holds:
-
(1)
If , then
-
(2)
If , then
-
(3)
If , then
The proof naturally splits into three parts. We will prove each part in a different subsection. Before we split into cases let us prove a lemma which makes a connection between the number of induced copies of in a graph to the numbers of edges and vertices of that graph. This lemma will be important for us both in the sparse regime and the dense regime.
Lemma 6.2.
Suppose are positive integers such that and let be real. Let be a graph with vertices and edges and assume . Then,
Remark.
The bound from the lemma is sharp when for some as the graph has vertices and edges and contains induced copies of .
Proof of Lemma 6.2.
Since an -vertex graph with minimum degree at least has at least and at most edges we may assume that . Since,
We only need to prove that for all . Note that every containing is determined by the ‘parallel’ edge to meaning the edge between the two other vertices in that . This is because we are considering induced copies. In other words, the number of induced copies containing is bounded by the number of edges between and . Let be the set of vertices not in and note that . The number of edges between and is bounded from above by where is the number of edges with an endpoint in . Since we may estimate as follows,
Thus, the number of edges between and is bounded from above by
and the lemma follows. ∎
In order to prove Theorem 6.1 we now split into cases, each dealing with each item of the theorem.
6.1. The dense regime
Assume that and assume that is any positive real which is small enough as a function of . Corollary 5.7 asserts that in this regime of the number of core graphs is negligible. Therefore, as explained at the end of Section 5, to bound the upper tail probability using Theorem 3.2 we are only left with showing that is a lower bound on the minimum number of edges in a core. This is given in the following lemma.
Lemma 6.3.
Suppose are positive real numbers with small enough as a function of . Further suppose . Then, there exists such that for all
Before proving the claim let us recall Observation 5.4. For the convenience of the reader we state it here as well:
Observation 6.4.
Suppose are positive integers with . Then,
Proof of Lemma 6.3.
As explained after Corollary 5.7, this implies the following corollary which is the third item in Theorem 6.1.
Corollary 6.5.
Suppose are positive real numbers with small enough as a function of . Further suppose . Then, for large enough we have
6.2. The sparse regime
Throughout this subsection, assume . As mentioned earlier, there is another change in the behaviour of the problem when and . Before splitting into these two cases we show a reduction and develop some tools which we use in both cases.
We start with the following fact which plays a major role in the later proofs.
Lemma 6.6.
Suppose are reals and positive with . Furthermore, suppose . Then, there exist and such that the following holds for all :
Suppose where and . Then,
and furthermore, .
Proof.
Let be a core graph with edges and let be an edge of . By the definition of we have the following:
Note that . Lemma 5.1 asserts that there exists such that provided is large enough we have
Hence as we deduce the following for large enough :
and
(17) |
Note also that and thus the assertion follows for small enough ,
and if or then which is a contradiction to (17). ∎
The above lemma will be used many times throughout this subsection, mostly the fact that when the minimum degree of a core graph is at least . We will omit the reference to this lemma and keep in mind that the minimum degree of all graphs considered is at least .
The following is a corollary which bounds . This will be used afterwards to formalize the discussion after Corollary 5.6.
Lemma 6.7.
Suppose are positive reals. Suppose further that . Then, there exists an integer such that for all integers the following holds:
Suppose is an integer with . Then,
Proof.
Fix an arbitrary and define, . We have
Thus, .
By Theorem 5.1 we have and hence,
Thus, by Lemma 6.6 for sufficiently large no two vertices of can be connected by an edge. Therefore,
Since for all of the vertices of , we obtain .
Therefore for large enough ,
Provided is large enough, . Hence, for large enough the assertion of the lemma holds. ∎
We now formalize the discussion after Corollary 5.6 in the following proposition.
Proposition 6.8.
Suppose is a positive real and are sufficiently small positive reals. Furthermore, suppose . Then, there exists such that for all we have,
To prove this proposition we start with a lemma reducing the problem of estimating only when .
Lemma 6.9.
Suppose is a positive real and are sufficiently small positive reals. Furthermore, suppose . Then, there exists such that for all we have,
Proof.
Since , provided is sufficiently large, we have for all by Lemma 6.7. Thus,
For any positive we may choose small enough such that we have the following for sufficiently large ,
We now use the above lemma to prove Proposition 6.8
Proof of Proposition 6.8.
By Lemma 6.9 it is sufficient to prove that for all we have the following provided is sufficiently small
Provided is large enough Lemma 6.7 gives for all . Therefore,
Noting that each copy of in is a core graph with edges we find that . Therefore,
These inequalities imply the following for all and sufficiently small
This establish the proposition provided is small enough. ∎
Proposition 6.8 and Corollary 5.6 show that to prove Theorem 6.1 when we only need to bound for . Therefore, to prove the first item in Theorem 6.1 it is enough to prove that, when we have
for all . Moreover, to prove the second item in Theorem 6.1 it is enough to prove that, when we have
for all . We now split into these two cases mentioned above and prove them. First we deal with the denser case where .
6.2.1. The denser case in the sparse regime
Assume . Proposition 6.8 implies that we need to evaluate the term only for . To do this we use Lemma 6.2 which implies a connection between the number of induced copies of in a graph and the number of edges and vertices in it. We start with a simple corollary of Lemma 6.2.
Corollary 6.10.
Suppose are positive reals. Suppose further that and is an integer. Then, for any the number of vertices of is at most
Proof.
Since the minimum degree of is at least , we may apply Lemma 6.2 and obtain,
(18) |
By algebraic manipulations we see that, which is the assertion of the corollary. ∎
Now we obtain the desired bound for in this regime of .
Lemma 6.11.
Suppose that are positive reals with small enough, and small enough as a function of . Furthermore, suppose and is some constant that might depends on and . Then there exists such that for any and any we have,
Proof.
Let be such that and recall that . Applying Corollary 6.10 we obtain the following for any and sufficiently large :
Recalling that we may rewrite the above as follows provided is large enough:
We claim that for all . Indeed, for and . This concludes the proof as now provided is small enough we have the following for all :
6.2.2. The sparser case in the sparse regime
Assume . It was already seen that there is a big difference in the behaviour of the upper tail probability depending on the regime lies at. In this section we present a surprising change of the behaviour in the sparser regime. This phenomenon comes from the fact that the number of core graphs is significant and matters a lot in the evaluation of the upper tail probability. This is completely different from the previous cases where the number of cores was negligible. This will be explained in more details later on.
The essence of the proofs in this subsection is still to make use of the connection between the number of induced copies of in a core graph and the number of vertices in it. To state the claims in this section it is useful to introduce the following notations.
Notation.
Suppose is a graph and is a nonnegative integer. Then we denote the following sets accordingly:
-
•
Denote by the set of all edges of with an endpoint of degree and denote its cardinality by .
-
•
Denote by the set of all edges of whose both endpoints have degree greater than ; similarly we denote by the cardinality of .
In most cases we omit ‘’ for brevity.
The following lemma bounds from above the number of induced copies of in any core graph with edges where .
Lemma 6.12.
Suppose are positive reals and . Then there exists such that for any the following holds:
Let be a core graph with edges. Then
(19) |
and
(20) |
Proof.
Let be a core graph. First, let us make some notations for the proof. For every nonnegative integer let be set of all vertices of with degree and denote its cardinality by . Furthermore, let be the set of all vertices of with degree greater than and denote its cardinality by .
We now proceed with the proof. Note that taking large enough according to Lemma 6.6 applied with , we are guaranteed that no two vertices of degree less or equal to can be connected by an edge. This implies that is a partition of the edges of . Furthermore, and for any we have . This follows by a standard double counting argument and as the set of all vertices of degree at most is an independent set. We thus obtain (20) as
To prove (19) we note that there are three types of induced copies of in core graphs:
-
(i)
Induced copies of with exactly two vertices in .
-
(ii)
Induced copies of with exactly one vertex in .
-
(iii)
Induced copies of with no vertices in i.e. all vertices in .
Let us bound from above the number of induced copies of of each type. We claim that the number of induced copies of of the first type is at most
Indeed, for every , there are at most copies of with two vertices in and, for all there are at most copies of with one vertex in and one vertex in .
We now bound the number of induced copies of of the second kind. Note that each such induced copy of is determined by choosing its vertex of degree less or equal to , two of its neighbours and another vertex of degree larger than . Therefore, the number of induced copies of of the second kind it at most
To bound from above the number of induced copies of of the third kind we observe that each such induced copy of is determined by each of the two perfect matchings in it. Since each induced copy of of the third type contains only edges of , there are at most such copies. Summing all of these bounds we obtain the assertion of the lemma. ∎
Our only use of Lemma 6.12 is when . Therefore, form now on we set . We will only consider core graphs with at most edges, thus we may bound from above the right-hand side of (19) by
In particular, for every core graph with at most edges,
Recall that in this range of we have and note that by Lemma 6.12 for any core graph the term can be bounded from above using by
Furthermore, for we have,
Therefore, letting sufficiently large we find that,
(21) |
We now restate this bound in a more compact way. To this end we introduce the following notations. For every graph define the following:
-
•
is the vector defined by in its -th coordinate for and in the last coordinate.
-
•
is defined by as the first coordinate, as the -th coordinate for , and in the last coordinate.
Using these notations we may rewrite (6.2.2) as follows,
(22) |
From now on we let . Since for every ,
Note that in the above we dropped the condition that came from a graph. By this we only consider more possible cases and thus obtain an upper bound on the maximization problem.
Now we are ready to state the main technical proposition in order to bound from above the term when .
Proposition 6.13.
Suppose is an integer and suppose that are positive reals with small enough as a function of and . Then, there exists such that for any and any we have
This proposition and Proposition 6.8 imply the following corollary. This corollary yields the first item in Theorem 6.1 as explained before.
Corollary 6.14.
Suppose is an integer and suppose that are positive reals with small as a function of and . Then, there exists such that for any and any we have the following:
for every .
We now turn to the proof of Proposition 6.13. We show that for the maximum of is achieved by a vector of the form where is the -th element in the standard basis. The strategy of the proof is to think of the vector as a distribution vector and push its mass towards the ‘center of mass’ while keeping constant and ensuring that evaluated on the vector after the transformation is still greater than . To be more precise, the center of mass will be the -th coordinate such that satisfies . This will be done iteratively by showing that if is a vector achieving the maximum of and for all then we may define a vector which also achieves this maximum and satisfies for all . A similar statement will be shown for the other direction i.e. pushing the mass from the right to the left.
Let us introduce the following notations of pushing mass. Let be some integer (should be thought of as the center of mass) and suppose that . Then for any integers define the following vectors,
One should think of these operations as pushing the mass towards the -th coordinate while staying on the hyperplane defined by for some constant .
Proof of Proposition 6.13.
As explained earlier we prove this proposition iteratively. We do so in several claims. First, we show that we may assume that the distribution vector satisfies . This is given in the following claim.
Claim 6.15.
If for some positive , then provided is small enough there exists such that for any the following holds:
Proof.
Let
Furthermore, let be such that and . By the definition of we have,
and therefore, . In order to prove the lemma it is sufficient to prove . To do so let
which depends only on , and
Note that for all we have
Note also that for all we have . Therefore, we obtain that . As and a straightforward computation gives
Note that and . This holds as and we assume is small enough so that , as then the left-hand side decays to zero and the right-hand side approaches infinity. This implies that
Therefore,
Proving that the above is nonnegative provided is sufficiently large will conclude the proof. Indeed, letting
is equivalent to,
which is also equivalent to
Since and for all , it is enough to prove that
First, as , the right inequality is equivalent to
Equivalently, , which holds for satisfying as then the left-hand side is positive for and the right-hand side tends to minus infinity as tends to infinity. Second, as , the left inequality is equivalent to
This holds if and only if . Finally, assuming that is small enough so that implying the claim. ∎
We now continue to the more technical part of the proof which is to push the mass to the -th coordinate provided that the last coordinate is , which we may assume due to the above claim.
Claim 6.16.
If then there exists such that for any the following holds:
Proof.
Let
Furthermore, let be such that and . By Claim 6.15 we may assume . As satisfies we also have for any .
We will start with the push of the mass from the left to the right. To this end we prove iteratively the following: Let and suppose that for all . Then provided is large enough. To see this let
which depends only on , and let
which depends only on . Note that for all with for we have . This implies that . By the definition of we have
Thus, to show that it is suffices to show the following:
(23) |
and
(24) |
As we have . Therefore, inequality (23) is equivalent to
By algebraic manipulation the above is equivalent to
This always holds in our settings provided is sufficiently large. Indeed, the above is equivalent to which holds as . It is left to show inequality (24). We prove this in the following claim.
Claim 6.17.
Proof.
The inequality in the left-hand side is equivalent to as . Recall that and plug this in the above inequality to obtain
Equivalently,
Thus we conclude that our assumption is equivalent to the following:
Since we assumed and for all we have we also have . Therefore, for any we may iterate this process for and obtain that . This conclude the push of the mass of from the -th coordinates for to the -th coordinate.
Now provided what we showed so far we prove a similar statement for . More specifically, we prove iteratively the following: Let and suppose that for all and for all . Then provided is large enough. To see this let
which depends only on for , and let
which depends only on for .
Note that for all with for all or we have . Note also that for all thus, . This implies that
Thus, to show that it is suffices to show the following:
and
Note that these two inequalities are respectively equivalent to the following inequalities:
(25) |
(26) |
Inequality (25) holds always as . Moreover, by Claim 6.17 inequality (26) holds if and only if .This indeed holds as we assume and we also have for all . This conclude the proof of the lemma. ∎
Now we deduce the proposition from the above claims. Assuming is large enough and using Claim 6.16 it is sufficient to bound from above
Let be a witness for that. We have,
Denoting implies
Hence, for every positive and sufficiently small we obtain
The second option is not possible as is nonnegative. Thus we have, and hence
This is as claimed. ∎
7. The solution to the variational problem
In this section we solve the naive mean field variational problem (when ) for the upper tail of the number of induced copies of in . Let us recall the definition of the variational problem.
Let be a positive integer and let be a sequence of independent Bernoulli random variables with mean . Further, let be a function from the hypercube to . Then the naive mean field variational problem associated to the upper tail of is the function such that for every :
where , and the infimum is taken over all sequences of Bernoulli random variables with means respectively.
We will be interested in the case where , the random variables correspond to the edges of , and is the random variable counting the number of induced copies of . The main proposition of this section is the following.
Proposition 7.1.
Suppose are positive reals. Further, suppose that . Then, for large enough , we have
The upper bound follows immediately from
In Section 4 we this minimum. It is attained when is the complete bipartite graph where . For more details the reader is referred to Section 4.
For the lower bound we start with the following reduction.
Lemma 7.2.
Suppose are positive reals. Furthermore, suppose . Then, for large enough , we have
To prove the lemma we define the set as follows: Let be a set of representatives for the set of all tuples with distinct coordinates and the equivalence relation, if and only if ; i.e. both and induce the same 4-cycle in .
Proof.
Let be such that . Define by setting . Since is monotone decreasing between and , we obtain that
To conclude the proof, we prove that . Indeed, provided is sufficiently large, we have
Next we present a lemma about the structure of when is close to the infimum in Lemma 7.2.
lemmalemmaone Suppose are positive reals with sufficiently small. Suppose also that . Then, for any sequence satisfying:
-
(1)
for some ,
-
(2)
, and
-
(3)
,
we have provided is large enough.
This lemma is a variant of the methods in the papers of Lubetzky and Zhao [27] and in Bhattacharya, Ganguly, Lubetzky and Zhao [5, Section 5.2]. In these papers the authors use the language of graph limits or ‘graphons’; we recreate their proof in the language of graphs in the Appendix. Further, Lubetzky and Zhao [27] and Bhattacharya, Ganguly, Lubetzky and Zhao [5], analysed the function and proved several lemmas in the language of graphons. We use these lemmas in the proof of Proposition 7.1, once again in the language of graphs and not graphons. For a proof of the first lemma we refer the reader to [27] and [5]; the second lemma will be proven in the Appendix with some extensions.
[[27]]lemmalemmathree Suppose are positive reals and . Suppose also that such that . Let be such that . Then, provided is large enough there is a constant such that the following holds:
-
(i)
and
-
(ii)
.
Now we are ready to prove Proposition 7.1.
Proof of Proposition 7.1.
We wish to prove the following:
for sufficiently small and for such that
We may assume that
for some absolute constant as otherwise there is nothing to prove. By Lemma 7.2 and Lemma 7.2 we may assume that for some which satisfies
where is the number of copies of in . Note that
In the next three claims we collect some properties of with non-negligible contribution to the sum above. Afterwards, we use these properties to show that the set of all 4-cycles with non-negligible contribution to the sum above is a subset of
where the summation is modulo and is some positive constant depending on . We now explain briefly the proof strategy: First, we show that to have a non-negligible contribution a 4-cycle must contain a with both edge weights at least ; for convenience assume that these edges are . Afterwards, we show that, among such 4-cycles, the only ones contributing a non-negligible amount must additionally satisfy or , assume . Then, we prove that a 4-cycle with the above properties contributes a non-negligible amount only if (in the complementary case we show ). Using symmetries of the 4-cycle, we may use this argument again from a different point of view. We then obtain that the only 4-cycles contributing a non-negligible amount to the sum are ones with for all . This explanation can also be viewed in Figure 1.
We now execute this plan rigorously.
Claim 7.3.
For all the following holds for large enough
where is the set of all tuples with
Proof.
Suppose otherwise, meaning,
By the definition of we have
This implies that
and therefore,
By Lemma 7.2 we have . This implies,
contradicting the assumption that . ∎
Claim 7.4.
For all the following holds for large enough
where , and is the set of all with
Proof.
Assume towards contradiction that
We claim that for any there is satisfying
where summation is taken modulo . Indeed, let . Since we have
and
Without loss of generality, assume that . Note that as there is such that . For we have implying that and therefore, . We continue by letting be the set of all such that , and we claim that
This follows from the fact that any in can be extended to a in at most ways. Therefore, we obtain the following:
By Lemma 7.2 we have,
This is a contradiction to our assumption that as and therefore, . ∎
Claim 7.5.
For any there exists such that the following holds provided is large enough
where , and is the set of all tuples with
and
Proof.
Let and assume towards contradiction that
Note that any satisfies
and
for some and where summation is taken modulo . Letting be the set of all such that we have the following by the definition of :
This implies that
Therefore, by the definition of and by Lemma 7.2 we have,
This is a contradiction to the choice of as by our assumptions we have . ∎
We now proceed with the proof of Proposition 7.1. By the our assumptions and the above three claims we obtain the following for any and small enough :
We claim that provided is large enough we have:
Indeed, as , there exists such that provided and is large enough we have:
Further, as we have
Without loss of generality, assume the latter holds. In particular we have,
Since , we have
Finally, as , for large enough we have implying that
Letting be the set of all with , we claim the following holds:
(27) |
Indeed,
is at most
where the sum ranges over unordered pairs, , such that are all distinct. We also have
This implies that
where the summation in the right-hand side is over unordered pairs, , such that are all distinct. This implies (27) as
where the unlabeled summations are over unordered pairs . We conclude that:
Now we can bound from below the ‘cost’ using Lemma 7.2 as follows:
This finishes the proof of the proposition as ∎
References
- [1] Noga Alon, On the number of subgraphs of prescribed type of graphs with a given number of edges, Israel J. Math. 38 (1981), no. 1-2, 116–130. MR 599482
- [2] Fanny Augeri, Nonlinear large deviation bounds with applications to Wigner matrices and sparse Erdős-Rényi graphs, Ann. Probab. 48 (2020), no. 5, 2404–2448. MR 4152647
- [3] by same author, A transportation approach to the mean-field approximation, Probability Theory and Related Fields 180 (2021), no. 1, 1–32.
- [4] Anirban Basak and Riddhipratim Basu, Upper tail large deviations of regular subgraph counts in Erdős–Rényi graphs in the full localized regime, arXiv preprint arXiv:1912.11410 (2019).
- [5] Bhaswar B Bhattacharya, Shirshendu Ganguly, Eyal Lubetzky, and Yufei Zhao, Upper tails and independence polynomials in random graphs, Advances in Mathematics 319 (2017), 313–347.
- [6] Béla Bollobás, Random graphs, Combinatorics (Swansea, 1981), London Math. Soc. Lecture Note Ser., vol. 52, Cambridge Univ. Press, Cambridge-New York, 1981, pp. 80–102. MR 633650
- [7] Béla Bollobás, Chiê Nara, and Shun-ichi Tachibana, The maximal number of induced complete bipartite graphs, Discrete Math. 62 (1986), no. 3, 271–275. MR 866942
- [8] Sourav Chatterjee, The missing log in large deviations for triangle counts, Random Structures & Algorithms 40 (2012), no. 4, 437–451.
- [9] Sourav Chatterjee and Amir Dembo, Nonlinear large deviations, Advances in Mathematics 299 (2016), 396–450.
- [10] Sourav Chatterjee and SR Srinivasa Varadhan, The large deviation principle for the Erdős–Rényi random graph, European Journal of Combinatorics 32 (2011), no. 7, 1000–1017.
- [11] Nicholas Cook and Amir Dembo, Large deviations of subgraph counts for sparse Erdős–Rényi graphs, Advances in Mathematics 373 (2020), 107289.
- [12] Nicholas A Cook, Amir Dembo, and Huy Tuan Pham, Regularity method and large deviation principles for the Erdős–Rényi hypergraph, arXiv preprint arXiv:2102.09100 (2021).
- [13] Robert DeMarco and Jeff Kahn, Tight upper tail bounds for cliques, Random Structures & Algorithms 41 (2012), no. 4, 469–487.
- [14] by same author, Upper tails for triangles, Random Structures Algorithms 40 (2012), no. 4, 452–459. MR 2925307
- [15] Amir Dembo and Ofer Zeitouni, Large deviations techniques and applications, Stochastic Modelling and Applied Probability, vol. 38, Springer-Verlag, Berlin, 2010, Corrected reprint of the second (1998) edition. MR 2571413
- [16] Ronen Eldan, Gaussian-width gradient complexity, reverse log-Sobolev inequalities and nonlinear large deviations, Geometric and Functional Analysis 28 (2018), no. 6, 1548–1596.
- [17] Paul Erdős, On the number of complete subgraphs contained in certain graphs, Magyar Tud. Akad. Mat. Kutató Int. Közl 7 (1962), no. 3, 459–464.
- [18] Ehud Friedgut and Jeff Kahn, On the number of copies of one hypergraph in another, Israel J. Math. 105 (1998), 251–256. MR 1639767
- [19] Dániel Gerbner, Dániel T. Nagy, Balázs Patkós, and Máté Vizer, On the maximum number of copies of in graphs with given size and order, J. Graph Theory 96 (2021), no. 1, 34–43. MR 4191113
- [20] Benjamin Gunby, Upper tails of subgraph counts in sparse regular graphs, arXiv preprint arXiv:2010.00658 (2020).
- [21] Matan Harel, Frank Mousset, and Wojciech Samotij, Upper tails via high moments and entropic stability, arXiv:1904.08212 (2019).
- [22] Svante Janson, Krzysztof Oleszkiewicz, and Andrzej Ruciński, Upper tails for subgraph counts in random graphs, Israel J. Math. 142 (2004), 61–92. MR 2085711
- [23] Svante Janson and Andrzej Ruciński, The infamous upper tail, Random Structures & Algorithms 20 (2002), no. 3, 317–342.
- [24] Jeong Han Kim and Van H Vu, Divide and conquer martingales and the number of triangles in a random graph, Random Structures & Algorithms 24 (2004), no. 2, 166–174.
- [25] Yang P Liu and Yufei Zhao, On the upper tail problem for random hypergraphs, Random Structures & Algorithms 58 (2021), no. 2, 179–220.
- [26] Eyal Lubetzky and Yufei Zhao, On replica symmetry of large deviations in random graphs, Random Structures & Algorithms 47 (2015), no. 1, 109–146.
- [27] by same author, On the variational problem for upper tails in sparse random graphs, Random Structures & Algorithms 50 (2017), no. 3, 420–436.
- [28] Andrzej Ruciński, When are small subgraphs of a random graph normally distributed?, Probab. Theory Related Fields 78 (1988), no. 1, 1–10. MR 940863
- [29] Van H. Vu, A large deviation result on the number of small subgraphs of a random graph, Combin. Probab. Comput. 10 (2001), no. 1, 79–94. MR 1827810
Appendix A Translating graphons’ language into graphs’ language
The aim of this Appendix is to add a proof (in the language of graphs instead of graphons) for Lemmas 7.2 and 7.2 which were proven by Lubetzky–Zhao [27] and Bhattacharya, Ganguly, Lubetzky and Zhao [5] in the language of graphons.
We start with a discussion of how one would prove Lemma 7.2, then we state some lemmas and an extension of Lemma 7.2. To prove Lemma 7.2 we will use Lemma 7.2, and therefore it will be proven before Lemma 7.2. Recall Lemma 7.2:
*
We start with by analysing the term . Suppose satisfies the conditions of Lemma 7.2. Note that
Since we have the following for large enough :
where stands for a spanning subgraph which is not equal to the host graph. Recalling that , we obtain,
where is the path with three edges, is the matching of size two, is the complete bipartite graph with sides of size and and an extra isolated vertex and is the disjoint union of an edge and an independent set of size two. Therefore, to prove the lemma, it is enough to show that all of the terms in the right-hand side of the above inequality except for are negligible in comparison to .
*
Lemma A.1 ([27]).
There exists such that for all and we have,
Corollary A.2 ([27]).
There exists such that for all and all we have,
where the -term goes to zero as .
Another useful tool we use is the generalized Hölder inequality:
Lemma A.3 (Generalized Hölder inequality [5]).
Let be probability measures on respectively, and let . Let be non-empty subsets of and for put and . Let for each , and suppose that for all . Then,
In particular, if every element of is contained in at most two sets , then we can take all and obtain:
Next, we recreate a proof of Lubetzky–Zhao [27] and Bhattacharya, Ganguly, Lubetzky and Zhao [5] of an extension of Lemma 7.2. Then, we will derive the required bounds.
Lemma A.4.
Suppose are positive reals and . Suppose also that such that . Let be such that . Then, provided is large enough there is a constant such that the following holds:
-
(i)
-
(ii)
, and
-
(iii)
Proof.
We first show that the set is empty, provided is large enough. If this were not true, then, as is a convex function of , we have the following by Jensen’s inequality
where the last inequality follows from Lemma 7.2. Since and we obtain the following:
and therefore, is empty. To prove (i), we use the convexity of and Lemma A.1,
Since, we obtain:
Now we prove (ii). By convexity we have,
Since we may use Lemma 7.2, and obtain the following for large enough :
This implies the following provided is large enough,
By the monotonicity of for we obtain (ii). That is
Lastly, we prove (iii). By Corollary A.2,
This and the assumption of the lemma implies (iii). ∎
Now we can finish the proof of Lemma 7.2.
Lemma A.5.
Suppose are positive reals and . Suppose also that is a sequence of reals between and such that . Then,
Proof.
By item (ii) in Lemma A.4 we have , and therefore,
where is the number of in . Let . Note that each with vertices can be represented in exactly two ways as a tuple where edges are consecutive vertices in this tuple. Let be a collection of exactly one such representative for each in . Observe the following:
where is the number of in . By the generalized Hölder inequality we have:
Applying items (i) and (iii) in Lemma A.4 we obtain
This establishes the third assertion of the lemma. The fourth assertion of the lemma follows immediately from Lemma A.4 item (i) and the definition of :