The upper tail problem for induced 4-cycles in sparse random graphs

Asaf Cohen Antonir

Abstract.

Building on the techniques from the breakthrough paper of Harel, Mousset and Samotij, which solved the upper tail problem for cliques, we compute the asymptotics of the upper tail for the number of induced copies of the 4-cycle in the binomial random graph $G_{n,p}$ . We observe a new phenomenon in the theory of large deviations of subgraph counts. This phenomenon is that, in a certain (large) range of $p$ , the upper tail of the induced 4-cycle does not admit a naive mean-field approximation.

1. Introduction

The study of the random variable counting the number of copies of a given graph in the binomial random graph $G_{n,p}$ has a very long history and many things are known about it. In particular, for every graph $H$ , when the expected number of copies of $H$ in $G_{n,p}$ tends to infinity a ‘law of large numbers’, see Bollobás [6], and a central limit theorem are known, see Ruciński [28].

After establishing such results, it is natural to ask what is the probability of the event that this random variable differs from its expectation by a significant amount. In this paper, we will consider only the ‘upper tail’ problem: what is the probability of a random variable exceeding its expectation by a multiplicative factor $1+\delta$ , where $\delta$ is some positive real.

We continue by recalling some definitions to make our discussion more rigorous. For every non-negative integer $n$ and a sequence $p=p(n)\in(0,1)$ , we let $G_{n,p}$ be the binomial random graph with $n$ vertices and density $p$ . Furthermore, for any graph $H$ , let $X^{H}_{n,p}$ be the random variable counting the number of (labeled) copies of $H$ in $G_{n,p}$ . Further, for any $\delta$ a positive real and a random variable $Y$ we write $\mathbf{UT}(Y,\delta)$ for the event $\{Y\geq(1+\delta)\mathbb{E}[Y]\}$ .

The work on the problem of estimating (the logarithm of) the upper tail probability of $X_{n,p}^{H}$ was initiated by the famous works of Kim and Vu [24], Vu [29], Janson and Ruciński [23]. This problem turns out to be so difficult¹¹1This problem was called ‘the infamous upper tail’ by Janson and Ruciński [23]. because when $H$ is a connected graph with at least two edges $X_{n,p}^{H}$ is not a linear function of independent Bernoulli random variables. In the case of linear functions, life is easier and much is known about its large deviation properties, see [15].

In a famous paper [22], Janson, Oleszkiewicz, and Ruciński estimated the logarithm of the upper tail probability for every graph $H$ up to a multiplicative factor of $O(\log(1/p))$ . Seven years later, Chatterjee [8] and DeMarco–Kahn [14] closed this gap up to a multiplicative factor of $O(1)$ when $H$ is a triangle and DeMarco–Kahn [13] generalized this for a clique of arbitrary size.

Next, one would like to obtain a first order approximation of the logarithm of the upper tail probability. The first to do that were Chatterjee and Varadhan [10]. They obtained a first order approximation of the upper tail probability for any graph, under the assumption that $p$ is a constant. Their proof relied on the regularity method and therefore extends only to $p$ tending to zero very slowly, for more discussion about this see [26].

The general strategy for estimating the logarithm of the upper tail probability, used by Chatterjee and Varadhan as well as all later works on this subject, is to establish a ‘large deviation principle’. That is to prove that the logarithm of the upper tail probability is asymptotically equal to the solution to a minimization problem over a non-trivial set of product probability measures. These minimization problems are called ‘variational problems’. After achieving such large deviation principle one is then left with solving the variational problem.

In a breakthrough paper of Chatterjee–Dembo [9], the authors established a ‘large deviation principle’ not only for $X_{n,p}^{H}$ when $p=\Omega(n^{-\varepsilon})$ , but also for a large class of functions of $N$ independent Bernoulli random variables with mean $p=\Omega(N^{-\varepsilon})$ . They proved that for any ‘smooth enough’ function $f\colon\{0,1\}^{N}\to\mathbb{R}$ and a sequence of independent Bernoulli random variables $Y=(Y_{1},Y_{2},\ldots,Y_{N})\in\{0,1\}^{N}$ , all with mean $p$ , we have the following:

-\log\mathbb{P}\left(\mathbf{UT}(f(Y),\delta)\right)=(1+o(1))\inf\left\{\sum_{i=1}^{N}I_{p}(q_{i}):\mathbb{E}[f(\tilde{Y})]\geq(1+\delta)\mathbb{E}[f(Y)]\right\},

(1)

where $I_{p}(q)=q\log\left(\frac{q}{p}\right)+(1-q)\log\left(\frac{1-q}{1-p}\right)$ is the relative entropy (the Kullback–Leibler divergence) between $\operatorname{Ber}(q)$ and $\operatorname{Ber}(p)$ , and $\tilde{Y}=(\tilde{Y}_{1},\tilde{Y}_{2},\ldots,\tilde{Y}_{N})\in\{0,1\}^{N}$ is a sequence of independent Bernoulli random variables with $\mathbb{E}[\tilde{Y}_{i}]=q_{i}$ for each $i$ .

This was further developed by Eldan [16] and Augeri [2, 3]. These methods are completely different from the ones used in the dense regime.

Revisiting the ideas from Chatergee–Varhadan [10], Cook–Dembo [11] developed a decomposition theorem similar to Szemerédi’s regularity lemma and a corresponding counting lemma which are suitable for sparse graphs. Using these they extended the range of sparsity of $p$ were the variational approximation (1) holds (for every graph $H$ ). Generalizing this method, Cook–Dembo–Pham [12] pushed the bounds even further in the case of subgraph count. They also obtained an approximation of the logarithm of the upper tail probability for the count of uniform sub-hypergraphs in the binomial random uniform hypergraph model, and the induced count of graphs in $G_{n,p}$ . Their result combined with the solution to the corresponding variational problem for uniform hypergraph cliques, and one more non-trivial 3-uniform hypergraph which were given by Liu–Zhao in [25], yields estimations for the sub-hypergraph count in the binomial random hypergraph model in the sparse regime. We also note that to the best of our knowledge the solution to the ‘induced variational problem’ is not known apart from the case where $H$ is a clique. The case of the induced subgraph count will be of main interest in this paper, and will be discussed later.

To estimate the asymptotics of the logarithm of the upper tail probability, one also needs to solve the variational problem in the right-hand side of (1). In the case of the random variable $X_{n,p}^{H}$ , Lubetzky and Zhao [27], and Bhattacharya, Ganguly, Lubetzky, and Zhao [5] solved this variational problem for all $H,n$ and $p$ satisfying $n^{-1/\Delta_{H}}\ll p\ll 1$ . This then leads to a first order approximation of the logarithm of the upper tail probability for $X_{n,p}^{H}$ for every graph $H$ and $p\geq n^{-c_{H}}$ . We wish to emphasize that, in all known cases, when $f$ counts the number of copies of a given $H$ in $G_{n,p}$ , the solution to the corresponding variational problem (the right hand side of (1)) always satisfies $q_{i}\in\{p,1\}$ when $p=o(1)$ .

A recent surprising result due to Gunby [20] studied the upper tail probability for subgraph counts in a random regular graph $G_{n,d}$ , where he also proved a large deviation principle. However, for a particular graph, and certain range of $d$ (the regularity) the answer for this variational problem is achieved with three possible values of $q_{i}$ and not two as in the binomial random graph model.

In a breakthrough paper of Harel, Mousset and Samotij [21], using a combinatorial approach, the authors managed to extend the range where the variational problem (the same one as before) bounds from above the logarithm of the upper tail probability for the count of cliques to the optimal range of $p$ , which is $p^{\frac{r-1}{2}}\gg n^{-1}(\log(n))^{\frac{1}{r-2}}$ for a clique on $r$ vertices²²2That is because below this threshold they also showed a Poisson approximation.. We wish to emphasize that their approach for solving the upper tail problem is completely different from any previous techniques in the papers mentioned above. This, together with the solution to the variational problem, settled the problem of the first order approximation of the logarithm of the upper tail probability for cliques. Moreover, they established the correct range of $p$ , up to a polylogarithmic factor, where the variational problem bounds from above the logarithm of the upper tail probability for non-bipartite regular graphs³³3For bipartite regular graphs their result was not optimal, but also a lot of progress was made..

Building on the combinatorial ideas of Harel, Mousset and Samotij, Basak and Basu [4] extended the range of $p$ to the optimal range for all regular graphs, including bipartite graphs. Again, this with the solution to the variational problem settled the problem of estimating the logarithm of the upper tail probability for regular graphs. We summarize this discussion with the following first order approximation of the upper tail problem for regular graphs in the ‘localized regime’.

Theorem 1.1 (Due to [21, 4]).

For any $\Delta\geq 2$ and a $\Delta$ -regular connected graph $H$ the following holds:

\frac{-\log\mathbb{P}\left(X_{n,p}^{H}\geq(1+\delta)\mathbb{E}\left[X_{n,p}^{H}\right]\right)}{n^{2}p^{\Delta}\log(1/p)}=(1+o(1))\begin{cases}\min\{\theta_{H},\frac{1}{2}\delta^{2/v_{H}}\}&\text{ if }\frac{1}{\sqrt{n}}\ll p^{\Delta/2}\ll 1,\\ \frac{1}{2}\delta^{2/v_{H}}&\text{ if }\frac{\log^{\frac{1}{v_{H}-2}}(n)}{n}\ll p^{\Delta/2}\ll\frac{1}{\sqrt{n}},\end{cases}

where $\theta_{H}$ is the unique solution to the equation $P_{H}(\theta)=1+\delta$ where $P_{H}$ is the independence polynomial of $H$ ⁴⁴4The independent polynomial of $H$ is $P_{H}(X)=\sum_{k=0}^{\alpha(H)}s_{k}X^{k}$ where $s_{k}$ is the number of independent sets in $H$ of size $k$ ..

In this paper we estimate the logarithm of the upper tail probability for the count of induced copies of $C_{4}$ in $G_{n,p}$ in the range $\frac{\log^{9}(n)}{n}\ll p\ll 1$ , which is optimal up to $\log^{9}(n)$ in the lower bound. To the best of our knowledge, this is the first exact result for the upper tail probability of a random variable counting the number of induced copies of a given non-complete graph in $G_{n,p}$ . Our proofs rely heavily on the combinatorial approach of Harel, Mousset and Samotij.

We now present the main result of this paper. For this, from now on let $X$ be the random variable counting the number of induced copies of $C_{4}$ in $G_{n,p}$ .

{restatable}

thmthebesttheorem There is an explicit sequence $0=c_{1}<c_{2}<\ldots\leq 1/3$ such that the following holds for $p\gg\frac{\log^{9}(n)}{n}$ :

\frac{-\log\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])}{n^{2}p^{2}\log(1/p)}=(1+o(1))\begin{cases}\rho_{k}(n,p)\sqrt{\frac{\delta}{2}}&\text{ if }n^{-1+c_{k-1}}\leq p\leq n^{-1+c_{k}},\\ \sqrt{\frac{\delta}{2}}&\text{ if }n^{-2/3}\leq p\ll\frac{1}{\sqrt{n}\log(n)},\\ \sqrt{\frac{\delta}{2}+\frac{1}{{128}}}-\frac{1}{\sqrt{128}}&\text{ if }n^{-1/2}\ll p\ll 1,\end{cases}

where $\rho_{k}(n,p)=\sqrt{\frac{k}{k-1}}\left(1-\frac{2}{k}+\frac{\log(n)}{k\log(p)}\right)<1$ .

The sequence above is explicitly defined in Section 6.

There are several interesting phenomena which distinguish Theorem 1.1 from previous results concerning the upper tail problem for subgraph counts.

The first, and most noticeable difference is the infinite number of phase transitions. To the best of our knowledge, there is no earlier example of an upper tail problem exhibiting infinitely many phase transitions in its first order approximation. To understand the reasons for this phenomenon let us first discuss further Theorem 1.1. A strategy to bound from below the upper tail probability is to observe that, conditioned on the event $C\subseteq G_{n,p}$ for some $C$ satisfying $\mathbb{E}\left[X_{n,p}^{H}\mid C\subseteq G_{n,p}\right]\geq(1+\delta)\mathbb{E}\left[X_{n,p}^{H}\right]$ , the upper tail event holds with ‘decent’ probability. This allows us to bound the lower tail probability from below by the probability of $C\subseteq G_{n,p}$ (which is $p^{e_{C}}$ ) times the ‘decent’ probability above (which is $p^{o(e_{C})}$ ). We refer to this as ‘planting a copy of $C$ ’. Of course one would like to consider the smallest such $C$ , to increase the lower bound as much as possible.

In the previous works mentioned earlier, it was shown that for $\Delta$ -regular graphs there are two natural candidates for $C$ . These candidates are: a clique on $\Theta(np^{\Delta/2})$ vertices or a spanning complete bipartite graph with the smaller side of size $\Theta(np^{\Delta})$ . The second construction is often called a Hub. When $p\ll n^{-1/\Delta}$ , the small side of the Hub construction needs to be smaller than one and hence at that point the Hub construction is no longer valid, and we are left only with the clique construction as a lower bound for the upper tail probability. This is the reason for the two different regimes in Theorem 1.1.

Even though $C_{4}$ is a regular graph, the presence of a clique in $G_{n,p}$ does not boost the expected number of induced 4-cycles by a significant amount. Therefore, the first construction is no longer valid. The analog of this construction in our case is a balanced and complete bipartite graph with $\Theta(np)$ vertices. Surprisingly, although amongst all graphs with a given number of edges the complete and balanced bipartite graph maximizes the number of induced copies of $C_{4}$ (see [7, 19] and Lemma 6.2) planting it does not always give the strongest lower bound on the upper tail probability.

This is because, in a range of densities $p$ , there is a large family $\mathcal{K}_{p}$ of subgraphs of $K_{n}$ , where each $C\in\mathcal{K}_{p}$ satisfies $\mathbb{E}[X\mid C\subseteq G_{n,p}]\geq(1+\delta)\mathbb{E}[X]$ and has only slightly suboptimal size (amongst all $C$ with this property), that is so ‘large’ that

\log\mathbb{P}(C\subseteq G_{n,p}\text{ for some }C\in\mathcal{K}_{p})\geq-(1-\Omega(1))\min_{C}e(C)\log(1/p),

where the minimum ranges over all graphs $C$ with $\mathbb{E}[X\mid C\subseteq G_{n,p}]\geq(1+\delta)\mathbb{E}[X]$ ; moreover, conditioned on the event that $C\subseteq G_{n,p}$ for some $C\in\mathcal{K}_{p}$ , the upper tail event still holds with ‘decent’ probability. Note that this is very different from the case of non-induced regular graphs, as here we condition on the appearance of some graph from $\mathcal{K}_{p}$ rather than a single graph (which corresponds to tilting the measure of $G_{n,p}$ to $G_{n,\bar{q}}$ for some $\bar{q}\in\{p,1\}^{\binom{n}{2}}$ ). In fact, we will show (in Section 7) that the lower bound obtained with the use of $\mathcal{K}_{p}$ is stronger than any bound corresponding to the more general tilting $G_{n,p}$ to $G_{n,\bar{q}}$ for some $\bar{q}\in[0,1]^{\binom{n}{2}}$ .

Let us elaborate on the structure of the graphs in $\mathcal{K}_{p}$ . The family $\mathcal{K}_{p}$ depends on $p$ in the following way: provided $n^{-1+c_{k-1}}\leq p\leq n^{-1+c_{k}}$ for some integer $k\geq 2$ , the set $\mathcal{K}_{p}$ comprises all complete bipartite graphs with sides $k$ and $\ell=2\sqrt{\frac{\delta\mathbb{E}[X]}{k(k-1)}}$ . Note that, for any fixed $k$ and large $n$ , the graph $K_{k,\ell}$ has more edges than the complete and balanced bipartite graph with the same number of induced copies of $C_{4}$ . However, the key point here is that we are no longer planting a single copy of this graph. Since $\ell$ is large, there are many embeddings of $K_{k,\ell}$ in $K_{n}$ and the probability that $G_{n,p}$ contains some such copy is affected by the combinatorial factor of the number of possible embeddings. It turns out that the lower bound obtained by considering these families is actually tight in the range where $\frac{\log^{9}(n)}{n}\ll p\ll n^{-2/3-o(1)}$ . Further, between $\frac{\log^{9}(n)}{n}$ and $n^{-2/3-o(1)}$ the family $\mathcal{K}_{p}$ changes infinitely many times depending on $p$ , which explains the infinite number of phase transitions. This will be explained in more details in Section 7.

Organisation of the paper

In Section 2, we prove a general large deviation principle for nonnegative functions of independents Bernoulli variables which is a version of Theorem 9.1 in [21], which was stated in [21] but was not proven. In Section 3, we continue by connecting the general large deviation principle to our large deviation problem. To this end we define special classes of graphs called core graphs. Then we modify the general result proved in Section 2 using this notion of core graphs.

Section 4 is independent of the others and gives lower bounds for the upper tail probability by planting graphs in the denser regimes and families of graphs in the sparser regimes.

In Section 5, we solve the variational problem presented in Section 3. Furthermore, we make a connection between the number of core graphs with $m$ edges and the maximum number of vertices a core graph with $m$ edges might have.

Section 6 uses the results from Section 5 to provide upper bounds for the logarithm of the upper tail probability. This section is divided into three parts. First is the dense regime $n^{-1/2}\ll p\ll 1$ , second is the sparse regime $\frac{\log^{9}(n)}{n}\ll p\ll\frac{1}{\sqrt{n}\log(n)}$ which we also split into two cases: the dense case in the sparse regime $n^{-2/3}\ll p\ll\frac{1}{\sqrt{n}\log(n)}$ and the sparser cases in the sparse regime $n^{-1+c_{k-1}}\ll p\ll n^{-1+c_{k}}$ . Before the second split, we develop some tools used in both cases.

In Section 7 we solve the naive mean field variational problem, showing that it is different the solution to the variational problem we used.

Appendix A was added for completeness, where we reprove results of [27, 5] in a language of graphs rather than graphons.

Acknowledgments

The author thanks his advisor Wojciech Samotij for his guidance through out the write-up of this paper, as well as helpful and fruitful discussions. The author also thanks Arnon Chor and Dor Elboim for helpful discussions. Lastly, the author thanks Eden Kuperwasser for fruitful discussions and for creating the figure in this paper.

2. Main tools - polynomials in the hypercube

We start with some notations and definitions which can also be found in [21]. We work within the probability space $(\{0,1\}^{N},\operatorname{Ber}({p})^{N})$ . Suppose $I\subseteq[N]$ and $x\in\{0,1\}^{N}$ . We say that

F(I,x)=\{y\in\{0,1\}^{N}:y_{n}=x_{n}\text{ for all }n\in I\}

is a subcube centered at $(I,x)$ with codimension $|I|$ denoted by $\operatorname{codim}(F)=|I|$ . Note that every subcube is centered at some pair $(I,x)$ . Moreover, if a subcube $F$ is centered at $(I,x)$ and it is also centered at $(J,y)$ then $I=J$ . Hence, the codimension is well defined. For every subcube $F$ centered at $(I,x)$ we define the one-supcube and the zero-supcube of $F$ to be $F^{(1)}=F(I_{1},x)$ and $F^{(0)}=F(I_{0},x)$ where $I_{i}=\{n\in I:x_{n}=i\}$ for $i=0,1$ . Moreover, for every subcube $F$ we say that the one-codimension of it is the codimension of $F^{(1)}$ and the zero-codimension of it is the codimension of $F^{(0)}$ denoted by $\operatorname{codim}_{i}(F)$ for $i=1$ and $i=0$ respectively. A simple observation is that any subcube $F$ always satisfies $F=F^{(0)}\cap F^{(1)}$ and $\operatorname{codim}(F)=\operatorname{codim}(F^{(0)})+\operatorname{codim}(F^{(1)})$ . For every $X\colon\{0,1\}^{N}\rightarrow\mathbb{R}_{\geq 0}$ we define the complexity of $X$ denoted by $\operatorname{comp}(X)$ to be the smallest integer $d$ for which it is possible to represent $X$ as a linear combination with nonnegative coefficients of indicator functions of subcubes with codimension at most $d$ . Note that the complexity is well defined as for every such function $X$ we can write

X=\sum\limits_{x\in\{0,1\}^{N}}X(x)\mathbbm{1}_{F([N],x)}.

Moreover, this shows that for any $X$ we have $\operatorname{comp}(X)\leq N$ . Assume now that $Y$ is a random variable taking values in $\{0,1\}^{N}$ and that $X=X(Y)$ . Given a subcube $F\subseteq\{0,1\}^{N}$ , we write $\mathbb{E}_{F}[X]=\mathbb{E}[X\mid Y\in F]$ for the expectation of $X$ conditioned on $Y\in F$ . We further define $\Phi_{X}\colon\mathbb{R}_{\geq 0}\rightarrow\mathbb{R}_{\geq 0}\cup\{\infty\}$ by

\displaystyle\Phi_{X}(\delta)=\min\{

\displaystyle-\log\mathbb{P}(Y\in F):F\subseteq\{0,1\}^{N}\text{ is a subcube with }\mathbb{E}_{F}[X]\geq(1+\delta)\mathbb{E}[X]\}.

Our main tool is [21, Theorem 9.1], which was stated but not proved. We prove a slightly different version of the theorem, but the essence remains the same.

Theorem 2.1.

For every positive integer $d$ and all positive real numbers $\varepsilon$ and $\delta$ with $\varepsilon<\frac{1}{2}$ , there is a positive constant $K=K(\varepsilon,\delta,d)$ such that the following holds. Let $Y$ be a sequence of $N$ independent $\operatorname{Ber}(p)$ random variables for some $p\in(0,1/2)$ and assume that $X=X(Y)$ is nonnegative with complexity at most $d$ and satisfies $\Phi_{X}(\delta-\varepsilon)\geq K\log(\frac{1}{p})$ . Denote by $\mathcal{F}$ the collection of all subcubes $F\subseteq\{0,1\}^{N}$ satisfying

( $F{1}$ )

$\mathbb{E}_{F}[X]\geq(1+\delta-\varepsilon)\mathbb{E}[X]$ ,
( $F{2}$ )

$\operatorname{codim}(F)\leq K\cdot\Phi_{X}(\delta+\varepsilon)$ .

Then,

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq(1+\varepsilon)\mathbb{P}(Y\in F\text{ for some }F\in\mathcal{F}).

To prove the theorem we state and prove the following lemmas.

Lemma 2.2.

Let $Y$ be a random variable taking values in $\{0,1\}^{N}$ and let $X=X(Y)$ be a real-valued function of $Y$ . Suppose that $\mathbb{E}[X]>0$ and that $X\leq M$ always. Then for all positive $\varepsilon$ and $\delta$ ,

-\log\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq\Phi_{X}(\delta+\varepsilon)+\log\bigg{(}\frac{M}{\varepsilon\mathbb{E}[X]}\bigg{)}

Proof.

Let $t=(1+\delta)\mathbb{E}[X]$ . If $\Phi_{X}(\delta+\varepsilon)=\infty$ then the claim is vacuously true. Otherwise, there exists a subcube $F$ such that $-\log\mathbb{P}(Y\in F)=\Phi_{X}(\delta+\varepsilon)$ and $\mathbb{E}_{F}[X]\geq(1+\delta+\varepsilon)\mathbb{E}[X]=t+\varepsilon\mathbb{E}[X]$ . By $X\leq M$ we have $\mathbb{E}_{F}[X]\leq M\cdot\mathbb{P}(X\geq t\mid Y\in F)+t$ . Thus, $\mathbb{P}(X\geq t\mid Y\in F)\geq\frac{\varepsilon\mathbb{E}[X]}{M}$ and hence,

\mathbb{P}(X\geq t)\geq\mathbb{P}(X\geq t\mid Y\in F)\mathbb{P}(Y\in F)\geq\frac{\varepsilon\mathbb{E}[X]}{M}\mathbb{P}(Y\in F).

Now taking the negative logarithm gives the assertion of the lemma. ∎

Fact 2.3.

Suppose $F_{1},\ldots,F_{k}$ are subcubes of $\{0,1\}^{N}$ . If $F_{1}\cap\cdots\cap F_{k}$ is nonempty, then it is also a subcube of $\{0,1\}^{N}$ and, moreover, $\operatorname{codim}(F_{1}\cap\cdots\cap F_{k})\leq\sum_{i=1}^{k}\operatorname{codim}(F_{i})$ .

Proof.

We prove the statement for $k=2$ ; the case $k>2$ follows by a simple inductive argument. Let $I_{1},I_{2}$ and $x_{1},x_{2}$ be such that $F_{i}$ is a subcube centered at $(I_{i},x_{i})$ for $i=1,2$ . Define $x\in\{0,1\}^{N}$ in the following way: for all $i\in I_{1}$ and $j\in I_{2}$ put $(x)_{i}=(x_{1})_{i}$ and $(x)_{j}=(x_{2})_{j}$ and for all $i\not\in I_{1}\cup I_{2}$ put $(x)_{i}=0$ . This is indeed a well defined element in $\{0,1\}^{N}$ as for all $i\in I_{1}\cap I_{2}$ we have, $(x_{1})_{i}=(x_{2})_{i}$ , otherwise it will contradict our assumption that $F_{1}\cap F_{2}\neq\emptyset$ . It follows from the definition of $x$ that $F_{1}\cap F_{2}=F(I_{1}\cup I_{2},x)$ and hence $F_{1}\cap F_{2}$ is a subcube. Furthermore, we have

\operatorname{codim}(F_{1}\cap F_{2})=|I_{1}\cup I_{2}|\leq|I_{1}|+|I_{2}|=\operatorname{codim}(F_{1})+\operatorname{codim}(F_{2}).\qed

Lemma 2.4.

Let $Y$ be a random variable taking values in $\{0,1\}^{N}$ and let $X=X(Y)$ be a nonnegative real-valued function with complexity bounded by $d$ . Then for every positive integer $\ell$ and all positive real numbers $\varepsilon$ and $\delta$ with $\varepsilon<1+\delta$ ,

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X]\text{ and }Y\not\in F\text{ for all }F\in\mathcal{F})\leq{\bigg{(}\frac{1+\delta-\varepsilon}{1+\delta}\bigg{)}}^{\ell},

where

\mathcal{F}=\{F\in\{0,1\}^{N}:F\text{ is a subcube and }\operatorname{codim}(F)\leq d\ell\text{ and }\mathbb{E}_{F}[X]\geq(1+\delta-\varepsilon)\mathbb{E}[X]\}.

Proof.

Given $S\subseteq\{0,1\}^{N}$ a subcube, let $Z_{S}$ be the indicator random variable of the event that $Y\not\in F$ for all $F\in\mathcal{F}$ with $F\supseteq S$ . Note that $S\subseteq S^{\prime}$ implies $Z_{S}\leq Z_{S^{\prime}}$ and let $Z=Z_{\emptyset}$ . Since $XZ\geq 0$ and $Z^{\ell}=Z$ , Markov’s inequality gives

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X]\text{ and }Z=1)=\mathbb{P}(XZ\geq(1+\delta)\mathbb{E}[X])\leq\frac{\mathbb{E}[X^{\ell}Z]}{((1+\delta)\mathbb{E}[X])^{\ell}}.

(2)

To simplify the notation, for every $S\subseteq\{0,1\}^{N}$ we write $\mathbbm{1}_{S}$ for $\mathbbm{1}_{\{Y\in S\}}$ . Write $X=\sum_{F}\alpha_{F}\mathbbm{1}_{F}$ , where the sum ranges over all subcubes of $\{0,1\}^{N}$ , each coefficient $\alpha_{F}$ is nonnegative, and $\alpha_{F}=0$ for all $F$ with $codim(F)>d$ . Then for every $k\in[\ell]$ ,

	$\displaystyle\mathbb{E}[X^{k}Z]=$	$\displaystyle\sum\limits_{F_{1},\ldots,F_{k}}\alpha_{F_{1}}\cdots\alpha_{F_{k}}\mathbb{E}[\mathbbm{1}_{F_{1}}\cdots\mathbbm{1}_{F_{k}}\cdot Z]$
	$\displaystyle\leq$	$\displaystyle\sum\limits_{F_{1},\ldots,F_{k}}\alpha_{F_{1}}\cdots\alpha_{F_{k}}\mathbb{E}[\mathbbm{1}_{F_{1}}\cdots\mathbbm{1}_{F_{k}}\cdot Z_{F_{1}\cap\cdots\cap F_{k}}]$
	$\displaystyle\leq$	$\displaystyle\sum\limits_{F_{1},\ldots,F_{k-1}}\alpha_{F_{1}}\cdots\alpha_{F_{k-1}}\mathbb{E}[\mathbbm{1}_{F_{1}}\cdots\mathbbm{1}_{F_{k-1}}\cdot Z_{F_{1}\cap\cdots\cap F_{k-1}}]$
		$\displaystyle\qquad\qquad\qquad\qquad\;\;\cdot\mathbb{E}[X\mid\mathbbm{1}_{F_{1}}\cdots\mathbbm{1}_{F_{k-1}}\cdot Z_{F_{1}\cap\cdots\cap F_{k-1}}=1],$

where we may let the third sum range only over sequences $F_{1},\ldots,F_{k-1}$ for which the event $\{\mathbbm{1}_{F_{1}}\cdots\mathbbm{1}_{F_{k-1}}\cdot Z_{F_{1}\cap\cdots\cap F_{k-1}}=1\}$ has a positive probability of occurring.

Claim 2.5.

For any such sequence,

F_{1}\cap\ldots\cap F_{k-1}\not\in\mathcal{F}\;\text{ and }\;\mathbbm{1}_{F_{1}}\cdots\mathbbm{1}_{F_{k-1}}\cdot Z_{F_{1}\cap\cdots\cap F_{k-1}}=\mathbbm{1}_{F_{1}}\cdots\mathbbm{1}_{F_{k-1}}.

Proof.

To see this, note that if $\{\mathbbm{1}_{F_{1}}\cdots\mathbbm{1}_{F_{k-1}}\cdot Z_{F_{1}\cap\cdots\cap F_{k-1}}=1\}$ has positive probability of occurring, then there exists a $y$ such that $y\in F_{1}\cap\ldots\cap F_{k-1}$ and $Z_{F_{1}\cap\ldots\cap F_{k-1}}(y)=1$ . That means $y\in F_{1}\cap\ldots\cap F_{k-1}$ and $y\not\in F$ for any subcube $F\in\mathcal{F}$ such that $F\supseteq F_{1}\cap\ldots\cap F_{k-1}$ . Hence, $F_{1}\cap\ldots\cap F_{k-1}\not\in\mathcal{F}$ . For the second part assume towards a contradiction that $\mathbbm{1}_{F_{1}}\cdots\mathbbm{1}_{F_{k-1}}\cdot Z_{F_{1}\cap\cdots\cap F_{k-1}}<\mathbbm{1}_{F_{1}}\cdots\mathbbm{1}_{F_{k-1}}$ , meaning that there is $y^{\prime}\in F_{1}\cap\ldots\cap F_{k-1}$ such that $Z_{F_{1}\cap\ldots\cap F_{k-1}}(y^{\prime})=0$ . Therefore, there exists a subcube $F^{*}\in\mathcal{F}$ such that, $F^{*}\supseteq F_{1}\cap\ldots\cap F_{k-1}$ . This is a contradiction as now, $y\in F_{1}\cap\ldots\cap F_{k-1}\subseteq F^{*}$ but by our assumption, $y\not\in F^{*}$ as $F^{*}\in\mathcal{F}$ . ∎

Since $\{\mathbbm{1}_{F_{1}\cap\cdots\cap F_{k-1}}=1\}$ has a positive probability of occurring, Fact 2.3 asserts that $F_{1}\cap\cdots\cap F_{k-1}$ is a subcube, and $\operatorname{codim}({F_{1}\cap\cdots\cap F_{k-1}})\leq d(k-1)\leq d\ell$ . Therefore

\mathbb{E}[X\mid\mathbbm{1}_{F_{1}}\cdots\mathbbm{1}_{F_{k-1}}\cdot Z_{F_{1}\cap\cdots\cap F_{k-1}}=1]=\mathbb{E}_{F_{1}\cap\cdots\cap F_{k-1}}[X]<(1+\delta-\varepsilon)\mathbb{E}[X],

as otherwise $F_{1}\cap\cdots\cap F_{k-1}$ would belong to $\mathcal{F}$ . It follows that

	$\displaystyle\sum\limits_{F_{1},\ldots,F_{k}}\alpha_{F_{1}}\cdots\alpha_{F_{k}}\mathbb{E}[\mathbbm{1}_{F_{1}}\cdots\mathbbm{1}_{F_{k}}\cdot Z_{F_{1}\cap\cdots\cap F_{k}}]$
	$\displaystyle<(1+\delta-\varepsilon)\mathbb{E}[X]\sum\limits_{F_{1},\ldots,F_{k-1}}$	$\displaystyle\alpha_{F_{1}}\cdots\alpha_{F_{k-1}}\mathbb{E}[\mathbbm{1}_{F_{1}}\cdots\mathbbm{1}_{F_{k-1}}\cdot Z_{F_{1}\cap\cdots\cap F_{k-1}}].$

By induction, we see that $\mathbb{E}[X^{\ell}Z]<((1+\delta-\varepsilon)\mathbb{E}[X])^{\ell}$ . Substituting this inequality into (2) completes the proof. ∎

Now we use the previous lemmas to prove the theorem.

Proof of Theorem 2.1.

Let $K$ be a large constant that may depend on $\varepsilon,\delta$ and $d$ . Furthermore, let $t=(1+\delta)\mathbb{E}[X]$ and $\ell=\left\lfloor{K/d\cdot\Phi_{X}(\delta+\varepsilon)}\right\rfloor$ . Define

\mathcal{F}=\{F\subseteq\{0,1\}^{N}:F\text{ is a subcube with }\operatorname{codim}(F)\leq d\ell\text{ and }\mathbb{E}_{F}[X]\geq(1+\delta-\varepsilon)\mathbb{E}[X]\}.

It follows from Lemma 2.4 that

\mathbb{P}(X\geq t\text{ and }Y\not\in F\text{ for all }F\in\mathcal{F})\leq\bigg{(}1-\frac{\varepsilon}{1+\delta}\bigg{)}^{\ell}.

As $\operatorname{comp}(X)\leq d$ , we can write

X=\sum\limits_{j\in J}\alpha_{j}\mathbbm{1}_{\{Y\in F_{j}\}}

for some $J$ so that, for all $j\in J$ , the coefficient $\alpha_{j}$ is nonnegative and $F_{j}$ is a subcube with $\operatorname{codim}(F_{j})\leq d$ . Put $M=\sum\alpha_{i}$ and note that $X\leq M$ always. Applying Lemma 2.2 gives

\displaystyle-\log\mathbb{P}(X\geq t)\leq\Phi_{X}(\delta+\varepsilon)+\log\bigg{(}\frac{M}{\varepsilon\mathbb{E}[X]}\bigg{)}.

Note also that $\mathbb{E}[X]\geq Mp^{d}$ as $\operatorname{comp}(X)\leq d$ and therefore $\mathbb{P}(Y\in F_{j})\geq\min\{p,1-p\}^{d}=p^{d}$ for every $j\in J$ . Therefore,

\displaystyle-\log\mathbb{P}(X\geq t)\leq\Phi_{X}(\delta+\varepsilon)+\log\bigg{(}\frac{1}{\varepsilon p^{d}}\bigg{)}.

Thus, provided $K$ is sufficiently large we also have

\displaystyle-\log\mathbb{P}(X\geq t)\leq(1+\varepsilon)\Phi_{X}(\delta+\varepsilon),

as we assumed that $\Phi_{X}(\delta+\varepsilon)\geq\Phi_{X}(\delta-\varepsilon)\geq K\log(1/p)$ . Putting all of this together gives that for sufficiently large $K$ , we have

	$\displaystyle\mathbb{P}(X\geq t\text{ and }Y\not\in F\text{ for all }F\in\mathcal{F})$	$\displaystyle\leq\bigg{(}1-\frac{\varepsilon}{1+\delta}\bigg{)}^{\ell}\leq e^{-\frac{\varepsilon}{1+\delta}\left\lfloor{K/d\cdot\Phi_{X}(\delta+\varepsilon)}\right\rfloor}$
		$\displaystyle\leq(\varepsilon/2)\cdot\mathbb{P}(X\geq t).$

The assertion of the theorem now follows:

\mathbb{P}(X\geq t)\leq{(1-\varepsilon/2)}^{-1}\cdot\mathbb{P}(Y\in F\text{ for some }F\in\mathcal{F})\leq(1+\varepsilon)\mathbb{P}(Y\in F\text{ for some }F\in\mathcal{F})

where the last inequality holds for $\varepsilon<1/2$ . ∎

When $p\ll 1$ and $\operatorname{comp}(X)=O(1)$ , the majority of the contribution to both $\mathbb{E}_{F}[X]$ and $\mathbb{P}(Y\in F)$ comes from the one-supcube $F^{(1)}\supseteq F$ . We prove a straightforward corollary of Theorem 2.1 which will be more convenient to work with. We start by proving the following lemma.

Lemma 2.6.

The following holds for every positive integer $d$ and all positive real numbers $\varepsilon$ and $\delta$ with $\varepsilon<1$ . Let $Y$ be a sequence of $N$ independent $\operatorname{Ber}(p)$ random variables for some $p<1-\big{(}{\frac{1+\delta-\varepsilon}{1+\delta-\varepsilon/2}\big{)}^{1/d}}$ and assume that $X=X(Y)$ has complexity at most $d$ . Let $F$ be a subcube satisfying $\mathbb{E}_{F}[X]\geq(1+\delta-\varepsilon/2)\mathbb{E}[X]$ . Then, $\mathbb{E}_{F^{(1)}}[X]\geq(1+\delta-\varepsilon)\mathbb{E}[X]$ where $F^{(1)}$ is the one-supcube of $F$ .

Proof.

As $X$ has complexity $d$ we can write

X=\sum\limits_{j\in J}\alpha_{j}\mathbbm{1}_{\{Y\in F_{j}\}}

for some $J$ so that, for all $j\in J$ , the coefficient $\alpha_{j}$ is nonnegative and $F_{j}$ is a subcube with $\operatorname{codim}(F_{j})\leq d$ . Moreover, by our assumptions on $F_{j}$ , we have $\operatorname{codim}_{0}(F_{j})\leq\operatorname{codim}(F_{j})\leq d$ . Thus, we have the following inequality,

\mathbb{P}(Y\in F_{j}\mid Y\in F^{(1)})\geq(1-p)^{d}\cdot\mathbb{P}(Y\in F_{j}\mid Y\in F).

It follows that

	$\displaystyle\mathbb{E}_{F^{(1)}}[X]$	$\displaystyle=\sum\limits_{j\in J}\alpha_{j}\mathbb{E}_{F^{(1)}}[\mathbbm{1}_{\{Y\in F_{j}\}}]\geq\sum\limits_{j\in J}\alpha_{j}(1-p)^{d}\cdot\mathbb{E}_{F}[\mathbbm{1}_{\{Y\in F_{j}\}}]$
		$\displaystyle\geq\sum\limits_{j\in J}\alpha_{j}\Big{(}\frac{1+\delta-\varepsilon}{1+\delta-\varepsilon/2}\Big{)}\mathbb{E}_{F}[\mathbbm{1}_{F_{j}}]=\Big{(}\frac{1+\delta-\varepsilon}{1+\delta-\varepsilon/2}\Big{)}\mathbb{E}_{F}[X]\geq(1+\delta-\varepsilon)\mathbb{E}[X],$		(3)

where the first inequality in (3) follows from the assumption on $p$ and the second inequality in (3) follows from the assumption on $\mathbb{E}_{F}[X]$ . ∎

Using this lemma we now state and prove a straightforward corollary of Theorem 2.1.

Corollary 2.7.

For every positive integer $d$ and all positive real numbers $\varepsilon$ and $\delta$ with $\varepsilon<{1}$ , there is a positive constant $K=K(\varepsilon,\delta,d)$ such that the following holds. Let $Y$ be a sequence of $N$ independent $\operatorname{Ber}(p)$ random variables for some $p<\min\{1-\big{(}{\frac{1+\delta-\varepsilon}{1+\delta-\varepsilon/2}\big{)}^{1/d}},1/2\}$ and assume that $X=X(Y)$ has complexity at most $d$ and satisfies $\Phi_{X}(\delta-\varepsilon)\geq K\log(\frac{1}{p})$ . Denote by $\mathcal{F}_{1}$ the collection of all subcubes $F\subseteq\{0,1\}^{N}$ satisfying

( $H{1}$ )

$\mathbb{E}_{F}[X]\geq(1+\delta-\varepsilon)\mathbb{E}[X]$ ,
( $H{2}$ )

$\operatorname{codim}F\leq K\cdot\Phi_{X}(\delta+\varepsilon)$ ,
( $H{3}$ )

$\operatorname{codim}_{1}(F)=\operatorname{codim}(F)$ .

Then

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq(1+\varepsilon)\mathbb{P}(Y\in F\text{ for some }F\in\mathcal{F}_{1}).

Proof.

Applying Theorem 2.1 with $\varepsilon$ replaced by $\varepsilon/2$ we obtain

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq(1+\varepsilon)\mathbb{P}(Y\in F\text{ for some }F\in\mathcal{F}),

(4)

where $\mathcal{F}$ is the collection of all subcubes $F\subseteq\{0,1\}^{N}$ satisfying ( $F{1}$ ) and ( $F{2}$ ) where $\varepsilon$ is replaced with $\varepsilon/2$ . Letting $\mathcal{F}_{1}=\{F^{(1)}:F\in\mathcal{F}\}$ , we obtain ( $H{2}$ ) and ( $H{3}$ ) for every subcube $F\in\mathcal{F}_{1}$ due to ( $F{2}$ ) and the definition of one-supcubes. Noting that for every subcube $F$ we have $F\subseteq F^{(1)}$ we obtain also that

\mathbb{P}(Y\in F\text{ for some }F\in\mathcal{F})\leq\mathbb{P}(Y\in F\text{ for some }F\in\mathcal{F}_{1}).

(5)

Combining (4) and (5) we obtain that

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq(1+\varepsilon)\mathbb{P}(Y\in F\text{ for some }F\in\mathcal{F}_{1}).

Furthermore, Lemma 2.6 implies that $\mathcal{F}_{1}$ also satisfies ( $H{1}$ ). ∎

We call a subcube $F$ a seed if $F$ is an element of $\mathcal{F}_{1}$ defined in Corollary 2.7. A formal definition is given in the following section. A general way one would use Corollary 2.7 to bound upper tails of counts of induced subgraphs is to define a special type of seeds called structured seeds. These structured seeds are subcubes in $\mathcal{F}_{1}$ from Corollary 2.7, with a stronger condition than ( $H{1}$ ). This condition is a combinatorial condition involving various supcubes of the subcubes in $\mathcal{F}_{1}$ (this will be explained further in Section 3). Then one would define a core subcube to be a subcube containing a structured seed such that ‘every coordinate counts’. In the following section we define these special subcubes. We will also show that Corollary 2.7 can be modified to bound the upper tail probability via the probability of $Y$ being an element in a core subcube.

3. From seeds to modified cores

Note that for any subcube $F\subseteq\{0,1\}^{E(K_{n})}$ , $F^{(1)}$ the one-supcube of $F$ corresponds to all subgraphs of $K_{n}$ containing a specific subgraph of $K_{n}$ with $\operatorname{codim}_{1}({F})$ edges. Therefore, from this point onwards we think of one-supcubes as a family of subgraphs of $K_{n}$ containing a specific subgraph. In particular, instead of writing $\mathbb{E}_{F}[X]$ for some one-supcube (of some subcube) we will write $\mathbb{E}_{G}[X]$ for the graph corresponding to $F$ . The subgraphs corresponding to the members of $\mathcal{F}_{1}$ from Corollary 2.7 will be called seeds. We start with a definition which will be useful in this section as well as the next ones.

Definition.

Suppose $H$ and $G$ are graphs and let $e$ be an edge of $G$ . We define:

•

$N_{ind}(H,G)$ is the number of induced copies of $H$ in $G$ .
•

$N_{ind}(e,H,G)$ is the number of induced copies of $H$ in $G$ that contain $e$ .

Our main concern in this paper is the random variable counting the number of induced copies of $C_{4}$ in $G_{n,p}$ . Therefore, we let $X=N_{ind}(C_{4},G_{n,p})$ . We will use this notation from this point onward. We now continue by defining seed graphs.

Definition.

Let $\varepsilon,\delta,K$ be positive reals. In addition let $p\in(0,1)$ . Then we define $\mathcal{S}(\varepsilon,\delta,K)$ to be the collection of all spanning subgraphs $G\subseteq K_{n}$ satisfying:

( $S{1}$ )

$e(G)\leq K\cdot\Phi_{X}(\delta+\varepsilon)$ and
( $S{2}$ )

$\mathbb{E}_{G}[X]\geq(1+\delta-\varepsilon)\mathbb{E}[X]$ .

We call the graphs in $\mathcal{S}(\varepsilon,\delta,K)$ seeds. Furthermore, for every positive integer $m$ we define $\mathcal{S}_{m}(\varepsilon,\delta,K)$ to be the set of all seeds with $m$ edges.

Since $\mathbb{E}_{G}[X]$ is determined by the number of induced copies of various subgraphs of $C_{4}$ in $G$ , it will be convenient to restate ( $S{2}$ ) in the above definition with a graph theoretic condition, giving rise to the notion of structured seeds. In the definition we use the graph obtained from $K_{1,2}$ by adding an isolated vertex; we denote this graph by $K_{1,2}\sqcup K_{1}$ . Now we define the structured seeds.

Definition.

Let $\varepsilon,\delta,K$ be positive reals. In addition let $p\in(0,1)$ . Then we define $\mathcal{S}^{\prime}(\varepsilon,\delta,K)$ to be the collection of all spanning subgraphs $G\subseteq K_{n}$ satisfying:

( $S^{\prime}{1}$ )

$e(G)\leq K\cdot\Phi_{X}(\delta+\varepsilon)$ and
( $S^{\prime}{2}$ )

$N_{ind}(C_{4},G)+N_{ind}(K_{1,2}\sqcup K_{1},G)p^{2}\geq(\delta-\varepsilon)\mathbb{E}[X]$ .

We call the graphs in $\mathcal{S}^{\prime}(\varepsilon,\delta,K)$ structured seeds. Furthermore, for every positive integer $m$ we define $\mathcal{S}^{\prime}_{m}(\varepsilon,\delta,K)$ to be the set of all structured seeds with $m$ edges.

The following is a lemma relating the seeds and the structured seeds, which we prove later.

Lemma 3.1.

Let $\varepsilon,\delta,K$ be positive reals and suppose also that $\varepsilon\leq\delta/2$ and $p\ll 1$ . Then, there exists a positive constant $C=C(\varepsilon,\delta,K)$ such that for any $m\leq K\Phi_{X}(\delta-\varepsilon)$ and large enough $n$ we have,

\mathcal{S}_{m}(\varepsilon,\delta,C)\subseteq\mathcal{S}^{\prime}_{m}(2\varepsilon,\delta,C).

Now we are ready to define what a core graph is. This is given in the following definition.

Definition.

Let $\varepsilon,\delta,K$ be positive reals. In addition let $p\in(0,1)$ . Then we define $\mathcal{C}(\varepsilon,\delta,K)$ to be collection of all structured seeds $G\in\mathcal{S}^{\prime}(\varepsilon,\delta,K)$ satisfying the following:
For all $e\in E(G)$

N_{ind}(e,C_{4},G)+N_{ind}(e,K_{1,2}\sqcup K_{1},G)p^{2}\geq\varepsilon\mathbb{E}[X]/(2K\cdot\Phi_{X}(\delta+\varepsilon)).

We call the graphs in $\mathcal{C}(\varepsilon,\delta,K)$ core graphs. Furthermore, for every positive integer $m$ we define $\mathcal{C}_{m}(\varepsilon,\delta,K)$ to be the set of all cores with $m$ edges.

In cases where the parameters $\varepsilon,\delta$ and $K$ can be understood from the context we omit them. The main aim of this section is to prove the following theorem which we derive using Corollary 2.7 and several lemmas.

Theorem 3.2.

For all positive real numbers $\varepsilon,\delta,p$ with $\varepsilon<{\delta}$ and $p<\min\{1-\big{(}{\frac{1+\delta-\varepsilon}{1+\delta-\varepsilon/2}\big{)}^{1/6}},1/2\}$ , there is a positive constant $K=K(\varepsilon,\delta)$ such that following holds:

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq(1+\varepsilon)\mathbb{P}(G\subseteq G_{n,p}\text{ for some }G\in\mathcal{C}(\varepsilon,\delta,K)).

(6)

In particular,

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq(1+\varepsilon)\sum\limits_{m}|\mathcal{C}_{m}(\varepsilon,\delta,K)|p^{m}.

We start by proving Lemma 3.1 relating seeds and structured seeds.

Proof of Lemma 3.1.

Suppose $G\in\mathcal{S}_{m}(\varepsilon,\delta,K)$ . By ( $S{2}$ ) we have

\displaystyle(\delta-\varepsilon)\mathbb{E}[X]\leq\mathbb{E}_{G}[X]-\mathbb{E}[X]\leq\sum\limits_{\emptyset\neq H{\subseteq}^{*}C_{4}}N_{ind}(H,G)p^{4-e(H)},

where ${\subseteq}^{*}$ stands for spanning subgraphs. Observe that

N_{ind}(H,G)\leq\begin{cases}mn^{2}&\text{ if }H=K_{2}\sqcup 2K_{1}\\ m^{2}&\text{ if }H=M_{2}\text{ or }P_{4},\end{cases}

where $K_{2}\sqcup 2K_{1}$ is the graph on four vertices and one edge, $M_{2}$ is a matching of size two, and $P_{4}$ is the path with 3 edges; this indeed holds as two edges of $G$ span at most one induced $M_{2}$ or $P_{4}$ . Therefore, we obtain

	$\displaystyle(\delta-\varepsilon)\mathbb{E}[X]\leq$	$\displaystyle N_{ind}(C_{4},G)+N_{ind}(K_{1,2}\sqcup K_{1},G)p^{2}+\underbrace{m^{2}p}_{N_{ind}(P_{4},G)}+\underbrace{m^{2}p^{2}}_{N_{ind}(M_{2},G)}+\underbrace{mn^{2}p^{3}}_{N_{ind}(K_{2}\sqcup 2K_{1},G)}$
	$\displaystyle\leq$	$\displaystyle N_{ind}(C_{4},G)+N_{ind}(K_{1,2}\sqcup K_{1},G)p^{2}+2pm^{2}+mn^{2}p^{3}.$		(7)

In Section 5 (Claim 5.1) we prove that $\Phi_{X}(\delta)=O_{\delta}(\sqrt{\mathbb{E}[X]}\log(1/p))$ . Therefore, as $\mathbb{E}[X]=\Theta(n^{4}p^{4})$ , we have $m=O_{\varepsilon,\delta}(n^{2}p^{2}\log(1/p))$ . Further, as $p\ll 1$ we get from (3) the following inequality for large enough $n$ ,

	$\displaystyle(\delta-\varepsilon)\mathbb{E}[X]$	$\displaystyle\leq N_{ind}(C_{4},G)+N_{ind}(K_{1,2}\sqcup K_{1},G)p^{2}+O(p\log^{2}(1/p)\mathbb{E}[X])$
		$\displaystyle\leq N_{ind}(C_{4},G)+N_{ind}(K_{1,2}\sqcup K_{1},G)p^{2}+\varepsilon\mathbb{E}[X].$

Therefore, we obtain $N_{ind}(C_{4},G)+N_{ind}(K_{1,2}\sqcup K_{1},G)p^{2}\geq(\delta-2\varepsilon)\mathbb{E}[X]$ for large enough $n$ . This is the assertion of the lemma. ∎

Informally, the next claim is that every structured seed contains a core. Formally this is given in the following claim.

Lemma 3.3.

Suppose $\varepsilon,\delta,K$ are positive reals. Then for every structured seed $G\in\mathcal{S}^{\prime}$ and every nonnegative real $s$ , there exists a subgraph $G^{*}\subseteq G$ such that:

( $C{1}$ )

$e(G)\leq K\cdot\Phi_{X}(\delta+\varepsilon)$ ,
( $C{2}$ )

$N_{ind}(C_{4},G^{*})+N_{ind}(K_{1,2}\sqcup K_{1},G^{*})p^{2}\geq(\delta-\varepsilon)\mathbb{E}[X]-s$ , and
( $C{3}$ )

$N_{ind}(e,C_{4},G^{*})+N_{ind}(e,K_{1,2}\sqcup K_{1},G^{*})p^{2}\geq\frac{s}{K\Phi_{X}(\delta+\varepsilon)}$ for every edge $e\in E(G^{*})$ .

Proof.

In this proof, it would be convenient to let

N(G)=N_{ind}(C_{4},G)+N_{ind}(K_{1,2}\sqcup K_{1},G)p^{2}.

This is because then, ( $C{2}$ ) is equivalent to $N(G^{*})\geq(\delta-\varepsilon)\mathbb{E}[X]-s$ and ( $C{3}$ ) is equivalent to $N(G^{*})-N(G^{*}\setminus e)\geq\frac{s}{K\Phi_{X}(\delta+\varepsilon)}$ for every $e$ an edge in $G^{*}$ .

Define the sequences $G=G_{0}\supseteq G_{1}\supseteq\cdots\supseteq G_{r}=G^{*}$ and $e_{1},e_{2},\ldots,e_{r}\in G$ by repeatedly setting $G_{k+1}$ to be a subgraph of $G_{k}$ obtained by the deletion of an edge $e_{k}$ such that

N(G_{k})-N(G_{k+1})<\frac{s}{e(G)},

as long as such edge exists. The graph $G^{*}$ satisfies ( $C{1}$ ) as it is a subgraph of a structured seed. We also claim that, the subgraph $G^{*}$ satisfies ( $C{3}$ ). That is because, if there is an edge $e$ in $G^{*}$ with $N(G^{*})-N(G^{*}\setminus e)<\frac{s}{e(G)}$ the process would have continued by deleting this edge $-$ a contradiction. Recalling that $G$ is a structured seed we find that, $e(G)\leq K\Phi_{X}(\delta+\varepsilon)$ and thus,

N(G^{*})\geq\frac{s}{e(G)}\geq\frac{s}{K\Phi_{X}(\delta+\varepsilon)}.

Finally, since $r\leq e(G)$ , we have

\displaystyle N(G)-N(G^{*})=\sum\limits_{k=0}^{r-1}N(G_{k})-N(G_{k+1})\leq\frac{rs}{e(G)}\leq s.

Rearranging this inequality and recalling that $G$ is a structured seed we obtain the assertion of the lemma,

N(G^{*})=N_{ind}(C_{4},G^{*})+N_{ind}(K_{1,2}\sqcup K_{1},G^{*})p^{2}\geq(\delta-\varepsilon)\mathbb{E}[X]-s.\qed

Applying Lemma 3.3 invoked with $\varepsilon$ replaced by $\varepsilon/2$ and $s=\varepsilon\mathbb{E}[X]/2$ yields the following corollary.

Corollary 3.4.

Suppose $\varepsilon,\delta,K$ are positive reals. Suppose further that $G\in\mathcal{S}^{\prime}(\varepsilon/2,\delta,K)$ is a structured seed. Then, there exists a core $G^{*}\in\mathcal{C}(\varepsilon,\delta,K)$ such that $G^{*}\subseteq G$ .

Now we derive Theorem 3.2 from the above claims and Corollary 2.7.

Proof of Theorem 3.2.

Applying Lemma 3.1 with $\varepsilon$ replaced by $\varepsilon/4$ and Corollary 3.4 with $\varepsilon$ replaced by $\varepsilon/2$ we find that

	$\displaystyle\mathbb{P}(G\subseteq G_{n,p}\text{ for some }G\in\mathcal{S}(\varepsilon/4,\delta,K))$	$\displaystyle\leq\mathbb{P}(G\subseteq G_{n,p}\text{ for some }G\in\mathcal{S}^{\prime}(\varepsilon/2,\delta,K))$
		$\displaystyle\leq\mathbb{P}(G\subseteq G_{n,p}\text{ for some }G\in\mathcal{C}(\varepsilon,\delta,K)).$

Applying Corollary 2.7 with $\varepsilon$ replaced with $\varepsilon/2$ we obtain

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq(1+\varepsilon)\mathbb{P}(G\subseteq G_{n,p}\text{ for some }G\in\mathcal{S}(\varepsilon/4,\delta,K)).

Combining the above inequalities we obtain the assertion of the theorem, that is

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq(1+\varepsilon)\mathbb{P}(G\subseteq G_{n,p}\text{ for some }G\in\mathcal{C}(\varepsilon,\delta,K)).\qed

4. Lower bounds

The aim of this section is to give lower bounds for $\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])$ for every positive $\delta$ . We do so by presenting a family of graphs such that ‘planting’ them in $G_{n,p}$ increases the expectation of $X$ by a multiplicative factor of $1+\delta$ . More formally, when we say ‘planting’ we mean changing the probability measure on the hypercube $\{0,1\}^{\binom{n}{2}}$ from the usual product measure of $G_{n,p}$ to the probability measure of $G_{n,p}$ conditioned on the existence of a subgraph from a predetermined family of graphs. The graphs in the families that we will consider will satisfy $\mathbb{E}_{G}[X]\geq(1+\delta)\mathbb{E}[X]$ . We choose these families as such because, on the event that $G\subseteq G_{n,p}$ for $G\subseteq K_{n}$ such that $\mathbb{E}_{G}[X]\geq(1+\delta)\mathbb{E}[X]$ the probability of $X\geq(1+\delta)\mathbb{E}[X]$ is pretty ‘large’.

Since for every labeled graph $G\subseteq K_{n}$ the probability of $G_{n,p}$ containing $G$ is $p^{e(G)}$ it makes sense to consider such graphs with the smallest $e(G)$ . In contrast, in our case there is a wide range of $p$ where the strongest lower bound is obtained by planting some member of a large family of graphs; each individual graph in the family is sub-optimal in terms of the number of edges, but the size of the family compensates for the difference in the number of edges between the optimal graph and the graphs in our family. This family is, roughly speaking all embeddings of an unbalanced complete bipartite graph into $K_{n}$ .

More precisely, denoting the complete bipartite graph with sides of size $s$ and $t$ by $K_{s,t}$ , we will define integers $m_{0},m_{2},m_{3},\ldots$ and $m_{*}$ such that $k|m_{k}$ for $k\neq 0$ and $n|m_{*}$ and $m_{k}\approx 2\sqrt{k/(k-1)\delta\mathbb{E}[X]}$ for $k\neq 0$ , $m_{0}\approx 2\sqrt{\delta\mathbb{E}[X]}$ , and $m_{*}\approx(\sqrt{1+2\delta}-1)\sqrt{2\mathbb{E}[X]}$ .

Note that provided that $n^{-1}\ll p\ll n^{-1/2}$ the constructions $K_{k,m_{k}/k}$ and $K_{\sqrt{m_{0}},\sqrt{m_{0}}}$ contain $\delta\mathbb{E}[X]$ induced copies of $C_{4}$ , up to lower order terms. In addition, we will show later that $H=K_{2m_{*}/n,n/2}$ admits $\mathbb{E}_{H}[X]\geq(1+\delta)\mathbb{E}[X]$ .

Denote by $\mathcal{E}_{k}$ the set of all copies of $K_{k,m_{k}/k}$ in $K_{n}$ when $k\neq 0,1$ . Denote by $\mathcal{E}_{0}$ the set of all $K_{\sqrt{m_{0}},\sqrt{m_{0}}}$ in $K_{n}$ and and by $\mathcal{E}_{*}$ denote the set of all copies of $H$ in $K_{n}$ . Planting one of $\mathcal{E}_{k},\mathcal{E}_{0}$ or $\mathcal{E}_{*}$ yields a lower bound for the probability of the upper tail event which is valid for all values of $p$ . As was mentioned in the introduction the significant different between this work and previous ones is the need to plant a large family of sub-optimal graphs and not a single optimal graph. This is true only for the families $\mathcal{E}_{k}$ where $k\neq 0$ . In the case of $\mathcal{E}_{0},\mathcal{E}_{*}$ we could as well plant a single graph from these sets as $|\mathcal{E}_{0}|$ and $|\mathcal{E}_{*}|$ are negligible.

One can compare these bounds, and see that the best one depends on $p$ in the following way. There exists an increasing sequence $\{c_{k}\}_{k=1}^{\infty}$ with $c_{1}=0$ and $\lim_{k\rightarrow\infty}c_{k}=1/3$ so that, provided $n^{-1+c_{k-1}}\ll p\ll n^{-1+c_{k}}$ for some integer $k\geq 2$ we obtain the strongest lower bound on the upper tail probability by planting $\mathcal{E}_{k}$ . The best family to plant when $n^{-2/3}\ll p\ll n^{-1/2}$ is $\mathcal{E}_{0}$ which should be thought of as the ‘limit’ of $\mathcal{E}_{k}$ when $k$ goes to infinity. Lastly, $\mathcal{E}^{*}$ is the best family to plant when $n^{-1/2}\ll p\ll 1$ .

The main result of this section is a formalisation of the above discussion. In order to make things rigorous from now on we fix $\delta,\varepsilon$ to be positive reals and let

r_{k}=\Bigg{\{}\begin{array}[]{@{}l@{\thinspace}l}2&\text{ for }k=0,\\ 2\sqrt{k/(k-1)}&\text{ for }k\geq 2,\\ \end{array}

when $k$ is some non negative integer. Moreover, note the following: For all $n^{-1}\ll p\ll 1$ we have $\mathbb{E}[X]=\Omega(n^{4}p^{4})$ . Therefore, for clarity of the presentation we assume that $r_{k}\sqrt{(\delta+\varepsilon)\mathbb{E}[X]}$ is an integer divisible by $k$ for any $2\leq k\leq O(np)$ . Furthermore, if $p\gg n^{-1/2}$ we have $\sqrt{\mathbb{E}[X]}/n=\Omega(np^{2})$ , hence for clarity of the presentation we assume $\sqrt{C(\varepsilon,\delta)\mathbb{E}[X]}/n$ is an integer where $C(\varepsilon,\delta)$ will be specified later. The lower bounds given by the following theorem should be thought of as the probability of the appearance of $K_{k,m_{k}/k}$ in $G_{n,p}$ for some fixed integer $2\leq k\leq O(np)$ (provided $p\ll n^{-1/2}$ ) and the appearance of $H$ in $G_{n,p}$ . Where we define:

	$\displaystyle m_{k}$	$\displaystyle=r_{k}\sqrt{(\delta+\varepsilon)\mathbb{E}[X]},$
	$\displaystyle m_{*}$	$\displaystyle=C(\varepsilon,\delta)\sqrt{\mathbb{E}[X]}/2,$

where $C(\varepsilon,\delta)=\sqrt{r+d^{2}}-d$ and $r=16(\delta+3\varepsilon/2)$ and $d=\sqrt{2}/(1+\varepsilon)$ . Now we are ready to state the main result of this section. We wish to emphasize that both $m_{k}$ and $m_{*}$ depend on $\delta$ and $\varepsilon$ .

Theorem 4.1.

Let $\varepsilon,\delta,C$ be positive real numbers and let $2\leq k\leq Cnp$ be a positive integer. Then the following holds

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\geq\begin{cases}\left(p^{m_{k}}\binom{n}{m_{k}/k}\right)^{1+\varepsilon}&\text{ for }n^{-1}\ll p\ll n^{-1/2}\\ p^{({1+\varepsilon})m_{*}}&\text{ for }n^{-1/2}\ll p\ll 1\\ \end{cases}

for large enough $n$ .

In the following lemma we claim that the expected number of induced copies of $C_{4}$ conditioned on one labeled copy of $K_{k,m_{k}/k}$ or $H$ being a subgraph of $G_{n,p}$ in a suitable range of $p$ is at least $(1+\delta+\varepsilon/2)\mathbb{E}[X]$ provided $n$ is large enough. To this end let us introduce the following notations. From now on we assume that the vertex set of $G_{n,p}$ is $[n]$ . For every positive integer $k\leq n$ and $A\subset[n]\setminus[k]$ with $|A|=m_{k}/k$ define the following events:

(I)

$F_{A,k}$ is the event that $\cap_{i=1}^{k}N(i)=A$ and there are no edges between any $i,j\in[k]$ ,
(II)

$F_{k}=\cup_{A\in\binom{[n]\setminus[k]}{m_{k}/k}}F_{A,k}$ .

Now we are ready to state the lemma.

Lemma 4.2.

Suppose $\varepsilon,\delta,C$ are positive reals, $2\leq k\leq Cnp$ is some positive integer and $A\subseteq[n]\setminus[k]$ with $|A|=m_{k}/k$ . Then the following holds:

(1)

Suppose $n^{-1}\ll p\ll n^{-1/2}$ . Then, for large enough $n$ we have

$\mathbb{E}[X\mid F_{A,k}]\geq(1+\delta+3\varepsilon/4)\mathbb{E}[X].$
(2)

Suppose $n^{-1/2}\ll p\ll 1$ . Then, for large enough $n$ we have

$\mathbb{E}_{H}[X]\geq(1+\delta+\varepsilon)\mathbb{E}[X].$

Proof.

We start with the first item in the lemma. Assume $n^{-1}\ll p\ll n^{-1/2}$ . First, note that

E[X\mid F_{A,k}]\geq N_{ind}(C_{4},K_{k,m_{k}/k})(1-p)+\mathbb{E}[N_{ind}(C_{4},G_{n-k,p})].

Let us compute these two quantities.

Since $K_{s,t}$ contains exactly $\binom{s}{2}\binom{t}{2}$ induced copies of $C_{4}$ we obtain that $K_{k,m_{k}/k}$ contains

(k-1)m_{k}^{2}/4k-O(m_{k})=(\delta+\varepsilon)\mathbb{E}[X]-O(\mathbb{E}[X]^{1/2})

induced copies of $C_{4}$ as $r_{k}=2\sqrt{k/(k-1)}$ . Further, as $k=O(np)$ and for all nonnegative integers $a,b,c$ we have $\binom{a-c}{b}/\binom{a}{b}\geq\left(\frac{a-b-c}{a-b}\right)^{b}$ , we deduce

\mathbb{E}[N_{ind}(C_{4},G_{n-k,p})]=\frac{\binom{n-k}{4}}{\binom{n}{4}}\mathbb{E}[X]\geq\left(1-\frac{k}{n-4}\right)^{4}\mathbb{E}[X]=(1-o(1))\mathbb{E}[X].

Combining the above we obtain,

\mathbb{E}[X\mid F_{A,k}]\geq(1-o(1))(\delta+\varepsilon)\mathbb{E}[X]-O(\mathbb{E}[X]^{1/2})+(1-o(1))\mathbb{E}[X]\geq(1+\delta+3\varepsilon/4)\mathbb{E}[X].

For the second item assume $n^{-1/2}\ll p\ll 1$ . Similar to the first case we have

\mathbb{E}_{H}[X]\geq(N_{ind}(C_{4},H)+N_{ind}(K_{1,2}\sqcup K_{1},H)p^{2})(1-p)^{2}+\mathbb{E}[N_{ind}(C_{4},G_{n-2m_{*}/n,p})].

We now compute these quantities.

Since $K_{s,t}$ contains exactly $\binom{s}{2}\binom{t}{2}$ induced copies of $C_{4}$ , we obtain that $H$ contains ${m_{*}^{2}}/{4}+O(m_{*}n)$ induced copies of $C_{4}$ . Moreover, thinking of $H$ as a spanning subgraph of $K_{n}$ by adding isolated vertices, we see that $H$ contains at least $m_{*}n^{2}/8+O(m_{*}^{2})$ induced copies of $K_{1,2}\sqcup K_{1}$ as we can choose one vertex from one side, two form the other side and another isolated vertex. Furthermore, as for all nonnegative integers $a,b,c$ we have $\binom{a-c}{b}/\binom{a}{b}\geq\left(\frac{a-b-c}{a-b}\right)^{b}$ and $2m_{*}/n=O(np^{2})=o(n)$ , we deduce

\mathbb{E}[N_{ind}(C_{4},G_{n-2m_{*}/n,p})]=\frac{\binom{n-2m_{*}/n}{4}}{\binom{n}{4}}\mathbb{E}[X]\geq\left(1-\frac{2m_{*}}{n(n-4)}\right)^{4}\mathbb{E}[X]\geq(1-o(1))\mathbb{E}[X],

taking $n$ large enough we have, $\mathbb{E}[N_{ind}(C_{4},G_{n-2m_{*}/n,p})]\geq(1-\varepsilon/4)\mathbb{E}[X]$ . Recalling that $p\gg n^{-1/2}$ and combining all the above bounds we obtain,

\mathbb{E}_{H}[X]\geq{m_{*}^{2}}/{4}+m_{*}n^{2}p^{2}/8+(1-\varepsilon/4)\mathbb{E}[X]+o(m_{*}^{2}).

By this inequality and the definition of $m_{*}$ one can check that for large enough $n$ ,

\mathbb{E}_{H}[X]\geq(\delta+3\varepsilon/2)\mathbb{E}[X]+(1-\varepsilon/4)\mathbb{E}[X]-\varepsilon/4\mathbb{E}[X]=(1+\delta+\varepsilon)\mathbb{E}[X].

This completes the proof. ∎

Now we are ready to prove Theorem 4.1. Before proving the theorem we make the following remark. Suppose $\gamma>0$ is a real number, $k=np$ and $p\gg n^{-1}$ . Then $m_{0}\leq m_{k}\leq(1+\gamma)m_{0}$ provided $n$ is sufficiently large. Therefore, Theorem 4.1 invoked with $\varepsilon=\gamma$ implies:

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\geq\left(p^{m_{k}}\binom{n}{m_{k}/np}\right)^{(1+\gamma)}\geq p^{(1+\gamma)^{2}m_{0}}

for large enough $n$ . Thus, by setting $\gamma=\sqrt{1+\varepsilon}-1$ , and letting $\{c_{k}\}_{k=2}^{\infty}$ be any increasing sequence satisfying $c_{2}=0$ and $\lim_{k\rightarrow\infty}c_{k}=1/3$ , we have the following corollary of Theorem 4.1, which will be shown to be tight in the next sections for some specific sequence $c_{k}$ .

Corollary 4.3.

Let $\varepsilon$ and $\delta$ be positive real numbers and let $k\geq 2$ be positive integer. Then the following holds

\log\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\geq\begin{cases}(1+\varepsilon)(m_{k}\log p+\log\binom{n}{m_{k}/k})&\text{ for }n^{-1+c_{k-1}}\ll p\ll n^{-1+c_{k}},\\ (1+\varepsilon)m_{0}\log(p)&\text{ for }n^{-2/3}\ll p\ll n^{-1/2},\\ {(1+\varepsilon)m_{*}}\log(p)&\text{ for }n^{-1/2}\ll p\ll 1.\\ \end{cases}

for large enough $n$ .

Note that in the proof of Theorem 4.1 we use a similar method as was used in [21].

Proof of Theorem 4.1.

Through out the proof we assume that the vertex set of $G_{n,p}$ is $[n]$ .

We start with the first item. To this end fix an integer $2\leq k\leq Cnp$ and $\varepsilon$ some positive real and assume that $n^{-1}\ll p\ll n^{-1/2}$ . Let $A\subset[n]\setminus[k]$ with $|A|=m_{k}/k$ and recall the definitions of the events $F_{A,k}$ and $F_{k}$ :

(I)

$F_{A,k}$ is the event that $\cap_{i=1}^{k}N(i)=A$ and there are no edges between any $i,j\in[k]$ ,
(II)

$F_{k}=\cup_{A\in\binom{[n]\setminus[k]}{m_{k}/k}}F_{A,k}$ .

Note that for any $A,B\subseteq[n]\setminus[k]$ such that $A\neq B$ with $|A|=|B|=m_{k}/k$ we have $F_{A,k}\cap F_{B,k}=\emptyset$ . Therefore, we have the following,

$\displaystyle\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])$	$\displaystyle\geq\mathbb{P}(F_{k}\text{ and }X\geq(1+\delta)\mathbb{E}[X])$
	$\displaystyle=\sum\limits_{A\in\binom{[n]\setminus[k]}{m_{k}/k}}\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X]\mid F_{A,k})\cdot\mathbb{P}(F_{A,k})$
	$\displaystyle\geq\sum\limits_{A\in\binom{[n]\setminus[k]}{m_{k}/k}}\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X]\mid F_{A,k})\cdot p^{m_{k}}(1-p)^{k^{2}}(1-p^{k})^{(n-k)}.$	(8)

Lemma 4.2 asserts that provided $n^{-1}\ll p\ll n^{-1/2}$ we have the following for all $A\subset[n]\setminus[k]$ with $|A|=m_{k}/k$ :

\mathbb{E}[X\mid F_{A,k}]\geq(1+\delta+\varepsilon/2)\mathbb{E}[X].

Note that $X\leq n^{4}$ always, we can bound $\mathbb{E}[X\mid F_{A,k}]$ from above (similar to the proof of Lemma 2.2) as follows:

\mathbb{E}[X\mid F_{A,k}]\leq(1+\delta)\mathbb{E}[X]+\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X]\mid F_{A,k})\cdot n^{4}.

Combining the two inequalities we obtain,

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X]\mid F_{A,k})\geq\frac{\varepsilon\cdot\mathbb{E}[X]}{2n^{4}}.

Therefore, for large enough $n$ we also have,

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X]\mid F_{A,k})\geq\varepsilon p^{4}(1-p)^{2}/(2\cdot 4^{4})\geq p^{5}.

(9)

Substituting (9) into (4) gives,

	$\displaystyle\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])$	$\displaystyle\geq\sum\limits_{A\in\binom{[n]\setminus[k]}{m_{k}/k}}p^{5}p^{m_{k}}(1-p)^{k^{2}}(1-p^{k})^{n-k}$
		$\displaystyle=\binom{n-k}{m_{k}/k}p^{m_{k}+5}(1-p)^{k^{2}}(1-p^{k})^{n-k}.$		(10)

We now show that $p^{5}(1-p)^{k^{2}}(1-p^{k})^{n-k}\geq p^{o(m_{k})}$ and $\binom{n-k}{m_{k}/k}\geq\binom{n}{m_{k}/k}^{1+o(1)}p^{o(m_{k})}$ .

First, as $p\ll n^{-1/2},m_{k}\gg 1$ and $2\leq k\leq Cnp$ , we have the following for large $n$ :

$\displaystyle p^{5}(1-p)^{k^{2}}(1-p^{k})^{(n-k)}$	$\displaystyle\geq\exp(5\log(p)-k^{2}(p+p^{2})-(n-k)(p^{k}+p^{2k}))$
	$\displaystyle\geq\exp(5\log(p)-C^{2}n^{2}p^{3}(1+p)-np^{2}(1+p^{k}))$
	$\displaystyle=\exp(-o(n^{2}p^{2}))$
	$\displaystyle\geq p^{\varepsilon m_{k}/2},$	(11)

where the first inequality follows as $1-x\geq\exp(-x-x^{2})$ for all small enough $x$ . Second, for all nonnegative integers $a,b,c$ we have $\binom{a-c}{b}/\binom{a}{b}\geq\left(\frac{a-b-c}{a-b}\right)^{b}$ , and thus we obtain that

\binom{n-k}{m_{k}/k}/\binom{n}{m_{k}/k}\geq\left({1-\frac{k}{n-m_{k}/k}}\right)^{m_{k}/k}=e^{-O(m_{k}/n)}.

In addition, as $2\leq k\leq Cnp$ and $n^{-1}\ll p\ll n^{-1/2}$ , we have

	$\displaystyle\binom{n}{m_{k}/k}p^{m_{k}}$	$\displaystyle\leq\left(\frac{enk}{m_{k}}\right)^{m_{k}/k}p^{m_{k}}\leq\left(\frac{eCn^{2}p}{m_{k}}\right)^{m_{k}/k}p^{m_{k}}$
		$\displaystyle\leq\exp\left(\left(1-\frac{1}{k}+O\left(\frac{1}{k\log(p)}\right)\right)m_{k}\log(p)\right)$
		$\displaystyle\leq\exp((1+o(1))m_{k}\log(p)).$

Furthermore, as $2\leq k\leq Cnp$ and $n^{-1}\ll p\ll n^{-1/2}$ we have $m_{k}/n=o(m_{k}\log(1/p))$ and hence, for sufficiently large $n$ we have,

\binom{n-k}{m_{k}/k}/\binom{n}{m_{k}/k}\geq\binom{n}{m_{k}/k}^{\varepsilon}p^{\varepsilon m_{k}/2}.

(12)

Combining (4), (4), and (12) gives,

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\geq\left(\binom{n}{m_{k}/k}p^{m_{k}}\right)^{1+\varepsilon}.

This finishes the first part of the proof.

For the second item, define the event $F^{*}$ to be the event that $G_{n,p}$ contains $K_{2m_{*}/n,n/2}$ as a subgraph on the vertex set $[n/2+2m_{*}/n]$ with sides $[2m_{*}/n]$ and $[n/2+2m_{*}/n]\setminus[2m_{*}/n]$ . Note that $X\leq n^{4}$ always. Further, Lemma 4.2 asserts that provided $n^{-1/2}\ll p\ll 1$ we have

\mathbb{E}[X\mid F^{*}]\geq(1+\delta+\varepsilon)\mathbb{E}[X],

and thus, $\Phi_{X}(\delta+\varepsilon)\leq-\log\mathbb{P}(F^{*})=-\log(p^{m_{*}})$ . Therefore, applying Lemma 2.2 gives,

\displaystyle-\log\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq

\displaystyle\Phi_{X}(\delta+\varepsilon)+\log\left(\frac{n^{4}}{\varepsilon\mathbb{E}[X]}\right)\leq-\log\left(\frac{p^{m_{*}}n^{4}}{\varepsilon\mathbb{E}[X]}\right).

Moreover, for sufficiently large $n$ we have, $\frac{\varepsilon\mathbb{E}[X]}{n^{4}}\geq p^{5}$ . Thus, taking negative logarithms we obtain the following for large enough $n$ :

\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\geq p^{m_{*}+5}\geq p^{(1+\varepsilon)m_{*}}.

This is as claimed. ∎

5. Counting the number of cores

Recall that $X$ is the random variable counting the number of induced copies of $C_{4}$ in $G_{n,p}$ . In this section we prove a general upper bound on the logarithmic probability of the upper tail event of $X$ . The main tool we use in this section is Theorem 2.7 which is a variation of . There are two major parts in this section. The first is an evaluation of $\Phi_{X}(\delta)$ in different regimes of $p$ . The second is an evaluation of the entropic term $|\mathcal{C}_{m}|$ . The main results of this section are the following lemmas.

Lemma 5.1.

Suppose $\varepsilon,\delta$ are positive real numbers with $\varepsilon$ being small as a function of $\delta$ . Then the following hold:

(i)

If $p\ll n^{-1/2}$ then for large enough $n$ we have,

2(1-\varepsilon)\sqrt{\delta\mathbb{E}[X]}\leq\Phi_{X}(\delta)/\log(1/p)\leq 2(1+\varepsilon)\sqrt{\delta\mathbb{E}[X]}.

(ii)

If $n^{-1/2}\ll p\ll 1$ then for large enough $n$ we have,

(1-\varepsilon)\left(\sqrt{\frac{n^{4}p^{4}}{16}+4\delta\mathbb{E}[X]}-\frac{n^{2}p^{2}}{4}\right)\leq\frac{\Phi_{X}(\delta)}{\log(1/p)}\leq(1+\varepsilon)\left(\sqrt{\frac{n^{4}p^{4}}{16}+4\delta\mathbb{E}[X]}-\frac{n^{2}p^{2}}{4}\right).

Before we present the second lemma, let us remind the reader the definition of $\mathcal{C}_{m}$ .

Definition.

Let $\varepsilon,\delta,K$ be positive reals. In addition let $p\in(0,1)$ . Then we define $\mathcal{C}(\varepsilon,\delta,K)$ to be collection of all $G\subset K_{n}$ spanning subgraphs satisfying the following:

( $C{1}$ )

$e(G)\leq K\cdot\Phi_{X}(\delta+\varepsilon)$ ,
( $C{2}$ )

$N_{ind}(C_{4},G)+N_{ind}(K_{1,2}\sqcup K_{1},G)p^{2}\geq(\delta-\varepsilon)\mathbb{E}[X]$ , and

(

C{3}

)

For all $e\in E(G)$

N_{ind}(e,C_{4},G)+N_{ind}(e,K_{1,2}\sqcup K_{1},G)p^{2}\geq\varepsilon\mathbb{E}[X]/(2K\cdot\Phi_{X}(\delta+\varepsilon)).

We call the graphs in $\mathcal{C}(\varepsilon,\delta,K)$ core graphs. Furthermore, for every positive integer $m$ we define $\mathcal{C}_{m}(\varepsilon,\delta,K)$ be the set of all cores with $m$ edges.

Lemma 5.2.

Suppose $\varepsilon,\delta,C,K$ are positive reals with $\varepsilon<1$ . Furthermore, suppose $n^{-1}\ll p\ll 1$ as $n$ tends to infinity. Then, there exist $D$ and $n_{0}$ such that the following holds for all $n>n_{0}$ :
Let $m$ be a positive integer with $Cn^{2}p^{2}\leq m\leq K\Phi_{X}(\delta+\varepsilon)$ . Furthermore, let

v_{m}=\max\{|\{v\in V(G):\deg(v)\neq 0\}|:G\in\mathcal{C}_{m}(\varepsilon,\delta,K)\}.

Then,

|\mathcal{C}_{m}|\leq\log(1/p)^{Dm}\binom{n}{v_{m}}.

We start with the first lemma. Let us give some motivation and history of the problem.

Definition.

For every graph $H$ and any positive integer $m$ define,

N(m,H)=\max|\{T\subset G:T\cong H\}|

where the maximum ranges over all graphs with $m$ edges.

This definition was first presented by Erdős and Hanani in [17]. In their paper they computed the asymptotic of this function where $H$ is a clique. In [1] Alon generalized this and computed the asymptotic of this function for all $H$ . Later, in [18] Friedgut and Kahn reproved Alon’s result using entropy methods which they were also able to generalize for the case of hypergraphs.

In [22] Janson, Oleszkiewicz, and Ruciński found a relation between $N(m,H)$ and a related parameter $N(n,m,H)$ and the probability that the random variable counting the number of copies of a fixed graph $H$ in $G_{n,p}$ exceeds its expectation by a multiplicative factor. This result used and generalized the machinery developed by Friedgut and Kahn. This led us to generalize the definition of $N(m,H)$ to fit our setting. Let us recall some definitions from Section 3 and introduce some new ones.

Definition.

Suppose $H$ and $G$ are graphs. Let $n,m$ be positive integers and let $e$ be an edge of $G$ . We define:

•

$N_{ind}(H,G)$ is the number of induced copies of $H$ in $G$ .
•

$N_{ind}(e,H,G)$ is the number of induced copies of $H$ in $G$ that contain $e$ .
•

${N}_{ind}(n,m,H)$ is the maximum of $N_{ind}(H,G)$ over all graphs $G$ such that the number of vertices of $G$ is at most $n$ and the number of edges of $G$ is $m$ .

Note that similar generalizations have been studied in [7] and [19]. The following is a simple corollary of Lemma 6.2 which we prove later in Section 6.

Corollary 5.3.

For every $n,m$ positive integers such that $3<m\leq n$ we have,

{N}_{ind}(n,m,C_{4})\leq\frac{m(m-n+1)}{4}\leq\frac{m^{2}}{4}.

In [7] the authors determined the asymptotic behaviour of $\max\{N_{ind}(n,m,K_{s,s}):n\in\mathbb{N}\}$ for any $s\in\mathbb{N}$ (and actually obtained optimal bounds), and gave tight bounds in some ranges of $m$ (in the above we consider only $K_{2,2}$ ). In [19] the authors considered the problem of determining the asymptotic in $n$ and $m$ of the maximum number of copies of a fixed bipartite graph $H$ in a bipartite host graph with $n$ vertices and $m$ edges. They solved this problem for some class of graphs which includes all complete bipartite graphs $K_{s,t}$ . Here we prove a similar statement to the ones in [7, 19].

We wish to emphasize that there is a significant difference in the behaviour of the problem depending on whether $n^{-1}\ll p\ll n^{-1/2}$ or $n^{-1/2}\ll p\ll 1$ . This can be seen for example in the first result of this section which we now start to prove. To this end we start with an observation. Recall that we denote by $K_{1,2}\sqcup K_{1}$ the disjoint union of a star with two leaves and an isolated vertex.

Observation 5.4.

Suppose $n,m$ are positive integers with $m\leq\binom{n}{2}$ . Then,

N_{ind}(n,m,K_{1,2}\sqcup K_{1})\leq mn^{2}/8.

Proof.

Let $G$ be a graph achieving the maximum in the definition of $N_{ind}(n,m,K_{1,2}\sqcup K_{1})$ . Let $uv=e\in E(G)$ and denote $x=|N(u)\cup N(v)\setminus\{u,v\}|$ . Then,

\displaystyle N_{ind}(e,K_{1,2}\sqcup K_{1},G)\leq x(n-x)\leq n^{2}/4.

Moreover we have,

N_{ind}(K_{1,2}\sqcup K_{1},{G})\leq\frac{1}{2}\sum_{e\in E({G})}N_{ind}(e,K_{1,2}\sqcup K_{1},G)\leq\frac{mn^{2}}{8}.\qed

Proof of Lemma 5.1.

We start with the upper bounds both in (i) and in (ii).

For the upper bound in (i), assume that $n^{-1}\ll p\ll n^{-1/2}$ . Since $\lim_{n\rightarrow\infty}\mathbb{E}[X]=\infty$ we may treat $m^{\prime}=\sqrt[4]{4(1+\varepsilon)\delta\mathbb{E}[X]}$ as an integer and let $H=K_{m^{\prime},m^{\prime}}$ . Note that $H$ has $2\sqrt{(1+\varepsilon)\delta\mathbb{E}[X]}$ edges and provided $n$ is large enough $H$ contains at least $(1+\varepsilon/2)\delta\mathbb{E}[X]>\delta\mathbb{E}[X]$ induced copies of $C_{4}$ . Note further that,

\mathbb{E}_{H}[X]\geq N_{ind}(C_{4},H)(1-p)^{2}+\mathbb{E}[N_{ind}(C_{4},G_{n-2m^{\prime},p})]\geq(1+\varepsilon/2)\delta\mathbb{E}[X]+\frac{\binom{n-2m^{\prime}}{4}}{\binom{n}{4}}\mathbb{E}[X].

As $m^{\prime}\ll n$ we also have the following for sufficiently large $n$ ,

\binom{n-2m^{\prime}}{4}/{\binom{n}{4}}\geq\left(\frac{n-2m^{\prime}-3}{n}\right)^{4}\geq 1-\delta\varepsilon/2.

Combining the above inequalities we find the following for large enough $n$ ,

\mathbb{E}_{H}[X]\geq(1+\varepsilon/2)\delta\mathbb{E}[X]+(1-\varepsilon\delta/2)\mathbb{E}[X]=(1+\delta)\mathbb{E}[X].

Therefore, for large enough $n$ we have

\mathbb{E}_{H}[X]-\mathbb{E}[X]\geq\delta\mathbb{E}[X].

Hence, by the definition of $\Phi_{X}(\delta)$ ,

\Phi_{X}(\delta)\leq-\log(p^{e(H)})=2\sqrt{(1+\varepsilon)\delta\mathbb{E}[X]}\log(1/p)\leq 2(1+\varepsilon)\sqrt{\delta\mathbb{E}[X]}\log(1/p).

For the upper bound in (ii), assume $n^{-1/2}\ll p\ll 1$ . Letting

\tilde{m}=(1+\varepsilon)\left(\sqrt{\frac{n^{4}p^{4}}{16}+4\delta\mathbb{E}[X]}-\frac{n^{2}p^{2}}{4}\right),

and recalling that $\lim_{n\rightarrow\infty}\sqrt{\mathbb{E}[X]}/n=\infty$ we may treat $\tilde{m}/n$ as an integer and consider the following graph. Let $H$ be a graph on the vertex set $[n]$ and such that $H\left[[2\tilde{m}/n+n/2]\right]\cong K_{2\tilde{m}/n,n/2}$ and all vertices in $[n]\setminus[2\tilde{m}/n+n/2]$ are isolated. Note that $H$ contains $\tilde{m}$ edges. Let $\xi$ be a small positive real. Assuming $n$ is large enough, $H$ contains at least $(1-\xi)\tilde{m}^{2}/4$ induced copies of $C_{4}$ . Furthermore, if $n$ is large enough, $H$ contains at least $(1-\xi)\frac{\tilde{m}n^{2}}{8}$ induced copies of $K_{1,2}\sqcup K_{1}$ . Let $\eta$ be a small positive real (that might depend on $\xi$ ). Then provided $n$ is large enough,

	$\displaystyle\mathbb{E}_{H}[X]\geq$	$\displaystyle\left(N_{ind}(C_{4},H)+N_{ind}(K_{1,2}\sqcup K_{1},H)p^{2}\right)(1-p)^{2}+\mathbb{E}[N_{ind}(C_{4},G_{n-2\tilde{m}/n,p})]$
	$\displaystyle\geq$	$\displaystyle(1-\eta)(1-\xi)\left(\frac{\tilde{m}^{2}}{4}+\frac{\tilde{m}n^{2}p^{2}}{8}\right)+\frac{\binom{n-2\tilde{m}/n}{4}}{\binom{n}{4}}\mathbb{E}[X].$

Where the first inequality holds as the first term is at most the expected number of induced copies $C_{4}$ with at least one vertex in $[2\tilde{m}/n]$ , and the second term is the expected number of induced copies of $C_{4}$ with no vertices in $[2\tilde{m}/n]$ .

By the definition of $\tilde{m}$ and the choices of $\xi,\eta$ being small enough we obtain,

\displaystyle(1-\eta)(1-\xi)\left(\frac{\tilde{m}^{2}}{4}+\frac{\tilde{m}n^{2}p^{2}}{8}\right)\geq(1+\varepsilon/2)\delta\mathbb{E}[X].

Further, for large enough $n$ we have,

\binom{n-2\tilde{m}/n}{4}/\binom{n}{4}\geq\left(\frac{n-2\tilde{m}/n-3}{n}\right)^{4}\geq{(1-\delta\varepsilon/2)}.

Combining all of these inequalities we have,

\mathbb{E}_{H}[X]-\mathbb{E}[X]\geq\delta\mathbb{E}[X].

Hence, by the definition of $\Phi_{X}(\delta)$ we have,

\Phi_{X}(\delta)\leq-\log(p^{e(H)})=(1+\varepsilon)\left(\sqrt{\frac{n^{4}p^{4}}{16}+4\delta\mathbb{E}[X]}-\frac{n^{2}p^{2}}{4}\right)\log(1/p).

For the lower bounds of both (i) and (ii) we start with a claim.

Claim 5.5.

Suppose $G$ is a spanning subgraph of $K_{n}$ achieving the minimum in the definition of $\Phi_{X}(\delta)$ and let $m=e(G)$ . Then for any small enough $\gamma>0$ and large enough $n$

\displaystyle\delta\mathbb{E}[X]\leq m^{2}/4+\min\{mn^{2}p^{2}/8,m^{2}np^{2}/2\}+\gamma n^{4}p^{4}.

Proof.

Let $G$ be a spanning subgraph of $K_{n}$ achieving the minimum in the definition of $\Phi_{X}(\delta)$ and put $m=e(G)$ . By the definition of $\Phi_{X}(\delta)$ we have $\mathbb{E}_{G}[X]-\mathbb{E}[X]\geq\delta\mathbb{E}[X]$ . Note that we can bound $\mathbb{E}_{G}[X]-\mathbb{E}[X]$ from above in the following way:

\displaystyle\mathbb{E}_{{G}}[X]-\mathbb{E}[X]\leq\sum\limits_{\emptyset\neq H\subseteq^{*}C_{4}}N_{ind}(H,{G})p^{4-e(H)},

(13)

where $\subseteq^{*}$ stands for spanning subgraphs. Let $P_{4}$ be the path with three edges, $M_{2}$ be a matching of size two and $K_{2}\sqcup I_{2}$ be a disjoint union of an edge and an independent set of size two. Since any matching of size two span at most one induced copy of $P_{4}$ or $M_{2}$ we have

N_{ind}(P_{4},G),N_{ind}(M_{2},G)\leq m^{2}.

Therefore we obtain

	$\displaystyle\mathbb{E}_{G}[X]-\mathbb{E}[X]\leq$	$\displaystyle N_{ind}(C_{4},{G})+N_{ind}(K_{1,2}\sqcup K_{1},{G})p^{2}$
		$\displaystyle+\underbrace{m^{2}p}_{N_{ind}(P_{4},{G})}+\underbrace{m^{2}p^{2}}_{N_{ind}(M_{2},{G})}+\underbrace{mn^{2}p^{3}}_{N_{ind}(K_{2}\sqcup I_{2},{G})}$
	$\displaystyle\leq$	$\displaystyle N_{ind}(C_{4},{G})+N_{ind}(K_{1,2}\sqcup K_{1},{G})p^{2}+mp(m+mp+n^{2}p^{2}).$

By Observation 5.4 we have

N_{ind}(K_{1,2}\sqcup K_{1},{G})\leq N_{ind}(m,n,K_{1,2}\sqcup K_{1})\leq\frac{mn^{2}}{8}.

Furthermore we always have,

N_{ind}(K_{1,2}\sqcup K_{1},{G})\leq\binom{m}{2}\cdot n\leq\frac{m^{2}n}{2}.

Combining the two we find that,

\displaystyle N_{ind}(K_{1,2}\sqcup K_{1},{G})p^{2}\leq\min\{mn^{2}p^{2}/8,m^{2}np^{2}/2\}.

Hence, by the definition of $\Phi_{X}(\delta)$ and Corollary 5.3, for large enough $n$ we have,

	$\displaystyle\delta\mathbb{E}[X]$	$\displaystyle\leq N_{ind}(C_{4},{G})+N_{ind}(K_{1,2}\sqcup K_{1},{G})p^{2}+mp(m+mp+n^{2}p^{2})$
		$\displaystyle\leq m^{2}/4+\min\{mn^{2}p^{2}/8,m^{2}np^{2}/2\}+mp(m+mp+n^{2}p^{2}).$		(14)

As $p\ll 1$ and the upper bound achieved before, we have $m=O(n^{2}p^{2})$ . Therefore, for any $\gamma>0$ and sufficiently large $n$ we obtain from (5) the following inequality

\delta\mathbb{E}[X]\leq m^{2}/4+\min\{mn^{2}p^{2}/8,m^{2}np^{2}/2\}+\gamma n^{4}p^{4}.\qed

We now prove the lower bound in (i). For this assume that $n^{-1}\ll p\ll n^{-1/2}$ and note that this implies that $\min\{mn^{2}p^{2}/8,m^{2}np^{2}/2\}=o(m^{2})$ . Thus, for any $\gamma>0$ and large enough $n$ we deduce the following from Claim 5.5,

\delta\mathbb{E}[X]\leq m^{2}/4+\min\{mn^{2}p^{2}/8,m^{2}np^{2}/2\}+\gamma n^{4}p^{4}\leq(1/4+\gamma)m^{2}+\gamma n^{4}p^{4}.

Taking $\gamma$ sufficiently small we obtain the following bound on $m$ ,

(1-\varepsilon)\delta\mathbb{E}[X]\leq\frac{1}{4(1-\varepsilon)}m^{2},

implying

m\geq 2(1-\varepsilon)\sqrt{\delta\mathbb{E}[X]}.

This proves the lower bound in (i) as for sufficiently large $n$ we obtain,

\Phi_{X}(\delta)=-\log(p^{m})\geq 2(1-\varepsilon)\sqrt{\delta\mathbb{E}[X]}\log(1/p).

Lastly, we prove the lower bound in (ii). For this assume $n^{-1/2}\ll p\ll 1$ . Let $\gamma>0$ be some small constant, then for large enough $n$ Claim 5.5 implies,

\displaystyle\delta\mathbb{E}[X]\leq m^{2}/4+mn^{2}p^{2}/8+\gamma\mathbb{E}[X].

We conclude the following as the above is a quadratic inequality in $m$ and the fact that $\mathbb{E}[X]=(1+o(1))\frac{n^{4}p^{4}}{8}$ ,

m\geq\sqrt{\frac{n^{4}p^{4}}{16}+4(\delta-\gamma)\mathbb{E}[X]}-\frac{n^{2}p^{2}}{4}\geq(1-\varepsilon)\left(\sqrt{\frac{n^{4}p^{4}}{16}+4\delta\mathbb{E}[X]}-\frac{n^{2}p^{2}}{4}\right).

This implies (ii) as follows,

\Phi_{X}(\delta)=-\log(p^{m})\geq(1-\varepsilon)\left(\sqrt{\frac{n^{4}p^{4}}{16}+4\delta\mathbb{E}[X]}-\frac{n^{2}p^{2}}{4}\right)\log(1/p).\qed

This finishes the first evaluation we prove in this section. The second evaluation is the evaluation of the entropic term $|\mathcal{C}_{m}|$ given by Lemma 5.2. Roughly speaking Lemma 5.2 shows that the number of core graphs with $m$ edges is determined by $v_{m}$ the maximum number of non-isolated vertices over all graphs in $\mathcal{C}_{m}$ . As seen already, there is a big difference in the behaviour of the problem depending on whether $p\ll n^{-1/2}$ or $p\gg n^{-1/2}$ . Later, we will derive two corollaries from Lemma 5.2, corresponding to these regimes, these corollaries will be very important in Section 6.

Proof of Lemma 5.2.

Let $G\in\mathcal{C}_{m}$ be a core graph with $m$ edges and let $uv$ be an edge in $G$ . Then by the definition of a core graph we have,

N_{ind}(uv,C_{4},G)+N_{ind}(uv,K_{1,2},G)np^{2}\geq\varepsilon\mathbb{E}[X]/(2K\cdot\Phi_{X}(\delta+\varepsilon)).

Therefore,

\deg(u)\deg(v)\geq N_{ind}(uv,C_{4},G)\geq\varepsilon\mathbb{E}[X]/(4K\cdot\Phi_{X}(\delta+\varepsilon)),

(15)

\deg(u)+\deg(v)\geq N_{ind}(uv,K_{1,2},G)\geq\varepsilon\mathbb{E}[X]/(4K\cdot\Phi_{X}(\delta+\varepsilon)np^{2}),

(16)

call this property $(*)$ . To bound $|\mathcal{C}_{m}|$ it is enough to bound the number of subgraphs of $K_{n}$ with $m$ edges satisfying property $(*)$ . This can be bounded from above by multiplying the following quantities:

(1)

The number of possible ways to choose the set of non-isolated vertices of such graph.
(2)

The number of possible choices of the edges of the graph such that property $(*)$ is being satisfied.

The first item can be bounded from above by the number of ways to choose a set of at most $v_{m}$ vertices which is at most $2^{v_{m}}\binom{n}{v_{m}}$ .

We now bound the second item. Let $H$ be a graph with $v$ vertices and $m$ edges that satisfies property $(*)$ . For every integer $0\leq t\leq\log_{2}(m)$ define $V_{t}(H)=\{v\in V(H):2^{t}\leq\deg(v)<2^{t+1}\}$ . Furthermore, for every integer $0\leq t\leq\log_{2}(m)$ define $U_{t}(H)=\cup_{\ell\geq t}V_{\ell}$ . Using a standard double counting argument and the fact that $U_{t}(H)\subseteq V(H)$ we have

|V_{t}(H)|\leq|U_{t}(H)|\leq\min\left\{\frac{m}{2^{t-1}},v_{m}\right\}

for all $t\geq 0$ . By property $(*)$ edges can be placed between the sets $V_{t}$ in two ways.

The first option is when the edges satisfy (15). Since, the degree of each vertex is always bounded from above by $n$ , we obtain that the degrees of the endpoints of each edge satisfying (15) are bounded from below by $d^{*}$ defined as

d^{*}=\frac{\varepsilon\mathbb{E}[X]}{4K\cdot n\Phi_{X}(\delta+\varepsilon)}=\frac{C_{0}np^{2}}{\log(1/p)}

where $C_{0}$ is some positive real that might depend on $\varepsilon,\delta$ and $K$ . Thus, such edges can only connect a vertex in $V_{t}$ and a vertex in $U_{\lfloor\log_{2}\ell(t)\rfloor}$ for integers $\lfloor\log_{2}(d^{*})\rfloor\leq t\leq\log_{2}(n)$ and $\ell(t)=\varepsilon\mathbb{E}[X]/(4K\cdot\Phi_{X}(\delta+\varepsilon)2^{t})$ . We denote the number of such options by $T_{m}$ .

The second option is that the edge has an endpoint in $U_{\lfloor\log_{2}(r)\rfloor}(H)$ where $r$ is defined as $r=\varepsilon\mathbb{E}[X]/\left(8K\cdot\Phi_{X}(\delta+\varepsilon)np^{2}\right)$ . That happens when the edge satisfies (16). We denote the number of such options by $S_{m}$ .

Furthermore, note that provided $n$ is large enough there are at most

{\left(\log_{2}(n)-\log_{2}(d^{*})+2\right)^{v_{m}}}={\left(\log_{2}\left(\frac{\log(1/p)}{C_{0}p^{2}}\right)+2\right)^{v_{m}}}\leq 3^{v_{m}}\log(1/p)^{v_{m}}

partitions of the non-isolated vertices of our graph into sets $V_{t}$ where $\lfloor\log_{2}(d^{*})\rfloor\leq t\leq\log_{2}(n)$ is an integer and $\cup_{t=0}^{\lfloor\log_{2}(d^{*})\rfloor-1}V_{t}$ . We will think of the vertices in $V_{t}$ as vertices which will have degrees between $2^{t}$ and $2^{t+1}$ where $\lfloor\log_{2}(d^{*})\rfloor\leq t\leq\log_{2}(n)$ is an integer.

We conclude that given the vertex set and a partition as explained above, there are at most $T_{m}+S_{m}$ pairs of vertices that can be edges of $H$ . Hence, the number of graphs with property $(*)$ is bounded from above by

{\left(3\log(1/p)\right)^{v_{m}}}2^{v_{m}}\binom{n}{v_{m}}\binom{S_{m}+T_{m}}{m}\leq(6\log(1/p))^{v_{m}}\binom{n}{v_{m}}\binom{S_{m}+T_{m}}{m}.

Let us now estimate $S_{m}$ and $T_{m}$ .

	$\displaystyle T_{m}=$	$\displaystyle\sum\limits_{t=\lfloor\log_{2}(d^{})\rfloor}^{\log_{2}(n)}\|V_{t}\|\|U_{\lfloor{\log_{2}(\ell(t))}\rfloor}\|\leq\sum\limits_{t=\lfloor\log_{2}(d^{})\rfloor}^{\log_{2}(n)}\frac{m}{2^{t-1}}\frac{4m}{\ell(t)}=\sum\limits_{t=\lfloor\log_{2}(d^{*})\rfloor}^{\log_{2}(n)}\frac{m^{2}}{2^{t-3}}\frac{4K\cdot\Phi_{X}(\delta+\varepsilon)2^{t}}{\varepsilon\mathbb{E}[X]}$
	$\displaystyle\leq$	$\displaystyle(\log_{2}(n)-\lfloor\log_{2}(d^{*})\rfloor+1)\frac{64K\Phi_{X}(\delta+\varepsilon)m^{2}}{\varepsilon\mathbb{E}[X]}\leq\frac{192K\Phi_{X}(\delta+\varepsilon)m^{2}\log(1/p)}{\varepsilon\mathbb{E}[X]}.$

Recalling the assumption that $m\leq K\Phi_{X}(\delta+\varepsilon)$ we obtain,

T_{m}\leq\frac{192K^{3}\Phi^{3}_{X}(\delta+\varepsilon)\log(1/p)}{\varepsilon\mathbb{E}[X]}.

Applying Lemma 5.1 we find that there is $C_{1}>0$ such that,

\Phi_{X}(\delta+\varepsilon)\leq\frac{C_{1}\mathbb{E}[X]}{n^{2}p^{2}}\log(1/p).

By our assumptions, $Cn^{2}p^{2}\leq m\leq\Phi_{X}(\delta+\varepsilon)=O(n^{2}p^{2}\log(1/p))$ . This implies there are $C_{2},C_{3}>0$ such that for large enough $n_{0}$ we have,

	$\displaystyle T_{m}\leq$	$\displaystyle C_{2}n^{2}p^{2}\log^{4}(1/p)$
	$\displaystyle\leq$	$\displaystyle C_{3}m\log^{5}(1/p).$

For $S_{m}$ recall that there are at most $4m/r$ vertices in $U_{\lfloor\log_{2}(r)\rfloor}(H)$ and further recall that $\Phi_{X}(\delta+\varepsilon)=O(n^{2}p^{2}\log(1/p))$ . Hence,

S_{m}\leq n\frac{4m\log(1/p)}{r}\leq C_{4}m\log(1/p),

where $C_{4}=C_{4}(\varepsilon,\delta,K)>0$ . This implies that for large enough $n_{0}$ we have $S_{m},T_{m}=O(m\log^{5}(1/p))$ . Putting it all together there exists $C_{5}>0$ such that provided $n_{0}$ is large enough we have,

	$\displaystyle\|C_{m}\|/\binom{n}{v_{m}}\leq$	$\displaystyle{(6\log(1/p))}^{v_{m}}\binom{T_{m}+S_{m}}{m}$
	$\displaystyle\leq$	$\displaystyle{(6\log_{2}(1/p))}^{2m}\left(\frac{C_{5}m\log^{5}(1/p)}{m}\right)^{m}$
	$\displaystyle\leq$	$\displaystyle{\log(1/p)}^{O(m)}.$

This proves the lemma. ∎

We now deduce two corollaries and discuss how one would use them in order to obtain an upper bound on the upper tail probability of $X$ . The first corollary will be used when $n^{-1}\ll p\ll n^{-1/2}$ and the second when $n^{-1/2}\ll p\ll 1$ .

Corollary 5.6.

Suppose $\varepsilon,\delta,C,K$ are positive reals with $\varepsilon<1$ . Furthermore, suppose $n^{-1}\ll p\ll n^{-1/2}$ as $n$ tends to infinity. Then, there exist $D>0$ and $n_{0}$ such that the following holds for all $n>n_{0}$ :
Let $m$ be a positive integer with $Cn^{2}p^{2}\leq m\leq K\Phi_{X}(\delta+\varepsilon)$ . Furthermore, let

v_{m}=\max\{|\{v\in V(G):\deg(v)\neq 0\}|:G\in\mathcal{C}_{m}(\varepsilon,\delta,K)\}.

Then,

\log\left(|\mathcal{C}_{m}|\right)\leq-v_{m}\log\left(np^{2}\right)+\eta m\log(n).

Proof.

By Lemma 5.2 and the assumption that $p\gg n^{-1}$ we have the following provided $n_{0}$ is sufficiently large,

\log(|\mathcal{C}_{m}|)\leq\log\binom{n}{v_{m}}+Dm\log\log(1/p)\leq v_{m}\log(en/v_{m})+\eta m\log(n)/3.

Since, $v_{m}\leq 2m$ we deduce that

\log(|\mathcal{C}_{m}|)\leq v_{m}\log(n/v_{m})+2\eta m\log(n)/3.

We now split into two cases. First, assume $v_{m}\leq\eta m/3$ . This assumption implies that

v_{m}\log(n/v_{m})\leq\eta m\log(n)/3.

On the other hand, if we assume $v_{m}\geq\eta m/3$ we obtain

v_{m}\log(n/v_{m})\leq v_{m}\log(3n/\eta m)=-v_{m}\log(np^{2})+v_{m}\log(O(1)),

where the equality is due to the assumption that $m=\Omega(n^{2}p^{2})$ . Since $v_{m}\leq 2m$ , in both cases we obtain that, provided $n$ is large enough, the following holds:

\log\left(|\mathcal{C}_{m}|\right)\leq-v_{m}\log\left(np^{2}\right)+\eta m\log(n).\qed

Corollary 5.6 suggests the following ‘plan of attack’ which we implement in Section 6. Assuming $p\ll n^{-1/2}$ the term $\log(np^{2})$ is negative. Therefore in order to estimate $|\mathcal{C}_{m}|$ estimate the maximum number of vertices that a core with $m$ edges can have for each $m$ . This estimation and Corollary 5.6 then yield an upper bound on the number of cores with $m$ edges, denote this bound by $\beta_{m}$ . Then we compare $p^{m}\beta_{m}$ for every $m_{0}\leq m\leq K\Phi_{X}(\delta)$ , where $m_{0}$ is the minimum number of edges in a core graph and denote this maximum by $\beta^{*}$ . This implies,

\sum\limits_{m=m_{0}}^{K\Phi_{\delta}(X)}p^{m}|\mathcal{C}_{m}|\leq K\Phi_{X}(\delta)\beta^{*}.

We also show that $K\Phi_{X}(\delta)$ is negligible compared to $\beta^{*}$ and therefore we obtain,

\log\left(\sum\limits_{m=m_{0}}^{K\Phi_{X}(\delta)}p^{m}|\mathcal{C}_{m}|\right)\leq(1-\eta)\log(\beta^{*}).

Then Theorem 3.2 will imply

\log\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq(1-\varepsilon)\log(\beta^{*}).

Note that $\beta^{*}$ depends on $p$ . We will also show that this upper bound is matched to the lower bounds that we obtained in Section 4.

Next we present a corollary of Lemma 5.2 which will be used in the range where $p\gg n^{-1/2}$ .

Corollary 5.7.

Suppose $\varepsilon,\delta,C,K$ are positive reals with $\varepsilon<1$ . Furthermore, suppose $n^{-1/2}\ll p\ll 1$ as $n$ tends to infinity. Then, there exists $n_{0}$ such that the following holds for all $n>n_{0}$ :
Let $m$ be a positive integer with $Cn^{2}p^{2}\leq m\leq K\Phi_{X}(\delta+\varepsilon)$ . Then,

|\mathcal{C}_{m}|\leq\left(\frac{1}{p}\right)^{\varepsilon m}.

This is also an immediate corollary of Lemma 5.2. To see this note that in this range of $p$ we have $n\ll m$ and thus $\binom{n}{v_{m}}\leq 2^{n}\leq\log(1/p)^{m}$ . Hence Lemma 5.2 implies

\log|\mathcal{C}_{m}|\leq O(m\log\log(1/p))=o(m\log(1/p)).

Similar to before $\Phi_{X}(\delta+\varepsilon)$ will be shown to be negligible and therefore Corollary 5.7 will yield the following provided $p\gg n^{-1/2}$ ,

\log\left(\sum_{m=m_{*}}^{K\Phi_{X}(\delta)}p^{m}|\mathcal{C}_{m}|\right)\leq(1-\varepsilon)m_{*}\log\left(p\right)

where $m_{*}$ is the minimum number of edges in a core graph. Which then by Theorem 3.2 implies

\log\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq(1-\varepsilon)m_{*}\log(p).

In the next section we will compute this $m_{*}$ and show that the above matches the lower bounds given in Section 4.

6. Upper bounds

As can be seen in previous sections there is a big difference in the behaviour of the problem depending on the regime $p$ lies at. As explained in Section 5, in order to obtain quantitative bounds on the upper tail probability we need to bound the entropic term $|\mathcal{C}_{m}|$ when $\frac{\log^{9}(n)}{n}\ll p\ll\frac{1}{\sqrt{n}\log(n)}$ , and when $p\gg n^{-1/2}$ we only need to compute the minimum number of edges in a core graph. Actually, the situation is more involved in the sparse regime where we see a surprising change in the behaviour of the problem. To present the main result of this section let us define a sequence $c_{k}$ for $k\geq 2$ :

c_{k}=\Bigg{\{}\begin{array}[]{@{}l@{\thinspace}l}0&\text{ for }k=1,\\ \frac{1}{2+\sqrt{\frac{k+1}{k-1}}}&\text{ for }k\geq 2.\end{array}

We note that $c_{k}$ is an increasing sequence and $\lim_{k\rightarrow\infty}c_{k}=1/3$ . This is the sequence promised in the our main theorem 1.1 and in Section 4. Furthermore, recall the definitions of $m_{k}$ and $m_{*}$ which were given at the beginning of Section 4. For the convenience of the reader we give these definitions here as well. First, we define $m_{k}=r_{k}\sqrt{(\delta+\varepsilon)\mathbb{E}[X]}$ , where

r_{k}=\Bigg{\{}\begin{array}[]{@{}l@{\thinspace}l}2&\text{ for }k=0,\\ 2\sqrt{k/(k-1)}&\text{ for }k\geq 2.\\ \end{array}

Second, we define

m_{*}=(\sqrt{r+d^{2}}-d)\sqrt{\mathbb{E}[X]}/2,

where $r=16(\delta+3\varepsilon/2)$ and $d=\sqrt{2}/(1+\varepsilon)$ . We are now ready to state the main theorem of this section which is the following.

Theorem 6.1.

Suppose $k$ is an integer greater than $1$ and suppose that $\varepsilon,\delta$ are positive reals with $\varepsilon$ small enough. Then there exists $n_{0}$ such that for any $n>n_{0}$ the following holds:

(1)

If $\max\left\{\frac{\log^{9}(n)}{n},n^{-1+c_{k-1}}\right\}\leq p\leq n^{-1+c_{k}}$ , then

$\log\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq{(1-\varepsilon)}\log\left(p^{m_{k}}\binom{n}{m_{k}/{k}}\right).$
(2)

If $n^{-2/3}\leq p\leq{\frac{1}{\sqrt{n}\log(n)}}$ , then

$\log\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq{(1-\varepsilon)}m_{0}\log\left(p\right).$
(3)

If $n^{-1/2}\ll p\ll 1$ , then

$\log\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq{(1-\varepsilon)}m_{*}\log(p).$

The proof naturally splits into three parts. We will prove each part in a different subsection. Before we split into cases let us prove a lemma which makes a connection between the number of induced copies of $C_{4}$ in a graph to the numbers of edges and vertices of that graph. This lemma will be important for us both in the sparse regime and the dense regime.

Lemma 6.2.

Suppose $n,m$ are positive integers such that $m>3$ and let $\gamma\geq 0$ be real. Let $G$ be a graph with $n$ vertices and $m$ edges and assume $\delta(G)\geq 2$ . Then,

N_{ind}(C_{4},G)\leq\frac{m(m-n+1)}{4}.

Remark.

The bound from the lemma is sharp when $n=\frac{m}{k}+k$ for some $k\in\mathbb{N}$ as the graph $K_{k,\frac{m}{k}}$ has $n$ vertices and $m$ edges and contains $\binom{k}{2}\binom{\frac{m}{k}}{2}=\frac{m}{4}(m-k-\frac{m}{k}+1)$ induced copies of $C_{4}$ .

Proof of Lemma 6.2.

Since an $n$ -vertex graph with minimum degree at least $2$ has at least $n$ and at most $\binom{n}{2}$ edges we may assume that $n\leq m\leq\binom{n}{2}$ . Since,

N_{ind}(C_{4},G)=\frac{1}{4}\sum_{e\in E(G)}N_{ind}(e,C_{4},G)\leq\frac{m}{4}\max_{e\in E(G)}N_{ind}(e,C_{4},G).

We only need to prove that $N_{ind}(uv,C_{4},G)\leq m-n+1$ for all $uv\in E(G)$ . Note that every $C_{4}$ containing $uv$ is determined by the ‘parallel’ edge to $uv$ meaning the edge between the two other vertices in that $C_{4}$ . This is because we are considering induced copies. In other words, the number of induced copies containing $uv$ is bounded by the number of edges between $N(u)\setminus\{v\}$ and $N(v)\setminus\{u\}$ . Let $X$ be the set of vertices not in $N(u)\cup N(v)$ and note that $|X|\geq n-\deg(u)-\deg(v)$ . The number of edges between $N(u)\setminus\{v\}$ and $N(v)\setminus\{u\}$ is bounded from above by $m-e(X)-(\deg(u)+\deg(v)-1)$ where $e(X)$ is the number of edges with an endpoint in $X$ . Since $\delta(G)\geq 2$ we may estimate $e(X)$ as follows,

e(X)\geq\frac{1}{2}\sum\limits_{x\in X}\deg(x)\geq|X|.

Thus, the number of edges between $N(u)\setminus\{v\}$ and $N(v)\setminus\{u\}$ is bounded from above by

m-(|X|+\deg(v)+\deg(u)-1)\leq m-n+1

and the lemma follows. ∎

In order to prove Theorem 6.1 we now split into cases, each dealing with each item of the theorem.

6.1. The dense regime

Assume that $n^{-1/2}\ll p\ll 1$ and assume that $\varepsilon$ is any positive real which is small enough as a function of $\delta$ . Corollary 5.7 asserts that in this regime of $p$ the number of core graphs is negligible. Therefore, as explained at the end of Section 5, to bound the upper tail probability using Theorem 3.2 we are only left with showing that $m_{*}$ is a lower bound on the minimum number of edges in a core. This is given in the following lemma.

Lemma 6.3.

Suppose $\varepsilon,\delta$ are positive real numbers with $\varepsilon$ small enough as a function of $\delta$ . Further suppose $p\gg n^{-1/2}$ . Then, there exists $n_{0}$ such that for all $n>n_{0}$

\min\{e(G):G\in\mathcal{C}\}\geq m_{*}.

Before proving the claim let us recall Observation 5.4. For the convenience of the reader we state it here as well:

Observation 6.4.

Suppose $n,m$ are positive integers with $m\leq\binom{n}{2}$ . Then,

N_{ind}(m,n,K_{1,2}\sqcup K_{1})\leq mn^{2}/8.

Proof of Lemma 6.3.

Suppose $G\in\mathcal{C}_{m}$ is a core graph with $m$ edges. Then $G$ is also a structured seed and thus

N_{ind}(C_{4},G)+N_{ind}(K_{1,2}\sqcup K_{1},G)p^{2}\geq(\delta-\varepsilon)\mathbb{E}[X].

Lemma 6.2 and Observation 6.4 assert that

N_{ind}(C_{4},G)\leq N_{ind}(m,C_{4})\leq m^{2}/4,

and

N_{ind}(K_{1,2}\sqcup K_{1},G)\leq N_{ind}(m,n,K_{1,2}\sqcup K_{1})\leq mn^{2}/8.

Therefore,

m^{2}+mn^{2}p^{2}/2-4(\delta-\varepsilon)\mathbb{E}[X]\geq 0.

Taking $n_{0}$ large enough we have $\mathbb{E}[X]\geq(1-\varepsilon)^{2}n^{4}p^{4}/8$ and thus,

m^{2}+\frac{m\sqrt{2\mathbb{E}[X]}}{(1-\varepsilon)}-4(\delta-\varepsilon)\mathbb{E}[X]\geq 0.

This implies that for large enough $n_{0}$ and small enough $\varepsilon$ we have

m\geq\left(\sqrt{\frac{2}{(1-\varepsilon)^{2}}+16(\delta-\varepsilon)}-\frac{\sqrt{2}}{(1-\varepsilon)}\right)\sqrt{\mathbb{E}[X]}/2=m_{*}.\qed

As explained after Corollary 5.7, this implies the following corollary which is the third item in Theorem 6.1.

Corollary 6.5.

Suppose $\varepsilon,\delta$ are positive real numbers with $\varepsilon$ small enough as a function of $\delta$ . Further suppose $n^{-1/2}\ll p\ll 1$ . Then, for large enough $n$ we have

\log\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq{(1-\varepsilon)}m_{*}\log(p).

6.2. The sparse regime

Throughout this subsection, assume $\frac{\log^{9}(n)}{n}\leq p\ll\frac{1}{\sqrt{n}\log(n)}$ . As mentioned earlier, there is another change in the behaviour of the problem when $\frac{\log^{9}(n)}{n}\leq p\leq n^{-2/3-\varepsilon}$ and $n^{-2/3}\leq p\ll\frac{1}{\sqrt{n}\log(n)}$ . Before splitting into these two cases we show a reduction and develop some tools which we use in both cases.

We start with the following fact which plays a major role in the later proofs.

Lemma 6.6.

Suppose $\varepsilon,\delta,s,K$ are reals and positive with $\varepsilon<1$ . Furthermore, suppose $\frac{\log(n)}{n}\leq p\ll\frac{1}{\sqrt{n}\log(n)}$ . Then, there exist $n_{0}$ and $\xi>0$ such that the following holds for all $n>n_{0}$ :
Suppose $G\in\mathcal{C}_{m}$ where $Cn^{2}p^{2}\leq m\leq K\Phi_{X}(\delta+\varepsilon)$ and $uv\in E(G)$ . Then,

\deg(u)\deg(v)\geq\frac{\xi n^{2}p^{2}}{\log(1/p)}\geq s,

and furthermore, $\deg(u),\deg(v)\geq 2$ .

Proof.

Let $G\in\mathcal{C}_{m}$ be a core graph with $Cn^{2}p^{2}\leq m\leq K\Phi_{X}(\delta+\varepsilon)$ edges and let $uv\in E(G)$ be an edge of $G$ . By the definition of $\mathcal{C}_{m}$ we have the following:

N_{ind}(uv,C_{4},G)+N_{ind}(uv,K_{1,2},G)np^{2}\geq\frac{\varepsilon\mathbb{E}[X]}{2K\Phi_{X}(\delta+\varepsilon)}.

Note that $N_{ind}(uv,K_{1,2},G)\leq m\leq K\Phi_{X}(\delta+\varepsilon)$ . Lemma 5.1 asserts that there exists $D>0$ such that provided $n_{0}$ is large enough we have

\Phi_{X}(\delta+\varepsilon)\leq D\sqrt{\mathbb{E}[X]}\log(1/p).

Hence as $p\ll\frac{1}{\sqrt{n}\log(n)}$ we deduce the following for large enough $n_{0}$ :

N_{ind}(uv,K_{1,2},G)np^{2}\leq KD\sqrt{\mathbb{E}[X]}np^{2}\log(1/p)\ll\sqrt{\mathbb{E}[X]}/\log(1/p),

and

N_{ind}(uv,C_{4},G)\geq\frac{\varepsilon\sqrt{\mathbb{E}[X]}}{2DK\log(1/p)}-N_{ind}(uv,K_{1,2},G)np^{2}\geq\frac{\varepsilon\sqrt{\mathbb{E}[X]}}{4DK\log(1/p)}.

(17)

Note also that $\deg(u)\deg(v)\geq N_{ind}(C_{4},uv,G)$ and thus the assertion follows for small enough $\xi$ ,

\deg(u)\deg(v)\geq\frac{\varepsilon\sqrt{\mathbb{E}[X]}}{4DK\log(1/p)}\geq\frac{\xi n^{2}p^{2}}{\log(1/p)}

and if $\deg(u)=1$ or $\deg(v)=1$ then $N_{ind}(C_{4},uv,G)=0$ which is a contradiction to (17). ∎

The above lemma will be used many times throughout this subsection, mostly the fact that when $\frac{\log^{9}(n)}{n}\leq p\ll\frac{1}{\sqrt{n}\log(n)}$ the minimum degree of a core graph is at least $2$ . We will omit the reference to this lemma and keep in mind that the minimum degree of all graphs considered is at least $2$ .

The following is a corollary which bounds $v_{m}=\max|\{v\in V(G):G\in\mathcal{C}_{m}\text{ and }\deg(v)>0\}|$ . This will be used afterwards to formalize the discussion after Corollary 5.6.

Lemma 6.7.

Suppose $\varepsilon,\delta,K,\gamma$ are positive reals. Suppose further that $\frac{\log^{9}(n)}{n}\leq p\ll\frac{1}{\sqrt{n}\log(n)}$ . Then, there exists an integer $n_{0}$ such that for all integers $n>n_{0}$ the following holds:
Suppose $m$ is an integer with $m_{0}\leq m\leq K\Phi_{X}(\delta+\varepsilon)$ . Then,

v_{m}\leq\frac{m}{2}+m^{3/4}.

Proof.

Fix an arbitrary $G\in\mathcal{C}_{m}$ and define, $A=\{v\in V(G):\deg(v)<\sqrt{m}/\log^{2}(n)\}$ . We have

2m=\sum_{v\in V(G)}\deg(v)=\sum_{v\in A}\deg(v)+\sum_{v\in A^{c}}\deg(v)\geq|A^{c}|\sqrt{m}/\log^{2}(n).

Thus, $|A^{c}|\leq 2\sqrt{m}\cdot\log^{2}(n)$ .

By Theorem 5.1 we have $\Phi_{X}(\delta+\varepsilon)=O(n^{2}p^{2}\log(1/p))$ and hence,

\left(\frac{\sqrt{m}}{\log^{2}(n)}\right)^{2}\leq\frac{O(n^{2}p^{2}\log(1/p))}{\log^{4}(n)}\ll\frac{n^{2}p^{2}}{\log(1/p)}.

Thus, by Lemma 6.6 for sufficiently large $n_{0}$ no two vertices of $A$ can be connected by an edge. Therefore,

m\geq\sum_{v\in A}\deg(v).

Since $\deg(v)\geq 2$ for all of the vertices of $G$ , we obtain $|A|\leq\frac{m}{2}$ .

Therefore for large enough $n_{0}$ ,

|V(G)|=|A\sqcup A^{c}|=|A|+|A^{c}|\leq\frac{m}{2}+2\sqrt{m}\cdot\log^{2}(n).

Provided $n_{0}$ is large enough, $2\sqrt{m}\log^{2}(n)\leq m^{3/4}$ . Hence, for large enough $n_{0}$ the assertion of the lemma holds. ∎

We now formalize the discussion after Corollary 5.6 in the following proposition.

Proposition 6.8.

Suppose $\delta$ is a positive real and $\varepsilon,\eta$ are sufficiently small positive reals. Furthermore, suppose $\frac{\log^{9}(n)}{n}\leq p\ll\frac{1}{\sqrt{n}\log(n)}$ . Then, there exists $n_{0}$ such that for all $n>n_{0}$ we have,

\log\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq(1-\eta)\max_{m_{0}\leq m\leq m_{2}}\{m\log(p)-v_{m}\log(np^{2})\}.

To prove this proposition we start with a lemma reducing the problem of estimating $m\log(p)-v_{m}\log(np^{2})$ only when $m_{0}\leq m\leq K\Phi_{X}(\delta+\varepsilon)$ .

Lemma 6.9.

Suppose $\delta$ is a positive real and $\varepsilon,\eta$ are sufficiently small positive reals. Furthermore, suppose $n^{-1}\ll p\ll n^{-1/2}/\log(n)$ . Then, there exists $K,n_{0}>0$ such that for all $n>n_{0}$ we have,

\log\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq(1-\eta)\max_{m_{0}\leq m\leq K\Phi_{X}(\delta+\varepsilon)}\left\{m\log(p)-v_{m}\log(np^{2})\right\}.

Proof.

For all $\eta_{1},K^{\prime}>0$ Corollary 5.6 asserts that provided $n_{0}$ is large enough we have the following for all $m_{0}\leq m\leq K^{\prime}\Phi_{X}(\delta+\varepsilon)$ :

\log(p^{m}|\mathcal{C}_{m}|)\leq m\log(p)-v_{m}\log(np^{2})+\eta_{1}m\log(n),

where $v_{m}=\max|\{v\in V(G):G\in\mathcal{C}_{m}\text{ and }\deg(v)>0\}|$ .

Since $n^{-1}\ll p\ll\frac{1}{\sqrt{n}\log(n)}$ , provided $n$ is sufficiently large, we have $v_{m}\leq m/2+m^{3/4}$ for all $m_{0}\leq m\leq K^{\prime}\Phi_{X}(\delta+\varepsilon)$ by Lemma 6.7. Thus,

m\log(p)-v_{m}\log(np^{2})\leq-\Omega\left(m\log(n)\right).

For any positive $\eta_{2}$ we may choose $\eta_{1}$ small enough such that we have the following for sufficiently large $n$ ,

m\log(p)-v_{m}\log(np^{2})+\eta_{1}m\log(n)\leq(1-\eta_{2})(m\log(p)-v_{m}\log(np^{2})).

Therefore, provided $\eta_{2}$ is sufficiently small Theorem 3.2 and Theorem 5.1 then yield that there exist $K,n_{0}>0$ such that for all $n>n_{0}$ :

\log\mathbb{P}(X\geq(1+\delta)\mathbb{E}[X])\leq(1-\eta)\max_{m_{0}\leq m\leq K\Phi_{X}(\delta+\varepsilon)}\{m\log(p)-v_{m}\log(np^{2})\}.\qed

We now use the above lemma to prove Proposition 6.8

Proof of Proposition 6.8.

By Lemma 6.9 it is sufficient to prove that for all $m_{2}\leq m\leq K\Phi_{X}(\delta+\varepsilon)$ we have the following provided $\eta^{\prime}$ is sufficiently small

m\log(p)-v_{m}\log(np^{2})\leq(1-\eta^{\prime})(m_{2}\log(p)-v_{m_{2}}\log(np^{2})).

Provided $n_{0}$ is large enough Lemma 6.7 gives $v_{m}\leq m/2+m^{3/4}\leq m/2+\eta^{\prime}m$ for all $m_{2}\leq m\leq K\Phi_{X}(\delta+\varepsilon)$ . Therefore,

	$\displaystyle m\log(p)-v_{m}\log(np^{2})$	$\displaystyle\leq m\log(p)-(1/2+\eta^{\prime})m\log\left(np^{2}\right)$
		$\displaystyle\leq(-1/2-2\eta^{\prime})m\log(n)\leq(-1/2-2\eta^{\prime})m_{2}\log(n).$

Noting that each copy of $K_{m_{2}/2,2}$ in $K_{n}$ is a core graph with $m_{2}$ edges we find that $v_{m_{2}}\geq m_{2}/2$ . Therefore,

m_{2}\log(p)-v_{m_{2}}\log(np^{2})\geq m_{2}\log(p)-\frac{m_{2}}{2}\log(np^{2})=-\frac{m_{2}}{2}\log(n).

These inequalities imply the following for all $m_{2}\leq m\leq K\Phi_{X}(\delta+\varepsilon)$ and sufficiently small $\eta^{\prime\prime}$

m\log(p)-v_{m}\log(np^{2})\leq(1-\eta^{\prime\prime})(m_{2}\log(p)-v_{m_{2}}\log(np^{2})).

This establish the proposition provided $\eta^{\prime\prime}$ is small enough. ∎

Proposition 6.8 and Corollary 5.6 show that to prove Theorem 6.1 when $\frac{\log^{9}(n)}{n}\leq p\ll\frac{1}{\sqrt{n}\log(n)}$ we only need to bound $m\log(p)-v_{m}\log(np^{2})$ for $m_{0}\leq m\leq m_{2}$ . Therefore, to prove the first item in Theorem 6.1 it is enough to prove that, when $n^{-1+c_{k-1}}\leq p\leq n^{-1+c_{k}}$ we have

m\log(p)-v_{m}\log(np^{2})\leq(1-\eta)(m_{k}\log(p)-v_{m_{k}}\log(np^{2})),

for all $m_{0}\leq m\leq m_{2}$ . Moreover, to prove the second item in Theorem 6.1 it is enough to prove that, when $n^{-2/3}\leq p\ll\frac{1}{\sqrt{n}\log(n)}$ we have

m\log(p)-v_{m}\log(np^{2})\leq(1-\eta)m_{0}\log(p),

for all $m_{0}\leq m\leq m_{2}$ . We now split into these two cases mentioned above and prove them. First we deal with the denser case where $n^{-2/3}\leq p\ll\frac{1}{\sqrt{n}\log(n)}$ .

6.2.1. The denser case in the sparse regime

Assume $n^{-2/3}\leq p\ll\frac{1}{\sqrt{n}\log(n)}$ . Proposition 6.8 implies that we need to evaluate the term $m\log(p)-v_{m}\log(np^{2})$ only for $m_{0}\leq m\leq m_{2}$ . To do this we use Lemma 6.2 which implies a connection between the number of induced copies of $C_{4}$ in a graph and the number of edges and vertices in it. We start with a simple corollary of Lemma 6.2.

Corollary 6.10.

Suppose $\varepsilon,\delta,K$ are positive reals. Suppose further that $\frac{\log(n)}{n}\leq p\ll 1$ and $m$ is an integer. Then, for any $G\in\mathcal{C}_{m}$ the number of vertices of $G$ is at most

m-4(\delta-\varepsilon)\mathbb{E}[X]/m+1.

Proof.

Since the minimum degree of $G$ is at least $2$ , we may apply Lemma 6.2 and obtain,

(\delta-\varepsilon)\mathbb{E}[X]\leq N(C_{4},G)\leq N_{ind}(v(G),m,C_{4})\leq\frac{m(m-v(G)+1)}{4}

(18)

By algebraic manipulations we see that, $v(G)\leq m-4(\delta-\varepsilon)\mathbb{E}[X]/m+1$ which is the assertion of the corollary. ∎

Now we obtain the desired bound for $m\log(p)-v_{m}\log(np^{2})$ in this regime of $p$ .

Lemma 6.11.

Suppose that $\varepsilon,\delta,\gamma$ are positive reals with $\varepsilon$ small enough, and $\gamma$ small enough as a function of $\varepsilon$ . Furthermore, suppose $n^{-2/3}\leq p\ll\frac{1}{\sqrt{n}\log(n)}$ and $K(\varepsilon,\delta)$ is some constant that might depends on $\varepsilon$ and $\delta$ . Then there exists $n_{0}$ such that for any $n>n_{0}$ and any $m_{0}\leq m\leq m_{2}$ we have,

m\log(p)-v_{m}\log(np^{2})\leq(1-\gamma)m_{0}\log(p).

Proof.

Let $c\in[1/3,1/2]$ be such that $p=n^{-1+c}$ and recall that $m=\Theta(n^{2}p^{2})$ . Applying Corollary 6.10 we obtain the following for any $m_{0}\leq m\leq m_{2}$ and sufficiently large $n_{0}$ :

	$\displaystyle m\log(p)-v_{m}\log(np^{2})$	$\displaystyle\leq(-1+c)m\log(n)+(1-2c)(m-4(\delta-\varepsilon)\mathbb{E}[X]/m+\eta m)\log(n)$
		$\displaystyle\leq-cm\log(n)-4(1-2c)(\delta-\varepsilon)\mathbb{E}[X]\log(n)/m+{\eta m\log(n)}/{3}.$

Recalling that $m_{0}=2\sqrt{(\delta-\varepsilon)\mathbb{E}[X]}$ we may rewrite the above as follows provided $n_{0}$ is large enough:

\displaystyle m\log(p)-v_{m}\log(np^{2})

\displaystyle\leq(-cm-(1-2c)m_{0}^{2}/m))\log(n)+\eta m\log(n)/3.

We claim that $g_{c}(m)\coloneqq-cm-(1-2c)m_{0}^{2}/m\leq(-1+c)m_{0}$ for all $m_{0}\leq m\leq m_{2}$ . Indeed, $g^{\prime}_{c}(m)=-c+(1-2c)(m_{0}/m)^{2}\leq 1-3c\leq 0$ for $m\geq m_{0}$ and $c\geq 1/3$ . This concludes the proof as now provided $\gamma$ is small enough we have the following for all $m_{0}\leq m\leq m_{2}$ :

	$\displaystyle m\log(p)-v_{m}\log(np^{2})$	$\displaystyle\leq(-1+c)m_{0}\log(n)+\eta m\log(n)/3$
		$\displaystyle=m_{0}\log(p)+\eta m\log(n)/3\leq m_{0}\log(p)+\eta m_{2}\log(n)/3$
		$\displaystyle\leq(1-\gamma)m_{0}\log(p).\qed$

This together with Proposition 6.8 conclude the proof for the second item in Theorem 6.1.

6.2.2. The sparser case in the sparse regime

Assume $\frac{\log^{9}(n)}{n}\leq p\ll n^{-2/3-\varepsilon}$ . It was already seen that there is a big difference in the behaviour of the upper tail probability depending on the regime $p$ lies at. In this section we present a surprising change of the behaviour in the sparser regime. This phenomenon comes from the fact that the number of core graphs is significant and matters a lot in the evaluation of the upper tail probability. This is completely different from the previous cases where the number of cores was negligible. This will be explained in more details later on.

The essence of the proofs in this subsection is still to make use of the connection between the number of induced copies of $C_{4}$ in a core graph and the number of vertices in it. To state the claims in this section it is useful to introduce the following notations.

Notation.

Suppose $G$ is a graph and $k$ is a nonnegative integer. Then we denote the following sets accordingly:

•

Denote by $X_{k}(G)$ the set of all edges of $G$ with an endpoint of degree $k$ and denote its cardinality by $x_{k}(G)$ .
•

Denote by $X_{>k}(G)$ the set of all edges of $G$ whose both endpoints have degree greater than $k$ ; similarly we denote by $x_{>k}(G)$ the cardinality of $X_{>k}(G)$ .

In most cases we omit ‘ $G$ ’ for brevity.

The following lemma bounds from above the number of induced copies of $C_{4}$ in any core graph with $m$ edges where $p\ll\frac{1}{\sqrt{n}\log(n)}$ .

Lemma 6.12.

Suppose $\varepsilon,\delta,R,K$ are positive reals and $\frac{\log(n)}{n}\leq p\ll\frac{1}{\sqrt{n}\log(n)}$ . Then there exists $n_{0}$ such that for any $n>n_{0}$ the following holds:
Let $G$ be a core graph with $m$ edges. Then

N_{ind}(C_{4},G)\leq\sum^{R}_{i=2}\frac{x_{i}}{i}\binom{i}{2}\left(\sum_{i<j}\frac{x_{j}}{j}+\frac{x_{i}}{2i}\right)+\sum^{R}_{i=2}\frac{x_{i}e(G)(i-1)}{R}+\frac{x_{>R}^{2}}{4},

(19)

and

v(G)\leq\sum_{i=2}^{R}\frac{x_{i}}{i}+\frac{2e(G)}{R}.

(20)

Proof.

Let $G$ be a core graph. First, let us make some notations for the proof. For every nonnegative integer $k$ let $V_{k}(G)$ be set of all vertices of $G$ with degree $k$ and denote its cardinality by $v_{k}(G)$ . Furthermore, let $V_{>k}(G)$ be the set of all vertices of $G$ with degree greater than $k$ and denote its cardinality by $v_{>k}(G)$ .

We now proceed with the proof. Note that taking $n_{0}$ large enough according to Lemma 6.6 applied with $s=R^{2}+1$ , we are guaranteed that no two vertices of degree less or equal to $R$ can be connected by an edge. This implies that $\{X_{i}:i\leq R\}\cup\{X_{>R}\}$ is a partition of the edges of $G$ . Furthermore, $v_{>R}(G)\leq 2e(G)/R$ and for any $i\leq R$ we have $v_{i}=x_{i}/i$ . This follows by a standard double counting argument and as the set of all vertices of degree at most $R$ is an independent set. We thus obtain (20) as

v(G)=\sum_{i=2}^{R}v_{i}(G)+v_{>R}(G)\leq\sum_{i=2}^{R}\frac{x_{i}}{i}+\frac{2e(G)}{R}.

To prove (19) we note that there are three types of induced copies of $C_{4}$ in core graphs:

(i)

Induced copies of $C_{4}$ with exactly two vertices in $\cup_{i=2}^{R}V_{i}$ .
(ii)

Induced copies of $C_{4}$ with exactly one vertex in $\cup_{i=2}^{R}V_{i}$ .
(iii)

Induced copies of $C_{4}$ with no vertices in $\cup_{i=2}^{R}V_{i}$ i.e. all vertices in $V_{>R}$ .

Let us bound from above the number of induced copies of $C_{4}$ of each type. We claim that the number of induced copies of $C_{4}$ of the first type is at most

\displaystyle\sum^{R}_{i=2}v_{i}\binom{i}{2}\left(\sum_{i<j}v_{j}+\frac{v_{i}}{2}\right)=\sum^{R}_{i=2}\frac{x_{i}}{i}\binom{i}{2}\left(\sum_{i<j}\frac{x_{j}}{j}+\frac{x_{i}}{2i}\right).

Indeed, for every $i$ , there are at most $\binom{v_{i}}{2}\binom{i}{2}$ copies of $C_{4}$ with two vertices in $v_{i}$ and, for all $i<j$ there are at most $v_{i}v_{j}\binom{i}{2}$ copies of $C_{4}$ with one vertex in $V_{i}$ and one vertex in $V_{j}$ .

We now bound the number of induced copies of $C_{4}$ of the second kind. Note that each such induced copy of $C_{4}$ is determined by choosing its vertex of degree less or equal to $R$ , two of its neighbours and another vertex of degree larger than $R$ . Therefore, the number of induced copies of $C_{4}$ of the second kind it at most

\sum_{i=2}^{R}v_{i}v_{>R}\binom{i}{2}=\sum_{i=2}^{R}\frac{x_{i}v_{>R}(i-1)}{2}\leq\sum_{i=2}^{R}\frac{x_{i}e(G)(i-1)}{R}.

To bound from above the number of induced copies of $C_{4}$ of the third kind we observe that each such induced copy of $C_{4}$ is determined by each of the two perfect matchings in it. Since each induced copy of $C_{4}$ of the third type contains only edges of $X_{>R}$ , there are at most $\frac{1}{2}\binom{x_{>R}}{2}\leq\frac{x_{>R}^{2}}{4}$ such copies. Summing all of these bounds we obtain the assertion of the lemma. ∎

Our only use of Lemma 6.12 is when $R=\lceil 1/\varepsilon\rceil$ . Therefore, form now on we set $R=\left\lceil 1/\varepsilon\right\rceil$ . We will only consider core graphs $G$ with at most $m_{2}$ edges, thus we may bound from above the right-hand side of (19) by

	$\displaystyle f(x_{1},x_{2},\ldots,x_{R},x_{>R})=$	$\displaystyle\sum^{R}_{i=2}\binom{i}{2}\frac{x_{i}}{i}\left(\sum_{i<j}\frac{x_{j}}{j}+\frac{x_{i}}{2i}\right)+\varepsilon m_{2}\sum^{R}_{i=2}{(i-1)x_{i}}+\frac{x_{>R}^{2}}{4}$
	$\displaystyle=$	$\displaystyle\sum^{R}_{i=2}{(i-1)x_{i}}\left(\sum_{i<j\leq R}\frac{x_{j}}{2j}+\frac{x_{i}}{4i}+\varepsilon m_{2}\right)+\frac{x_{>R}^{2}}{4}.$

In particular, for every core graph $G$ with at most $m_{2}$ edges,

(\delta-\varepsilon)\mathbb{E}[X]\leq N_{ind}(C_{4},G)\leq f(x_{1}(G),x_{2}(G),\ldots,x_{R}(G),x_{>R}(G)).

Recall that in this range of $p$ we have $\log(np^{2})<0$ and note that by Lemma 6.12 for any core graph $G$ the term $e(G)\log(p)-v(G)\log(np^{2})$ can be bounded from above using $x_{i}(G)$ by

\displaystyle\sum_{i=2}^{R}\left(x_{i}(G)\log(p)-x_{i}(G)\log(np^{2})/i\right)+x_{>R}(G)\log(p)-2e(G)\log(np^{2})/R.

Furthermore, for $\varepsilon<\eta/2$ we have,

2e(G)/R\leq 2e(G)\varepsilon\leq\eta e(G).

Therefore, letting $n_{0}$ sufficiently large we find that,

	$\displaystyle\max_{m_{0}\leq m\leq m_{2}}m\log(p)-v_{m}\log(np^{2})\leq$	$\displaystyle\max_{G\in\bigcup_{m=m_{0}}^{m_{2}}\mathcal{C}_{m}}\sum_{i=2}^{R}\left(x_{i}(G)\log(p)-x_{i}(G)\log(np^{2})/i\right)$
		$\displaystyle+x_{>R}(G)\log(p)+\eta m_{2}\log(n).$		(21)

We now restate this bound in a more compact way. To this end we introduce the following notations. For every graph $G$ define the following:

•

$x(G)\in\mathbb{R}^{R+1}$ is the vector defined by $x_{i}(G)$ in its $i$ -th coordinate for $1\leq i\leq R$ and $x_{>R}(G)$ in the last coordinate.
•

$u\in\mathbb{R}^{R+1}$ is defined by $0$ as the first coordinate, $\log(p)-\log(np^{2})/i$ as the $i$ -th coordinate for $2\leq i\leq R$ , and $\log(p)$ in the last coordinate.

Using these notations we may rewrite (6.2.2) as follows,

\displaystyle\max_{m_{0}\leq m\leq m_{2}}m\log(p)-v_{m}\log(np^{2})\leq\max_{G\in\cup_{m=m_{0}}^{m_{2}}\mathcal{C}_{m}}\langle x(G),u\rangle+\eta m_{2}\log(n).

(22)

From now on we let $t=(\delta-\varepsilon)\mathbb{E}[X]$ . Since $t\leq f(x(G))$ for every $G\in\bigcup_{m=m_{0}}^{m_{2}}\mathcal{C}_{m}$ ,

\displaystyle\max_{G\in\cup_{m=m_{0}}^{m_{2}}\mathcal{C}_{m}}\langle x(G),u\rangle\leq\max\{\langle x,u\rangle:x\in\mathbb{R}_{\geq 0}^{R+1}\land f(x)\geq t\}.

Note that in the above we dropped the condition that $x$ came from a graph. By this we only consider more possible cases and thus obtain an upper bound on the maximization problem.

Now we are ready to state the main technical proposition in order to bound from above the term $m\log(p)-v_{m}\log(np^{2})$ when $\frac{\log^{9}(n)}{n}\leq p\leq n^{-2/3-\varepsilon}$ .

Proposition 6.13.

Suppose $k>1$ is an integer and suppose that $\eta,\varepsilon,\delta$ are positive reals with $\varepsilon$ small enough as a function of $\eta$ and $k$ . Then, there exists $n_{0}$ such that for any $n>n_{0}$ and any $n^{-1+c_{k-1}}\leq p\leq n^{-1+c_{k}}$ we have

\max\{\langle x,u\rangle:x\in\mathbb{R}_{\geq 0}^{R+1}\land f(x)\geq t\}\leq(1-\eta)m_{k}u_{k}.

This proposition and Proposition 6.8 imply the following corollary. This corollary yields the first item in Theorem 6.1 as explained before.

Corollary 6.14.

Suppose $k>1$ is an integer and suppose that $\eta,\varepsilon,\delta,K$ are positive reals with $\varepsilon$ small as a function of $\eta$ and $k$ . Then, there exists $n_{0}$ such that for any $n>n_{0}$ and any $\max\left\{\frac{\log^{9}(n)}{n},n^{-1+c_{k-1}}\right\}\leq p\leq n^{-1+c_{k}}$ we have the following:

m\log(p)-v_{m}\log(np^{2})\leq(1-\eta)m_{k}u_{k}=(1-\eta)m_{k}(\log(p)-\log(np^{2})/k)

for every $m_{0}\leq m\leq K\Phi_{X}(\delta+\varepsilon)$ .

We now turn to the proof of Proposition 6.13. We show that for $\max\left\{\frac{\log^{9}(n)}{n},n^{-1+c_{k-1}}\right\}\leq p\leq n^{-1+c_{k}}$ the maximum of $\langle x,u\rangle$ is achieved by a vector $x$ of the form $\alpha\cdot e_{k}$ where $e_{k}$ is the $k$ -th element in the standard basis. The strategy of the proof is to think of the vector $x$ as a distribution vector and push its mass towards the ‘center of mass’ while keeping $\langle x,u\rangle$ constant and ensuring that $f$ evaluated on the vector after the transformation is still greater than $t$ . To be more precise, the center of mass will be the $k$ -th coordinate such that $k$ satisfies $\max\left\{\frac{\log^{9}(n)}{n},n^{-1+c_{k-1}}\right\}\leq p\leq n^{-1+c_{k}}$ . This will be done iteratively by showing that if $x$ is a vector achieving the maximum of $\langle x,u\rangle$ and $x_{i}=0$ for all $i\leq j<k$ then we may define a vector $\tilde{x}$ which also achieves this maximum and satisfies $\tilde{x_{i}}=0$ for all $i\leq j+1$ . A similar statement will be shown for the other direction i.e. pushing the mass from the right to the left.

Let us introduce the following notations of pushing mass. Let $2\leq k\leq R$ be some integer (should be thought of as the center of mass) and suppose that $x\in\mathbb{R}^{R+1}$ . Then for any integers $2\leq i<k<j\leq R+1$ define the following vectors,

x^{(i,k)}=x-x_{i}e_{i}+\frac{u_{i}x_{i}}{u_{i+1}}e_{i+1},

x^{(j,k)}=x-x_{j}e_{j}+\frac{u_{j}x_{j}}{u_{j-1}}e_{j-1}.

One should think of these operations as pushing the mass towards the $k$ -th coordinate while staying on the hyperplane defined by $\langle x,u\rangle=s$ for some constant $s$ .

Proof of Proposition 6.13.

As explained earlier we prove this proposition iteratively. We do so in several claims. First, we show that we may assume that the distribution vector $x$ satisfies $x_{R+1}=0$ . This is given in the following claim.

Claim 6.15.

If $p\leq n^{-2/3-\eta}$ for some positive $\eta$ , then provided $\varepsilon$ is small enough there exists $n_{0}$ such that for any $n>n_{0}$ the following holds:

\max\{\langle x,u\rangle:x\in\mathbb{R}_{\geq 0}^{R+1}\land f(x)\geq t\}=\max\{\langle x,u\rangle:x\in\mathbb{R}_{\geq 0}^{R+1}\land x_{R+1}=0\land f(x)\geq t\}.

Proof.

Let

s=\max\{\langle x,u\rangle:x\in\mathbb{R}_{\geq 0}^{R+1}\land f(x)\geq t\}.

Furthermore, let $x\in\mathbb{R}_{\geq 0}^{R+1}$ be such that $s=\langle x,u\rangle$ and $f(x)\geq t$ . By the definition of $x^{(R+1,k)}$ we have,

\langle x,u\rangle=\langle x^{(R+1,k)},u\rangle

and therefore, $s=\langle x^{(R+1,k)},u\rangle$ . In order to prove the lemma it is sufficient to prove $f(x^{(R+1,k)})\geq f(x)$ . To do so let

f_{1}(x_{1},x_{2},\ldots,x_{R+1})=\sum^{R-1}_{i=2}(i-1){x_{i}}\left(\sum_{i<j\leq R-1}\frac{x_{j}}{2j}+\frac{x_{i}}{4i}+\varepsilon m_{2}\right),

which depends only on $x_{1},x_{2},\ldots,x_{R-1}$ , and

f_{2}(x_{1},x_{2},\ldots,x_{R+1})=\sum^{R-1}_{i=2}\frac{(i-1)x_{i}x_{R}}{2R}+(R-1)x_{R}\left(\frac{x_{R}}{4R}+\varepsilon m_{2}\right)+\frac{x_{R+1}^{2}}{4}.

Note that for all $y\in\mathbb{R}^{R+1}$ we have

f(y)=f_{1}(y)+f_{2}(y).

Note also that for all $1\leq i\leq R-1$ we have $x^{(R+1,k)}_{i}=x_{i}$ . Therefore, we obtain that $f(x^{(R+1,k)})-f(x)=f_{2}(x^{(R+1,k)})-f_{2}(x)$ . As $x_{R}^{(R+1,k)}=x_{R}+\frac{u_{R+1}}{u_{R}}x_{R+1}$ and $x^{(R+1,k)}_{R+1}=0$ a straightforward computation gives

	$\displaystyle f(x^{(R+1,k)})-f(x)$	$\displaystyle=\left(\sum^{R}_{i=2}\frac{(i-1)x_{i}}{2R}+\varepsilon m_{2}(R-1)\right)\frac{u_{R+1}}{u_{R}}x_{R+1}$
		$\displaystyle\quad+\left(\frac{(R-1)u^{2}_{R+1}}{Ru^{2}_{R}}-1\right)\frac{x_{R+1}^{2}}{4}.$

Note that $u_{R+1}=\log(p)<0$ and $u_{R}<0$ . This holds as $u_{R}=\log(p)-\log(np^{2})/R$ and we assume $\varepsilon$ is small enough so that $p^{R-2}\leq n$ , as then the left-hand side decays to zero and the right-hand side approaches infinity. This implies that

\frac{u_{R+1}}{u_{R}}>0.

Therefore,

f(x^{(R+1,k)})-f(x)\geq\left(\frac{(R-1)u^{2}_{R+1}}{Ru^{2}_{R}}-1\right)\frac{x^{2}_{R+1}}{4}.

Proving that the above is nonnegative provided $n_{0}$ is sufficiently large will conclude the proof. Indeed, letting $y=\frac{\log(np^{2})}{R\log(p)}$

\frac{(R-1)u^{2}_{R+1}}{Ru^{2}_{R}}=\frac{(R-1)\log^{2}(p)}{R(\log(p)-\log(np^{2})/R)^{2}}\geq 1

is equivalent to,

-y^{2}+2y-1/R\geq 0,

which is also equivalent to

1-\sqrt{1-1/R}\leq y\leq 1+\sqrt{1-1/R}.

Since $1-\sqrt{1-x}\leq x/2+x^{2}$ and $1+\sqrt{1-x}\geq 2-x$ for all $0\leq x\leq 1$ , it is enough to prove that

2/R+1/R^{2}\leq\frac{\log(np^{2})}{R\log(p)}\leq 2-1/R.

First, as $\log(p)<0$ , the right inequality is equivalent to

\log(np^{2})\geq(2R-1)\log(p).

Equivalently, $\log(n)\geq(2R-3)\log(p)$ , which holds for $\varepsilon$ satisfying $2R-3\geq 0$ as then the left-hand side is positive for $n>1$ and the right-hand side tends to minus infinity as $p$ tends to infinity. Second, as $np^{2}\ll 1$ , the left inequality is equivalent to

(1/2+1/R)\log(p)\geq\log(np^{2}).

This holds if and only if $p\leq n^{-1/(3/2-1/R)}$ . Finally, assuming that $\varepsilon$ is small enough so that $1/R\leq 3/2-1/(2/3+\eta)$ implying the claim. ∎

We now continue to the more technical part of the proof which is to push the mass to the $k$ -th coordinate provided that the last coordinate is $0$ , which we may assume due to the above claim.

Claim 6.16.

If $n^{-1+c_{k-1}}\leq p\leq n^{-1+c_{k}}$ then there exists $n_{0}$ such that for any $n>n_{0}$ the following holds:

\max\{\langle x,u\rangle:x\in\mathbb{R}_{\geq 0}^{R+1}\land f(x)\geq t\}=\max\{\alpha u_{k}:f(\alpha e_{k})\geq t\}.

Proof.

Let

s=\max\{\langle x,u\rangle:x\in\mathbb{R}_{\geq 0}^{R+1}\land f(x)\geq t\}.

Furthermore, let $x\in\mathbb{R}_{\geq 0}^{R+1}$ be such that $s=\langle x,u\rangle$ and $f(x)\geq t$ . By Claim 6.15 we may assume $x_{R+1}=0$ . As $x$ satisfies $\langle x,u\rangle=s$ we also have $\langle x^{(i,k)},u\rangle=\langle x^{(j,k)},u\rangle=s$ for any $2\leq i<k<j\leq R$ .

We will start with the push of the mass from the left to the right. To this end we prove iteratively the following: Let $x\in\mathbb{R}_{\geq 0}^{R+1}$ and suppose that $x_{r}=0$ for all $1\leq r<i<k$ . Then $f\left(x^{(i,k)}\right)-f(x)\geq 0$ provided $n_{0}$ is large enough. To see this let

\displaystyle f_{1}(x_{1},x_{2},\ldots,x_{R+1})=\sum^{R}_{\ell=i+2}{(\ell-1)x_{\ell}}\left(\sum_{\ell<j\leq R}\frac{x_{j}}{2j}+\frac{x_{\ell}}{4\ell}+\varepsilon m_{2}\right),

which depends only on $x_{i+2},x_{i+3},\ldots,x_{R}$ , and let

\displaystyle f_{2}(x_{1},x_{2},\ldots,x_{R+1})=\sum^{i+1}_{\ell=i}{(\ell-1)x_{\ell}}\left(\sum_{\ell<j\leq R}\frac{x_{j}}{2j}+\frac{x_{\ell}}{4\ell}+\varepsilon m_{2}\right),

which depends only on $x_{i},x_{i+1},\ldots,x_{R}$ . Note that for all $y$ with $y_{r}=0$ for $r<i$ we have $f(y)=f_{1}(y)+f_{2}(y)$ . This implies that $f(x^{(i,k)})-f(x)=f_{2}(x^{(i,k)})-f_{2}(x)$ . By the definition of $x^{(i,k)}$ we have

\displaystyle f\left(x^{(i,k)}\right)-f(x)=

\displaystyle\left(\sum_{i+1\leq j}\frac{x_{i}x_{j}}{2j}+\varepsilon m_{2}x_{i}\right)\left(\frac{iu_{i}}{u_{i+1}}-(i-1)\right)+\frac{x_{i}^{2}}{4}\left(\frac{iu^{2}_{i}}{(i+1)u^{2}_{i+1}}-\frac{i-1}{i}\right).

Thus, to show that $f\left(x^{(i,k)}\right)-f(x)\geq 0$ it is suffices to show the following:

\displaystyle\frac{iu_{i}}{u_{i+1}}-(i-1)\geq 0,

(23)

and

\displaystyle\frac{iu^{2}_{i}}{(i+1)u^{2}_{i+1}}-\frac{i-1}{i}\geq 0.

(24)

As $np^{2}=o(1)$ we have $0>u_{i}>u_{i+1}$ . Therefore, inequality (23) is equivalent to

\displaystyle i\log(p)-\log(np^{2})\leq(i-1)\log(p)-\frac{i-1}{i+1}\log(np^{2}).

By algebraic manipulation the above is equivalent to

\displaystyle\log(p)\leq\frac{2}{i+1}\log(np^{2}).

This always holds in our settings provided $n$ is sufficiently large. Indeed, the above is equivalent to $p^{i-3}\leq n^{2}$ which holds as $n^{-1}\leq p\ll 1$ . It is left to show inequality (24). We prove this in the following claim.

Claim 6.17.

i^{2}u_{i}^{2}\geq(i+1)(i-1)u^{2}_{i+1}\iff p\geq n^{-1+c_{i}}.

Proof.

The inequality in the left-hand side is equivalent to $iu_{i}\leq\sqrt{(i+1)(i-1)}u_{i+1}$ as $0>u_{i}>u_{i+1}$ . Recall that $u_{\ell}=\log(p)-\log(np^{2})/\ell$ and plug this in the above inequality to obtain

i\log(p)-\log(np^{2})\leq\sqrt{{(i-1)}{(i+1)}}\log(p)-\sqrt{\frac{i-1}{i+1}}\log(np^{2}).

Equivalently,

\left(i-2-\sqrt{(i+1)(i-1)}+2\sqrt{\frac{i-1}{i+1}}\right)\log(p)\leq\left(1-\sqrt{\frac{i-1}{i+1}}\right)\log(n)

Thus we conclude that our assumption is equivalent to the following:

\log(p)\geq\frac{\sqrt{i+1}-\sqrt{i-1}}{(i-2)\sqrt{i+1}-(i-1)^{3/2}}\log(n)=\left(\frac{1}{2+\sqrt{\frac{i+1}{i-1}}}-1\right)\log(n)=(-1+c_{i})\log(n).\qed

Since we assumed $p\geq n^{-1+c_{k-1}}$ and for all $i<k$ we have $c_{i}\leq c_{k-1}$ we also have $p\geq n^{-1+c_{k-1}}\geq n^{-1+c_{i}}$ . Therefore, for any $x\in\mathbb{R}^{R+1}$ we may iterate this process for $2\leq i\leq k-1$ and obtain that $f\left(x^{(2,k)(3,k)\ldots(k-1,k)}\right)\geq f(x)$ . This conclude the push of the mass of $x$ from the $i$ -th coordinates for $i<k$ to the $k$ -th coordinate.

Now provided what we showed so far we prove a similar statement for $x^{(j,k)}$ . More specifically, we prove iteratively the following: Let $x\in\mathbb{R}^{R+1}$ and suppose that $x_{r}=0$ for all $k<j<r\leq R+1$ and $x_{\ell}=0$ for all $\ell<k$ . Then $f\left(x^{(j,k)}\right)-f(x)\geq 0$ provided $n_{0}$ is large enough. To see this let

\displaystyle f_{3}(x_{1},x_{2},\ldots,x_{R+1})=\sum^{j-2}_{i=k}{(i-1)x_{i}}\left(\sum_{i<\ell\leq j-2}\frac{x_{\ell}}{2\ell}+\frac{x_{i}}{4i}+\varepsilon m_{2}\right),

which depends only on $x_{i}$ for $k\leq i\leq j-2$ , and let

\displaystyle f_{4}(x_{1},x_{2},\ldots,x_{R+1})=

\displaystyle\sum^{j-2}_{i=k}{(i-1)x_{i}}\sum_{\ell=j-1}^{j}\frac{x_{\ell}}{2\ell}+\sum^{j}_{i=j-1}{(i-1)x_{i}}\left(\sum_{i<\ell\leq j}\frac{x_{\ell}}{2\ell}+\frac{x_{i}}{4i}+\varepsilon m_{2}\right),

which depends only on $x_{i}$ for $k\leq i\leq j$ .

Note that for all $y$ with $y_{i}=0$ for all $i\leq k$ or $i\geq j+1$ we have $f(y)=f_{3}(y)+f_{4}(y)$ . Note also that $x^{(j,k)}_{i}=x_{i}$ for all $i\leq j-2$ thus, $f(x^{(j,k)})-f(x)=f_{4}(x^{(j,k)})-f_{4}(x)$ . This implies that

\displaystyle f\left(x^{(j,k)}\right)-f(x)=

\displaystyle\sum_{i=k}^{j-1}\frac{(i-1)x_{i}x_{j}}{2}\left(\frac{u_{j}}{u_{j-1}(j-1)}-\frac{1}{j}\right)+\frac{x_{j}^{2}}{4}\left(\frac{(j-2)u_{j}^{2}}{(j-1)u_{j-1}^{2}}-\frac{j-1}{j}\right).

Thus, to show that $f\left(x^{(j,k)}\right)-f(x)\geq 0$ it is suffices to show the following:

\frac{u_{j}}{u_{j-1}}\geq\frac{j-1}{j},

and

\displaystyle\frac{u^{2}_{j}}{u^{2}_{j-1}}\geq\frac{(j-1)^{2}}{j(j-2)}.

Note that these two inequalities are respectively equivalent to the following inequalities:

\displaystyle j\log(p)-\log(np^{2})\leq(j-1)\log(p)-\log(np^{2}),

(25)

\displaystyle\frac{u^{2}_{j-1}}{u^{2}_{j}}\leq\frac{j(j-2)}{(j-1)^{2}}.

(26)

Inequality (25) holds always as $p<1$ . Moreover, by Claim 6.17 inequality (26) holds if and only if $p\leq n^{-1+c_{j-1}}$ .This indeed holds as we assume $p\leq n^{-1+c_{k}}$ and we also have $c_{k}\leq c_{j-1}$ for all $j>k$ . This conclude the proof of the lemma. ∎

Now we deduce the proposition from the above claims. Assuming $n_{0}$ is large enough and using Claim 6.16 it is sufficient to bound from above

s\coloneqq\max\{\alpha u_{k}:f(\alpha e_{k})\geq t\}.

Let $\alpha\in\mathbb{R}$ be a witness for that. We have,

	$\displaystyle t\leq f(\alpha e_{k})$	$\displaystyle=\alpha^{2}\frac{k-1}{4k}+\varepsilon\alpha(k-1)m_{2}$
		$\displaystyle=\frac{(\delta+\varepsilon)\mathbb{E}[X]\alpha^{2}}{m_{k}^{2}}+2\varepsilon\alpha(k-1)\sqrt{(\delta+\varepsilon)\mathbb{E}[X]}$
		$\displaystyle=\frac{\delta+\varepsilon}{\delta-\varepsilon}\cdot\left(\left(\frac{\alpha}{m_{k}}\right)^{2}t+4\varepsilon\sqrt{k(k-1)}\left(\frac{\alpha}{m_{k}}\right)t\right).$

Denoting $x=\alpha/m_{k}$ implies

x\geq-2\varepsilon\sqrt{k(k-1)}+2\sqrt{\varepsilon^{2}k(k-1)+\frac{\delta-\varepsilon}{\delta+\varepsilon}}\text{ or }x\leq-2\varepsilon\sqrt{k(k-1)}-2\sqrt{\varepsilon^{2}k(k-1)+\frac{\delta-\varepsilon}{\delta+\varepsilon}}.

Hence, for every positive $\eta$ and sufficiently small $\varepsilon$ we obtain

x\geq 1-\eta\text{ or }x\leq-1+\eta.

The second option is not possible as $\alpha$ is nonnegative. Thus we have, $\alpha/m_{k}\geq 1-\eta$ and hence

s=\alpha u_{k}\leq(1-\eta)m_{k}u_{k}.

This is as claimed. ∎

7. The solution to the variational problem

In this section we solve the naive mean field variational problem (when $\sqrt{\log(n)}/n\ll p\ll n^{-1/2-\varepsilon}$ ) for the upper tail of the number of induced copies of $C_{4}$ in $G_{n,p}$ . Let us recall the definition of the variational problem.

Let $N$ be a positive integer and let $Y=(Y_{1},Y_{2},\ldots,Y_{N})$ be a sequence of independent Bernoulli random variables with mean $p$ . Further, let $X$ be a function from the hypercube to $\mathbb{R}$ . Then the naive mean field variational problem associated to the upper tail of $X(Y)$ is the function $\Psi_{X}\colon\mathbb{R}_{\geq 0}\to\mathbb{R}_{\geq 0}\cup\{\infty\}$ such that for every $\delta\geq 0$ :

\Psi_{X}(\delta)=\inf\left\{\sum_{i\in N}I_{p}(z_{i}):\mathbb{E}[X(Z)]\geq(1+\delta)\mathbb{E}[X(Y)]\right\},

where $I_{p}(q)=D_{KL}(\operatorname{Ber}(q)\|\operatorname{Ber}(p))=q\log\left(\frac{q}{p}\right)+(1-q)\log\left(\frac{1-q}{1-p}\right)$ , and the infimum is taken over all $Z=(Z_{1},Z_{2},\ldots,Z_{N})$ sequences of $N$ Bernoulli random variables with means $z_{i}$ respectively.

We will be interested in the case where $N=\binom{n}{2}$ , the random variables $Y_{i}$ correspond to the edges of $G_{n,p}$ , and $X$ is the random variable counting the number of induced copies of $C_{4}$ . The main proposition of this section is the following.

Proposition 7.1.

Suppose $\varepsilon,\delta$ are positive reals. Further, suppose that $\sqrt{\log(n)}/n\ll p\ll n^{-1/2-\varepsilon}$ . Then, for large enough $n$ , we have

(1-\varepsilon)\sqrt{\frac{\delta}{2}}\leq\frac{\Psi_{X}(\delta)}{n^{2}p^{2}\log(1/p)}\leq(1+\varepsilon)\sqrt{\frac{\delta}{2}}.

The upper bound follows immediately from

\Psi_{X}(\delta)\leq\log(1/p)\min\{e(C):\mathbb{E}[X\mid C\subseteq G_{n,p}]\geq(1+\delta)\mathbb{E}[X]\}.

In Section 4 we this minimum. It is attained when $C$ is the complete bipartite graph $K_{m_{0},m_{0}}$ where $m_{0}^{2}\approx\sqrt{\frac{\delta}{2}}n^{2}p^{2}$ . For more details the reader is referred to Section 4.

For the lower bound we start with the following reduction.

Lemma 7.2.

Suppose $\varepsilon,\delta$ are positive reals. Furthermore, suppose $p\ll 1$ . Then, for large enough $n$ , we have

\displaystyle\Psi_{X}(\delta)\geq\inf_{\bar{q}}\left\{\sum_{i\in\binom{[n]}{2}}I_{p}(q_{i}):\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{q}})]\geq(1+\delta-\varepsilon)\mathbb{E}[X]\text{ and }\forall i\in\binom{[n]}{2}\,q_{i}\geq p\right\}.

To prove the lemma we define the set $\mathcal{C}_{4}$ as follows: Let $\mathcal{C}_{4}$ be a set of representatives for the set of all tuples $(x,y,z,w)\in[n]^{4}$ with distinct coordinates and the equivalence relation, $(x,y,z,w)\sim(x^{\prime},y^{\prime},z^{\prime},w^{\prime})$ if and only if $\{xy,yz,zw,wx\}=\{x^{\prime}y^{\prime},y^{\prime}z^{\prime},z^{\prime}w^{\prime},w^{\prime}x^{\prime}\}$ ; i.e. both $(x,y,z,w)$ and $(x^{\prime},y^{\prime},z^{\prime},w^{\prime})$ induce the same 4-cycle in $K_{n}$ .

Proof.

Let $\bar{q}\in[0,1]^{\binom{n}{2}}$ be such that $\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{q}})]\geq(1+\delta)\mathbb{E}[X]$ . Define $\bar{r}$ by setting $r_{i}=\max\{p,q_{i}\}$ . Since $I_{p}(x)$ is monotone decreasing between $0$ and $p$ , we obtain that

\sum I_{p}(q_{i})\geq\sum I_{p}(r_{i}).

To conclude the proof, we prove that $\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{r}})]\geq(1+\delta-\varepsilon)\mathbb{E}[X]$ . Indeed, provided $n$ is sufficiently large, we have

	$\displaystyle\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{r}})]$	$\displaystyle=\sum_{(x,y,z,w)\in\mathcal{C}_{4}}r_{xy}r_{yz}r_{zw}r_{wx}(1-r_{xz})(1-r_{yw})$
		$\displaystyle\geq\sum_{(x,y,z,w)\in\mathcal{C}_{4}}q_{xy}q_{yz}q_{zw}q_{wx}(1-q_{xz})(1-q_{yw})(1-p)^{2}$
		$\displaystyle\geq(1+\delta)(1-p)^{2}\mathbb{E}[X]\geq(1+\delta-\varepsilon)\mathbb{E}[X].\qed$

Next we present a lemma about the structure of $G_{n,\bar{q}}$ when $\sum I_{p}(q_{i})$ is close to the infimum in Lemma 7.2.

{restatable}

lemmalemmaone Suppose $\varepsilon,\delta$ are positive reals with $\varepsilon$ sufficiently small. Suppose also that $\sqrt{\log(n)}/n\ll p\ll n^{-1/2}$ . Then, for any sequence $\bar{q}$ satisfying:

(1)

$q_{i}=p+u_{i}$ for some $u_{i}\geq 0$ ,
(2)

$\sum_{i\in\binom{[n]}{2}}I_{p}(q_{i})\leq\sqrt{2\delta}n^{2}p^{2}\log(1/p)$ , and
(3)

$\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{q}})]\geq(1+\delta-\varepsilon)\mathbb{E}[X]$ ,

we have $\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{u}})]\geq(\delta-2\varepsilon)\mathbb{E}[X]$ provided $n$ is large enough.

This lemma is a variant of the methods in the papers of Lubetzky and Zhao [27] and in Bhattacharya, Ganguly, Lubetzky and Zhao [5, Section 5.2]. In these papers the authors use the language of graph limits or ‘graphons’; we recreate their proof in the language of graphs in the Appendix. Further, Lubetzky and Zhao [27] and Bhattacharya, Ganguly, Lubetzky and Zhao [5], analysed the function $I_{p}(p+x)$ and proved several lemmas in the language of graphons. We use these lemmas in the proof of Proposition 7.1, once again in the language of graphs and not graphons. For a proof of the first lemma we refer the reader to [27] and [5]; the second lemma will be proven in the Appendix with some extensions.

{restatable}

[[27]]lemmalemmatwo The following holds,

I_{p}(p+x)=\begin{cases}(1+o(1))\frac{x^{2}}{2p},&0\leq x\ll p,\\ (1+o(1))x\log(x/p),&p\ll x\leq 1-p.\end{cases}

{restatable}

[[27]]lemmalemmathree Suppose $\varepsilon,\delta,C$ are positive reals and $\sqrt{\log(n)}/n\ll p\ll n^{-1/2}$ . Suppose also that $\bar{u}\in[0,1-p]^{\binom{n}{2}}$ such that $\sum_{i\in\binom{[n]}{2}}I_{p}(p+u_{i})\leq Cn^{2}p^{2}\log(1/p)$ . Let $b=b(n)$ be such that $\max\{np^{2},\sqrt{p\log(1/p)}\}\ll b\leq 1-\varepsilon$ . Then, provided $n$ is large enough there is a constant $D>0$ such that the following holds:

(i)

$\sum_{x\in[n]}\left(\sum_{y\neq x}u_{xy}\right)^{2}\leq Dn^{3}p^{2}b$ and
(ii)

$\sum_{i\in\binom{[n]}{2}}u_{i}\leq Dn^{2}p^{3/2}\sqrt{\log(1/p)}$ .

Now we are ready to prove Proposition 7.1.

Proof of Proposition 7.1.

We wish to prove the following:

\sum_{i\in\binom{[n]}{2}}I_{p}(q_{i})\geq(1-\eta)\sqrt{\frac{\delta}{2}}n^{2}p^{2}\log(1/p)

for sufficiently small $\eta>0$ and for $\bar{q}\in[0,1]^{\binom{n}{2}}$ such that

\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{q}})]\geq(1+\delta)\mathbb{E}[X].

We may assume that

\sum_{i\in\binom{[n]}{2}}I_{p}(q_{i})\leq Cn^{2}p^{2}\log(1/p),

for some absolute constant $C>0$ as otherwise there is nothing to prove. By Lemma 7.2 and Lemma 7.2 we may assume that $q_{i}=p+u_{i}$ for some $u_{i}\geq 0$ which satisfies

\mathbb{E}[N(C_{4},G_{n,\bar{u}})]\geq\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{u}})]\geq(\delta-\varepsilon)\mathbb{E}[X]

where $N(C_{4},G_{n,\bar{u}})$ is the number of copies of $C_{4}$ in $G_{n,\bar{u}}$ . Note that

\displaystyle\mathbb{E}[N(C_{4},G_{n,\bar{u}})]

\displaystyle={\sum_{(x,y,z,w)\in\mathcal{C}_{4}}u_{xy}u_{yz}u_{zw}u_{wx}}\geq(\delta-\varepsilon)\mathbb{E}[X].

In the next three claims we collect some properties of $(x,y,z,w)\in\mathcal{C}_{4}$ with non-negligible contribution to the sum above. Afterwards, we use these properties to show that the set of all 4-cycles with non-negligible contribution to the sum above is a subset of

\mathcal{C}_{4}^{\gamma}\coloneqq\{(a_{0},a_{1},a_{2},a_{3})\in\mathcal{C}_{4}:u_{a_{i}a_{i+1}}>\gamma\text{ for all }i=0,1,2,3\},

where the summation is modulo $4$ and $\gamma$ is some positive constant depending on $\varepsilon$ . We now explain briefly the proof strategy: First, we show that to have a non-negligible contribution a 4-cycle $(a_{0},a_{1},a_{2},a_{3})\in\mathcal{C}_{4}$ must contain a $K_{1,2}$ with both edge weights at least $\sqrt{p}/\log(1/p)$ ; for convenience assume that these edges are $a_{0}a_{1},a_{1}a_{2}$ . Afterwards, we show that, among such 4-cycles, the only ones contributing a non-negligible amount must additionally satisfy $u_{a_{2}a_{3}}\geq p^{1-2\varepsilon}$ or $u_{a_{3}a_{0}}\geq p^{1-2\varepsilon}$ , assume $u_{a_{3}a_{0}}\geq p^{1-2\varepsilon}$ . Then, we prove that a 4-cycle with the above properties contributes a non-negligible amount only if $u_{a_{0}a_{1}},u_{a_{2}a_{3}}\geq\gamma$ (in the complementary case we show $u_{a_{1}a_{2}},u_{a_{3}a_{1}}\geq\gamma$ ). Using symmetries of the 4-cycle, we may use this argument again from a different point of view. We then obtain that the only 4-cycles contributing a non-negligible amount to the sum are ones with $u_{a_{i}a_{i+1}}\geq\gamma$ for all $i$ . This explanation can also be viewed in Figure 1.

Figure 1.

We now execute this plan rigorously.

Claim 7.3.

For all $\eta>0$ the following holds for large enough $n$

\sum_{(x,y,z,w)\in\mathcal{A}}u_{xy}u_{yz}u_{zw}u_{wx}\leq\eta^{2}n^{4}p^{4},

where $\mathcal{A}$ is the set of all tuples $(x,y,z,w)\in\mathcal{C}_{4}$ with

u_{xy},u_{zw}\leq\sqrt{p}/\log(1/p)\quad\text{or}\quad u_{xw},u_{yz}\leq\sqrt{p}/\log(1/p).

Proof.

Suppose otherwise, meaning,

\sum_{(x,y,z,w)\in\mathcal{A}}u_{xy}u_{yz}u_{zw}u_{wx}>\eta^{2}n^{4}p^{4}.

By the definition of $\mathcal{A}$ we have

\displaystyle u_{xy},u_{zw}\leq\sqrt{p}/\log(1/p)\quad\text{or}\quad u_{xz},u_{yw}\leq\sqrt{p}/\log(1/p).

This implies that

\eta^{2}n^{4}p^{4}<\sum_{(x,y,z,w)\in\mathcal{A}}u_{xy}u_{yz}u_{zw}u_{wx}\leq\left(\sum_{x,y}u_{xy}\right)^{2}\frac{p}{\log^{2}(1/p)},

and therefore,

\eta n^{2}p^{3/2}\log(1/p)\leq\sum_{x,y}u_{xy}.

By Lemma 7.2 we have $\sum_{x,y}u_{xy}=O\left(n^{2}p^{3/2}\sqrt{\log(1/p)}\right)$ . This implies,

\eta n^{2}p^{3/2}\log(1/p)\leq O\left(n^{2}p^{3/2}\sqrt{\log(1/p)}\right),

contradicting the assumption that $p=o(1)$ . ∎

Claim 7.4.

For all $\eta>0$ the following holds for large enough $n$

\sum_{(a_{0},a_{1},a_{2},a_{3})\in\mathcal{B}\setminus\mathcal{A}}u_{a_{0}a_{1}}u_{a_{1}a_{2}}u_{a_{2}a_{3}}u_{a_{3}a_{0}}\leq\eta^{2}n^{4}p^{4},

where $\mathcal{B}=\cup_{i=0}^{3}\mathcal{B}_{i}$ , and $\mathcal{B}_{i}$ is the set of all $(a_{0},a_{1},a_{2},a_{3})\in\mathcal{C}_{4}$ with

u_{a_{i}a_{i+1}},u_{a_{i+1}a_{i+2}}\leq p^{1-2\varepsilon}.

Proof.

Assume towards contradiction that

\sum_{(a_{0},a_{1},a_{2},a_{3})\in\mathcal{B}\setminus\mathcal{A}}u_{a_{0}a_{1}}u_{a_{1}a_{2}}u_{a_{2}a_{3}}u_{a_{3}a_{0}}>\eta^{2}n^{4}p^{4}.

We claim that for any $(a_{0},a_{1},a_{2},a_{3})\in\mathcal{B}\setminus\mathcal{A}$ there is $i$ satisfying

u_{a_{i}a_{i+1}},u_{a_{i+1}a_{i+2}}>\sqrt{p}/\log(1/p)\quad\text{and}\quad u_{a_{i+2}a_{i+3}},u_{a_{i+3}a_{i}}\leq p^{1-2\varepsilon},

where summation is taken modulo $4$ . Indeed, let $(a_{0},a_{1},a_{2},a_{3})\in\mathcal{B}\setminus\mathcal{A}$ . Since $(a_{0},a_{1},a_{2},a_{3})\not\in\mathcal{A}$ we have

u_{a_{0}a_{1}}>\sqrt{p}/\log(1/p)\quad\text{or}\quad u_{a_{2}a_{3}}>\sqrt{p}/\log(1/p)

and

u_{a_{0}a_{3}}>\sqrt{p}/\log(1/p)\quad\text{or}\quad u_{a_{1}a_{2}}>\sqrt{p}/\log(1/p).

Without loss of generality, assume that $u_{a_{0},a_{1}},u_{a_{1},a_{2}}>\sqrt{p}/\log(1/p)$ . Note that as $(a_{0},a_{1},a_{2},a_{3})\in\mathcal{B}$ there is $i$ such that $u_{a_{i}a_{i+1}},u_{a_{i+1}a_{i+2}}\leq p^{1-2\varepsilon}$ . For $\varepsilon<1/4$ we have $\sqrt{p}/\log(1/p)\geq p^{1-2\varepsilon}$ implying that $i\neq 0,1,3$ and therefore, $i=2$ . We continue by letting $\mathcal{L}_{1}$ be the set of all $\{x,y\}\in\binom{[n]}{2}$ such that $u_{xy}>\sqrt{p}/\log(1/p)$ , and we claim that

\displaystyle\sum_{(a_{0},a_{1},a_{2},a_{3})\in\mathcal{B}\setminus\mathcal{A}}u_{a_{0}a_{1}}u_{a_{1}a_{2}}u_{a_{2}a_{3}}u_{a_{3}a_{0}}\leq np^{2(1-2\varepsilon)}\left(\sum_{\{x,y\}\in\mathcal{L}_{1}}u_{xy}\right)^{2}.

This follows from the fact that any $K_{1,2}$ in $K_{n}$ can be extended to a $C_{4}$ in at most $n$ ways. Therefore, we obtain the following:

\sum_{\{x,y\}\in\mathcal{L}_{1}}u_{xy}>\eta n^{3/2}p^{1+2\varepsilon}.

By Lemma 7.2 we have,

\sum_{x,y}I_{p}(p+u_{xy})\geq\sum_{\{x,y\}\in\mathcal{L}_{1}}I_{p}(p+u_{xy})=\sum_{\{x,y\}\in\mathcal{L}_{1}}(1+o(1))u_{xy}\log(u_{xy}/p)=\Omega\left(n^{3/2}p^{1+2\varepsilon}\log(1/p)\right).

This is a contradiction to our assumption that $\sum_{x,y}I_{p}(p+u_{xy})=O(n^{2}p^{2}\log(1/p))$ as $p\ll n^{-1/2-\varepsilon}$ and therefore, $n^{3/2}p^{1+2\varepsilon}=\omega(n^{2}p^{2})$ . ∎

Claim 7.5.

For any $\eta>0$ there exists $\gamma>0$ such that the following holds provided $n$ is large enough

\sum_{(a_{0},a_{1},a_{2},a_{3})\in\mathcal{D(\gamma)}}u_{a_{0}a_{1}}u_{a_{1}a_{2}}u_{a_{2}a_{3}}u_{a_{3}a_{0}}\leq\eta^{2}n^{4}p^{4},

where $\mathcal{D}(\gamma)=\mathcal{D}_{0}(\gamma)\cup\mathcal{D}_{1}(\gamma)$ , and $\mathcal{D}_{i}(\gamma)$ is the set of all tuples $(a_{0},a_{1},a_{2},a_{3})\in\mathcal{C}_{4}$ with

u_{a_{i}a_{i+1}}\leq\gamma\quad\text{or}\quad u_{a_{i+2}a_{i+3}}\leq\gamma,

and

u_{a_{i+1}a_{i+2}},u_{a_{i}a_{i+3}}\geq p^{1-2\varepsilon}.

Proof.

Let $\gamma=\left({\eta\varepsilon}/{C}\right)^{2}$ and assume towards contradiction that

\sum_{(a_{0},a_{1},a_{2},a_{3})\in\mathcal{D(\gamma)}}u_{a_{0}a_{1}}u_{a_{1}a_{2}}u_{a_{2}a_{3}}u_{a_{3}a_{0}}>\eta^{2}n^{4}p^{4}.

Note that any $(a_{0},a_{1},a_{2},a_{3})\in\mathcal{D}(\gamma)$ satisfies

u_{a_{i}a_{i+1}}\leq\gamma\quad\text{or}\quad u_{a_{i+2}a_{i+3}}\leq\gamma

and

u_{a_{i}a_{i+3}},u_{a_{i+1}a_{i+2}}\geq p^{1-2\varepsilon},

for some $i$ and where summation is taken modulo $4$ . Letting $\mathcal{L}_{2}$ be the set of all $\{x,y\}\in\binom{[n]}{2}$ such that $u_{xy}\geq p^{1-2\varepsilon}$ we have the following by the definition of $\mathcal{D}(\gamma)$ :

\displaystyle\sum_{(a_{0},a_{1},a_{2},a_{3})\in\mathcal{D}(\gamma)}u_{a_{0}a_{1}}u_{a_{1}a_{2}}u_{a_{2}a_{3}}u_{a_{3}a_{0}}\leq\gamma\left(\sum_{\{x,y\}\in\mathcal{L}_{2}}u_{xy}\right)^{2}.

This implies that

\sum_{\{x,y\}\in\mathcal{L}_{2}}u_{xy}>\eta n^{2}p^{2}/\sqrt{\gamma}.

Therefore, by the definition of $\mathcal{L}_{2}$ and by Lemma 7.2 we have,

\displaystyle\sum_{x,y}I_{p}(p+u_{xy})

\displaystyle\geq\sum_{\{x,y\}\in\mathcal{L}_{2}}I_{p}(p+u_{xy})\geq\sum_{\{x,y\}\in\mathcal{L}_{2}}u_{xy}\log(u_{xy}/p)\geq\frac{2\eta\varepsilon}{\sqrt{\gamma}}n^{2}p^{2}\log(1/p).

This is a contradiction to the choice of $\gamma$ as by our assumptions we have $\sum_{x,y}I_{p}(p+u_{xy})\leq Cn^{2}p^{2}\log(1/p)$ . ∎

We now proceed with the proof of Proposition 7.1. By the our assumptions and the above three claims we obtain the following for any $\eta>0$ and small enough $\gamma>0$ :

\sum_{(a_{0},a_{1},a_{2},a_{3})\in\mathcal{C}_{4}\setminus\left(\mathcal{A}\cup\mathcal{B}\cup\mathcal{D}(\gamma)\right)}u_{a_{0}a_{1}}u_{a_{1}a_{2}}u_{a_{2}a_{3}}u_{a_{3}a_{0}}(1-u_{a_{0}a_{2}})(1-u_{a_{1}a_{3}})\geq\left(\delta-\eta\right)\mathbb{E}[X].

We claim that provided $n$ is large enough we have:

\mathcal{C}_{4}\setminus\left(\mathcal{A}\cup\mathcal{B}\cup\mathcal{D}(\gamma)\right)\subseteq\mathcal{C}_{4}^{\gamma}.

Indeed, as $(a_{0},a_{1},a_{2},a_{3})\not\in\mathcal{A}$ , there exists $i$ such that provided $\varepsilon<1/4$ and $n$ is large enough we have:

u_{a_{i}a_{i+1}},u_{a_{i+1}a_{i+2}}>\sqrt{p}/\log(1/p)>p^{1-2\varepsilon}.

Further, as $(a_{0},a_{1},a_{2},a_{3})\not\in\mathcal{B}_{i+2}$ we have

u_{a_{i+2}a_{i+3}}>p^{1-2\varepsilon}\quad\text{or}\quad u_{a_{i}a_{i+3}}>p^{1-2\varepsilon}.

Without loss of generality, assume the latter holds. In particular we have,

u_{a_{i}a_{i+3}},u_{a_{i+1}a_{i+2}}>p^{1-2\varepsilon}.

Since $(a_{0},a_{1},a_{2},a_{3})\not\in\mathcal{D}_{i}(\gamma)$ , we have

u_{a_{i}a_{i+1}},u_{a_{i+2}a_{i+3}}>\gamma.

Finally, as $(a_{0},a_{1},a_{2},a_{3})\not\in\mathcal{D}_{i+1}(\gamma)$ , for large enough $n$ we have $p^{1-2\varepsilon}<\gamma$ implying that

u_{a_{i}a_{i+3}},u_{a_{i+1}a_{i+2}}>\gamma.

Letting $\mathcal{L}_{3}$ be the set of all $\{x,y\}\in\binom{[n]}{2}$ with $u_{xy}>\gamma$ , we claim the following holds:

\sum_{(a_{0},a_{1},a_{2},a_{3})\in\mathcal{C}_{4}^{\gamma}}u_{a_{0}a_{1}}u_{a_{1}a_{2}}u_{a_{2}a_{3}}u_{a_{3}a_{0}}(1-u_{a_{0}a_{2}})(1-u_{a_{1}a_{3}})\leq\frac{\left(\sum_{\{a_{0},a_{1}\}\in\mathcal{L}_{3}}u_{a_{0}a_{1}}\right)^{2}}{4}.

(27)

Indeed,

\displaystyle 2\sum_{(a_{0},a_{1},a_{2},a_{3})\in\mathcal{C}_{4}^{\gamma}}u_{a_{0}a_{1}}u_{a_{1}a_{2}}u_{a_{2}a_{3}}u_{a_{3}a_{0}}(1-u_{a_{0}a_{2}})(1-u_{a_{1}a_{3}})

is at most

\sum u_{a_{0}a_{1}}u_{a_{2}a_{3}}(\underbrace{u_{a_{1}a_{2}}u_{a_{3}a_{0}}(1-u_{a_{0}a_{2}})(1-u_{a_{1}a_{3}})+u_{a_{0}a_{2}}u_{a_{1}a_{3}}(1-u_{a_{1}a_{2}})(1-u_{a_{3}a_{0}})}_{(\star)}),

where the sum ranges over unordered pairs, $\{\{a_{0},a_{1}\},\{a_{2},a_{3}\}\}\subseteq\mathcal{L}_{3}$ , such that $a_{0},a_{1},a_{2},a_{3}$ are all distinct. We also have

(\star)\leq(1-u_{a_{0}a_{2}})+u_{a_{0}a_{2}}=1.

This implies that

(\delta-\eta)\mathbb{E}[X]\leq\sum_{(a_{0},a_{1},a_{2},a_{3})\in\mathcal{C}_{4}^{\gamma}}u_{a_{0}a_{1}}u_{a_{1}a_{2}}u_{a_{2}a_{3}}u_{a_{3}a_{0}}(1-u_{a_{0}a_{2}})(1-u_{a_{1}a_{3}})\leq\frac{\sum u_{a_{0}a_{1}}u_{a_{2}a_{3}}}{2},

where the summation in the right-hand side is over unordered pairs, $\{\{a_{0},a_{1}\},\{a_{2},a_{3}\}\}\subseteq\mathcal{L}_{3}$ , such that $a_{0},a_{1},a_{2},a_{3}$ are all distinct. This implies (27) as

\frac{\left(\sum_{\{a_{0},a_{1}\}\in\mathcal{L}_{3}}u_{a_{0}a_{1}}\right)^{2}}{2}=\frac{\sum_{\{a_{0},a_{1}\}\in\mathcal{L}_{3}}u^{2}_{a_{0}a_{1}}+2\sum u_{a_{0}a_{1}}u_{a_{2}a_{3}}}{2}\geq{\sum u_{a_{0}a_{1}}u_{a_{2}a_{3}}},

where the unlabeled summations are over unordered pairs $\{\{a_{0},a_{1}\},\{a_{2},a_{3}\}\}\subseteq\mathcal{L}_{3}$ . We conclude that:

2\sqrt{(\delta-\eta)\mathbb{E}[X]}\leq\sum_{\{a_{0},a_{1}\}\in\mathcal{L}_{3}}u_{a_{0}a_{1}}.

Now we can bound from below the ‘cost’ using Lemma 7.2 as follows:

	$\displaystyle\sum_{\{x,y\}\in\binom{[n]}{2}}I_{p}(p+u_{xy})$	$\displaystyle\geq\sum_{\{x,y\}\in\mathcal{L}_{3}}I_{p}(p+u_{xy})=(1+o(1))\sum_{\{x,y\}\in\mathcal{L}_{3}}u_{xy}\log(u_{xy}/p)$
		$\displaystyle\geq(1+o(1))\sum_{\{x,y\}\in\mathcal{L}_{3}}u_{xy}\log(1/p)\geq(1+o(1))2\sqrt{(\delta-\eta)\mathbb{E}[X]}\log(1/p).$

This finishes the proof of the proposition as $\mathbb{E}[X]=(1+o(1))\frac{n^{2}p^{2}}{8}.$ ∎

References

[1] Noga Alon, On the number of subgraphs of prescribed type of graphs with a given number of edges, Israel J. Math. 38 (1981), no. 1-2, 116–130. MR 599482
[2] Fanny Augeri, Nonlinear large deviation bounds with applications to Wigner matrices and sparse Erdős-Rényi graphs, Ann. Probab. 48 (2020), no. 5, 2404–2448. MR 4152647
[3] by same author, A transportation approach to the mean-field approximation, Probability Theory and Related Fields 180 (2021), no. 1, 1–32.
[4] Anirban Basak and Riddhipratim Basu, Upper tail large deviations of regular subgraph counts in Erdős–Rényi graphs in the full localized regime, arXiv preprint arXiv:1912.11410 (2019).
[5] Bhaswar B Bhattacharya, Shirshendu Ganguly, Eyal Lubetzky, and Yufei Zhao, Upper tails and independence polynomials in random graphs, Advances in Mathematics 319 (2017), 313–347.
[6] Béla Bollobás, Random graphs, Combinatorics (Swansea, 1981), London Math. Soc. Lecture Note Ser., vol. 52, Cambridge Univ. Press, Cambridge-New York, 1981, pp. 80–102. MR 633650
[7] Béla Bollobás, Chiê Nara, and Shun-ichi Tachibana, The maximal number of induced complete bipartite graphs, Discrete Math. 62 (1986), no. 3, 271–275. MR 866942
[8] Sourav Chatterjee, The missing log in large deviations for triangle counts, Random Structures & Algorithms 40 (2012), no. 4, 437–451.
[9] Sourav Chatterjee and Amir Dembo, Nonlinear large deviations, Advances in Mathematics 299 (2016), 396–450.
[10] Sourav Chatterjee and SR Srinivasa Varadhan, The large deviation principle for the Erdős–Rényi random graph, European Journal of Combinatorics 32 (2011), no. 7, 1000–1017.
[11] Nicholas Cook and Amir Dembo, Large deviations of subgraph counts for sparse Erdős–Rényi graphs, Advances in Mathematics 373 (2020), 107289.
[12] Nicholas A Cook, Amir Dembo, and Huy Tuan Pham, Regularity method and large deviation principles for the Erdős–Rényi hypergraph, arXiv preprint arXiv:2102.09100 (2021).
[13] Robert DeMarco and Jeff Kahn, Tight upper tail bounds for cliques, Random Structures & Algorithms 41 (2012), no. 4, 469–487.
[14] by same author, Upper tails for triangles, Random Structures Algorithms 40 (2012), no. 4, 452–459. MR 2925307
[15] Amir Dembo and Ofer Zeitouni, Large deviations techniques and applications, Stochastic Modelling and Applied Probability, vol. 38, Springer-Verlag, Berlin, 2010, Corrected reprint of the second (1998) edition. MR 2571413
[16] Ronen Eldan, Gaussian-width gradient complexity, reverse log-Sobolev inequalities and nonlinear large deviations, Geometric and Functional Analysis 28 (2018), no. 6, 1548–1596.
[17] Paul Erdős, On the number of complete subgraphs contained in certain graphs, Magyar Tud. Akad. Mat. Kutató Int. Közl 7 (1962), no. 3, 459–464.
[18] Ehud Friedgut and Jeff Kahn, On the number of copies of one hypergraph in another, Israel J. Math. 105 (1998), 251–256. MR 1639767
[19] Dániel Gerbner, Dániel T. Nagy, Balázs Patkós, and Máté Vizer, On the maximum number of copies of $H$ in graphs with given size and order, J. Graph Theory 96 (2021), no. 1, 34–43. MR 4191113
[20] Benjamin Gunby, Upper tails of subgraph counts in sparse regular graphs, arXiv preprint arXiv:2010.00658 (2020).
[21] Matan Harel, Frank Mousset, and Wojciech Samotij, Upper tails via high moments and entropic stability, arXiv:1904.08212 (2019).
[22] Svante Janson, Krzysztof Oleszkiewicz, and Andrzej Ruciński, Upper tails for subgraph counts in random graphs, Israel J. Math. 142 (2004), 61–92. MR 2085711
[23] Svante Janson and Andrzej Ruciński, The infamous upper tail, Random Structures & Algorithms 20 (2002), no. 3, 317–342.
[24] Jeong Han Kim and Van H Vu, Divide and conquer martingales and the number of triangles in a random graph, Random Structures & Algorithms 24 (2004), no. 2, 166–174.
[25] Yang P Liu and Yufei Zhao, On the upper tail problem for random hypergraphs, Random Structures & Algorithms 58 (2021), no. 2, 179–220.
[26] Eyal Lubetzky and Yufei Zhao, On replica symmetry of large deviations in random graphs, Random Structures & Algorithms 47 (2015), no. 1, 109–146.
[27] by same author, On the variational problem for upper tails in sparse random graphs, Random Structures & Algorithms 50 (2017), no. 3, 420–436.
[28] Andrzej Ruciński, When are small subgraphs of a random graph normally distributed?, Probab. Theory Related Fields 78 (1988), no. 1, 1–10. MR 940863
[29] Van H. Vu, A large deviation result on the number of small subgraphs of a random graph, Combin. Probab. Comput. 10 (2001), no. 1, 79–94. MR 1827810

Appendix A Translating graphons’ language into graphs’ language

The aim of this Appendix is to add a proof (in the language of graphs instead of graphons) for Lemmas 7.2 and 7.2 which were proven by Lubetzky–Zhao [27] and Bhattacharya, Ganguly, Lubetzky and Zhao [5] in the language of graphons.

We start with a discussion of how one would prove Lemma 7.2, then we state some lemmas and an extension of Lemma 7.2. To prove Lemma 7.2 we will use Lemma 7.2, and therefore it will be proven before Lemma 7.2. Recall Lemma 7.2:

\lemmaone

We start with by analysing the term $\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{q}})]$ . Suppose $\bar{q}\in[0,1]^{\binom{n}{2}}$ satisfies the conditions of Lemma 7.2. Note that

\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{q}}))]={\sum_{(x,y,z,w)\in\mathcal{C}_{4}}q_{xy}q_{yz}q_{zw}q_{wx}(1-q_{xz})(1-q_{yw})}.

Since $q_{i}=p+u_{i}\geq u_{i}$ we have the following for large enough $n$ :

	$\displaystyle\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{q}}))]$	$\displaystyle\leq{\sum_{(x,y,z,w)\in\mathcal{C}_{4}}q_{xy}q_{yz}q_{zw}q_{wx}(1-u_{xz})(1-u_{yw})}$
		$\displaystyle={\sum_{(x,y,z,w)\in\mathcal{C}_{4}}(p+u_{xy})(p+u_{yz})(p+u_{zw})(p+u_{wx})(1-u_{xz})(1-u_{yw})}$
		$\displaystyle=\mathbb{E}[N(C_{4},G_{n,p})]+\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{u}})]+\sum_{\emptyset\not=H\subsetneq^{*}C_{4}}\mathbb{E}[N(H,G_{n,\bar{u}})]p^{4-e_{H}}$
		$\displaystyle=\mathbb{E}[X]/(1-p)^{2}+\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{u}})]+\sum_{\emptyset\not=H\subsetneq^{*}C_{4}}\mathbb{E}[N(H,G_{n,\bar{u}})]p^{4-e_{H}}$
		$\displaystyle=(1+\varepsilon/2)\mathbb{E}[X]+\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{u}})]+\sum_{\emptyset\not=H\subsetneq^{*}C_{4}}\mathbb{E}[N(H,G_{n,\bar{u}})]p^{4-e_{H}},$

where $\subsetneq^{*}$ stands for a spanning subgraph which is not equal to the host graph. Recalling that $\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{q}}))]-\mathbb{E}[X]\geq(\delta-\varepsilon)\mathbb{E}[X]$ , we obtain,

	$\displaystyle(\delta-3\varepsilon/2)\mathbb{E}[X]\leq$	$\displaystyle\,\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{u}})]+\mathbb{E}[N(P_{4},G_{n,\bar{u}})]p+\mathbb{E}[N(M_{2},G_{n,\bar{u}})]p^{2}$
		$\displaystyle\,+\mathbb{E}[N(K_{1,2}\sqcup K_{1},G_{n,\bar{u}})]p^{2}+\mathbb{E}[N(K_{2}\sqcup 2K_{1},G_{n,\bar{u}})]p^{3}.$
	$\displaystyle\leq$	$\displaystyle\,\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{u}})]+\mathbb{E}[N(P_{4},G_{n,\bar{u}})]p+\mathbb{E}[N(M_{2},G_{n,\bar{u}})]p^{2}$
		$\displaystyle\,+\mathbb{E}[N(K_{1,2},G_{n,\bar{u}})]np^{2}+\mathbb{E}[N(K_{2},G_{n,\bar{u}})]n^{2}p^{3},$

where $P_{4}$ is the path with three edges, $M_{2}$ is the matching of size two, $K_{1,2}\sqcup K_{1}$ is the complete bipartite graph with sides of size $1$ and $2$ and an extra isolated vertex and $K_{2}\sqcup 2K_{1}$ is the disjoint union of an edge and an independent set of size two. Therefore, to prove the lemma, it is enough to show that all of the terms in the right-hand side of the above inequality except for $\mathbb{E}[N_{ind}(C_{4},G_{n,\bar{u}})]$ are negligible in comparison to $n^{4}p^{4}$ .

For this we remind the reader Lemma 7.2 and cite two useful lemmas from [27] and [5]:

\lemmatwo

Lemma A.1 ([27]).

There exists $p_{0}>0$ such that for all $0<p\leq p_{0}$ and $0\leq x\leq b\leq 1-p-1/\log(1/p)$ we have,

I_{p}(p+x)\geq\frac{x^{2}I_{p}(p+b)}{b^{2}}.

Corollary A.2 ([27]).

There exists $p_{0}>0$ such that for all $0<p\leq p_{0}$ and all $0\leq x\leq 1-p$ we have,

I_{p}(p+x)\geq x^{2}I_{p}(1-1/\log(1/p))=(1+o(1))x^{2}I_{p}(1)

where the $o(1)$ -term goes to zero as $p\to 0$ .

Another useful tool we use is the generalized Hölder inequality:

Lemma A.3 (Generalized Hölder inequality [5]).

Let $\mu_{1},\mu_{2},\ldots,\mu_{n}$ be probability measures on $\Omega_{1},\Omega_{2},\ldots,\Omega_{n}$ respectively, and let $\mu=\prod_{i=1}^{n}\mu_{i}$ . Let $A_{1},A_{2},\ldots,A_{m}$ be non-empty subsets of $[n]$ and for $A\subseteq[n]$ put $\mu_{A}=\prod_{j\in A}\mu_{j}$ and $\Omega_{A}=\prod_{j\in A}\Omega_{j}$ . Let $f_{i}\in L^{p_{i}}(\Omega_{A_{i}},\mu_{A_{i}})$ for each $i\in[m]$ , and suppose that $\sum_{i:A_{i}\ni j}\frac{1}{p_{i}}\leq 1$ for all $j\in[n]$ . Then,

\int\prod_{i=1}^{m}|f_{i}|d\mu\leq\prod_{i=1}^{m}\left(\int|f_{i}|^{p_{i}}d\mu_{A_{i}}\right)^{1/p_{i}}.

In particular, if every element of $[n]$ is contained in at most two sets $A_{j}$ , then we can take all $p_{i}=2$ and obtain:

\int\prod_{i=1}^{m}f_{i}d\mu\leq\prod_{i=1}^{m}\left(\int|f_{i}|^{2}d\mu_{A_{i}}\right)^{1/2}.

Next, we recreate a proof of Lubetzky–Zhao [27] and Bhattacharya, Ganguly, Lubetzky and Zhao [5] of an extension of Lemma 7.2. Then, we will derive the required bounds.

Lemma A.4.

Suppose $\varepsilon,\delta,C$ are positive reals and $\sqrt{\log(n)}/n\ll p\ll n^{-1/2}$ . Suppose also that $\bar{u}\in[0,1-p]^{\binom{n}{2}}$ such that $\sum_{i\in\binom{[n]}{2}}I_{p}(p+u_{i})\leq Cn^{2}p^{2}\log(1/p)$ . Let $b=b(n)$ be such that $\max\{np^{2},\sqrt{p\log(1/p)}\}\ll b\leq 1-\varepsilon$ . Then, provided $n$ is large enough there is a constant $D>0$ such that the following holds:

(i)

$\sum_{x\in[n]}\left(\sum_{y\neq x}u_{xy}\right)^{2}\leq Dn^{3}p^{2}b,$
(ii)

$\sum_{i\in\binom{[n]}{2}}u_{i}\leq Dn^{2}p^{3/2}\sqrt{\log(1/p)}$ , and
(iii)

$\sum_{i\in\binom{[n]}{2}}u_{i}^{2}\leq Dn^{2}p^{2}.$

Proof.

We first show that the set $B=\{x\in[n]:\sum_{y\neq x}u_{xy}\geq bn\}$ is empty, provided $n$ is large enough. If this were not true, then, as $I_{p}(p+x)$ is a convex function of $x$ , we have the following by Jensen’s inequality

	$\displaystyle\sum_{x,y}I_{p}(p+u_{xy})$	$\displaystyle\geq(n-1)\sum_{x}I_{p}\left(p+\frac{\sum_{y\neq x}u_{xy}}{n-1}\right)\geq(n-1)\sum_{x\in B}I_{p}\left(p+\frac{\sum_{y\neq x}u_{xy}}{n-1}\right)$
		$\displaystyle\geq(n-1)\sum_{x\in B}I_{p}(p+b)\geq{(1+o(1))(n-1)\|B\|b\log(b/p)}$

where the last inequality follows from Lemma 7.2. Since $b\gg\sqrt{p\log(1/p)}$ and $\sum_{i\in\binom{[n]}{2}}I_{p}(p+u_{i})\leq Cp^{2}n^{2}\log(1/p)$ we obtain the following:

|B|\leq(1+o(1))\frac{Cp^{2}n^{2}\log(1/p)}{(n-1)b\log(\sqrt{\log(1/p)/p})}\ll 1,

and therefore, $B$ is empty. To prove (i), we use the convexity of $I_{p}(p+x)$ and Lemma A.1,

	$\displaystyle\sum_{x,y}I_{p}(p+u_{xy})$	$\displaystyle\geq(n-1)\sum_{x}I_{p}\left(p+\frac{\sum_{y\neq x}u_{xy}}{n-1}\right)=(n-1)\sum_{x\not\in B}I_{p}\left(p+\frac{\sum_{y\neq x}u_{xy}}{n-1}\right)$
		$\displaystyle\geq(n-1)I_{p}(p+b)\sum_{x\not\in B}\left(\frac{\left(\sum_{y\neq x}u_{xy}\right)^{2}}{b^{2}n^{2}}\right).$

Since, $\sum_{i\in\binom{[n]}{2}}I_{p}(p+u_{i})\leq Cp^{2}n^{2}\log(1/p)$ we obtain:

\sum_{x}\left(\sum_{y\neq x}u_{xy}\right)^{2}\leq\frac{Cp^{2}n^{4}b^{2}\log(1/p)}{(n-1)I_{p}(p+b)}.

As $b\gg\sqrt{p\log(1/p)}\gg p$ we may use Lemma 7.2 and we obtain:

\sum_{x\in[n]}\left(\sum_{y\neq x}u_{xy}\right)^{2}\leq\frac{Cn^{4}p^{2}b^{2}\log(1/p)}{(1+o(1))(n-1)b\log(b/p)}\leq 2Cn^{3}p^{2}b.

Now we prove (ii). By convexity we have,

\binom{n}{2}I_{p}\left(p+\frac{\sum_{i\in\binom{[n]}{2}}u_{i}}{\binom{n}{2}}\right)\leq\sum_{x,y}I_{p}(p+u_{xy})\leq Cn^{2}p^{2}\log(1/p).

Since $p\gg\sqrt{p^{3}\log(1/p)}$ we may use Lemma 7.2, and obtain the following for large enough $n$ :

I_{p}\left(p+\sqrt{12Cp^{3}\log(1/p)}\right)\geq\frac{12Cp^{3}\log(1/p)}{3p}=4Cp^{2}\log(1/p).

This implies the following provided $n$ is large enough,

I_{p}\left(p+\frac{\sum_{i\in\binom{[n]}{2}}u_{i}}{\binom{n}{2}}\right)\leq 4Cp^{2}\log(1/p)\leq I_{p}\left(p+\sqrt{12Cp^{3}\log(1/p)}\right).

By the monotonicity of $I_{p}(p+x)$ for $x>0$ we obtain (ii). That is

\sum_{i\in\binom{[n]}{2}}u_{i}\leq\sqrt{3Cn^{4}p^{3}\log(1/p)}.

Lastly, we prove (iii). By Corollary A.2,

\sum_{i\in\binom{[n]}{2}}I_{p}(p+u_{i})\geq(1+o(1))\sum_{i\in\binom{[n]}{2}}u_{i}^{2}\log(1/p).

This and the assumption of the lemma implies (iii). ∎

Now we can finish the proof of Lemma 7.2.

Lemma A.5.

Suppose $\delta,C$ are positive reals and $\sqrt{\log(n)}/n\ll p\ll n^{-1/2}$ . Suppose also that $\bar{u}$ is a sequence of $\binom{n}{2}$ reals between $0$ and $1$ such that $\sum_{i\in\binom{[n]}{2}}I_{p}(p+u_{i})\leq Cp^{2}n^{2}\log(1/p)$ . Then,

\mathbb{E}[N(M_{2},G_{n,\bar{u}})]p^{2}=o(n^{4}p^{4}),

\mathbb{E}[N(K_{2},G_{n,\bar{u}})]n^{2}p^{3}=o(n^{4}p^{4}),

\mathbb{E}[N(P_{4},G_{n,\bar{u}})]p=o(n^{4}p^{4}),

\mathbb{E}[N(K_{1,2},G_{n,\bar{u}})]np^{2}=o(n^{4}p^{4}).

Proof.

By item (ii) in Lemma A.4 we have $\mathbb{E}[e(G_{n,\bar{u}})]=o(n^{2}p^{2})$ , and therefore,

\mathbb{E}[N_{ind}(M_{2},G_{n,\bar{u}})]p^{2}\leq\mathbb{E}[N(M_{2},G_{n,\bar{u}})]p^{2}\leq\mathbb{E}[e(G_{n,\bar{u}})]^{2}p^{2}=o(n^{4}p^{4}),

\mathbb{E}[N_{ind}(K_{2},G_{n,\bar{u}})]n^{2}p^{3}=\mathbb{E}[e(G_{n,\bar{u}})]n^{2}p^{3}=o(n^{4}p^{4}),

where $N(M_{2},G_{n,\bar{u}})$ is the number of $M_{2}$ in $G_{n,\bar{u}}$ . Let $\max\{np^{2},\sqrt{p\log(1/p)}\}\ll b=b(n)\ll 1$ . Note that each $P_{4}\subseteq K_{n}$ with vertices $a_{0},a_{1},a_{2},a_{3}$ can be represented in exactly two ways as a tuple $(a_{0},a_{1},a_{2},a_{3})$ where edges are consecutive vertices in this tuple. Let $\mathcal{P}_{4}$ be a collection of exactly one such representative for each $P_{4}$ in $K_{n}$ . Observe the following:

\displaystyle\mathbb{E}[N(P_{4},G_{n,\bar{u}})]

\displaystyle=\sum_{(x,y,z,w)\in\mathcal{P}_{4}}u_{xy}u_{yz}u_{zw}\leq\sum_{y,z,w\in[n]}\left(\sum_{x\neq y}u_{xy}\right)u_{yz}u_{zw},

where $N(P_{4},G_{n,\bar{u}})$ is the number of $P_{4}$ in $G_{n,\bar{u}}$ . By the generalized Hölder inequality we have:

\sum_{y,z,w\in[n]}\left(\sum_{x\neq y}u_{xy}\right)u_{yz}u_{zw}\leq\left(\sum_{y}\mathbb{E}[\deg(y)]^{2}\right)^{1/2}\left(\sum_{x,y}u_{xy}^{2}\right).

Applying items (i) and (iii) in Lemma A.4 we obtain

\mathbb{E}[N(P_{4},G_{n,\bar{u}})]p\leq D^{3/2}n^{7/2}p^{4}\sqrt{b}=o(n^{4}p^{4}).

This establishes the third assertion of the lemma. The fourth assertion of the lemma follows immediately from Lemma A.4 item (i) and the definition of $b$ :

\mathbb{E}[N(K_{1,2},G_{n,\bar{u}})]\leq\sum_{x,y,z\in[n]}u_{xy}u_{yz}\leq\sum_{y\in[n]}\left(\sum_{x\in[n]}u_{xy}\right)^{2}\leq Dn^{2}p^{3/2}b=o(n^{3}p^{2}).\qed