This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

The strong component structure of the barely subcritical directed configuration model

Matthew Coulson Department of Combinatorics & Optimization, University of Waterloo, Canada. Email: [email protected].
Abstract

We study the behaviour of the largest components of the directed configuration model in the barely subcritical regime. We show that with high probability all strongly connected components in this regime are either cycles or isolated vertices and give an asymptotic distribution of the size of the kkth largest cycle. This gives a configuration model analogue of a result of Łuczak and Seierstad for the binomial random digraph.

1 Introduction

1.1 The directed configuration model

Let 𝐝n=(𝐝n,𝐝n+)=((d1,d1+),,(dn,dn+))\mathbf{d}_{n}=(\mathbf{d}_{n}^{-},\mathbf{d}_{n}^{+})=((d_{1}^{-},d_{1}^{+}),\ldots,(d_{n}^{-},d_{n}^{+})) be a directed degree sequence on nn vertices and let mn=di=di+m_{n}=\sum d_{i}^{-}=\sum d_{i}^{+}. The directed configuration model on 𝐝n\mathbf{d}_{n}, introduced by Cooper and Frieze [3], is the random directed multigraph on [n][n] obtained by associating with vertex ii did_{i}^{-} in-stubs and di+d_{i}^{+} in-stubs, and then choosing a perfect matching of in- and out- stubs uniformly at random. We will denote this model by G(𝐝n)G(\mathbf{d}_{n}). This is a directed generalisation of the configuration model of Bollobás [1] which since its introduction has become one of the most widely used random graph models.

A strongly connected component in a digraph is a maximal sub-digraph such that there exists a directed path between each ordered pair of vertices. In this paper we will consider the size of the largest strongly connected component in the barely subcritical regime. Note that these components are much smaller than the connected components which are found in the undirected case. This is due to the fact that the minimal strongly connected components are directed cycles as opposed to the trees in the directed case and so we require one additional edge on the same vertex set. We shall call any strongly connected digraph which is not a cycle complex. Note that any such digraph has strictly more edges than vertices.

Let nk,n_{k,\ell} be the number of copies of (k,)(k,\ell) in 𝐝n\mathbf{d}_{n} and let Δn=max(di,di+)\Delta_{n}=\max(d_{i}^{-},d_{i}^{+}) be the maximum degree of 𝐝n\mathbf{d}_{n}. Also, define the following parameters of the degree distribution. These parameters govern the behaviour of the size of the largest strongly connected component.

Definition 1.1.
Qn:=1mni=1ndidi+1\displaystyle Q_{n}:=\frac{1}{m_{n}}\sum_{i=1}^{n}d_{i}^{-}d_{i}^{+}-1 Rn:=1mni=1ndidi+(di1)\displaystyle R_{n}^{-}:=\frac{1}{m_{n}}\sum_{i=1}^{n}d_{i}^{-}d_{i}^{+}(d_{i}^{-}-1) Rn+:=1mni=1ndidi+(di+1)\displaystyle R_{n}^{+}:=\frac{1}{m_{n}}\sum_{i=1}^{n}d_{i}^{-}d_{i}^{+}(d_{i}^{+}-1)

We shall assume the following conditions which ensure that the degree sequence is suitably “well behaved”.

Condition 1.2.

For each nn, (di,di+)i=1n=((di,di+)(n))i=1n(d_{i}^{-},d_{i}^{+})_{i=1}^{n}=((d_{i}^{-},d_{i}^{+})^{(n)})_{i=1}^{n} is a sequence of ordered pairs of non-negative integers such that i=1ndi=i=1ndi+\sum_{i=1}^{n}d_{i}^{-}=\sum_{i=1}^{n}d_{i}^{+}. Furthermore, (pi,j)i,j=1(p_{i,j})_{i,j=1}^{\infty} is a probability distribution such that for some ε,ζ>0\varepsilon,\zeta>0,

  1. i)

    ni,j/npi,jn_{i,j}/n\to p_{i,j} as nn\to\infty for each i,j0i,j\geq 0,

  2. ii)

    mn/n=μ(n)μ=i,j=1ipi,j=i,j=1jpi,jm_{n}/n=\mu^{(n)}\to\mu=\sum_{i,j=1}^{\infty}ip_{i,j}=\sum_{i,j=1}^{\infty}jp_{i,j},

  3. iii)

    n0,0=0n_{0,0}=0,

  4. iv)

    i=1n0,i+ni,0(1ε)n\sum_{i=1}^{\infty}n_{0,i}+n_{i,0}\leq(1-\varepsilon)n,

  5. v)

    n1,1(1ε)nn_{1,1}\leq(1-\varepsilon)n,

  6. vi)

    Δnn1/6log1/4(n)\Delta_{n}\leq n^{1/6}\log^{-1/4}(n),

  7. vii)

    Rn,Rn+ζR_{n}^{-},R_{n}^{+}\geq\zeta.

It was shown by Cooper and Frieze [3] that Qn=0Q_{n}=0 is the threshold for the existence of a giant strongly connected component under some mild conditions on the degree sequence similar to those observed in 1.2.

For the remainder of the paper, we shall omit the subscript nn for reasons of clarity - for example we write 𝐝=𝐝n\mathbf{d}=\mathbf{d}_{n} etc. Let 𝒞k(𝐝)\mathcal{C}_{k}(\mathbf{d}) be the size of the kkth largest strongly connected component of the directed configuration model with degree sequence 𝐝\mathbf{d}. We shall write 𝒞k\mathcal{C}_{k} for this quantity when the degree sequence is clear.

Our main result is the following,

Theorem 1.3.

Suppose that G(𝐝)G(\mathbf{d}) is a configuration model random digraph and suppose that nQ3(RR+)1nQ^{3}(R^{-}R^{+})^{-1}\to-\infty. With high probability, there are no complex components or cycles of length ω(1/|Q|)\omega(1/|Q|). Furthermore,

(|𝒞k|α|Q|)=1i=0k1ξαii!eξα+o(1).\mathbb{P}\bigg{(}|\mathcal{C}_{k}|\geq\frac{\alpha}{|Q|}\bigg{)}=1-\sum_{i=0}^{k-1}\frac{\xi_{\alpha}^{i}}{i!}e^{-\xi_{\alpha}}+o(1).

where

ξα=αexx𝑑x.\xi_{\alpha}=\int_{\alpha}^{\infty}\frac{e^{-x}}{x}dx.

The condition nQ3(RR+)1nQ^{3}(R^{-}R^{+})^{-1}\to-\infty is likely best possible as the analogous condition in the undirected case discriminates the barely subcritical case from the critical window. Moreover, substituting Q,RQ,R^{-} and R+R^{+} obtained from a typical degree sequence of D(n,p)D(n,p) also enters the critical window when we have nQ3(RR+)1cnQ^{3}(R^{-}R^{+})^{-1}\to c for some constant cc.

In prior work, Łuczak and Seierstad [17] considered the model D(n,p)D(n,p) which is a random digraph model formed by including each possible arc with probability pp independently. They showed an analogous result to Theorem 1.3 for p=(1ε)/np=(1-\varepsilon)/n with ε=o(1)\varepsilon=o(1) and ε3n\varepsilon^{3}n\to\infty in this model.

1.2 Previous work

The study of the giant component in random graph models was initiated by the seminal paper of Erdős and Rényi [7] regarding the giant component in G(n,p)G(n,p). Since then the appearance of a giant component in various models has remained an active topic of study in the area.

When working with directed graphs, there are a number of types of connected component which are of interest. In this paper we concern ourselves with the strongly connected components. The study of strongly connected components in random digraphs began in the model D(n,p)D(n,p) where we include each possible edge with probability pp independently. Karp [14] and Łuczak [16] independently showed that when p=c/np=c/n, then D(n,p)D(n,p) has all strongly connected components of size O(1)O(1) if c<1c<1. If instead c>1c>1 they showed that there exists a unique strongly connected component of linear order with all other components of size O(1)O(1). The case p=(1+ε)/np=(1+\varepsilon)/n with ε=o(1)\varepsilon=o(1) and |ε|3n|\varepsilon|^{3}n\to\infty was studied by Łuczak and Seierstad [17]. They showed that if ε3n\varepsilon^{3}n\to\infty, there is a unique strongly connected component of size 4ε2n4\varepsilon^{2}n and other components all of size O(1/ε)O(1/\varepsilon). When ε3n\varepsilon^{3}n\to-\infty they showed an analogue of Theorem 1.3:

Theorem 1.4.

Let np=1εnp=1-\varepsilon where ε0\varepsilon\to 0 but ε3n\varepsilon^{3}n\to-\infty. Assume α>0\alpha>0 is a constant and let XkX_{k} denote the size of the kkth largest strongly connected component. Then, asymptotically almost surely, D(n,p)D(n,p) contains no complex components and

limn(Xkα/ε)=1i=0k1ξαii!eξα,\lim_{n\to\infty}\mathbb{P}(X_{k}\geq\alpha/\varepsilon)=1-\sum_{i=0}^{k-1}\frac{\xi_{\alpha}^{i}}{i!}e^{-\xi_{\alpha}},

where ξα=αexx𝑑x\xi_{\alpha}=\int_{\alpha}^{\infty}\frac{e^{-x}}{x}dx.

The so-called critical window, when p=(1+λn1/3)/np=(1+\lambda n^{-1/3})/n has also been the subject of some study. In [5] the author showed bounds on the size of the largest strongly connected component in this regime which are akin to bounds obtained by Nachmias and Peres for the undirected case [18]. Moreover, Goldschmidt and Stephenson [10] gave a scaling limit result for the largest strongly connected components in the critical window.

The directed configuration model has also been studied previously. It was first studied by Cooper and Frieze who showed that provided the maximum degree Δ<n1/12/log(n)\Delta<n^{1/12}/\log(n) then if Qn<0Q_{n}<0, there is no all strongly connected components are small and if Qn>0Q_{n}>0 there is a giant strongly connected component of linear size. The assumptions on the degree sequence have subsequently been relaxed, Graf [11] showed Δ<n1/4\Delta<n^{1/4} is enough to draw the same conclusion and Cai and Perarnau [2] improved this further to only require bounded second moments. A scaling limit result has been recently obtained at the exact point of criticality, Qn=0Q_{n}=0 by Donderwinkel and Xie [6] in a very closely related model where vertices’ degrees are sampled from a limiting degree sequence.

1.3 Organization

The remainder of this paper is arranged as follows, in section 2 we prove some auxiliary results on subgraph counting within the directed configuration model. Then, in section 3 we enumerate certain types of strongly connected directed graphs of maximum degree 44. In section 4 we prove Theorem 1.3 which we break down into a few steps: first that there are no long cycles, next that there are no complex components, and finally we compute the probability the kkth largest component has size at least α/|Q|\alpha/|Q| via Poisson approximation. We conclude the paper in section 5 with some open questions and future work.

2 Subgraph Bounds

We calculate probabilities of given subgraphs in the configuration model.

Suppose that G(𝐝)G(\mathbf{d}) is a configuration model random graph and let D=(D,D+)D=(D_{-},D_{+}) be the degree of a uniformly random vertex of G(𝐝)G(\mathbf{d}). We call DD the degree distribution of G(𝐝)G(\mathbf{d}) (or of 𝐝\mathbf{d}). Define

μi,j=𝔼(DiD+j),\displaystyle\mu_{i,j}=\mathbb{E}(D_{-}^{i}D_{+}^{j}), and ρi,j=𝔼(k=0i1(Dk)=0j1(D+)).\displaystyle\rho_{i,j}=\mathbb{E}\bigg{(}\prod_{k=0}^{i-1}(D_{-}-k)\prod_{\ell=0}^{j-1}(D_{+}-\ell)\bigg{)}.

So that the μi,j\mu_{i,j} and ρi,j\rho_{i,j} are the moments and factorial moments of (D,D+)(D_{-},D_{+}) respectively. We also define μ=μ1,0=μ0,1\mu=\mu_{1,0}=\mu_{0,1} which is the average degree. Furthermore, observe that μ1,1=mn(1+Q)\mu_{1,1}=\frac{m}{n}(1+Q) which is a fact we utilise in subsequent sections. We now state a general upper bound on the probability of finding certain subgraphs in the configuration model.

Lemma 2.1.

Let G(𝐝)G(\mathbf{d}) be a configuration model random digraph with degree distribution (D,D+)(D_{-},D_{+}). Suppose further that G(𝐝)G(\mathbf{d}) has nn vertices and mm edges. Let HH be any digraph with hh vertices, kk edges and let hi,jh_{i,j} be the number of vertices of HH with in-degree ii and out degree jj. Then the probability that a uniformly random injective map ϕ:V(H)V(G(𝐝))\phi:V(H)\to V(G(\mathbf{d})) is a homomorphism is bounded above by

p+(H,G(𝐝)):=nh(n)h(m)ki,jρi,jhi,jp^{+}(H,G(\mathbf{d})):=\frac{n^{h}}{(n)_{h}(m)_{k}}\prod_{i,j\in\mathbb{N}}\rho_{i,j}^{h_{i,j}} (1)

Furthermore, the expected number of copies of HH in G(𝐝)G(\mathbf{d}) is bounded above by

n+(H,G(𝐝)):=h!aut(H)(nh)p+(H,G(𝐝)).n^{+}(H,G(\mathbf{d})):=\frac{h!}{\text{aut}(H)}\binom{n}{h}p^{+}(H,G(\mathbf{d})). (2)
Proof.

Note that (2) follows immediately from (1). Thus we shall focus on the proof of (1). Let ψ:V(H)V(G(𝐝))\psi:V(H)\to V(G(\mathbf{d})) be a fixed injective map. Arbitrarily order the edges of HH as E1,E2,,EkE_{1},E_{2},\ldots,E_{k} and for each edge EiE_{i} define an event i:={ψ(Ei)E(G(𝐝))}\mathcal{E}_{i}:=\{\psi(E_{i})\in E(G(\mathbf{d}))\}. Then, ψ\psi is a homomorphism if and only if every event 1,,k\mathcal{E}_{1},\ldots,\mathcal{E}_{k} occurs. To simplify notation, let 1=1\mathcal{F}_{1}=\mathcal{E}_{1} and i=i|j=1i1j\mathcal{F}_{i}=\mathcal{E}_{i}|\cap_{j=1}^{i-1}\mathcal{E}_{j} for i2i\geq 2 and note that

i=1ki=i=1ki.\bigcap_{i=1}^{k}\mathcal{E}_{i}=\bigcap_{i=1}^{k}\mathcal{F}_{i}.

Suppose that Ei=aibiE_{i}=a_{i}b_{i} for some ai,biV(H)a_{i},b_{i}\in V(H). Define si=|{j<i:ai=aj}|s_{i}=|\{j<i:a_{i}=a_{j}\}| and ti=|{j<i:bi=bj}|t_{i}=|\{j<i:b_{i}=b_{j}\}|. That is, sis_{i} and tit_{i} are the number of times that aia_{i} (resp. bib_{i}) has previously appeared as the initial (terminal) vertex of an edge of HH. Also, suppose that ψ(Ei)=aibi\psi(E_{i})=a_{i}^{\prime}b_{i}^{\prime}.

Claim 2.2.
(i)min(d+(ai)si,0)min(d(bi)ti,0)m+1i\mathbb{P}(\mathcal{F}_{i})\leq\frac{\min(d^{+}(a_{i}^{\prime})-s_{i},0)\min(d^{-}(b_{i}^{\prime})-t_{i},0)}{m+1-i}
Proof.

To see this, note that if d+(ai)sid^{+}(a_{i}^{\prime})\leq s_{i}, then (i)=0\mathbb{P}(\mathcal{F}_{i})=0 as there are not enough stubs at aia_{i}^{\prime} to create such a copy of HH. Similarly if d(bi)tid^{-}(b_{i}^{\prime})\leq t_{i}, (i)=0\mathbb{P}(\mathcal{F}_{i})=0.

So, without loss of generality we may assume that (d+(ai)si),(d(bi)ti)>0(d^{+}(a_{i}^{\prime})-s_{i}),(d^{-}(b_{i}^{\prime})-t_{i})>0. Now, by definition of i\mathcal{F}_{i} we have already chosen i1i-1 edges of the configuration model and we have precisely d+(ai)sid^{+}(a_{i}^{\prime})-s_{i} out-stubs remaining at aia_{i}^{\prime} and d(bi)tid^{-}(b_{i}^{\prime})-t_{i} at bib_{i}^{\prime}. Now, consider the random matching on the 2(m+1i)2(m+1-i) remaining stubs. The probability that a specified pair of stubs is matched is (m+1i)1(m+1-i)^{-1}. There are (d+(ai)si)(d(bi)ti)(d^{+}(a_{i}^{\prime})-s_{i})(d^{-}(b_{i}^{\prime})-t_{i}) such pairs which produce aibia_{i}^{\prime}b_{i}^{\prime}. Thus the expected number of edges from aia_{i}^{\prime} to bib_{i}^{\prime} is (d+(ai)si)(d(bi)ti)/(m+1i)(d^{+}(a_{i}^{\prime})-s_{i})(d^{-}(b_{i}^{\prime})-t_{i})/(m+1-i). The claim then follows by Markov’s inequality. ∎

Next we compute the probability that all of the i\mathcal{F}_{i} occur simultaneously. Due to the way in which we defined these events,

(ψ is a homomorphism)=(i=1ki)=i=1k(i).\mathbb{P}(\psi\text{ is a homomorphism})=\mathbb{P}\bigg{(}\bigcap_{i=1}^{k}\mathcal{F}_{i}\bigg{)}=\prod_{i=1}^{k}\mathbb{P}(\mathcal{F}_{i}).

To write down this probability succinctly, we will use the functions

fa,b(x,y):=i=1aj=1b(x+1i)(y+1j)f_{a,b}(x,y):=\prod_{i=1}^{a}\prod_{j=1}^{b}(x+1-i)(y+1-j)

It is a simple computation to check that

(i=1ki)((m)k)1vV(H)fdH(v),dH+(v)(dG(𝐝)(ψ(v)),dG(𝐝)+(ψ(v)))\mathbb{P}\bigg{(}\bigcap_{i=1}^{k}\mathcal{F}_{i}\bigg{)}\leq((m)_{k})^{-1}\prod_{v\in V(H)}f_{d_{H}^{-}(v),d_{H}^{+}(v)}(d_{G(\mathbf{d})}^{-}(\psi(v)),d_{G(\mathbf{d})}^{+}(\psi(v))) (3)

Note that here we were able to remove the min\min function as if there are any negative contributions to the product, then there is also a contribution of value 0 and furthermore, it is impossible for ψ\psi to be a homomorphism in this case so that (3) reduces to 000\leq 0 in this case (which is clearly true).

To complete the proof, we extend the right hand side of equation (3) to allow any function ψ:V(H)V(G(𝐝))\psi:V(H)\to V(G(\mathbf{d})). Choosing an uniformly random function in this way gives that the probability that a uniformly random injective function is a homomorphism is bounded above by

1(n)h(m)kψ:V(H)V(G(𝐝))vV(H)fdH(v),dH+(v)(dG(𝐝)(ψ(v)),dG(𝐝)+(ψ(v)))\displaystyle\;\frac{1}{(n)_{h}(m)_{k}}\sum_{\psi:V(H)\hookrightarrow V(G(\mathbf{d}))}\prod_{v\in V(H)}f_{d_{H}^{-}(v),d_{H}^{+}(v)}(d_{G(\mathbf{d})}^{-}(\psi(v)),d_{G(\mathbf{d})}^{+}(\psi(v))) (4)
\displaystyle\leq 1(n)h(m)kψ:V(H)V(G(𝐝))vV(H)fdH(v),dH+(v)(dG(𝐝)(ψ(v)),dG(𝐝)+(ψ(v)))\displaystyle\;\frac{1}{(n)_{h}(m)_{k}}\sum_{\psi:V(H)\to V(G(\mathbf{d}))}\prod_{v\in V(H)}f_{d_{H}^{-}(v),d_{H}^{+}(v)}(d_{G(\mathbf{d})}^{-}(\psi(v)),d_{G(\mathbf{d})}^{+}(\psi(v))) (5)

In moving from injective functions to arbitrary functions between (4) and (5), we move from an intractable space of functions to a product space which naturally splits over the vertices of HH. This allows us to rewrite (5) as

1(n)h(m)kψ:V(H)V(G(𝐝))vV(H)fdH(v),dH+(v)(dG(𝐝)(ψ(v)),dG(𝐝)+(ψ(v)))\displaystyle\frac{1}{(n)_{h}(m)_{k}}\sum_{\psi:V(H)\to V(G(\mathbf{d}))}\prod_{v\in V(H)}f_{d_{H}^{-}(v),d_{H}^{+}(v)}(d_{G(\mathbf{d})}^{-}(\psi(v)),d_{G(\mathbf{d})}^{+}(\psi(v)))
=\displaystyle= 1(n)h(m)kvV(H)wV(G(𝐝))fdH(v),dH+(v)(dG(𝐝)(w),dG(𝐝)+(w))\displaystyle\frac{1}{(n)_{h}(m)_{k}}\prod_{v\in V(H)}\sum_{w\in V(G(\mathbf{d}))}f_{d_{H}^{-}(v),d_{H}^{+}(v)}(d_{G(\mathbf{d})}^{-}(w),d_{G(\mathbf{d})}^{+}(w))
=\displaystyle= nh(n)h(m)kvV(H)ρdH(v),dH+(v)=nh(n)h(m)ki,jρi,jhi,j,\displaystyle\frac{n^{h}}{(n)_{h}(m)_{k}}\prod_{v\in V(H)}\rho_{d_{H}^{-}(v),d_{H}^{+}(v)}=\frac{n^{h}}{(n)_{h}(m)_{k}}\prod_{i,j\in\mathbb{N}}\rho_{i,j}^{h_{i,j}},

which is the claimed upper bound, p+(H,G(𝐝))p^{+}(H,G(\mathbf{d})). ∎

We also need a lower bound in the case that HH is a cycle.

Lemma 2.3.

Let G(𝐝)G(\mathbf{d}) be a configuration model random digraph whose degree sequence 𝐝\mathbf{d} satisfies 1.2. Suppose further that G(𝐝)G(\mathbf{d}) has nn vertices and mm edges. Let HH be a directed cycle with hh vertices. Then the probability that a uniformly random injective map ϕ:V(H)V(G(𝐝))\phi:V(H)\to V(G(\mathbf{d})) is a homomorphism is bounded from below by

pc(H,G(𝐝)):=(1+Q)h(n)h(12h2Δ2εn)(1hΔ22m),p_{c}^{-}(H,G(\mathbf{d})):=\frac{(1+Q)^{h}}{(n)_{h}}\left(1-\frac{2h^{2}\Delta^{2}}{\varepsilon n}\right)\left(1-\frac{h\Delta^{2}}{2m}\right), (6)

where the ε\varepsilon in (6) comes from 1.2

Proof.

First, let us consider the probability of finding at least one edge from vertex uu of out-degree aa to vertex vv with in-degree bb. Let k\mathcal{H}_{k} be our knowledge of G(𝐝)G(\mathbf{d}) up until now which consists only of a set of kk edges which are present. k\mathcal{H}_{k} satisfies that we have not observed any edges that affect either the out-degree of uu or the in-degree of vv and that G(𝐝)G(\mathbf{d}) has mm edges. So let XX be the number of edges between uu and vv. Then,

(X=0|k)=\displaystyle\mathbb{P}(X=0|\mathcal{H}_{k})= (1amk)(1amk1)(1am+1kb)\displaystyle\,\bigg{(}1-\frac{a}{m-k}\bigg{)}\bigg{(}1-\frac{a}{m-k-1}\bigg{)}\ldots\bigg{(}1-\frac{a}{m+1-k-b}\bigg{)}
\displaystyle\leq (1am)b\displaystyle\,\bigg{(}1-\frac{a}{m}\bigg{)}^{b}
\displaystyle\leq  1abm+a2b22m2.\displaystyle\,1-\frac{ab}{m}+\frac{a^{2}b^{2}}{2m^{2}}.

where the final line follows by the inequality, (1+x)n1nx+(n2)x2(1+x)^{n}\leq 1-nx+\binom{n}{2}x^{2} which is valid for nn\in\mathbb{N} and x1x\geq-1. This allows us to deduce that the probability of seeing an edge between uu and vv is at least abma2b22m2\frac{ab}{m}-\frac{a^{2}b^{2}}{2m^{2}}. Now, looking at each edge of HH in turn allows us to deduce that the probability ϕ\phi is a homomorphism is at least

1(n)hϕ:[h][n]i=1h(dϕ(i)dϕ(i+1)+m(dϕ(i)dϕ(i+1)+)22m2),\frac{1}{(n)_{h}}\sum_{\phi:[h]\hookrightarrow[n]}\prod_{i=1}^{h}\bigg{(}\frac{d_{\phi(i)}^{-}d_{\phi(i+1)}^{+}}{m}-\frac{(d_{\phi(i)}^{-}d_{\phi(i+1)}^{+})^{2}}{2m^{2}}\bigg{)}, (7)

where we consider the argument of ϕ\phi modulo hh in (7). Also, note that one can factorise the linear term in (7) and bound the second occurrence of dϕ(i)dϕ(i+1)+d_{\phi(i)}^{-}d_{\phi(i+1)}^{+} by Δ2\Delta^{2} to get the lower bound

1(n)h(1Δ22m)ϕ:[h][n]i=1hdϕ(i)dϕ(i)+m\frac{1}{(n)_{h}}\bigg{(}1-\frac{\Delta^{2}}{2m}\bigg{)}\sum_{\phi:[h]\hookrightarrow[n]}\prod_{i=1}^{h}\frac{d_{\phi(i)}^{-}d_{\phi(i)}^{+}}{m} (8)

The idea is to argue that we can swap the order of the product and sum in (8) without changing the result very much. To this end, let Φi\Phi_{i} be the set of functions ϕ:[h][n]\phi:[h]\to[n] such that

  1. i)

    |ϕ([n])|=hi|\phi([n])|=h-i,

  2. ii)

    dϕ(j)+0d_{\phi(j)}^{+}\neq 0 for each j[h]j\in[h],

  3. iii)

    dϕ(j)0d_{\phi(j)}^{-}\neq 0 for each j[h]j\in[h].

Note that for any function ϕi=0hΦi\phi\not\in\bigcup_{i=0}^{h}\Phi_{i} then i=1hdϕ(i)dϕ(i)+=0\prod_{i=1}^{h}d_{\phi(i)}^{-}d_{\phi(i)}^{+}=0. As a result of this we observe that

ϕ:[h][n]i=1hdϕ(i)dϕ(i)+=ϕΦ0i=1hdϕ(i)dϕ(i)+.\sum_{\phi:[h]\hookrightarrow[n]}\prod_{i=1}^{h}d_{\phi(i)}^{-}d_{\phi(i)}^{+}=\sum_{\phi\in\Phi_{0}}\prod_{i=1}^{h}d_{\phi(i)}^{-}d_{\phi(i)}^{+}.

As well as the fact that

j=0hϕΦji=1hdϕ(i)dϕ(i)+=ϕ:[h][n]i=1hdϕ(i)dϕ(i)+=(i=1ndidi+)h=mh(1+Q)h.\sum_{j=0}^{h}\sum_{\phi\in\Phi_{j}}\prod_{i=1}^{h}d_{\phi(i)}^{-}d_{\phi(i)}^{+}=\sum_{\phi:[h]\rightarrow[n]}\prod_{i=1}^{h}d_{\phi(i)}^{-}d_{\phi(i)}^{+}=\left(\sum_{i=1}^{n}d_{i}^{-}d_{i}^{+}\right)^{h}=m^{h}(1+Q)^{h}. (9)

For a given function ϕ:[h][n]\phi:[h]\to[n] we define its weight as w(ϕ):=i=1hdϕ(i)dϕ(i)+w(\phi):=\prod_{i=1}^{h}d_{\phi(i)}^{-}d_{\phi(i)}^{+}. We also define the weight of a set SS of functions in the natural way as w(S)=ϕSw(ϕ)w(S)=\sum_{\phi\in S}w(\phi). Next, we shall apply switching arguments to bound the weights of the sets Φi\Phi_{i} relative to one another.

Consider the auxiliary bipartite graphs, 𝒢i\mathcal{G}_{i} with parts Φi\Phi_{i} and Φi+1\Phi_{i+1}. We connect ϕiΦi\phi_{i}\in\Phi_{i} to ϕi+1Φi+1\phi_{i+1}\in\Phi_{i+1} by an edge in 𝒢i\mathcal{G}_{i} if ϕi\phi_{i} and ϕi+1\phi_{i+1} differ in precisely one coordinate. For each ϕiΦi\phi_{i}\in\Phi_{i} there are at most h2h^{2} ways we can change one coordinate and decrease the size of the image. In particular we may pick any of the hh coordinates of ϕi\phi_{i} and change it to ϕi(j)\phi_{i}(j) for some j[h]j\in[h]. So, Δ𝒢i(Φi)h2\Delta_{\mathcal{G}_{i}}(\Phi_{i})\leq h^{2}. For each ϕi+1Φi+1\phi_{i+1}\in\Phi_{i+1}, we pick any of the at least ii coordinates at which ϕi+1\phi_{i+1} is not injective and choose a new image for this coordinate. By 1.2 there are at least εn/2\varepsilon n/2 ways to choose the new image, δ𝒢i(Φi+1)iεn/2\delta_{\mathcal{G}_{i}}(\Phi_{i+1})\geq i\varepsilon n/2.

Combining these two results allows us to deduce that iεn/2|Φi+1|e(𝒢i)h2|Φi|i\varepsilon n/2|\Phi_{i+1}|\leq e(\mathcal{G}_{i})\leq h^{2}|\Phi_{i}|. Upon rearrangement we find |Φi|εn2h2|Φi+1||\Phi_{i}|\geq\frac{\varepsilon n}{2h^{2}}|\Phi_{i+1}|. Note that two functions ϕ\phi and ψ\psi which differ in one coordinate must also satisfy w(ϕ)Δ2w(ψ)w(\phi)\leq\Delta^{2}w(\psi) and vice versa. Hence w(Φi)εn2h2Δ2w(Φi+1)w(\Phi_{i})\geq\frac{\varepsilon n}{2h^{2}\Delta^{2}}w(\Phi_{i+1}). This allows us to apply induction to deduce that

j=0hϕΦji=1hdϕ(i)dϕ(i)+=j=0hw(Φj)w(Φ0)j=0h(2h2Δ2εn)jw(Φ0)12h2Δ2εn.\sum_{j=0}^{h}\sum_{\phi\in\Phi_{j}}\prod_{i=1}^{h}d_{\phi(i)}^{-}d_{\phi(i)}^{+}=\sum_{j=0}^{h}w(\Phi_{j})\leq w(\Phi_{0})\sum_{j=0}^{h}\left(\frac{2h^{2}\Delta^{2}}{\varepsilon n}\right)^{j}\leq\frac{w(\Phi_{0})}{1-\frac{2h^{2}\Delta^{2}}{\varepsilon n}}.

So by (9), w(Φ0)(12h2Δ2εn)mh(1+Q)hw(\Phi_{0})\geq(1-\frac{2h^{2}\Delta^{2}}{\varepsilon n})m^{h}(1+Q)^{h}. Combining this with (8) allows us to deduce the statement of the lemma, that the following is a lower bound on the probability of finding a cycle at a specified position in G(𝐝)G(\mathbf{d}):

(1+Q)h(n)h(12h2Δ2εn)(1hΔ22m).\frac{(1+Q)^{h}}{(n)_{h}}\left(1-\frac{2h^{2}\Delta^{2}}{\varepsilon n}\right)\left(1-\frac{h\Delta^{2}}{2m}\right).

3 Digraph Counting

In this section we will prove the following bound on the number of strongly connected multi-digraphs with maximum total degree 44. This is similar to [5, Lemma 2.3] although a weaker bound suffices here allowing us to simplify the proof somewhat.

Lemma 3.1.

Suppose that n,m,a,bn,m,a,b\in\mathbb{N} such that n+a+b=mn+a+b=m. Let N(n,a,b)N(n,a,b) be the number of labelled strongly connected multi-digraphs with nn vertices and degree distribution given by

in-degree out-degree quantity
11 11 n2abn-2a-b
11 22 a
22 11 a
22 22 b

Then, we have the following bound,

N(n,a,b)(3a+2b)(m1)!(na,a,b)N(n,a,b)\leq(3a+2b)(m-1)!\binom{n}{a,a,b}

To prove this bound we will use the preheart configuration model of Pérez-Giménez and Wormald [19] which we shall define as follows.

A preheart is a multi-digraph with minimum semi-degree at least 11 and no cycle components. The heart of a preheart DD is the multidigraph H(D)H(D) formed by suppressing all vertices of DD which have in and out degree precisely 11. For a degree sequence 𝐝\mathbf{d} on vertex set VV, define

T=T(𝐝)={vV:d+(v)+d(v)3}.T=T(\mathbf{d})=\{v\in V:d^{+}(v)+d^{-}(v)\geq 3\}.

To form the preheart configuration model, first we apply the configuration model to TT to produce a heart HH. Given a heart configuration HH, we construct a preheart configuration QQ by assigning VTV\setminus T to E(H)E(H) such that the vertices assigned to each arc of HH are given a linear order. Denote this assignment including the orderings by qq. Then the preheart configuration model, 𝒬(𝐝)\mathcal{Q}(\mathbf{d}) is the probability space of random preheart configurations formed by choosing HH and qq uniformly at random.

It is easy to see that every strongly connected digraph is produced by the preheart configuration model. Thus counting the number of possible outcomes from the preheart configuration model gives an upper bound for the number of strongly connected digraphs with the same degree sequence. We now count the number of preheart configurations using [5, Lemma 2.4].

Lemma 3.2.

In the preheart configuration model with nn vertices, mm edges and degree sequence 𝐝\mathbf{d}, let n=|T(𝐝)|n^{\prime}=|T(\mathbf{d})| be the number of vertices of the heart. Then there are a total of

n+mnmm!\frac{n^{\prime}+m-n}{m}m!

preheart configurations.

From this lemma, we may prove Lemma 3.1.

Proof of Lemma 3.1.

First, we choose the degree sequence. So note that there are at most (na,a,b)\binom{n}{a,a,b} ways in which we can give aa vertices in-degree 11 and out-degree 22, aa vertices in-degree 22 and out-degree 11, bb vertices in-degree 22 and out-degree 22 and the remainder in-degree 11 and out-degree 11. Having fixed this degree sequence, by Lemma 3.2 as the heart contains 2a+b2a+b vertices and there are a+ba+b more edges than vertices, the number of strongly connected digraphs with this degree sequence is bounded above by 3a+2bmm!\frac{3a+2b}{m}m!. Hence the number of strongly connected digraphs with degree distribution as in the statement of the lemma is at most

(3a+2b)(m1)!(na,a,b)(3a+2b)(m-1)!\binom{n}{a,a,b}

as claimed. ∎

4 Proof of Theorem 1.3

In this section we prove Theorem 1.3 and show that every strongly connected component is a cycle and that these cycles are not particularly large. In particular this provides a configuration model analogue of [17, Theorem 7]. Working with the configuration model introduces several additional difficulties with the proof of this, foremost of these being that we do not have enough control of the subgraph counts to compute the factorial moments and show that they converge to those of a Poisson distribution. We will instead use the Chen-Stein method for Poisson approximation which only requires good control of the first moment and an upper bound on the second.

The proof of this theorem splits naturally into four parts. Choose functions f(n)g(n)f(n)\gg g(n) which are defined such that f(n)=ω(m/|Q|)f(n)=\omega(\sqrt{m/|Q|}), f(n)=o(m|Q|/R)f(n)=o(m|Q|/R^{-}) and g(n)=ω(1/|Q|)g(n)=\omega(1/|Q|). Moreover for this section we shall assume that RR+R^{-}\geq R^{+} and if this is not the case, we swap the orientations of all edges to get an equivalent digraph with RR+R^{-}\geq R^{+} as desired. We will say that a cycle CC is

  • Long if |C|f(n)|C|\geq f(n),

  • Medium if g(n)<|C|<f(n)g(n)<|C|<f(n),

  • Short if |C|g(n)|C|\leq g(n).

First we will show that there are no long or medium cycles in the directed configuration model. Next, we show that there are no complex components and finally we show the result on the distribution of the length of the kkth longest cycle.

4.1 Long Cycles

Lemma 4.1.

The configuration model, G(𝐝)G(\mathbf{d}) has no long cycles.

To show that there are no long cycles, it suffices to show that the out-component of an arbitrary vertex is bounded above by f(n)f(n). Certainly, the longest cycle in a directed graph is at most the size of the largest out-component and so the lemma follows.

Thus, consider the following version of the branching process of Hatami and Molloy [12] for the out-component in a digraph. For a vertex vv we explore its out-component in the random digraph G(𝐝)G(\mathbf{d}) as follows. We will have a partial subdigraph CtC_{t} at time tt consisting of the vertices explored thus far. CtC_{t} will consist of all in- and out-stubs of some vertices of G(𝐝)G(\mathbf{d}) together with a matching of some of the stubs. If there are unmatched out-stubs in CtC_{t} we will pick one at random and match it to some in-stub which yields an edge of CtC_{t}. We define YtY_{t} as the number of unmatched out-stubs in CtC_{t}. Thus, Yt=0Y_{t}=0 indicates that we have explored an out-component in its entirety. Formally we define the exploration process as follows.

  • Choose a vertex vv and initialise C0={v}C_{0}=\{v\} and Y0=d+(v)Y_{0}=d^{+}(v).

  • While Yt>0Y_{t}>0, choose an arbitrary unmatched out-stub of any vertex vCtv\in C_{t}. Pick a uniformly random unmatched in-stub and let uu be the vertex to which this in-stub belongs. Match these two stubs forming an edge of GG.

    • If uCtu\not\in C_{t} we add it so Ct+1=Ct{u}C_{t+1}=C_{t}\cup\{u\} and Yt+1=Yt+d+(u)1Y_{t+1}=Y_{t}+d^{+}(u)-1.

    • Otherwise, Ct+1=CtC_{t+1}=C_{t}, Yt+1=Yt1Y_{t+1}=Y_{t}-1.

Note that YtY_{t} does not depend on how we have exposed CtC_{t} so CtC_{t} and YtY_{t} are Markov processes. We define the following quantities.

  • Dt:=Yt+uCtd+(u)D_{t}:=Y_{t}+\sum_{u\not\in C_{t}}d^{+}(u), the number of unmatched out-stubs at time tt. Note this is also the number of unmatched in-stubs at time tt.

  • vt:=v_{t}:=\emptyset if Ct1C_{t-1} and CtC_{t} have the same vertex set. Otherwise it is the unique vertex in CtCt1C_{t}\setminus C_{t-1}.

  • Qt:=uCtd(u)d+(u)Dt1Q_{t}:=\frac{\sum_{u\not\in C_{t}}d^{-}(u)d^{+}(u)}{D_{t}}-1.

Note that initially Qt=QQ_{t}=Q. Also, for unvisited vertices uCtu\not\in C_{t}, the probability that we explore uu next is (vt+1=u)=d(u)Dt\mathbb{P}(v_{t+1}=u)=\frac{d^{-}(u)}{D_{t}}. Hence, provided that Yt>0Y_{t}>0, the expected change in YtY_{t} is

𝔼(Yt+1Yt|Ct)=uCt(vt+1=u)(d+(u)1)=uCtd(u)d+(u)Dt1=Qt.\mathbb{E}(Y_{t+1}-Y_{t}|C_{t})=\sum_{u\not\in C_{t}}\mathbb{P}(v_{t+1}=u)(d^{+}(u)-1)=\frac{\sum_{u\not\in C_{t}}d^{-}(u)d^{+}(u)}{D_{t}}-1=Q_{t}. (10)

As long as QtQ_{t} remains close to QQ we expect that YtY_{t} is a random walk with drift approximately QQ. So in particular, for our setting of Q<0Q<0, we expect that the random walk will quickly return to 0. Thus, we shall start by showing that the drift parameter QtQ_{t} is indeed close to QQ with high probability. For this we shall use the following formulation of the Azuma-Hoeffding inequality (see [13, Theorem 2.25]).

Theorem 4.2 (Azuma-Hoeffding Inequality).

Let (Xk)k=0n(X_{k})_{k=0}^{n} be a martingale with X=XnX=X_{n}, X0=𝔼(X)X_{0}=\mathbb{E}(X). Suppose there exist constants, ck>0c_{k}>0 such that

|XkXk1|ck for all kn.|X_{k}-X_{k-1}|\leq c_{k}\;\text{ for all }\;k\leq n.

Then for any λ0\lambda\geq 0,

(|XnX0|λ)2exp(λ22k=1nck2).\mathbb{P}(|X_{n}-X_{0}|\geq\lambda)\leq 2\exp\left(-\frac{\lambda^{2}}{2\sum_{k=1}^{n}c_{k}^{2}}\right).

For t1t\geq 1 define Wt:=QtQt1𝔼(QtQt1|Ct1)W_{t}:=Q_{t}-Q_{t-1}-\mathbb{E}(Q_{t}-Q_{t-1}|C_{t-1}). Also, we define X0=QX_{0}=Q and for t1t\geq 1 let

Xt:=X0+i=1tWi=Qti=1t𝔼(QiQi1|Ci1).X_{t}:=X_{0}+\sum_{i=1}^{t}W_{i}=Q_{t}-\sum_{i=1}^{t}\mathbb{E}(Q_{i}-Q_{i-1}|C_{i-1}). (11)

It is a simple check that the XtX_{t} forms a martingale. Furthermore, |QtQ||QtXt|+|XtQ||Q_{t}-Q|\leq|Q_{t}-X_{t}|+|X_{t}-Q| so to bound the probability that |QtQ||Q_{t}-Q| is large, we show that |QtXt||Q_{t}-X_{t}| is small and bound the probability that |XtQ||X_{t}-Q| is large.

For the second of these, consider the auxiliary random variables

Q~t:=uCtd(u)d+(u)Dt11.\widetilde{Q}_{t}:=\frac{\sum_{u\not\in C_{t}}d^{-}(u)d^{+}(u)}{D_{t-1}}-1.

That is we change QtQ_{t} to have the same denominator as Qt1Q_{t-1}. As we assume that tm/2t\leq m/2, Dtm/2D_{t}\geq m/2 so combining this with the fact that |DtDt1|1|D_{t}-D_{t-1}|\leq 1 we deduce that

|QtQ~t|=|DtDt1|uCtd(u)d+(u)DtDt14m.|Q_{t}-\widetilde{Q}_{t}|=\frac{|D_{t}-D_{t-1}|\sum_{u\not\in C_{t}}d^{-}(u)d^{+}(u)}{D_{t}D_{t-1}}\leq\frac{4}{m}. (12)

Also, as there is at most one vertex whose contributions are removed in moving from Qt1Q_{t-1} to QtQ_{t}, then

|Qt1Q~t|4Δ2m.|Q_{t-1}-\widetilde{Q}_{t}|\leq\frac{4\Delta^{2}}{m}. (13)

Combining equations (12) and (13) gives an upper bound on |QtQt1||Q_{t}-Q_{t-1}| which we may then use to bound the martingale differences almost surely,

|XtXt1|8Δ2+8m16Δ2m.|X_{t}-X_{t-1}|\leq\frac{8\Delta^{2}+8}{m}\leq\frac{16\Delta^{2}}{m}. (14)

Next, we will bound the terms 𝔼(QtQt1|Ct1)\mathbb{E}(Q_{t}-Q_{t-1}|C_{t-1}). It will be convenient to do this in two stages utilising the auxiliary random variables Q~t\widetilde{Q}_{t}. So, first note that

0𝔼(Qt1Q~t|Ct1)\displaystyle 0\leq\mathbb{E}(Q_{t-1}-\widetilde{Q}_{t}|C_{t-1}) =uCt1(vt=u)d(u)d+(u)Dt1uV(G)(d(u))2d+(u)Dt12\displaystyle=\sum_{u\not\in C_{t-1}}\mathbb{P}(v_{t}=u)\frac{d^{-}(u)d^{+}(u)}{D_{t-1}}\leq\sum_{u\in V(G)}\frac{(d^{-}(u))^{2}d^{+}(u)}{D_{t-1}^{2}}
uV(G)4(d(u))2d+(u)m24R+4m.\displaystyle\leq\sum_{u\in V(G)}\frac{4(d^{-}(u))^{2}d^{+}(u)}{m^{2}}\leq\frac{4R^{-}+4}{m}. (15)

We can combine (15) with (12) and apply 1.2 to deduce that

|𝔼(QtQt1|Ct1)|4R+8m12Rζm.|\mathbb{E}(Q_{t}-Q_{t-1}|C_{t-1})|\leq\frac{4R^{-}+8}{m}\leq\frac{12R^{-}}{\zeta m}. (16)

This leaves us in a situation in which we can compare XtX_{t} and QtQ_{t},

|XtQt|i=1t|𝔼(QiQi1|Ci1)|12Rtζm.|X_{t}-Q_{t}|\leq\sum_{i=1}^{t}|\mathbb{E}(Q_{i}-Q_{i-1}|C_{i-1})|\leq\frac{12R^{-}t}{\zeta m}. (17)

So provided that tmζ|Q|48Rt\leq\frac{m\zeta|Q|}{48R^{-}} we have |XtQt||Q|/4|X_{t}-Q_{t}|\leq|Q|/4 and in particular, for any such tt,

(QtQ|Q|2)=(QtXt+XtQ|Q|2)\displaystyle\mathbb{P}\left(Q_{t}-Q\geq\frac{|Q|}{2}\right)=\mathbb{P}\left(Q_{t}-X_{t}+X_{t}-Q\geq\frac{|Q|}{2}\right)
\displaystyle\leq\; (XtQ|Q|4)(|XtQ||Q|4)\displaystyle\mathbb{P}\left(X_{t}-Q\geq\frac{|Q|}{4}\right)\leq\mathbb{P}\left(|X_{t}-Q|\geq\frac{|Q|}{4}\right)

Whereupon we can apply the Azuma-Hoeffding inequality as X0=QX_{0}=Q. We can use the bound from (14) for the ckc_{k}. Substituting into Theorem 4.2 yields,

(|XtQ||Q|4)exp(|Q|2m28192tΔ4)exp(C|Q|mn2/3log(n)),\mathbb{P}\left(|X_{t}-Q|\geq\frac{|Q|}{4}\right)\leq\exp\left(-\frac{|Q|^{2}m^{2}}{8192t\Delta^{4}}\right)\leq\exp\left(-C|Q|mn^{-2/3}\log(n)\right), (18)

where the second inequality in (18) comes from tmζ|Q|48Rt\leq\frac{m\zeta|Q|}{48R^{-}}, Δn1/6log1/4(n)\Delta\leq n^{1/6}\log^{-1/4}(n) and RζR^{-}\geq\zeta. We could improve the dependence on Δ\Delta by using Freedman’s inequality [9] in place of the Azuma-Hoeffding inequality here however there are other points where we require Δn1/6\Delta\leq n^{1/6} and so this would only remove the log1/4(n)\log^{-1/4}(n) term in 1.2. Note mn2/3m1/3/2mn^{-2/3}\geq m^{1/3}/2 hence |Q|mn2/3|Q|mn^{-2/3}\to\infty and so for any large enough nn,

(QtQ|Q|2)=(Qt|Q|2)n2.\mathbb{P}\left(Q_{t}-Q\geq\frac{|Q|}{2}\right)=\mathbb{P}\left(Q_{t}\geq-\frac{|Q|}{2}\right)\leq n^{-2}. (19)

Now that we have shown that QtQ_{t} is concentrated around QQ, we can proceed to show that G(𝐝)G(\mathbf{d}) has no large components with high probability via a stopping time argument. We will use the following version of Doob’s optional stopping theorem [20, Theorem 10.10]

Theorem 4.3 (Optional Stopping Theorem).

Let XX be a supermartingale and let τ\tau be a stopping time. Then XτX_{\tau} is integrable and furthermore,

𝔼(Xτ)𝔼(X0)\mathbb{E}(X_{\tau})\leq\mathbb{E}(X_{0})

whenever τ\tau is bounded.

So, now let us show that there is no component of size larger than f(n)f(n). For each vV(G)v\in V(G) we shall consider the exploration process started at vv. Recall that f(n)=o(n|Q|/R)f(n)=o(n|Q|/R^{-}) and so for sufficiently large nn, f(n)mζ|Q|48Rf(n)\leq\frac{m\zeta|Q|}{48R^{-}}. Hence we have Qt|Q|/2Q_{t}\leq-|Q|/2 with high probability for each tf(n)t\leq f(n). Define the stopping time

τ:=min{t0|Yt=0 or Qt|Q|/2 or t=f(n)}\tau:=\min\{t\geq 0|Y_{t}=0\text{ or }Q_{t}\geq-|Q|/2\text{ or }t=f(n)\}

Recall that for tmζ|Q|48Rt\leq\frac{m\zeta|Q|}{48R^{-}} we have 𝔼(YtYt1)=Qt1|Q|/2\mathbb{E}(Y_{t}-Y_{t-1})=Q_{t-1}\leq-|Q|/2. Thus, Ymin(t,τ)+|Q|min(t,τ)/2Y_{\min(t,\tau)}+|Q|\min(t,\tau)/2 is a supermartingale. Clearly, τ\tau is bounded by f(n)f(n) so we may apply Theorem 4.3 to Ymin(t,τ)+|Q|min(t,τ)/2Y_{\min(t,\tau)}+|Q|\min(t,\tau)/2 from which we deduce that

𝔼(Yτ+|Q|τ2)Y0=d+(v).\mathbb{E}\left(Y_{\tau}+\frac{|Q|\tau}{2}\right)\leq Y_{0}=d^{+}(v).

Upon rearrangement this yields,

𝔼(τ)2d+(v)𝔼(Yτ)|Q|2d+(v)|Q|.\mathbb{E}(\tau)\leq 2\frac{d^{+}(v)-\mathbb{E}(Y_{\tau})}{|Q|}\leq\frac{2d^{+}(v)}{|Q|}.

By Markov’s inequality we can deduce

(τ=f(n))2d+(v)|Q|f(n)\mathbb{P}(\tau=f(n))\leq\frac{2d^{+}(v)}{|Q|f(n)}

The only other way in which we could have Yτ0Y_{\tau}\neq 0 is if for some ii we have Qi|Q|/2Q_{i}\geq-|Q|/2. A union bound allows us to deduce that this occurs with probability at most f(n)n2n1f(n)n^{-2}\leq n^{-1}. So for any large enough nn,

(Yτ0)2d+(v)|Q|f(n)+1n.\mathbb{P}(Y_{\tau}\neq 0)\leq\frac{2d^{+}(v)}{|Q|f(n)}+\frac{1}{n}.

Define ZZ as the number of vertices of GG which lie in cycles of size at least f(n)f(n). Note that any such vertex must have out component of size at least f(n)f(n). Thus,

(|𝒞1|f(n))\displaystyle\mathbb{P}(|\mathcal{C}_{1}|\geq f(n)) =(Zf(n))𝔼(Z)f(n)1f(n)vV(G)(|C+(v)|f(n))\displaystyle=\mathbb{P}(Z\geq f(n))\leq\frac{\mathbb{E}(Z)}{f(n)}\leq\frac{1}{f(n)}\sum_{v\in V(G)}\mathbb{P}(|C^{+}(v)|\geq f(n))
1f(n)vV(G)(2d+(v)|Q|f(n)+1n)=m|Q|f(n)2+1f(n)=o(1)\displaystyle\leq\frac{1}{f(n)}\sum_{v\in V(G)}\left(\frac{2d^{+}(v)}{|Q|f(n)}+\frac{1}{n}\right)=\frac{m}{|Q|f(n)^{2}}+\frac{1}{f(n)}=o(1)

as f(n)m/|Q|f(n)\gg\sqrt{m/|Q|}. Thus there are no long cycles in G(𝐝)G(\mathbf{d}).

4.2 Medium Cycles

Lemma 4.4.

The configuration model, G(𝐝)G(\mathbf{d}) has no medium cycles.

Our next step is to apply Lemma 2.1 to show there are no medium cycles. By Lemma 2.1, the probability that G(𝐝)G(\mathbf{d}) has a cycle of length hh in any particular location is at most nhμ1,1h(n)h(m)h\frac{n^{h}\mu_{1,1}^{h}}{(n)_{h}(m)_{h}}. Thus the expected number of cycles of length hh in G(𝐝)G(\mathbf{d}) is at most

h!|aut(Ch)|(nh)nhμ1,1h(n)h(m)h\displaystyle\frac{h!}{|\text{aut}(\overrightarrow{C_{h}})|}\binom{n}{h}\frac{n^{h}\mu_{1,1}^{h}}{(n)_{h}(m)_{h}} =mh(m)h(1+Q)hh(1+hmh)h(1+Q)hh\displaystyle=\frac{m^{h}}{(m)_{h}}\frac{(1+Q)^{h}}{h}\leq\bigg{(}1+\frac{h}{m-h}\bigg{)}^{h}\frac{(1+Q)^{h}}{h}
(1+2hm)h(1+Q)hhehQ+2h2mh.\displaystyle\leq\bigg{(}1+\frac{2h}{m}\bigg{)}^{h}\frac{(1+Q)^{h}}{h}\leq\frac{e^{hQ+\frac{2h^{2}}{m}}}{h}. (20)

For any g(n)hf(n)g(n)\leq h\leq f(n), we have 2h2/m4h2/nh|Q|/22h^{2}/m\leq 4h^{2}/n\leq h|Q|/2 as f(n)=o(n|Q|)f(n)=o(n|Q|). Then, the expected number of cycles of length between g(n)g(n) and f(n)f(n) is at most

h=g(n)f(n)ehQ+2h2mhh=g(n)f(n)ehQ2hg(n)ehQ2h𝑑h=Qg(n)2eλλ𝑑λ=E1(Qg(n)2),\sum_{h=g(n)}^{f(n)}\frac{e^{hQ+\frac{2h^{2}}{m}}}{h}\leq\sum_{h=g(n)}^{f(n)}\frac{e^{\frac{hQ}{2}}}{h}\leq\int_{g(n)}^{\infty}\frac{e^{\frac{hQ}{2}}}{h}dh=\int_{-\frac{Qg(n)}{2}}^{\infty}\frac{e^{-\lambda}}{\lambda}d\lambda=E_{1}\bigg{(}-\frac{Qg(n)}{2}\bigg{)}, (21)

where E1(x)E_{1}(x) is the exponential integral function and the first equality follows by making the substitution λ=Qh/2\lambda=-Qh/2 (recall that Q<0Q<0 and so this substitution preserves positivity). It is straightforward to bound 0E1(x)ex/x0\leq E_{1}(x)\leq e^{-x}/x which allows us to conclude that E1(x)0E_{1}(x)\to 0 as xx\to\infty. Note that g(n)=ω(1/|Q|)g(n)=\omega(1/|Q|) and so Qg(n)-Qg(n)\to\infty. Thus the expected number of cycles in G(𝐝)G(\mathbf{d}) of length at least g(n)g(n) is o(1)o(1). So by Markov’s inequality, there are no such cycles with high probability.

4.3 Complex Components

Lemma 4.5.

The configuration model, G(𝐝)G(\mathbf{d}) has no complex components.

We begin by defining digraphs S(a,b,c)S(a,b,c) and T(a,b)T(a,b) for a,b,ca,b,c\in\mathbb{N}. Let S(a,b,c)S(a,b,c) be the digraph with a+b+c1a+b+c-1 vertices consisting of vertices uu, vv and three internally disjoint paths. One of length aa from uu to vv, one of length bb from uu to vv and one of length cc from vv to uu. Let T(a,b)T(a,b) be the digraph with a+b1a+b-1 vertices consisting of two cycles, one of length aa, one of length bb which intersect at a single vertex, uu. We can use the ear decomposition of a strongly connected digraph to deduce that if G(𝐝)G(\mathbf{d}) contains any complex components, then it contains a subgraph which is either a copy of S(a,b,c)S(a,b,c) or a copy of T(a,b)T(a,b). Note that both of these are the union of two cycles and and by the results of the previous two sections, there are no cycles with more than g(n)g(n) vertices with high probability. Thus we only need to show there are none of these motifs on at most 2g(n)2g(n) vertices to deduce that there are none in G(𝐝)G(\mathbf{d}).

Unlike Łuczak and Seierstad [17], we must treat these cases separately as in the first case, we have two vertices of total degree 33 and the rest of total degree 22 and in the second there is one vertex of total degree 44 in place of the total degree 33 vertices which changes the result of applying Lemma 2.1.

First let us consider S(a,b,c)S(a,b,c). There are at most h2h!h^{2}h! ways of finding such subgraphs on hh vertices (h2\leq h^{2} ways of choosing path lengths connecting the two degree 33 vertices and assuming the associated automorphism groups are all trivial gives this bound). Thus, we may apply Lemma 2.1 to deduce that the expected number of such subgraphs in the configuration model with parameters as in the statement of Theorem 1.3 is at most

h=12g(n)nh(m)h+1h2h!μ1,1h2ρ1,2ρ2,1=RR+mh=12g(n)mh+1(m)h+1h2(1+Q)h2RR+m02g(n)x2exQ2𝑑x,\sum_{h=1}^{2g(n)}\frac{n^{h}}{(m)_{h+1}}h^{2}h!\mu_{1,1}^{h-2}\rho_{1,2}\rho_{2,1}=\frac{R^{-}R^{+}}{m}\sum_{h=1}^{2g(n)}\frac{m^{h+1}}{(m)_{h+1}}h^{2}(1+Q)^{h-2}\leq\frac{R^{-}R^{+}}{m}\int_{0}^{2g(n)}x^{2}e^{\frac{xQ}{2}}dx, (22)

where we eliminate the term mk+1(m)k+1\frac{m^{k+1}}{(m)_{k+1}} in the above in the same way as in (20). An integral of the form seen in (22) can be evaluated by integrating by parts twice to deduce the following (where t>0t>0),

0yx2etx=2t3ety(2t3+2yt2+2y2t)2t3.\int_{0}^{y}x^{2}e^{-tx}=\frac{2}{t^{3}}-e^{-ty}\bigg{(}\frac{2}{t^{3}}+\frac{2y}{t^{2}}+\frac{2y^{2}}{t}\bigg{)}\leq\frac{2}{t^{3}}.

Thus the expected number of these subgraphs in G(𝐝)G(\mathbf{d}) can be bounded above by 16RR+m|Q|30\frac{16R^{-}R^{+}}{m|Q|^{3}}\to 0.

The second case is T(a,b)T(a,b) where we have a degree 44 vertex. In this case there are at most hh!hh! such subgraphs on hh vertices (h\leq h choices of the two cycle lengths and assuming the associated automorphism groups are all trivial gives this bound). Again we apply Lemma 2.1 to compute the expected number of such subgraphs of size at most 2g(n)2g(n) which this time is at most

h=12g(n)nh(m)h+1hμ1,1h1ρ2,2R+Δmh=12g(n)mh+1(m)h+1hehQR+Δm02g(n)xexQ2𝑑x\sum_{h=1}^{2g(n)}\frac{n^{h}}{(m)_{h+1}}h\mu_{1,1}^{h-1}\rho_{2,2}\leq\frac{R^{+}\Delta}{m}\sum_{h=1}^{2g(n)}\frac{m^{h+1}}{(m)_{h+1}}he^{-hQ}\leq\frac{R^{+}\Delta}{m}\int_{0}^{2g(n)}xe^{-\frac{xQ}{2}}dx (23)

Note we may pick either ρ2,2,mnRΔ\rho_{2,2,}\leq\frac{m}{n}R^{-}\Delta or mnR+Δ\frac{m}{n}R^{+}\Delta here by selecting which part of the product di(di1)di+(di+1)d_{i}^{-}(d_{i}^{-}-1)d_{i}^{+}(d_{i}^{+}-1) to bound by Δ\Delta in computing ρ2,2\rho_{2,2} and so we pick the smaller of the two. We may proceed similarly to before, integrating by parts which allows us to bound integrals of the form found in (23) as

0yxetx=1t2ety(1t2+yt)1t2.\int_{0}^{y}xe^{-tx}=\frac{1}{t^{2}}-e^{-ty}\bigg{(}\frac{1}{t^{2}}+\frac{y}{t}\bigg{)}\leq\frac{1}{t^{2}}.

So, we can bound (23) above by 4R+Δm|Q|24(R+R)2/3m2/3ζ1/3|Q|2Δm1/30\frac{4R^{+}\Delta}{m|Q|^{2}}\leq\frac{4(R^{+}R^{-})^{2/3}}{m^{2/3}\zeta^{1/3}|Q|^{2}}\frac{\Delta}{m^{1/3}}\to 0. Thus by Markov’s inequality there are no copies of S(a,b,c)S(a,b,c) or T(a,b)T(a,b) on at most 2g(n)2g(n) vertices in G(𝐝)G(\mathbf{d}) with high probability. Combining Lemma 4.1 and Lemma 4.4 we deduce there are no cycles of length at least g(n)g(n) with high probability. As any complex strongly connected digraph contains a copy of at least one of these, we deduce that G(𝐝)G(\mathbf{d}) contains no complex components with high probability.

4.4 Length of the kth largest cycle

Lemma 4.6.

The kthkth longest cycle in the configuration model, G(𝐝)G(\mathbf{d}) follows the distribution given in Theorem 1.3

The idea now will be to apply a local coupling version of the Chen-Stein method (see [8, Theorem 2.8]) to deduce that the number of cycles in G(𝐝)G(\mathbf{d}) of length between α/|Q|\alpha/|Q| and g(n)g(n) converges to a Poisson distribution of mean ξα\xi_{\alpha}. Let us start by stating the version of the Chen-Stein method which we will apply,

Theorem 4.7.

Let W=iΓXiW=\sum_{i\in\Gamma}X_{i} be a sum of indicator variables and let pi:=𝔼(Xi)p_{i}:=\mathbb{E}(X_{i}). For each iΓi\in\Gamma, divide Γ{i}\Gamma\setminus\{i\} into two sets, Γis\Gamma_{i}^{s} and Γiw\Gamma_{i}^{w}. Define

Zi:=jΓisXj and Wi:=jΓiwXj.Z_{i}:=\sum_{j\in\Gamma_{i}^{s}}X_{j}\;\text{ and }\;W_{i}:=\sum_{j\in\Gamma_{i}^{w}}X_{j}.

Suppose that there exist random variables, Wi1W_{i}^{1} and W~i1\widetilde{W}_{i}^{1} defined on the same probability space such that

(W~i1)=(Wi|Xi=1) and (Wi1)=(Wi).\mathcal{L}(\widetilde{W}_{i}^{1})=\mathcal{L}(W_{i}|X_{i}=1)\;\text{ and }\;\mathcal{L}(W_{i}^{1})=\mathcal{L}(W_{i}).

Then,

dTV(W,Po(𝔼(W)))min(1,𝔼(W)1)iΓ(pi𝔼(Xi+Zi)+𝔼(XiZi)+pi𝔼|W~i1Wi1|)d_{\text{TV}}(W,Po(\mathbb{E}(W)))\leq\min(1,\mathbb{E}(W)^{-1})\sum_{i\in\Gamma}\left(p_{i}\mathbb{E}(X_{i}+Z_{i})+\mathbb{E}(X_{i}Z_{i})+p_{i}\mathbb{E}|\widetilde{W}_{i}^{1}-W_{i}^{1}|\right) (24)

Note that this lemma requires us to have a copy of Wi|Xi=1W_{i}|X_{i}=1. To create such a copy, we will use the following lemma to couple the configuration model with itself conditioned on the containment of a given subgraph.

Lemma 4.8.

Let G=(AB,E)G=(A\cup B,E) be a balanced bipartite complete graph and let \mathcal{M} be a uniformly chosen random perfect matching of GG. Suppose that a1,a2,,aka_{1},a_{2},\ldots,a_{k} are distinct elements of AA and b1,b2,,bkb_{1},b_{2},\ldots,b_{k} are distinct elements of BB. Then, the following procedure gives a copy of |(a1b1,a2b2,,akbk)\mathcal{M}|(a_{1}b_{1},a_{2}b_{2},\ldots,a_{k}b_{k}\in\mathcal{M}):

  1. 1.

    Sample an element MM from \mathcal{M} and set M0=MM_{0}=M.

  2. 2.

    For each iki\leq k:

    • If aibia_{i}b_{i} is an edge of Mi1M_{i-1} set Mi=Mi1M_{i}=M_{i-1}.

    • Otherwise, let aia_{i}^{\prime} be the unique neighbour of bib_{i} and bib_{i}^{\prime} be the unique neighbour of aia_{i} in Mi1M_{i-1}. Let Mi=(Mi1{aibi,aibi}){aibi,aibi}M_{i}=(M_{i-1}\setminus\{a_{i}b_{i}^{\prime},a_{i}^{\prime}b_{i}\})\cup\{a_{i}b_{i},a_{i}^{\prime}b_{i}^{\prime}\}.

That is MkM_{k} is sampled uniformly from |(a1b1,a2b2,,akbk)\mathcal{M}|(a_{1}b_{1},a_{2}b_{2},\ldots,a_{k}b_{k}\in\mathcal{M}).

Proof.

Note that it is sufficient to prove the lemma for k=1k=1 as if =|(a1b1,a2b2,,ak1bk1)\mathcal{B}=\mathcal{M}|(a_{1}b_{1},a_{2}b_{2},\ldots,a_{k-1}b_{k-1}\in\mathcal{M}) then clearly, |(akbk)=|(a1b1,a2b2,,akbk)\mathcal{B}|(a_{k}b_{k}\in\mathcal{B})=\mathcal{M}|(a_{1}b_{1},a_{2}b_{2},\ldots,a_{k}b_{k}\in\mathcal{M}). So, the general case follows by induction on the k=1k=1 case.

Now, we prove the lemma for k=1k=1. To do so, we will show that each of the (n1)!(n-1)! atoms of |(a1b1)\mathcal{M}|(a_{1}b_{1}\in\mathcal{M}) comes from precisely nn atoms of \mathcal{M} via this switching approach. So let MM be an atom of |(a1b1)\mathcal{M}|(a_{1}b_{1}\in\mathcal{M}) and let e=cde=cd be an edge of MM with cA,dBc\in A,d\in B. Now, consider the inverse switching, M(M{ab,cd}){ad,bc}M\to(M\setminus\{ab,cd\})\cup\{ad,bc\} (note if we chose the edge abab, then this is simply the identity MMM\to M). Thus, for each atom MM of |(a1b1)\mathcal{M}|(a_{1}b_{1}\in\mathcal{M}) there are nn ways to get to an atom of \mathcal{M}. Similarly we can show that each atom of \mathcal{M} is mapped to a unique element of |(a1b1)\mathcal{M}|(a_{1}b_{1}\in\mathcal{M}). Thus, as \mathcal{M} has the uniform distribution, so does the random variable obtained by our procedure above. ∎

Note that this lemma does not allow us to generate the configuration model conditioned on the existence of a subgraph specified in the usual way by the locations of its vertices unless all of the involved vertices have in- and out-degrees at most 11. This is due to the fact that Lemma 4.8 allows us to condition on which pairs of stubs are connected rather than which pairs of vertices. Instead we shall condition on the existence of principal subgraphs which we shall define as follows.

Definition 4.9.

Let G(𝐝)G(\mathbf{d}) be a configuration model random digraph with degree distribution 𝐝\mathbf{d}. Let M=M(G(𝐝))M=M(G(\mathbf{d})) be the associated perfect matching of in- and out-stubs. A principal subgraph of G(𝐝)G(\mathbf{d}) is an event of the form MMM^{\prime}\subseteq M where MM^{\prime} is a partial matching of in- and out-stubs.

A similar notion was seen in [15] in which their canonical events are precisely the same as our principal subgraphs. Furthermore, note that the event corresponding to a subgraph in G(𝐝)G(\mathbf{d}) may be written as a union of events corresponding to principal subgraphs. In particular, the existence of the cycle v1v2vkv_{1}v_{2}\ldots v_{k} in G(𝐝)G(\mathbf{d}) can be written as the union of i=1kdG(vi)dG+(vi)\prod_{i=1}^{k}d_{G}^{-}(v_{i})d_{G}^{+}(v_{i}) events corresponding to principal cycles. For this reason, in places where there may be some ambiguity as to whether we are dealing with principal subgraphs or not, we shall refer to subgraphs in the usual sense as union subgraphs.

This leaves us in a setting to which we can apply Theorem 4.7. We shall show that the number of cycles in G(𝐝)G(\mathbf{d}) of lengths between α/|Q|\alpha/|Q| and g(n)g(n) is Poisson distributed with mean ξα\xi_{\alpha}. Let Γ\Gamma be the set of all principal cycles which have lengths between α/|Q|\alpha/|Q| and g(n)g(n). For each CΓC\in\Gamma and digraph JJ on the same vertex set and stubs as G(𝐝)G(\mathbf{d}) let XC(J)X_{C}(J) be the indicator function that CC is a principal subgraph of JJ. Also, we split Γ{C}\Gamma\setminus\{C\} into a set strongly dependent on CC and a set weakly dependent on CC. Define the strongly dependent set ΓCs\Gamma_{C}^{s} to be the set of all principal cycles which share at least one vertex with CC. The weakly dependent set ΓCw\Gamma_{C}^{w} contains all of the other principal cycles, it is the set of principal cycles which are vertex disjoint from CC. Finally, we define WC1W_{C}^{1} to be WCW_{C} for an independent copy GG^{\prime} of G(𝐝)G(\mathbf{d}) and G~C\widetilde{G}_{C} to be obtained from GG^{\prime} by applying a 44-cycle switching to each edge of CC in GCG_{C} in turn and define W~C1\widetilde{W}_{C}^{1} in the obvious way to be WCW_{C} for the digraph G~C\widetilde{G}_{C}. That is,

WC=CΓCwXC(G)W~C1=CΓCwXC(G~C)W_{C}=\sum_{C^{\prime}\in\Gamma_{C}^{w}}X_{C^{\prime}}(G^{\prime})\qquad\qquad\qquad\widetilde{W}_{C}^{1}=\sum_{C^{\prime}\in\Gamma_{C}^{w}}X_{C^{\prime}}(\widetilde{G}_{C})

These variables clearly satisfy the assumptions of Theorem 4.7 and so we must bound the expectations in the statement of the theorem to compute an upper bound on dTV(W,Po(𝔼(W)))d_{\text{TV}}(W,Po(\mathbb{E}(W))). The idea now is to reduce the whole problem to one of bounding the expected numbers of certain subgraphs being contained in G(𝐝)G(\mathbf{d}), GG^{\prime} and G~C\widetilde{G}_{C}. For the remainder of this section, we write XCX_{C} for XC(G)X_{C}(G) and pC=𝔼(XC)p_{C}=\mathbb{E}(X_{C}) unless specified otherwise.

We shall bound the three terms from (24) one by one. First let us consider the term

η:=CΓpC𝔼(XC+ZC)=CΓCΓCs{C}pCpC\eta:=\sum_{C\in\Gamma}p_{C}\mathbb{E}(X_{C}+Z_{C})=\sum_{C\in\Gamma}\sum_{C^{\prime}\in\Gamma_{C}^{s}\cup\{C\}}p_{C}p_{C}^{\prime} (25)

Note that the set, {CΓCs{C}}\{C^{\prime}\in\Gamma_{C}^{s}\cup\{C\}\} is the set of all cycles which share at least one vertex with CC. Thus, the union of CC and CC^{\prime} is a strongly connected digraph (here we allow multiple edges). In the computation of η\eta we will use the following generalisation of [5, Lemma 3.4]. No changes are required to the proof of the lemma as presented in [5] to generalise from digraphs to multi-digraphs.

Lemma 4.10.

Each strongly connected multi-digraph DD with excess kk may be formed in at most 27k27^{k} ways as the union of a pair of directed cycles C1C_{1} and C2C_{2}.

Define Ξ\Xi to be the set of all strongly connected multi-digraphs with vertices a subset of V(G(𝐝))V(G(\mathbf{d})) such that all vertices have degrees d+(v)=d(v)=1d^{+}(v)=d^{-}(v)=1 or d+(v)=d(v)=2d^{+}(v)=d^{-}(v)=2 with at least one vertex which has d+(v)=d(v)=2d^{+}(v)=d^{-}(v)=2. Note that Ξ\Xi is precisely the set of multi-digraphs which can be formed as the edge disjoint union of the two cycles CΓC\in\Gamma and CΓCsC^{\prime}\in\Gamma_{C}^{s}. Furthermore, note that the excess of a multi-digraph in Ξ\Xi is precisely the number of vertices which have d+(v)=d(v)=2d^{+}(v)=d^{-}(v)=2. We let Ξk\Xi_{k} be the set Ξkh:={FΞ||F|=h and excess(F)=k}\Xi_{k}^{h}:=\{F\in\Xi||F|=h\text{ and }\text{excess}(F)=k\}. Moreover, for each FΞF\in\Xi define

t(F):=vV(F)dG(v)dF(v)dG+(v)dF+(v)t(F):=\prod_{v\in V(F)}d^{-}_{G}(v)^{d^{-}_{F}(v)}d^{+}_{G}(v)^{d^{+}_{F}(v)} (26)

Observe that for a given C,CC,C^{\prime} whose union is FF, t(F)t(F) is the number of pairs of principal cycles C~,C~\widetilde{C},\widetilde{C}^{\prime} which are copies of the same cycles as C,CC,C^{\prime}. Thus, combining (26) with Lemma 4.10 allows us to bound η\eta as follows,

η=CΓCΓCs{C}pCpCh=α|Q|2g(n)k=1hFΞkh27kt(F)(mhk)h+k\eta=\sum_{C\in\Gamma}\sum_{C^{\prime}\in\Gamma_{C}^{s}\cup\{C\}}p_{C}p_{C^{\prime}}\leq\sum_{h=\frac{\alpha}{|Q|}}^{2g(n)}\sum_{k=1}^{h}\sum_{F\in\Xi_{k}^{h}}\frac{27^{k}t(F)}{(m-h-k)^{h+k}} (27)

Now, let Λkh\Lambda_{k}^{h} be the set of all strongly connected labelled multi-digraphs with hh vertices such that all vertices have degrees d+(v)=d(v)=1d^{+}(v)=d^{-}(v)=1 or d+(v)=d(v)=2d^{+}(v)=d^{-}(v)=2 with precisely kk vertices vv such that d+(v)=d(v)=2d^{+}(v)=d^{-}(v)=2. This allows us to write

FΞkht(F)=1h!FΛkhϕ:V(F)V(G)vV(F)dG(v)dF(v)dG+(v)dF+(v),\sum_{F\in\Xi_{k}^{h}}t(F)=\frac{1}{h!}\sum_{F\in\Lambda_{k}^{h}}\sum_{\phi:V(F)\hookrightarrow V(G)}\prod_{v\in V(F)}d^{-}_{G}(v)^{d^{-}_{F}(v)}d^{+}_{G}(v)^{d^{+}_{F}(v)}, (28)

where the division by h!h! comes from the fact that Λkh\Lambda_{k}^{h} distinguishes between digraphs obtained from one-another by a permutation of the vertex set. Arguing similarly to Lemma 2.1, noting that we know the degree sequence of FF and using μ1,1=mn(1+Q)mn\mu_{1,1}=\frac{m}{n}(1+Q)\leq\frac{m}{n}, we deduce that

ϕ:V(F)V(G)vV(F)dG(v)dF(v)dG+(v)dF+(v)nhμ1,1hkμ2,2kmhμ2,2kμ1,1k.\sum_{\phi:V(F)\hookrightarrow V(G)}\prod_{v\in V(F)}d^{-}_{G}(v)^{d^{-}_{F}(v)}d^{+}_{G}(v)^{d^{+}_{F}(v)}\leq n^{h}\mu_{1,1}^{h-k}\mu_{2,2}^{k}\leq m^{h}\mu_{2,2}^{k}\mu_{1,1}^{-k}.

Substituting into (28) and applying Lemma 3.1 we find

FΞkht(F)kh+k(h+k)!h!(hk)mhμ2,2kμ1,1k\displaystyle\sum_{F\in\Xi_{k}^{h}}t(F)\leq\frac{k}{h+k}\frac{(h+k)!}{h!}\binom{h}{k}m^{h}\mu_{2,2}^{k}\mu_{1,1}^{-k} =kh+k(h+k2k)(2k)!k!mhμ2,2kμ1,1k\displaystyle=\frac{k}{h+k}\binom{h+k}{2k}\frac{(2k)!}{k!}m^{h}\mu_{2,2}^{k}\mu_{1,1}^{-k}
(h+k)2k1(k1)!mhμ2,2kμ1,1k.\displaystyle\leq\frac{(h+k)^{2k-1}}{(k-1)!}m^{h}\mu_{2,2}^{k}\mu_{1,1}^{-k}. (29)

To finish, we note that ((mhk)h+k)=(1+o(1))mh+k((m-h-k)^{h+k})=(1+o(1))m^{h+k} as h+k=o(m1/2)h+k=o(m^{1/2}), this allows us to substitute the bound found in (29) into (27) where we deduce

η\displaystyle\eta (1+o(1))h=α|Q|2g(n)k=1h27kmh+k(h+k)2k1(k1)!mhμ2,2kμ1,1k\displaystyle\leq(1+o(1))\sum_{h=\frac{\alpha}{|Q|}}^{2g(n)}\sum_{k=1}^{h}\frac{27^{k}}{m^{h+k}}\frac{(h+k)^{2k-1}}{(k-1)!}m^{h}\mu_{2,2}^{k}\mu_{1,1}^{-k}
(1+o(1))216g(n)μ2,2mh=α|Q|2g(n)k=0h108kh2kμ2,2kmkk!μ1,1k.\displaystyle\leq(1+o(1))\frac{216g(n)\mu_{2,2}}{m}\sum_{h=\frac{\alpha}{|Q|}}^{2g(n)}\sum_{k=0}^{h}\frac{108^{k}h^{2k}\mu_{2,2}^{k}}{m^{k}k!\mu_{1,1}^{k}}. (30)

Part of the expression in (30) is in the form of an exponential sum. Evaluating this sum allows us to deduce that η=o(1)\eta=o(1).

η(1+o(1))216g(n)μ2,2mμ1,1h=α|Q|2g(n)e108h2μ2,2mμ1,1(1+o(1))432g(n)2μ2,2mμ1,1e432g(n)2μ2,2mμ1,1=o(1),\eta\leq(1+o(1))\frac{216g(n)\mu_{2,2}}{m\mu_{1,1}}\sum_{h=\frac{\alpha}{|Q|}}^{2g(n)}e^{\frac{108h^{2}\mu_{2,2}}{m\mu_{1,1}}}\leq(1+o(1))\frac{432g(n)^{2}\mu_{2,2}}{m\mu_{1,1}}e^{\frac{432g(n)^{2}\mu_{2,2}}{m\mu_{1,1}}}=o(1), (31)

where we deduce that this is o(1)o(1) in the same way as when we bound (23).

Next we shall consider the term,

θ:=CΓ𝔼(XCZC)=CΓCΓCs{C}𝔼(XCXC)\theta:=\sum_{C\in\Gamma}\mathbb{E}(X_{C}Z_{C})=\sum_{C\in\Gamma}\sum_{C^{\prime}\in\Gamma_{C}^{s}\cup\{C\}}\mathbb{E}(X_{C}X_{C}^{\prime}) (32)

So note that 𝔼(XCXC)\mathbb{E}(X_{C}X_{C^{\prime}}) is the probability that both CC and CC^{\prime} are simultaneously present. Furthermore, note that CCC\cup C^{\prime} is a strongly connected (not necessarily simple) digraph with maximum total degree at most 44. Thus we can use a similar strategy to the one used to bound η\eta. So define Θ\Theta to be the set of all strongly connected multi-digraphs FF with V(F)V(G)V(F)\subseteq V(G) and Δ(F)4\Delta(F)\leq 4. Furthermore for a,ba,b\in\mathbb{N}, define

Θa,bh:={FΘ||F|=h,n1,2(F)=n2,1(F)=a,n2,2(F)=b},\Theta_{a,b}^{h}:=\{F\in\Theta||F|=h,n_{1,2}(F)=n_{2,1}(F)=a,n_{2,2}(F)=b\},

where we define Θ0,0h=\Theta_{0,0}^{h}=\emptyset. Define t(F)t(F) for FΘF\in\Theta in the same way as (26). Then, for a given construction, F=CCF=C\cup C^{\prime} from a pair of principal cycles, t(F)t(F) is again the number of pairs of principal cycles C~,C~\widetilde{C},\widetilde{C}^{\prime} which are copies of the same cycles as C,CC,C^{\prime}. Thus, if we apply Lemma 4.10 we get the following bound on θ\theta,

θ=CΓCΓCs𝔼(XCXC)h=α|Q|2g(n)a=0hb=0hFΘa,bh27kt(F)(mhab)h+a+b\theta=\sum_{C\in\Gamma}\sum_{C^{\prime}\in\Gamma_{C}^{s}}\mathbb{E}(X_{C}X_{C^{\prime}})\leq\sum_{h=\frac{\alpha}{|Q|}}^{2g(n)}\sum_{a=0}^{h}\sum_{b=0}^{h}\sum_{F\in\Theta_{a,b}^{h}}\frac{27^{k}t(F)}{(m-h-a-b)^{h+a+b}} (33)

Now, we define Λa,bh\Lambda_{a,b}^{h} to be the set of all strongly connected labelled multi-digraphs FF with hh vertices such that n1,1(F)=h2abn_{1,1}(F)=h-2a-b, n1,2(F)=n2,1(F)=an_{1,2}(F)=n_{2,1}(F)=a and n2,2(F)=bn_{2,2}(F)=b. (Note Λ0,kh=Λkh\Lambda_{0,k}^{h}=\Lambda_{k}^{h}). Thus, arguing as previously,

FΘa,bht(F)=1h!FΛa,bhϕ:V(F)V(G)vV(F)dG(v)dF(v)dG+(v)dF+(v)\sum_{F\in\Theta_{a,b}^{h}}t(F)=\frac{1}{h!}\sum_{F\in\Lambda_{a,b}^{h}}\sum_{\phi:V(F)\hookrightarrow V(G)}\prod_{v\in V(F)}d^{-}_{G}(v)^{d^{-}_{F}(v)}d^{+}_{G}(v)^{d^{+}_{F}(v)} (34)

Again, we may argue similarly to Lemma 2.1 to deduce that for FΛa,bhF\in\Lambda_{a,b}^{h},

ϕ:V(F)V(G)vV(F)dG(v)dF(v)dG+(v)dF+(v)mhμ1,2aμ2,1aμ2,2bμ1,1(2a+b).\sum_{\phi:V(F)\hookrightarrow V(G)}\prod_{v\in V(F)}d^{-}_{G}(v)^{d^{-}_{F}(v)}d^{+}_{G}(v)^{d^{+}_{F}(v)}\leq m^{h}\mu_{1,2}^{a}\mu_{2,1}^{a}\mu_{2,2}^{b}\mu_{1,1}^{-(2a+b)}.

We can substitute this into (33) and apply Lemma 3.1 to find

FΘa,bht(F)\displaystyle\sum_{F\in\Theta_{a,b}^{h}}t(F) 3a+2bh+a+b(h+a+b)!h!(ha,a,b)mhμ1,2aμ2,1aμ2,2bμ1,1(2a+b)\displaystyle\leq\frac{3a+2b}{h+a+b}\frac{(h+a+b)!}{h!}\binom{h}{a,a,b}m^{h}\mu_{1,2}^{a}\mu_{2,1}^{a}\mu_{2,2}^{b}\mu_{1,1}^{-(2a+b)}
3a+2bh2a+bh3a+2ba!a!b!mhμ1,2aμ2,1aμ2,2bμ1,1(2a+b)\displaystyle\leq\frac{3a+2b}{h}2^{a+b}\frac{h^{3a+2b}}{a!a!b!}m^{h}\mu_{1,2}^{a}\mu_{2,1}^{a}\mu_{2,2}^{b}\mu_{1,1}^{-(2a+b)}
=3a+2bh1a!a!b!(2h3μ1,2μ2,1μ1,12)a(2h2μ2,2μ1,1)bmh\displaystyle=\frac{3a+2b}{h}\frac{1}{a!a!b!}\bigg{(}\frac{2h^{3}\mu_{1,2}\mu_{2,1}}{\mu_{1,1}^{2}}\bigg{)}^{a}\bigg{(}\frac{2h^{2}\mu_{2,2}}{\mu_{1,1}}\bigg{)}^{b}m^{h} (35)

Note that this bound is a sum of two terms due to the factor 3a+2b3a+2b in (35). So we will split this bound into ta(F)+tb(F)t_{a}(F)+t_{b}(F) in the obvious way. This allows us to bound θθa+θb\theta\leq\theta_{a}+\theta_{b} by only considering the ta(F)t_{a}(F) or tb(F)t_{b}(F) terms which will be convenient for us. We substitute the bound from (35) into (33) and use that (mhab)h+a+b=(1+o(1))mh+a+b(m-h-a-b)^{h+a+b}=(1+o(1))m^{h+a+b} to deduce

θa\displaystyle\theta_{a} (1+o(1))h=α|Q|2g(n)a=0hb=0h3ah1a!a!b!(2272h3μ1,2μ2,1mμ1,12)a(54h2μ2,2mμ1,1)b\displaystyle\leq(1+o(1))\sum_{h=\frac{\alpha}{|Q|}}^{2g(n)}\sum_{a=0}^{h}\sum_{b=0}^{h}\frac{3a}{h}\frac{1}{a!a!b!}\bigg{(}\frac{2\cdot 27^{2}h^{3}\mu_{1,2}\mu_{2,1}}{m\mu_{1,1}^{2}}\bigg{)}^{a}\bigg{(}\frac{54h^{2}\mu_{2,2}}{m\mu_{1,1}}\bigg{)}^{b} (36)
θb\displaystyle\theta_{b} (1+o(1))h=α|Q|2g(n)a=0hb=0h2bh1a!a!b!(2272h3μ1,2μ2,1mμ1,12)a(54h2μ2,2mμ1,1)b\displaystyle\leq(1+o(1))\sum_{h=\frac{\alpha}{|Q|}}^{2g(n)}\sum_{a=0}^{h}\sum_{b=0}^{h}\frac{2b}{h}\frac{1}{a!a!b!}\bigg{(}\frac{2\cdot 27^{2}h^{3}\mu_{1,2}\mu_{2,1}}{m\mu_{1,1}^{2}}\bigg{)}^{a}\bigg{(}\frac{54h^{2}\mu_{2,2}}{m\mu_{1,1}}\bigg{)}^{b} (37)

Note that in both cases we can evaluate the two inner sums in terms of exponential functions and modified Bessel functions of the first kind. However the latter of these is a little difficult to work with, hence we will replace the a!a!a!a! with (2a)!(2a)! allowing us to use hyperbolic trigonometric functions instead. In particular, we have

1a!a!4a(2a)! and 1a!(a1)!4a(2a1)!.\frac{1}{a!a!}\leq\frac{4^{a}}{(2a)!}\;\text{ and }\;\frac{1}{a!(a-1)!}\leq\frac{4^{a}}{(2a-1)!}.

Looking first at θa\theta_{a} we get the bound

θa\displaystyle\theta_{a} (1+o(1))h=α|Q|2g(n)a=0hb=0h3h1(2a1)!b!(8272h3μ1,2μ2,1mμ1,12)a(54h2μ2,2mμ1,1)b\displaystyle\leq(1+o(1))\sum_{h=\frac{\alpha}{|Q|}}^{2g(n)}\sum_{a=0}^{h}\sum_{b=0}^{h}\frac{3}{h}\frac{1}{(2a-1)!b!}\bigg{(}\frac{8\cdot 27^{2}h^{3}\mu_{1,2}\mu_{2,1}}{m\mu_{1,1}^{2}}\bigg{)}^{a}\bigg{(}\frac{54h^{2}\mu_{2,2}}{m\mu_{1,1}}\bigg{)}^{b}
=(1+o(1))1622h=α|Q|2g(n)hμ1,2μ2,1mμ1,12sinh(542h3μ1,2μ2,1mμ1,12)e54h2μ2,2mμ1,1\displaystyle=(1+o(1))162\sqrt{2}\sum_{h=\frac{\alpha}{|Q|}}^{2g(n)}\sqrt{\frac{h\mu_{1,2}\mu_{2,1}}{m\mu_{1,1}^{2}}}\sinh\left(54\sqrt{\frac{2h^{3}\mu_{1,2}\mu_{2,1}}{m\mu_{1,1}^{2}}}\right)e^{\frac{54h^{2}\mu_{2,2}}{m\mu_{1,1}}}
(1+o(1))648g(n)3μ1,2μ2,1mμ1,12sinh(216g(n)3μ1,2μ2,1mμ1,12)e216g(n)2μ2,2mμ1,1,\displaystyle\leq(1+o(1))648\sqrt{\frac{g(n)^{3}\mu_{1,2}\mu_{2,1}}{m\mu_{1,1}^{2}}}\sinh\left(216\sqrt{\frac{g(n)^{3}\mu_{1,2}\mu_{2,1}}{m\mu_{1,1}^{2}}}\right)e^{\frac{216g(n)^{2}\mu_{2,2}}{m\mu_{1,1}}}, (38)

where the final inequality follows from the fact that x\sqrt{x}, sinh(x)\sinh(x) and exe^{x} are all increasing for x>0x>0. Bounding θb\theta_{b} is similar,

θb\displaystyle\theta_{b} (1+o(1))h=α|Q|2g(n)a=0hb=1h2h1(2a)!(b1)!(8272h3μ1,2μ2,1mμ1,12)a(54h2μ2,2mμ1,1)b\displaystyle\leq(1+o(1))\sum_{h=\frac{\alpha}{|Q|}}^{2g(n)}\sum_{a=0}^{h}\sum_{b=1}^{h}\frac{2}{h}\frac{1}{(2a)!(b-1)!}\bigg{(}\frac{8\cdot 27^{2}h^{3}\mu_{1,2}\mu_{2,1}}{m\mu_{1,1}^{2}}\bigg{)}^{a}\bigg{(}\frac{54h^{2}\mu_{2,2}}{m\mu_{1,1}}\bigg{)}^{b}
=(1+o(1))108h=α|Q|2g(n)hμ2,2mμ1,1cosh(542h3μ1,2μ2,1mμ1,12)e54h2μ2,2mμ1,1\displaystyle=(1+o(1))108\sum_{h=\frac{\alpha}{|Q|}}^{2g(n)}\frac{h\mu_{2,2}}{m\mu_{1,1}}\cosh\left(54\sqrt{\frac{2h^{3}\mu_{1,2}\mu_{2,1}}{m\mu_{1,1}^{2}}}\right)e^{\frac{54h^{2}\mu_{2,2}}{m\mu_{1,1}}}
(1+o(1))432g(n)2μ2,2mμ1,1cosh(216g(n)3μ1,2μ2,1mμ1,12)e216g(n)2μ2,2mμ1,1\displaystyle\leq(1+o(1))432\frac{g(n)^{2}\mu_{2,2}}{m\mu_{1,1}}\cosh\left(216\sqrt{\frac{g(n)^{3}\mu_{1,2}\mu_{2,1}}{m\mu_{1,1}^{2}}}\right)e^{\frac{216g(n)^{2}\mu_{2,2}}{m\mu_{1,1}}} (39)

Both (38) and (39) are o(1)o(1) which follows from the facts that

g(n)2μ2,2mμ1,10 and g(n)3μ1,2μ2,1mμ1,120.\frac{g(n)^{2}\mu_{2,2}}{m\mu_{1,1}}\to 0\qquad\text{ and }\qquad\frac{g(n)^{3}\mu_{1,2}\mu_{2,1}}{m\mu_{1,1}^{2}}\to 0.

The first of which we showed in (31) and the latter follows directly from the assumption that g(n)3=o(mRR+)g(n)^{3}=o(\frac{m}{R^{-}R^{+}}). Hence θ=o(1)\theta=o(1). Finally, we will bound the terms,

κC:=𝔼|W~C1WC1|.\kappa_{C}:=\mathbb{E}|\widetilde{W}_{C}^{1}-W_{C}^{1}|. (40)

We note that unlike in bounding η\eta and θ\theta we will not need to look at the sum over CΓC\in\Gamma in computing the bound. This is due to the fact that κC\kappa_{C} depends on a much more global structure than the terms pC𝔼(XC+ZC)p_{C}\mathbb{E}(X_{C}+Z_{C}) and 𝔼(XCZC)\mathbb{E}(X_{C}Z_{C}). For the bounding of κC\kappa_{C}, the first thing to observe is that due to the choice of coupling and ΓCs\Gamma_{C}^{s} we have W~C1WC1\widetilde{W}_{C}^{1}\geq W_{C}^{1}. This is because all of the edges which are removed by the switchings are incident to a vertex of CC and therefore none of the principal cycles containing these edges are in ΓCw\Gamma_{C}^{w} which are the cycles contributing to W~C1WC1\widetilde{W}_{C}^{1}\geq W_{C}^{1}. This enables us to bound κC\kappa_{C} by computing the number of expected cycles from ΓCw\Gamma_{C}^{w} which are added by the switchings.

In order to add a cycle CC^{\prime} with the switchings over 44-cycles which produce G~C\widetilde{G}_{C} from GG^{\prime} the following must be true for some kk,

  • All but kk edges of CC^{\prime} must be present in GG^{\prime}.

  • kk of the edges of CC must add these edges after applying the switching.

Now let us compute the probability that this event comes to pass. So let us fix kk edges missing from CC^{\prime} which we match up with kk of the edges of CC which will be used to add them when we apply the switching. So if one such edge is e=uve=uv and this is matched with e=uve^{\prime}=u^{\prime}v^{\prime}, in order for the switching to add ee, before switching we must have edges uvuv^{\prime} and uvu^{\prime}v present in GG^{\prime} (note that because the edges are directed we do not need to consider the alternate switching using uuuu^{\prime} and vvvv^{\prime} like we would if working with graphs). Hence in total there are |C|+k|C^{\prime}|+k edges which we must find in GG^{\prime} before switching in order that CC is a cycle of G~C\widetilde{G}_{C}. The probability we find these edges is ((m)|C|+k)1((m)_{|C^{\prime}|+k})^{-1}.

Now, let us count how many such structures there are which produce the same union cycle as CC^{\prime}. As CC is a principal cycle, we know exactly which stubs we use for it and so there is no contribution to the number of copies from vertices of CC. However there are d+(u)d(v)d^{+}(u)d^{-}(v) switchings which add the union edge uvuv and so the number of ways in which the switching structure with a union cycle which is that of CC^{\prime} can be found is

vV(C)d(v)d+(v).\prod_{v\in V(C^{\prime})}d^{-}(v)d^{+}(v).

Thus for any fixed set of missing edges for the union cycle with the same edges as CC^{\prime} and choice of edges from CC with which we add these edges, the expected number of additional principal cycles after the switchings is

1(m)|C|+kvV(C)d(v)d+(v).\frac{1}{(m)_{|C^{\prime}|+k}}\prod_{v\in V(C^{\prime})}d^{-}(v)d^{+}(v). (41)

There are (|C|k)\binom{|C|}{k} ways to pick the edges of CC which we switch over to add the missing edges of CC^{\prime}. Also there are (|C|k)\binom{|C^{\prime}|}{k} ways to pick the missing edges of CC^{\prime} and k!k! ways to assign edges of CC to missing edges of CC^{\prime}. Thus, when we sum over principal cycles CC^{\prime}, missing edges of CC and CC^{\prime} and matchings of the missing edges, we find the expected number of such structures is at most

𝔼|W~C1WC1|h=α|Q|g(n)k=1min(|C|,h)1hCG;|C|=h(|C|k)(hk)k!(m)h+kvV(C)d(v)d+(v).\mathbb{E}|\widetilde{W}_{C}^{1}-W_{C}^{1}|\leq\sum_{h=\frac{\alpha}{|Q|}}^{g(n)}\sum_{k=1}^{\min(|C|,h)}\frac{1}{h}\sum_{C^{\prime}\subseteq G;|C^{\prime}|=h}\binom{|C|}{k}\binom{h}{k}\frac{k!}{(m)_{h+k}}\prod_{v\in V(C^{\prime})}d^{-}(v)d^{+}(v). (42)

Applying standard bounds on binomial coefficients and falling factorials and taking vertices from V(G(𝐝))V(G(\mathbf{d})) for |C||C^{\prime}| with replacement rather than without allows us to bound (42) as follows

𝔼|W~C1WC1|\displaystyle\mathbb{E}|\widetilde{W}_{C}^{1}-W_{C}^{1}| (1+o(1))h=α|Q|g(n)k=1|C|khk1k!mk(1+Q)h\displaystyle\leq(1+o(1))\sum_{h=\frac{\alpha}{|Q|}}^{g(n)}\sum_{k=1}^{\infty}\frac{|C|^{k}h^{k-1}}{k!m^{k}}(1+Q)^{h}
(1+o(1))|C|=α|Q|g(n)e|C|hm1|C|\displaystyle\leq(1+o(1))\sum_{|C^{\prime}|=\frac{\alpha}{|Q|}}^{g(n)}\frac{e^{\frac{|C|h}{m}}-1}{|C^{\prime}|}
g(n)|Q|α(eg(n)2m1)2g(n)3|Q|αm=o(1),\displaystyle\leq\frac{g(n)|Q|}{\alpha}\left(e^{\frac{g(n)^{2}}{m}}-1\right)\leq\frac{2g(n)^{3}|Q|}{\alpha m}=o(1), (43)

where we note that g(n)2/m<1g(n)^{2}/m<1 allows us to use the bound ex12xe^{x}-1\leq 2x in (43). So κCκ=o(1)\kappa_{C}\leq\kappa=o(1) for all CC where κ=2g(n)3|Q|αm\kappa=\frac{2g(n)^{3}|Q|}{\alpha m}. Finally, note that this implies CΓpCκCκ𝔼(W)\sum_{C\in\Gamma}p_{C}\kappa_{C}\leq\kappa\mathbb{E}(W). We subsequently show that 𝔼(W)ξα+o(1)\mathbb{E}(W)\leq\xi_{\alpha}+o(1) from which it follows that CΓpCκC=o(1)\sum_{C\in\Gamma}p_{C}\kappa_{C}=o(1) as ξα\xi_{\alpha} is a constant independent of nn. Using this and the fact that η,θ=o(1)\eta,\theta=o(1), we conclude that

dTV(W,Po(𝔼(W)))=o(1).d_{\text{TV}}(W,Po(\mathbb{E}(W)))=o(1).

Note that WW is the number of cycles of G(𝐝)G(\mathbf{d}) of length between α/|Q|\alpha/|Q| and g(n)g(n). By Lemma 4.1 and Lemma 4.4, the probability that there are any longer cycles is o(1)o(1). Thus, if WW^{\prime} is the number of cycles of G(𝐝)G(\mathbf{d}) of length at least α/|Q|\alpha/|Q|, dTV(W,Po(𝔼(W)))=dTV(W,Po(𝔼(W)))+o(1)=o(1)d_{\text{TV}}(W^{\prime},Po(\mathbb{E}(W)))=d_{\text{TV}}(W,Po(\mathbb{E}(W)))+o(1)=o(1). To show that the mean of the corresponding Poisson distribution can be taken to be ξα\xi_{\alpha} note that dTV(Po(λ),Po(μ))|λμ|d_{\text{TV}}(Po(\lambda),Po(\mu))\leq|\lambda-\mu| holds for all λ,μ0\lambda,\mu\geq 0. This follows from coupling Bin(n,λ/n)\text{Bin}(n,\lambda/n) and Bin(n,μ/n)\text{Bin}(n,\mu/n) by coupling their constituent Bernoulli trials and noting that dTV(Bin(n,λ/n),Po(λ))=on(1)d_{\text{TV}}(\text{Bin}(n,\lambda/n),Po(\lambda))=o_{n}(1). Thus it suffices to show that 𝔼(W)=ξα+o(1)\mathbb{E}(W)=\xi_{\alpha}+o(1). Upper bounding 𝔼(W)\mathbb{E}(W) may be done in the same way as the proof of Lemma 4.4. However, we can be more precise than we are in (21) as in this case 2h2/m=o(1)2h^{2}/m=o(1), so this line can be replaced by

h=α|Q|g(n)ehQ+2h2mhh=α|Q|g(n)ehQ+o(1)h(1+o(1))α|Q|ehQh𝑑h=(1+o(1))αeλλ𝑑λ=ξα+o(1).\sum_{h=\frac{\alpha}{|Q|}}^{g(n)}\frac{e^{hQ+\frac{2h^{2}}{m}}}{h}\leq\sum_{h=\frac{\alpha}{|Q|}}^{g(n)}\frac{e^{hQ+o(1)}}{h}\leq(1+o(1))\int_{\frac{\alpha}{|Q|}}^{\infty}\frac{e^{hQ}}{h}dh=(1+o(1))\int_{\alpha}^{\infty}\frac{e^{-\lambda}}{\lambda}d\lambda=\xi_{\alpha}+o(1). (44)

Next, we lower bound 𝔼(W)\mathbb{E}(W) by ξαo(1)\xi_{\alpha}-o(1). In order to do this we will use Lemma 2.3 in combination with the inequality

1xexx2 for x12.1-x\geq e^{-x-x^{2}}\;\text{ for }\;x\leq\frac{1}{2}.

This allows us to deduce that the probability of finding a cycle of length between α/|Q|\alpha/|Q| and g(n)g(n) is at least

h=α|Q|g(n)h!h(nh)(1+Q)h(n)h(12h2Δ2εn)(1hΔ22m)h=α|Q|g(n)(1+o(1))heh(QQ22)\sum_{h=\frac{\alpha}{|Q|}}^{g(n)}\frac{h!}{h}\binom{n}{h}\frac{(1+Q)^{h}}{(n)_{h}}\left(1-\frac{2h^{2}\Delta^{2}}{\varepsilon n}\right)\left(1-\frac{h\Delta^{2}}{2m}\right)\geq\sum_{h=\frac{\alpha}{|Q|}}^{g(n)}\frac{(1+o(1))}{h}e^{h(Q-\frac{Q^{2}}{2})} (45)

Now, that the exponent in the above is equal to hQo(1)hQ-o(1) follows from the assumption g(n)=o(1/|Q|2)g(n)=o(1/|Q|^{2}). Thus in analogy with (44), we deduce that (45) is at most

(1+o(1))h=α|Q|g(n)ehQo(1)h(1o(1))α|Q|ehQh𝑑h=(1o(1))αeλλ𝑑λ=ξαo(1).(1+o(1))\sum_{h=\frac{\alpha}{|Q|}}^{g(n)}\frac{e^{hQ-o(1)}}{h}\geq(1-o(1))\int_{\frac{\alpha}{|Q|}}^{\infty}\frac{e^{hQ}}{h}dh=(1-o(1))\int_{\alpha}^{\infty}\frac{e^{-\lambda}}{\lambda}d\lambda=\xi_{\alpha}-o(1). (46)

Combining (44) and (46) we deduce that 𝔼(W)=ξα+o(1)\mathbb{E}(W)=\xi_{\alpha}+o(1) as required.

4.5 Putting it all together

We can deduce the statement of the theorem from the previous sections as follows. First we apply Lemma 4.1Lemma 4.4 and Lemma 4.5 to deduce there are no cycles with length at least ω(1/|Q|)\omega(1/|Q|) or complex components. Then, we can apply Lemma 4.6 to deduce that the distribution of the number of cycles with at least α/|Q|\alpha/|Q| vertices converges to a Poisson distribution with mean ξα\xi_{\alpha} from which the statement that

(|𝒞k|α|Q|)=1i=0k1ξαii!eξα+o(1)\mathbb{P}\left(|\mathcal{C}_{k}|\geq\frac{\alpha}{|Q|}\right)=1-\sum_{i=0}^{k-1}\frac{\xi_{\alpha}^{i}}{i!}e^{-\xi_{\alpha}}+o(1)

follows immediately as the above is simply the probability that a Poisson(ξα\xi_{\alpha}) distribution is at least kk (plus the o(1)o(1) term).

5 Concluding Remarks

In this paper, we showed that the largest component of the directed configuration model is of order |Q|1|Q|^{-1} when nQ3(RR+)nQ^{3}(R^{-}R^{+})\to-\infty and found the distribution of the size of the kkth largest component for any kk. In a subsequent paper [4] we shall show that under similar conditions to those in this paper that for degree sequences such that nQ3(RR+)1nQ^{3}(R^{-}R^{+})^{-1}\to\infty the largest component is of order nQ2(RR+)1nQ^{2}(R^{-}R^{+})^{-1}. This quantity matches the 1/|Q|1/|Q| we find in this paper if n|Q|3(RR+)1cn|Q|^{3}(R^{-}R^{+})^{-1}\to c for some constant cc and suggests that one may find a critical window phenomenon for degree sequences with such parameters. Note that in particular this is precisely the critical window of D(n,p)D(n,p) (looking at typical degree sequences from the model D(n,p)D(n,p)). Moreover, the recent result of Donderwinkel and Xie certainly seems to indicate that the point Q=0Q=0 lies inside a critical window.

An interesting question for further study would be to ask about the joint distribution of the largest strongly connected components in the directed configuration model with parameters as in this paper. The fact that we find

(|𝒞k|α|Q|)=1i=0k1ξαii!eξα+o(1)\mathbb{P}\bigg{(}|\mathcal{C}_{k}|\geq\frac{\alpha}{|Q|}\bigg{)}=1-\sum_{i=0}^{k-1}\frac{\xi_{\alpha}^{i}}{i!}e^{-\xi_{\alpha}}+o(1)

seems rather suggestive of an underlying Poisson process. As such we make the following conjecture.

Conjecture 5.1.

Let GG be a directed configuration model with parameters as in Theorem 1.3 and suppose the sizes of its components in descending order are given by the random variables Z1Z2Z_{1}\geq Z_{2}\geq\ldots. Suppose further that X1X2X_{1}\leq X_{2}\leq\ldots are the points of a Poisson process of rate 11 and Y1Y2Y_{1}\geq Y_{2}\geq\ldots are the unique positive solutions to

Xi=Yiexx𝑑x.X_{i}=\int_{Y_{i}}^{\infty}\frac{e^{-x}}{x}dx.

Then we have that

(|Q|Z1,|Q|Z2,)(Y1,Y2,) as n.(|Q|Z_{1},|Q|Z_{2},\ldots)\to(Y_{1},Y_{2},\ldots)\text{ as }n\to\infty.

Note that the above is also an open question for D(n,p)D(n,p) with p=(1ε)/np=(1-\varepsilon)/n and ε0\varepsilon\to 0, ε3n\varepsilon^{3}n\to\infty.

References

  • [1] B. Bollobás. A probabilistic proof of an asymptotic formula for the number of labelled regular graphs. European Journal of Combinatorics, 1(4):311–316, 1980.
  • [2] X. S. Cai and G. Perarnau. The giant component of the directed configuration model revisited. arXiv preprint arXiv:2004.04998, 2020.
  • [3] C. Cooper and A. Frieze. The size of the largest strongly connected component of a random digraph with a given degree sequence. Combinatorics, Probability and Computing, 13(3):319–337, 2004.
  • [4] M. Coulson. The strong component structure of the barely supercritical directed configuration model. Manuscript.
  • [5] M. Coulson. The critical window in random digraphs. arXiv preprint arXiv:1905.00624, 2019.
  • [6] S. Donderwinkel and Z. Xie. Universality for the directed configuration model with random degrees: metric space convergence of the strongly connected components at criticality. arXiv preprint arXiv:2105.11434, 2021.
  • [7] P. Erdos and A. Rényi. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci, 5(1):17–60, 1960.
  • [8] T. Erhardsson. Steins method for Poisson and compound Poisson approximation. An introduction to Stein’s method, 4:61–114, 2005.
  • [9] D. A. Freedman. On tail probabilities for martingales. the Annals of Probability, pages 100–118, 1975.
  • [10] C. Goldschmidt and R. Stephenson. The scaling limit of a critical random directed graph. arXiv preprint arXiv:1905.05397, 2019.
  • [11] A. Graf. On the strongly connected components of random directed graphs with given degree sequences. Master’s thesis, University of Waterloo, 2016.
  • [12] H. Hatami and M. Molloy. The scaling window for a random graph with a given degree sequence. Random Structures & Algorithms, 41(1):99–123, 2012.
  • [13] S. Janson, T. Łuczak, and A. Ruciński. Random graphs, volume 45. John Wiley & Sons, 2011.
  • [14] R. M. Karp. The transitive closure of a random digraph. Random Structures & Algorithms, 1(1):73–93, 1990.
  • [15] L. Lu and L. Székely. Using Lovász Local Lemma in the space of random injections. the electronic journal of combinatorics, page R63, 2007.
  • [16] T. Łuczak. The phase transition in the evolution of random digraphs. Journal of graph theory, 14(2):217–223, 1990.
  • [17] T. Łuczak and T. G. Seierstad. The critical behavior of random digraphs. Random Structures & Algorithms, 35(3):271–293, 2009.
  • [18] A. Nachmias and Y. Peres. Critical percolation on random regular graphs. Random Structures & Algorithms, 36(2):111–148, 2010.
  • [19] X. Pérez-Giménez and N. Wormald. Asymptotic enumeration of strongly connected digraphs by vertices and edges. Random Structures & Algorithms, 43(1):80–114, 2013.
  • [20] D. Williams. Probability with martingales. Cambridge university press, 1991.