The strong component structure of the barely subcritical directed configuration model
Abstract
We study the behaviour of the largest components of the directed configuration model in the barely subcritical regime. We show that with high probability all strongly connected components in this regime are either cycles or isolated vertices and give an asymptotic distribution of the size of the th largest cycle. This gives a configuration model analogue of a result of Łuczak and Seierstad for the binomial random digraph.
1 Introduction
1.1 The directed configuration model
Let be a directed degree sequence on vertices and let . The directed configuration model on , introduced by Cooper and Frieze [3], is the random directed multigraph on obtained by associating with vertex in-stubs and in-stubs, and then choosing a perfect matching of in- and out- stubs uniformly at random. We will denote this model by . This is a directed generalisation of the configuration model of Bollobás [1] which since its introduction has become one of the most widely used random graph models.
A strongly connected component in a digraph is a maximal sub-digraph such that there exists a directed path between each ordered pair of vertices. In this paper we will consider the size of the largest strongly connected component in the barely subcritical regime. Note that these components are much smaller than the connected components which are found in the undirected case. This is due to the fact that the minimal strongly connected components are directed cycles as opposed to the trees in the directed case and so we require one additional edge on the same vertex set. We shall call any strongly connected digraph which is not a cycle complex. Note that any such digraph has strictly more edges than vertices.
Let be the number of copies of in and let be the maximum degree of . Also, define the following parameters of the degree distribution. These parameters govern the behaviour of the size of the largest strongly connected component.
Definition 1.1.
We shall assume the following conditions which ensure that the degree sequence is suitably “well behaved”.
Condition 1.2.
For each , is a sequence of ordered pairs of non-negative integers such that . Furthermore, is a probability distribution such that for some ,
-
i)
as for each ,
-
ii)
,
-
iii)
,
-
iv)
,
-
v)
,
-
vi)
,
-
vii)
.
It was shown by Cooper and Frieze [3] that is the threshold for the existence of a giant strongly connected component under some mild conditions on the degree sequence similar to those observed in 1.2.
For the remainder of the paper, we shall omit the subscript for reasons of clarity - for example we write etc. Let be the size of the th largest strongly connected component of the directed configuration model with degree sequence . We shall write for this quantity when the degree sequence is clear.
Our main result is the following,
Theorem 1.3.
Suppose that is a configuration model random digraph and suppose that . With high probability, there are no complex components or cycles of length . Furthermore,
where
The condition is likely best possible as the analogous condition in the undirected case discriminates the barely subcritical case from the critical window. Moreover, substituting and obtained from a typical degree sequence of also enters the critical window when we have for some constant .
In prior work, Łuczak and Seierstad [17] considered the model which is a random digraph model formed by including each possible arc with probability independently. They showed an analogous result to Theorem 1.3 for with and in this model.
1.2 Previous work
The study of the giant component in random graph models was initiated by the seminal paper of Erdős and Rényi [7] regarding the giant component in . Since then the appearance of a giant component in various models has remained an active topic of study in the area.
When working with directed graphs, there are a number of types of connected component which are of interest. In this paper we concern ourselves with the strongly connected components. The study of strongly connected components in random digraphs began in the model where we include each possible edge with probability independently. Karp [14] and Łuczak [16] independently showed that when , then has all strongly connected components of size if . If instead they showed that there exists a unique strongly connected component of linear order with all other components of size . The case with and was studied by Łuczak and Seierstad [17]. They showed that if , there is a unique strongly connected component of size and other components all of size . When they showed an analogue of Theorem 1.3:
Theorem 1.4.
Let where but . Assume is a constant and let denote the size of the th largest strongly connected component. Then, asymptotically almost surely, contains no complex components and
where .
The so-called critical window, when has also been the subject of some study. In [5] the author showed bounds on the size of the largest strongly connected component in this regime which are akin to bounds obtained by Nachmias and Peres for the undirected case [18]. Moreover, Goldschmidt and Stephenson [10] gave a scaling limit result for the largest strongly connected components in the critical window.
The directed configuration model has also been studied previously. It was first studied by Cooper and Frieze who showed that provided the maximum degree then if , there is no all strongly connected components are small and if there is a giant strongly connected component of linear size. The assumptions on the degree sequence have subsequently been relaxed, Graf [11] showed is enough to draw the same conclusion and Cai and Perarnau [2] improved this further to only require bounded second moments. A scaling limit result has been recently obtained at the exact point of criticality, by Donderwinkel and Xie [6] in a very closely related model where vertices’ degrees are sampled from a limiting degree sequence.
1.3 Organization
The remainder of this paper is arranged as follows, in section 2 we prove some auxiliary results on subgraph counting within the directed configuration model. Then, in section 3 we enumerate certain types of strongly connected directed graphs of maximum degree . In section 4 we prove Theorem 1.3 which we break down into a few steps: first that there are no long cycles, next that there are no complex components, and finally we compute the probability the th largest component has size at least via Poisson approximation. We conclude the paper in section 5 with some open questions and future work.
2 Subgraph Bounds
We calculate probabilities of given subgraphs in the configuration model.
Suppose that is a configuration model random graph and let be the degree of a uniformly random vertex of . We call the degree distribution of (or of ). Define
and |
So that the and are the moments and factorial moments of respectively. We also define which is the average degree. Furthermore, observe that which is a fact we utilise in subsequent sections. We now state a general upper bound on the probability of finding certain subgraphs in the configuration model.
Lemma 2.1.
Let be a configuration model random digraph with degree distribution . Suppose further that has vertices and edges. Let be any digraph with vertices, edges and let be the number of vertices of with in-degree and out degree . Then the probability that a uniformly random injective map is a homomorphism is bounded above by
(1) |
Furthermore, the expected number of copies of in is bounded above by
(2) |
Proof.
Note that (2) follows immediately from (1). Thus we shall focus on the proof of (1). Let be a fixed injective map. Arbitrarily order the edges of as and for each edge define an event . Then, is a homomorphism if and only if every event occurs. To simplify notation, let and for and note that
Suppose that for some . Define and . That is, and are the number of times that (resp. ) has previously appeared as the initial (terminal) vertex of an edge of . Also, suppose that .
Claim 2.2.
Proof.
To see this, note that if , then as there are not enough stubs at to create such a copy of . Similarly if , .
So, without loss of generality we may assume that . Now, by definition of we have already chosen edges of the configuration model and we have precisely out-stubs remaining at and at . Now, consider the random matching on the remaining stubs. The probability that a specified pair of stubs is matched is . There are such pairs which produce . Thus the expected number of edges from to is . The claim then follows by Markov’s inequality. ∎
Next we compute the probability that all of the occur simultaneously. Due to the way in which we defined these events,
To write down this probability succinctly, we will use the functions
It is a simple computation to check that
(3) |
Note that here we were able to remove the function as if there are any negative contributions to the product, then there is also a contribution of value and furthermore, it is impossible for to be a homomorphism in this case so that (3) reduces to in this case (which is clearly true).
To complete the proof, we extend the right hand side of equation (3) to allow any function . Choosing an uniformly random function in this way gives that the probability that a uniformly random injective function is a homomorphism is bounded above by
(4) | ||||
(5) |
In moving from injective functions to arbitrary functions between (4) and (5), we move from an intractable space of functions to a product space which naturally splits over the vertices of . This allows us to rewrite (5) as
which is the claimed upper bound, . ∎
We also need a lower bound in the case that is a cycle.
Lemma 2.3.
Let be a configuration model random digraph whose degree sequence satisfies 1.2. Suppose further that has vertices and edges. Let be a directed cycle with vertices. Then the probability that a uniformly random injective map is a homomorphism is bounded from below by
(6) |
Proof.
First, let us consider the probability of finding at least one edge from vertex of out-degree to vertex with in-degree . Let be our knowledge of up until now which consists only of a set of edges which are present. satisfies that we have not observed any edges that affect either the out-degree of or the in-degree of and that has edges. So let be the number of edges between and . Then,
where the final line follows by the inequality, which is valid for and . This allows us to deduce that the probability of seeing an edge between and is at least . Now, looking at each edge of in turn allows us to deduce that the probability is a homomorphism is at least
(7) |
where we consider the argument of modulo in (7). Also, note that one can factorise the linear term in (7) and bound the second occurrence of by to get the lower bound
(8) |
The idea is to argue that we can swap the order of the product and sum in (8) without changing the result very much. To this end, let be the set of functions such that
-
i)
,
-
ii)
for each ,
-
iii)
for each .
Note that for any function then . As a result of this we observe that
As well as the fact that
(9) |
For a given function we define its weight as . We also define the weight of a set of functions in the natural way as . Next, we shall apply switching arguments to bound the weights of the sets relative to one another.
Consider the auxiliary bipartite graphs, with parts and . We connect to by an edge in if and differ in precisely one coordinate. For each there are at most ways we can change one coordinate and decrease the size of the image. In particular we may pick any of the coordinates of and change it to for some . So, . For each , we pick any of the at least coordinates at which is not injective and choose a new image for this coordinate. By 1.2 there are at least ways to choose the new image, .
Combining these two results allows us to deduce that . Upon rearrangement we find . Note that two functions and which differ in one coordinate must also satisfy and vice versa. Hence . This allows us to apply induction to deduce that
So by (9), . Combining this with (8) allows us to deduce the statement of the lemma, that the following is a lower bound on the probability of finding a cycle at a specified position in :
∎
3 Digraph Counting
In this section we will prove the following bound on the number of strongly connected multi-digraphs with maximum total degree . This is similar to [5, Lemma 2.3] although a weaker bound suffices here allowing us to simplify the proof somewhat.
Lemma 3.1.
Suppose that such that . Let be the number of labelled strongly connected multi-digraphs with vertices and degree distribution given by
in-degree | out-degree | quantity |
---|---|---|
a | ||
a | ||
b |
Then, we have the following bound,
To prove this bound we will use the preheart configuration model of Pérez-Giménez and Wormald [19] which we shall define as follows.
A preheart is a multi-digraph with minimum semi-degree at least and no cycle components. The heart of a preheart is the multidigraph formed by suppressing all vertices of which have in and out degree precisely . For a degree sequence on vertex set , define
To form the preheart configuration model, first we apply the configuration model to to produce a heart . Given a heart configuration , we construct a preheart configuration by assigning to such that the vertices assigned to each arc of are given a linear order. Denote this assignment including the orderings by . Then the preheart configuration model, is the probability space of random preheart configurations formed by choosing and uniformly at random.
It is easy to see that every strongly connected digraph is produced by the preheart configuration model. Thus counting the number of possible outcomes from the preheart configuration model gives an upper bound for the number of strongly connected digraphs with the same degree sequence. We now count the number of preheart configurations using [5, Lemma 2.4].
Lemma 3.2.
In the preheart configuration model with vertices, edges and degree sequence , let be the number of vertices of the heart. Then there are a total of
preheart configurations.
From this lemma, we may prove Lemma 3.1.
Proof of Lemma 3.1.
First, we choose the degree sequence. So note that there are at most ways in which we can give vertices in-degree and out-degree , vertices in-degree and out-degree , vertices in-degree and out-degree and the remainder in-degree and out-degree . Having fixed this degree sequence, by Lemma 3.2 as the heart contains vertices and there are more edges than vertices, the number of strongly connected digraphs with this degree sequence is bounded above by . Hence the number of strongly connected digraphs with degree distribution as in the statement of the lemma is at most
as claimed. ∎
4 Proof of Theorem 1.3
In this section we prove Theorem 1.3 and show that every strongly connected component is a cycle and that these cycles are not particularly large. In particular this provides a configuration model analogue of [17, Theorem 7]. Working with the configuration model introduces several additional difficulties with the proof of this, foremost of these being that we do not have enough control of the subgraph counts to compute the factorial moments and show that they converge to those of a Poisson distribution. We will instead use the Chen-Stein method for Poisson approximation which only requires good control of the first moment and an upper bound on the second.
The proof of this theorem splits naturally into four parts. Choose functions which are defined such that , and . Moreover for this section we shall assume that and if this is not the case, we swap the orientations of all edges to get an equivalent digraph with as desired. We will say that a cycle is
-
•
Long if ,
-
•
Medium if ,
-
•
Short if .
First we will show that there are no long or medium cycles in the directed configuration model. Next, we show that there are no complex components and finally we show the result on the distribution of the length of the th longest cycle.
4.1 Long Cycles
Lemma 4.1.
The configuration model, has no long cycles.
To show that there are no long cycles, it suffices to show that the out-component of an arbitrary vertex is bounded above by . Certainly, the longest cycle in a directed graph is at most the size of the largest out-component and so the lemma follows.
Thus, consider the following version of the branching process of Hatami and Molloy [12] for the out-component in a digraph. For a vertex we explore its out-component in the random digraph as follows. We will have a partial subdigraph at time consisting of the vertices explored thus far. will consist of all in- and out-stubs of some vertices of together with a matching of some of the stubs. If there are unmatched out-stubs in we will pick one at random and match it to some in-stub which yields an edge of . We define as the number of unmatched out-stubs in . Thus, indicates that we have explored an out-component in its entirety. Formally we define the exploration process as follows.
-
•
Choose a vertex and initialise and .
-
•
While , choose an arbitrary unmatched out-stub of any vertex . Pick a uniformly random unmatched in-stub and let be the vertex to which this in-stub belongs. Match these two stubs forming an edge of .
-
–
If we add it so and .
-
–
Otherwise, , .
-
–
Note that does not depend on how we have exposed so and are Markov processes. We define the following quantities.
-
•
, the number of unmatched out-stubs at time . Note this is also the number of unmatched in-stubs at time .
-
•
if and have the same vertex set. Otherwise it is the unique vertex in .
-
•
.
Note that initially . Also, for unvisited vertices , the probability that we explore next is . Hence, provided that , the expected change in is
(10) |
As long as remains close to we expect that is a random walk with drift approximately . So in particular, for our setting of , we expect that the random walk will quickly return to 0. Thus, we shall start by showing that the drift parameter is indeed close to with high probability. For this we shall use the following formulation of the Azuma-Hoeffding inequality (see [13, Theorem 2.25]).
Theorem 4.2 (Azuma-Hoeffding Inequality).
Let be a martingale with , . Suppose there exist constants, such that
Then for any ,
For define . Also, we define and for let
(11) |
It is a simple check that the forms a martingale. Furthermore, so to bound the probability that is large, we show that is small and bound the probability that is large.
For the second of these, consider the auxiliary random variables
That is we change to have the same denominator as . As we assume that , so combining this with the fact that we deduce that
(12) |
Also, as there is at most one vertex whose contributions are removed in moving from to , then
(13) |
Combining equations (12) and (13) gives an upper bound on which we may then use to bound the martingale differences almost surely,
(14) |
Next, we will bound the terms . It will be convenient to do this in two stages utilising the auxiliary random variables . So, first note that
(15) |
We can combine (15) with (12) and apply 1.2 to deduce that
(16) |
This leaves us in a situation in which we can compare and ,
(17) |
So provided that we have and in particular, for any such ,
Whereupon we can apply the Azuma-Hoeffding inequality as . We can use the bound from (14) for the . Substituting into Theorem 4.2 yields,
(18) |
where the second inequality in (18) comes from , and . We could improve the dependence on by using Freedman’s inequality [9] in place of the Azuma-Hoeffding inequality here however there are other points where we require and so this would only remove the term in 1.2. Note hence and so for any large enough ,
(19) |
Now that we have shown that is concentrated around , we can proceed to show that has no large components with high probability via a stopping time argument. We will use the following version of Doob’s optional stopping theorem [20, Theorem 10.10]
Theorem 4.3 (Optional Stopping Theorem).
Let be a supermartingale and let be a stopping time. Then is integrable and furthermore,
whenever is bounded.
So, now let us show that there is no component of size larger than . For each we shall consider the exploration process started at . Recall that and so for sufficiently large , . Hence we have with high probability for each . Define the stopping time
Recall that for we have . Thus, is a supermartingale. Clearly, is bounded by so we may apply Theorem 4.3 to from which we deduce that
Upon rearrangement this yields,
By Markov’s inequality we can deduce
The only other way in which we could have is if for some we have . A union bound allows us to deduce that this occurs with probability at most . So for any large enough ,
Define as the number of vertices of which lie in cycles of size at least . Note that any such vertex must have out component of size at least . Thus,
as . Thus there are no long cycles in .
4.2 Medium Cycles
Lemma 4.4.
The configuration model, has no medium cycles.
Our next step is to apply Lemma 2.1 to show there are no medium cycles. By Lemma 2.1, the probability that has a cycle of length in any particular location is at most . Thus the expected number of cycles of length in is at most
(20) |
For any , we have as . Then, the expected number of cycles of length between and is at most
(21) |
where is the exponential integral function and the first equality follows by making the substitution (recall that and so this substitution preserves positivity). It is straightforward to bound which allows us to conclude that as . Note that and so . Thus the expected number of cycles in of length at least is . So by Markov’s inequality, there are no such cycles with high probability.
4.3 Complex Components
Lemma 4.5.
The configuration model, has no complex components.
We begin by defining digraphs and for . Let be the digraph with vertices consisting of vertices , and three internally disjoint paths. One of length from to , one of length from to and one of length from to . Let be the digraph with vertices consisting of two cycles, one of length , one of length which intersect at a single vertex, . We can use the ear decomposition of a strongly connected digraph to deduce that if contains any complex components, then it contains a subgraph which is either a copy of or a copy of . Note that both of these are the union of two cycles and and by the results of the previous two sections, there are no cycles with more than vertices with high probability. Thus we only need to show there are none of these motifs on at most vertices to deduce that there are none in .
Unlike Łuczak and Seierstad [17], we must treat these cases separately as in the first case, we have two vertices of total degree and the rest of total degree and in the second there is one vertex of total degree in place of the total degree vertices which changes the result of applying Lemma 2.1.
First let us consider . There are at most ways of finding such subgraphs on vertices ( ways of choosing path lengths connecting the two degree vertices and assuming the associated automorphism groups are all trivial gives this bound). Thus, we may apply Lemma 2.1 to deduce that the expected number of such subgraphs in the configuration model with parameters as in the statement of Theorem 1.3 is at most
(22) |
where we eliminate the term in the above in the same way as in (20). An integral of the form seen in (22) can be evaluated by integrating by parts twice to deduce the following (where ),
Thus the expected number of these subgraphs in can be bounded above by .
The second case is where we have a degree vertex. In this case there are at most such subgraphs on vertices ( choices of the two cycle lengths and assuming the associated automorphism groups are all trivial gives this bound). Again we apply Lemma 2.1 to compute the expected number of such subgraphs of size at most which this time is at most
(23) |
Note we may pick either or here by selecting which part of the product to bound by in computing and so we pick the smaller of the two. We may proceed similarly to before, integrating by parts which allows us to bound integrals of the form found in (23) as
So, we can bound (23) above by . Thus by Markov’s inequality there are no copies of or on at most vertices in with high probability. Combining Lemma 4.1 and Lemma 4.4 we deduce there are no cycles of length at least with high probability. As any complex strongly connected digraph contains a copy of at least one of these, we deduce that contains no complex components with high probability.
4.4 Length of the kth largest cycle
Lemma 4.6.
The longest cycle in the configuration model, follows the distribution given in Theorem 1.3
The idea now will be to apply a local coupling version of the Chen-Stein method (see [8, Theorem 2.8]) to deduce that the number of cycles in of length between and converges to a Poisson distribution of mean . Let us start by stating the version of the Chen-Stein method which we will apply,
Theorem 4.7.
Let be a sum of indicator variables and let . For each , divide into two sets, and . Define
Suppose that there exist random variables, and defined on the same probability space such that
Then,
(24) |
Note that this lemma requires us to have a copy of . To create such a copy, we will use the following lemma to couple the configuration model with itself conditioned on the containment of a given subgraph.
Lemma 4.8.
Let be a balanced bipartite complete graph and let be a uniformly chosen random perfect matching of . Suppose that are distinct elements of and are distinct elements of . Then, the following procedure gives a copy of :
-
1.
Sample an element from and set .
-
2.
For each :
-
•
If is an edge of set .
-
•
Otherwise, let be the unique neighbour of and be the unique neighbour of in . Let .
-
•
That is is sampled uniformly from .
Proof.
Note that it is sufficient to prove the lemma for as if then clearly, . So, the general case follows by induction on the case.
Now, we prove the lemma for . To do so, we will show that each of the atoms of comes from precisely atoms of via this switching approach. So let be an atom of and let be an edge of with . Now, consider the inverse switching, (note if we chose the edge , then this is simply the identity ). Thus, for each atom of there are ways to get to an atom of . Similarly we can show that each atom of is mapped to a unique element of . Thus, as has the uniform distribution, so does the random variable obtained by our procedure above. ∎
Note that this lemma does not allow us to generate the configuration model conditioned on the existence of a subgraph specified in the usual way by the locations of its vertices unless all of the involved vertices have in- and out-degrees at most . This is due to the fact that Lemma 4.8 allows us to condition on which pairs of stubs are connected rather than which pairs of vertices. Instead we shall condition on the existence of principal subgraphs which we shall define as follows.
Definition 4.9.
Let be a configuration model random digraph with degree distribution . Let be the associated perfect matching of in- and out-stubs. A principal subgraph of is an event of the form where is a partial matching of in- and out-stubs.
A similar notion was seen in [15] in which their canonical events are precisely the same as our principal subgraphs. Furthermore, note that the event corresponding to a subgraph in may be written as a union of events corresponding to principal subgraphs. In particular, the existence of the cycle in can be written as the union of events corresponding to principal cycles. For this reason, in places where there may be some ambiguity as to whether we are dealing with principal subgraphs or not, we shall refer to subgraphs in the usual sense as union subgraphs.
This leaves us in a setting to which we can apply Theorem 4.7. We shall show that the number of cycles in of lengths between and is Poisson distributed with mean . Let be the set of all principal cycles which have lengths between and . For each and digraph on the same vertex set and stubs as let be the indicator function that is a principal subgraph of . Also, we split into a set strongly dependent on and a set weakly dependent on . Define the strongly dependent set to be the set of all principal cycles which share at least one vertex with . The weakly dependent set contains all of the other principal cycles, it is the set of principal cycles which are vertex disjoint from . Finally, we define to be for an independent copy of and to be obtained from by applying a -cycle switching to each edge of in in turn and define in the obvious way to be for the digraph . That is,
These variables clearly satisfy the assumptions of Theorem 4.7 and so we must bound the expectations in the statement of the theorem to compute an upper bound on . The idea now is to reduce the whole problem to one of bounding the expected numbers of certain subgraphs being contained in , and . For the remainder of this section, we write for and unless specified otherwise.
We shall bound the three terms from (24) one by one. First let us consider the term
(25) |
Note that the set, is the set of all cycles which share at least one vertex with . Thus, the union of and is a strongly connected digraph (here we allow multiple edges). In the computation of we will use the following generalisation of [5, Lemma 3.4]. No changes are required to the proof of the lemma as presented in [5] to generalise from digraphs to multi-digraphs.
Lemma 4.10.
Each strongly connected multi-digraph with excess may be formed in at most ways as the union of a pair of directed cycles and .
Define to be the set of all strongly connected multi-digraphs with vertices a subset of such that all vertices have degrees or with at least one vertex which has . Note that is precisely the set of multi-digraphs which can be formed as the edge disjoint union of the two cycles and . Furthermore, note that the excess of a multi-digraph in is precisely the number of vertices which have . We let be the set . Moreover, for each define
(26) |
Observe that for a given whose union is , is the number of pairs of principal cycles which are copies of the same cycles as . Thus, combining (26) with Lemma 4.10 allows us to bound as follows,
(27) |
Now, let be the set of all strongly connected labelled multi-digraphs with vertices such that all vertices have degrees or with precisely vertices such that . This allows us to write
(28) |
where the division by comes from the fact that distinguishes between digraphs obtained from one-another by a permutation of the vertex set. Arguing similarly to Lemma 2.1, noting that we know the degree sequence of and using , we deduce that
Substituting into (28) and applying Lemma 3.1 we find
(29) |
To finish, we note that as , this allows us to substitute the bound found in (29) into (27) where we deduce
(30) |
Part of the expression in (30) is in the form of an exponential sum. Evaluating this sum allows us to deduce that .
(31) |
where we deduce that this is in the same way as when we bound (23).
Next we shall consider the term,
(32) |
So note that is the probability that both and are simultaneously present. Furthermore, note that is a strongly connected (not necessarily simple) digraph with maximum total degree at most . Thus we can use a similar strategy to the one used to bound . So define to be the set of all strongly connected multi-digraphs with and . Furthermore for , define
where we define . Define for in the same way as (26). Then, for a given construction, from a pair of principal cycles, is again the number of pairs of principal cycles which are copies of the same cycles as . Thus, if we apply Lemma 4.10 we get the following bound on ,
(33) |
Now, we define to be the set of all strongly connected labelled multi-digraphs with vertices such that , and . (Note ). Thus, arguing as previously,
(34) |
Again, we may argue similarly to Lemma 2.1 to deduce that for ,
We can substitute this into (33) and apply Lemma 3.1 to find
(35) |
Note that this bound is a sum of two terms due to the factor in (35). So we will split this bound into in the obvious way. This allows us to bound by only considering the or terms which will be convenient for us. We substitute the bound from (35) into (33) and use that to deduce
(36) | ||||
(37) |
Note that in both cases we can evaluate the two inner sums in terms of exponential functions and modified Bessel functions of the first kind. However the latter of these is a little difficult to work with, hence we will replace the with allowing us to use hyperbolic trigonometric functions instead. In particular, we have
Looking first at we get the bound
(38) |
where the final inequality follows from the fact that , and are all increasing for . Bounding is similar,
(39) |
Both (38) and (39) are which follows from the facts that
The first of which we showed in (31) and the latter follows directly from the assumption that . Hence . Finally, we will bound the terms,
(40) |
We note that unlike in bounding and we will not need to look at the sum over in computing the bound. This is due to the fact that depends on a much more global structure than the terms and . For the bounding of , the first thing to observe is that due to the choice of coupling and we have . This is because all of the edges which are removed by the switchings are incident to a vertex of and therefore none of the principal cycles containing these edges are in which are the cycles contributing to . This enables us to bound by computing the number of expected cycles from which are added by the switchings.
In order to add a cycle with the switchings over -cycles which produce from the following must be true for some ,
-
•
All but edges of must be present in .
-
•
of the edges of must add these edges after applying the switching.
Now let us compute the probability that this event comes to pass. So let us fix edges missing from which we match up with of the edges of which will be used to add them when we apply the switching. So if one such edge is and this is matched with , in order for the switching to add , before switching we must have edges and present in (note that because the edges are directed we do not need to consider the alternate switching using and like we would if working with graphs). Hence in total there are edges which we must find in before switching in order that is a cycle of . The probability we find these edges is .
Now, let us count how many such structures there are which produce the same union cycle as . As is a principal cycle, we know exactly which stubs we use for it and so there is no contribution to the number of copies from vertices of . However there are switchings which add the union edge and so the number of ways in which the switching structure with a union cycle which is that of can be found is
Thus for any fixed set of missing edges for the union cycle with the same edges as and choice of edges from with which we add these edges, the expected number of additional principal cycles after the switchings is
(41) |
There are ways to pick the edges of which we switch over to add the missing edges of . Also there are ways to pick the missing edges of and ways to assign edges of to missing edges of . Thus, when we sum over principal cycles , missing edges of and and matchings of the missing edges, we find the expected number of such structures is at most
(42) |
Applying standard bounds on binomial coefficients and falling factorials and taking vertices from for with replacement rather than without allows us to bound (42) as follows
(43) |
where we note that allows us to use the bound in (43). So for all where . Finally, note that this implies . We subsequently show that from which it follows that as is a constant independent of . Using this and the fact that , we conclude that
Note that is the number of cycles of of length between and . By Lemma 4.1 and Lemma 4.4, the probability that there are any longer cycles is . Thus, if is the number of cycles of of length at least , . To show that the mean of the corresponding Poisson distribution can be taken to be note that holds for all . This follows from coupling and by coupling their constituent Bernoulli trials and noting that . Thus it suffices to show that . Upper bounding may be done in the same way as the proof of Lemma 4.4. However, we can be more precise than we are in (21) as in this case , so this line can be replaced by
(44) |
Next, we lower bound by . In order to do this we will use Lemma 2.3 in combination with the inequality
This allows us to deduce that the probability of finding a cycle of length between and is at least
(45) |
Now, that the exponent in the above is equal to follows from the assumption . Thus in analogy with (44), we deduce that (45) is at most
(46) |
4.5 Putting it all together
We can deduce the statement of the theorem from the previous sections as follows. First we apply Lemma 4.1, Lemma 4.4 and Lemma 4.5 to deduce there are no cycles with length at least or complex components. Then, we can apply Lemma 4.6 to deduce that the distribution of the number of cycles with at least vertices converges to a Poisson distribution with mean from which the statement that
follows immediately as the above is simply the probability that a Poisson() distribution is at least (plus the term).
5 Concluding Remarks
In this paper, we showed that the largest component of the directed configuration model is of order when and found the distribution of the size of the th largest component for any . In a subsequent paper [4] we shall show that under similar conditions to those in this paper that for degree sequences such that the largest component is of order . This quantity matches the we find in this paper if for some constant and suggests that one may find a critical window phenomenon for degree sequences with such parameters. Note that in particular this is precisely the critical window of (looking at typical degree sequences from the model ). Moreover, the recent result of Donderwinkel and Xie certainly seems to indicate that the point lies inside a critical window.
An interesting question for further study would be to ask about the joint distribution of the largest strongly connected components in the directed configuration model with parameters as in this paper. The fact that we find
seems rather suggestive of an underlying Poisson process. As such we make the following conjecture.
Conjecture 5.1.
Let be a directed configuration model with parameters as in Theorem 1.3 and suppose the sizes of its components in descending order are given by the random variables . Suppose further that are the points of a Poisson process of rate and are the unique positive solutions to
Then we have that
Note that the above is also an open question for with and , .
References
- [1] B. Bollobás. A probabilistic proof of an asymptotic formula for the number of labelled regular graphs. European Journal of Combinatorics, 1(4):311–316, 1980.
- [2] X. S. Cai and G. Perarnau. The giant component of the directed configuration model revisited. arXiv preprint arXiv:2004.04998, 2020.
- [3] C. Cooper and A. Frieze. The size of the largest strongly connected component of a random digraph with a given degree sequence. Combinatorics, Probability and Computing, 13(3):319–337, 2004.
- [4] M. Coulson. The strong component structure of the barely supercritical directed configuration model. Manuscript.
- [5] M. Coulson. The critical window in random digraphs. arXiv preprint arXiv:1905.00624, 2019.
- [6] S. Donderwinkel and Z. Xie. Universality for the directed configuration model with random degrees: metric space convergence of the strongly connected components at criticality. arXiv preprint arXiv:2105.11434, 2021.
- [7] P. Erdos and A. Rényi. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci, 5(1):17–60, 1960.
- [8] T. Erhardsson. Steins method for Poisson and compound Poisson approximation. An introduction to Stein’s method, 4:61–114, 2005.
- [9] D. A. Freedman. On tail probabilities for martingales. the Annals of Probability, pages 100–118, 1975.
- [10] C. Goldschmidt and R. Stephenson. The scaling limit of a critical random directed graph. arXiv preprint arXiv:1905.05397, 2019.
- [11] A. Graf. On the strongly connected components of random directed graphs with given degree sequences. Master’s thesis, University of Waterloo, 2016.
- [12] H. Hatami and M. Molloy. The scaling window for a random graph with a given degree sequence. Random Structures & Algorithms, 41(1):99–123, 2012.
- [13] S. Janson, T. Łuczak, and A. Ruciński. Random graphs, volume 45. John Wiley & Sons, 2011.
- [14] R. M. Karp. The transitive closure of a random digraph. Random Structures & Algorithms, 1(1):73–93, 1990.
- [15] L. Lu and L. Székely. Using Lovász Local Lemma in the space of random injections. the electronic journal of combinatorics, page R63, 2007.
- [16] T. Łuczak. The phase transition in the evolution of random digraphs. Journal of graph theory, 14(2):217–223, 1990.
- [17] T. Łuczak and T. G. Seierstad. The critical behavior of random digraphs. Random Structures & Algorithms, 35(3):271–293, 2009.
- [18] A. Nachmias and Y. Peres. Critical percolation on random regular graphs. Random Structures & Algorithms, 36(2):111–148, 2010.
- [19] X. Pérez-Giménez and N. Wormald. Asymptotic enumeration of strongly connected digraphs by vertices and edges. Random Structures & Algorithms, 43(1):80–114, 2013.
- [20] D. Williams. Probability with martingales. Cambridge university press, 1991.