Failure behavior in a connected configuration model
under a critical loading mechanism

Fiona Sloothaak Lorenzo Federico

Abstract

We study a cascading edge failure mechanism on a connected random graph with a prescribed degree sequence, sampled using the configuration model. This mechanism prescribes that every edge failure puts an additional strain on the remaining network, possibly triggering more failures. We show that under a critical loading mechanism that depends on the global structure of the network, the number of edge failure exhibits scale-free behavior (up to a certain threshold). Our result is a consequence of the failure mechanism and the graph topology. More specifically, the critical loading mechanism leads to scale-free failure sizes for any network where no disconnections take place. The disintegration of the configuration model ensures that the dominant contribution to the failure size comes from edge failures in the giant component, for which we show that the scale-free property prevails. We prove this rigorously for sublinear thresholds, and we explain intuitively why the analysis follows through for linear thresholds. Moreover, our result holds for other graph structures as well, which we validate with simulation experiments.

1 Introduction

Cascading failures is a term used to describe the phenomenon where an initial disturbance can trigger subsequent failures in a system. A typical way to describe cascading failure models is through a network where every node/edge failure creates an additional strain on the surviving network, possibly leading to knock-on failure effects. It appears in many different application areas, such as communication networks [28, 27, 23], road systems [8, 41], earthquakes [2, 3, 1], material science [30], epidemiology [40, 26], power transmission systems [7, 31, 6, 37], and more.

A macroscopic characteristic that is observed in different natural and engineering systems, is the emergence of scale-free behavior [5, 36, 4, 9, 32]. Although this heavy-tailed property sometimes relates to the nodal degree distribution, there are many physical networks that do not share this feature (e.g. power grids [39]). An intriguing question arises why failure sizes still display scale-free behavior for these types of networks.

In this paper, we introduce a cascading mechanism on a complex network whose nodal distribution does not (necessarily) exhibit scale-free behavior. The cascade is initialized by a small additional loading of the network, and failures occur on edges whenever its load capacity is exceeded. An intrinsic feature in this work is that the propagation of failures occurs non-locally and depends on the global network structure, which continually changes as the failure process advances. This is a novel feature with respect to the existing literature: in order to obtain an analytically tractable model, the cascading mechanisms is often described through local relations, or assumes an independence of the global network structure throughout the cascade. Well-known examples that satisfy at least one of these properties are epidemiology models (where an initially infected node may taint each of its neighbors with a fixed probability), or percolation models (where each edge/node in the network fails with a fixed probability). In this work, we introduce a critical load function that leads to a power-law distributed tail for the failure size.

1.1 Model description

Let $G=(V,E)$ denote a graph, where $V$ denotes the vertex set with $|V|=n$ , and $E$ denotes the edge set with $|E|=m$ . Typically, we consider graphs that can be scaled in the number of vertices/edges. Suppose that each edge in the network is subjected to a load demand that is initially exceeded by its edge capacity. The difference between the initial load and the edge capacity is called the surplus capacity of the edge, and we assume the surplus capacities to be independently and uniformly distributed on $[0,1]$ at the various edges. The failure process is triggered by an initial disturbance: all edges are additionally loaded with $\theta/m$ for some constant $\theta>0$ . If the total load increase surpasses the surplus capacity of one or more edges, these edges fail and in turn, cause an additional load increase on all surviving edges that are in the same component upon failure. We call these load increments the load surges. We make two assumptions: edge failures occur subsequently, and all edges in the same component upon failure experience the same load surge. We continue with the failure of edges that have insufficient surplus capacity to handle the load surges till there are no more. We are interested in the number of edge failures, also referred to as the failure size.

We are interested in the setting where the failure size exhibits scale-free behavior. To this purpose, we define the critical load surge function $l_{j}^{m}(i)$ at edge $j$ after experiencing $i$ load surges to be

\displaystyle\left\{\begin{array}[]{ll}l_{j}^{m}(1)=\theta/m,\\ l_{j}^{m}(i+1)=l_{j}^{m}(i)+\frac{1-l_{j}^{m}(i)}{|E_{j}^{m}(i-1)|},\hskip 28.45274pti=1,\ldots,m-1,\end{array}\right.

(3)

where $|E_{j}^{m}(i)|$ is the number of edges in the component that contains edge $j$ after perceiving $i$ edge failures in that component.

We observe that as long as two edges remain in the same component during the cascade, this recursive relation implies that the load surges are the same at both edges. Moreover, as long as all edges remain to be in a single component, it holds that $|E_{j}^{m}(i)|=m-i$ at every surviving edge, and hence (3) is solved by

\displaystyle l_{j}^{m}(i)=\frac{\theta}{m}+(i-1)\cdot\frac{1-\theta/m}{m}=\frac{\theta+i-1}{m}(1+o(m)),\hskip 28.45274pti=1,\ldots,m.

(4)

In particular, applying (3) to a star topology with $n+1$ nodes and $m=n$ edges yields load surge function (4) for all surviving edges.

We point out that the load surges defined in (3) are typically non-deterministic, and are only well-defined as long as edge $j$ has not yet failed throughout the cascade. Edge failures may cause the network to disintegrate in multiple components, which affects $|E_{j}^{m}(i)|$ in a probabilistic way that depends on the structure of the graph. In contrast to processes in epidemiology or percolation models, the propagation of failures thus occurs non-locally and depends on the global structure of the graph throughout the cascade.

To provide an intuitive understanding why (3) gives rise to scale-free behavior for the failure size, consider the following. For a scale-free failure size to appear, the cascade propagation should occur in some form of criticality. This happens when the load surges are close to the expected values of the ordered surplus capacities. More specifically, since the surplus capacities are independently and uniformly distributed on $[0,1]$ , the mean of the $i$ -th smallest surplus capacity is $i/(m+1)$ . If no disconnections occurred during the cascade, this would imply that the additional load surges at every edge should be close to $1/(m+1)$ after every edge failure. This explains intuitively why (4) leads to scale-free behavior for the star-topology, as is shown rigorously in [33]. Yet, whenever the network disintegrates in different components, the consecutive load surges need to be of a size such that the load surge is close the smallest expected surplus capacity of the remaining edges in order for the cascade to remain in the window of criticality. More specifically, suppose that the first disconnection occurs after $k$ edge failures and splits the graph in two components of edge size $l$ and $m-k-l$ , respectively. Due to the properties of uniformly distributed random variables, we note that the expectation of the smallest surplus capacity in the first component equals $k/(m+1)+(1-k/(m+1))/l$ . In other words, in order to stay in a critical failure regime, the additional load surges should be close to $(1-k/(m+1))/l$ after every edge failure until the cascade stops in that component, or another disconnection occurs. This process can be iterated, and gives rise to load surge function (3).

In this paper, our main focus is to apply failure mechanism (3) to the (connected) configuration model $CM_{n}(\mathbf{d})$ on $n$ vertices with a prescribed degree sequence $\mathbf{d}=(d_{1},...,d_{n})$ . The configuration model is constructed by assigning $d_{v}$ half-edges to each vertex $v\in[n]:=\{1,...,n\}$ , after which the half-edges are paired randomly: first we pick two half-edges at random and create an edge out of them, then we pick two half-edges at random from the set of remaining half-edges and pair them into an edge, and we continue this process until all half-edges have been paired. For consistency, we assume that the total degree $\sum_{i=1}^{n}d_{i}$ is even. The construction can give rise to self-loops and multiple edges between vertices, but these events are relatively rare when $n$ is large [16, 19, 20]. We point out that the number of edges is thus a function of $n$ and $\mathbf{d}$ , i.e. $m:=m_{n}(\mathbf{d}):=\sum_{i=1}^{n}d_{i}/2$ , where we suppress the dependency on $n$ and $\mathbf{d}$ for the sake of exposition.

Define $n_{i}$ as the number of vertices of degree $i$ , and let $D_{n}$ denote the degree of a vertex chosen uniformly at random from $[n]$ . We assume the following condition on the degree sequence.

Condition 1.1 (Regularity conditions).

Unless stated otherwise, we assume that $CM_{n}(\textbf{d})$ satisfies the following conditions:

•

There exists a limiting degree variable $D$ such that $D_{n}$ converges in distribution to $D$ as $n\rightarrow\infty$ ;
•

$n_{0},n_{1}=0$ ;
•

$p_{2}:=\lim_{n\rightarrow\infty}n_{2}/n\in(0,1)$ ;
•

$n_{j}=0$ for all $j\geq n^{1/4-\varepsilon}$ for some $\varepsilon>0$ ;
•

$d:=\lim_{n\rightarrow\infty}\mathbb{E}[D_{n}]<\infty$ .

Under these conditions, we can write $p_{i}:=\lim_{n\rightarrow\infty}n_{i}/n$ as the limiting fraction of degree $i$ vertices in the network. Moreover, under these conditions it is known that there is a positive probability for $CM_{n}(\mathbf{d})$ to be connected [13, 24]. Our starting point will be such a configuration model conditioned to be connected, denoted by $\overline{CM}_{n}(\mathbf{d})$ , on which we apply the edge failure mechanism (3). We are interested in quantifying the resilience of the connected configuration model under (3), measured by the number of edge failures $A_{n,\mathbf{d}}$ .

Remark 1.2.

In Condition 1.1, we assume $n_{0}=0$ to ensure that the resulting graph has a positive probability to be connected. Moreover, as explained in [13], it suffices to pose the condition that $n_{1}=o(\sqrt{n})$ for a positive probability of having a connected configuration model. We point that our results do not change when allowing that $n_{1}=o(\sqrt{n})$ , but for the sake of exposition, we removed the details in this paper.

1.2 Notation

In Appendix A, we provide an overview of the quantities that are commonly used throughout the paper. Unless stated otherwise, all limits are taken as $n\rightarrow\infty$ , or equivalently by Condition 1.1, as $m\rightarrow\infty$ . A sequence of events $(A_{n})_{n\in\mathbb{N}}$ happens with high probability (w.h.p.) if $\mathbb{P}(A_{n})\rightarrow 1$ . For random variables $(A_{n})_{n\in\mathbb{N}}$ , we write $X_{n}\overset{d}{\rightarrow}X$ and $X_{n}\overset{\mathbb{P}}{\rightarrow}X$ to denote convergence in distribution and in probability, respectively. For real-valued sequences $(a_{n})_{n\in\mathbb{N}}$ and $(b_{n})_{n\in\mathbb{N}}$ , we write $a_{n}=o(b_{n})$ and $a_{n}=O(b_{n})$ if $\lim_{n\rightarrow\infty}a_{n}/b_{n}$ tends to zero or is bounded, respectively. Similarly, we write $a_{n}=\omega(b_{n})$ and $a_{n}=\Omega(b_{n})$ if $\lim_{n\rightarrow\infty}b_{n}/a_{n}$ tends to zero or is bounded, respectively. We write $a_{n}=\Theta(b_{n})$ if both $a_{n}=O(b_{n})$ and $a_{n}=\Omega(b_{n})$ hold. We adopt an analogue notation for random variables, e.g. for sequences of random variables $(X_{n})_{n\in\mathbb{N}}$ and $(Y_{n})_{n\in\mathbb{N}}$ , we denote $X_{n}=o_{\mathbb{P}}(Y_{n})$ if $X_{n}/Y_{n}\overset{\mathbb{P}}{\rightarrow}0$ . For convenience of notation, we denote $a_{n}\ll b_{n}$ if $a_{n}=o(b_{n})$ . Finally, $\textrm{Poi}(\lambda)$ always denotes a Poisson distributed random variable with mean $\lambda$ , $\textrm{Exp}(\lambda)$ denotes an exponentially distributed random variable with parameter $\lambda$ , and $\textrm{Bin}(n,p)$ denotes a binomial distributed random variable with parameters $n$ and $p$ .

1.3 Main result

We are interested in the probability that the failure size exceeds a threshold $k$ . In this paper, we mainly focus on thresholds satisfying $1\ll k\ll m^{1-\delta}$ for some $\delta\in(0,1)$ , which we refer to as the sublinear case. Our main result shows that the failure size has a power-law distribution.

Theorem 1.3.

Consider the connected configuration model $\overline{CM}_{n}(\mathbf{d})$ , and suppose $k:=k_{m}$ such that $1\ll k\ll m^{1-\delta}$ for some $\delta\in(0,1)$ . Then,

\displaystyle\mathbb{P}\left(A_{n,\mathbf{d}}\geq k\right)\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2}.

(5)

To see why (5) holds, we need to understand the typical behavior of the failure process as the cascade continues. A first result we show is that it is likely that the number of edge failures that need to occur for the network to become disconnected is of order $\Theta(\sqrt{m})$ . This suggests that as long as the threshold satisfies $k=o(\sqrt{m})$ , the tail is the same as in the case of a star topology. The latter case is known to satisfy (5), as is shown in [33].

As long as the cascade continues, we show that it typically disconnects small components. This suggests that up to a certain point, the cascading failure mechanism creates a network with a single large component that contains almost all vertices and edges, referred to as the giant component, and some small disconnected components with few edges. It turns out that the total number of edges that are outside the giant component is sufficiently small, and hence the dominant contribution to the failure size comes from the number of edges that are contained in the giant component upon failure. Moreover, due to small sizes of the components outside the giant component, the load surge function for the edges in the giant as prescribed by (3) is relatively close to (4). We show that as long as threshold $k\ll m^{1-\delta}$ for some $\delta\in(0,1)$ , the perturbations are sufficiently small such that the failure size behavior in the giant component satisfies (5).

We heuristically explain in Section 5 that the power-law tail property prevails beyond the sublinear case up to a certain critical point. However, the prefactor is affected in this case, i.e. the failure size tail is different from the star topology. Moreover, this notion holds for a broader set of graphs, and we validate this claim by extensive simulation experiments in Section 5.

We would like to remark that the disintegration of the network by the cascading failure mechanism is closely related to percolation, a process where each edge is failed/removed with a corresponding removal probability. In fact, percolation results will be crucial in our analysis to prove Theorem 1.3, combined with first-passage theory of random walks over moving boundaries. Before we lay out a more detailed road map to prove Theorem 1.3, we provide an outline of the remainder of this paper.

1.4 Outline

The proof of Theorem 1.3 requires many different steps, and therefore we provide a road map of the proof in Section 2. We explain that we need to derive novel percolation results for the sublinear case, which we show in Section 3. We rigorously identify the impact of the disintegration of the network on the cascading failure process in Section 4, where we use first-passage theory for random walks over moving boundaries to conclude our main result. Finally, we study the failure size behavior beyond the sublinear case in Section 5, as well as the failure size behavior for other graphs.

2 Proof strategy for Theorem 1.3

Our proof of Theorem 1.3 requires several steps. In this section, we provide a high-level road map of the proof.

2.1 Relation of failure process and sequential removal process

There are two elements of randomness involved in the cascading failure process: the sampling of the surplus capacities of the edges, and the way the network disintegrates as edge failures occur. The second aspect determines the values of the load surges, and only when the surplus capacity of an edge is insufficient to handle the load surge, the cascading failure process continues. Recall that as long as edges remain in the same component, they experience the same load surge. Since the surplus capacities are i.i.d., it follows that every edge in the same component is equally likely to be the next edge to fail as long as the failure process continues in that component. This is a crucial observation, as this provides a relation with the sequential edge-removal process. That is, suppose that we sequentially remove edges uniformly at random from a graph. Given that a new edge removal occurs in a certain component, each edge in that component is equally likely to be removed next, just as in the cascading failure process. Consequently, this observation gives rise to a coupling of the disintegration of the network through the cascading failure process to one that is caused by sequentially removing edges uniformly at random.

Refer to caption — Figure 1: Sample of sequentially removing five edges uniformly at random from a connected graph. We remove the red edges subsequently in successive order, leaving two disconnected component. The first component is connected by the dotted lines, whereas the second component is connected by the dashed lines.

More specifically, suppose that sequentially removing edges uniformly at random yields the permutation $\{e_{(1)},...,e_{(m)}\}$ . For the cascading failure process, sample $m$ uniformly distributed random variables and order them so that $U_{(1)}^{m}\leq...\leq U_{(m)}^{m}$ and assign to edge $e_{(j)}$ surplus capacity $U_{(j)}^{m}$ . In particular, this implies that if the cascading failure process would always continue until all edges have failed, then the edges would fail in the same order as the sequential edge-removal process. Moreover, it is possible to exactly compute the load surge function from the order $\{e_{(1)},...,e_{(m)}\}$ over the edge set. We illustrate this claim by the example in Figure 1. In this example, we observe that $m=11$ , and we consider the sequential removal process up to five edge removals. We see that the first three edge failures do not cause the graph to disconnect. It holds for every dotted edge $e_{j}\in[m]$ ,

\displaystyle|E_{j}^{m}(i)|=\left\{\begin{array}[]{ll}11-i&\textrm{if }i\in\{0,1,2,3\}\\ 4&\textrm{if }i=4,\\ 3&\textrm{if }i=5,\end{array}\right.

and for every dashed edge $e_{j}\in[m]$ ,

\displaystyle|E_{j}^{m}(i)|=\left\{\begin{array}[]{ll}11-i&\textrm{if }i\in\{0,1,2,3\}\\ 3&\textrm{if }i=4.\end{array}\right.

Recursion (3) yields the corresponding load surge values.

This example illustrates that using our coupling, the sequential removal process gives rise to the load surge values as prescribed in (3). We point out that due to our coupling, if after step $j-1$ an edge $e_{(j)}$ for some $j\in[m]$ has sufficient surplus capacity to deal with the load surge, then so do all the other edges in that component. In other words, the cascade stops in that component. To determine the failure size of the cascade, one needs to subsequently compare the load surge values to the surplus capacities in all components until the surplus capacities at all the surviving edges are sufficient to deal with the load surges.

In summary, sequentially removing edges uniformly at random gives rise to the load surge values in every component. This idea decouples the two sources of randomness: first one needs to understand the disintegration of the network through the sequential removal process, leading to the load surge values through (3). Then, we can determine the failure size by comparing the surplus capacities to the load surges up to the point where the cascade stop in every component.

2.2 Disintegration of the network through sequential removal

The next step is to determine the typical behavior of the sequential removal process on the connected configuration model $\overline{CM}_{n}(\mathbf{d})$ . A first question that needs to be answered is how many edges need to fail, or equivalently, need to be removed uniformly at random from the network $\overline{CM}_{n}(\mathbf{d})$ to become disconnected. It turns out that this is likely to occur when $\Theta(\sqrt{m})$ edges are removed.

Theorem 2.1.

Suppose that we subsequently remove edges uniformly at random from $\overline{CM}_{n}(\mathbf{d})$ . Define $T_{n,\mathbf{d}}$ as the smallest number of edges that need to be removed for the network to be disconnected. Then,

\displaystyle m^{-1/2}T_{n,\mathbf{d}}\overset{d}{\to}T(D),

where $T(D)$ has a Rayleigh distribution with density function

\displaystyle f_{T(D)}(x)=\frac{4p_{2}x}{d-2p_{2}}\mathrm{e}^{-\frac{2x^{2}p_{2}}{d-2p_{2}}}.

(6)

After this point, more disconnections start to occur as more edges are removed uniformly at random from the graph. In Section 3.3, we focus on the typical network structure if $\sqrt{m}\ll i\ll m^{1-\delta}$ edges have been removed uniformly at random for some $\delta\in(0,1/2)$ . Typically, there is a giant component that contains almost all edges and vertices. The components that detach from the giant component are isolated nodes, line components, and possibly isolated cycles. We show that the total number of edges contained in these components are likely to be of order $\Theta(i^{2}/m)$ , while the number of edges in more complex component structures are negligible in comparison. This leads to the following result, which is proved in Section 3.6.

Theorem 2.2.

Suppose we remove $i:=i_{m}$ edges uniformly at random from the connected configuration model $\overline{CM}_{n}(\mathbf{d})$ with $\sqrt{m}\ll i\ll m^{1-\delta}$ for some $\delta>0$ . Write $\hat{E}_{m}(i)$ for the set of edges in the largest component of this graph, and let $|\hat{E}_{m}(i)|$ denote its cardinality. Then,

\displaystyle(m-i-|\hat{E}_{m}(i)|)\frac{m}{i^{2}}\overset{\mathbb{P}}{\to}\frac{4p_{2}^{2}}{(d-2p_{2})^{2}}.

(7)

We stress that determining the typical behavior of the network disintegration is not enough to prove our main result. In addition, for each of our results, we need to show that it is extremely unlikely to be far from its typical behavior, which we consider in Section 3.4. Moreover, we combine these large deviations results to show the following result in Section 3.6.

Theorem 2.3.

Suppose $i:=i_{m}$ edges have been removed uniformly at random from the connected configuration model $\overline{CM}_{n}(\textbf{d})$ , where $\sqrt{m}\ll i\ll m^{\alpha}$ for some $\alpha\in(1/2,1)$ . Then,

\displaystyle\mathbb{P}\left(m-i-|\hat{E}_{m}(i)|>i^{\alpha}\right)=O(m^{-3}).

(8)

Moreover, for every $k:=k_{m}$ for which $k=o(m^{\alpha})$ for some $\alpha\in(0,1)$ ,

\displaystyle\mathbb{P}\left(m-i-|\hat{E}_{m}(i)|>i^{\alpha}\textrm{ for some }i\leq k\right)=o(m^{-1/2}).

(9)

2.3 Impact of disintegration on the failure process

The sequential removal process gives rise to the load surge function at every edge, and we need to compare these values to the surplus capacities in every component until the cascade stops. To keep track of the cascading failure process in every component may seem cumbersome at first glance. However, due to the way the connected configuration model is likely to disintegrate, it turns out that it only matters what happens in the giant component.

Intuitively, this can be understood as follows. By Theorem 2.1 and 2.2, removing any sublinear number $i_{m}$ of edges always leaves $o(i_{m})$ edges outside the giant. Therefore, if $k:=k_{m}\ll m$ edges are sequentially removed, then only $o(k)$ of these edges were contained outside the giant upon removal. Moreover, even if the cascading failure process struck every edge of the components outside the giant, the contribution to the failure size would be at most $o(k)$ . This is negligible with respect to the $k-o(k)$ failures that occur in the giant. The failure size $A_{n,\mathbf{d}}$ should therefore be well-approximated by the failure size in the giant. We formalize these ideas in Section 4.

More specifically, write

	$\displaystyle\hat{A}_{n,\mathbf{d}}$	$\displaystyle=\textrm{\# edges contained in the giant upon failure during the cascade},$
	$\displaystyle\tilde{A}_{n,\mathbf{d}}$	$\displaystyle=\textrm{\# edges contained outside the giant upon failure during the cascade},$

and hence $A_{n,\mathbf{d}}=\hat{A}_{n,\mathbf{d}}+\tilde{A}_{n,\mathbf{d}}$ . Moreover, define

	$\displaystyle\|\tilde{E}_{m}(i)\|=$	# remaining edges outside the giant when $i$ edges are removed uniformly at random,
	$\displaystyle\kappa(i)=$	# edges removed from giant when $i$ edges are removed uniformly at random.

The main idea is that since the number of edges outside the giant is likely to be $o(i)$ when a sublinear number of $i$ edges are removed uniformly at random, the contribution of edge failures that occur outside the giant is asymptotically negligible. That is, we bound the probability that $\{A_{n,\mathbf{d}}\geq k\}$ to occur due to the cascading failure process by the same event in two related processes. For the upper bound, we consider the scenario where all edges in the small components that disconnect from the giant immediately fail upon disconnection. For the lower bound, we consider the scenario where none of the edges in the small components that disconnect from the giant fail.

To make this rigorous, we first consider the probability that the number of edge failures in the giant exceeds $\kappa(k)$ . That is, if we sequentially remove $k$ edges uniformly at random, a set of $\kappa(k)$ edges were contained in the giant upon failure. We are interested in the probability that the coupled surplus capacities of all these edges are insufficient to deal with the corresponding load surges. By translating this setting to a first-passage problem of a random walk bridge over a moving boundary, we show the following result in Section 4.3.

Proposition 2.4.

If $k=o(m^{\alpha})$ for some $\alpha\in(0,1)$ , then as $n\rightarrow\infty$ ,

\displaystyle\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(k)\right)\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2}.

Second, we use this result to derive an upper bound for the failure size tail. Trivially, the failure size is bounded by the number of edges that are contained in the giant component upon failure plus all the edges that exist outside the giant after the cascade has stopped. We introduce the stopping time

\displaystyle\upsilon(k)=\min\{j\in\mathbb{N}:j+|\tilde{E}_{m}(j)|\geq k\},

(10)

the minimum number of edges that need to be removed uniformly at random for the number of edges outside the giant to exceed $k-\upsilon(k)$ . In other words, after we have removed $\upsilon(k)$ edges uniformly at random, the sum of the number of edges outside the giant and the number of removed edges exceeds $k$ . The number of edge removals in the giant is given by $\kappa(\upsilon(k))$ , and hence

\displaystyle\left\{A_{n,\mathbf{d}}\geq k\right\}\subseteq\left\{\hat{A}_{n,\mathbf{d}}\geq\kappa(\upsilon(k))\right\}.

We prove that $\upsilon(k)=k-o(k)$ with sufficiently high probability, and hence

\displaystyle\mathbb{P}\left(A_{n,\mathbf{d}}\geq k\right)\leq\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(\upsilon(k))\right)\sim\frac{2\theta}{\sqrt{2\pi}}\left(k-o(k)\right)^{-1/2}\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2}.

Third, we derive a lower bound. We note that $\hat{A}_{n,\mathbf{d}}\leq A_{n,\mathbf{d}}$ , and hence

\displaystyle\left\{A_{n,\mathbf{d}}\geq k\right\}\supseteq\left\{\hat{A}_{n,\mathbf{d}}\geq k\right\}=\left\{\hat{A}_{n,\mathbf{d}}\geq\kappa(\varrho(k))\right\},

where

\displaystyle\varrho(k)=\min\{j\in\mathbb{N}:\kappa(j)\geq k\}.

(11)

That is, $\varrho(k)$ is the number of edges that need to be removed uniformly at random for $k$ failures to have occurred in the giant. We show that that $\varrho(k)=k+o(k)$ with sufficiently high probability, and hence

\displaystyle\mathbb{P}\left(A_{n,\mathbf{d}}\geq k\right)\geq\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(\varrho(k))\right)\sim\frac{2\theta}{\sqrt{2\pi}}\left(k+o(k)\right)^{-1/2}\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2}.

Since the upper and lower bounds coincide, this yields Theorem 1.3. We formalize this in Section 4.4.

3 Disintegration of the network

As explained in the previous section, the cascading failure process can be decoupled in two elements of randomness. In this section, we study the first element of randomness: the disintegration of the network as we sequentially remove edges uniformly at random. In view of Theorem 1.3, our main focus is the case where we remove only $O(m^{1-\delta})$ edges with $\delta\in(0,1)$ . In particular, our goal is to prove Theorems 2.1-2.3.

This section is structured as follows. First, we show in Section 3.1 that the sequential removal process is well-approximated by a process where we remove each edge independently with a certain probability, also known as percolation. This is a well-studied process in the literature, and particularly in case of the configuration model [18]. In Section 3.2, we provide an alternative way to construct a percolated configuration model by means of an algorithm as developed in [18]. This alternative construction allows for simpler analysis, and is used in Section 3.3 to show that for the percolated (connected) configuration model, typically, the components outside the giant component are either isolated nodes, line components or possibly cycle components. In Section 3.4, we derive large deviations bounds on the number of edges outside the giant. We prove Theorem 2.1 in Section 3.5, and in addition, we provide a large deviations bound for the number of edges that need to be removed for the connected configuration model to become disconnected. We prove Theorems 2.2 and 2.3 in Section 3.6. Although these sections focus on the (connected) configuration model under a sublinear number of edge removals, we briefly recall results known in the literature involving the typical behavior of the random graph beyond the sublinear window in Section 3.7. This will be important to obtain intuition of what happens to the failure size behavior beyond the sublinear case, the topic of interest in Section 5.

3.1 Percolation on the connected configuration model

To prove our results regarding the sequential removal process, we relate this process to another one where each edge is removed independently with a certain removal probability, also known as percolation. This is illustrated by the following lemma.

Lemma 3.1.

Let $G=(V,E)$ be a graph, and write $E^{\prime}(G(q))$ as the set of edges that have been removed by percolation with parameter $q\in[0,1]$ , and $\tilde{E}^{\prime}(i)$ as the set of edges when $i$ edges are removed uniformly at random. It holds for every $i\in[m]$ , $q\in[0,1]$ , and $B\subseteq E$ such that $|B|=i$ ,

\displaystyle\mathbb{P}(\tilde{E}^{\prime}(i)=B)=\mathbb{P}\left(E^{\prime}(G(q))=B\;\big{|}\;|E^{\prime}(G(q))|=i\right).

(12)

Moreover, if $q=q_{m}=\omega(m^{-1})$ , then as $m\rightarrow\infty$ ,

\displaystyle\frac{|E^{\prime}(G(q))|}{qm}\overset{\mathbb{P}}{\rightarrow}1.

(13)

Lemma 3.1 is a direct consequence of the concentration of the binomial random variable, and the proof is given in Appendix B. This lemma establishes that sequentially removing $i:=i_{m}=\omega(1)$ edges uniformly at random is well-approximated by a percolation process with removal probability $q=i/m$ . In particular, this holds for the (connected) configuration model. This allows us to study many questions involving the connected configuration model subject to uniformly removing edges in the setting of percolation. The study of percolation processes on finite (deterministic or random) graphs is a well-established research field, which dates back to the work of Gilbert [14] and is still very active these days. In particular, percolation on the configuration model is a fairly well-understood process, as established by Janson in [18].

It is known that there exists a critical parameter $q_{c}:=1-\mathbb{E}[D]/\mathbb{E}[D(D-1)]$ such that if $q<q_{c}$ , then the largest component of the percolated (connected) configuration model contains a non-vanishing proportion of the vertices and edges [25, 18]. In particular, it is implied by the formula for $q_{c}$ that in order for a phase transition to appear, it is necessary that $\mathbb{E}[D(D-1)]/\mathbb{E}[D]\in(1,\infty)$ . If $\mathbb{E}[D(D-1)]/\mathbb{E}[D]\leq 1$ , then the largest component already has a sublinear size in $n$ even before percolation, while if $\mathbb{E}[D(D-1)]/\mathbb{E}[D]=\infty$ , then there exists a connected component of linear size in the percolated graph for every $q\in(0,1)$ . Typically, the (w.h.p. unique) component of linear size is referred to as the giant component. All other components are likely to be much smaller, i.e. of size $O_{\mathbb{P}}(\log n)$ under some regularity conditions on the limiting degree sequence [25]. If $q\geq q_{c}$ , there is no giant component in the percolated graph and all the components have sublinear size [18].

In order to derive Theorem 1.3, these results in the literature are not sufficient. In particular, in view of Theorems 2.1 and 2.2, a more detailed description of the network structure is needed in case that $q=o(m)$ . A critical tool to derive the results in this setting is the explosion algorithm, as we describe next.

3.2 Explosion algorithm

Key to prove Theorems 2.1 and 2.2 is the explosion algorithm, designed by Janson in [18]. This algorithm prescribes that instead of applying percolation on $CM_{n}(\mathbf{d})$ with removal probability $q$ , we can run the procedure as described in Algorithm 1.

Input: A set of

n

vertices, such that for every

j\in[n]

the vertex

v_{j}

has

d_{j}

half-edges attached to it according to the degree sequence

\mathbf{d}

Output: Graph

CM_{n}(\mathbf{d},q)

11.

Remove each half-edge independently with probability $(1-(1-q)^{1/2})$ . Let $R_{n}$ be the number

of removed half-edges.

22.

Add $R_{n}$ degree-one vertices to the vertex set. Define the new degree sequence as $\mathbf{d}^{\prime}$ with

N=n+R_{n}

vertices.

Pair $CM_{N}(\mathbf{d}^{\prime})$ . 4. Remove $R_{n}$ vertices of degree 1 from $CM_{N}(\mathbf{d}^{\prime})$ uniformly at random.

Algorithm 1 Explosion algorithm [18].

Janson proved in [18] that the algorithm produces a random graph that is statistically indistinguishable from $CM_{n}(\mathbf{d},q)$ , the graph obtained by removing every edge in (not necessarily connected) $CM_{n}(\mathbf{d})$ with probability $q\in[0,1]$ . Yet, the graph obtained by the explosion method is significantly easier to study as it is simply a configuration model with a new degree sequence and a couple of vertices of degree $1$ that have been removed, where the latter operation does not significantly change the structure of the graph. Since the graphs obtained from the percolation process and the explosion method are identically distributed, we use the denomination $CM_{n}(\mathbf{d},q)$ both for the configuration model after percolation and for the graph we obtain via the explosion algorithm.

Remark 3.2.

The observant reader may notice that the explosion algorithm is designed for the configuration model that is not necessarily initially connected. Fortunately, as established in [13], connectivity has a non-vanishing probability to occur under Condition 1.1 as $n\to\infty$ . This is a crucial observation, since it implies that certain events that happen w.h.p. on $CM_{n}(\mathbf{d})$ also happen w.h.p. on $\overline{CM}_{n}(\mathbf{d})$ . More specifically, for a sequence of events $(A_{n})_{n\in\mathbb{N}}$ ,

\displaystyle\mathbb{P}(A_{n}\mid CM_{n}(\mathbf{d})\text{ is connected})\leq\frac{\mathbb{P}(A_{n})}{\mathbb{P}(CM_{n}(\mathbf{d})\text{ is connected})},

(14)

where under Condition 1.1 [13],

\displaystyle\liminf_{n\to\infty}\mathbb{P}(CM_{n}(\mathbf{d})\text{ is connected})>0.

In particular, this implies that if for some sequence of random variables $(X_{n})_{n\in\mathbb{N}}$ it holds that $X_{n}\overset{\mathbb{P}}{\to}c$ for some constant $c\in\mathbb{R}$ in $CM_{n}(\mathbf{d})$ , then the same statement holds for the graph $\overline{CM}_{n}(\mathbf{d})$ . Similarly, if $X_{n}=o_{\mathbb{P}}(a_{n})$ for some sequence $(a_{n})_{n\in\mathbb{N}}$ in $CM_{n}(\mathbf{d})$ , then this is also true for $\overline{CM}_{n}(\mathbf{d})$ .

Next, we point out some properties we use extensively in this section. First, we observe that if $q=o(1)$ , then by Taylor expansion, the removal probability in the explosion algorithm satisfies $1-(1-q)^{1/2}=(q/2)(1+o(1))$ . Therefore, the probability of a vertex of degree $l$ to retain $j$ half-edges satisfies

\displaystyle p_{l,j}=\binom{l}{j}\left((1-q)^{-1/2}\right)^{j}\left(1-(1-q)^{1/2}\right)^{l-j}=\binom{l}{j}(q/2)^{l-j}(1+o(1))

(15)

if $q=o(1)$ and $j\leq l$ . Moreover, let $n_{l,j}$ represent the number of vertices of degree $l$ that retain $j$ half-edges. That is, $n_{l,j}$ are random variables with distribution $n_{l,j}\overset{d}{=}Bin(n_{l},p_{l,j})$ . Due to Markov’s inequality, it holds that

\displaystyle\mathbb{P}(n_{l,j}>0)\leq n_{l}p_{l,j}.

(16)

Moreover,

\displaystyle\begin{array}[]{ll}n_{l,j}\overset{d}{\rightarrow}\textrm{Poi}(a)&\textrm{if }n_{l}p_{l,j}\rightarrow a\in(0,\infty),\\ n_{l,j}=n_{l}p_{l,j}(1+o_{\mathbb{P}}(1))&\textrm{if }n_{l}p_{l,j}\rightarrow\infty,\end{array}

(19)

due to the Poisson limit theorem and the law of large numbers, respectively. Note that

\displaystyle\mathbb{E}\left(\sum_{l=h}^{\infty}n_{l,0}\right)=\sum_{l=h}^{\infty}n_{l}p_{l,0}\leq n\frac{(q/2)^{h}}{1-q/2}(1+o(1))=O(nq^{h}),\hskip 28.45274pth\geq 2.

(20)

By Markov’s inequality, it holds for every $\epsilon>0$

\displaystyle\mathbb{P}\left(\sum_{l=h}^{\infty}n_{l,0}>\epsilon\right)=O(nq^{h}),\hskip 28.45274pth\geq 2.

(21)

Similarly, it follows that for all $q=o(1)$ ,

\displaystyle\mathbb{E}\left(\sum_{l=h}^{\infty}n_{l,1}\right)\leq n\sum_{l=h}^{\infty}l(q/2)^{l-1}(1+o(1))=O(nq^{h-1}),\hskip 28.45274pth\geq 2,

(22)

and hence for every $\epsilon>0$ ,

\displaystyle\mathbb{P}\left(\sum_{l=h}^{\infty}n_{l,1}>\epsilon\right)=O(nq^{h-1}),\hskip 28.45274pth\geq 2,

(23)

Finally, we observe that for every $1/m\ll q\ll 1$ , by the law of large numbers,

\displaystyle R_{n}=2m(1-(1-q)^{-1/2})(1+o_{\mathbb{P}}(1))=\frac{nd}{2}q(1+o_{\mathbb{P}}(1))

(24)

3.3 Typical structure of the percolated configuration model

We recall that our focus is on the case where the number of edges that are removed from the giant is of order $o(m)$ . In view of Lemma 3.1, this number is well-approximated by the number of edge removals in a percolation process with removal probability $q=o(1)$ . In particular, in this regime there is a unique giant component w.h.p. and other components are likely to be much smaller, i.e. w.h.p. the number of vertices and edges ourside the giant is of order $o(m)$ . Even more can be said about the structure of these components. In this section, we show in that these components are typically isolated nodes, line components or potentially isolated circles. More complex structures are relatively rare.

Remark 3.3.

Remark 3.2 implies that often it suffices to consider $CM_{n}(\mathbf{d})$ to prove an analogous result for $\overline{CM}_{n}(\mathbf{d})$ . Moreover, the explosion algorithm is used to construct $CM_{n}(\mathbf{d},q)$ from $CM_{N}(\mathbf{d}^{\prime})$ by the removal of $R_{n}$ degree-one vertices, and hence we often also focus on $CM_{N}(\mathbf{d}^{\prime})$ in this section. We point out that the operation of removing degree-one vertices does not affect the connectivity of a component. Moreover, the probability that the giant component in $CM_{N}(\mathbf{d}^{\prime})$ is not unique is exponentially small [25]. If $q=o(1)$ , then the probability for $R_{n}$ not to be of sublinear size is also exponentially small. Therefore, the giant component in $CM_{N}(\mathbf{d}^{\prime})$ remains the giant component in $CM_{n}(\mathbf{d},q)$ with extremely high probability, i.e. the probability that the complement is true has an exponentially decaying rate. Therefore, the number of edges outside the giant in $CM_{N}(\mathbf{d}^{\prime})$ provides an upper bound (with extremely high probability) for the number of edges outside the giant in $CM_{n}(\mathbf{d},q)$ . Moreover, since line components outside the giant in $CM_{N}(\mathbf{d}^{\prime})$ either remain line components or become isolated nodes after the removal of degree-one vertices, the number of edges in $CM_{n}(\mathbf{d},q)$ outside the giant that are not contained in line components is bounded from above (with extremely high probability) by the same quantity in $CM_{N}(\mathbf{d}^{\prime})$ .

First, we explore the degree sequence $\mathbf{d}^{\prime}$ of $CM_{N}(\mathbf{d}^{\prime})$ in the explosion method from Algorithm 1. Analogously to the notation for the original degree sequence $\mathbf{d}$ , we write $n_{i}^{\prime}$ for the number of vertices of degree $i$ in $\mathbf{d}^{\prime}$ and $p_{i}^{\prime}:=\lim_{n\rightarrow\infty}n_{i}^{\prime}/n$ as the limiting fraction.

Lemma 3.4.

Consider the explosion algorithm from Algorithm 1 with initial graph $CM_{n}(\mathbf{d})$ satisfying Condition 1.1 and $q=i/m$ with $1\ll i\ll m$ . The degree sequence $\mathbf{d}^{\prime}$ after explosion satisfies the following properties. For the number of vertices of degree zero,

\displaystyle\begin{array}[]{ll}\mathbb{P}(n^{\prime}_{0}\neq 0)=O(i^{2}/m),&\textrm{ if }i\ll m^{1/2},\\ n^{\prime}_{0}\overset{d}{\to}Poi\Big{(}\frac{c^{2}p_{2}}{2d}\Big{)},&\textrm{ if }i=c\sqrt{m},\\ n^{\prime}_{0}=\frac{p_{2}}{2d}\frac{i^{2}}{m}(1+o_{\mathbb{P}}(1)),&\textrm{ if }\sqrt{m}\ll i\ll m.\end{array}

(28)

The number of degree-one vertices in $CM_{N}(\mathbf{d}^{\prime})$ satisfies

\displaystyle n^{\prime}_{1}

\displaystyle=\left(i+\frac{2p_{2}}{d}i\right)(1+o_{\mathbb{P}}(1)).

(29)

Finally, the fraction of vertices of degree $k\geq 2$ in $CM_{N}(\mathbf{d}^{\prime})$ converges to the limiting fraction of degree- $k$ vertices in $CM_{n}(\mathbf{d})$ , i.e.

\displaystyle p^{\prime}_{k}

\displaystyle=p_{k},\hskip 14.22636pt\textrm{ for all }k\geq 2.

(30)

Proof.

Recall that if $q=o(1)$ , then

	$\displaystyle 1-(1-q)^{1/2}$	$\displaystyle=(q/2)(1+o(1))=(i/(2m))(1+o(1)),$
	$\displaystyle p_{l,j}$	$\displaystyle=\binom{l}{j}(q/2)^{l-j}(1+o(1)),$

and $n_{l,j}\overset{d}{=}Bin(n_{l},p_{l,j})$ . Using (19), we obtain

\displaystyle n_{2,0}=\frac{p_{2}i^{2}}{2dm}(1+o_{\mathbb{P}}(1))\text{ if }m^{1/2}\ll i\ll m,\quad n_{2,0}\overset{d}{\to}Poi\Big{(}\frac{c^{2}p_{2}}{2d}\Big{)}\text{ if }i=c\sqrt{m}.

By (20) we know that in both cases, these are the only leading-order contributions to $n_{0}^{\prime}$ , since $n_{0}^{\prime}=\sum_{h=2}^{\infty}n_{l,0}$ with $\sum_{h=3}^{\infty}n_{l,0}=O_{\mathbb{P}}(ni^{3}/m^{2})=o_{\mathbb{P}}(i^{2}/m)$ . From (21), we obtain that

\displaystyle\mathbb{P}\left(n_{0}^{\prime}\neq 0\right)=\mathbb{P}\left(\sum_{l=2}^{\infty}n_{l,0}\neq 0\right)=O(i^{2}/m),

if $i\ll\sqrt{m}$ . Similarly, it follows from (19), (23) and (24),

\displaystyle n^{\prime}_{1}=(R_{n}+n_{2,1})(1+o_{\mathbb{P}}(1))=\left(i+\frac{2p_{2}}{d}i\right)(1+o_{\mathbb{P}}(1)).

Finally, we note that, since every removal of a half-edge in the first step of the explosion algorithm changes the degree of one vertex,

\displaystyle n_{l}-R_{n}\leq n^{\prime}_{l}\leq n_{l}+R_{n},

with $R_{n}=O_{\mathbb{P}}(i)=o_{\mathbb{P}}(n)$ and hence $p^{\prime}_{l}\overset{\mathbb{P}}{\to}p_{l}$ for all $l\geq 2$ .

We use the degree sequence $\mathbf{d}^{\prime}$ to study the structure of the components outside the giant in $\overline{CM}_{n}(\mathbf{d},q)$ . Again, we will show that these components are either isolated nodes, line components or possibly isolated cycles. More complex structures are rather unlikely to appear as the network disintegrates.

We begin by proving a bound on the number of edges belonging to components that have a more complex structure, i.e. contain a vertex of degree at least three when $q\ll m^{-\delta}$ for some $\delta>0$ . We show that this is not the leading-order term for the number of edges outside the giant, since the number of lines and isolated vertices is much larger. For a graph $G$ , define $d_{\max}(G)$ as the largest degree of any vertex in $G$ and $E(G)$ as the edge set of $G$ . For every edge $e$ and vertex $v$ , write $\mathcal{C}(e)$ and $\mathcal{C}(v)$ for the connected component that contains $e$ or $v$ , respectively. Finally, let $\mathcal{C}_{\rm max}$ denote the largest component in a graph $G$ .

Proposition 3.5.

Consider the graphs $\overline{CM}_{n}(\mathbf{d},q)$ , $CM_{n}(\mathbf{d},q)$ and $CM_{N}(\mathbf{d}^{\prime})$ with $q=i/m$ , where $m^{\delta}\ll i\ll m^{1-\delta}$ for some $\delta>0$ . For all three graphs, if $i=o(\sqrt{m})$ , then

\displaystyle\mathbb{P}(\#\{e\notin\mathcal{C}_{\rm max}\colon d_{\max}(\mathcal{C}(e))\geq 3\}\neq 0)=o(i^{2}/m).

(31)

For both graphs, if $i\gg\sqrt{m}$ , then

\displaystyle\#\{e\notin\mathcal{C}_{\rm max}\colon d_{\max}(\mathcal{C}(e))\geq 3\}=o_{\mathbb{P}}\left(i^{2}/m\right).

(32)

Proof.

We prove the result using the explosion algorithm. First, recall Remark 3.2. Applying (14) to the events

	$\displaystyle\left\{\#\{e\notin\mathcal{C}_{\rm max}\colon d_{\max}(\mathcal{C}(e))\geq 3\}\neq 0\right\},$
	$\displaystyle\left\{\frac{m}{i^{2}}\#\{e\notin\mathcal{C}_{\rm max}\colon d_{\max}(\mathcal{C}(e))\geq 3\}>\epsilon\right\},\;\;\;\epsilon>0,$

implies that to prove (31) and (32) for $\overline{CM}_{n}(\mathbf{d},q)$ , it suffices to show (31) and (32) hold for $CM_{n}(\mathbf{d},q)$ . In addition, recall Remark 3.3, implying that the number of edges in $CM_{n}(\mathbf{d},q)\setminus\mathcal{C}_{\rm max}$ in components containing a node with degree at least three is bounded by the same quantity in $CM_{N}(\mathbf{d}^{\prime})\setminus\mathcal{C}_{\rm max}$ with sufficiently high probability. In other words, it suffices to prove that (31) and (32) hold for $CM_{N}(\mathbf{d}^{\prime})$ .

Recall the degree distribution of $CM_{N}(\mathbf{d}^{\prime})$ from Lemma 3.4. It follows from the proof of [25, Lemma 11] that for every supercritical degree sequence $\mathbf{d}^{\prime}$ (i.e. $\mathbb{E}[D^{\prime}_{n}(D^{\prime}_{n}-1)]/\mathbb{E}(D_{n}^{\prime})>1$ ) and any $\gamma\in(0,\infty)$ , there exists $c=c(\mathbf{d}^{\prime})<1$ such that in $CM_{N}(\mathbf{d}^{\prime})$ , $\mathbb{P}(\exists\mathcal{C}\neq\mathcal{C}_{\rm max}:|E(\mathcal{C})|>\gamma\log n)\leq n^{2}c^{\gamma\log n}$ . Therefore, for $\gamma$ large enough,

\displaystyle\mathbb{P}(\exists\,\mathcal{C}\neq\mathcal{C}_{\rm max}:|E(\mathcal{C})|>\gamma\log n)=o(n^{-1}).

(33)

Consequently, since the number of edges in the giant component is much larger than $\gamma\log n$ , it suffices to prove the claims (31) and (32) for

\displaystyle\#\left\{e\in CM_{N}(\mathbf{d}^{\prime}):|E(\mathcal{C}(e))|\leq\gamma\log n,d_{\max}(\mathcal{C}(e))\geq 3\right\}.

(34)

For this purpose, we use the standard exploration algorithm of $CM_{N}(\mathbf{d}^{\prime})$ used in the literature (see e.g. [13, 11] for some similar formulations). At each time $t\in\mathbb{N}$ , we define the sets of half-edges $\{\mathcal{A}_{t},\mathcal{D}_{t},\mathcal{N}_{t}\}$ as the active, dead and neutral sets, and explore them in the following way:

1.

At step $t=0$ , pick a vertex $v\in[n]$ with $d_{v}\geq 3$ uniformly at random and set all its half-edges as active. All other half-edges are set as neutral, and $\mathcal{D}_{0}=\emptyset$ .
2.

At each step $t$ , pick a half-edge $e_{1}(t)$ in $\mathcal{A}_{t}$ uniformly at random, and pair it with another half-edge $e_{2}(t)$ chosen uniformly at random in $\mathcal{A}_{t}\cup\mathcal{N}_{t}$ . Set $e_{1}(t),e_{2}(t)$ as dead. If $e_{2}(t)\in\mathcal{N}_{t}$ , then find the vertex $v(e_{2}(t))$ incident to $e_{2}(t)$ and activate all its other half-edges.
3.

Terminate the process when $\mathcal{A}_{t}=\emptyset$ .

We observe that when $\mathcal{A}_{t}=\emptyset$ , we have exhausted the exploration of the connected component $\mathcal{C}(v)$ , and the number of steps performed by the exploration algorithm is the number of edges in $\mathcal{C}(v)$ In order to prove the claim, we thus need to prove that there is no $t\leq\gamma\log n$ such that $\mathcal{A}_{t}=\emptyset$ with sufficiently high probability. Let $(Z^{(v)}_{t})_{t\geq 0}$ count the number of active half-edges starting from a vertex $v$ with $d_{v}\geq 3$ . Note that at step $t$ the process can only go down if i) $e_{2}(t)\in\mathcal{N}_{t}$ and its incident vertex has degree one, causing $Z^{(v)}_{t}=Z^{(v)}_{t-1}-1$ , or ii) $e_{2}(t)\in\mathcal{A}_{t}$ , causing $Z^{(v)}_{t}=Z^{(v)}_{t-1}-2$ . We denote these events by $A(t)$ and $B(t)$ , respectively. Since $Z^{(v)}_{0}\geq 3$ , this counting process needs to decrease by at least three in total for the exploration process to die out. Moreover, the values of the counting process is small at the time steps where the process decreases. More specifically, $\left\{\mathcal{A}_{\gamma\log n}=\emptyset\right\}\subseteq F_{1}\cup F_{2}\cup F_{3}$ , where

	$\displaystyle F_{1}=\bigcup_{s_{1},s_{2},s_{3}\leq\gamma\log n}$	$\displaystyle A(s_{1})\cap A(s_{2})\cap A(s_{3})\cap\{Z_{s_{1}}^{\scriptscriptstyle(v)},Z_{s_{2}}^{\scriptscriptstyle(v)},Z_{s_{3}}^{\scriptscriptstyle(v)}\leq 3\},$
	$\displaystyle F_{2}=\bigcup_{s_{1},s_{2}\leq\gamma\log n}$	$\displaystyle A(s_{1})\cap B(s_{2})\cap\{Z_{s_{1}}^{\scriptscriptstyle(v)},Z_{s_{2}}^{\scriptscriptstyle(v)}\leq 3\},$
	$\displaystyle F_{3}=\bigcup_{s_{1},s_{2}\leq\gamma\log n}$	$\displaystyle B(s_{1})\cap B(s_{2})\cap\{Z_{s_{1}}^{\scriptscriptstyle(v)},Z_{s_{2}}^{\scriptscriptstyle(v)}\leq 4\}.$

We can bound the probabilities that these events occur by

	$\displaystyle\mathbb{P}(F_{1})$	$\displaystyle\leq\left(\gamma\log n\right)^{3}\left(1+\frac{2p_{2}}{d}\right)^{3}\frac{i^{3}}{m^{3}}(1+o(1))=O\left(\frac{i^{3}\log^{3}m}{m^{3}}\right),$
	$\displaystyle\mathbb{P}(F_{2})$	$\displaystyle\leq\left(\gamma\log n\right)^{2}\left(1+\frac{2p_{2}}{d}\right)\frac{i}{m}(1+o(1))\frac{3}{m-2\gamma\log n}=O\left(\frac{i\log^{2}m}{m^{2}}\right),$
	$\displaystyle\mathbb{P}(F_{3})$	$\displaystyle\leq\left(\gamma\log n\right)^{2}\left(\frac{4}{m-2\gamma\log n}\right)^{2}=O\left(\frac{\log^{2}m}{m^{2}}\right).$

Consequently, using the union bound, we obtain that for every $i$ that satisfies $m^{\epsilon}\ll i\ll m^{1-\epsilon}$ for some $\epsilon>0$ ,

\displaystyle\mathbb{E}

\displaystyle\left[\#\left\{e:|E(\mathcal{C}(e))|\leq\gamma\log n,d_{\max}(\mathcal{C}(e))\geq 3\right\}\right]\leq n\gamma\log n\left(\mathbb{P}(F_{1})+\mathbb{P}(F_{2})+\mathbb{P}(F_{3})\right)=o\left(\frac{i^{2}}{m}\right).

(35)

By Markov’s inequality, it follows that

\displaystyle\mathbb{P}\left(\#\left\{e:|E(\mathcal{C}(e))|\leq\gamma\log n,d_{\max}(\mathcal{C}(e))\geq 3\right\}\neq 0\right)=o(i^{2}/m),

and

\displaystyle\#\left\{e:|E(\mathcal{C}(e))|\leq\gamma\log n,d_{\max}(\mathcal{C}(e))\geq 3\right\}=o_{\mathbb{P}}\left(i^{2}/m\right).

The following proposition specifies the number of vertices and edges in lines and the number of isolated nodes which are disconnected from the giant by a percolation process, which constitutes the leading-order term for $CM_{n}(\mathbf{d},q)\setminus\mathcal{C}_{\rm max}$ .

Proposition 3.6.

Consider $CM_{n}(\mathbf{d},q)$ with $q=i/m$ with $\sqrt{m}\ll i\ll m$ . Define $L_{k}(n)$ as the number of isolated lines of length $k$ and $N_{0}(n)$ the number of isolated vertices. Then,

	$\displaystyle\frac{m}{i^{2}}\Big{(}N_{0}(n)+\sum_{k=2}^{\infty}kL_{k}(n)\Big{)}\overset{\mathbb{P}}{\to}\frac{2dp_{2}}{(d-2p_{2})^{2}},$		(36)
	$\displaystyle\frac{m}{i^{2}}\Big{(}\sum_{k=2}^{\infty}(k-1)L_{k}(n)\Big{)}\overset{\mathbb{P}}{\to}\frac{4p_{2}^{2}}{(d-2p_{2})^{2}}.$		(37)

Instead, if $i=o(\sqrt{m})$ , then

\displaystyle\mathbb{P}\Big{(}N_{0}(n)+\sum_{k=2}^{\infty}kL_{k}(n)\neq 0\Big{)}=O(i^{2}/m).

(38)

Moreover, (36)-(38) hold also for $\overline{CM}_{n}(\mathbf{d},q)$ .

Before moving to the proof of Proposition 3.6, we first consider the higher moments of $L_{k}^{\prime}(n),k\geq 1$ , the number of isolated lines of length $k$ in $CM_{N}(\mathbf{d}^{\prime})$ .

Lemma 3.7.

For any sequence $\mathbf{r}=\{r_{2},...,r_{k}\}$ with $k\geq 2$ of positive integer values, it holds as $n\rightarrow\infty$ ,

\displaystyle\mathbb{E}[L_{2}^{\prime}(n)^{r_{2}}\cdots L_{k}^{\prime}(n)^{r_{k}}]\Big{(}\frac{m}{i^{2}}\Big{)}^{r_{2}+...+r_{k}}\to\prod_{j=2}^{k}\left(\frac{1}{4}\Big{(}1+\frac{2p_{2}}{d}\Big{)}^{2}\Big{(}\frac{2p_{2}}{d}\Big{)}^{j-2}\right)^{r_{j}}.

(39)

From Lemma 3.7, we can bound the number of edges in isolated line components in $CM_{N}(\mathbf{d}^{\prime})$ .

Corollary 3.8.

For every $j\geq 1$ , as $n\rightarrow\infty$ ,

\displaystyle\mathbb{E}\Big{[}\Big{(}\frac{m}{i^{2}}\sum_{k=2}^{\infty}kL_{k}^{\prime}(n)\Big{)}^{j}\Big{]}\to\Big{(}\frac{(d-p_{2})(d+2p_{2})^{2}}{2d(d-2p_{2})^{2}}\Big{)}^{j}.

(40)

The proof of Lemma 3.7 and Corollary 3.8 is given in Appendix B. Next, we prove Proposition 3.6.

Proof of Proposition 3.6.

Again, note that it follows from Corollary 3.8 that

\displaystyle\mathbb{E}\Big{[}\sum_{k=2}^{\infty}kL_{k}^{\prime}(n)\Big{]}

\displaystyle=O\left(\frac{i^{2}}{m}\right).

By Markov’s inequality and Lemma 3.4, if $i=o(\sqrt{m})$ ,

\displaystyle\mathbb{P}\left(n_{0}^{\prime}+\sum_{k=2}^{\infty}kL_{k}^{\prime}(n)\neq 0\right)=O\left(\frac{i^{2}}{m}\right).

Recall that in the final step of the explosion algorithm, we only remove vertices of degree 1. Therefore, the only way for the number of vertices in line components and isolated vertices to increase is when a component in $CM_{N}(\mathbf{d}^{\prime})$ with a vertex that has a degree at least three turns into a line or an isolated node. By Proposition 3.5, such components appear in $CM_{N}(\mathbf{d}^{\prime})$ with probability $o_{\mathbb{P}}(i^{2}/m)$ , and we conclude that (38) holds.

In order to prove (36) and (37), we also need second moments. By Corollary 3.8,

\displaystyle\mathbb{E}\Big{[}\Big{(}\frac{m}{i^{2}}\sum_{k=2}^{\infty}kL_{k}^{\prime}(n)\Big{)}^{2}\Big{]}=\mathbb{E}\Big{[}\Big{(}\frac{m}{i^{2}}\sum_{k=2}^{\infty}kL_{k}^{\prime}(n)\Big{)}\Big{]}^{2}(1+o(1)),

and thus, by the second moment method

\displaystyle\frac{m}{i^{2}}\sum_{k=2}^{\infty}kL_{k}^{\prime}(n)\overset{\mathbb{P}}{\to}\frac{(d-p_{2})(d+2p_{2})^{2}}{2d(d-2p_{2})^{2}}.

Since $n_{0}^{\prime}=\frac{p_{2}i^{2}}{2dm}$ by Lemma 3.4, we obtain that

\displaystyle\frac{m}{i^{2}}\Big{(}n_{0}^{\prime}+\sum_{k=2}^{\infty}kL_{k}^{\prime}(n)\Big{)}\overset{\mathbb{P}}{\to}\frac{p_{2}}{2d}+\frac{(d-p_{2})(d+2p_{2})^{2}}{2d(d-2p_{2})^{2}}.

The same arguments can also be used to prove concentration of $\sum_{k=2}^{\infty}(k-1)L_{k}^{\prime}(n)$ , since it is dominated by $\sum_{k=2}^{\infty}kL_{k}^{\prime}(n)$ . That is,

\displaystyle\frac{m}{i^{2}}\sum_{k=2}^{\infty}(k-1)L_{k}^{\prime}(n)\overset{\mathbb{P}}{\to}\frac{(d+2p_{2})^{2}}{4(d-2p_{2})^{2}}.

We computed the number of vertices and edges that are contained in isolated node components or in line components in $CM_{N}(\mathbf{d}^{\prime})$ . To obtain the corresponding value for $CM_{n}(\mathbf{d},q)$ we need to subtract the number of degree one vertices that are part of line components and are removed in the last step of the explosion algorithm, and add the number of vertices whose components turn into a line or an isolated vertex by the removal of some degree $1$ vertices. By Proposition 3.5, the number of components that can turn into a line or isolated vertex is bounded by $o_{\mathbb{P}}(i^{2}/m)$ , and hence the contribution of these type of events is negligible. Therefore, it suffices to consider the number of vertices and edges that are removed in the final step of the explostion algorithm from the line components in $CM_{N}(\mathbf{d}^{\prime})$ . We observe that there are in total

\displaystyle\sum_{k=2}^{\infty}2L_{k}^{\prime}(n)=\frac{(d+2p)^{2}}{2d(d-2p)}\frac{i^{2}}{m}(1+o_{\mathbb{P}}(1)),

(41)

vertices of degree one in the line components out of the $i(1+2p_{2}/d)(1+o_{\mathbb{P}}(1))$ that exist in $\mathbf{d}^{\prime}$ . We define $L_{R}(n)$ as the number of vertices removed from line components in the last step of the explosion algorithm. We remove $i(1+o_{\mathbb{P}}(1))$ edges of degree one uniformly at random, so the number of the degree-one vertices removed from lines is given by an hypergeometric variable with

\displaystyle\mathbb{E}[L_{R}(n)]=\frac{(d+2p_{2})^{2}}{2d(d-2p_{2})}\frac{d}{d+2p_{2}}(1+o(1))=\frac{d+2p_{2}}{2(d-2p_{2})}(1+o(1)).

(42)

A hypergeometric random variable with diverging mean concentrates around the mean, and hence by the law of large numbers,

\displaystyle\frac{m}{i^{2}}\Big{(}N_{0}(n)+\sum_{k=2}^{\infty}kL_{k}(n)\Big{)}\overset{\mathbb{P}}{\to}\frac{p_{2}}{2d}+\frac{(d-p_{2})(d+2p_{2})^{2}}{2d(d-2p_{2})^{2}}-\frac{d+2p_{2}}{2(d-2p_{2})}=\frac{2dp_{2}}{(d-2p_{2})^{2}},

as claimed.

We do the same computation for the number of edges, this time accounting for the fact that if both vertices of a line of length $2$ are removed, only one edge is removed, while all the other vertex and edge removals are in bijection. The number of lines of length $2$ which are removed is given by

\displaystyle L_{2}^{\prime}(n)\left(1+\frac{2p_{2}}{d}\right)^{-2}=\frac{i^{2}}{m}\frac{1}{4}\Big{(}1+\frac{2p_{2}}{d}\Big{)}^{2}\Big{(}1+\frac{2p_{2}}{d}\Big{)}^{-2}(1+o_{\mathbb{P}}(1))=\frac{i^{2}}{4m}(1+o_{\mathbb{P}}(1)).

(43)

We conclude that

\displaystyle\frac{m}{i^{2}}\sum_{k=2}^{\infty}(k-1)L_{k}(n)\overset{\mathbb{P}}{\rightarrow}\frac{(d+2p_{2})^{2}}{4(d-2p_{2})^{2}}-\frac{d+2p_{2}}{2(d-2p_{2})}+\frac{1}{4}=\frac{4p_{2}^{2}}{(d-2p_{2})^{2}}.

(44)

Finally, we conclude that the claims also hold when for $\overline{CM}_{n}(\mathbf{d},q)$ by Remark 3.2, i.e. by applying (14) to the events

	$\displaystyle\Big{\{}\Big{\|}\frac{m}{i^{2}}\Big{(}N_{0}(n)+\sum_{k=2}^{\infty}kL_{k}(n)\Big{)}-\frac{2dp_{2}}{(d-2p_{2})^{2}}\Big{\|}\geq\varepsilon\Big{\}},$
	$\displaystyle\Big{\{}\Big{\|}\frac{m}{i^{2}}\Big{(}\sum_{k=2}^{\infty}(k-1)L_{k}(n)\Big{)}-\frac{4p_{2}^{2}}{(d-2p_{2})^{2}}\Big{\|}\geq\varepsilon\Big{\}},$
	$\displaystyle\Big{\{}N_{0}(n)+\sum_{k=2}^{\infty}kL_{k}(n)\neq 0\Big{\}}.$

Proposition 3.6 indicates that the typical number of isolated nodes and line components outside the giant component is of order $\Theta(i^{2}/m)$ . Naturally, the isolated nodes do not contribute to the number of edges outside the giant component, and hence we are mostly interested in the total number of edges in the line components, which is likely to be of order $\Theta(i^{2}/m)$ due to (37).

Finally, we would like to comment on the number of isolated cycles in $CM_{N}(\mathbf{d}^{\prime})$ . Let $C_{k}^{\prime}(n),k\geq 1,$ denote the number of isolated cycles with $k$ edges. In view of Lemma 3.4, if $q=o(1)$ , then the degree distribution $\mathbf{d}^{\prime}$ satisfies all conditions in [13] (with extremely high probability) except one, namely $n_{1}^{\prime}\neq O(\sqrt{m})$ . However, the proof of Theorem 3.3 in [13] does not use this condition to prove a bound on the number of isolated cycles: this condition was only needed to bound the number of line components. Therefore, it follows from [13, (5.18)],

\displaystyle\lim_{n\rightarrow\infty}\mathbb{E}\left[\sum_{k\geq 1}kC_{k}^{\prime}(n)\right]<\infty.

(45)

In other words, the expected number of edges outside the giant that are contained in cycle components is finite. Since $CM_{n}(\mathbf{d},q)$ is created from $CM_{N}(\mathbf{d}^{\prime})$ by removing $R_{n}$ vertices of degree one, we observe that all isolated cycles that exist in $CM_{N}(\mathbf{d}^{\prime})$ remain isolated cycles in $CM_{n}(\mathbf{d},q)$ . Moreover, more isolated cycles can be formed from more complex components in $CM_{N}(\mathbf{d}^{\prime})$ . Using Proposition 3.6 and (45) together with Markov’s inequality, we observe that if $q=i/m$ with $\sqrt{m}\ll i\ll m^{\delta}$ for some $\delta\in(0,1/2)$ , then the number of edges in $CM_{n}(\mathbf{d},q)$ , outside the giant that are contained in cycle components is $o_{\mathbb{P}}(i^{2}/m)$ . In view of Remark 3.2, the same statement holds for $\overline{CM}_{n}(\mathbf{d},q)$ .

3.4 Probabilistic bounds on component sizes outside the giant

In this section, we provide large deviation bounds on the number of edges outside the giant for the percolated connected configuration model with removal probability $q=i/m$ with $\sqrt{m}\ll i\ll m^{1-\delta}$ for some $\delta\in(0,1/2)$ . Again, in view of Remarks 3.2 and 3.3, it suffices to consider $CM_{N}(\mathbf{d}^{\prime})$ only.

First, we provide a sharper bound on the probability of having a number of edges in these complex components outside the giant that is even of a higher order of magnitude than $O(i^{2}/m)$ .

Proposition 3.9.

Consider $CM_{N}(\mathbf{d}^{\prime})$ obtained by removal probability $q=i/m$ with $\sqrt{m}\ll i\ll m^{1-\delta}$ for some $\delta\in(0,1/2)$ . For every $\alpha>0$ ,

\displaystyle\mathbb{P}\left(\frac{m}{i^{2}}\#\{e\notin\mathcal{C}_{\rm max}\colon d_{\max}(\mathcal{C}(e))\geq 3\}\geq m^{\alpha}\right)=O(m^{-3}).

(46)

Proof.

We note that, by the proof of [25, Lemma 11], for $\gamma>0$ sufficiently large,

\displaystyle\mathbb{P}(\exists\,\mathcal{C}\neq\mathcal{C}_{\rm max}:|E(\mathcal{C})|\geq\gamma\log n)=o(m^{-3}).

(47)

We are left to bound the contribution from components that contain at most $\gamma\log n$ edges. We use the method of moments. We observe that

	$\displaystyle(\#\{\mathcal{C}_{l}\ s.t.\ \|E(\mathcal{C}_{l})\|\leq\gamma\log n,$	$\displaystyle d_{\max}(\mathcal{C}_{l})\geq 3\})^{j}\leq(\#\{v\in[n]:d_{v}\geq 3,\|E(\mathcal{C}(v))\|\leq\gamma\log n\})^{j}$
		$\displaystyle=\sum_{v_{1},...,v_{j}\in[n]}\mathbbm{1}_{\{\|E(\mathcal{C}(v_{1}))\|\leq\gamma\log n\colon d_{v_{1}}\geq 3\}}\cdots\mathbbm{1}_{\{\|E(\mathcal{C}(v_{j}))\|\leq\gamma\log n\colon d_{v_{j}}\geq 3\}}.$

We stress that the vertices $v_{1},...,v_{j}$ in the summation are not necessarily mutually distinct. For the purpose of exposition, write for a vertex $v\in[n]$ ,

\displaystyle\mathcal{I}(v)=\{d_{v}\geq 3,|E(\mathcal{C}(v))|\leq\gamma\log n\}.

Applying the tower property, we obtain

	$\displaystyle\mathbb{E}$	$\displaystyle\left[(\#\{\mathcal{C}_{l}\ s.t.\ \|E(\mathcal{C}_{l})\|\leq\gamma\log n,d_{\max}(\mathcal{C}_{l})\geq 3\})^{j}\right]\leq\mathbb{E}\left[\sum_{v_{1},...,v_{j}\in[n]}\mathbbm{1}_{\mathcal{I}(v_{1})}\mathbbm{1}_{\mathcal{I}(v_{2})}\cdots\mathbbm{1}_{\mathcal{I}(v_{j})}\right]$
		$\displaystyle\hskip 85.35826pt=\mathbb{E}\left[\sum_{v_{1},...,v_{j-1}\in[n]}\mathbbm{1}_{\mathcal{I}(v_{1})}\mathbbm{1}_{\mathcal{I}(v_{2})}\cdots\mathbbm{1}_{\mathcal{I}(v_{j-1})}\mathbb{E}\left[\sum_{v_{j}\in[n]}\mathbbm{1}_{\mathcal{I}(v_{j})}\,\big{\|}\,\mathcal{I}(v_{1}),...,\mathcal{I}(v_{j-1})\right]\right].$

For a graph (or component) $G$ , write $V(G)$ as the vertex set of that graph. Then,

	$\displaystyle\mathbb{E}\left[\sum_{v_{j}\in[n]}\mathbbm{1}_{\mathcal{I}(v_{j})}\,\big{\|}\,\mathcal{I}(v_{1}),\mathcal{I}(v_{2}),...,\mathcal{I}(v_{j-1})\right]$	$\displaystyle=\mathbb{E}\left[\sum_{v_{j}\in[n]}\mathbbm{1}_{\mathcal{I}(v_{j})}\colon v_{j}\in\bigcup_{h=1}^{j-1}V(\mathcal{C}(v_{h}))\,\big{\|}\,\mathcal{I}(v_{1}),\mathcal{I}(v_{2}),...,\mathcal{I}(v_{j-1})\right]$
		$\displaystyle\hskip 8.5359pt+\mathbb{E}\left[\sum_{v_{j}\in[n]}\mathbbm{1}_{\mathcal{I}(v_{j})}\colon v_{j}\not\in\bigcup_{h=1}^{j-1}V(\mathcal{C}(v_{h}))\,\big{\|}\,\mathcal{I}(v_{1}),\mathcal{I}(v_{2}),...,\mathcal{I}(v_{j-1})\right].$

The first term is trivially bounded by

\displaystyle\mathbb{E}\left[\sum_{v_{j}\in[n]}\mathbbm{1}_{\mathcal{I}(v_{j})}\colon v_{j}\in\bigcup_{h=1}^{j-1}V(\mathcal{C}(v_{h}))\,\big{|}\,\mathcal{I}(v_{1}),\mathcal{I}(v_{2}),...,\mathcal{I}(v_{j-1})\right]\leq(j-1)\gamma\log n.

For the second term, we note that we count the number of vertices outside the giant component that have a degree at least three, while disregarding the set $\cup_{h=1}^{j-1}\mathcal{I}(v_{h})$ . We note that if we remove the components $\cup_{h=1}^{j-1}\mathcal{C}(v_{h})$ from $CM_{N}(\mathbf{d}^{\prime})$ , the remaining graph is a configuration model but with a modified degree sequence. In other words, we remove components that have a total of at most $O(\log n)$ edges. This number is too small to change the degree sequence $\mathbf{d}^{\prime}$ much. That is, it follows from (the proof of) Proposition 3.5 that

\displaystyle\mathbb{E}\left[\sum_{v_{j}\in[n]}\mathbbm{1}_{\mathcal{I}(v_{j})}\colon v_{j}\not\in\bigcup_{h=1}^{j-1}V(\mathcal{C}(v_{h}))\,\big{|}\,\mathcal{I}(v_{1}),\mathcal{I}(v_{2}),...,\mathcal{I}(v_{j-1})\right]=o\left(\frac{i^{2}}{m}\right).

Iterating the argument, we obtain

\displaystyle\mathbb{E}

\displaystyle\left[(\#\{\mathcal{C}_{l}\ s.t.\ |E(\mathcal{C}_{l})|\leq\gamma\log n,d_{\max}(\mathcal{C}_{l})\geq 3\})^{j}\right]=o\left((j-1)\log n+o\left(\frac{i^{2}}{m}\right)\right)^{j},

and hence

\displaystyle\mathbb{E}\big{[}(\#\{e:|E(\mathcal{C}(e))|\leq\gamma\log n\colon d_{\max}(\mathcal{C}(e))\geq 3\})^{j}\big{]}\leq\Big{(}(j-1)(\gamma\log n)^{2}+o\Big{(}\frac{i^{2}\log n}{m}\Big{)}\Big{)}^{j}.

Finally, using Markov’s inequality, it holds for every $j\in\mathbb{N}$ ,

	$\displaystyle\mathbb{P}\Big{(}$	$\displaystyle\frac{m\#\{e:\|E(\mathcal{C}(e))\|\leq\gamma\log n\colon d_{\max}(\mathcal{C}(e))\geq 3\}}{i^{2}}\geq m^{\alpha}\Big{)}\leq\Big{(}(j-1)\gamma^{2}(\log n)^{2}+o\Big{(}\frac{i^{2}\log n}{m}\Big{)}\Big{)}^{j}\frac{m^{j(1-\alpha)}}{i^{2j}}$
		$\displaystyle\hskip 199.16928pt=O\left(\Big{(}(\log n)^{2}+\frac{i^{2}\log n}{m}\Big{)}\frac{m^{(1-\alpha)}}{i^{2}}\right)^{j}=o\left((\log n)^{2}m^{-\alpha}\right)^{j}.$

Choosing $j\geq(3+\varepsilon)/\alpha$ for some $\varepsilon>0$ , the claim follows.

Next, we provide a result that shows that the probability for the number of edges in line components in $CM_{N}(\mathbf{d}^{\prime})$ to be of a higher order of magnitude than its expectation decays fast.

Lemma 3.10.

Consider $CM_{N}(\mathbf{d}^{\prime})$ obtained by removal probability $q=i/m$ , where $\sqrt{m}\ll i\ll m^{1-\delta}$ for some $\delta\in(0,1/2)$ . For every $\alpha>0$ ,

\displaystyle\mathbb{P}\Big{(}\frac{m}{i^{2}}\Big{(}\sum_{k=2}^{\infty}(k-1)L_{k}^{\prime}(n)\Big{)}\geq m^{\alpha}\Big{)}=O(m^{-3}).

(48)

Proof.

For every $j\in\mathbb{N}$ , by Markov’s inequality,

\displaystyle\mathbb{P}\Big{(}\frac{m}{i^{2}}\Big{(}\sum_{k=2}^{\infty}(k-1)L_{k}^{\prime}(n)\geq m^{\alpha}\Big{)}\leq\mathbb{E}\Big{[}\Big{(}\frac{m}{i^{2}}\Big{(}\sum_{k=2}^{\infty}kL_{k}^{\prime}(n)\Big{)}^{j}\Big{]}m^{-\alpha j}.

By Corollary 3.8,

\displaystyle\mathbb{E}\Big{[}\Big{(}\frac{m}{i^{2}}\Big{(}\sum_{k=2}^{\infty}kL_{k}^{\prime}(n)\geq m^{\alpha}\Big{)}^{j}\Big{]}m^{-\alpha j}=O(m^{-\alpha j}).

Choosing an integer $j\geq 3/\alpha$ concludes the result.

3.5 First disconnections

In this section, we consider the question of how many edges need to be removed to cause the (initially connected) configuration model to become disconnected. That is, we would like to prove Theorem 2.1, and show that the most likely moment for the first disconnection to occur is after $\Theta_{\scriptscriptstyle\mathbb{P}}(\sqrt{m})$ edges have been removed.

To prove Theorem 2.1, we use the equivalence between sequential edge removal and percolation as established in Lemma 3.1. More specifically, in order to prove Theorem 2.1, it follows from Lemma 3.1 that it suffices to show that in a percolation process with a removal probability of the order $q=\Theta(m^{-1/2})$ leads to a positive probability for the configuration model to remain connected.

Proposition 3.11.

Consider the graphs $CM_{n}(\mathbf{d},q)$ and $\overline{CM}_{n}(\mathbf{d},q)$ . If $q=c/\sqrt{m}$ with $c\in(0,\infty)$ , then

\displaystyle\mathbb{P}(CM_{n}(\mathbf{d},q)\text{ is connected})\to\left(\frac{d-2p_{2}}{d}\right)^{1/2}\exp\Big{\{}-\frac{2c^{2}p_{2}}{d-2p_{2}}\Big{\}},

(49)

and

\displaystyle\mathbb{P}(\overline{CM}_{n}(\mathbf{d},q)\text{ is connected})\to\exp\Big{\{}-\frac{2c^{2}p_{2}}{d-2p_{2}}\Big{\}}.

(50)

Proof.

We build $CM_{n}(\mathbf{d},q)$ using the explosion algorithm from Algorithm 1. To obtain the limiting probability that $CM_{n}(\mathbf{d},q)$ is connected, we first require more detailed information on the degree sequence of $\textbf{d}^{\prime}$ . Observe that

\displaystyle q=\frac{c}{\sqrt{m}}=\frac{c}{\sqrt{d/2}}\frac{1}{\sqrt{n}}.

Recall (21), (23), (24) and (19), and hence

	$\displaystyle n^{\prime}_{0}$	$\displaystyle=\sum_{j=2}^{\infty}n_{j,0}=n_{2,0}+o_{\mathbb{P}}(1)\overset{d}{\to}Poi\left(\frac{c^{2}p_{2}}{2d}\right),$		(51)
	$\displaystyle n^{\prime}_{1}$	$\displaystyle=R_{n}+\sum_{j=2}^{\infty}n_{j,1}=\sqrt{n}\left(\frac{c\sqrt{d}}{\sqrt{2}}+\frac{cp_{2}}{\sqrt{d/2}}\right)(1+o_{\mathbb{P}}(1)).$		(52)

In addition, we observe that $n_{l}-R_{n}\leq n_{l}^{\prime}\leq n_{l}+R_{n}$ for every $l\geq 2$ , and hence with high probability,

\displaystyle p_{l}^{\prime}=\lim_{n\rightarrow\infty}n_{l}^{\prime}/n=p_{l},\hskip 28.45274ptl\geq 2.

Moreover, we observe that the average degree of $\mathbf{d}^{\prime}$ converges in probability to $d$ .

Secondly, we note that removing vertices of degree one cannot disconnect a component. Therefore, there are only two ways for $CM_{n}(\mathbf{d},q)$ to be connected after the explosion algorithm. That is, either $CM_{N}(\mathbf{d}^{\prime})$ is already connected, or $CM_{N}(\mathbf{d}^{\prime})$ consists of one (giant) component and other components that are lines of length two (the only component entirely made of vertices of degree one), where all vertices of the line components are removed in the final step of the explosion algorithm. In either case, we observe that one requires that $n_{0}^{\prime}=0$ , which occurs with probability

\displaystyle\mathbb{P}\left(n_{0}^{\prime}=0\right)=\mathbb{P}\Big{(}Poi\Big{(}\frac{c^{2}p_{2}}{2d}\Big{)}=0\Big{)}(1+o_{\mathbb{P}}(1))=\exp\left\{-\frac{c^{2}p_{2}}{2d}\right\}(1+o_{\mathbb{P}}(1)).

Theorem 2.2 of [13] implies that if $n_{0}^{\prime}=0$ , then with high probability the graph $CM_{N}(\mathbf{d}^{\prime})$ consists of a giant component, possible components that are lines, possible components that are cycles, but no other type of components. Recall that $L_{k}^{\prime}(n)$ represents the number of components that are lines of length $k\geq 2$ in $CM_{N}(\mathbf{d}^{\prime})$ , and $C_{k}^{\prime}(n)$ the number of components that are cycles of length $k\geq 1$ . We call component with a single vertex of degree two with a self-loop a cycle component of length one, and a component with two vertices with multi-edges between them a cycle of length two. Theorem 2.2 of [13] implies that

	$\displaystyle L_{k}^{\prime}(n)$	$\displaystyle\overset{d}{\to}\textrm{Poi}\left(c^{2}\left(\frac{d}{2}+p_{2}\right)^{2}\frac{(2p_{2})^{k-2}}{d^{k}}\right),\hskip 28.45274ptk\geq 2,$
	$\displaystyle C_{k}^{\prime}(n)$	$\displaystyle\overset{d}{\to}\textrm{Poi}\left(\frac{(2p_{2})^{k}}{2kd^{k}}\right),\hskip 28.45274ptk\geq 1,$

and all limits are independent random variables. For $CM_{n}(\mathbf{d},q)$ to be connected after the explosion algorithm, no cycles should appear in $CM_{N}(\mathbf{d}^{\prime})$ , which occurs with probability

	$\displaystyle\lim_{n\rightarrow\infty}\mathbb{P}$	$\displaystyle\left(\textrm{no cycle components in }CM_{N}(\mathbf{d}^{\prime})\right)=\prod_{k=1}^{\infty}\lim_{n\rightarrow\infty}\mathbb{P}\left(C_{k}^{\prime}(n)=0\right)$
		$\displaystyle=\prod_{k=1}^{\infty}\exp\left\{-\frac{(2p_{2})^{k}}{2kd^{k}}\right\}=\left(\frac{d-2p_{2}}{d}\right)^{1/2}.$

Also, no lines of length more than three should appear, which occurs with probability

	$\displaystyle\lim_{n\rightarrow\infty}\mathbb{P}\left(L_{k}^{\prime}(n)=0\;\;\;\forall k\geq 3\right)$	$\displaystyle=\prod_{k=3}^{\infty}\lim_{n\rightarrow\infty}\mathbb{P}\left(L_{k}^{\prime}(n)=0\right)$
		$\displaystyle=\prod_{k=3}^{\infty}\exp\left\{-c^{2}(\frac{d}{2}+p_{2})^{2}\frac{(2p_{2})^{k-2}}{d^{k}}\right\}=\exp\left\{-\frac{c^{2}}{d^{2}}\left(\frac{d}{2}+p_{2}\right)^{2}\frac{2p_{2}}{d-2p_{2}}\right\}.$

Finally, we require that all vertices in the line components are removed in the final step of the explosion algorithm. That is, a set of $2L_{2}^{\prime}(n)$ vertices need to be removed out of the $R_{n}$ randomly chosen vertices of degree one. Note that the number of vertices that are removed from line components in the last step of the explosion algorithm is hypergeometrically distributed: we remove $R_{n}$ vertices uniformly at random from $n_{1}^{\prime}$ vertices of which $2L_{2}^{\prime}(n)$ are in line components. Therefore, conditionally on $n_{1}^{\prime}$ , $R_{n}$ and $L_{2}^{\prime}(n)$ , the probability of all vertices to be removed in the final step of the explosion algorithm is given by

\displaystyle\binom{n_{1}^{\prime}-2L_{2}^{\prime}(n)}{R_{n}-2L_{2}^{\prime}(n)}\Big{/}\binom{n_{1}^{\prime}}{R_{n}}

\displaystyle=\frac{R_{n}(R_{n}-1)\cdots(R_{n}-2L_{2}^{\prime}(n)+1)}{n_{1}^{\prime}(n_{1}^{\prime}-1)\cdots(n_{1}^{\prime}-2L_{2}^{\prime}(n)+1)}

In particular, if $L_{2}^{\prime}(n)=0$ , then with probability one all vertices in line components are removed in the last step of the algorithm. Using the tower property, (24), (52) and the observation that $L_{2}^{\prime}(n)$ converges to a Poisson distribution, we observe that

	$\displaystyle\mathbb{P}$	$\displaystyle\left(\textrm{all line components in }CM_{N}(\mathbf{d}^{\prime})\textrm{ are removed in }CM_{n}(\mathbf{d},q)\right)$
		$\displaystyle=\mathbb{E}\left[\mathbb{E}\left[\mathbbm{1}_{\{\textrm{all line components in }CM_{N}(\mathbf{d}^{\prime})\textrm{ are removed in }CM_{n}(\mathbf{d},q)\}}\mid n_{1}^{\prime},R_{n},L_{2}^{\prime}(n)\right]\right]$
		$\displaystyle=\sum_{k=0}^{\infty}\mathbb{P}(L_{2}^{\prime}(n)=k)\mathbb{E}\left[\binom{n_{1}^{\prime}-2k}{R_{n}-2k}\Big{/}\binom{n_{1}^{\prime}}{R_{n}}\mid n_{1}^{\prime},R_{n},L_{2}^{\prime}(n)=k\right]$
		$\displaystyle=\sum_{k=0}^{\infty}\frac{1}{k!}\left(\frac{c^{2}}{d^{2}}\left(\frac{d}{2}+p_{2}\right)^{2}\right)^{k}\exp\left\{-\frac{c^{2}}{d^{2}}\left(\frac{d}{2}+p_{2}\right)^{2}\right\}\left(\frac{d}{d+2p_{2}}\right)^{2k}(1+o(1))$
		$\displaystyle=\exp\left\{-\frac{c^{2}}{d^{2}}\left(\frac{d}{2}+p_{2}\right)^{2}\left(1-\left(\frac{d}{d+2p_{2}}\right)^{2}\right)\right\}(1+o(1))=\exp\left\{-\frac{c^{2}p_{2}(d+p_{2})}{d^{2}}\right\}(1+o(1)).$

We conclude that

	$\displaystyle\mathbb{P}$	$\displaystyle(CM_{n}(\mathbf{d},q)\text{ is connected})=(1+o(1))\sqrt{\frac{d-2p_{2}}{d}}\exp\left\{-\frac{c^{2}p_{2}}{2d}-\frac{c^{2}}{d^{2}}\left(\frac{d}{2}+p_{2}\right)^{2}\frac{2p_{2}}{d-2p_{2}}-\frac{c^{2}p_{2}(d+p_{2})}{d^{2}}\right\}$
		$\displaystyle=\left(\frac{d-2p_{2}}{d}\right)^{1/2}\exp\left\{-\frac{2c^{2}p_{2}}{d-2p_{2}}\right\}(1+o(1)).$

For the second statement of the proposition, it is known from [24] that

\displaystyle\mathbb{P}(CM_{n}(\mathbf{d})\text{ is connected})\to\left(\frac{d-2p_{2}}{d}\right)^{1/2},

and since $\{CM_{n}(\mathbf{d},q)\text{ is connected}\}\subseteq\{CM_{n}(\mathbf{d})\text{ is connected}\}$ , the statement follows.

Using the partial result of Proposition 3.11, we can prove Theorem 2.1.

See 2.1

Proof.

First, connectivity is a monotone property, and thus, once the graph is disconnected it will stay disconnected for the rest of the process. From Lemma 3.1 it follows that sequential removal of $c\sqrt{m}$ edges in $\overline{CM}_{n}(\textbf{d},q)$ is well-approximated by a percolation process with removal probability $q=cm^{-1/2}(1+o_{\mathbb{P}}(1))$ . Consequently, due to Proposition 3.11,

\displaystyle\mathbb{P}(T_{n,\boldsymbol{d}}\geq x\sqrt{m})=\mathbb{P}\left(CM_{n}(\mathbf{d},x/\sqrt{m})\text{ is connected}\right)+o(1)\to\exp\Big{\{}-\frac{2x^{2}p_{2}}{d-2p_{2}}\Big{\}}.

(53)

In other words, $m^{-1/2}T_{n,\boldsymbol{d}}$ converges in distribution to a random variable $T$ whose complementary cumulative distribution function is given by the expression on the right-hand side of (53).

Proposition 3.11 implies that the percolated graph $\overline{CM}_{n}(\mathbf{d},q)$ with removal probability $q=o(m^{-1/2})$ is disconnected with probability of order $o(1)$ . More detailed bounds can be provided for this range, as is illustrated by the next result.

Proposition 3.12.

Consider $\overline{CM}_{n}(\mathbf{d},q)$ with $q=O(m^{-\alpha})$ for some $1/2<\alpha<1$ . Then,

\displaystyle\mathbb{P}(\overline{CM}_{n}(\mathbf{d},q)\text{ is disconnected})=O(m^{1-2\alpha}).

(54)

Consequently, if we remove $i:=i_{m}$ edges uniformly at random from $\overline{CM}_{n}(\mathbf{d})$ with $i=o(m^{\beta})$ for some $\beta\in(0,1/2)$ , then the probability of the resulting graph being disconnected is of order $O(m^{2\beta-1})$ .

Proof.

If $\overline{CM}_{n}(\mathbf{d},q)$ is disconnected, then there is at least one component outside the giant that is either an isolated node, a line, a cycle or a component that contains a vertex with degree at least three. To show the result, it therefore suffices to prove that each of these events occur with probability $O(m^{1-2\alpha})$ .

First, it follows from Proposition 3.6 that in $\overline{CM}_{n}(\mathbf{d},q)$ ,

\displaystyle\mathbb{P}\Big{(}N_{0}(n)+\sum_{k=2}^{\infty}kL_{k}(n)\neq 0\Big{)}=O(m^{1-2\alpha}).

In other words, the event that there is an isolated node or a line component (outside the giant) occurs with probability $O(m^{1-2\alpha})$ . Moreover, we can apply Proposition 3.5 to obtain that in $\overline{CM}_{n}(\mathbf{d},q)$ ,

\displaystyle\mathbb{P}(\exists\,\mathcal{C}\neq\mathcal{C}_{\rm max}\colon d_{\max}(\mathcal{C})\geq 3\}=o(m^{1-2\alpha}).

Finally, we show that with probability $1-O(m^{1-2\alpha})$ percolation on $\overline{CM}_{n}(\mathbf{d},q)$ does not create cycles. We observe that

	$\displaystyle\mathbb{P}\left(\sum_{k=1}^{\infty}kC_{k}(n)\neq 0\,\big{\|}\,CM_{n}(\mathbf{d})\textrm{ is connected}\right)$	$\displaystyle=\frac{\mathbb{P}\left(\sum_{k=1}^{\infty}k\tilde{C}_{k}(n)\neq 0\colon CM_{n}(\mathbf{d})\textrm{ is connected}\right)}{\mathbb{P}\left(CM_{n}(\mathbf{d})\textrm{ is connected}\right)}$
		$\displaystyle\leq\frac{\mathbb{P}\left(\sum_{k=1}^{\infty}k\tilde{C}_{k}(n)\neq 0\right)}{\mathbb{P}\left(CM_{n}(\mathbf{d})\textrm{ is connected}\right)},$

where $\tilde{C}_{k}(n)$ denotes the number of newly formed isolated cycles in $CM_{n}(\mathbf{d},q)$ that do not exist in the initial graph $CM_{n}(\mathbf{d})$ . Since $CM_{n}(\mathbf{d})$ is connected with non-vanishing probability, it is sufficient to show that

\displaystyle\mathbb{P}\left(\sum_{k=1}^{\infty}k\tilde{C}_{k}(n)\neq 0\right)=O(m^{1-2\alpha}).

Again, we use the explosion method. We stress that running this algorithm on a sampled $CM_{n}(\mathbf{d})$ results in a graph that is not the same as the graph that would have been obtained if percolation had been applied on the same sampled $CM_{n}(\mathbf{d})$ . Instead, sampling a $CM_{n}(\mathbf{d})$ and running the explosion algorithm results in a graph that is statistically indistinguishable from one that is obtained by sampling another $CM_{n}(\mathbf{d})$ and applying percolation on it. Therefore, we need to consider the question how to bound newly formed cycles in $CM_{n}(\mathbf{d},q)$ . The crucial observation to answer this question is that such cycles contain at least one node whose degree in $\mathbf{d}$ was at least three.

More specifically, the number of newly formed isolated cycles in $CM_{n}(\mathbf{d},q)$ is at most the number of cycles in $CM_{n}(\mathbf{d},q)$ that contain at least one node whose degree in $\mathbf{d}$ is at least three. Write $Z_{n}$ for the number of vertices whose degree in $\mathbf{d}^{\prime}$ is two, but has degree at least three in $d$ , i.e.

\displaystyle Z_{n}=\sum_{l=3}^{\infty}n_{l,2}.

Every isolated cycle in $CM_{n}(\mathbf{d},q)$ that contains at least one node having an original degree at least three in $\mathbf{d}$ , either that cycle already exists in $CM_{N}(\mathbf{d}^{\prime})$ , or it was formed from a component in $CM_{N}(\mathbf{d}^{\prime})$ by the removal of degree one vertices. In the second case, this implies that there exists a component in $CM_{N}(\mathbf{d}^{\prime})$ outside the giant with a maximum degree at least three, which occurs with probability $o(m^{1-2\alpha})$ by Proposition 3.5. What remains is to analyze the first case.

Write $\tilde{C}_{k}^{\prime}(n)$ for the number of cycles that exist in $CM_{N}(\mathbf{d}^{\prime})$ and contain at least one node whose original degree in $\mathbf{d}$ is at least three. For any $k\geq 1$ , a set of (distinct) degree-two vertices $\{v_{1},...,v_{k}\}$ in $\mathbf{d}^{\prime}$ forms a cycle in $CM_{N}(\mathbf{d}^{\prime})$ with probability

\displaystyle\frac{2(k-1)}{2m-1}\frac{2(k-2)}{2m-3}\cdots\frac{2}{2m-2k+3}\frac{1}{2m-2k+1}

if $k\geq 2$ , and $1/(2m-1)$ if $k=1$ . The number of sets of $k$ vertices in $\mathbf{d}^{\prime}$ that all have degree two of which at least one vertex has degree at least three in $\mathbf{d}$ is bounded by $Z_{n}\binom{n_{2}^{\prime}}{k-1}$ . Therefore,

\displaystyle\mathbb{E}[\tilde{C}_{k}^{\prime}(n)]\leq\mathbb{E}\left[Z_{n}\binom{n_{2}^{\prime}}{k-1}\frac{(2(k-1))!!(2m-2k-1)!!}{(2m-1)!!}\right].

We observe that $n_{2}^{\prime}\leq n_{2}+R_{n}$ and $Z_{n}\leq R_{n}$ , where $R_{n}$ is a binomially distributed random variable with parameters $2m$ and $1-\sqrt{1-q}=i/(2m)(1+o(1))$ . Therefore, for every $\epsilon>0$ , the probability that $R_{n}\geq(1+\epsilon)i$ occurs has an exponentially decaying tail. On the other hand, since $n_{2}/n\rightarrow p_{2}$ , it holds for every $\epsilon>0$ and all $n$ sufficiently large,

\displaystyle\mathbb{E}[\tilde{C}_{k}^{\prime}(n)\mid R_{n}\leq(1+\epsilon)i]

\displaystyle\leq i(1+\epsilon)\binom{n_{2}+(1+\epsilon)i}{k-1}\frac{(2(k-1))!!(2m-2k-1)!!}{(2m-1)!!}=(1+\epsilon)\frac{i}{2m}\left(\frac{2p_{2}+\epsilon}{d-\epsilon}\right)^{k-1}

and hence this sequence converges to zero exponentially fast in $k$ . Applying dominated convergence, we obtain

\displaystyle\mathbb{E}\left[\sum_{k=1}^{\infty}k\tilde{C}_{k}^{\prime}(n)\mid R_{n}\leq(1+\epsilon)i\right]=O\left(\frac{i}{m}\right)=o(m^{1-2\alpha}).

By Markov’s inequality, we conclude that

\displaystyle\mathbb{P}\left(\sum_{k=1}^{\infty}k\tilde{C}_{k}(n)\neq 0\right)

\displaystyle=\mathbb{E}\left(\sum_{k=1}^{\infty}k\tilde{C}_{k}^{\prime}(n)\neq 0\mid R_{n}\leq(1+\epsilon)i\right)+o(m^{1-2\alpha})=o(m^{1-2\alpha}).

This proves the first statement of the proposition.

To prove the second statement of the proposition, we need to relate the percolation process to the process of removing edges uniformly at random from $\overline{CM}_{n}(\mathbf{d})$ . To each edge, attach a uniformly distributed random variable. An alternative description of the percolation process with removal probability $q$ is by removing all edges whose values of the random variable are below $q$ . Let $U_{(1)}^{m}\leq...\leq U_{(m)}^{m}$ denote $m$ uniformly distributed order statistics. Then, the probability that $i$ edge removals disconnects $\overline{CM}_{n}(\mathbf{d})$ is given by $\mathbb{P}(\overline{CM}_{n}(\mathbf{d},U_{(i)}^{m})\textrm{ is disconnected})$ . We note that

\displaystyle\mathbb{P}

\displaystyle\left(\overline{CM}_{n}\left(\mathbf{d},U_{(i)}^{m}\right)\textrm{ is disconnected}\right)\leq\mathbb{P}\left(\overline{CM}_{n}\left(\mathbf{d},m^{\beta-1}\right)\textrm{ is disconnected}\right)+\mathbb{P}\left(U_{(i)}^{m}>m^{\beta-1}\right).

The above proof shows that

\displaystyle\mathbb{P}\left(\overline{CM}_{n}\left(\mathbf{d},m^{\beta-1}\right)\textrm{ is disconnected}\right)=O(m^{2\beta-1}).

The second term is negligible, since by the Chernoff bound,

\displaystyle\mathbb{P}\left(U_{(i)}^{m}>m^{\beta-1}\right)\leq(1+o(1))\exp\left\{-\frac{m^{\beta}}{2}\right\}=O(m^{2\beta-1}).

3.6 Number of edges outside the giant component

First, we prove Theorem 2.2 putting together the results proved in the Section 3.3.

See 2.2

Proof of Theorem 2.2.

By Lemma 3.1, the number of edges outside the largest component in $\overline{CM}_{n}(\mathbf{d})$ after $i$ failures can be derived by considering percolation on this graph with removal probability $q=i/m$ . The edges outside the giant can be divided into edges in line components, edges in cyclic components and edges in more complex components (i.e. components which contain a vertex of degree at least three). From Proposition 3.6 it follows that

\displaystyle\frac{m}{i^{2}}\Big{(}\sum_{k=2}^{\infty}(k-1)L_{k}(n)\Big{)}\overset{\mathbb{P}}{\to}\frac{4p_{2}^{2}}{(d-2p_{2})^{2}}.

Next we bound the number of edges outside the giant that are contained in components that are cycles or contained in more complex components. In view of Remarks 3.2 and 3.3, we point out that this is bounded by the same quantity in $CM_{N}(\mathbf{d}^{\prime})$ . Therefore, to conclude the proof, it suffices to show that the total number of edges contained in cycle components or more complex components outside the giant is of order $o_{\mathbb{P}}(i^{2}/m)$ for $CM_{N}(\mathbf{d}^{\prime})$ .

By Proposition 3.5, the number of edges in $CM_{N}(\mathbf{d}^{\prime})$ outside the giant that is contained in a more complex component is of order $o_{\mathbb{P}}(i^{2}/m)$ . To count the number of edges in cycle components, recall that $C^{\prime}_{k}(n)$ denotes the number of cyclic components with $k$ edges in $CM_{N}(\mathbf{d}^{\prime})$ , and that it satisfies (45), i.e. $\lim_{n\to\infty}\mathbb{E}[\sum_{k\geq 1}kC_{k}^{\prime}(n)]<\infty$ . Using Markov’s inequality, this implies that

\displaystyle\frac{m}{i^{2}}\sum_{k\geq 1}kC_{k}^{\prime}(n)=o_{\mathbb{P}}(1).

Consequently, the number of edges in $CM_{N}(\mathbf{d}^{\prime})$ outside the giant that are not contained in line components is indeed of order $o_{\mathbb{P}}(i^{2}/m)$ .

Theorem 2.2 prescribes the likely number of edges outside the giant component as an effect of sequentially removing edges uniformly at random. The initially connected configuration model is likely to disintegrate, as more edges are removed, in a unique giant and several small components, the majority of which are either lines or isolated nodes as long as the number of edge failures is sublinear.

Next, we show that the number of edges outside the giant component in $CM_{n}(\mathbf{d},q)$ is unlikely to be of a higher order of magnitude than its typical value during the cascade. We stress that this results concerns the percolation process, which can in turn be used to derive a large deviations bound for the sequential removal process.

Theorem 3.13.

Consider $CM_{n}(\mathbf{d},q)$ and $\overline{CM}_{n}(\mathbf{d},q)$ with $q=i/m$ , with $\sqrt{m}\ll i\ll m^{1-\delta}$ for some $\delta>0$ . Then, for every $\alpha>0$ ,

\displaystyle\mathbb{P}\Big{(}|E(CM_{n}(\mathbf{d},q)\setminus\mathcal{C}_{\rm max})|\frac{m}{i^{2}}\geq m^{\alpha}\Big{)}=O(m^{-3}),

(55)

and

\displaystyle\mathbb{P}\Big{(}|E(\overline{CM}_{n}(\mathbf{d},q)\setminus\mathcal{C}_{\rm max})|\frac{m}{i^{2}}\geq m^{\alpha}\Big{)}=O(m^{-3}).

(56)

Proof.

First, we show (55) by using the explosion algorithm. Again, in view of Remarks 3.2 and 3.3, it suffices to show

\displaystyle\mathbb{P}\Big{(}|E(CM_{N}(\mathbf{d}^{\prime})\setminus\mathcal{C}_{\rm max})|\frac{m}{i^{2}}\geq m^{\alpha}\Big{)}=O(m^{-3}).

(57)

We partition the different kinds of contributions to the total number of edges outside the giant of $CM_{N}(\mathbf{d}^{\prime})$ as follows: edges can be contained in a line, a cycle or a more complex component. Due to Proposition 3.10, the number of edges in line components in $CM_{N}(\mathbf{d}^{\prime})$ is bounded by

\displaystyle\mathbb{P}\Big{(}\sum_{k=2}^{\infty}(k-1)L_{k}^{\prime}(n)\frac{m}{i^{2}}\geq m^{\alpha}\Big{)}=O(m^{-3}).

(58)

To bound the edges in cycles, we point out that if follows from [13, (3.7)] that all moments of the number of cycles in $CM_{N}(\mathbf{d}^{\prime})$ converge to a finite constant. That is, convergence of the first $j$ factorial moments is shown, which implies convergence of the first $j$ moments. Again, this result is shown under the condition that the number of vertices of degree one is of order $O(\sqrt{m})$ in [13], but this assumption is not used for the results with respect to the isolated cycles. Consequently, for every $j$ , as long as $p_{2}\in(0,1)$ ,

\displaystyle\lim_{n\to\infty}\mathbb{E}\Big{[}\Big{(}\sum_{k=1}^{\infty}kC^{\prime}_{k}(n)\Big{)}^{j}\Big{]}\leq\mathbb{E}\Big{[}\Big{(}\sum_{k=1}^{\infty}Poi\Big{(}\frac{(2p_{2})^{k}}{2kd^{k}}\Big{)}k\Big{)}^{j}\Big{]}<\infty,

where the Poisson variables in the second term are all independent. Therefore, for every $\alpha>0$ , by applying Markov’s inequality, we obtain

\displaystyle\mathbb{P}\Big{(}\frac{m}{i^{2}}\sum_{k=1}^{\infty}kC_{k}^{\prime}(n)\geq m^{\alpha}\Big{)}

\displaystyle\leq\mathbb{P}\Big{(}\sum_{k=1}^{\infty}kC_{k}^{\prime}(n)\geq m^{\alpha}\Big{)}\leq\mathbb{E}\Big{[}\Big{(}\sum_{k=1}^{\infty}kC_{k}^{\prime}(n)\Big{)}^{3/\alpha}\Big{]}m^{-3}=O(m^{-3}).

To bound the number of edges in other components we use Proposition 3.9 to obtain

\displaystyle\mathbb{P}\Big{(}\frac{m\#\{e\notin\mathcal{C}_{\rm max}\colon d_{\max}(\mathcal{C}(e))\geq 3\}}{i^{2}}\geq m^{\alpha}\Big{)}=O(m^{-3}).

Thus we obtain (57) by summing the three different contributions.

Theorem 3.13 reveals that it is very unlikely for the number of edges outside the giant to be larger than of order $\Theta(i^{2}/m)$ when applying the percolation process. We can use this result to show that this also holds under a sequential removal process. That is, we can prove Theorem 2.3, which we recall next.

See 2.3

Proof of Theorem 2.3.

First, we prove (8) for $i$ such that $m^{1/2+\epsilon}\ll i\ll m^{\alpha}$ for some $\epsilon\in(0,\alpha-1/2)$ sufficiently small. We use this partial result to show (9). In the proof of (9), it is implied that (8) also holds for all $i\ll m^{1/2+\epsilon}$ for some $\epsilon\in(0,\alpha-1/2)$ sufficiently small, concluding the proof of Theorem 3.13 follows.

Let $U_{(1)}^{m}\leq...\leq U_{(m)}^{m}$ denote $m$ uniformly distributed order statistics, and note that

\displaystyle\mathbb{P}\left(|\hat{E}_{m}(i)|\leq m-i-i^{\alpha}\right)=\mathbb{P}\left(\big{|}E(\overline{CM}_{n}(\textbf{d},U_{(i)}^{m})\backslash\mathcal{C}_{\rm max})\big{|}>i^{\alpha}\right).

Let $\epsilon>0$ (sufficiently) small, and note that by the Chernoff bound,

\displaystyle\mathbb{P}\left(U_{(i)}^{m}\not\in\left[\frac{i}{m}m^{-\epsilon},\frac{i}{m}m^{\epsilon}\right]\right)=O(m^{-3}).

Therefore,

\displaystyle\mathbb{P}

\displaystyle\left(E(\big{|}\overline{CM}_{n}(\textbf{d},U_{(i)}^{m})\backslash\mathcal{C}_{\rm max})\big{|}>i^{\alpha}\right)\leq\max_{q\in\left[im^{-(1+\epsilon)},im^{-(1-\epsilon)}\right]}\mathbb{P}\left(\big{|}E(\overline{CM}_{n}(\textbf{d},q)\backslash\mathcal{C}_{\rm max})\big{|}>i^{\alpha}\right)+O(m^{-3}).

We observe that for every $q\in\left[im^{-(1+\epsilon)},im^{-(1-\epsilon)}\right]$ with $i=o(m^{\alpha})$ ,

\displaystyle\frac{q^{2}m}{i^{\alpha}}\leq\frac{i^{2-\alpha}}{m^{1-2\epsilon}}=o(m^{-(1-\alpha)^{2}+2\epsilon})=o(1)

for all $\epsilon>0$ sufficiently small. By Theorem 3.13, we conclude that

\displaystyle\mathbb{P}

\displaystyle\left(|\hat{E}_{m}(i)|\leq m-i-i^{\alpha}\right)\leq\max_{q\in\left[im^{-(1+\epsilon)},im^{-(1-\epsilon)}\right]}\mathbb{P}\left(\big{|}E(\overline{CM}_{n}(\textbf{d},q)\backslash\mathcal{C}_{\rm max})\big{|}>q^{2}/m\right)+O(m^{-3})=O(m^{-3}).

To prove (9), note that by Proposition 3.12,

	$\displaystyle\mathbb{P}$	$\displaystyle\left(m-i-\|\hat{E}_{m}(i)\|>i^{\alpha}\textrm{ for some }1\leq i\leq m^{1/8}\right)$
		$\displaystyle\hskip 56.9055pt\leq\mathbb{P}\left(m-i-\|\hat{E}_{m}(i)\|\neq 0\textrm{ for some }1\leq i\leq m^{1/8}\right)=o(m^{-1/2}).$

Moreover, if $k\gg m^{1/2+\epsilon}$ for some $\epsilon\in(0,1/2)$ , using that $k\leq m$ , it follows directly from (8),

\displaystyle\mathbb{P}

\displaystyle\left(m-i-|\hat{E}_{m}(i)|>i^{\alpha}\textrm{ for some }m^{1/2+\epsilon}\leq i\leq k\right)=O(km^{-3})=o(m^{-1/2}).

Therefore, to conclude (9), it suffices to show that for some $\epsilon\in(0,1/2)$ sufficiently small,

\displaystyle\mathbb{P}

\displaystyle\left(m-i-|\hat{E}_{m}(i)|>i^{1/8}\textrm{ for some }m^{1/8}\leq i\leq m^{1/2+\epsilon}\right)=o(m^{-1/2}).

For convenience, write $|\tilde{E}_{m}(i)|=m-i-|\hat{E}_{m}(i)|$ , and consider values of $i$ such that $m^{1/8}\leq i\leq m^{1/2+\epsilon}$ for some $\epsilon\in(0,1/2)$ . Note that

	$\displaystyle\mathbb{P}$	$\displaystyle\left(\|\tilde{E}_{m}(i)\|>i^{1/8}\right)=\mathbb{P}\left(\|\tilde{E}_{m}(i)\|>i^{1/8},\|\tilde{E}_{m}(m^{1/2+\epsilon})\|<\|\tilde{E}_{m}(i)\|/2\right)$
		$\displaystyle\hskip 99.58464pt+\mathbb{P}\left(\|\tilde{E}_{m}(i)\|>i^{1/8},\|\tilde{E}_{m}(m^{1/2+\epsilon})\|\geq\|\tilde{E}_{m}(i)\|/2\right).$

We observe that $|\tilde{E}_{m}(i)|$ is the number of edges outside the giant when removing $i$ edges uniformly at random. Write $\xi(i,j)$ as the number of edges that are removed out of this set of $|\tilde{E}_{m}(i)|$ edges if we remove another $j-i$ edges uniformly at random. Then, $|\tilde{E}_{m}(m^{1/2+\epsilon})|\geq|\tilde{E}_{m}(i)|-\xi(i,m^{1/2+\epsilon})$ , and hence

\displaystyle\mathbb{P}

\displaystyle\left(|\tilde{E}_{m}(i)|>i^{1/8},|\tilde{E}_{m}(m^{1/2+\epsilon})|<|\tilde{E}_{m}(i)|/2\right)\leq\mathbb{P}\left(|\tilde{E}_{m}(i)|>i^{1/8},\xi(i,m^{1/2+\epsilon})>|\tilde{E}_{m}(i)|/2\right).

Since $\xi(i,j)\leq j$ for every $j\geq i$ , it follows from the first claim of the corollary that with probability $1-O(m^{-3})$ ,

\displaystyle|\tilde{E}_{m}(i)|\leq|\tilde{E}_{m}(m^{1/2+\epsilon})|+\xi(i,m^{1/2+\epsilon})=o(m).

In other words, the probability that an edge out of the set of $|\tilde{E}_{m}(i)|$ is chosen to be removed is strictly bounded by $|\tilde{E}_{m}(i)|/(m-m^{1/2+\epsilon})=o(1)$ with probability $1-O(m^{-3})$ . In that case, the probability for more than half of the $|\tilde{E}_{m}(i)|>i^{1/8}$ edges to be removed has an exponentially decaying tail. In particular, this implies that

\displaystyle\mathbb{P}

\displaystyle\left(|\tilde{E}_{m}(i)|>i^{1/8},|\tilde{E}_{m}(m^{1/2+\epsilon})|<|\tilde{E}_{m}(i)|/2\right)\leq\mathbb{P}\left(|\tilde{E}_{m}(i)|>i^{1/8},\xi(i,m^{1/2+\epsilon})>|\tilde{E}_{m}(i)|/2\right)=O(m^{-3}).

For the other term, we observe that for every $m^{1/8}\leq i\leq m^{1/2+\epsilon}$ ,

\displaystyle\mathbb{P}\left(|\tilde{E}_{m}(i)|>i^{1/8},|\tilde{E}_{m}(m^{1/2+\epsilon})|\geq\frac{|\tilde{E}_{m}(i)|}{2}\right)

\displaystyle\leq\mathbb{P}\left(|\tilde{E}_{m}(m^{1/2+\epsilon})|>\frac{i^{1/8}}{2}\right)\leq\mathbb{P}\left(|\tilde{E}_{m}(m^{1/2+\epsilon})|>\frac{m^{1/64}}{2}\right).

Similarly as in the proof of the first claim of the corollary, we observe that

\displaystyle\mathbb{P}\left(U_{(m^{1/2+\epsilon})}^{m}\not\in\left[m^{-1/2-2\epsilon},m^{-1/2+2\epsilon}\right]\right)=O(m^{-3}),

and hence

	$\displaystyle\mathbb{P}$	$\displaystyle\left(\|\tilde{E}_{m}(m^{1/2+\epsilon})\|>2m^{1/64}\right)\leq\mathbb{P}\left(U_{(m^{1/2+\epsilon})}^{m}\not\in\left[m^{-1/2-2\epsilon},m^{-1/2+2\epsilon}\right]\right)$
		$\displaystyle\hskip 85.35826pt+\max_{q\in[m^{-1/2-2\epsilon},m^{-1/2+2\epsilon}]}\mathbb{P}\left(\big{\|}E(\overline{CM}_{n}(\textbf{d},q)\backslash\mathcal{C}_{\rm max})\big{\|}>2m^{1/64}\right).$

It follows from Theorem 3.13 that the second term is also of order $O(m^{-3})$ for every $\epsilon<1/256$ . We conclude that for every $m^{1/8}\leq i\leq m^{1/2+\epsilon}$ with $\epsilon<1/256$ ,

\displaystyle\mathbb{P}\left(|\tilde{E}_{m}(i)|>i^{1/8}\right)=O(m^{-3}),

from which (9) follows by the union bound. Moreover, it implies that (8) holds for all $\sqrt{m}\ll i\ll m^{1/2+\epsilon}$ for some $\epsilon\in(0,\alpha-1/2)$ as well.

3.7 Linear number of edge removals

For completeness, we provide a brief overview of known results about what $CM_{n}(\mathbf{d},q)$ looks like when the removal probability is a fixed value $q\in(0,q_{c})$ , as studied in [18]. It is shown that for any fixed $q\in(0,q_{c})$ , $CM_{n}(\mathbf{d},q)$ has a unique giant component and many small components. However, in this phase the giant does not contain $n-o(n)$ vertices, as in the case when $q\to 0$ .

From [18], it is known that there exists a function $\xi_{\mathbf{d}}(q)$ defined for $q<q_{c}$ such that in $CM_{n}(\mathbf{d},q)$

\displaystyle\frac{|E(\mathcal{C}_{\rm max})|}{(1-q)m}\overset{\mathbb{P}}{\to}\xi_{\mathbf{d}}(q),

(59)

i.e., the proportion of edges in the giant component concentrates for every $q<q_{c}$ . The exact formula for $\xi_{\mathbf{d}}(q)$ comes from [18, Theorem 3.9],

\displaystyle\xi_{\mathbf{d}}(q)=\frac{1-\rho}{\sqrt{1-q}}+\frac{(1-\rho)^{2}}{2},

(60)

where $\rho$ is defined as the solution of the equation

\displaystyle(1-q)^{1/2}G^{\prime}_{D}\big{(}1-(1-q)^{1/2}-\rho(1-q)^{1/2}\big{)}+(1-(1-q)^{1/2})d=\rho d,

(61)

where $G_{D}$ is the probability generating function of $D$ . The same concentration result, with a different limit function, holds also for the number of vertices in the largest component.

It is worth mentioning that in this case, lines and isolated vertices do not constitute the vast majority of the small components anymore, but there is also a positive density of more complex small components to appear. These are mainly trees.

If $q>q_{c}$ , instead, there is no giant component that contains a non-vanishing proportion of the edges or vertices and usually the high-degree vertices determine the size of the largest components, i.e. $|\mathcal{C}_{\rm max}|=\Theta_{\mathbb{P}}(d_{max})$ [17].

4 Cascading failure process

The results in the previous section explain the way the connected configuration model is likely to disintegrate as edge failures occur sequentially and uniformly at random. This yields the load surge values, and hence what remains to be done for proving Thoerem 1.3, is to compare the load surge values to the corresponding surplus capacities at the edges.

To prove our main results as stated in Theorem 1.3, we follow the proof strategy as laid out in Section 2.3. Recall that the connected disintegrates in a giant component and a sublinear number of edges outside the giant as long as no more than $o(m)$ edges are removed. We point out that, intuitively, this implies that the only dominant contribution to the total failure size comes from the number of edges that are contained in the giant component upon failure. We make this statement rigorous in this section.

This section is structured as follows. In Section 4.1, we consider the setting where no disconnections takes place yet. Since it follows from Theorem 2.1 that it is unlikely that the connected configuration model becomes disconnected before $\Theta(\sqrt{m})$ , we can show that Theorem 2.2 holds if $k=o(\sqrt{m})$ . For larger thresholds, we consider the failure size tail of the giant component. To derive this tail behavior, we translate this problem to a first-passage time problem over a random moving boundary in Section 4.2. In Section 4.3, we derive the behavior of this first-passage time. We conclude Theorem 2.2 in Section 4.4 by using the strategy as laid out in Section 2.3.

4.1 No edge disconnections

Before we move to the tail of the failure size, we first consider the scenario where no disconnections have occurred yet during the cascade, or alternatively, the setting where the failure mechanism is applied to a graph with a star topology. As long as edge failures do not cause (edge) disconnections in the graph, the load surge function remains the same at every surviving edge. Recall that in this case it holds that $|E_{j}^{m}(i)|=m-i$ for all surviving edges $e_{j}\in[m]$ , and hence recursion (3) is solved by

\displaystyle l_{j}^{m}(i)=\frac{\theta}{m}+(i-1)\cdot\frac{1-\theta/m}{m}.

(62)

In other words, given that no disconnections have occurred after $k$ edge failures, the load surge function behaves deterministically until step $k$ (at every surviving edge). In this case, the problem reduces to a first-passage time problem, i.e. the event $\{A_{n,\mathbf{d}}\geq k\}$ corresponds to the event that the smallest $k$ uniformly distributed order statistics are below the linear load surge function. The following result follows by applying Theorem 1 of [33].

Proposition 4.1.

Define $A^{\star}_{n+1}$ as the number of edge failures in a star network with $n+1$ nodes and $m=n$ edges, and load surge function given by (62). For every $k:=k_{m}$ satisfying $k\rightarrow\infty$ and $m-k\rightarrow\infty$ as $m\rightarrow\infty$ ,

\displaystyle\mathbb{P}\left(A^{\star}_{n+1}\geq k\right)\sim\frac{2\theta}{\sqrt{2\pi}}\sqrt{\frac{m-k}{m}}k^{-1/2}.

In particular, if $1\ll k\ll m$ , it holds that

\displaystyle\mathbb{P}\left(A^{\star}_{n+1}\geq k\right)\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2}.

A crucial argument used in the proof of Proposition 4.1 is that as $n=m\rightarrow\infty$ ,

\displaystyle\mathbb{P}\left(A^{\star}_{n+1}\geq k\right)\sim\mathbb{P}\left(U^{m}_{(i)}\leq\frac{\theta+i-1}{m},\hskip 5.69046pti=1,...,k\right).

The asymptotic behavior of the latter expression is obtained by observing that the edge failure distribution is quasi-binomial for this particular load surge function, and exploiting the analytic expression for the probability distribution function to derive the tail behavior. Alternatively, this result can be derived by relating this problem to an equivalent setting where one is interested in the first-passage time of a random walk bridge, as is done in [35]. We use such a relation in the more involved case where disconnections occur as well. Before moving to the general case, we briefly recall the translation to the equivalent random walk problem in the much simpler setting where no (edge) disconnections occur yet.

Translation to random walk setting:
Note that

\displaystyle\begin{split}&\left(U_{(1)}^{m},U_{(2)}^{m},...,U_{(m)}^{m}\right)\overset{d}{=}\left(\frac{\textrm{Exp}_{1}(1)}{m},\frac{\sum_{j=1}^{2}\textrm{Exp}_{j}(1)}{m},...,\frac{\sum_{j=1}^{m}\textrm{Exp}_{j}(1)}{m}\,\bigg{|}\,\sum_{j=1}^{m+1}\textrm{Exp}_{j}(1)=m\right),\end{split}

(63)

where the exponentially distributed random variables are independent. Define the random walk

\displaystyle S_{i}=\sum_{j=1}^{i}\left(1-\textrm{Exp}_{j}(1)\right),\hskip 14.22636pti\geq 1,

(64)

where $S_{0}=0$ . Then,

\displaystyle\mathbb{P}\left(A^{\star}_{n+1}\geq k\right)

\displaystyle\sim\mathbb{P}\left(U^{m}_{(i)}\leq\frac{\theta+i-1}{m},\hskip 5.69046pti=1,...,k\right)=\mathbb{P}\left(S_{i}\geq 1-\theta,\hskip 5.69046pti=1,...,k\big{|}S_{m+1}=1\right).

Define for every $x\in\mathbb{R}$ ,

\displaystyle\tau^{*}_{x}:=\min\{i\geq 1:S_{i}\leq x\}.

(65)

Then, the objective can be written as

\displaystyle\mathbb{P}\left(A^{\star}_{n+1}\geq k\right)\sim\mathbb{P}\left(\tau^{*}_{1-\theta}>k\big{|}S_{m+1}=1\right)\sim\mathbb{P}\left(\tau^{*}_{1-\theta}>k\big{|}S_{m}=0\right).

Using the main result in [35], we observe that for $k=o(m)$ ,

\displaystyle\mathbb{P}\left(A^{\star}_{n+1}\geq k\right)\sim\mathbb{P}\left(\tau^{*}_{1-\theta}>k\big{|}S_{m}=0\right)

\displaystyle\sim\sqrt{\frac{2}{\pi}}\mathbb{E}\left(-S_{\tau_{1-\theta}}\right)k^{-1/2}=\frac{2\theta}{\sqrt{2\pi}}k^{-1/2},

where the latter equality follows from the memoryless property of exponentials.

A similar random walk construction can be done for the case where disconnections do take place. Before we consider this process, we first show the result of Theorem 1.3 in the phase where it is unlikely to have any disconnections in the connected configuration model. That is, removing $k=o(\sqrt{m})$ edges uniformly at random is unlikely to cause any disconnection by Theorem 2.1. Due to the coupling between the cascading failure process and sequential edge-removal process, Proposition 4.1 prescribes exactly the asymptotic behavior of the edge failure size in that case.

Theorem 4.2.

The probability that the cascading failure process disconnects the network is given by

\displaystyle\mathbb{P}(A_{n,\mathbf{d}}\geq T_{n,\mathbf{d}})\sim\frac{2\theta}{\sqrt{2\pi}}\left(\frac{2p_{2}}{d-2p_{2}}\right)^{1/4}\Gamma\left(\frac{3}{4}\right)m^{-1/4},

(66)

where $\Gamma(\cdot)$ denotes the gamma function. Consequently, for any threshold $1\ll k\ll\sqrt{m}$ ,

\displaystyle\mathbb{P}\left(A_{n,\mathbf{d}}\geq k\right)\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2}.

(67)

Proof.

Note that

	$\displaystyle\mathbb{P}(A_{n,\mathbf{d}}\geq T_{n,\mathbf{d}})$	$\displaystyle=\int_{0}^{\infty}\mathbb{P}\left(\sqrt{m}T_{n,\mathbf{d}}\in dt\right)\mathbb{P}(A^{\star}_{m+1}\geq t\sqrt{m})\,dt\sim\frac{2\theta}{\sqrt{2\pi}}m^{-1/4}\mathbb{E}\left[(\sqrt{m}T_{n,\mathbf{d}})^{-1/2}\right]$
		$\displaystyle\sim\frac{2\theta}{\sqrt{2\pi}}m^{-1/4}\int_{0}^{\infty}\frac{4p_{2}}{d-2p_{2}}t^{1/2}e^{-\frac{2t^{2}p_{2}}{d-2p_{2}}}\,dt=\frac{2\theta}{\sqrt{2\pi}}\left(\frac{2p_{2}}{d-2p_{2}}\right)^{1/4}\Gamma\left(\frac{3}{4}\right)m^{-1/4},$

where the second assertion follows due to the uniform convergence result of $A^{\star}_{m+1}$ , see Theorem 1 in [34], and the third assertion follows from Theorem 2.1. For the second claim of the theorem, note that

\displaystyle\mathbb{P}\left(A_{n,\mathbf{d}}\geq k\right)

\displaystyle=\mathbb{P}\left(A_{n,\mathbf{d}}\geq k,A_{n,\mathbf{d}}<T_{n,\mathbf{d}}\right)+\mathbb{P}\left(A_{n,\mathbf{d}}\geq k,A_{n,\mathbf{d}}\geq T_{n,\mathbf{d}}\right),

where due to Proposition 4.1,

	$\displaystyle\mathbb{P}\left(A_{n,\mathbf{d}}\geq k,A_{n,\mathbf{d}}<T_{n,\mathbf{d}}\right)$	$\displaystyle=\mathbb{P}\left(A_{n,\mathbf{d}}\geq k\big{\|}A_{n,\mathbf{d}}<T_{n,\mathbf{d}}\right)\mathbb{P}\left(A_{n,\mathbf{d}}<T_{n,\mathbf{d}}\right)$
		$\displaystyle=\mathbb{P}\left(A^{\star}_{m+1}\geq k\right)\underbrace{\mathbb{P}\left(A_{n,\mathbf{d}}<T_{n,\mathbf{d}}\right)}_{=1-o(1)}\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2},$

and

\displaystyle\mathbb{P}\left(A_{n,\mathbf{d}}\geq k,A_{n,\mathbf{d}}\geq T_{n,\mathbf{d}}\right)\leq\mathbb{P}\left(A_{n,\mathbf{d}}\geq T_{n,\mathbf{d}}\right)=O(m^{-1/4})=o(k^{-1/2}).

4.2 Random walk formulation

This section is devoted to introducing a related random walk, and to showing that the number of edge failures in the giant asymptotically behaves the same as the first-passage time of a random walk bridge.

Recall that $|\hat{E}_{m}(i)|$ denotes the number of edges in the giant when $i$ edges have been removed uniformly at random, and let $e_{i}$ correspond to the edge corresponding to the $i$ ’th order statistic of the surplus capacities. Define the sequence of processes $\{L_{i,m}:1\leq i\leq m+1,m\geq 1\}$ , where $L_{1,m}=1$ , and for $2\leq i\leq m+1$ ,

\displaystyle L_{i,m}=\left\{\begin{array}[]{ll}\frac{m+1-\sum_{j=1}^{i-1}L_{j,m}}{|\hat{E}_{m}(i-2)|}&\textrm{if }e_{i-1}\in\mathcal{C}_{\max},\\ 0&\textrm{if }e_{i-1}\not\in\mathcal{C}_{\max}.\end{array}\right.

(70)

Note that this corresponds to the load surge increments in the giant if $\theta=1$ , rescaled by a factor $m+1$ . We consider a sequence of random walks $(S_{i,m})_{m\geq 1,1\leq i\leq m+1}$ defined as

\displaystyle S_{i,m}=\sum_{j=1}^{i}X_{j,m}

(71)

with increments

\displaystyle X_{i,m}=L_{i,m}-\textrm{Exp}_{i,m}(1),

(72)

where $\textrm{Exp}_{i,m}(1)$ are independent exponential random variables with unit rate. We note that $e_{m}\in\mathcal{C}_{\rm max}$ and $|\hat{E}_{m}(m-1)|=1$ by definition, since removing $m-1$ edges leaves only isolated nodes and one component with two nodes connected by a single edge which inherently is the component that contains the largest number of edges. Therefore,

\displaystyle\sum_{j=1}^{m+1}L_{j,m}=\sum_{j=1}^{m}L_{j,m}+\frac{m+1-\sum_{j=1}^{m}L_{j,m}}{1}=m+1,

and hence the random walk satisfies the property

\displaystyle S_{m+1,m}=\sum_{j=1}^{m+1}X_{j,m}=m+1-\sum_{j=1}^{m+1}\textrm{Exp}_{j,m}(1).

(73)

Finally, for all $m\geq 1$ define the stopping times

\displaystyle\tau_{m}=\min\{1\leq i\leq m:S_{i,m}<1-\theta\},

(74)

and $\tau=m+1$ whenever $S_{i,m}\geq 1-\theta$ for all $i=1,...,m$ .

In case of the star topology, i.e. no edge disconnections occur, it holds that $|\hat{E}_{m}(i)|=m-i$ , causing $L_{i,m}=1$ for all $1\leq i\leq m+1$ . In that case we observed that the failure size tail could be written as the first-passage tail of the random walk bridge. The above random walk formulation is a generalization that accounts for edges that may possibly no longer be contained in the giant as edge failures occur.

Proposition 4.3.

Suppose $m^{\delta}\ll k\ll m^{1-\delta}$ for some $\delta\in(0,1/2)$ . If

\displaystyle\mathbb{P}\left(\tau_{m}\geq k\bigg{|}\,S_{m+1,m}=0\right)\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2},

(75)

then

\displaystyle\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(k)\right)\sim\mathbb{P}\left(\tau_{m}\geq k\bigg{|}\,S_{m+1,m}=0\right).

Proof of Proposition 4.3.

Write $\hat{l}(\cdot)$ for the load surge function corresponding to the edges in the giant. Note that by construction,

\displaystyle\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(k)\right)

\displaystyle=\mathbb{P}\left(U_{(i)}^{m}\mathbbm{1}_{\{e_{i}\in\mathcal{C}_{\max}\}}\leq\hat{l}(\kappa(i)),\;\;\;i=1,...,k\right).

In other words, whenever an edge is contained in the giant, one checks whether this edge has sufficient capacity to deal with the load surge function. Instead of looking only at those instances, we would like to compare all order statistics to an appropriately chosen function. For this purpose, we introduce the function $l^{*}(\cdot)$ defined as

\displaystyle l^{*}(i)=\hat{l}(\kappa(i-1)+1),\hskip 28.45274pti=1,...,m.

We note this function satisfies two important properties:

(p1)

$l^{*}(i)=\hat{l}(\kappa(i))$ if $e_{i}\in\mathcal{C}_{\max}$ ;
(p2)

$l^{*}(i)=l^{*}(i-1)$ if $e_{i-1}\not\in\mathcal{C}_{\max}$ .

Moreover, as $\hat{l}(\cdot)$ is non-decreasing for all $i\geq 2$ , this holds as well for $l^{*}(i)$ . We define the two stopping times,

\displaystyle\hat{T}=\min\{1\leq i\leq m:U_{(i)}^{m}\mathbbm{1}_{\{e_{i}\in\mathcal{C}_{\max}\}}>\hat{l}(\kappa(i))\}

and

\displaystyle T^{*}=\min\{1\leq i\leq m:U_{(i)}^{m}>l^{*}(i)\}.

We observe that the first property (p1) implies that $T^{*}\leq\hat{T}$ , and $T^{*}=\hat{T}$ if $e_{T^{*}}\in\mathcal{C}_{\max}$ . Together with the second property (p2) and the observation that $U_{(\hat{T})}^{m}\geq U_{(T^{*})}^{m}$ , this implies

\displaystyle\hat{T}=\min\{j\geq T^{*}:e_{j}\in\mathcal{C}_{\max}\}.

(76)

Therefore,

\displaystyle\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(k)\right)=\mathbb{P}\left(\hat{T}>k\right)=\mathbb{P}\left({T}^{*}>k\right)+\mathbb{P}\left({T}^{*}\leq k;\hat{T}>k\right).

(77)

To conclude the proof, we relate the random walk to the stopping time $T^{\star}$ , and show that the second contribution in (77) is negligible. For the first claim, we consider the perturbed increments $\{L_{i,m}(\theta);1\leq i\leq m+1,m\geq 1\}$ with $L_{1,m}(\theta)=\theta$ and

\displaystyle L_{i,m}(\theta)=\left\{\begin{array}[]{ll}\frac{m+1-\sum_{j=1}^{i-1}L_{j,m}(\theta)}{|\hat{E}_{m}(i-2)|}&\textrm{if }e_{i-1}\in\mathcal{C}_{\max},\\ 0&\textrm{if }e_{i-1}\not\in\mathcal{C}_{\max}.\end{array}\right.

(80)

Note that this corresponds to the load surge increments rescaled by a factor $m+1$ . In particular, we observe $L_{\cdot,\cdot}=L_{\cdot,\cdot}(1)$ , and

\displaystyle(m+1)l^{*}(i)=\theta\left(\frac{m+1}{m}-1\right)+\sum_{j=1}^{i}L_{j,m}(\theta)=\frac{\theta}{m}+\sum_{j=1}^{i}L_{j,m}(\theta).

Note that $\theta/m=O(1/m)$ , and

\displaystyle\max_{i=1,...,k}\left\{\left|\sum_{j=1}^{i}\left(L_{j,m}(\theta)-L_{j,m}\right)-(\theta-1)\right|\right\}\leq\max_{i=1,...,k}\frac{|1-\theta|}{|\hat{E}_{m}(i)|},

which is of order $O(1/m)$ with probability $1-o(m^{-2})$ by Theorem 2.3. Since

	$\displaystyle\mathbb{P}\left({T}^{*}>k\right)$	$\displaystyle=\mathbb{P}\left((m+1)U_{(i)}^{m}\leq(m+1)l^{*}(i),\;\;\;i=1,...,k\right)$
		$\displaystyle=\mathbb{P}\left(\sum_{j=1}^{i}\textrm{Exp}_{j}(1)\leq\frac{\theta}{m}+\sum_{j=1}^{i}L_{j,m}(\theta),\;\;\;i=1,...,k\,\big{\|}\,\sum_{j=1}^{m+1}\textrm{Exp}_{j}(1)=m+1\right)$
		$\displaystyle=\mathbb{P}\left(\sum_{j=1}^{i}X_{j,m}\geq-\frac{\theta}{m}+\sum_{j=1}^{i}(L_{j,m}-L_{j,m}(\theta)),\;\;\;i\leq,k\,\big{\|}\,\sum_{j=1}^{m+1}\textrm{Exp}_{j}(1)=m+1\right),$

it follows that

\displaystyle\mathbb{P}

\displaystyle\left({T}^{*}>k\right)=\mathbb{P}\left(\sum_{j=1}^{i}X_{j,m}\geq 1-\theta+o(1),\;\;\;i=1,...,k\,\big{|}\,\sum_{j=1}^{m+1}\textrm{Exp}_{j}(1)=m+1\right)+o(m^{-2}).

Due to our hypothesis (75), it follows that as $m\rightarrow\infty$ ,

\displaystyle\mathbb{P}\left({T}^{*}>k\right)

\displaystyle\sim\mathbb{P}\left(\sum_{j=1}^{i}X_{j,m}\geq 1-\theta,\;\;\;i=1,...,k\,\big{|}\,\sum_{j=1}^{m+1}\textrm{Exp}_{j}(1)=m+1\right)=\mathbb{P}\left(\tau_{m}\geq k\bigg{|}\,S_{m+1,m}=0\right).

To conclude the result, it remains to be shown that the second term in (77) is of order $o(k^{-1/2})$ . Since we assumed that $m^{\delta}\ll k\ll m^{1-\delta}$ for some $\delta\in(0,1/2)$ , we observe that there exists an $\alpha\in(0,1)$ such that both $k^{2}/m\ll m^{\alpha}\ll k$ . For all such $\alpha\in(0,1)$ , it holds that

\displaystyle\mathbb{P}\left({T}^{*}\leq k;\hat{T}>k\right)=\mathbb{P}\left({T}^{*}\in[k-m^{\alpha},k];\hat{T}>k\right)+\mathbb{P}\left({T}^{*}<k-m^{\alpha};\hat{T}>k\right).

We note that by our hypothesis and our previous result,

	$\displaystyle\mathbb{P}\left({T}^{*}\in[k-m^{\alpha},k];\hat{T}>k\right)$	$\displaystyle\leq\mathbb{P}\left({T}^{}>k-m^{\alpha}\right)-\mathbb{P}\left({T}^{}>k;\hat{T}>k\right)$
		$\displaystyle\sim\frac{2\theta}{\sqrt{2\pi}}(k-m^{\alpha})^{-1/2}-\frac{2\theta}{\sqrt{2\pi}}k^{-1/2}=o(k^{-1/2}).$

Finally, we observe that by (76),

	$\displaystyle\mathbb{P}\left({T}^{*}<k-m^{\alpha};\hat{T}>k\right)$	$\displaystyle\leq\sum_{j=1}^{k-m^{\alpha}}\mathbb{P}\left({T}^{}=j;\hat{T}-T^{}>m^{\alpha}\right)$
		$\displaystyle\leq\sum_{j=1}^{k-m^{\alpha}}\sum_{r=0}^{m}\left(\frac{m-j+1-r}{m-k}\right)^{m^{\alpha}}\mathbb{P}\left(\|\hat{E}_{m}(j-1)\|=r\right)$
		$\displaystyle\leq o(m^{-1/2})+\sum_{j=1}^{k-m^{\alpha}}\left(\frac{j^{\alpha}}{m-k}\right)^{m^{\alpha}}=o(m^{-1/2})+O\left(k\left(\frac{k^{\alpha}}{m-k}\right)^{m^{\alpha}}\right)=o(m^{-1/2}),$

where the third assertion follows from Theorem 3.13.

To derive the asymptotic probability of $\{\hat{A}_{n,\mathbf{d}}\geq\kappa(k)\}$ to occur for $k\ll m^{1-\delta}$ for some $\delta\in(0,1)$ , Proposition 4.3 implies that it suffices to show that the asymptotic behavior of the first-passage time of the defined random walk is given by (75).

4.3 Behavior of the number of edge failures in the giant

We start the analysis by showing that if $k=o(m^{\alpha})$ for some $\alpha\in(0,1)$ , then Proposition 2.4 holds for the number of failures in the giant. We recap this proposition next.

See 2.4

For $k=o(\sqrt{m})$ , this result is already proven in Theorem 4.2. For the remainder of the proofs in this section, we therefore assume $k=\Omega(\sqrt{m})$ .

To prove Proposition 2.4, we will extensively use the random walk

\displaystyle S_{i}=\sum_{j=1}^{i}\left(1-\textrm{Exp}_{j}(1)\right),\hskip 14.22636pti\geq 1,

where $S_{0}=0$ . This is related to $\tau_{m}$ as defined in (74) through the relation

\displaystyle\tau_{m}=\min\{1\leq i\leq m:S_{i}<1-\theta+\sum_{j=1}^{i}\left(1-L_{j,m}\right)\},

(81)

and $\tau_{m}=m+1$ if $S_{i}\geq 1-\theta+\sum_{j=1}^{i}\left(1-L_{j,m}\right)$ for all $1\leq i\leq m$ . Moreover, for a sequence $g=\{g_{i}\}_{i\in\mathbb{N}}$ , let $T_{g}$ correspond to the first-passage time of the random walk $S_{i}$ over this sequence, i.e.,

\displaystyle T_{g}=\min\{i\in\mathbb{N}:S_{i}<g_{i}\}.

We use the following strategy to prove Proposition 2.4. First, we show that for a particular class of (deterministic) boundary sequences, it holds that

\displaystyle\mathbb{P}(T_{g}>k)\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2}

as $k\rightarrow\infty$ . Next, we show that the boundary as given in (81) falls in this class of boundary sequences with sufficiently high probability, and hence

\displaystyle\mathbb{P}(\tau_{m}>k)\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2}

as $k\rightarrow\infty$ . Finally, we show that conditioning on the event that the random walk returns to zero at time $m+1$ does not affect the tail behavior, i.e. for all $k=o(m^{\alpha})$ for some $\alpha\in(0,1)$ , it holds that as $m\rightarrow\infty$ ,

\displaystyle\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(k)\right)\sim\mathbb{P}\left(\tau_{m}>k\big{|}S_{m+1}=0\right)\sim\mathbb{P}(\tau_{m}>k)\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2}.

4.3.1 First-passage time for moving boundaries initially constant for sufficient time

Before moving to the proof of Proposition 2.4, we consider the first-passage time behavior of $(S_{i})_{i\in\mathbb{N}}$ for a particular class of moving boundaries. The next lemma shows that the first-passage time over a boundary that is monotone non-decreasing, grows slower than $\sqrt{i}$ , and is initially constant for a sufficiently large time, behaves the same as the first-passage time over the constant boundary.

Lemma 4.4.

Suppose $l:=l_{k}$ is such that $k^{\alpha}\ll l\ll k$ as $k\rightarrow\infty$ for some $\alpha\in(0,1)$ . Define the boundary sequence

\displaystyle g_{i,l}^{+}=\left\{\begin{array}[]{ll}1-\theta&\textrm{if }i\leq l,\\ i^{\gamma}&\textrm{if }i>l,\end{array}\right.

with $\gamma\in(0,1/2)$ . Then, as $k\rightarrow\infty$ ,

\displaystyle\mathbb{P}\left(T_{g^{+}}>k\right)\sim\mathbb{P}\left(T_{1-\theta}>k\right).

Remark 4.5.

We point out that the class of boundary sequences as described in Lemma 4.4 is not covered by the literature. That is, almost all related literature consider moving boundaries that can be described by sequences of the form $(g_{i})_{i\in\mathbb{N}}$ , i.e. that do not depend on $k$ . The exception is [12], but this paper restricts to constant boundaries only.

Still, the literature offers some partial results. First, since $\{T_{g^{+}}>k\}\subseteq\{T_{1-\theta}>k\}$ ,

\displaystyle\mathbb{P}\left(T_{g^{+}}>k\right)\leq\mathbb{P}\left(T_{1-\theta}>k\right).

For a lower bound, we would like to remark that $g_{i,l}^{+}$ is a non-decreasing sequence in $i\geq 1$ , where $g_{i,l}^{+}\leq i^{\gamma}$ for all $i\in\mathbb{N}$ . Therefore, due to Proposition 1 in [38] (or Theorem 2 in [15]),

\displaystyle\mathbb{P}\left(T_{g^{+}}>k\right)\geq\mathbb{P}\left(T_{i^{\gamma}}>k\right)\sim c_{\gamma}\mathbb{P}\left(T_{0}>k\right)\sim c_{\gamma}^{\prime}\mathbb{P}\left(T_{1-\theta}>k\right),

where $c_{\gamma},c_{\gamma}^{\prime}\in[0,\infty)$ . Moreover, since

\displaystyle\sum_{i=1}^{\infty}\frac{i^{\gamma}}{i^{3/2}}<\infty,

it holds that $c_{\gamma},c_{\gamma}^{\prime}>0$ [15]. In order to prove Lemma 4.4, we therefore need to show that $c_{\gamma}^{\prime}=1$ .

Proof.

First, recall that $\{T_{g^{+}}>k\}\subseteq\{T_{1-\theta}>k\}$ , and hence

\displaystyle\mathbb{P}\left(T_{g^{+}}>k\right)\leq\mathbb{P}\left(T_{1-\theta}>k\right).

Therefore, it suffices to show that the reversed inequality holds asymptotically.

We bound the moving boundary $g^{+}$ by an appropriate piecewise constant boundary with finitely many jumps. For each of these constant intervals, we use the results in [12] to show that the trajectory of the random walk is asymptotically indistinguishable with respect to the boundary $g^{+}$ and the piecewise constant one. Finally, we glue the intervals together to conclude the result.

First, note that since $\gamma\in(0,1/2)$ , we can assume without loss of generality that $\alpha>0$ is such that $k^{\alpha(1+\eta)}\ll l\ll k^{\alpha((2\gamma)^{-1}-\eta)}$ with $\eta=((2\gamma)^{-1}-1)/4>0$ . To define the piecewise constant boundary, let

\displaystyle r:=\min\{j\in\mathbb{N}\colon\alpha(2\gamma)^{-j}>1\},

and note that $1\leq r<\infty$ since $2\gamma\in(0,1)$ and $\alpha\in(0,1)$ . Choose a fixed $\epsilon>0$ sufficiently small such that

•

$\epsilon<\alpha\eta$ , which implies that $l=o\left(k^{\alpha(2\gamma)^{-1}-\epsilon}\right)$ ;
•

$\alpha/(2\gamma)-\epsilon<\alpha/(2\gamma)^{2}-2\epsilon<...<\alpha/(2\gamma)^{r}-r\epsilon$ ;
•

$\epsilon<(1-\alpha(2\gamma)^{-r})/r$ .

Define $t_{j,k}^{\epsilon},j\geq 0$ with $t_{0,k}^{\epsilon}=l$ and

\displaystyle t_{j,k}^{\epsilon}=k^{\alpha(2\gamma)^{-j}-j\epsilon},\hskip 14.22636pt1\leq j\leq r.

We point out that $r$ corresponds to the number of times the piecewise constant boundary makes a jump, and the values $t_{j,k}^{\epsilon}$ , $0\leq j\leq r-1$ , correspond to the times where the piecewise constant boundaries jump. Since $\epsilon>0$ is chosen sufficiently small, $l=t_{0,k}^{\epsilon}\ll t_{1,k}^{\epsilon}\ll...\ll t_{r-1,k}^{\epsilon}\ll k\ll t_{r,k}^{\epsilon}$ as $k\rightarrow\infty$ . Write

\displaystyle h^{(j)}=\left\{\begin{array}[]{ll}1-\theta&\textrm{if }j=0,\\ k^{\alpha(2\gamma)^{-(j-1)}/2-j\epsilon/2}&\textrm{if }1\leq j\leq r,\end{array}\right.

and define the boundary sequence as

\displaystyle h_{i,k}^{\epsilon}=\left\{\begin{array}[]{ll}h^{(0)}=1-\theta&\textrm{if }i\leq t_{0,k}^{\epsilon}=l,\\ h^{(j)}&\textrm{if }t_{j-1,k}^{\epsilon}<i\leq t_{j,k}^{\epsilon},\;1\leq j\leq r-1,\\ h^{(r)}&\textrm{if }i>t_{r-1,k}^{\epsilon}.\end{array}\right.

We point out that by construction,

\displaystyle h^{(j)}/\sqrt{t_{j-1,k}^{\epsilon}}=k^{-\epsilon/2}\Longrightarrow h^{(j)}=o\left(\sqrt{t_{j-1,k}^{\epsilon}}\right),\hskip 14.22636pt1\leq j\leq r,

and hence $h_{i,k}^{\epsilon}=o(\sqrt{i})$ for all $i\leq k$ as $k\rightarrow\infty$ . Moreover,

\displaystyle h^{(j)}/(t_{j,k}^{\epsilon})^{\gamma}=k^{j\epsilon(\gamma-1/2)}\Longrightarrow h^{(j)}\geq(t_{j,k}^{\epsilon})^{\gamma}\hskip 14.22636pt1\leq j\leq r.

Consequently,

\displaystyle h_{i,k}^{\epsilon}\geq g_{i,l}^{+},\hskip 14.22636pt1\leq i\leq k,

and therefore we obtain the lower bound

\displaystyle\mathbb{P}\left(T_{g^{+}}>k\right)\geq\mathbb{P}\left(T_{h^{\epsilon}}>k\right).

Next, we provide a lower bound for the tail behavior of $T_{h^{\epsilon}}$ . Fix $\delta>0$ , and note that

\displaystyle\mathbb{P}\left(T_{h^{\epsilon}}>k\right)\geq\mathbb{P}\left(T_{h^{\epsilon}}>k;S_{t_{j,k}^{\epsilon}}\in\left(\delta\sqrt{t_{j,k}^{\epsilon}},1/\delta\sqrt{t_{j,k}^{\epsilon}}\right)\;\forall\,0\leq j\leq r-1\right).

Conditioning on the position of the random walk at the times $t_{j,k}^{\epsilon}$ , $0\leq j\leq r-1$ yields

	$\displaystyle\mathbb{P}\left(T_{h^{\epsilon}}>k\right)\geq$	$\displaystyle\int_{u_{0}=\delta\sqrt{t_{0,k}^{\epsilon}}}^{1/\delta\sqrt{t_{0,k}^{\epsilon}}}\cdots\int_{u_{r-1}=\delta\sqrt{t_{r-1,k}^{\epsilon}}}^{1/\delta\sqrt{t_{r-1,k}^{\epsilon}}}\mathbb{P}\left(T_{h^{(r)}}>k-t_{r-1,k}^{\epsilon}\big{\|}S_{0}=u_{r-1}\right)$
		$\displaystyle\hskip 85.35826pt\cdot\prod_{j=0}^{r-1}\mathbb{P}\left(S_{t_{j,k}^{\epsilon}-t_{j-1,k}^{\epsilon}}\in du_{j};T_{h^{(j)}}>t_{j,k}^{\epsilon}-t_{j-1,k}^{\epsilon}\big{\|}S_{0}=u_{j-1}\right),$

where we write $t_{-1}=0$ and $u_{-1}=0$ for convenience. In other words, we partition the trajectory of the random walk in intervals where the boundary is constant. Recall that for every $0\leq j\leq r-1$ , it holds that $t_{j,k}^{\epsilon}-t_{j-1,k}^{\epsilon}=t_{j,k}^{\epsilon}(1+o(1))$ , and $h^{(j)}=o(\sqrt{t_{j-1,k}^{\epsilon}})$ for every $1\leq j\leq r$ . Applying Proposition 18 in [12], we obtain uniformly in $u_{j}=\Theta(\sqrt{t_{j,k}^{\epsilon}})$ , $1\leq j\leq r-1$ ,

	$\displaystyle\frac{\mathbb{P}\left(S_{t_{j,k}^{\epsilon}-t_{j-1,k}^{\epsilon}}\in du_{j};T_{h^{(j)}}>t_{j,k}^{\epsilon}-t_{j-1,k}^{\epsilon}\big{\|}S_{0}=u_{j-1}\right)}{du_{j}}$
	$\displaystyle\hskip 85.35826pt\sim\sqrt{\frac{2}{\pi}}\frac{V(u_{j-1}-h^{(j)})}{\sqrt{t_{j,k}^{\epsilon}-t_{j-1,k}^{\epsilon}}}\frac{u_{j}-h^{(j)}}{t_{j,k}^{\epsilon}-t_{j-1,k}^{\epsilon}}\textrm{exp}\left\{-\frac{\left(u_{j}-h^{(j)}\right)^{2}}{2(t_{j,k}^{\epsilon}-t_{j-1,k}^{\epsilon})}\right\},$

where $V(\cdot)$ denotes the renewal function corresponding to the decreasing ladder height process of the random walk. The behavior of this function is well-understood: it is non-decreasing and $V(t)\sim t/\mathbb{E}(-S_{T_{0}})=t$ as $t\rightarrow\infty$ . Since by construction, $h^{(j)}=o(u_{j-1})=o(u_{j})$ for every $1\leq j\leq r-1$ , we obtain

	$\displaystyle\frac{\mathbb{P}\left(S_{t_{j}-t_{j-1}}\in du_{j};T_{h^{(j)}}>t_{j}-t_{j-1}\big{\|}S_{0}=u_{j-1}\right)}{du_{j}}$
	$\displaystyle\hskip 28.45274pt=(1+o(1))\sqrt{\frac{2}{\pi}}\frac{V(u_{j-1}-(1-\theta))}{\sqrt{t_{j}-t_{j-1}}}\frac{u_{j}-(1-\theta)}{t_{j}-t_{j-1}}\textrm{exp}\left\{-\frac{\left(u_{j}-(1-\theta)\right)^{2}}{2(t_{j}-t_{j-1})}\right\}$
	$\displaystyle\hskip 28.45274pt=(1+o(1))\frac{\mathbb{P}\left(S_{t_{j}-t_{j-1}}\in du_{j};T_{1-\theta}>t_{j}-t_{j-1}\big{\|}S_{0}=u_{j-1}\right)}{du_{j}}.$

Similarly, it holds uniformly in $u_{r-1}=\Theta(\sqrt{t_{r-1,k}^{\epsilon}})$ [12, Proposition 18],

\displaystyle\mathbb{P}\left(T_{h^{(r)}}>k-t_{r-1,k}^{\epsilon}\big{|}S_{0}=u_{r-1}\right)\sim\mathbb{P}\left(T_{1-\theta}>k-t_{r-1,k}^{\epsilon}\big{|}S_{0}=u_{r-1}\right).

Then,

\displaystyle\mathbb{P}\left(T_{h^{\epsilon}}>k\right)

\displaystyle\geq(1+o(1))\mathbb{P}\left(T_{1-\theta}>k;S_{t_{j,k}^{\epsilon}}\in\left(\delta\sqrt{t_{j,k}^{\epsilon}},\frac{\sqrt{t_{j,k}^{\epsilon}}}{\delta}\right)\;\forall\,0\leq j\leq r-1\right).

Conditioning on staying above the constant boundary and applying the union bound yields

	$\displaystyle\mathbb{P}\left(T_{h^{\epsilon}}>k\right)$	$\displaystyle\geq(1+o(1))\mathbb{P}\left(T_{1-\theta}>k\right)\left(1-\sum_{j=0}^{r-1}\mathbb{P}\left(S_{t_{j,k}^{\epsilon}}\not\in\left(\delta\sqrt{t_{j,k}^{\epsilon}},1/\delta\sqrt{t_{j,k}^{\epsilon}}\right)\big{\|}T_{1-\theta}>k\right)\right)$
		$\displaystyle=(1+o(1))\left(1-r\left(1-e^{-\frac{\delta^{2}}{2}}\right)-re^{-\frac{1}{2\delta^{2}}}\right)\mathbb{P}\left(T_{1-\theta}>k\right).$

Letting $\delta\downarrow 0$ , we find that for every $\epsilon>0$ sufficiently small,

\displaystyle\liminf_{k\rightarrow\infty}\frac{\mathbb{P}\left(T_{g^{+}}>k\right)}{\mathbb{P}\left(T_{1-\theta}>k\right)}\geq\liminf_{k\rightarrow\infty}\frac{\mathbb{P}\left(T_{h^{\epsilon}}>k\right)}{\mathbb{P}\left(T_{1-\theta}>k\right)}=1,

from which we conclude that the result holds.

The next lemma shows a similar result as Lemma 4.4, yet with a boundary that is monotone non-increasing and initially constant for a sufficiently large time. The proof is similar to the proof of Lemma 4.4, and therefore given in Appendix C.

Lemma 4.6.

Suppose $l:=l_{k}$ is such that $k^{\alpha}\ll l\ll k$ for some $\alpha\in(0,1)$ as $k\rightarrow\infty$ . Define the boundary sequence

\displaystyle g_{i,l}^{-}=\left\{\begin{array}[]{ll}1-\theta&\textrm{if }i\leq l,\\ -i^{\gamma}&\textrm{if }i>l,\end{array}\right.

with $\gamma\in(0,1/2)$ . Then, as $k\rightarrow\infty$ ,

\displaystyle\mathbb{P}\left(T_{g^{-}}>k\right)\sim\mathbb{P}\left(T_{1-\theta}>k\right).

From Lemmas 4.4 and 4.6, the following corollary follows directly:

Corollary 4.7.

Suppose $l:=l_{k}$ is such that both $k^{\alpha}\ll l\ll k$ for some $\alpha\in(0,1)$ , and the boundary sequence satisfies

\displaystyle g_{i,l}=\left\{\begin{array}[]{ll}1-\theta&\textrm{if }i\leq l,\\ o(i^{\gamma})&\textrm{if }i>l,\end{array}\right.

for some $\theta>0$ and $\gamma\in(0,1/2)$ . Then, as $k\rightarrow\infty$ ,

\displaystyle\mathbb{P}\left(T_{g}>k\right)\sim\mathbb{P}\left(T_{1-\theta}>k\right)\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2}.

4.3.2 Proof of Proposition 2.4

To prove Proposition 2.4, we first show that the tail of $\tau_{m}$ behaves the same as that of $T_{1-\theta}$ , after which we use the relation in Proposition 4.3 to derive the tail of $A_{n,\mathbf{d}}$ . In view of (81), we need to understand the behavior of the random walk

\displaystyle Y_{i,m}=\sum_{j=1}^{i}(1-L_{j,m}),\hskip 28.45274pt1\leq i\leq m+1,

where $Y_{0,m}=0$ . In order to apply Corollary 4.7, we therefore need to show that the random walk is likely to be close to zero for a sufficiently long time $l$ , and within $[-i^{\gamma},i^{\gamma}]$ for all $l\leq i\leq k$ for some $0<\gamma<1/2$ .

Proposition 4.8.

Suppose $k=o(m^{\alpha})$ for some $\alpha\in(0,1)$ and $\gamma\in(\alpha/2,1/2)$ . Then, as $m\rightarrow\infty$ ,

\displaystyle\mathbb{P}\left(\left|\sum_{j=1}^{i}\left(1-L_{j,m}\right)\right|>i^{\gamma}\textrm{ for some }1\leq i\leq k\right)=o(m^{-1/2}).

Proof.

From Proposition 3.12, it follows that the probability that there are disconnections when removing less than $o(m^{1/4-\epsilon})$ edges with $\epsilon\in(0,1/4)$ is of order $o(m^{-1/2})$ . Therefore, for every $l=o(m^{1/4-\epsilon})$ with $\epsilon\in(0,1/4)$ , it is likely that $L_{i,m}=0$ for every $i\leq l$ , and hence

\displaystyle\mathbb{P}

\displaystyle\left(\left|\sum_{j=1}^{i}\left(1-L_{j,m}\right)\right|>i^{\gamma}\textrm{ for some }1\leq i\leq l\right)\leq\mathbb{P}\left(\left|\sum_{j=1}^{i}\left(1-L_{j,m}\right)\right|\neq 0\textrm{ for some }1\leq i\leq l\right)=o(m^{-1/2}).

Therefore, to prove the proposition, it suffices to show that for every $k$ for which $m^{1/4-\epsilon}\ll k\ll m^{\alpha}$ for some $\alpha\in(0,1)$ and $\epsilon\in(0,1/4)$ ,

\displaystyle\mathbb{P}\left(\left|\sum_{j=l}^{i}\left(1-L_{j,m}\right)\right|>i^{\gamma}\textrm{ for some }l\leq i\leq k\right)=o(m^{-1/2}),

where e.g. $l=m^{(1/4-\epsilon)/2}$ . Write $\pi_{1}=1$ , and

\displaystyle\pi_{i}=\frac{|\hat{E}_{m}(i-2)|}{m-i+2},\hskip 28.45274pt2\leq i\leq m+1,

a random variable representing the probability that edge $e_{i-1}$ is in the giant. Let ${Ber}(\pi)$ denote a Bernoulli distributed random variable with success probability $\pi$ , and note that

\displaystyle L_{i,m}=\left(\frac{1}{\pi_{i}}+\frac{Y_{i-1,m}}{|E_{m}(i-2)|}\right)\textrm{Ber}(\pi_{i})\geq 0.

(82)

In view of Theorem 2.3, we observe that $\pi_{i}$ is likely to be

\displaystyle\pi_{i}\geq\frac{m-i+2-(i-2)^{\alpha}}{m-i+2}\geq 1-i^{\alpha-1}.

More precisely, let $\mathcal{E}=\{\pi_{i}=1\;\forall i<l,\pi_{i}\geq 1-i^{\alpha-1},\;\forall l\leq i\leq k\}$ . Then,

\displaystyle\mathbb{P}

\displaystyle\left(\left|\sum_{j=1}^{i}\left(1-L_{j,m}\right)\right|>i^{\gamma}\textrm{ for some }l\leq i\leq k\right)\leq\mathbb{P}\left(\left|\sum_{j=1}^{i}\left(1-L_{j,m}\right)\right|>i^{\gamma}\textrm{ for some }l\leq i\leq k\,\bigg{|}\,\mathcal{E}\right)+\mathbb{P}(\mathcal{E}^{c}),

where due to Theorem 2.3, it holds that $\mathbb{P}(\mathcal{E}^{c})=o(m^{-1/2})$ . Next, we show that the summed probabilities have an exponentially decaying tail. Define the stopping time

\displaystyle\sigma_{i}=\sup\left\{j\in\mathbb{N}:j\leq i,Y_{i,m}\geq 0\right\}.

We remark that $\sigma_{i}\geq l$ . Due to (82), it holds for every $l\leq i\leq k$ ,

	$\displaystyle\mathbb{P}\left(Y_{i,m}<-i^{\gamma}\bigg{\|}\mathcal{E}\right)$	$\displaystyle\leq\sum_{r=l}^{i-1}\mathbb{P}\left(\sum_{j=1}^{i}L_{j,m}>i+i^{\gamma};\sigma_{i}=r\bigg{\|}\mathcal{E}\right)$
		$\displaystyle\leq\sum_{r=l}^{i-1}\mathbb{P}\left(-Y_{r,m}+\sum_{j=r+1}^{i}\frac{1}{\pi_{i}}\textrm{Ber}(\pi_{i})>(i-r)+i^{\gamma};\sigma_{i}=t\bigg{\|}\mathcal{E}\right)(1+o(1))$
		$\displaystyle\leq\sum_{r=l}^{i-1}\;\mathbb{P}\left(\sum_{j=r+1}^{i}\frac{1}{\pi_{i}}\textrm{Ber}(\pi_{i})>(i-r)+i^{\gamma}\bigg{\|}\mathcal{E}\right)(1+o(1)).$

Applying Chernoff’s bound, we obtain for every $t>0$

\displaystyle\mathbb{P}

\displaystyle\left(\sum_{j=r+1}^{i}\frac{1}{\pi_{i}}\textrm{Ber}(\pi_{i})>(i-r)+i^{\gamma}\bigg{|}\mathcal{E}\right)\leq e^{-t(i-r+i^{\gamma})}\mathbb{E}\left[\exp\left\{t\sum_{j=r+1}^{i}\frac{1}{\pi_{j}}\textrm{Ber}(\pi_{j})\right\}\bigg{|}\mathcal{E}\right].

Although the random variables $\pi_{1},...,\pi_{i}$ are not independent, they are conditioned to be close to one and satisfy a Markovian property. Let $p=1-i^{\alpha-1}$ , and note that the conditional event $\mathcal{E}$ implies that $\pi_{j}\geq p$ for all $1\leq j\leq i$ . Define $\mathcal{F}_{i}$ as the filtration generated by removing $i$ edges uniformly at random. Applying the law of total expectation and noting that $1+x(e^{t/x}-1)$ is a (strictly) decreasing function for all $t>0$ , we observe that

	$\displaystyle\mathbb{E}\left[\exp\left\{t\sum_{j=1}^{i}\frac{1}{\pi_{j}}\textrm{Ber}(\pi_{j})\right\}\bigg{\|}\mathcal{E}\right]$	$\displaystyle=\mathbb{E}\left[\mathbb{E}\left[\exp\left\{t\sum_{j=1}^{i}\frac{1}{\pi_{j}}\textrm{Ber}(\pi_{j})\right\}\bigg{\|}\mathcal{F}_{i-1};\mathcal{E}\right]\right]$
		$\displaystyle=\mathbb{E}\left[\exp\left\{t\sum_{j=1}^{i-1}\frac{1}{\pi_{j}}\textrm{Ber}(\pi_{j})\right\}\mathbb{E}\left[1+\pi_{i}(e^{t/\pi_{i}}-1)\bigg{\|}\mathcal{F}_{i-1};\mathcal{E}\right]\right]$
		$\displaystyle\leq\left(1+p(e^{t/p}-1)\right)\mathbb{E}\left[\exp\left\{t\sum_{j=1}^{i-1}\frac{1}{\pi_{j}}\textrm{Ber}(\pi_{j})\right\}\bigg{\|}\mathcal{E}\right].$

Applying the same argument recursively yields the bound

\displaystyle\mathbb{P}\left(\sum_{j=r+1}^{i}\frac{1}{\pi_{i}}\textrm{Ber}(\pi_{i})>(i-r)+i^{\gamma}\bigg{|}\mathcal{E}\right)\leq e^{-t(i-r+i^{\gamma})}\left(1+p(e^{t/p}-1)\right)^{i-r}

for every $t\geq 0$ . We point out that the right-hand side corresponds exactly to the Chernoff bound that would have been obtained if we consider the sum of $i-r$ Bernoulli distributed independent random variables with parameter $p$ . It follows that

\displaystyle\mathbb{P}

\displaystyle\left(\sum_{j=r+1}^{i}\frac{1}{\pi_{i}}\textrm{Ber}(\pi_{i})>(i-r)+i^{\gamma}\bigg{|}\mathcal{E}\right)\leq\exp\left\{-\frac{i^{2\gamma}}{2(i-r)p(1-p)}\right\}=\exp\left\{-\frac{1}{2}i^{2\gamma-\alpha}\right\}(1+o(1)).

We conclude that

\displaystyle\mathbb{P}\left(Y_{i,m}<-i^{\gamma}\bigg{|}\mathcal{E}\right)\leq i\exp\left\{-\frac{1}{2}i^{2\gamma-\alpha}\right\}(1+o(1)).

On the other hand, we can use analogous arguments to bound $\mathbb{P}\left(Y_{i,m}>i^{\gamma}\big{|}\mathcal{E}\right)$ . This would yield

\displaystyle\mathbb{P}\left(Y_{i,m}>i^{\gamma}\bigg{|}\mathcal{E}\right)\leq i\exp\left\{-\frac{1}{4}i^{2\gamma-\alpha}\right\}(1+o(1)).

We conclude that

	$\displaystyle\mathbb{P}\left(\left\|\sum_{j=l}^{i}\left(1-L_{j,m}\right)\right\|>i^{\gamma}\textrm{ for some }l\leq i\leq k\right)$	$\displaystyle\leq\sum_{i=l}^{k}\mathbb{P}\left(\left\|\sum_{j=1}^{i}\left(1-L_{j,m}\right)\right\|>i^{\gamma}\bigg{\|}\mathcal{E}\right)+o(m^{-1/2})$
		$\displaystyle\leq\sum_{i=l}^{k}2i\exp\left\{-\frac{1}{8}i^{2\gamma-\alpha}\right\}+o(m^{-1/2})=o(m^{-1/2}).$

The tail behavior of $\tau$ follows directly by combining Proposition 4.8 and Corollary 4.7.

Corollary 4.9.

If $k=o(m^{\alpha})$ for some $\alpha\in(0,1)$ , then as $m\rightarrow\infty$ ,

\displaystyle\mathbb{P}\left(\tau_{m}>k\right)\sim\mathbb{P}\left(T_{1-\theta}>k\right)\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2}.

Proposition 2.4 follows from Corollary 4.9 if we show that conditioning on the event that $S_{m+1,m}=S_{m+1}=0$ does not change the tail of the stopping time $\tau_{m}$ . Indeed, this turns out to be the case.

Proof of Proposition 2.4.

We show that asymptotically the behavior of the conditioned stopping time $\tau_{m}|S_{m+1}=0$ is determined solely by what happens for the increments until time $k$ . Note that by Proposition 4.3

\displaystyle\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(k)\right)

\displaystyle\sim\mathbb{P}\left(S_{i}\geq 1-\theta+\sum_{j=1}^{i}\left(1-L_{j,m}\right),\;\;\;i=1,...,k\bigg{|}\,S_{m+1}=0\right)=\mathbb{P}\left(\tau_{m}>k\big{|}\,S_{m+1}=0\right).

Fix $\epsilon\in(0,1)$ . We bound the probability terms both from above and below, and show that these bounds asymptotically behave the same as $\epsilon\downarrow 0$ . Denote by $f_{i}(\cdot)$ the density of the random walk $S_{i}$ at time $i\geq 1$ . Since the density of the random walk is bounded, we point out that it holds that [29]

\displaystyle\lim_{m\rightarrow\infty}\sup_{x\in\mathbb{R}}\left|\sqrt{m}f_{m}(\sqrt{m}x)-\phi(x)\right|=0,

(83)

where $\phi(\cdot)$ denotes the standard normal density function. Note that

\displaystyle\mathbb{P}\left(\tau_{m}>k\big{|}\,S_{m+1}=0\right)

\displaystyle=\mathbb{P}\left(\tau_{m}>k;S_{k}\leq\epsilon\sqrt{m}\big{|}\,S_{m+1}=0\right)+\mathbb{P}\left(\tau_{m}>k;S_{k}>\epsilon\sqrt{m}\big{|}\,S_{m+1}=0\right).

For the first term, we observe

	$\displaystyle\mathbb{P}\left(\tau_{m}>k;S_{k}\leq\epsilon\sqrt{m}\big{\|}\,S_{m+1}=0\right)$	$\displaystyle=\frac{1}{f_{m+1}(0)}\int_{-\infty}^{\epsilon\sqrt{k}}\mathbb{P}\left(\tau_{m}>k;S_{k}\in du\right)f_{m+1-k}(-u)$
		$\displaystyle\leq\frac{1}{f_{m+1}(0)}\mathbb{P}\left(\tau_{m}>k\right)\sup_{u\in[1-\theta+\sum_{j=1}^{k}\left(1-L_{j,m}\right),\epsilon\sqrt{k}]}f_{m+1-k}(-u).$

Due to (83),

\displaystyle f_{m+1}(0)=\frac{(1+o(1))}{\sqrt{2\pi m}}

and

\displaystyle\sup_{x\in\mathbb{R}}f_{i}(\sqrt{i}x)\leq\frac{1+o(1)}{\sqrt{2\pi i}}

as $i\rightarrow\infty$ . This yields the upper bound

\displaystyle\limsup_{m\rightarrow\infty}\frac{\mathbb{P}\left(\tau_{m}>k;S_{k}\leq\epsilon\sqrt{m}\big{|}\,S_{m+1}=0\right)}{\mathbb{P}\left(\tau_{m}>k\right)}\leq\limsup_{m\rightarrow\infty}\frac{\sqrt{2\pi m}}{\sqrt{2\pi(m+1-k)}}=1.

For the second term, we show it is negligible. Note that

\displaystyle\mathbb{P}\left(\tau_{m}>k;S_{k}>\epsilon\sqrt{m}\big{|}\,S_{m+1}=0\right)

\displaystyle\leq\frac{\mathbb{P}\left(S_{k}>\epsilon\sqrt{m}\right)}{f_{m+1}(0)}=(1+o(1))\sqrt{2\pi m}\,\mathbb{P}\left(S_{k}>\epsilon\sqrt{m}\right).

Applying Chernoff’s bound, it holds for every $t\geq 0$ ,

\displaystyle\mathbb{P}\left(S_{k}>\epsilon\sqrt{m}\right)\leq\textrm{exp}\left\{-t\epsilon\sqrt{m}+kt+k\log\left(\frac{1}{1+t}\right)\right\}.

In particular, this holds for $t=\epsilon\sqrt{m}/(k-\epsilon\sqrt{m})>0$ (for $m$ large enough). Using this choice of $t$ and applying series expansions, we derive

	$\displaystyle\mathbb{P}\left(S_{k}>\epsilon\sqrt{m}\right)$	$\displaystyle\leq\textrm{exp}\left\{-\left(\frac{\epsilon^{2}m}{k}+\epsilon\sqrt{m}\right)\frac{1}{1-\epsilon\sqrt{m}/k}+k\log\left(1-\frac{\epsilon\sqrt{m}}{k}\right)\right\}$
		$\displaystyle=\textrm{exp}\left\{-\frac{\epsilon^{2}m+\epsilon\sqrt{m}k}{k}\left(1+\frac{\epsilon\sqrt{m}}{k}+O\left(\frac{m}{k^{2}}\right)\right)-\epsilon\sqrt{m}-\frac{\epsilon^{2}m}{2k}-O\left(\frac{m^{3/2}}{k^{2}}\right)\right\}$
		$\displaystyle=\textrm{exp}\left\{-\frac{\epsilon^{2}m}{k}+o(1)\right\}.$

Due to Corollary 4.9,

\displaystyle\mathbb{P}\left(\tau_{m}>k\right)=\Theta\left(k^{-1/2}\right),

and hence

\displaystyle\frac{\mathbb{P}\left(\tau_{m}>k;S_{k}>\epsilon\sqrt{m}\big{|}\,S_{m+1}=0\right)}{\mathbb{P}\left(\tau_{m}>k\right)}=O\left(\sqrt{km}\;\textrm{exp}\left\{-\frac{\epsilon^{2}m}{k}+o(1)\right\}\right)=o(1).

We conclude the upper bound

\displaystyle\limsup_{m\rightarrow\infty}\frac{\mathbb{P}\left(\tau_{m}>k\big{|}\,S_{m+1}=0\right)}{\mathbb{P}\left(\tau_{m}>k\right)}\leq 1.

For a lower bound, we observe

	$\displaystyle\mathbb{P}\left(\tau_{m}>k\big{\|}\,S_{m+1}=0\right)$	$\displaystyle\geq\mathbb{P}\left(\tau_{m}>k;S_{k}\leq\epsilon\sqrt{m}\big{\|}\,S_{m+1}=0\right)$
		$\displaystyle\geq(1+o(1))\sqrt{2\pi m}\mathbb{P}\left(\tau_{m}>k\right)\inf_{u\in[1-\theta+\sum_{j=1}^{k}\left(1-L_{j,m}\right),\epsilon\sqrt{k}]}f_{m+1-k}(-u).$

Due to Proposition 4.8, it holds with probability $1-o(m^{-1/2})$ that

\displaystyle\bigg{|}\sum_{j=1}^{k}\left(1-L_{j,m}\right)\bigg{|}=o(\sqrt{k}).

Combining this observation with (83) yields

\displaystyle\inf_{u\in[1-\theta+\sum_{j=1}^{k}\left(1-L_{j,m}\right),\epsilon\sqrt{k}]}f_{m+1-k}(-u)=(1+o(1))\frac{1}{\sqrt{2\pi m}}e^{-\frac{\epsilon^{2}}{2}}.

We conclude that

\displaystyle\liminf_{m\rightarrow\infty}\frac{\mathbb{P}\left(\tau_{m}>k\big{|}\,S_{m+1}=0\right)}{\mathbb{P}\left(\tau_{m}>k\right)}\geq e^{-\frac{\epsilon^{2}}{2}}.

As $\epsilon\downarrow 0$ , the lower bound tends to one as well. We conclude that as $m\rightarrow\infty$ ,

\displaystyle\mathbb{P}\left(\tau_{m}>k\big{|}\,S_{m+1}=0\right)\sim\mathbb{P}\left(\tau_{m}>k\right)\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2}

due to Corollary 4.9.

4.4 Proof of main result

As is laid out in the proof strategy described in Section 2.3, it only remains to be shown that the stopping times $\upsilon(k)$ and $\varrho(k)$ , as defined in (10) and (11) respectively, are close to $k$ . This follows from the extremely likely events that only a few components disconnect from the giant, and that such components are relatively small. In particular, it is likely that $\upsilon(k)=k-o(k)$ and $\varrho(k)=k+o(k)$ .

Lemma 4.10.

Suppose $k=o(m^{\alpha})$ for some $\alpha\in(0,1)$ . Then,

\displaystyle\mathbb{P}\left(\upsilon(k)\leq k-k^{\alpha}\right)=o(m^{-1/2}).

Lemma 4.11.

Suppose $k=o(m^{\alpha})$ for some $\alpha\in(0,1)$ . Then,

\displaystyle\mathbb{P}\left(\varrho(k)>k+k^{\frac{\alpha+1}{2}}\right)=o(m^{-1/2}).

The proofs of these two lemma are fairly straightforward, and given in Appendix C for completeness. Using these lemmas, we can prove our main result.

Proof of Theorem 1.3.

Using the proof strategy as laid out in Section 2.3, we have the upper bound

\displaystyle\mathbb{P}\left(A_{n,\mathbf{d}}\geq k\right)

\displaystyle\leq\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(\upsilon(k))\right)\leq\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(\upsilon(k))\;\big{|}\;\upsilon(k)\geq k-k^{\alpha}\right)+\mathbb{P}\left(\upsilon(k)\leq k-k^{\alpha}\right),

where, due to Proposition 2.4,

\displaystyle\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(\upsilon(k))\;\big{|}\;\upsilon(k)\geq k-k^{\alpha}\right)

\displaystyle\leq\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(k-k^{\alpha})\right)\sim\frac{2\theta}{\sqrt{2\pi}}(k-k^{\alpha})^{-1/2}\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2}.

Due to Lemma 4.10,

\displaystyle\mathbb{P}\left(\upsilon(k)\leq k-k^{\alpha}\right)=o(m^{-1/2})=o(k^{-1/2}).

For the lower bound, we observe

\displaystyle\mathbb{P}

\displaystyle\left(A_{n,\mathbf{d}}\geq k\right)\geq\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(\varrho(k))\right)\geq\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(\varrho(k))\;\big{|}\;\varrho(k)\leq k+k^{(\alpha+1)/2}\right)\mathbb{P}\left(\varrho(k)\leq k+k^{(\alpha+1)/2}\right).

By Proposition 2.4,

	$\displaystyle\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa(\varrho(k))\;\big{\|}\;\varrho(k)\leq k+k^{(\alpha+1)/2}\right)$	$\displaystyle\geq\mathbb{P}\left(\hat{A}_{n,\mathbf{d}}\geq\kappa\left(k+k^{(\alpha+1)/2}\right)\right)$
		$\displaystyle\sim\frac{2\theta}{\sqrt{2\pi}}\left(k+k^{(\alpha+1)/2}\right)^{-1/2}\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2},$

and due to Lemma 4.11,

\displaystyle\mathbb{P}\left(\varrho(k)\leq k+k^{(\alpha+1)/2}\right)=1-o(m^{-1/2}).

5 Universality principle

Theorem 2.2 described the tail behavior of the failure size in case of sublinear thresholds. In this section, we consider whether the scale-free behavior prevails if the threshold is of linear size, i.e. the threshold is of the same order as the number of vertices/edges. We conjecture that the scale-free behavior prevails up to a critical point. Also, we provide an explanation why the scale-free behavior can extend to a wide class of other graphs as well. We stress that the arguments that are provided in this section are intuitive in nature, and rigorous proofs of the claims remain to be established.

The proof of Theorem 2.2 relies on the translation to a first-passage time of the random walk bridge $S_{i,m}$ over a (moving) boundary that is close to constant $1-\theta$ . We prove that the difference $S_{i,m}-S_{i}$ is small (Proposition 4.8) if $k$ is sublinear, causing the random walk $S_{i,m}$ to be asymptotically indistinguishable from $S_{i}$ up to time $k$ . That is, the random walk bridge is asymptotically indistinguishable from the one corresponding to the star topology. Since $S_{i}$ is a random walk with independent identically distributed increments with zero mean and finite variance, it is well-known by Donsker’s theorem that appropriately scaling the random walk bridge $S_{i}$ yields convergence to a Brownian bridge. Therefore, the probability that $A_{n,\mathbf{d}}$ exceeds $k$ asymptotically behaves the same as the probability that a Brownian bridge stays above zero until time $k$ , multiplied by a constant that relates to the translation of the boundary to any other (constant) boundary. We recall that in case of $S_{i}=\sum_{j=1}^{i}(1-\textrm{Exp}_{j}(1))$ and a boundary (close to) $1-\theta$ , this constant is given by $\theta$ .

When the threshold $k:=k_{m}$ is of the same order as $m$ , this analysis does not follow through. The (maximum) difference between the two random walks up to time $k$ is likely to become of order $\Theta(\sqrt{k})=\Theta(\sqrt{m})$ , an order of magnitude that affects the asymptotic behavior. Moreover, the number of edges outside the giant is also no longer likely to be of size $o(m)$ , and hence we need to understand what the failure behavior typically is in these components as well. The natural question that comes to mind is whether the scale-free behavior in the failure size tail prevails or not for linear-sized thresholds. Next, we argue heuristically why this type of behavior prevails up to a certain critical point.

If we remove $k=\beta m$ edges uniformly at random, where $\beta\in(0,q_{c})$ , then there is a non-vanishing proportion of edges outside the largest (giant) component [25]. Nevertheless, the sizes of the components outside the giant are relatively small, i.e. the number of edges in such a component is at most $O_{\mathbb{P}}(\log m)$ . It turns out that this causes it to be likely that whenever a component detaches from the giant, the cascade stops in the small component. More specifically, suppose that the edge with the $i$ ’th smallest surplus capacity is contained in the giant upon failure, and causes a component to detach a component from the giant. Due to the way that the total load surge is defined, we conjecture that it is likely that the total load surge is close to $i/m+\Theta(1/\sqrt{m})$ upon failure. Since the number of edges in the smaller component is likely to be $O(\log m)$ , all the surplus capacities of these edges are likely to be at least $i/m+\Omega(1/\log m)$ , which is much larger than the total load surge. That is, all edges are likely to have sufficient capacity to deal with the load, and no more failures occur in the smaller detached component. Moreover, even if the cascade continues, it would contribute at most a logarithmic number of edges to the total failure size. These observations lead to the claim that the dominant contribution in the failure size comes from the number of edges that are contained in the giant upon failure.

To track the failure behavior in the giant component, one can use the sequence of random walks $S_{i,m}$ . That is, the translation of the failure size to the first-passage time of the random walk bridge over the constant boundary $1-\theta$ remains (likely to be) true if $k\leq\beta m$ with $\beta\in(0,q_{c})$ . Although the random walk is no longer close to $S_{i}$ , the increments of $S_{i,m}$ do have zero mean and a variance that is likely to be finite (and non-constant), and hence satisfy a martingale property. Therefore, we conjecture that the probability that the failure size in the giant component exceeds $\kappa(k)$ behaves the same as the probability that a Brownian bridge (with non-constant variance) stays above zero until time $k$ , multiplied by a constant.

Since we argued that it should hold that $A_{n,\textbf{d}}\approx\hat{A}_{n,\textbf{d}}$ , this leads to the following conclusion. Write $k=\alpha m$ with $\alpha\in(0,1)$ . In view of (59), we observe that for all $i:=i_{m}$ sufficiently smaller than the critical point $q_{c}$ , it holds that $\kappa(i)\approx m\int_{0}^{i/m}\xi_{\mathbf{d}}(q)\mathrm{d}q$ . Write

\displaystyle\beta_{\alpha}:=\min_{x\in(0,1)}\left\{\int_{0}^{x}\xi_{\mathbf{d}}(q)\mathrm{d}q=\alpha\right\}.

(84)

Then $\varrho(k)\approx\beta_{\alpha}m$ , since

\displaystyle\kappa(\varrho(k))=k=\alpha m\approx\kappa(\beta_{\alpha}m).

Therefore,

\displaystyle\mathbb{P}(A_{n,\textbf{d}}\geq k)\sim\mathbb{P}(\hat{A}_{n,\textbf{d}}\geq k)\sim\mathbb{P}(\hat{A}_{n,\textbf{d}}\geq\kappa(\beta_{\alpha}m)).

Summarizing, we have the following conjecture for $\overline{CM}_{n}(d,q)$ . Suppose $k=\alpha m$ with $\alpha\in(0,1)$ such that $\beta_{\alpha}<q_{c}$ is satisfied. Then,

\displaystyle\mathbb{P}(A_{n,\textbf{d}}\geq k)\sim f(\alpha)k^{-1/2}.

To support this conjecture, we performed a Monte-Carlo simulation experiment. In particular, we tested the conjecture against the erased configuration model. That is, we create this graph according to the configuration model mechanism with a prescribed degree sequence. We merge multiple edges and erase self-loops. Moreover, after sampling such a graph, we remove any smaller components such that the final graph is a simple connected graph. Due to the properties of the configuration model, the number of self-loops and multiple edges is very small (if any), and only a finite number of vertices and edges are not contained in the giant. As a result, this graph and the configuration model conditioned on connectivity are indistinguishable asymptotically, and will lead to the same asymptotic result for the number of edge failures. In our simulations we choose $n\in\{500,1000,1500,2000\}$ , and a degree sequence with $n_{1}=\lceil n^{1/3}\rceil$ vertices of degree one, $n_{2}=n_{3}=n/2-\lceil n^{1/3}\rceil$ and $n_{4}=\lceil n^{1/3}\rceil$ . Therefore, the number of edges is (close to) $m=5/4n$ . The results are displayed in Figure 2. Indeed, it appears that our conjecture holds in this case.

Not only do we believe that this conjecture holds for the connected configuration model, we argue that the scale-free behavior may hold for a wider range of graphs. In particular, the relevant properties of the configuration model that we used in the analysis are the following. First, it is likely that no (significant) disconnections occur at the beginning of the cascading failure process. For example, in the case of a configuration model, we showed that the first disconnection is likely to occur after $\Theta(\sqrt{m})$ edge failures. Secondly, whenever the cascading failure process causes disconnections to occur, a giant component appears and disconnections only create relatively small components. It is well-known that this property is satisfied up to a certain critical threshold $q_{c}$ for the configuration model, but this holds in fact for many more types of random graphs. In other words, for our result to prevail in other graph topologies, the graph should satisfy the following two properties:

•

The first disconnection (if any) is only likely to occur after $\Omega(m^{\delta})$ failures, where $\delta>0$ ;
•

There exists a critical parameter $q_{c}$ such that if $q<q_{c}$ , the largest component of the percolated graph is unique w.h.p. and contains a non-vanishing proportion of the vertices and edges. Moreover, all other components are likely to be relatively small for $q<q_{c}$ , e.g. the second largest component contains at most $O_{\mathbb{P}}((\log m)^{c})$ number of edges for some $c<\infty$ .

Whenever a graph $G=(V,E)$ satisfies these two properties, we conjecture that the number of edge failures $A_{G}$ exhibits scale-free behavior. That is, for a range of thresholds $k:=k_{m}$ , it holds that

\displaystyle\mathbb{P}\left(A_{G}\geq k\right)\sim f_{G}(\alpha)k^{-1/2},

(85)

where $f_{G}(\cdot)>0$ and $\alpha=\lim_{m\rightarrow\infty}k/m$ . In particular, $f_{G}(0)=2\theta/\sqrt{2\pi}$ . The function $f_{G}$ depends on the specifics of the graph, such as average degree and more detailed connectivity properties. The range of values for $k$ for which (85) holds depends on the critical threshold $q_{c}$ and the typical number of edges that are outside the giant component.

To test conjecture (85), we first consider a $\lceil\sqrt{n}\rceil\times\lceil\sqrt{n}\rceil$ square lattice graph where the opposite boundaries are not connected with $n\in\{500,1000,1500,2000\}$ . On the square lattice it is known that there is a phase transition for the existence of a giant component when $q_{c}=1/2$ [21]. Moreover, significant disconnections occur only after quite a significant number of failures have occurred. Indeed, to disconnect one edge $e$ (not on the boundary) from the giant we need to remove at least six edges (the ones that share an end-vertex with $e$ ). This suggests that since there are roughly $2n$ edges in the graph, the first time the process produces an edge disconnected from the giant should be of order $\Theta(n^{5/6})$ . Moreover, it is known that in this regime the second largest component in a box of volume $n$ is polylogarithmic in $n$ [22]. Thus, it satisfies the conditions we conjecture for a graph to be in the same mean-field universality class as the configuration model. In Figure 3, we observe that indeed the first significant disconnections happen after a much longer time than in $CM_{n}(\mathbf{d})$ , and the limiting function for $k^{1/2}\mathbb{P}\left(A_{Lattice}\geq k\right)$ remains very close to the setting where no disconnection takes place for a relatively long time.

Secondly, we consider the giant component of the Erdös-Rényi random graph. That is, for every pair of the $n$ vertices, there exists an edge with probability $\lambda/n$ , and we consider the cascading failure process on the giant component. The giant is uniquely defined asymptotically: it is well-known that for every $\lambda>1$ there exists a unique giant component $\mathcal{C}_{\rm max}$ for which holds that [16],

\displaystyle\frac{|\{v:v\in\mathcal{C}_{\max}\}|}{n}\overset{\mathbb{P}}{\longrightarrow}\zeta_{\lambda},

where $\zeta_{\lambda}=1-\eta_{\lambda}>0$ and $\eta_{\lambda}$ satisfies the fixed-point equation

\displaystyle\eta_{\lambda}=\mathbb{E}\left(\eta_{\lambda}^{\textrm{Pois}(\lambda)}\right)=1-e^{-\lambda\eta_{\lambda}}.

In our simulation experiment, we choose $n\in\{500,1000,1500,2000\}$ possible vertices, and $\lambda=2$ . Therefore, the graph on which we perform the cascading failure process is likely to have around $\zeta_{\lambda}n$ vertices and $\lambda(1-\eta_{\lambda}^{2})n/2$ edges (with $\Theta(\sqrt{n})$ fluctuations). From the definition of the Erdös-Rényi random graph it is clear that if we run a percolation process on it, the resulting graph is again an Erdös-Rényi random graph, but with a smaller $\lambda$ . It is known that for every $\lambda>1$ all the components outside the giant are at most of size $\Theta(\log n)$ . Therefore, the disintegration of the network is similar to the one of the configuration model, yet with the critical edge-removal probability corresponding to $q_{c}=(\lambda-1)/\lambda$ in this case.

However, the first disconnection is likely to occur after finitely many edge failures, since the number of vertices with degree one in the giant of an Erdös Rényi graph is likely to be of order $\Theta(n)$ . In other words, this graph violates the condition that the first disconnections should occur after $\Omega(m^{\delta})$ edge failures for some $\delta>0$ . Nevertheless, it appears from our simulation result that (85) still prevails, see Figure 4. In other words, the condition on no early disconnections can possibly be relaxed. The analysis would require significant changes, and particularly, the results on the distribution of first-passage times over moving boundaries such as Lemmas 4.4 and 4.6, as the load surge is no longer almost deterministic at the beginning of the process.

In contrast, we consider a graph $G=CS(n,4)$ consisting of $n$ star-components with $4$ edges each, connected by a single path of edges connecting all the components. This graph thus consists of $5n$ vertices and also $m=5n$ edges. We observe that as soon as one of the $n$ edges on the single path fails, the remaining graph is a (connected) tree. Therefore, with high probability, the graph would disconnect in two components both of order $\Theta(m)$ after removing only a fixed number of edges. This effect is likely to occur again in both components when edges are removed uniformly at random in these components. This violates both properties we needed to prove our result for the configuration model. We observe in Figure 5 that indeed $k^{1/2}\mathbb{P}\left(A_{CS}\geq k\right)$ does not seem to converge to a single function as $n\rightarrow\infty$ .

References

[1] P. Bak and C. Tang. Earthquakes as a self-organized critical phenomenon. Journal of Geophysical Research: Solid Earth, 94(B11):15635–15637, 1989.
[2] P. Bak, C. Tang, and K. Wiesenfeld. Self-organized criticality: An explanation of the 1/f noise. Physical Review Letters, 59:381–384, 1987.
[3] P. Bak, C. Tang, and K. Wiesenfeld. Self-organized criticality. Physical Review A, 38:364–374, 1988.
[4] A. Barabási. The origin of bursts and heavy tails in human dynamics. Nature, 435(7039):207, 2005.
[5] A. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509–512, 1999.
[6] D. Bienstock. Electrical transmission system cascades and vulnerability - an operations research viewpoint, volume 22 of MOS-SIAM Series on Optimization. SIAM, 2016.
[7] B. A. Carreras, V. E. Lynch, I. Dobson, and D. E. Newman. Complex dynamics of blackouts in power transmission systems. Chaos: An Interdisciplinary Journal of Nonlinear Science, 14(3):643–652, 2004.
[8] B. K. Chakrabarti. A fiber bundle model of traffic jams. Physica A: Statistical Mechanics and its Applications, 372(1):162–166, 2006.
[9] A. Clauset, C. R. Shalizi, and M. E. J. Newman. Power-law distributions in empirical data. SIAM review, 51(4):661–703, 2009.
[10] D. Denisov, A. Sakhanenko, and V. Wachtel. First-passage times for random walks with nonidentically distributed increments. Ann. Probab., 46(6):3313–3350, 11 2018.
[11] S. Dhara, R. van der Hofstad, J. S. H. van Leeuwaarden, and S. Sen. Critical window for the configuration model: finite third moment degrees. Electronic Journal of Probability, 22:Paper No. 16, 33, (2017).
[12] R. A. Doney. Local behaviour of first passage probabilities. Probability Theory and Related Fields, 152(3):559–588, 2012.
[13] L. Federico and R. Hofstad. Critical window for connectivity in the configuration model. Combinatorics, Probability and Computing, 26(5):660–680, 2017.
[14] E. N. Gilbert. Random graphs. Annals of Mathematical Statistics, 30:1141–1144, (1959).
[15] P. E. Greenwood and A. A. Novikov. One-sided boundary crossing for processes with independent increments. Theory of Probability & Its Applications, 31(2):221–232, 1987.
[16] R. Hofstad. Random graphs and complex networks. Volume 1. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, (2017).
[17] S. Janson. The largest component in a subcritical random graph with a power law degree distribution. The Annals of Applied Probability, 18(4):1651–1668, (2008).
[18] S. Janson. On percolation in random graphs with given vertex degrees. Electronic Journal of Probability, 14:no. 5, 87–118, (2009).
[19] S. Janson. The probability that a random multigraph is simple. Combinatorics, Probability and Computing, 18(1-2):205–225, (2009).
[20] S. Janson. The probability that a random multigraph is simple. II. Journal of Applied Probability, 51A(Celebrating 50 Years of The Applied Probability Trust):123–137, (2014).
[21] H. Kesten. The critical probability of bond percolation on the square lattice equals ${1\over 2}$ . Communications in Mathematical Physics, 74(1):41–59, (1980).
[22] H. Kesten and Y. Zhang. The probability of a large finite cluster in supercritical Bernoulli percolation. The Annals of Probability, 18(2):537–555, (1990).
[23] M. Korkali, J. G. Veneman, B. Tivnan, and P. Hines. Reducing cascading failure risk by increasing infrastructure network interdependency. Scientific Reports, 7, 10 2017.
[24] T. Łuczak. Sparse random graphs with a given degree sequence. In Random graphs, Vol. 2 (Poznań, 1989), Wiley-Intersci. Publ., pages 165–182. Wiley, New York, (1992).
[25] M. Molloy and B. Reed. A critical point for random graphs with a given degree sequence. In Proceedings of the Sixth International Seminar on Random Graphs and Probabilistic Methods in Combinatorics and Computer Science, “Random Graphs ’93” (Poznań, 1993), volume 6, pages 161–179, (1995).
[26] F. Morone and H. A. Makse. Influence maximization in complex networks through optimal percolation. Nature, 524(7563):65, 2015.
[27] A. E. Motter. Cascade control and defense in complex networks. Physical Review Letters, 93:098701, Aug 2004.
[28] A. E. Motter and Y. C. Lai. Cascade-based attacks on complex networks. Physical Review E, 66(065102), 2002.
[29] V. V. Petrov. Sums of Independent Random Variables. Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer Berlin Heidelberg, 1975.
[30] S. Pradhan, A. Hansen, and B.K. Chakrabarti. Failure processes in elastic fiber bundles. Reviews of Modern Physics, 82(1):499–555, 2010.
[31] J. Qi, I. Dobson, and S. Mei. Towards estimating the statistics of simulated cascades of outages with branching processes. IEEE Transactions on Power Systems, 28(3):3410–3419, Aug 2013.
[32] H. A. Simon. On a clas of skew distribution functions. Biometrika, 42(3-4):425–440, 12 1955.
[33] F. Sloothaak, S. C. Borst, and B. Zwart. Robustness of power-law behavior in cascading line failure models. Stochastic Models, 34(1):45–72, 2018.
[34] F. Sloothaak, S. C. Borst, and B. Zwart. The impact of a network split on cascading failure processes. Stochastic Systems, 9(4):392–416, 2019.
[35] F. Sloothaak, V. Wachtel, and B. Zwart. First-passage time asymptotics over moving boundaries for random walk bridges. Journal of Applied Probability, 55(2):627–651, 2018.
[36] B. Suki, A. Barabási, Z. Hantos, F. Peták, and H. E. Stanley. Avalanches and power-law behaviour in lung inflation. Nature, 368(6472):615, 1994.
[37] K. Sun, Y. Hou, W. Sun, and J. Qi. Power System Control Under Cascading Failures: Understanding, Mitigation, and System Restoration. Wiley-IEEE Press, 2019.
[38] V. Wachtel and D. Denisov. An exact asymptotics for the moment of crossing a curved boundary by an asymptotically stable random walk. Theory of Probability & Its Applications, 60(3):481–500, 2016.
[39] Z. Wang, A. Scaglione, and R. J. Thomas. Generating statistically correct random topologies for testing smart grid communication and control networks. IEEE Transactions on Smart Grid, 1(1):28–39, June 2010.
[40] D.J. Watts. A simple model of global cascades on random networks. Proceedings of the National Academy of Sciences of the United States of America, 99(9):5766, 2002.
[41] J. Zheng, Z. Gao, X. Zhao, and F. Bai-Bai. Extended fiber bundle model for traffic jams on scale-free networks. International Journal of Modern Physics C (IJMPC), 19(11):1727–1735, 2008.

Appendix A Lists of variables

$n$	# vertices
$\mathbf{d}$	(Fixed) degree sequence
$m=m_{n}(\mathbf{d})$	# edges in a configuration model with degree sequence $\mathbf{d}$
$CM_{n}(\mathbf{d})$	Configuration model with degree sequence $\mathbf{d}$
$\overline{CM}_{n}(\mathbf{d})$	Configuration model with degree sequence $\mathbf{d}$ conditioned to be connected
$n_{i}$	# vertices with degree $i$
$p_{i}$	Fraction of vertices with degree $i$
$D_{n}$	Degree of a vertex chosen uniformly at random from $[n]$
$D$	Limiting degree variable of $D_{n}$
$d$	Average degree of $\mathbf{d}$
$\theta$	Disturbance constant
$l_{j}^{m}(i)$	Total load surge at edge $j$ after experiencing $i$ load surges in a graph with $m$ edges
$\|E_{j}^{m}(i)\|$	# edges in the component containing edge $j$ after experiencing $i$ load surges
$A_{n,\mathbf{d}}$	# edge failures after the cascading failure process

Table 1: List of variables commonly used throughout this chapter.

$q$	Removal probability in the percolation process
$CM_{n}(\mathbf{d},q)$	Percolated configuration model with removal probability $q$
$\mathcal{C}$	Component of a graph, sometimes also denoted as $\mathcal{C}(x)$ when referring to the component that contains vertex or edge $x$
$\mathcal{C}_{\rm max}$	Largest component of a graph
$T_{n,\mathbf{d}}$	# edges that are sequentially removed uniformly at random for the first disconnection to occur in the connected configuration model
$T$	Limiting variable of $m^{1/2}T_{n,\mathbf{d}}$
$\|\hat{E}_{m}(i)\|$	# edges in the giant (largest) component after $i$ edges have been removed uniformly at random
$\|\tilde{E}_{m}(i)\|$	# edges outside the giant (largest) component after $i$ edges have been removed uniformly at random

Table 2: List of variables commonly used in percolation/sequential edge-removal process.

$R_{n}$	# removed half-edges in the first step of the explosion algorithm
$\mathbf{d}^{\prime}$	Degree sequence of configuration graph in step three of the explosion method ( $\mathbf{d}^{\prime}\in\mathbb{N}^{n+R_{n}}$ )
$CM_{N}(\mathbf{d}^{\prime})$	Configuration model in step three of the explosion method with degree sequence $\mathbf{d}^{\prime}$
$CM_{n}(\mathbf{d},q)$	Resulting configuration model in step four of the explosion method, indistinguishable from the percolated configuration model with removal probability $q$
$n_{i}^{\prime}$	# vertices of degree $i$ in $CM_{N}(\mathbf{d}^{\prime})$
$p_{i}^{\prime}$	$\lim_{n\rightarrow\infty}n_{i}^{\prime}/n$
$n_{l,j}$	# vertices of degree $l$ in d that have degree $j$ in $\textbf{d}^{\prime}$
$p_{l,j}$	Probability for a vertex of degree $l$ to retain $j$ half-edges after the first step of the explosion algorithm
$L_{k}^{\prime}(n)$	# components that are lines of length $k\geq 2$ in $CM_{N}(\mathbf{d}^{\prime})$
$C_{k}^{\prime}(n)$	# components that are cycles of length $k\geq 1$ in $CM_{N}(\mathbf{d}^{\prime})$

Table 3: List of variables commonly used in the explosion algorithm.

$\hat{A}_{n,\mathbf{d}}$	# edges that were contained in the largest component upon failure during the cascade
$\tilde{A}_{n,\mathbf{d}}$	# edges that were contained outside the largest component upon failure during the cascade
$A_{n+1}^{*}$	# edge failures in the cascading failure process on a star topology with $n+1$ nodes and $m=n$ edges
$\kappa(i)$	# edges that are contained in the largest component upon removal when $i$ edges are sequentially removed uniformly at random
$\upsilon(i)$	Minimum # edges that need to be removed uniformly at random for the sum of $\upsilon(i)$ and # edges outside the giant to exceed $i$
$\varrho(i)$	# edges that need to be removed uniformly at random such that $i$ edges were contained in the giant component upon failure
$S_{i}$	$\sum_{j=1}^{i}\left(1-\textrm{Exp}_{j}(1)\right)$
$L_{i,m}$	Scaled perturbed load surge, formally defined as in (80)
$S_{i,m}$	Random walk defined as $\sum_{j=1}^{i}L_{j,m}-\textrm{Exp}_{j,m}(1)$
$Y_{i,m}$	Random walk $\sum_{j=1}^{i}(1-L_{j,m})$
$\tau_{m}$	First-passage time of random walk $S_{i,m}$ to be less than $1-\theta$
$T_{g}$	First-passage time of random walk $S_{i}$ to move below a boundary sequence $(g_{i})_{i\in\mathbb{N}}$

Table 4: List of variables commonly used in the failure process.

Appendix B Proofs of results on the disintegration of the graph

In this appendix, we present several proofs of results given in Section 3.

Proof of Lemma 3.1.

Due to the i.i.d. property of the surplus capacities, it holds that $\tilde{E}^{\prime}(i)$ has the distribution of $i$ edges chosen uniformly without repetitions from $[m]$ , so $\mathbb{P}(\tilde{E}^{\prime}(i)=B)=\binom{m}{i}^{-1}$ .

Since all edges in $CM_{n}(\mathbf{d},q)$ are removed independently with probability $q$ , it holds for all sets $B\subseteq E$ with $|B|=i$ that

\displaystyle\mathbb{P}(E^{\prime}(G(q))=B)=q^{i}(1-q)^{m-i},

(86)

while

\displaystyle\mathbb{P}(|E^{\prime}(G(q))|=i)=\binom{m}{i}q^{i}(1-q)^{m-i}.

(87)

Since $\{E^{\prime}(G(q))=B\}\subseteq\{|E^{\prime}(G(q))|=i\}$ , we obtain

\displaystyle\mathbb{P}(E^{\prime}(G(q))=B\mid|E^{\prime}(G(q))|=i)=\binom{m}{i}^{-1},

(88)

so (12) holds. From (87) we obtain (13) by concentration of $\textrm{Bin}(m,q)$ if $qm\to\infty$ .

Proof of Lemma 3.7.

We use the explosion algorithm from Algorithm 1. To illustrate the type of arguments we use to prove this statement, we first consider the first moment only.

Define $\mathcal{V}_{j}$ as the set of all vertices of degree $j$ in $\mathbf{d}^{\prime}$ . Recall the degree sequence $\mathbf{d}^{\prime}$ from Lemma 3.4, and we define

\displaystyle L_{k}=\{\{v_{1},v_{2},...,v_{k}\}:v_{1},v_{k}\in\mathcal{V}_{1};v_{2},...,v_{k-1}\in\mathcal{V}_{2}\},

the set of all collections of $k$ vertices that could form a line. Note that

\displaystyle L_{k}^{\prime}(n)=\sum_{l\in\mathcal{L}_{k}}\mathbbm{1}_{\{l\text{ forms a line}\}}.

Due to Lemma 3.4, we observe that

	$\displaystyle\mathbb{E}$	$\displaystyle[L_{k}^{\prime}(n)]=\mathbb{E}\left[\sum_{l\in\mathcal{L}_{k}}\mathbbm{1}_{\{l\text{ forms a line}\}}\right]$
		$\displaystyle=\mathbb{E}\left[\binom{n_{1}^{\prime}}{2}\binom{n_{2}^{\prime}}{k-2}\frac{2k-4}{2m-1}\frac{2k-6}{2m-3}\cdots\frac{2}{2m-2k+5}\frac{1}{2m-2k+3}\right]$
		$\displaystyle=\mathbb{E}\left[\frac{{n_{1}^{\prime}}^{2}{n_{2}^{\prime}}^{k-2}}{2(k-2)!}\frac{(2k-4)!!}{(2m)^{k-1}}\right](1+o(1))=\frac{i^{2}}{m}\frac{1}{4}\Big{(}1+\frac{2p_{2}}{d}\Big{)}^{2}\Big{(}\frac{2p_{2}}{d}\Big{)}^{k-2}(1+o(1)).$

Next, we generalize these arguments to higher and mixed moments of the variables $(L_{k}^{\prime}(n))_{n\geq 2}$ . We follow the same approach used in [13] where the convergence of the number of lines to a sequence of Poisson variables was proved in the critical window for connectivity. Using the method of moments, in this case, we need to prove concentration of the number of vertices and edges in line components.

We prove (39) by induction. With a slight abuse of notation, we start the induction step at $k=1$ , or alternatively, at $k=2$ with $r_{2}=0$ . Then, both sides in (39) are equal to one, and hence the induction hypothesis is satisfied.

Next, we show how to advance the induction hypothesis. We define

\displaystyle W_{k}(\mathbf{r})=\left\{\bigcup_{j=2}^{k}\{l_{j}(1),...,l_{j}(r_{j})\}:l_{j}(h)\in\mathcal{L}_{j}\textrm{ for all }1\leq h\leq r_{j},2\leq j\leq k\right\},

the collection of sets of $\sum_{j=2}^{k}r_{j}$ possible lines. Moreover, for a set $w_{k}(\mathbf{r})\in W_{k}(\mathbf{r})$ , we define $\mathcal{E}(w_{k}(\mathbf{r}))$ as the event that all elements in the set $w_{k}(\mathbf{r})$ form a line component in $CM_{N}(\mathbf{d}^{\prime})$ . Then, using the tower property, we can rewrite

	$\displaystyle\mathbb{E}[L_{2}^{\prime}(n)^{r_{2}}\cdots L_{k}^{\prime}(n)^{r_{h}}]=\mathbb{E}\left[\sum_{w_{k}(\mathbf{r})\in W_{k}(\mathbf{r})}\mathbbm{1}_{\mathcal{E}(w_{k}(\mathbf{r}))}\right]$
	$\displaystyle=\mathbb{E}\left[\sum_{w_{k-1}(\mathbf{r})}\mathbbm{1}_{\mathcal{E}(w_{k-1}(\mathbf{r}))}\mathbb{E}\left[\sum_{l_{k}(1),\ldots,l_{k}(r_{k})\in\mathcal{L}_{k}}\mathbbm{1}_{l_{k}(1)}\mathbbm{1}_{l_{k}(2)}\cdots\mathbbm{1}_{l_{k}(r_{k})}\mid\mathcal{E}(w_{k-1}(\mathbf{r}))\right]\right],$

where $\mathbbm{1}_{l_{k}(h)}$ denotes the indicator of the event that the set $l_{k}(h)$ forms a line. Next, we show that for every $w_{k}(\mathbf{r})\in W_{k}(\mathbf{r})$ ,

	$\displaystyle\Big{(}\frac{m}{i^{2}}\Big{)}^{r_{k}}\sum_{l_{k}(1),\ldots,l_{k}(r_{k})\in\mathcal{L}_{k}}$	$\displaystyle\mathbb{E}[\mathbbm{1}_{l_{k}(1)}\mathbbm{1}_{l_{k}(2)}\cdots\mathbbm{1}_{l_{k}(r_{k})}\|\mathcal{E}(w_{k-1}(\mathbf{r}))]$
		$\displaystyle=\left(\frac{1}{4}\Big{(}1+\frac{2p_{2}}{d}\Big{)}^{2}\Big{(}\frac{2p_{2}}{d}\Big{)}^{k-2}\right)^{r_{k}}(1+o(1)).$

Note that by induction, this suffices to conclude (39).

First, note that if any of the $l_{k}(j),1\leq j\leq r_{k},$ contains any of the vertices used in $w_{k-1}(\mathbf{r})$ , then it is not possible for it to form a line of length $k$ as these vertices are already contained in line components of a smaller length. In that case, $\mathbb{E}[\mathbbm{1}_{l_{k}(1)}\mathbbm{1}_{l_{k}(2)}\cdots\mathbbm{1}_{l_{k}(r_{k})}\mid\mathcal{E}(w_{k-1}(\mathbf{r}))]=0$ . Therefore, we can consider the part of the graph excluding the line components in $w_{k-1}(\mathbf{r})$ . This part of the graph is also a configuration model, but with a slightly altered degree sequence. Note that we exclude finitely many line components of finite length, and hence only exclude finitely many vertices and edges. With a slight abuse of notation, write $r^{\prime}=r^{\prime}(l_{k}(1),...,l_{k}(r_{k}))$ as the number of mutually distinct lines. Then the probability that $r^{\prime}$ mutually distinct lines can be formed in the graph excluding the line components in $w_{k-1}(\mathbf{r})$ is given by

\displaystyle\mathbb{E}[\mathbbm{1}_{l_{k}(1)}\mathbbm{1}_{l_{k}(2)}\cdots\mathbbm{1}_{l_{k}(r_{k})}|\mathcal{E}(w_{k-1}(\mathbf{r}))]=\frac{(2k-4)!!^{r^{\prime}}}{(2m)^{r^{\prime}(k-1)}}(1+o(1)).

We sum these contributions over the number of possible sets that exist in the subgraph. Write $C(r_{k},r^{\prime})$ as the number of distinct sets of $r_{k}$ lines that contain the same set of $r^{\prime}$ mutually distinct lines of length $k$ . Importantly, $C(r_{k},r_{k})=1$ and $C(r_{k},r^{\prime})$ is a finite integer if $1\leq r^{\prime}\leq r_{k}-1$ . Recalling Lemma 3.4, we obtain

	$\displaystyle\Big{(}\frac{m}{i^{2}}\Big{)}^{r_{k}}$	$\displaystyle\sum_{l_{k}(1),\ldots,l_{k}(r_{k})\in\mathcal{L}_{k}}\mathbb{E}[\mathbbm{1}_{l_{k}(1)}\mathbbm{1}_{l_{k}(2)}\cdots\mathbbm{1}_{l_{k}(r_{k})}\|\mathcal{E}(w_{k-1}(\mathbf{r}))]$
		$\displaystyle=\Big{(}\frac{m}{i^{2}}\Big{)}^{r_{k}}\mathbb{E}\left[\sum_{r^{\prime}=1}^{r_{k}}C(r_{k},r^{\prime})\left(\frac{{n_{1}^{\prime}}^{2}{n_{2}^{\prime}}^{k-2}}{2(k-2)!}\frac{(2k-4)!!}{(2m)^{k-1}}\right)^{r^{\prime}}\right](1+o(1))$
		$\displaystyle=\left(\frac{1}{4}\Big{(}1+\frac{2p_{2}}{d}\Big{)}^{2}\Big{(}\frac{2p_{2}}{d}\Big{)}^{k-2}\right)^{r_{k}}(1+o(1)),$

concluding the proof.

Proof of Corolllary 3.8.

Note that it follows from Lemma 3.7

\displaystyle\mathbb{E}[L_{k}^{\prime}(n)]

\displaystyle=\frac{i^{2}}{m}\frac{1}{4}\Big{(}1+\frac{2p_{2}}{d}\Big{)}^{2}\Big{(}\frac{2p_{2}}{d}\Big{)}^{k-2}(1+o(1)),\hskip 28.45274ptk\geq 2.

Consequently, for every $\varepsilon>0$ there exists a $N>0$ such that for all $n\geq N$ ,

\displaystyle\frac{m}{i^{2}}\mathbb{E}[L_{k}^{\prime}(n)]\leq\frac{1}{4}\Big{(}1+\frac{2p_{2}+\varepsilon}{d-\varepsilon}\Big{)}^{2}\Big{(}\frac{2p_{2}+\varepsilon}{d-\varepsilon}\Big{)}^{k-2},\hskip 28.45274ptk\geq 2.

In particular, for $\varepsilon$ small enough $\frac{2p_{2}+\varepsilon}{d-\varepsilon}<1$ so that the sequence converges to zero exponentially fast in $k$ . We apply dominated convergence to obtain

	$\displaystyle\mathbb{E}\Big{[}\frac{m}{i^{2}}\sum_{k=2}^{\infty}kL_{k}^{\prime}(n)\Big{]}$	$\displaystyle\to\frac{(d-p_{2})(d+2p_{2})^{2}}{2d(d-2p_{2})^{2}},$		(89)
	$\displaystyle\mathbb{E}\Big{[}\frac{m}{i^{2}}\sum_{k=2}^{\infty}(k-1)L_{k}^{\prime}(n)\Big{]}$	$\displaystyle\to\frac{(d+2p_{2})^{2}}{4(d-2p_{2})^{2}}.$		(90)

In other words, we have derived the expected number of vertices and edges in line components in $CM_{N}(\mathbf{d}^{\prime})$ . Next, we prove (40) for every $j\geq 2$ . We define for a sequence $\mathbf{r}$ of positive integer values, $|\mathbf{r}|_{1}=\sum_{h=1}^{k}r_{h}$ , i.e. its $\ell_{1}$ norm. We write

\displaystyle\mathbb{E}\Big{[}\Big{(}\frac{m}{i^{2}}\sum_{k=2}^{\infty}kL_{k}^{\prime}(n)\Big{)}^{j}\Big{]}=\frac{m^{j}}{i^{2j}}\sum_{\mathbf{r}:|\mathbf{r}|_{1}=j}\prod_{h\geq 2}\mathbb{E}[(hL_{h}^{\prime}(n))^{r_{h}}].

For every $\varepsilon\geq 0$ , it holds for all $n$ sufficiently large that

\displaystyle\Big{(}\frac{m}{i^{2}}\Big{)}^{|\mathbf{r}|_{1}}\prod_{h\geq 2}\mathbb{E}[(L_{h}^{\prime}(n))^{r_{h}}]\leq\frac{1}{4^{|\mathbf{r}|_{1}}}\Big{(}1+\frac{2p_{2}+\varepsilon}{d-\varepsilon}\Big{)}^{2|\mathbf{r}|_{1}}\Big{(}\frac{2p_{2}+\varepsilon}{d-\varepsilon}\Big{)}^{\sum_{h\geq 2}(h-2)r_{h}}.

(91)

If $\varepsilon$ is small enough, then $\frac{2p_{2}+\varepsilon}{d-\varepsilon}<1$ , and thus the sequence is decreasing exponentially fast in $\sum_{h\geq 2}hr_{h}$ . Applying dominated convergence thus yields (40).

Appendix C Proofs of results on the cascading failure process

Proof of Lemma 4.6.

First, we observe that

\displaystyle\mathbb{P}\left(T_{g^{-}}>k\right)\geq\mathbb{P}\left(T_{1-\theta}>k\right),

and hence it suffices to show that the reversed inequality holds asymptotically. Our proof will be similar to the proof of Lemma 4.4, but adapted to provide an upper bound.

Fix $\epsilon>0$ small as in Lemma 4.4, and define the piecewise constant boundary

\displaystyle\hat{h}_{i,k}^{\epsilon}=\left\{\begin{array}[]{ll}h^{(0)}=1-\theta&\textrm{if }i\leq t_{0,k}^{\epsilon}=l,\\ -h^{(j)}&\textrm{if }t_{j-1,k}^{\epsilon}<i\leq t_{j,k}^{\epsilon},\;1\leq j\leq r-1,\\ -h^{(r)}&\textrm{if }i>t_{r-1,k}^{\epsilon}.\end{array}\right.

for $r$ , times $t_{j,k}^{\epsilon}$ and levels $h^{(j)}$ defined as in the proof of Lemma 4.4. We note that for every fixed $\delta>0$ ,

	$\displaystyle\mathbb{P}\left(T_{g^{-}}>k\right)\leq\mathbb{P}\left(T_{\hat{h}^{\epsilon}}>k;S_{t_{j,k}^{\epsilon}}\in\left(\delta\sqrt{t_{j,k}^{\epsilon}},1/\delta\sqrt{t_{j,k}^{\epsilon}}\right),\;\;\;\forall\,0\leq j\leq r-1\right)$
	$\displaystyle\hskip 142.26378pt+\sum_{j=0}^{r}\mathbb{P}\left(T_{g^{-}}>k;S_{t_{j,k}^{\epsilon}}\not\in\left(\delta\sqrt{t_{j,k}^{\epsilon}},1/\delta\sqrt{t_{j,k}^{\epsilon}}\right)\right).$

Using analogous arguments as in Lemma 4.4, we obtain

	$\displaystyle\mathbb{P}\left(T_{\hat{h}^{\epsilon}}>k;S_{t_{j,k}^{\epsilon}}\in\left(\delta\sqrt{t_{j,k}^{\epsilon}},1/\delta\sqrt{t_{j,k}^{\epsilon}}\right),\;\;\;\forall\,0\leq j\leq r-1\right)$
	$\displaystyle\hskip 227.62204pt\leq(1+o(1))\mathbb{P}\left(T_{1-\theta}>k\right).$

For the other terms, define the sequence $(\tilde{h}_{i})_{i\in\mathbb{N}}$ with $\tilde{h}_{i}=\min\{1-\theta,-i^{\gamma}\}$ . Then, due to Theorem 1 of [10],

	$\displaystyle\sum_{j=0}^{r}$	$\displaystyle\mathbb{P}\left(T_{g^{-}}>k;S_{t_{j,k}^{\epsilon}}\not\in\left(\delta\sqrt{t_{j,k}^{\epsilon}},1/\delta\sqrt{t_{j,k}^{\epsilon}}\right)\right)$
		$\displaystyle\hskip 56.9055pt\leq\sum_{j=0}^{r-1}\mathbb{P}\left(S_{t_{j,k}^{\epsilon}}\not\in\left(\delta\sqrt{t_{j,k}^{\epsilon}},1/\delta\sqrt{t_{j,k}^{\epsilon}}\right)\big{\|}T_{\tilde{h}}>k\right)\mathbb{P}\left(T_{\tilde{h}}>k\right)$
		$\displaystyle\hskip 56.9055pt\leq(1+o(1))r\left(1-e^{-\frac{\delta^{2}}{2}}+e^{-\frac{1}{2\delta^{2}}}\right)\mathbb{P}\left(T_{\tilde{h}}>k\right).$

Letting $\delta\downarrow 0$ yields [10]

\displaystyle\sum_{j=0}^{r}

\displaystyle\mathbb{P}\left(T_{g^{-}}>k;S_{t_{j,k}^{\epsilon}}\not\in\left(\delta\sqrt{t_{j,k}^{\epsilon}},1/\delta\sqrt{t_{j,k}^{\epsilon}}\right)\right)=o\left(\mathbb{P}\left(T_{\tilde{h}}>k\right)\right)=o\left(k^{-1/2}\right).

Since

\displaystyle\mathbb{P}\left(T_{1-\theta}>k\right)\sim\frac{2\theta}{\sqrt{2\pi}}k^{-1/2},

we conclude that

\displaystyle\limsup_{k\rightarrow\infty}\frac{\mathbb{P}\left(T_{g^{-}}>k\right)}{\mathbb{P}\left(T_{1-\theta}>k\right)}\leq 1.

Proof of Lemma 4.10.

This statement is a consequence of Theorem 2.3. Recall that by definition,

\displaystyle\upsilon(k)+|\tilde{E}_{m}(\upsilon(k))|\geq k,

and $\upsilon(k)\leq k$ . Moreover,

\displaystyle|\tilde{E}_{m}(\upsilon(k))|=m-\upsilon(k)-|\hat{E}_{m}(\upsilon(k))|\leq\max_{1\leq j\leq k}\left\{m-j-|\hat{E}_{m}(j)|\right\}.

It follows that

	$\displaystyle\mathbb{P}\left(\|\tilde{E}_{m}(\upsilon(k))\|\geq k^{\alpha}\right)$	$\displaystyle\leq\mathbb{P}\left(\max_{1\leq j\leq k}\left\{m-j-\|\hat{E}_{m}(j)\|\right\}\geq k^{\alpha}\right)$
		$\displaystyle\leq\mathbb{P}\left(\max_{1\leq j\leq k}\left\{m-j-\|\hat{E}_{m}(j)\|\right\}\geq j^{\alpha}\right)=o(m^{-1/2})$

by Theorem 2.3. We conclude that

\displaystyle\mathbb{P}\left(\upsilon(k)\leq k-k^{\alpha}\right)\leq\mathbb{P}\left(|\tilde{E}_{m}(\upsilon(k))|\geq k^{\alpha}\right)=o(m^{-1/2}).

Proof of Lemma 4.11.

Again, this is a consequence of Theorem 2.3. We note that for every $1\leq l\leq m$ ,

\displaystyle\kappa(l)=\sum_{i=1}^{s}\textrm{Ber}(\pi_{i+1}),

where

\displaystyle\pi_{i}=\frac{|\hat{E}_{m}(i-2)|}{m-i+2}

is a random variable. Due to Theorem 2.3,

\displaystyle\mathbb{P}\left(\pi_{i}\leq 1-\frac{i^{\alpha}}{m-i+2}\textrm{ for some }2\leq i\leq k+1\right)=o(m^{-1/2}).

Since for every $\alpha\in(0,1)$ , the function $i^{\alpha-1}/(m-i+1)$ is an increasing function in $i$ , it follows that

\displaystyle\mathbb{P}\left(\pi_{i}\leq 1-k^{\alpha-1}\textrm{ for some }2\leq i\leq k+1\right)=o(m^{-1/2}).

We derive the bound

	$\displaystyle\mathbb{P}\left(\varrho(k)>k+k^{\alpha}\right)$	$\displaystyle=\mathbb{P}\left(\kappa\left(k+k^{(\alpha+1)/2}\right)\leq k\right)\leq\mathbb{P}\left(\sum_{i=1}^{k+k^{(\alpha+1)/2}}\textrm{Ber}(\pi_{i+1})\leq k\right)$
		$\displaystyle\leq\mathbb{P}\left(\sum_{i=1}^{k+k^{(\alpha+1)/2}}\textrm{Ber}(k^{\alpha-1})>k^{(\alpha+1)/2}\right)$
		$\displaystyle\leq\exp\left\{-\frac{1}{2}k^{(1-\alpha)/2}(1+o(1))\right\}=o(m^{-1/2}),$

where the last inequality is due to the Chernoff bound.

	$\displaystyle\mathbb{P}\left(Y_{i,m}<-i^{\gamma}\bigg{\|}\mathcal{E}\right)$	$\displaystyle\leq\sum_{r=l}^{i-1}\mathbb{P}\left(\sum_{j=1}^{i}L_{j,m}>i+i^{\gamma};\sigma_{i}=r\bigg{\|}\mathcal{E}\right)$
		$\displaystyle\leq\sum_{r=l}^{i-1}\mathbb{P}\left(-Y_{r,m}+\sum_{j=r+1}^{i}\frac{1}{\pi_{i}}\textrm{Ber}(\pi_{i})>(i-r)+i^{\gamma};\sigma_{i}=t\bigg{\|}\mathcal{E}\right)(1+o(1))$
		$\displaystyle\leq\sum_{r=l}^{i-1}\;\mathbb{P}\left(\sum_{j=r+1}^{i}\frac{1}{\pi_{i}}\textrm{Ber}(\pi_{i})>(i-r)+i^{\gamma}\bigg{\|}\mathcal{E}\right)(1+o(1)).$

	$\displaystyle\mathbb{E}\left[\exp\left\{t\sum_{j=1}^{i}\frac{1}{\pi_{j}}\textrm{Ber}(\pi_{j})\right\}\bigg{\|}\mathcal{E}\right]$	$\displaystyle=\mathbb{E}\left[\mathbb{E}\left[\exp\left\{t\sum_{j=1}^{i}\frac{1}{\pi_{j}}\textrm{Ber}(\pi_{j})\right\}\bigg{\|}\mathcal{F}_{i-1};\mathcal{E}\right]\right]$
		$\displaystyle=\mathbb{E}\left[\exp\left\{t\sum_{j=1}^{i-1}\frac{1}{\pi_{j}}\textrm{Ber}(\pi_{j})\right\}\mathbb{E}\left[1+\pi_{i}(e^{t/\pi_{i}}-1)\bigg{\|}\mathcal{F}_{i-1};\mathcal{E}\right]\right]$
		$\displaystyle\leq\left(1+p(e^{t/p}-1)\right)\mathbb{E}\left[\exp\left\{t\sum_{j=1}^{i-1}\frac{1}{\pi_{j}}\textrm{Ber}(\pi_{j})\right\}\bigg{\|}\mathcal{E}\right].$

Failure behavior in a connected configuration model under a critical loading mechanism

Abstract

1 Introduction

1.1 Model description

Condition 1.1 (Regularity conditions).

Remark 1.2.

1.2 Notation

1.3 Main result

Theorem 1.3.

1.4 Outline

2 Proof strategy for Theorem 1.3

2.1 Relation of failure process and sequential removal process

2.2 Disintegration of the network through sequential removal

Theorem 2.1.

Theorem 2.2.

Theorem 2.3.

2.3 Impact of disintegration on the failure process

Proposition 2.4.

3 Disintegration of the network

3.1 Percolation on the connected configuration model

Lemma 3.1.

3.2 Explosion algorithm

Remark 3.2.

3.3 Typical structure of the percolated configuration model

Remark 3.3.

Lemma 3.4.

Proof.

Proposition 3.5.

Proof.

Proposition 3.6.

Lemma 3.7.

Corollary 3.8.

Proof of Proposition 3.6.

3.4 Probabilistic bounds on component sizes outside the giant

Proposition 3.9.

Proof.

Lemma 3.10.

Proof.

3.5 First disconnections

Proposition 3.11.

Proof.

Proof.

Proposition 3.12.

Proof.

3.6 Number of edges outside the giant component

Proof of Theorem 2.2.

Theorem 3.13.

Proof.

Proof of Theorem 2.3.

3.7 Linear number of edge removals

4 Cascading failure process

4.1 No edge disconnections

Proposition 4.1.

Theorem 4.2.

Proof.

4.2 Random walk formulation

Proposition 4.3.

Proof of Proposition 4.3.

4.3 Behavior of the number of edge failures in the giant

4.3.1 First-passage time for moving boundaries initially constant for sufficient time

Lemma 4.4.

Remark 4.5.

Proof.

Lemma 4.6.

Corollary 4.7.

4.3.2 Proof of Proposition 2.4

Proposition 4.8.

Proof.

Corollary 4.9.

Proof of Proposition 2.4.

4.4 Proof of main result

Lemma 4.10.

Lemma 4.11.

Proof of Theorem 1.3.

5 Universality principle

References

Appendix A Lists of variables

Appendix B Proofs of results on the disintegration of the graph

Proof of Lemma 3.1.

Proof of Lemma 3.7.

Failure behavior in a connected configuration model
under a critical loading mechanism