On the Robustness, Connectivity and Giant Component Size of Random K-out Graphs
Abstract
Random K-out graphs are garnering interest in designing distributed systems including secure sensor networks, anonymous crypto-currency networks, and differentially-private decentralized learning. In these security-critical applications, it is important to model and analyze the resilience of the network to node failures and adversarial captures. Motivated by this, we analyze how the connectivity properties of random K-out graphs vary with the network parameters , the number of nodes (), and the number of nodes that get failed or compromised (). In particular, we study the conditions for achieving connectivity with high probability and for the existence of a giant component with formal guarantees on the size of the largest connected component in terms of the parameters , and . Next, we analyze the property of -robustness which is a stronger property than connectivity and leads to resilient consensus in the presence of malicious nodes. We derive conditions on and under which the random K-out graph achieves r-robustness with high probability. We also provide extensive numerical simulations and compare our results on random K-out graphs with known results on Erdős-Rényi (ER) graphs.
Index Terms:
Connectivity, giant component, robustness, r-robustness, random graphs, random K-out graphs, security, privacyI Introduction
I-A Motivation and Background
In recent years, the rapid proliferation of affordable sensing and computing devices has led to rapid growth in technologies powered by the IoT (Internet of Things). A key challenge in this space is to develop network models for generating a securely connected ad-hoc network in a distributed fashion while minimizing operational costs.
With its unique connectivity properties, a class of random graph models known as the random -out graphs has found many applications in the design of ad-hoc networks. A random K-out graph [3, 4, 5], denoted as , is an undirected graph with nodes where each node forms an edge with distinct nodes chosen uniformly at random. Random K-out graphs are known to achieve connectivity easily, i.e., with far fewer edges as compared to classical random graph models including Erdős-Rényi (ER) graphs [6, 3], random geometric graphs [7], and random key graphs [8], which all require edges for connectivity. In particular, it is known [5, 4] that random K-out graphs are connected with high probability (whp) when . This had led to their deployment in several applications including random key predistribution schemes for secure communication in sensor networks [9, 10, 11], differentially-private distributed averaging algorithms [12], anonymity preserving cryptocurrency networks [13], and distributed secure mapping of network addresses in next-generation internet architectures [14].
In the context of sensor networks, random K-out graphs have been used [5, 11, 15] to analyze the performance of the random pairwise key predistribution scheme and its heterogeneous variants [16, 17]. The random pairwise scheme works as follows. Before deployment, each sensor chooses others uniformly at random. A unique pairwise key is given to each node pair where at least one of them selects the other. After deployment, two sensors can securely communicate if they share a pairwise key. The topology of the sensor network can thus be represented by a random K-out graph; each edge of the random K-out represents a secure communication link between two sensors. Consequently, random K-out graphs have been analyzed to answer key questions on the values of the parameters and needed to achieve certain desired properties, including connectivity at the time of deployment [4, 5], connectivity under link removals [11, 15], and unassailability [18]. Despite many prior works on random K-out graphs, very little is known about its connectivity properties when some of its nodes are removed. This is an increasingly relevant problem since many IoT networks are deployed in remote and hostile environments where nodes may be captured by an adversary, or fail due to harsh conditions.
Another application of random K-out graphs is in distributed learning, where a key goal is to perform computations on user data without compromising the privacy of the users. Random K-out graphs have recently been used to construct the communication graph in a differentially-private federated averaging scheme called the GOPA (GOssip Noise for Private Averaging) protocol [12, Algorithm 1]. According to the GOPA protocol, a random K-out graph is constructed on a set of nodes, of which an unknown subset is dishonest. It was shown in [12, Theorem 3] that the privacy-utility trade-offs achieved by the GOPA protocol are tightly dependent on the subgraph on honest nodes being connected. When the subgraph on honest nodes is not connected, it was shown that the performance of GOPA is tied to the size of the connected components of honest nodes. Since dishonest users can be modeled as randomly deleted nodes, analyzing the connectivity and giant component size of random K-out graphs under node deletions is the key in understanding the performance of the GOPA protocol.
I-B Properties of Interest for Random K-out Graphs in Distributed Computing Applications
In the context of applications discussed in the previous section, we can identify several key properties of random K-out graphs that need to be well-understood for performance evaluation and efficient design of the underlying systems. We believe that the graph properties discussed here can also be useful in facilitating new applications of random K-out graphs in different fields, akin to our recent work [19] paving the way to new applications by establishing connectivity guarantees in the finite node regime.
A key metric in quantifying the utility of a network is connectivity which is defined as the existence of a path of edges between every node pair [20]. Connectivity ensures that all agents in the network can communicate with one another and no node is isolated from the network. In practice, resource constraints limit the number of links that can be established in the network. Thus, a key goal is to design a resiliently connected network while keeping the number of links to be established within operational constraints. Depending on the resource constraints and mission requirements of the application at hand, it may suffice to ensure a weaker notion of connectivity or in cases where agents may routinely fail or compromise, we may even need a stronger notion of connectivity.
In resource-constrained environments, preserving connectivity despite node failures may not be feasible and the goal instead might be to ensure that there is a large enough, connected sub-network of users, also known as a giant component. For example, it may suffice to aggregate the temperature readings of the majority of sensors deployed in a field to get an estimate of the true temperature. Another example is the power grid network where it is essential to ensure supply to the majority of the users in the event of failures.
In addition to ensuring that the network remains resiliently connected in the event of node failures, it is often desirable to ensure that consensus can be achieved even in the presence of adversarial agents. In [21], it was shown that network connectivity is not sufficient to characterize consensus when nodes use a certain class of local filtering rules. In particular, it was shown that consensus can be reached in graphs that have the property of being sufficiently robust. This is formally quantified by the property of -robustness, which was introduced in [21]. A graph is said to be -robust if, for every disjoint subset pair that partitions the graph, at least one node in one of these subsets is adjacent to at least nodes in the other set.
The -robustness property is especially useful in applications of consensus dynamics, where parameters of several agents get aligned after a sufficiently long period of local interactions. In another example, it was shown [22] that if the network is -robust (for some non-negative integer ), then the nodes in the network can reach consensus even when there are up to malicious nodes in the neighborhood of every correctly-behaving node. Thus, -robustness is particularly important for applications based on consensus dynamics in adversarial environments. Moreover, -robustness is known [22] to be a stronger property than -connectivity and thus can provide guarantees on the connectivity of the graph when up to nodes in the graph are removed. Due to this importance, -robustness has been studied in various graph models in the random graph literature, such as the ER graph and the Barabási-Albert model [23]. -robustness has been studied in [24] in our prior work; and the results presented here improve upon those results by providing a sharper threshold for -robustness.
I-C Main Contributions
With these motivations in mind, this paper aims to fill the gaps in the literature on the connectivity and robustness properties of random K-out graphs. We provide a comprehensive set of results on the connectivity and size of the giant component of the random K-out graph when some of its nodes are dishonest, have failed, or have been captured. We further analyze the conditions required for ensuring -robustness of the random K-out graph. Our main contributions are summarized below:
-
1.
Let denote the random graph obtained after removing nodes, selected uniformly at random, from the random K-out graph . We provide a set of conditions for , , and under which is connected with high probability (whp). This is done for both cases where and , respectively. Our result for (see Theorem III.1) significantly improves a prior result [25] on the same problem and leads to a sharp zero-one law for the connectivity of the random K-out graph under node deletions. Our result for the case (see Theorem III.2) expands the existing threshold of required for connectivity by showing that the graph is connected whp for even when nodes are deleted.
-
2.
We derive conditions on , , that lead to a giant component in whp and provide an upper bound on the number of nodes not contained in the giant component. This is also done for both cases and ; see Theorem III.3 and Theorem III.4, respectively. An important consequence of this result is to establish as a sufficient condition to ensure whp the existence of a giant component in the random K-out graph despite the removal of nodes in the network.
-
3.
Using a novel proof technique, we show that when is sufficient to ensure that the random K-out graph is -robust whp (see Theorem III.5). Since it is already known that is not -robust whp when , this result is tight up to at most a multiplicative factor of two (and it is is much tighter than the condition established in [24]).
-
4.
We also provide a comparison of random K-out graphs with ER graphs. We determine that random K-out graphs are much more robust in terms of -robustness property, and also attain connectivity and admit a giant component with fewer edges compared to ER graphs.
-
5.
Combining our theoretical results with numerical simulations, we highlight the usefulness of random K-out graphs as a topology design tool for efficient design of secure, resilient and robust distributed networks.
I-D Organization of the Paper
The rest of this article is organized as follows. In Section II, we introduce the notation used across this article and the network model, the random K-out graph, and extend this model to account for node deletions. In Section III, we present the main results along with the simulation results and provide a detailed discussion. In Section IV, we provide the proof of all Theorems presented in Section III. Conclusions are provided in Section V.
II Notations and Definitions
All random variables are defined on the same probability space and probabilistic statements are given with respect to the probability measure . The complement of an event is denoted by . The cardinality of a discrete set is denoted by . The intersection of events and is denoted by. We refer to any mapping as a scaling if it satisfies the condition . All limits are understood with going to infinity. If the probability of an event tends to one as , we say that it occurs with high probability (whp). The statements , , , , and , used when comparing the asymptotic behavior of sequences , have their meaning in the standard Landau notation. The asymptotic equivalence is used to denote the fact that . Finally, we let denote the mean node degree of a graph.
Definition II.1 (Random K-out Graph)
The random K-out graph is defined on the vertex set as follows. Let denote the set vertex labels. For each , let denote the set of labels, selected uniformly at random, corresponding to the nodes selected by . It is assumed that are mutually independent. Distinct nodes and are adjacent, denoted by if at least one of them picks the other. Namely,
(1) |
Definition II.2 (Cut)
[27, Definition 6.3] For a graph defined on the node set , a cut is a non-empty subset of nodes isolated from the rest of the graph. Namely, is a cut if there is no edge between and . If is a cut, then so is .
Definition II.3 (Connected Components)
A pair of nodes in a graph are said to be connected if there exists a path of edges connecting them. A connected component of is a subgraph in which any two vertices are connected to each other, and no vertex is connected to a node outside of .
Definition II.4 (Giant Component)
For a graph with nodes, a giant component exists if its largest connected component has size . In that case, the largest connected component is referred to as the giant component of the graph.
Definition II.5 (Connectivity)
A graph is connected if there exists a path of edges between every pair of its vertices.
Definition II.6 (-connectivity)
A graph is -connected if it remains connected after the removal of any set of (or, fewer) nodes or edges.
Definition II.7 (-reachable Set)
[21, Definition 6] For a graph and a subset of nodes , we say is -reachable if i, where . In other words, is an -reachable set if it contains a node that has at least neighbors outside .

Definition II.8 (-robust Graph)
[21, Definition 6] A graph is -robust if for every pair of nonempty, disjoint subsets of , at least one of these subset pairs is -reachable, where .
It was shown in [22] that if a graph is -robust, it is at least -connected. Thus, -robustness is a stronger property than -connectivity. It is also easy to see that when , the properties of -robustness and -connectivity are equivalent.
A main goal of this paper is to study the connectivity and giant component size of random K-out graphs when some of its nodes are failed, captured, or dishonest. To this end, we also consider the following model of random K-out graphs under random removal of nodes. We first let denote the number of removed nodes and assume, for simplicity, that they are selected uniformly at random among all nodes in . Further, we let denote the set of deleted nodes. We then define on the vertex set and the corresponding set of labels , such that distinct vertices and (both in ) are adjacent if they were adjacent in ; i.e., if . For each , the set of labels adjacent to node in is denoted by .
III Main Results
Our main results are presented in Theorems III.1- III.5 below. Each Theorem addresses a design question as to how the parameter should be chosen to satisfy the desired property on robustness, connectivity or the size of the giant component; see Table I for a summary of the main results. The results on connectivity and the size of the giant component are for , i.e., the random K-out graph when nodes are deleted, while the result on -robustness is given for the original graph (without any node deletion). We provide the proofs of all results in Section IV.
III-A Results on Connectivity
Let .
Theorem III.1
Let with in , and consider a scaling such that with we have
(2) |
is the threshold function. Then, we have
(3) |
The proof of the one-law in (3), i.e., that if , is given in Section IV. The zero-law of (3), i.e., that if , was established previously in [25, Corollary 3.3]. There, a one-law was also provided: under (2), it was shown that if , leaving a gap between the thresholds of the zero-law and the one-law. Theorem III.1 presented here fills this gap by establishing a tighter one-law, and constitutes a sharp zero-one law; e.g., when , the one-law in [25] is given with , while we show that it suffices to have .
Theorem III.2
Consider a scaling .
a) If , then we have
(4) |
b) If and , and if for some sequence , it holds that
is the threshold function, then we have
(5) |
We remind that random K-out graph is known [4, 5] to be connected whp when . Part of Theorem III.2 extends this result by showing that having is sufficient for the random K-out graph to remain connected whp even when of its nodes (selected randomly) are deleted. We believe that this result will further facilitate the application of random K-out graphs in a wide range of applications where connectivity despite node failures is crucial.
III-B Results on the Size of the Giant Component
Let denote the set of nodes in the largest connected component of and let . Namely, is the probability that less than nodes are outside the largest component of .
Theorem III.3
Let , and . Consider a scaling and let
be the threshold function. Then, we have
We remark that if with , then . This shows that when , it suffices to have for to have a giant component containing nodes for arbitrary . Put differently, by choosing , we ensure that even when nodes are removed, the rest of the network contains a connected component whose fractional size is arbitrarily close to 1.
Theorem III.4
Let with in , and . Consider a scaling and let
be the threshold function. Then, we have
It can be seen from this result that needs to scale as for a random K-out graph to have a giant component of size when of its nodes are removed (or, if each node is independently removed with probability ). We also remark that the threshold is finite when . This shows that even when a positive fraction of the nodes of the random K-out graph are removed, a finite is still sufficient to have a giant component of size in the graph. This result can be useful in applications where it is required to maintain a giant component as efficiently (i.e., with as fewest edges) as possible even when large scale node failures take place.
III-C Result on Robustness
Theorem III.5
Define
Then, we have
In Theorem III.5, we establish a threshold for one-law of -robustness in random K-out graphs, and find that . In other words, we find that with high probability, a random K-out graph is -robust when , and . This threshold is much smaller than the previously established threshold [24] of which scales with . Hence, Theorem III.5 constitutes a sharper one-law for -robustness.
Desired | Minimum needed to | Theorem |
Property | achieve the property | |
Connectivity | Thm. III.1 | |
Connectivity | Thm. III.2(a) | |
Connectivity , | Thm. III.2(b) | |
Giant Component, , | Thm. III.3 | |
Giant Component, , | Thm. III.3 | |
Giant Component, , | Thm. III.4 | |
-robustness | Thm. III.5 |
III-D Discussion
In Theorem III.1, we improve the results given in [25], and with this, we close the gap between the zero law and the one law, and hence establish a sharp zero-one law for connectivity when nodes are deleted from . In Theorem III.2, we establish that the graph with is connected whp when ; and when , is sufficient for connectivity. The latter result is especially important, since is the previously established threshold for connectivity [4]. We improve this result by showing that the graph is still connected with even after nodes (selected randomly) are deleted.
Since most distributed systems require connectivity in the event of node failures, our results can be useful in many applications of distributed systems, particularly when the resources on each node is limited and it is critical to achieve desired connectivity and robustness properties using as few edges as possible. For example, in wireless sensor networks, knowing the minimum conditions needed for connectivity or giant component size under such failures is crucial as it enables designing them with fewest edges possible per node [18, 28], which reduces the communication overhead and potentially the cost of the hardware on each node.
We also note that Theorems III.3 - III.4 constitute the first results concerning the giant component size of random K-out graphs under randomly deleted nodes. In particular, these results help choose the value of for any anticipated level of node failure and for any given giant component size required, enabling the designs of distributed systems to compromise between efficiency, robustness, and the required giant component size. Thus, we expect these results to be useful in applications where connectivity is not a stringent condition under node failures, and instead having a certain giant component size is sufficient to continue the operation of the system.
In Theorem III.5, we establish that a random K-out graph is -robust whp when for any . This is a much sharper one-law than the previous result given in [24] where it was shown that needs to scale as for -robustness. This tighter result was made possible through several novel steps introduced here. While the proofs in prior work [29, 24] also rely on finding upper bounds on the probability of having at least one subset that is not -reachable, they tend to utilize standard upper bounds for the binomial coefficients and a union bound to establish them. Instead, our proof uses extensively the Beta function and its properties to obtain tighter upper bounds on such probabilities, which then enables us to establish a much sharper one-law for -robustness of random K-out graphs. We believe this result will pave the way for further applications of random K-out graphs in distributed computing applications such as the design of consensus networks in the presence of adversaries.
It is also of interest to compare the threshold of -robustness and -connectivity in random K-out graphs. For Erdős-Rényi graphs, the threshold for -connectivity and -robustness have been shown [29] to coincide with each other. For random K-out graphs, we know from [4] that is -connected whp whenever , and it is not -connected whp if . This leaves a factor of 2 difference between the condition we established for -robustness here and the threshold of -connectivity. Put differently, we know from [4] and Theorem III.5 that for any
(6) |
For , it is instead known that if only if . Since the currently established conditions for the zero-law and one-law of -robustness are not the same for random K-out graphs (unlike ER graphs where the two thresholds coincide), there is a question as to whether our threshold of is the tightest possible for -robustness. This is currently an open problem and would be an interesting direction for future work, e.g., by establishing a tighter zero-law for -robustness that coincides with the one-law of Theorem III.5. Comparison
To put all these results in perspective, we provide comparisons of our results with the results from an Erdős-Rényi graph , which is one of the most commonly used random graph models. Firstly, in terms of -robustness, it was shown in [29] that ER graph G(n; p) is -robust whp if , which translates to a mean node degree of . Since the mean node degree required for random K-out graphs scales as , we can conclude that for large , random K-out graphs can be ensured to be -robust whp at a mean node degree significantly smaller than the mean node degree required for an ER graph. In terms of connectivity, an ER graph becomes connected whp if [30], and this translates to having a mean node degree of . Similarly, when nodes are removed, the mean node degree required for connectivity scales with [6]. The required for the random K-out graph to be connected whp is much lower, with when nodes are removed, and when nodes are removed.


Next, since we are not aware of any theoretical results on the giant component size of ER graphs under node removals, we compare the size of the giant component in random K-out graphs and ER graphs under node removals through simulations in the finite node regime. We examine the maximum number of nodes outside the giant component out of 1000 experiments of a random K-out graph and an ER graph with the same mean node degree when nodes are removed from both graphs. To ensure that both graphs have the same mean node degree, in the ER graph is selected as . The results are given in Fig. 2 for , on (Left), and , on (Right). As can be seen, the random K-out graph tends to have fewer nodes outside of the giant component than the ER graph and this difference is more pronounced when is smaller.
In conclusion, we see that when both graphs have the same mean node degree, random K-out graphs are more robust than ER graphs in terms of connectivity and giant component size under random node removal, and also in terms of the -robustness property. This reinforces the efficiency of the K-out construction in many distributed computing applications where connectivity in the event of node failures or adversarial capture of nodes is crucial. Similarly, the fact that random K-out graphs tend to achieve -robustness with fewer edges per node than ER graphs (for any ), makes it more suitable in applications based on distributed consensus.
III-E Simulation Results
Since our results are asymptotic in nature, i.e., they have been established in the limit , an important question is whether they can also be useful in practical settings where the number of nodes is finite. We check the usefulness to validate Theorems III.1 - III.4 under practical settings, To answer this, we examine the probability of connectivity and the number of nodes outside the giant component for the graph (random K-out graph with deleted nodes) through computer simulations in two different setups111Determining whether a graph is -robust is a co-NP-complete problem [29] making it not feasible to check the usefulness of Theorem III.5 through computer simulations..
In the first setup, we consider the case where the number of deleted nodes, , with in . We generate instantiations of the random graph with , varying in the interval and consider several values in the interval . Then, we record the empirical probability of connectivity of the graph and from 1000 independent experiments for each pair. The results of this experiment are shown in Fig. 3 and Fig. 4. Fig. 3 (Left) depicts the empirical probability of connectivity of . The vertical lines stand for the critical threshold of connectivity obtained from Theorem III.1. In each curve, exhibits a threshold behaviour as increases, and the transition from to takes place around , the threshold established in (2), reinforcing the usefulness of Theorem III.1 under practical settings.
In Fig. 4, the maximum number of nodes outside the giant component in 1000 experiments is plotted for each parameter pair. For comparison, we also plot the upper bound on obtained from Theorem III.4 by taking the maximum value that gives a threshold less than or equal to the value tested in the simulation. As can be seen, for any and value, the experimental maximum number of nodes outside the giant component is smaller than the upper bound obtained from Theorem III.4, validating the usefulness of this result in the finite node regime.






The goal of the second experimental setup is to examine the case where the number of deleted nodes is . As before, we generate instantiations of the random graph , with , varying in , and varying in . For each pair, the maximum number of nodes outside the giant component in 1000 experiments is recorded; if no node is outside the giant component, then it is understood that the graph is connected. In Fig. 3 (Right), the maximum number of nodes outside the giant component observed in experiments is depicted as a function of . The plots for and are considered to represent the case in Theorem III.2(a). We see that there is at most one node outside the giant component for and , even when . This shows that the asymptotic behavior given in Theorem III.2(a), i.e., that random K-out graph remains connected if nodes are deleted, already appears when . The plots for and are used to check the case and in Theorem III.2(b). The thresholds on for these values, obtained using Theorem III.2(b) are and , rounded to two digits after decimal (the term in Theorem III.2(b) is ignored due to having a finite value in the simulations). It is clear from the plot that when , the graph with is connected, while suffices to ensure connectivity when . Thus, selecting above the theoretical thresholds given in III.2(b) is seen to ensure the connectivity of the graph in the finite node regime as well, supporting the usefulness of Theorem III.2(b) in practical cases.
In Fig. 5, the maximum number of nodes outside the giant component in 1000 experiments is plotted as a function of . For comparison, we also plot the upper bound on obtained from Theorem III.4. In particular, for each Theorem, the maximum value that gives a threshold less than or equal to the value tested in the simulation is found. Then, the lowest of these maximum values is used as the theoretical value. As can be seen, for any and value, the experimental maximum number of nodes outside the giant component is smaller than the upper bounds obtained from Theorem III.4, reinforcing the usefulness of our results in the finite node regime.
IV Proofs of Main Results
In this section, we provide the proof of all Theorems presented in Section III.
IV-A Preliminary Steps for Proving Theorems III.1 - III.4
Since the preliminary steps related to the proofs of III.1 - III.4 are the same, in this Section we present these steps. First, recall from Section II that the metrics connectivity and the size of the giant component under node removals are defined for the graph , where the set of nodes is removed from the graph ; also recall that . Let denote the set of labels of the vertex set of and let denote the event that is a cut in as per Definition II.2. The event occurs if no nodes in pick neighbors in , and no nodes in pick neighbors in . Note that nodes in or can still pick neighbors in the set . Thus, we have
Let denote the event that has no cut with size where is a sequence such that . In other words, is the event that there are no cuts in whose size falls in the range .
Lemma IV.1
[17, Lemma 4.3] For any sequence such that for all , we have
(7) |
Lemma IV.1 states that if the event holds, then the size of the largest connected component of is greater than ; i.e., there are less than nodes outside of the giant component of . Also note that is connected if takes place with , since a graph is connected if no node is outside the giant component. In order to establish the Theorems III.1-III.4., we need to show that with , and values as stated in each Theorem. From the definition of , we have
where is the collection of all non-empty subsets of . Complementing both sides and using the union bound, we get
(8) |
where denotes the collection of all subsets of with exactly elements. For each , we can simplify the notation by denoting . From the exchangeability of the node labels and associated random variables, we have
, since there are subsets of with r elements. Thus, we have
Substituting this into (8), we obtain
(9) |
Remember that is the event that the nodes in and nodes in do not pick each other, but they can pick nodes from the set . Thus, we have:
(10) |
Abbreviating as , we get from (9) that
(11) |
Using the upper bound on binomials (14) again, we have
(12) |
In order to establish the Theorems, we need to show that (12) goes to zero in the limit of large n for , and values as specified in each Theorem.
Since they will be referred to frequently throughout the proofs, we also include here the following standard bounds.
(13) |
(14) |
IV-B A Proof of Theorem III.1
Recall that in Theorem III.1, we have with and that we need for connectivity. Using (13) in (12), we have
We will show that the right side of the above expression goes to zero as goes to infinity. Let
We write
and show that both and go to zero as . We start with the first summation .
Next, assume as in the statement of Theorem III.1 that
(15) |
for some sequence such that with . Also define
where we substituted via (15) and used the fact that . Taking the limit as and recalling that , we see that . Hence, for large , we have
(16) |
where the geometric sum converges by virtue of . Using this, it is clear that .
Now, consider the second summation .
(18) | ||||
(20) |
Next, we define
(21) | ||||
(22) |
where we substituted for via (15). Taking the limit as we see that upon noting that and . With arguments similar to those used in the case of , we can show that when is large, , leading to converging to zero as gets large. With , and both and converging to zero when is large, we establish the fact that converges to zero as goes to infinity. This result also yields the desired conclusion in Theorem III.1 since .
IV-C A Proof of Theorem III.2
We will first start with part (a) of Theorem III.2.
Part a) Recall that in part (a), and we need for connectivity. Using this and (13) in (12), we get
We will show that the right side of the above expression goes to zero as goes to infinity. Let
We write
and show that both and go to zero as . We start with the first summation .
Next, assume as in the statement of Theorem III.2(a) that
. Also define
Taking the limit as and recalling that , we see that . Hence, for large , we have
(23) |
where the geometric sum converges by virtue of . Using this once again, it is clear from the last expression that .
Now, consider the second summation .
Again assume as in the statement of Theorem III.2(a) that . Next, we define=
Taking the limit as and recalling that , we see that . Hence, for large , we have
(24) |
where the geometric sum converges by virtue of . Using this once again, it is clear from the last expression that . With , and both and converging to zero when is large, we establish the fact that converges to zero as goes to infinity. This result also yields the desired conclusion in Theorem III.2(a) since .
Part b) We now continue with the proof of Theorem III.2(b). Recall that we had and . Using this and (13) in (12), we get
(26) | ||||
(28) |
Next, assume as in the statement of Theorem III.2(b) that
(29) |
Since , and noting that in (28), we have
(30) |
Using this, we get
(32) |
Next, define
(33) |
Recall that , so we have . Using this, and substituting via (29), we get
(34) |
Hence, for large , we have
(35) |
where the geometric sum converges by virtue of . Using this, it is clear from the last expression that . This result also yields the desired conclusion in Theorem III.2(b) since . This result, combined with the proof of part a, concludes the proof of Theorem III.2.
IV-D A Proof of Theorem III.3
Recall that in Theorem III.3, we have and . Using (13) in (12), we have
Next, assume as in the statement of Theorem III.3 that
(36) |
Since , we have
We will show that the right side of the above expression goes to zero as goes to infinity. Let
Recall that , so we have . Using this, and substituting for via (36), we get
(37) |
Taking the limit , we have
(38) |
Hence, for large , we have
(39) |
where the geometric sum converges by virtue of and . Using this, it is clear from the last expression that . This result also yields the desired conclusion in Theorem III.3 since . This concludes the proof of Theorem III.3.
IV-E A Proof of Theorem III.4
Recall that in Theorem III.4, we have with in , and . Using in (12), we get
(41) |
Next, assume as in the statement of Theorem III.4 that
Since , we have
(43) |
Also define
where we substituted via (IV-E). Taking the limit as , we see that . Hence, for large n, we have
(44) |
where the geometric sum converges by virtue of and . Using this, it is clear from the last expression that . This result also yields the desired conclusion in Theorem III.4 since . This concludes the proof of Theorem III.4.
IV-F Preliminaries Needed in the Proof of Theorem III.5
We start with a few definitions and properties that will be useful throughout the rest of the proof. First, let denote the beta function, denote the incomplete beta function, and denote the regularized incomplete beta function, where and are non-negative integers. These functions are defined as follows [31]:
(45) |
Using these definitions, it can easily be shown that
(46) |
Proof: where we divided the integral into two parts since the function is symmetric around 1/2. Using the fact that , we can conclude that .
The cumulative distribution function of a Binomial random variable can be expressed using the regularized incomplete beta function as:
(47) |
Lemma IV.2
[31, Eq. 8.17.4]: For ,
(48) |
Lemma IV.3
[31, Eq. 8.17.20]: For ,
(49) |
Lemma IV.4
The equation has only one solution when , , and .
Proof: First, is a solution to this equation since when , both and are zero. Also, when , is a solution of the equation since . The derivative of both terms with respect to is:
(50) |
It can be seen that the derivative of is a constant, and when and is monotone increasing in the range . For the case where ; and is a solution to the equation, hence for some that satisfies , it must hold that when , and when . This is because if such such that does not exist, can’t be a solution to the equation . Now, considering the case for arbitrary , since is monotone increasing, there can only be one such that . This means that is increasing faster than in the region , hence there can’t be a solution to in this region. Further, is increasing faster than in the region , hence there can be at most one solution to the equation in the region . Now, consider the fact that , and when . Combining this with the fact that both functions are continuous, there must be at least one solution to the equation for in the range . Combining this with previous statement (that there can be at most one solution in this range), it can be concluded that there is only one solution to the equation in the range where .
IV-G Proof of Theorem III.5
To prove Theorem III.5, we need to show that for any , the random K-out graph is -robust whp if . To do this, similar to the proof given in [29] for Erdős-Rényi graphs, we will first find an upper bound on the probability of a subset of given size being not -reachable, and then use this result to show that the probability of not being -robust goes to zero when and . Different from the prior work which relied on the commonly used upper bounds for the binomial coefficients and the union bound [29, 24] to bound the probability of a subset of given size being not -reachable, our proof uses the Beta function and its properties described in the previous Section to achieve tighter bounds. This in turn enables us to establish a tighter threshold for the -robustness of random K-out graphs than what was previously possible; e.g., see [24].
First, let denote the event that is an -reachable set as per Definition II.7. The event occurs if there exists at least one node in that is adjacent to at least nodes in , the subset comprised of nodes outside the subset . Thus, we have
with , denoting the set of labels of the vertices in and , respectively, and denoting the indicator function. We are also interested in the complement of this event, denoted as , which occurs if all nodes in are adjacent to less than r nodes in . This can be written as
Note that at least one subset in every disjoint subset pairs needs to be -reachable per the definition of -robustness, hence one of the events or need to hold with high probability for every disjoint subset pairs of . Now, let denote the event that both subsets in at least one of the disjoint subset pairs are -reachable. Thus, we have
where , is the collection of all non-empty subsets of and since for each we check the -reachability of both and , the condition is used to prevent counting each subset twice. From this defintion, it can be seen that the graph is -robust if the event does not occur. Using union bound, we get
(51) |
where denotes the collection of all subsets of with exactly elements, and let be a subset of the vertex set with size , i.e. and . Further, is abbreviated as . From the exchangeability of the node labels and associated random variables, we have
(52) |
since , as there are subsets of with elements. Substituting this into (51), we obtain
Before evaluating this expression, we will start with evaluating the probability that the set is not -reachable, abbreviated as . Since a node can have neighbors in if it forms an edge with nodes in or if nodes in forms edges with node , let denote the probability that all nodes form an edge with less than nodes in , and let denote the probability that for each node , nodes in form less than edges with them. Evidently, . Further, let denote the probability that a node forms an edge with less than nodes in , and let denote the probability that nodes in form less than edges with the node .
Lemma IV.5
The probability that the node chooses less than nodes in the set , denoted as , can be upper bounded by the cumulative distribution function of a binomial random variable with trials and success probability .
Proof: For node , after making one selection, the number of nodes available to choose from decreases so the probability of choosing a node in changes at each selection. For example, the probability of choosing a node in in the first selection is and the probability of choosing a node in in the second selection is if a node in was selected in the first selection and it is otherwise. Based on this, the probability of selecting a node in at the selection out of selections can be expressed as , , where denotes the number of nodes already chosen from the set before the selection. Since we are considering the case of choosing less than nodes in , we have that , and with this constraint the lowest possible value of occurs when and , and hence it is . This gives a lower bound on the probability of selecting a node in in one of the selections and at the same time is an upper bound on the probability of not selecting a node in in one of the selections, and hence it is an upper bound for choosing less than nodes.
Next, using this upper bound, we plug in and to (47), then we have
(53) |
The selections of each node in are independent, hence we can use .
In order to find , a node in forming an edge with the node can be modeled as a Bernoulli trial with probability so the event that nodes in forming less than edges with the node can be represented by a Binomial model with trials and . Hence,
(54) |
Since the nodes in forming edges with nodes in are not independent of the other nodes in , we cannot write . To find , we will decompose it into the following conditional probabilities.
(55) |
where represent all the nodes in , and is used to denote the number of nodes in that form an edge with the node . To find an upper bound on , we consider the worst case. In the worst case, all the preceding nodes are selected by nodes in exactly times. This reduces the number of available edges in that can make connections with the remaining nodes in , hence increases the probability of nodes in forming less than edges with the remaining nodes in . Hence, we can write:
(56) |
Consider the general case for where . Assume that nodes in formed an edge with only one node among the nodes . Similarly, assume nodes in formed an edge with only two nodes, and so on ( nodes in formed edges with nodes in ). Also define . It can be seen that . Here, we have nodes that did not use any of its selections yet, so their probability of choosing the node is . Similarly, we have nodes that used one of its selections, so their probability is (for nodes ).
Considering the selection of each node in as a Bernoulli trial with different probabilities, the collection of all such trials defines a Poisson Binomial distribution . The mean of this distribution is:
(57) |
Assuming and using , we have:
(58) |
Since the mode of the Poisson Binomial distribution satisfies [32], we have for any if . Since for any distribution, the cumulative distribution function is equal to 1/2 at the mode of distribution, we have that . Hence,
(59) |
Now, using the fact that , we have:
Let . Then, we have:
(60) |
We will divide the summation into three parts as follows:
(61) |
where is the solution to the equation in the range . (The purpose of defining this way will be given later in the proof.) Start with the first summation and use (14) along with , then we have:
For , since and are finite values, we have by virtue of . Using this, we can express the summation as:
(62) |
where the geometric sum converges by virtue of , leading to converging to zero for large .
Now, similarly consider the second summation . Using , we have
(63) |
where . Assume that . It can be shown that as a consequence of the property (49). Hence, we have
(64) |
Assume that is the solution of the equation in the range . From Lemma IV.4, we know that this equation has only one solution in this range, and for . Plugging in , we have , which leads to when . Denoting , we have
(65) |
where the geometric sum converges by virtue of , leading to converging to zero as gets large.
Now, similarly consider the third summation . Assuming , we have
(66) |
From the previous summation and the proof of Lemma IV.4, we know that in the range ; is decreasing, is increasing, and the overall expression is also increasing. Hence, takes its maximum value when . Since is an upper bound of , the expression also takes its maximum value when . Denoting , and using Stirling’s approximation as a finer upper bound, we have
(67) |
From this, we have:
(68) |
where the sum converges by virtue of since in this case, leading to converging to zero as gets large. Since , , and all converge to zero as gets large, also converges to zero as gets large. This concludes the proof of Theorem III.5.
V Conclusion
In this paper, we provide a comprehensive set of results on the -robustness of the random K-out graph , and the connectivity and giant component size of , i.e., random K-out graph with (randomly selected) nodes deleted. In addition to providing proofs of our results, we include computer simulations to validate our results in the finite node regime. To demonstrate the usefulness of the random K-out graphs, we compare our results on the random K-out graphs with results from ER graphs under similar settings, and determine that random K-out graphs attain -robustness, connectivity or the occurrence of a giant component of a given size at a significantly lower mean node degree value compared to ER graphs. These results reinforce the usefulness of random K-out graphs in applications that require a certain degree of robustness or tolerance to nodes failing, being captured, or being dishonest; such as federated learning, consensus dynamics, distributed averaging and wireless sensor networks.
References
- [1] E. C. Elumar, M. Sood, and O. Yağan, “On the connectivity and giant component size of random k-out graphs under randomly deleted nodes,” in 2021 IEEE International Symposium on Information Theory (ISIT), 2021, pp. 2572–2577.
- [2] E. C. Elumar and O. Yağan, “Analyzing r-robustness of random k-out graphs for the design of robust networks,” in ICC 2023 - IEEE International Conference on Communications, 2023, pp. 5946–5952.
- [3] B. Bollobás, Random graphs. Cambridge university press, 2001.
- [4] T. I. Fenner and A. M. Frieze, “On the connectivity of random -orientable graphs and digraphs,” Combinatorica, vol. 2, no. 4, pp. 347–359, Dec 1982.
- [5] O. Yağan and A. M. Makowski, “On the connectivity of sensor networks under random pairwise key predistribution,” IEEE Transactions on Information Theory, vol. 59, no. 9, pp. 5754–5762, Sept 2013.
- [6] P. Erdős and A. Rényi, “On the strength of connectedness of random graphs,” Acta Math. Acad. Sci. Hungar, pp. 261–267, 1961.
- [7] M. Penrose, Random geometric graphs. OUP Oxford, 2003, vol. 5.
- [8] O. Yağan and A. M. Makowski, “Zero–one laws for connectivity in random key graphs,” IEEE Transactions on Information Theory, vol. 58, no. 5, pp. 2983–2999, 2012.
- [9] M. Sood and O. Yağan, “k-connectivity in random graphs induced by pairwise key predistribution schemes,” in 2020 IEEE International Symposium on Information Theory (ISIT), 2020, pp. 1343–1348.
- [10] M. Sood and O. Yağan, “Towards k-connectivity in heterogeneous sensor networks under pairwise key predistribution,” in 2019 IEEE Global Communications Conference (GLOBECOM), 2019, pp. 1–6.
- [11] O. Yağan and A. M. Makowski, “Modeling the pairwise key predistribution scheme in the presence of unreliable links,” IEEE Transactions on Information Theory, vol. 59, no. 3, pp. 1740–1760, March 2013.
- [12] C. Sabater, A. Bellet, and J. Ramon, “Distributed differentially private averaging with improved utility and robustness to malicious parties,” 2020.
- [13] G. Fanti, S. B. Venkatakrishnan, S. Bakshi, B. Denby, S. Bhargava, A. Miller, and P. Viswanath, “Dandelion++: Lightweight cryptocurrency networking with formal anonymity guarantees,” Proc. ACM Meas. Anal. Comput. Syst., vol. 2, no. 2, pp. 29:1–29:35, Jun. 2018.
- [14] S. Sridhara, F. Wirz, J. de Ruiter, C. Schutijser, M. Legner, and A. Perrig, “Global distributed secure mapping of network addresses,” in Proceedings of the ACM SIGCOMM Workshop on Technologies, Applications, and Uses of a Responsible Internet (TAURIN), 2021.
- [15] F. Yavuz, J. Zhao, O. Yağan, and V. Gligor, “Designing secure and reliable wireless sensor networks under a pairwise key predistribution scheme,” in 2015 IEEE International Conference on Communications (ICC). IEEE, 2015, pp. 6277–6283.
- [16] R. Eletreby and O. Yağan, “Connectivity of inhomogeneous random K-out graphs,” IEEE Transactions on Information Theory, vol. 66, no. 11, pp. 7067–7080, 2020.
- [17] M. Sood and O. Yağan, “On the size of the giant component in inhomogeneous random k-out graphs,” in 2020 59th IEEE Conference on Decision and Control (CDC), 2020, pp. 5592–5597.
- [18] O. Yağan and A. M. Makowski, “Wireless sensor networks under the random pairwise key predistribution scheme: Can resiliency be achieved with small key rings?” IEEE/ACM Transactions on Networking, vol. 24, no. 6, pp. 3383–3396, 2016.
- [19] M. Sood and O. Yağan, “Tight bounds for the probability of connectivity in random k-out graphs,” in ICC 2021-IEEE International Conference on Communications. IEEE, 2021, pp. 1–6.
- [20] R. Merris, Graph theory, ser. Wiley-Interscience series in discrete mathematics and optimization. New York: John Wiley, 2001.
- [21] H. Zhang and S. Sundaram, “Robustness of information diffusion algorithms to locally bounded adversaries,” in 2012 American Control Conference (ACC), 2012, pp. 5855–5861.
- [22] H. J. LeBlanc, H. Zhang, X. Koutsoukos, and S. Sundaram, “Resilient asymptotic consensus in robust networks,” IEEE Journal on Selected Areas in Communications, vol. 31, no. 4, pp. 766–781, 2013.
- [23] H. Zhang and S. Sundaram, “Robustness of complex networks with implications for consensus and contagion,” in Decision and Control (CDC), 2012 IEEE 51st Annual Conference on. IEEE, 2012, pp. 3426–3432.
- [24] E. C. Elumar and O. Yağan, “Robustness of random k-out graphs,” in 2021 60th IEEE Conference on Decision and Control (CDC), 2021, pp. 5526–5531.
- [25] O. Yağan and A. M. Makowski, “On the scalability of the random pairwise key predistribution scheme: Gradual deployment and key ring sizes,” Performance Evaluation, vol. 70, no. 7, pp. 493 – 512, 2013.
- [26] A. Frieze and M. Karoński, Introduction to random graphs. Cambridge University Press, 2016.
- [27] A. Mei, A. Panconesi, and J. Radhakrishnan, “Unassailable sensor networks,” in Proc. of the 4th International Conference on Security and Privacy in Communication Netowrks, ser. SecureComm ’08. New York, NY, USA: ACM, 2008.
- [28] R. Di Pietro, L. V. Mancini, A. Mei, A. Panconesi, and J. Radhakrishnan, “Redoubtable sensor networks,” ACM Transactions on Information and System Security (TISSEC), vol. 11, no. 3, pp. 1–22, 2008.
- [29] H. Zhang, E. Fata, and S. Sundaram, “A notion of robustness in complex networks,” IEEE Transactions on Control of Network Systems, vol. 2, no. 3, pp. 310–320, 2015.
- [30] P. Erdős and A. Rényi, “On the evolution of random graphs,” Publ. Math. Inst. Hung. Acad. Sci, vol. 5, no. 1, pp. 17–60, 1960.
- [31] “NIST Digital Library of Mathematical Functions,” http://dlmf.nist.gov/8.17, Release 1.1.2 of 2021-06-15, f. W. J. Olver, A. B. Olde Daalhuis, D. W. Lozier, B. I. Schneider, R. F. Boisvert, C. W. Clark, B. R. Miller, B. V. Saunders, H. S. Cohl, and M. A. McClain, eds. [Online]. Available: http://dlmf.nist.gov/
- [32] W. Hoeffding, “On the Distribution of the Number of Successes in Independent Trials,” The Annals of Mathematical Statistics, vol. 27, no. 3, pp. 713 – 721, 1956. [Online]. Available: https://doi.org/10.1214/aoms/1177728178