Maximal persistence in random clique complexes
Abstract.
We study the persistent homology of an Erdős–Rényi random clique complex filtration on vertices. Here, each edge appears at a time chosen uniform randomly in the interval, and the persistence of a cycle is defined as , where and are the birth and death times of the cycle respectively. We show that for fixed , with high probability the maximal persistence of a -cycle is of order roughly . These results are in sharp contrast with the random geometric setting where earlier work by Bobrowski, Kahle, and Skraba shows that for random Čech and Vietoris–Rips filtrations, the maximal persistence of a -cycle is much smaller, of order .
1. Introduction
Recently, the topology of random simplicial complexes has been an active area of study — see, for example, the surveys [14] and [3]. This study has had applications in topological data analysis, including in neuroscience [11]. We will assume that the reader is familiar with the notions of persistent homology and a persistence diagram [9]. In topological inference, one sometimes considers points far from the diagonal in the persistence diagram to be representing “signal” and points near the diagonal as representing “noise”.
With this in mind, Bobrowski, Kahle, and Skraba studied maximally persistent cycles in random geometric complexes in [4]. They showed that with high probability the maximal persistence of a -dimensional cycle in a random geometric complex in is . We write if and grow at the same rate in the sense that there exist constants such that for all large enough . Both the Vietoris–Rips and Čech filtrations have an underlying parameter . Persistence of a cycle is measured multiplicatively as where and are the birth and death radius.
Our main result is that for fixed , the maximal persistence of -cycles in a random clique complex filtration is of order roughly . A precise statement is given in Theorem 4.1. Here we measure the persistence of a cycle as , where and are the birth and death edge probabilities, respectively.
The comparison between the Erdős–Rényi and random geometric settings may be more apparent if we renormalize so that the persistence of the associated filtrations can be measured on the same scale. One natural way to try to do this is to reconsider maximal persistence in random geometric complexes, using instead birth and death edge probability rather than radius.
The edge probability in a random geometric complex is of order . So if
then
As long as is fixed, is still much smaller than the maximal persistence of cycles in the random clique complex. However, this parameterization makes it clear that if grows, we expect cycles to persist for longer. It is known that random geometric graphs in dimension growing quickly enough converge in total variation distance to Erdős-Rényi random graphs, and this connection has been further explored and quantified in a number of recent papers — see, for example [7, 6, 18]. From this point of view, our main result can be seen as a “curse of dimensionality” for topological inference—as the ambient dimension gets bigger, noisy cycles persist for much longer.
2. Topology of random clique complexes
In this section, we review the definition of the random clique complex, and briefly survey the literature on topology of random clique complexes.
The following random graph is sometimes called the Erdős–Rényi model.
Definition 2.1.
For and , is the probability space of all simplicial graphs on vertices where every edge is included with probability , jointly independently.
We use the notation to indicate that a graph is chosen according to this distribution. We say that has a given property with high probability (w.h.p.) if the probability that has the property tends to one as .
We define to be the clique complex of the Erdős–Rényi random graph . We write to indicate that is a random simplicial complex chosen according to this distribution.
In [12], Kahle studied the topology of random clique complexes, and the following theorem roughly identified the threshold for homology to appear.
Theorem 2.2.
Let be the random clique complex, and assume is fixed.
-
(1)
If with , then w.h.p. . On the other hand,
-
(2)
If then w.h.p. .
We use the notation to indicate . Theorem 2.2 shows that the threshold for the appearance of th homology is roughly . The following is the main result of a later paper, [13], showing that the threshold for the vanishing of th homology is roughly .
Theorem 2.3.
Let and be fixed, and . If
then w.h.p. .
Theorem 2.3 describes a sharp threshold for cohomology to vanish, in the same spirit as in Linial and Meshulam’s work [15]. These theorems all describe high-dimensional cohomological generalizations of the Erdős–Rényi theorem on the threshold for connectivity of [10]. By universal coefficients, since we are working over a field, the results hold for homology with coefficients as well as cohomology.
Theorem 2.4 (Malen, 2019).
Let be fixed and . If with , then w.h.p. collapses onto a subcomplex of dimension at most .
This implies, in particular, that is torsion-free, so this represents an important step toward the “bouquet-of-spheres conjecture” described in [12] and [13].
Newman recently refined Malen’s collapsing argument to give a probabilistic refinement [17].
Theorem 2.5 (Newman, 2021).
Let be fixed and . If
then w.h.p. collapses onto a subcomplex of dimension at most .
In summary, earlier results show that there is one threshold where homology is born for the first time, when , and another where homology dies for the last time, when . Our main result is that there exist cycles that persist for nearly the entire interval of nontrivial homology.
3. The second moment method
We briefly review the second moment method, i.e. the use of Chebyshev’s inequality, which is our main probabilistic tool.
Theorem 3.1 (Chebyshev’s Inequality).
For any ,
Where is the expectation and is the variance.
The variance is defined by
If can be written as a sum of indicator random variables , then the following is easy to derive and its proof appears, for example, in Chapter 4 of Alon and Spencer’s book [1].
It follows from Chebyshev’s inequality 3.1 that if and
then w.h.p. In fact, w.h.p., meaning that in probability.
Finally, we note that if are are indicator random variables for events , we have that
Here denotes the event that both and occur.
4. Main result and proof
We consider the random graph as a stochastic process, as follows. Consider the random filtration of the complete graph where each edge appears at time , chosen uniform randomly in the interval . Similarly, the random clique complex is a random filtration of the simplex on vertices .
We assume the reader is familiar with persistent diagrams [9, 8]. A point in the persistence diagram for with coefficients and represents a -dimensional cycle with birth time and death time . We measure the persistence of that cycle multiplicatively, as . Define
where the maximum is taken over all points in the persistence diagram for homology in degree .
An equivalent definition is the following. Consider the natural inclusion map , where . For every , there is an induced map on homology . Then
where the maximum is taken as and range over all values with .
Our main result is the following.
Theorem 4.1.
For fixed and ,
with high probability.
Equivalently, if
then converges in probability to
Our results are actually slightly sharper then this. A slightly sharper statement is the following.
Suppose that
and
Then w.h.p.
So then our results are sharp, up to a small power of .
Since Theorem 2.3 is only known to hold for coefficients (or any field of characteristic zero), our upper bounds on maximal persistence only apply for persistent homology with coefficients. For lower bounds, we use the second moment method to prove the existence of cycles that persist nearly as long as possible, and these lower bounds hold for arbitrary field coefficients.
Proof.
First we prove an upper bound on .
Suppose that . By Theorem 2.5, we have that w.h.p. Now let , and suppose that
By Theorem 2.3, we have that . So we have that
Set
and let be any function such that . We have showed that
Most of our work is in proving a lower bound for . We focus our attention on a particular type of nontrivial -cycle, namely simplicial spheres which are combinatorially isomorphic to cross-polytope boundaries.
In the following, let and denote distinct subsets of vertices. That is, we write and suppose that with . A notation we can use for this is
Suppose that , where . Recall that every vertex is an element of , so they come with a natural ordering. We use and to denote adjacency and non-adjacency of vertices and . For any choice of , we say that is a special persistent cycle in the random clique complex filtration if
-
(1)
, , and for every at time ,
-
(2)
for every at time , and
-
(3)
have no common neighbors outside of vertex set at time .
Condition (1) implies that spans a -dimensional cycle at time , namely a cycle that is combinatorially equivalent to the boundary of -dimensional cross-polytope. Conditions (2) and (3) imply that is still not a boundary of anything, even at time . So then it is not only a nontrivial cycle at time , but it persists at least until time . Note that condition (2) already implies that have no common neighbor within vertex set . So condition (3) implies that they have no common neighbor at all, and then is a maximal -dimensional face.
Let be the number of special persistent cycles. We want to show that , which in turn will imply that with high probability. In the following, we will assume whenever necessary that
In particular, we assume that and .
Let be the event that the set of vertices in form a special persistent cycle, and let be its indicator random variable for this event. Then we can write
where the sum is taken over all subsets of size .
By edge independence, the probability of condition (1) is , and the probability of condition (2) is , the probability of condition (3) is . Moreover, these events are independent since they involve disjoint sets of edges. So we have
By linearity of expectation,
since . Since we also assume that , we have . By Chebyshev’s inequality, if we show that , then w.h.p.
We have the standard inequality
We recall that
We always have
and we note the simpler estimate
since is fixed and .
Let . In estimating , we consider cases depending on the value of .
Case I:
First, consider . It might be tempting to believe that in the case that , and are independent sets, so the covariance is zero, but unfortunately this is not the case. Conditions (1) and (2) for a special persistent cycle only depend on adjacency between vertices within the -set, but condition (3) depends on connections with the rest of the graph and these are not independent.
Nevertheless, we still have in this case
as follows.
The term is the probability of condition (1) holding for both vertex sets and . So this is also an upper bound on the probability of conditions (1), (2), and (3) holding for both vertex sets. For a lower bound on , we consider a slightly smaller event, slightly simpler but whose probability is of the same order of magnitude.
Let , where , as before. Similarly, let , where .
The event is defined as follows.
-
(1)
We have , , and , , , and for every at time . That is, condition (1) holds for both and . Some edges may be listed more than once if and overlap. This does not happen when but these are the cases we consider below.
-
(2)
Besides the edges that appear in the previous condition other edges occur between vertices in vertex set , at time . This happens with probability .
-
(3)
Neither nor has any mutual neighbors outside of vertex set , at time . The probability of this condition being satisfied can be bounded below by a union bound by , which is again since .
It is clear that imples . Indeed, condition (1) is the same, condition (2) of implies condition (2) of , and conditions (2) and (3) of together imply condition (3) of . So putting it all together, we have that .
Then
and
as desired.
So then
Since the number of pairs is bounded by we have that the total contribution to the variance, , is bounded by
Comparing this to
we see that
Case II:
An essentially identical calculation shows that when , we have
So in this case we have again
Hence, the total contribution to the variance, , is
and then,
and in particular .
Case III:
When , we consider two sub-cases. The first subcase is that events and are not compatible in the sense that they cannot both occur due to the ways in which and overlap. This happens if for a certain pair of vertices , are required to be adjacent in one of and non-adjacent in the other. In this subcase, we have
so
The second subcase is that the events and are compatible, in the sense that they could possibly both happen. In this case, let denote the number of pairs in that are forced to be non-adjacent in . Then the same argument as in Case I shows that
So
Since as , we get
So,
The total contribution of a pair of events and with to the variance is then bounded by
Comparing this to
we get
We have
We are assuming that . Since , we have . Then
, and .
Summing the inequalities from the different cases, we conclude that
since for each and is fixed.
We conclude that as long as
then with high probability.
It follows that if then w.h.p. , as desired.
∎
5. Future directions
Recall that we earlier defined
We believe that the is likely of order roughly , in the following sense.
Let be any function that tends to infinity with . We showed in the proof of Theorem 4.1 that
We believe that an analogous lower bound should hold.
Conjecture 5.1.
Let denote the maximal persistence over all -dimensional cycles in . Then
with high probability.
The following kind of limit theorem would provide precise answers to questions like, “Given a prior of this kind of distribution, what is the probability that there exists a cycle of persistence greater than ?”
Conjecture 5.2.
Let denote the maximal persistence over all -dimensional cycles in . Then
converges in law to a limiting distribution supported on an interval for some .




See Figures 1 and 2 for some numerical experiments illustrating these conjectures. These experiments were computed with the aid of Ulrich Bauer’s software Ripser [2].
It also seems natural to study more about the “rank invariant” of a random clique complex filtration. That is, given , , and , how large do we expect the rank of the map to be?
Conjecture 5.3.
Suppose that is fixed, and
If is the inclusion map, and
is the induced map on homology, then
In [12], it is shown that
so this conjecture is that almost all of the homology persists for as long as possible.
Bobrowski and Skraba study limiting distributions for maximal persistence in their recent preprint [5]. They describe experimental evidence that there is a universal distribution for maximal persistence over a wide class of models, including random Čech and Vietoris–Rips complexes. It is not clear whether we should expect the random clique complex filtration studied here to be in the same conjectural universality class.
References
- [1] Noga Alon and Joel H. Spencer. The probabilistic method. Wiley Series in Discrete Mathematics and Optimization. John Wiley & Sons, Inc., Hoboken, NJ, fourth edition, 2016.
- [2] Ulrich Bauer. Ripser: efficient computation of Vietoris-Rips persistence barcodes. J. Appl. Comput. Topol., 5(3):391–423, 2021.
- [3] Omer Bobrowski and Matthew Kahle. Topology of random geometric complexes: a survey. J. Appl. Comput. Topol., 1(3-4):331–364, 2018.
- [4] Omer Bobrowski, Matthew Kahle, and Primoz Skraba. Maximally persistent cycles in random geometric complexes. Ann. Appl. Probab., 27(4):2032–2060, 2017.
- [5] Omer Bobrowski and Primoz Skraba. On the universality of random persistence diagrams. (submitted), arXiv:2207.03926, 2022.
- [6] Matthew Brennan, Guy Bresler, and Dheeraj Nagaraj. Phase transitions for detecting latent geometry in random graphs. Probab. Theory Related Fields, 178(3-4):1215–1289, 2020.
- [7] Sébastien Bubeck, Jian Ding, Ronen Eldan, and Miklós Z. Rácz. Testing for high-dimensional geometry in random graphs. Random Structures Algorithms, 49(3):503–532, 2016.
- [8] David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. Stability of persistence diagrams. Discrete Comput. Geom., 37(1):103–120, 2007.
- [9] Herbert Edelsbrunner, David Letscher, and Afra Zomorodian. Topological persistence and simplification. Discrete Comput. Geom., 28(4):511–533, 2002.
- [10] P. Erdős and A. Rényi. On random graphs. I. Publ. Math. Debrecen, 6:290–297, 1959.
- [11] Chad Giusti, Eva Pastalkova, Carina Curto, and Vladimir Itskov. Clique topology reveals intrinsic geometric structure in neural correlations. Proceedings of the National Academy of Sciences, 112(44):13455–13460, 2015.
- [12] Matthew Kahle. Topology of random clique complexes. Discrete Math., 309(6):1658–1671, 2009.
- [13] Matthew Kahle. Sharp vanishing thresholds for cohomology of random flag complexes. Ann. of Math. (2), 179(3):1085–1107, 2014.
- [14] Matthew Kahle. Random simplicial complexes. In Handbook of Discrete and Computational Geometry, pages 581–603. Chapman and Hall/CRC, 2017.
- [15] Nathan Linial and Roy Meshulam. Homological connectivity of random 2-complexes. Combinatorica, 26(4):475–487, 2006.
- [16] Greg Malen. Collapsibility of random clique complexes. arXiv:1903.05055, 2019.
- [17] Andrew Newman. One-sided sharp thresholds for homology of random flag complexes. arXiv:2108.04299, 2021.
- [18] Elliot Paquette and Andrew V. Werf. Random geometric graphs and the spherical Wishart matrix. arXiv:2110.10785, 2021.