This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Random Simplicial Complexes:
Models and Phenomena

Omer Bobrowski and Dmitri Krioukov
Abstract

We review a collection of models of random simplicial complexes together with some of the most exciting phenomena related to them. We do not attempt to cover all existing models, but try to focus on those for which many important results have been recently established rigorously in mathematics, especially in the context of algebraic topology. In application to real-world systems, the reviewed models are typically used as null models, so that we take a statistical stance, emphasizing, where applicable, the entropic properties of the reviewed models. We also review a collection of phenomena and features observed in these models, and split the presented results into two classes: phase transitions and distributional limits. We conclude with an outline of interesting future research directions.

1 Introduction

Simplicial complexes serve as a powerful tool in algebraic topology, a field of mathematics fathered by Poincaré at the end of the 19th century. This tool was built—or discovered, depending on one’s philosophical view—by topologists in order to study topology. Much more recently, over the past decade or so, this tool was also discovered by network and data scientists who study complex real-world systems in which interactions are not necessarily diadic. The complexity of interactions in these systems is amplified by their stochasticity, making them difficult or impossible to predict, and suggesting that these intricate systems should be modeled by random objects. In other words, the combination of stochasticity and high-order interactions in real-world complex systems suggests that models of random simplicial complexes may be useful models of these systems.

From the mathematical perspective, the study of random simplicial complexes combines combinatorics and probability with geometry and topology. As a consequence, the history of random simplicial complexes is quite dramatic. The drama is that this history is super-short, compared to the histories of probability and topology taken separately. For example, the first and simplest model of random simplicial complexes, the Linial-Meshulam model [1], appeared only in 2006. The main reason behind this dramatic delay is that probability and topology had been historically at nearly opposite extremes in terms of methods, skills, intuition, and esthetic (dis)likes among mathematicians. Fortunately, the wall is now dismantled, and over the last 15 years or so, the field of random topology has been growing explosively, as our review attempts to convey through the lens of random simplicial complexes.

We do not attempt to review all existing models and results related to random simplicial complexes, which is a mission impossible. Instead, we try to focus on those models for which some exciting phenomena—which we review as well—have been rigorously established in mathematics, especially in topology. It is not entirely coincidental that a majority of these models are particulary attractive not only from the topological and probabilistic perspectives, but also from the statistical and information-theoretic perspectives. This is because these models tend to be statistically unbiased, in the sense that they are canonical models satisfying the maximum entropy principle. As a consequence, they can be used as the correct null models of real-world complexes exhibiting certain structural features.

We take this statistical stance in our review of models in Section 2. This review is then followed by the review of some of the most exciting phenomena related to these models, which we split between phase transitions (Section 3) and distributional limit theorems (Section 4). Many of the presented results are higher-dimensional analogues of the well-known phenomena in random graphs related to connectivity, giant component, cycles, etc., therefore we preamble, where possible, the higher-dimensional statements with brief recollections of their one-dimensional counterparts. The focus on mathematics taken in this review precludes us unfortunately from covering many exciting subjects, such as models of growing complexes [2, 3, 4, 5, 6, 7], their applications, and phenomena in them including percolation [8, 9, 10, 11, 12, 13]. Yet in the concluding Section 6 that outlines our view on interesting future directions, we also comment briefly on some applications and their implications for different models of random simplicial complexes.

2 Review of models

In application to real-world systems, the models of random simplicial complexes reviewed here are typically used as null models reproducing a particular property of interest. In other words, these models are not intended to be “correct models” of real-world systems, but they are intended to be correct null models of these systems. By “correct” we mean here a model that is statistically unbiased, and by “statistically unbiased” we mean a model that maximizes entropy across all models that have a desired property. Therefore, we begin this section with a brief recollection of basic facts behind the maximum entropy principle and canonical ensembles, which we call canonical models in this review. We then observe that, with a few exceptions, the reviewed models are higher-dimensional generalizations of 11-dimensional random simplicial complexes, which are random graphs. Therefore, we preamble, where possible, the definition of a higher-dimensional model with its 11-dimensional counterpart. We finish the section with a short summary of the maximum entropy properties of the reviewed models.

2.1 Maximum entropy principle and canonical models

The maximum entropy principle [14] formalizes the concept of statistical unbiasedness of a null model. Indeed, Shannon entropy is the unique measure of information satisfying the basic axioms of continuity, monotonicity, and system and subset independence [15], and the maximum entropy principle follows directly from these axioms coupled with the additional consistency axioms of uniqueness and representation invariance [16, 17, 18]. For these reasons, a maximum-entropy model is the unique model that contains all the information that the model is asked to model, and, more importantly, that does not contain any other junk information.

Formally, let 𝒳\mathcal{X} be a space of graphs or simplicial complexes, and \mathbb{P} a probability distribution on 𝒳\mathcal{X} corresponding to a model of 𝒳\mathcal{X}: (X)\mathbb{P}(X) is the probability with which the model generates X𝒳X\in\mathcal{X}. The Shannon entropy of \mathbb{P} is S()=X𝒳(X)log(X)S(\mathbb{P})=-\sum_{X\in\mathcal{X}}\mathbb{P}(X)\log\mathbb{P}(X). Let xq:𝒳x_{q}:\mathcal{X}\to\mathbb{R}, q=1,2,,Qq=1,2,\ldots,Q, be a finite collection of functions that we call properties of XX, and let yqy_{q}\in\mathbb{R} be a set of numbers that we will associate with values of properties xqx_{q}. In general, properties xqx_{q} can take values in spaces that are more sophisticated than \mathbb{R}, but \mathbb{R} suffices for us here. Denote 𝐱={xq}q=1Q\mathbf{x}=\left\{x_{q}\right\}_{q=1}^{Q}, 𝐲={yq}q=1Q\mathbf{y}=\left\{y_{q}\right\}_{q=1}^{Q}, and let ρ\rho be a probability distribution on Q\mathbb{R}^{Q}.

Given a pair of properties 𝐱\mathbf{x} and their values 𝐲\mathbf{y}, the microcanonical model is then the one that maximizes entropy under the sharp constraints that the values of the properties 𝐱\mathbf{x} of XX must be equal to 𝐲\mathbf{y} exactly: mic(𝐱,𝐲)=argmax{S():𝐱(X)=𝐲}\mathbb{P}_{\mathrm{mic}}(\mathbf{x},\mathbf{y})=\operatorname*{argmax}_{\mathbb{P}}\left\{S(\mathbb{P}):\mathbf{x}(X)=\mathbf{y}\right\}. This means that mic(𝐱,𝐲)\mathbb{P}_{\mathrm{mic}}(\mathbf{x},\mathbf{y}) is the uniform distribution over all such XXs:

mic[𝐱,𝐲](X)=1𝒩(𝐱,𝐲),\mathbb{P}_{\mathrm{mic}}[\mathbf{x},\mathbf{y}](X)=\frac{1}{\mathcal{N}(\mathbf{x},\mathbf{y})},

where 𝒩(𝐱,𝐲)=|{X𝒳:𝐱(X)=𝐲}|\mathcal{N}(\mathbf{x},\mathbf{y})=\left|\left\{X\in\mathcal{X}:\mathbf{x}(X)=\mathbf{y}\right\}\right|.

The canonical model is the one that maximizes entropy under the soft constraints that the values of the properties 𝐱\mathbf{x} are equal to 𝐲\mathbf{y} in expectation: can(𝐱,𝐲)=argmax{S():𝔼𝐱=𝐲}\mathbb{P}_{\mathrm{can}}(\mathbf{x},\mathbf{y})=\operatorname*{argmax}_{\mathbb{P}}\left\{S(\mathbb{P}):\mathbb{E}\,\mathbf{x}=\mathbf{y}\right\}. If the properties 𝐱\mathbf{x} are sufficiently nice (e.g., satisfy certain convexity assumptions [19, 20]), then the measure can(𝐱,𝐲)\mathbb{P}_{\mathrm{can}}(\mathbf{x},\mathbf{y}) is a Gibbs measure

can[𝐱,𝐲](X)=exp[𝝀(𝐲)𝐱(X)]/Z,\mathbb{P}_{\mathrm{can}}[\mathbf{x},\mathbf{y}](X)=\exp[-\boldsymbol{\lambda}(\mathbf{y})\cdot\mathbf{x}(X)]/Z,

where Z=X𝒳exp[𝝀(𝐲)𝐱(X)]Z=\sum_{X\in\mathcal{X}}\exp[-\boldsymbol{\lambda}(\mathbf{y})\cdot\mathbf{x}(X)] and the parameters 𝝀(𝐲)\boldsymbol{\lambda}(\mathbf{y}) solve the system of equations 𝔼𝐱=logZ/𝝀=𝐲\mathbb{E}\,\mathbf{x}=-\partial\log Z/\partial\boldsymbol{\lambda}=\mathbf{y}. The solution exists and is unique under the same niceness assumptions [19, 20]. Since Gibbs measures are known as exponential families in statistics, canonical models of random graphs are called exponential random graph models there. Consequently, canonical models of simplicial complexes are exponential random simplicial complexes [21].

Finally, the hypercanonical model hyp(𝐱,ρ)\mathbb{P}_{\mathrm{hyp}}(\mathbf{x},\rho) is the canonical model with random 𝐲ρ\mathbf{y}\sim\rho. The measure hyp(𝐱,ρ)\mathbb{P}_{\mathrm{hyp}}(\mathbf{x},\rho) is thus the ρ\rho-mixture of the Gibbs measures,

hyp[𝐱,ρ](X)=can[𝐱,𝐲](X)𝑑ρ(𝐲),\mathbb{P}_{\mathrm{hyp}}[\mathbf{x},\rho](X)=\int\mathbb{P}_{\mathrm{can}}[\mathbf{x},\mathbf{y}](X)\,d\rho(\mathbf{y}),

reproducing a desired distribution ρ\rho of the values 𝐲\mathbf{y} of the properties 𝐱\mathbf{x} of XX.

It is important to notice that can[𝐱,𝐲](X)\mathbb{P}_{\mathrm{can}}[\mathbf{x},\mathbf{y}](X) depends on XX only via 𝐱(X)\mathbf{x}(X). For this reason, the properties 𝐱\mathbf{x} are called sufficient statistics [22]: it suffices to know 𝐱(X)\mathbf{x}(X) to know can[𝐱,𝐲](X)\mathbb{P}_{\mathrm{can}}[\mathbf{x},\mathbf{y}](X); no further details about XX are needed. It follows that all XX’s with the same value of 𝐱(X)\mathbf{x}(X) are equally likely in the canonical model, so that it is a probabilistic mixture of the corresponding microcanonical models, while the hypercanonical model is a probabilistic mixture of the canonical models.

Canonical models of random graphs and simplicial complexes tend to be more tractable than their microcanonical counterparts, explaining why the best-studied models of random simplicial complexes are based on canonical models of random graphs, versus microcanonical ones. We begin with the simplest models based on the Erdős-Rényi graphs.

2.2 Complexes based on homogeneous random graphs

The Erdős-Rényi graphs G(n,p)G(n,p) and G(n,M)G(n,M).

Perhaps the best-studied random graph model is the G(n,M)G(n,M) model of random graphs with nn nodes and MM edges. All such graphs are equiprobable in the model, so that the model is microcanonical. It was introduced in 1951 by Solomonoff and Rapoport [23], and then studied by Erdős and Rényi in 1959 [24]. Its canonical counterpart, introduced by Gilbert in 1959 [25], is the G(n,p)G(n,p) ensemble of random graphs in which every possible edge between the (n2)n\choose 2 pairs of nodes exists independently with probability pp. The sufficient statistic in this canonical model is the number of edges MM. If (n2)p/M1{n\choose 2}p/M\to 1 in the large graph limit nn\to\infty, then the microcanonical G(n,M)G(n,M) and canonical G(n,p)G(n,p) models are asymptotically equivalent according to all definitions of such equivalence [26, 27, 28].

The Linial-Meshulam complex Y2(n,p)Y_{2}(n,p).

The constructive definition of the G(n,p)G(n,p) model can be rephrased as follows: take a complete 0-complex, which is a set of nn vertices, and then add 11-simplexes (edges) to it at all possible (n2)n\choose 2 locations independently with probability pp. The result is a random 11-dimensional complex Y1(n,p)=G(n,p)Y_{1}(n,p)=G(n,p).

The Linial-Meshulam model [1] is a straightforward 22-dimensional analogy of G(n,p)G(n,p): take a complete 11-complex, which is the complete graph of size nn, and then add 22-simplexes (filled triangles) to it at all possible (n3)n\choose 3 locations independently with probability pp. The result is a random 22-dimensional complex Y2(n,p)Y_{2}(n,p).

The dd-dimensional Linial-Meshulam complexes Yd(n,p)Y_{d}(n,p) and Yd(n,M)Y_{d}(n,M).

The Linial-Meshulam complex admits the natural generalization to any dimension d=1,2,,n1d=1,2,\ldots,n-1 [29] in the following way. Take a complete (d1)(d-1)-complex, and then add dd-simplexes to it at all possible (nd+1)n\choose d+1 locations independently with probability pp. We denote this random complex by Yd(n,p)Y_{d}(n,p).

The microcanonical version Yd(n,M)Y_{d}(n,M) of canonical Yd(n,p)Y_{d}(n,p) is also well defined. Here, we can take a complete (d1)(d-1)-complex, add exactly MM dd-simplexes to it chosen uniformly at random out of all the ((nd+1)M){{n\choose d+1}\choose M} possibilities. Note that Y1(n,M)=G(n,M)Y_{1}(n,M)=G(n,M). The microcanonical Yd(n,M)Y_{d}(n,M) model gained less consideration than the canonical Yd(n,p)Y_{d}(n,p) one, but some of its aspects were studied in [30, 31]. It should be easy to show that Yd(n,M)Y_{d}(n,M) is asymptotically equivalent to Yd(n,p)Y_{d}(n,p), but this has not been done.

The random flag complexes X(n,p)X(n,p) and X(n,M)X(n,M).

The Linial-Meshulam complexes are just one way to generalize the Erdős-Rényi graphs to higher dimensions. Another straightforward generalization is to extend G(n,p)G(n,p) into a random flag complex X(n,p)X(n,p) [32, 33]. Recall that the flag complex of a graph GG is the simplicial complex XX obtained by filling all the (k+1)(k+1)-cliques in GG with kk-simplexes, for all k=1,2,,n1k=1,2,\ldots,n-1. We can similarly define X(n,M)X(n,M) as the flag complex of G(n,M)G(n,M). To the best of our knowledge, the X(n,M)X(n,M) model has not been considered in the past.

The multi-parameter complexes X(n,𝐩)X(n,{\bf{p}}) and X(n,𝐌)X(n,\mathbf{M}).

A further generalization of G(n,p)G(n,p), subsuming both the Linial-Meshulam and flag complexes, is the multi-parameter complex X(n,𝐩)X(n,{\bf{p}}) considered in [34, 35, 36, 37]. Let 𝐩=(p1,p2,,pn1){\bf{p}}=(p_{1},p_{2},\ldots,p_{n-1}) where pk[0,1]p_{k}\in[0,1], k=1,2,,n1k=1,2,\ldots,n-1, is the probability of the existence of simplexes of dimension kk in X(n,𝐩)X(n,{\bf{p}}). Given this vector of probabilities, the complex X(n,𝐩)X(n,{\bf{p}}) is then defined as follows. First, take nn vertices and add edges to them independently with probability p1p_{1}, i.e., generate a G(n,p1)G(n,p_{1}) graph. Second, considering this graph as a 11-skeleton of a 22-complex, go over all the triangles (i.e. 33-cliques) in it, and fill each of them with a 22-simplex independently with probability p2p_{2}. Do not stop here, but continue in this fashion—given the (k1)(k-1)-skeleton, go over all the (k+1)(k+1)-cliques in it, and fill each of them with a kk-simplex independently with probability pkp_{k}—until k=n1k=n-1.

Among the G(n,p)G(n,p)-based complexes discussed thus far, the X(n,𝐩)X(n,{\bf{p}}) complex is the most general since it subsumes both the flag complex,

X(n,p)=X(n,(p,1,1,,1)),X(n,p)=X(n,(p,1,1,\ldots,1)),

and the Linial-Meshulam complex,

Yd(n,p)=X(n,(1,1,,1d1,p,0,0,,0nd1)).Y_{d}(n,p)=X(n,(\underbrace{1,1,\ldots,1}_{d-1},p,\underbrace{0,0,\ldots,0}_{n-d-1})).

The X(n,𝐩)X(n,{\bf{p}}) model is canonical, and to specify its sufficient statistics, we define a kk-shell σ\partial\sigma to be the complete (k1)(k-1)-dimensional boundary of a (potential) kk-simplex σ\sigma in a complex XX. That is, XX may or may not contain σ\sigma (σ\sigma can be filled or empty), but if σ\partial\sigma is a kk-shell, then all its simplexes are filled, so that σ\partial\sigma can be thought of as a “pre-kk-simplex”, in the sense that it is ready to be filled with σ\sigma, but might eventually be left empty.

As shown in [21] (see also Section 2.6), the sufficient statistics in X(n,𝐩)X(n,{\bf{p}}) are not only the numbers Ms,kM_{s,k} of kk-simplexes of each dimension kk, but also the numbers Mc,kM_{c,k} of kk-shells. Therefore, the microcanonical counterpart of X(n,𝐩)X(n,{\bf{p}}) is X(n,𝐌)X(n,\mathbf{M}), where 𝐌=((Mc,1,Ms,1),(Mc,2,Ms,2),,(Mc,n1,Ms,n1))\mathbf{M}=((M_{c,1},M_{s,1}),(M_{c,2},M_{s,2}),\ldots,(M_{c,n-1},M_{s,n-1})), Mc,1=(n2)M_{c,1}={n\choose 2}, and Ms,1=MM_{s,1}=M. This model has not been considered in the past, but it is another natural higher-dimensional generalization of G(n,M)G(n,M), while X(n,M)X(n,M) and Yd(n,M)Y_{d}(n,M) are special cases of X(n,𝐌)X(n,\mathbf{M}).

All other relationships between the discussed complexes are shown in Fig. 1.

Refer to caption

Figure 1: The relations between the considered models of random simplicial complexes. The vertical solid lines connect more general complexes to their special cases. The horizontal dashed lines connect higher-dimensional complexes to their 11-dimensional cases or 11-skeletons. The vertical dotted lines connect probabilistic mixtures (e.g., canonical models) to their constituents (e.g., microcanonical models). The shaded models have not been considered before. The review of models in Section 2 proceeds roughly against the arrow directions, from the least general G(n,M)G(n,M) to the most general X(n,𝐫^)X(n,\hat{\mathbf{r}}).

2.3 Complexes based on inhomogeneous random graphs

2.3.1 General complexes

The inhomogeneous random graph G(n,p^)G(n,\hat{p}).

The edge existence probability in the random Erdős-Rényi graph G(n,p)G(n,p) does definitely not have to be the same pp for all edges. Each edge i,ji,j can have a different existence probability pijp_{ij}, i,j=1,2,,ni,j=1,2,\ldots,n. Such random graphs are known as inhomogeneous or generalized random graphs [38, 39, 40, 41]. We denote them by G(n,p^)G(n,\hat{p}), where p^\hat{p} is the matrix of edge existence probabilities, p^={pij}\hat{p}=\{p_{ij}\}.

The multi-parameter complex X(n,𝐩^)X(n,\hat{{\bf{p}}}).

A straightforward generalization of G(n,p^)G(n,\hat{p}) to higher dimensions was introduced in [21]. It is also a generalization of X(n,𝐩)X(n,{\bf{p}}), and it is defined as follows. Let 𝐩^=(p^1,p^2,,p^n1)\hat{{\bf{p}}}=(\hat{p}_{1},\hat{p}_{2},\ldots,\hat{p}_{n-1}) be a vector of simplex existence probabilities for all possible simplexes of all possible dimensions k=1,2,,n1k=1,2,\ldots,n-1: p^1={pij}\hat{p}_{1}=\{p_{ij}\}, p^2={pijl}\hat{p}_{2}=\{p_{ijl}\}, and p^k={pσk}\hat{p}_{k}=\{p_{\sigma_{k}}\} collects the (nk+1)n\choose k+1 existence probabilities for all possible (nk+1)n\choose k+1 simplexes σk\sigma_{k} of dimension kk. Given such a vector of tensors 𝐩^\hat{{\bf{p}}}, the random complex X(n,𝐩^)X(n,\hat{{\bf{p}}}) is then generated similarly to X(n,𝐩)X(n,{\bf{p}}): starting with an inhomogeneous random graph G(n,p^1)G(n,\hat{p}_{1}), go over all the 33-cliques (22-shells) in this graph, and fill them with 22-simplexes σ2\sigma_{2} independently with probability pσ2p_{\sigma_{2}}. Proceed to higher dimensions k=3,4,,n1k=3,4,\ldots,n-1 in a similar fashion, filling kk-shells with kk-simplexes independently with probabilities pσkp_{\sigma_{k}}. The complex X(n,𝐩)X(n,{\bf{p}}) is a special case of X(n,𝐩^)X(n,\hat{{\bf{p}}})—the one with p^kpk\hat{p}_{k}\equiv p_{k}.

2.3.2 Configuration models

The configuration models SCM(n,𝐤)\mathrm{SCM}(n,\mathbf{k}) and CM(n,𝐤)\mathrm{CM}(n,\mathbf{k}).

The inhomogeneous random graphs G(n,p^)G(n,\hat{p}) are quite general, subsuming many other important models of random graphs. In particular, they encompass the soft configuration model SCM(n,𝐤)\mathrm{SCM}(n,\mathbf{k}) [38, 39, 40, 41, 42, 43, 44], which is the canonical model of random graphs with a given sequence of expected degrees 𝐤=(k1,k2,,kn)\mathbf{k}=(k_{1},k_{2},\ldots,k_{n}), ki0k_{i}\geq 0, i=1,2,,ni=1,2,\ldots,n. The SCM(n,𝐤)\mathrm{SCM}(n,\mathbf{k}) model is the G(n,p^)G(n,\hat{p}) with p^\hat{p} given by

pij=1eλi+λj+1,p_{ij}=\frac{1}{e^{\lambda_{i}+\lambda_{j}}+1}, (2.1)

where the parameters λi\lambda_{i}’s (known as Lagrange multipliers) solve the system of nn equations given by

jpij=ki.\sum_{j}p_{ij}=k_{i}. (2.2)

This equation guarantees the expected degree of node ii to be kik_{i}.

The sufficient statistics in SCM(n,𝐤)\mathrm{SCM}(n,\mathbf{k}) are the degrees of all nodes in a graph. Therefore, SCM(n,𝐤)\mathrm{SCM}(n,\mathbf{k})’s microcanonical counterpart, the configuration model CM(n,𝐤)\mathrm{CM}(n,\mathbf{k}) [45, 46], is the uniform distribution over all the graphs with the degree sequence 𝐤\mathbf{k}. Note that 𝐤\mathbf{k} can be any sequence of nonnegative real numbers in SCM(n,𝐤)\mathrm{SCM}(n,\mathbf{k}), but in CM(n,𝐤)\mathrm{CM}(n,\mathbf{k}), 𝐤\mathbf{k} is a graphical sequence of nonnegative integers. A sequence 𝐤\mathbf{k} is called graphical, if there exists a graph whose degree sequence is 𝐤\mathbf{k}. The necessary and sufficient conditions for 𝐤\mathbf{k} to be graphical are the Erdős-Gallai conditions [47].

An important special case of CM(n,𝐤)\mathrm{CM}(n,\mathbf{k}) is the random kk-regular graph G(n,k)=CM(n,(k,k,,k))G(n,k)=\mathrm{CM}(n,(k,k,\ldots,k)).

The dd-dimensional configuration models SCMd(n,𝐤)\mathrm{SCM}_{d}(n,\mathbf{k}) and CMd(n,𝐤)\mathrm{CM}_{d}(n,\mathbf{k}).

Recall that the degree kik_{i} of a vertex (0-simplex) ii is the number of edges (11-simplexes) that contain ii. In a similar vein, the degree kσk_{\sigma} of dd-simplex σ\sigma is defined to be the number of (d+1)(d+1)-simplices that contain σ\sigma.

The dd-dimensional soft configuration model SCMd(n,𝐤)\mathrm{SCM}_{d}(n,\mathbf{k}) is a generalization of SCM(n,𝐤)\mathrm{SCM}(n,\mathbf{k}), where now 𝐤={kτ}\mathbf{k}=\{k_{\tau}\} is a sequence of (nd)n\choose d expected degrees kτ0k_{\tau}\geq 0 of all (d1)(d-1)-simplexes τ\tau. To construct SCMd(n,𝐤)\mathrm{SCM}_{d}(n,\mathbf{k}) we take the complete (d1)(d-1)-skeleton on nn vertices, and add all possible dd-simplexes σ\sigma independently with probability

pσ=1eτλτ𝟙{τ<σ}+1,p_{\sigma}=\frac{1}{e^{\sum_{\tau}\lambda_{\tau}\boldsymbol{\mathbbm{1}}\left\{\tau<\sigma\right\}}+1}, (2.3)

where the summation is over all (d1)(d-1)-faces τ\tau of σ\sigma (as implied by the standard τ<σ\tau<\sigma notation), and the parameters λτ\lambda_{\tau} solve the system of (nd)n\choose d equations

σpσ𝟙{τ<σ}=kτ.\sum_{\sigma}p_{\sigma}\boldsymbol{\mathbbm{1}}\left\{\tau<\sigma\right\}=k_{\tau}. (2.4)

Note that SCMd(n,𝐤)\mathrm{SCM}_{d}(n,\mathbf{k}) is a special case of X(n,𝐩^)X(n,\hat{{\bf{p}}}):

SCMd(n,𝐤)=X(n,(1,1,,1d1,p^d,0,0,,0nd1)),\mathrm{SCM}_{d}(n,\mathbf{k})=X(n,(\underbrace{1,1,\ldots,1}_{d-1},\hat{p}_{d},\underbrace{0,0,\ldots,0}_{n-d-1})),

where p^d={pσ}\hat{p}_{d}=\{p_{\sigma}\} is given by (2.3). If d=1d=1, then Eqs. (2.3,2.4) reduce to Eqs. (2.1,2.2), respectively, and we have SCM1(n,𝐤)SCM(n,𝐤)\mathrm{SCM}_{1}(n,\mathbf{k})\equiv\mathrm{SCM}(n,\mathbf{k}).

The SCMd(n,𝐤)\mathrm{SCM}_{d}(n,\mathbf{k}) is the canonical model of random complexes whose sufficient statistics are the degrees of all the (d1)(d-1)-simplexes in a complex. Therefore, SCMd(n,𝐤)\mathrm{SCM}_{d}(n,\mathbf{k})’s microcanonical counterpart, the configuration model CMd(n,𝐤)\mathrm{CM}_{d}(n,\mathbf{k}), is the uniform distribution over complexes with the complete (d1)(d-1)-skeleton and the degree sequence of (d1)(d-1)-simplexes equal to 𝐤\mathbf{k}. In CMd(n,𝐤)\mathrm{CM}_{d}(n,\mathbf{k}), 𝐤\mathbf{k} is a realizable sequence of (nd)n\choose d nonnegative integers, versus any sequence of (nd)n\choose d nonnegative real numbers in SCMd(n,𝐤)\mathrm{SCM}_{d}(n,\mathbf{k}). The conditions for 𝐤\mathbf{k} to be realizable, analogous to the Erdős-Gallai graphicality conditions in the d=1d=1 case, are at present unknown.

An important special case of the configuration model is the random kk-regular complex Xd(n,k)=CMd(n,(k,k,,k))X_{d}(n,k)=\mathrm{CM}_{d}(n,(k,k,\ldots,k)) in which all (d1)(d-1)-simplexes have degree kk [48]. For k=1k=1, such a complex is also known as an (n,d)(n,d)-Steiner system, whose randomized construction is due to [49].

With the exception of Xd(n,k)X_{d}(n,k), the dd-dimensional configuration models CMd(n,𝐤)\mathrm{CM}_{d}(n,\mathbf{k}) and SCMd(n,𝐤)\mathrm{SCM}_{d}(n,\mathbf{k}) have not been considered in the past. The configuration models defined and studied in [50, 51] are very different and unrelated to any other model considered above. We review them in Section 2.5.2.

2.4 Complexes based on inhomogeneous random graphs with random connection probabilities

2.4.1 General complexes

The inhomogeneous random graph with random connection probabilities G(n,r^)G(n,\hat{r}).

Note that the edge probabilities pijp_{ij} in inhomogeneous random graphs G(n,p^)G(n,\hat{p}) can themselves be random [52, 53, 40, 41]. In that case, we replace p^\hat{p} with r^={rij}\hat{r}=\{r_{ij}\} which is a random matrix with entries in [0,1][0,1], and the corresponding inhomogeneous random graph is denoted G(n,r^)G(n,\hat{r}). This class of random graphs is extremely general and subsumes a great variety of many important random graph models, for example G(n,p^)G(n,\hat{p}).

Of a particular interest is the case where

rij=p(Xi,Xj),r_{ij}=p(X_{i},X_{j}),

where {X1,,Xn}\{X_{1},\ldots,X_{n}\} is a set of random variables in some measure space SS, and p:S×S[0,1]p:S\times S\to[0,1] is a fixed function called the connection probability function. This class of models is quite general and subsumes, for instance, the stochastic block models [54], hidden variable models [52, 53], latent space models [55, 56, 57], random geometric graphs [58, 59], and many other popular models. If the XiX_{i}’s are independent random variables uniformly distributed in [0,1][0,1], and pp is a symmetric integrable function, then pp is also known as a graphon in the theory of graph limits [60, 61].

The multi-parameter complex X(n,𝐫^)X(n,\hat{\mathbf{r}}).

The most general class of models considered in this chapter is a version of the multi-parameter complex X(n,𝐩^)X(n,\hat{{\bf{p}}}) with random simplex existence probabilities 𝐩^\hat{{\bf{p}}}. To emphasize their randomness, we denote this complex by X(n,𝐫^)X(n,\hat{\mathbf{r}}), where 𝐫^\hat{\mathbf{r}} is random 𝐩^\hat{{\bf{p}}}. For maximum generality, the probabilities of existence of different simplexes, including simplexes of different dimensions, do not have to be independent random variables, so that the space of X(n,𝐫^)X(n,\hat{\mathbf{r}}) is parameterized by joint probability distributions over the vector of tensors 𝐫^\hat{\mathbf{r}}, equivalent to all nonempty subsets of {1,,n}\left\{1,\ldots,n\right\}. As evident from Fig. 1, all other models of random complexes and graphs are special cases of X(n,𝐫^)X(n,\hat{\mathbf{r}}). In particular, X(n,𝐩^)X(n,\hat{{\bf{p}}}) is a special degenerate case of X(n,𝐫^)X(n,\hat{\mathbf{r}}). At this level of generality, the complex X(n,𝐫^)X(n,\hat{\mathbf{r}}) has not been considered or defined before, yet it is simply a higher-dimensional version of the well-studied G(n,r^)G(n,\hat{r}).

2.4.2 Hypersoft configuration model

The hypersoft configuration model HSCM(n,ρ)\mathrm{HSCM}(n,\rho).

The hypersoft configuration model is the hypercanonical model of random graphs with a given expected degree distribution ρ\rho [52, 53, 40, 41, 44, 62]. It is defined as the SCM(n,𝐤)\mathrm{SCM}(n,\mathbf{k}) in which the expected degree sequence 𝐤\mathbf{k} is not fixed but random: the expected degree kik_{i} of every vertex ii is an independent random variable with distribution ρ\rho, kiρk_{i}\sim\rho. In other words, to generate an HSCM(n,ρ)\mathrm{HSCM}(n,\rho) graph, one first samples kik_{i}s independently from the distribution ρ\rho, then solves the system of equations (2.2) to find all λi\lambda_{i}s, and finally creates edges with probabilities pijp_{ij} (2.1). The HSCM(n,ρ)\mathrm{HSCM}(n,\rho) is thus a probabilistic mixture of SCM(n,𝐤)\mathrm{SCM}(n,\mathbf{k})’s with random 𝐤ρn\mathbf{k}\sim\rho^{n}. The degree distribution in HSCM(n,ρ)\mathrm{HSCM}(n,\rho) is the mixed Poisson distribution with mixing ρ\rho [40, 41].

An equivalent definition of HSCM(n,ρ)\mathrm{HSCM}(n,\rho) is based on the observation that the distribution ρ\rho defines the joint distribution Ψ\Psi of Λ={λi}\Lambda=\{\lambda_{i}\} via (2.2). This means that an equivalent procedure to generate an HSCM(n,ρ)\mathrm{HSCM}(n,\rho) graph is to sample Λ\Lambda directly from Ψ\Psi first, and then create edges with probabilities pijp_{ij} (2.1). This definition makes it explicit that the HSCM(n,ρ)\mathrm{HSCM}(n,\rho) is a special case of G(n,r^)G(n,\hat{r}).

For a given ρ\rho, it may difficult to find the explicit form of the distribution Ψ\Psi. However, in many cases with important applications—including the Pareto ρ\rho with a finite mean, for instance—it has been shown that the canonical connection probability (2.1) and its classical limit approximation

pijCL=min(1,eλieλi)=min(1,kikjk¯n),p_{ij}^{\mathrm{CL}}=\min\left(1,e^{-\lambda_{i}}e^{-\lambda_{i}}\right)=\min\left(1,\frac{k_{i}k_{j}}{\bar{k}n}\right), (2.5)

where k¯\bar{k} is the mean of ρ\rho, lead to asymptotically equivalent random graphs [28]. In such cases, the random Lagrange multipliers λi\lambda_{i}’s are asymptotically independent, Ψ=ψn\Psi=\psi^{n}, λiψ\lambda_{i}\sim\psi, with the distribution ψ\psi defined by the distribution ρ\rho via the change of variables λi=ln(k¯n/ki)\lambda_{i}=\ln\left(\sqrt{\bar{k}n}/k_{i}\right) where kiρk_{i}\sim\rho. Generating such a graph is extremely simple: first sample either the kik_{i}’s or λi\lambda_{i}’s independently from the distribution ρ\rho or ψ\psi, respectively, and then generate edges with probabilities pijCLp_{ij}^{\mathrm{CL}} (2.5).

The dd-dimensional hypersoft configuration model HSCMd(n,ρ)\mathrm{HSCM}_{d}(n,\rho).

A straightforward generalization of HSCM(n,ρ)\mathrm{HSCM}(n,\rho) to dimension dd is achieved by defining the HSCMd(n,ρ)\mathrm{HSCM}_{d}(n,\rho) as the probabilistic mixture of SCMd(n,𝐤)\mathrm{SCM}_{d}(n,\mathbf{k}) with random 𝐤ρ(nd)\mathbf{k}\sim\rho^{n\choose d}: the expected degree kτk_{\tau} of every (d1)(d-1)-simplex τ\tau is an independent random variable with distribution ρ\rho. In other words, to generate a random HSCMd(n,ρ)\mathrm{HSCM}_{d}(n,\rho) complex, one first prepares the complete (d1)(d-1)-complex of size nn, then samples the (nd)n\choose d expected degrees kτk_{\tau} of all the (d1)(d-1)-simplexes τ\tau independently from the distribution ρ\rho, then solves the system of equations (2.4) to find all the λτ\lambda_{\tau}’s, and finally creates the dd-simplexes σ\sigma independently with probability pσp_{\sigma} (2.3). For the same reasons as in the d=1d=1 case, the HSCMd(n,ρ)\mathrm{HSCM}_{d}(n,\rho) model is a special case of X(n,𝐫^)X(n,\hat{\mathbf{r}}) for any d1d\geq 1. The model has not been previously considered, except the d=1d=1 case.

2.4.3 Geometric complexes

An important class of random complexes are geometric complexes, which are a special case of X(n,𝐫^)X(n,\hat{\mathbf{r}}) whose 11-skeletons are random geometric graphs. Their randomness comes from random locations of vertices in a space, and this randomness is often modeled by either binomial or Poisson point processes whose definitions we recall next.

Let SS be a metric space with a probability distribution PP on it. The definitions below apply to any sufficiently nice pair of SS and PP. However, for the sake of simplicity, in the subsequent sections of this chapter, the space SS will always be a flat dd-dimensional torus S=𝕋d=[0,1]d/{01}S=\mathbb{T}^{d}=[0,1]^{d}/\left\{0\sim 1\right\}, while PP will be its Lebesgue measure λ\lambda, i.e., P(A)=λ(A)/λ(𝕋d)=λ(A)P(A)=\lambda(A)/\lambda(\mathbb{T}^{d})=\lambda(A), where λ(A)\lambda(A) is the Euclidean volume of A𝕋dA\subseteq\mathbb{T}^{d}.

The binomial point process

is simply a set of nn random points 𝒳n={X1,,Xn}S\mathcal{X}_{n}=\left\{X_{1},\ldots,X_{n}\right\}\subset S sampled independently from PP. In the simplest case S=𝕋1S=\mathbb{T}^{1}, we simply sample XiX_{i}’s from the uniform distribution, Xi𝒰(0,1)X_{i}\sim\mathcal{U}(0,1), viewed as a circle. The distribution of the number of points in any measurable subset ASA\subseteq S is binomial with mean nP(A)nP(A), giving the process its name.

The Poisson point process

is the binomial process with a random number of points NN sampled from the Poisson distribution with mean nn, i.e. 𝒫n=𝒳N\mathcal{P}_{n}=\mathcal{X}_{N}, where NPois(n)N\sim\mathrm{Pois}\left({n}\right). The distribution of the number of points in any measurable ASA\subseteq S is Poisson with mean nP(A)nP(A), where nn is the rate of the process. Another key property of the Poisson process is spatial independence: the numbers of points in any pair of disjoint subsets of SS are independent random variables. This property, absent in the binomial process, makes the Poisson process favorable in probabilistic analysis.

In the nn\to\infty limit, the two process are asymptotically identical—the binomial process converges to the Poisson process in a proper sense [63]. For this reason, and given that the Poisson process is easier to deal with, we will restrict ourselves in this chapter to the Poisson process 𝒫n\mathcal{P}_{n} on the torus 𝕋d\mathbb{T}^{d}. These setting turn out to provide the most elegant results, while not losing much in terms of generality.

The random geometric graph G(𝒫n,r)G(\mathcal{P}_{n},r).

We start with G(𝒳n,r)G(\mathcal{X}_{n},r), which is a special case of G(n,r^)G(n,\hat{r}), in which

rij=p(Xi,Xj)=𝟙{d(Xi,Xj)r},r_{ij}=p(X_{i},X_{j})=\boldsymbol{\mathbbm{1}}\left\{d(X_{i},X_{j})\leq r\right\},

where d(Xi,Xj)d(X_{i},X_{j}) is the distance between Xi,Xj𝒳nX_{i},X_{j}\in\mathcal{X}_{n} in 𝕋d\mathbb{T}^{d}, and r>0r>0 is the connectivity radius parameter [58, 59]. In other words, the vertices of this random graph are a realization of the binomial process, and two vertices are connected if the distance between them in 𝕋d\mathbb{T}^{d} is at most rr. To generate the random geometric graph G(𝒫n,r)G(\mathcal{P}_{n},r) we first sample NPois(n)N\sim\mathrm{Pois}\left({n}\right), and then generate G(𝒳N,r)G(\mathcal{X}_{N},r) as described above.

The random Vietoris-Rips complex R(𝒫n,r)R(\mathcal{P}_{n},r).

This is the flag complex over the random geometric graph G(𝒫n,r)G(\mathcal{P}_{n},r). It is thus a straightforward higher-dimensional generalization of G(𝒫n,r)G(\mathcal{P}_{n},r). Note that we can consider R(𝒳n,r)R(\mathcal{X}_{n},r) for the binomial process as well, which is a special case of X(n,𝐫^)X(n,\hat{\mathbf{r}}). Then R(𝒫n,r)=R(𝒳N,r)R(\mathcal{P}_{n},r)=R(\mathcal{X}_{N},r) with NPois(n)N\sim\mathrm{Pois}\left({n}\right).

The random Čech complex C(𝒫n,r)C(\mathcal{P}_{n},r).

This is a different higher-dimensional generalization of G(𝒫n,r)G(\mathcal{P}_{n},r). The C(𝒫n,r)C(\mathcal{P}_{n},r) rule is: draw balls Br/2(Xi)B_{r/2}(X_{i}) of radius r/2r/2 around each of the points Xi𝒫nX_{i}\in\mathcal{P}_{n}, then look for all the intersections of these balls, and for every (k+1)(k+1)-fold intersection of the balls, add the corresponding kk-simplex to the complex. Note that the binomial version C(𝒳n,r)C(\mathcal{X}_{n},r) can still be considered as a special case of X(n,𝐫^)X(n,\hat{\mathbf{r}}) with a different rule for the creation of higher-dimensional simplexes, compared to R(𝒳n,r)R(\mathcal{X}_{n},r).

While the 11-skeleton of both R(𝒫n,r)R(\mathcal{P}_{n},r) and C(𝒫n,r)C(\mathcal{P}_{n},r) is the same random geometric graph G(𝒫n,r)G(\mathcal{P}_{n},r), higher-dimensional simplexes in these complexes are in general different, as illustrated in Fig. 2. The following useful relation was proved in [64] for any αd+12d\alpha\leq\sqrt{\frac{d+1}{2d}}:

R(𝒫n,αr)C(𝒫n,r)R(𝒫n,r).R(\mathcal{P}_{n},\alpha r)\subset C(\mathcal{P}_{n},r)\subset R(\mathcal{P}_{n},r).
Refer to caption
Figure 2: The Čech and Rips complexes. We draw balls of a fixed radius rr around the points, and consider their intersection. The edges in both cases correspond to the geometric graph. While both triangles belong to the Rips complex, the left triangle is not included in the Čech complex.

A powerful property of the Čech complex is due to the Nerve Lemma [65] stating that under mild conditions we have the following isomorphism between the homology groups HkH_{k}:

Hk(C(𝒫n,r))Hk(B(𝒫n,r)),H_{k}(C(\mathcal{P}_{n},r))\cong H_{k}(B(\mathcal{P}_{n},r)),

where B(𝒫n,r)=Xi𝒫nBr/2(Xi)B(\mathcal{P}_{n},r)=\bigcup_{X_{i}\in\mathcal{P}_{n}}B_{r/2}(X_{i}). This property, absent in the Rips complex, is very useful in the probabilistic analysis of the Čech complex, allowing one to switch back-and-forth between combinatorics and stochastic geometry.

2.5 Complexes based on random hypergraphs

2.5.1 General complexes

The main reason why the Linial-Meshulam model Yd(n,p)Y_{d}(n,p) is defined on top of the complete (d1)(d-1)-skeleton, is simplicity. The complete (d1)(d-1)-complex underlying the dd-complex Yd(n,p)Y_{d}(n,p) simplifies its topological analysis drastically. Much less simple, but also much more general, is the X(n,𝐩^)X(n,\hat{{\bf{p}}}) complex in which the probability of existence of simplexes of different dimensions are all different, but this complex is still relatively simple from the probabilistic perspective, since simplexes of higher dimensions are created in a conditionally independent manner, with the conditions being the existence of required simplexes of lower dimensions. Very recently, this simplicity was sacrificed even further for the sake of even greater generality in a class of models based on random hypergraphs [66, 67], which we briefly review next.

The inhomogeneous random hypergraph H(n,𝐩^)H(n,\hat{{\bf{p}}}).

This is a straightforward generalization of the inhomogeneous random graph G(n,p^)G(n,\hat{p}) to hypergraphs: every possible hyperedge σ\sigma (which is a nonempty subset of {1,,n}\left\{1,\ldots,n\right\}) in HH(n,𝐩^)H\sim H(n,\hat{{\bf{p}}}) exists independently with probability pσ𝐩^p_{\sigma}\in\hat{{\bf{p}}}. Note that HH is not a simplicial complex with high probability, unless 𝐩^\hat{{\bf{p}}} is specially designed for HH to be a complex.

The lower and upper complexes Z¯(n,𝐩^)\underline{Z}(n,\hat{{\bf{p}}}) and Z¯(n,𝐩^)\overline{Z}(n,\hat{{\bf{p}}}).

Introduced in [68, 66], these are both based on the H(n,𝐩^)H(n,\hat{{\bf{p}}}). The lower complex Z¯Z¯(n,𝐩^)\underline{Z}\sim\underline{Z}(n,\hat{{\bf{p}}}) is the largest simplicial complex that the random HH(n,𝐩^)H\sim H(n,\hat{{\bf{p}}}) contains, while the upper complex Z¯Z¯(n,𝐩^)\overline{Z}\sim\overline{Z}(n,\hat{{\bf{p}}}) is the smallest simplicial complex that contains the random HH(n,𝐩^)H\sim H(n,\hat{{\bf{p}}}), so that Z¯HZ¯\underline{Z}\leq H\leq\overline{Z}. In other words, a simplex σ\sigma is included in the lower complex Z¯\underline{Z} if and only if σH\sigma\in H and also all its faces are hyperedges of HH. The other way around, every hyperedge σH\sigma\in H is included in the upper complex Z¯\overline{Z} as a simplex together with all its faces, even if some of these faces do not happen to be in HH.

Compared to the graph-based models reviewed in the previous sections, the hypergraph-based models are much more difficult to analyze, primarily because the distributions of the skeletons are highly nontrivial and carry intricate dependency structure. However, an exciting fact about these two models is that they are dual in some strict topological sense (Alexander duality), and therefore statements about one of the models can be translated with ease to corresponding statements about the other [66, 67].

At this point, it may be tempting to proceed similarity to the previous XX-sections, and let the hyperedge existence probabilities 𝐩^\hat{{\bf{p}}} be random 𝐫^\hat{\mathbf{r}}, leading to H(n,𝐫^)H(n,\hat{\mathbf{r}}), Z¯(n,𝐫^)\underline{Z}(n,\hat{\mathbf{r}}), and Z¯(n,𝐫^)\overline{Z}(n,\hat{\mathbf{r}}). However, nothing new is gained by doing so, since the X(n,𝐫^)X(n,\hat{\mathbf{r}}) model is already general enough as it includes any possible joint distribution of simplex existence probabilities, including the distributions describing Z¯(n,𝐩^)\underline{Z}(n,\hat{{\bf{p}}}) and Z¯(n,𝐩^)\overline{Z}(n,\hat{{\bf{p}}}). That is, the lower and upper models Z¯(n,𝐩^)\underline{Z}(n,\hat{{\bf{p}}}) and Z¯(n,𝐩^)\overline{Z}(n,\hat{{\bf{p}}}) with nonrandom 𝐩^\hat{{\bf{p}}} are special cases of the more general X(n,𝐫^)X(n,\hat{\mathbf{r}}) model with the joint distribution of 𝐫^\hat{\mathbf{r}} set equal to the joint distribution of simplex existence probabilities in Z¯(n,𝐩^)\underline{Z}(n,\hat{{\bf{p}}}) and Z¯(n,𝐩^)\overline{Z}(n,\hat{{\bf{p}}}), while the lower and upper models Z¯(n,𝐫^)\underline{Z}(n,\hat{\mathbf{r}}) and Z¯(n,𝐫^)\overline{Z}(n,\hat{\mathbf{r}}) with random 𝐫^\hat{\mathbf{r}} are equivalent to X(n,𝐫^)X(n,\hat{\mathbf{r}}_{*}) with matching joint distributions of 𝐫^\hat{\mathbf{r}}_{*}.

2.5.2 ZZ-configuration models

One interesting special case of the ZZ-complexes from the previous section is the ZZ-version of the dd-dimensional (soft) configuration model considered in [50].

The ZZ-configuration models Z\mathrm{Z}-CMd(n,𝐤)\mathrm{CM}_{d}(n,\mathbf{k}) and Z\mathrm{Z}-SCMd(n,𝐤)\mathrm{SCM}_{d}(n,\mathbf{k}).

The dd-degree kσ(d)k_{\sigma}(d) of a dd^{\prime}-simplex σ\sigma is defined to be the number of simplexes of dimension d>dd>d^{\prime} that contain σ\sigma [2], and the ZZ-configuration models are defined by a vector 𝐤={ki(d)}\mathbf{k}=\{k_{i}(d)\} of a given sequence of (expected) dd-degrees of vertices i=1,2,,ni=1,2,\ldots,n, as opposed to the vector of (expected) dd-degrees of (d1)(d-1)-simplexes in the (S)CMd(n,𝐤)\mathrm{(S)CM}_{d}(n,\mathbf{k}) in Section 2.3.2. No low-dimensional skeleton is formed in the Z-SCMd(n,𝐤)\mathrm{SCM}_{d}(n,\mathbf{k}), while the probability of dd-simplex σ\sigma is

pσ=1eiλi𝟙{i<σ}+1,p_{\sigma}=\frac{1}{e^{\sum_{i}\lambda_{i}\boldsymbol{\mathbbm{1}}\left\{i<\sigma\right\}}+1}, (2.6)

where the nn parameters λi\lambda_{i} of vertices ii solve the system of nn equations

σpσ𝟙{i<σ}=ki(d).\sum_{\sigma}p_{\sigma}\boldsymbol{\mathbbm{1}}\left\{i<\sigma\right\}=k_{i}(d). (2.7)

If simplex σ\sigma is added to the complex, then so are all its lower-dimensional faces. The Z-SCMd(n,𝐤)\mathrm{SCM}_{d}(n,\mathbf{k}) is a special case of the Z¯(n,𝐩^)\bar{Z}(n,\hat{{\bf{p}}}) with 𝐩^=(0,,0,p^d,0,,0)\hat{{\bf{p}}}=(0,\ldots,0,\hat{p}_{d},0,\ldots,0) and p^d={pσ}\hat{p}_{d}=\{p_{\sigma}\} given by (2.6).

The model is canonical with the dd-degree of all vertices being its sufficient statistics, so that the corresponding microcanonical model Z-CMd(n,𝐤)\mathrm{CM}_{d}(n,\mathbf{k}) is the uniform distribution over all the dd-complexes with the dd-degree sequence of the vertices equal to 𝐤\mathbf{k}. Note that these complexes satisfy the additional constraint: they do not contain any simplex of dimension d<dd^{\prime}<d that is not a face of a dd-simplex. The hypercanonical model Z-HSCMd(n,ρ)\mathrm{HSCM}_{d}(n,\rho) is the Z-SCMd(n,𝐤)\mathrm{SCM}_{d}(n,\mathbf{k}) with random ki(d)ρk_{i}(d)\sim\rho.

2.6 Random simplicial complexes as canonical models

Table 1 summarizes what is known concerning the reviewed complexes as canonical models. To convey this knowledge succinctly, it suffices to focus only on the canonical models per se, since their microcanonical constituents and hypercanonical mixtures are coupled to them as discussed in Section 2.1.

Models Spaces 𝒮\mathcal{S} Properties 𝐱\mathbf{x} Refs Comments
G(n,p)G(n,p), G(n,M)G(n,M) 𝒢n\mathcal{G}_{n} MM [39] (1)
X(n,p)X(n,p), X(n,M)X(n,M) n\mathcal{F}_{n} MM [21] (1)
Yd(n,p)Y_{d}(n,p), Yd(n,M)Y_{d}(n,M) 𝒞n,d\mathcal{C}_{n,d} Ms,dM_{s,d} [21] (1)
X(n,𝐩)X(n,{\bf{p}}), X(n,𝐌)X(n,\mathbf{M}) 𝒞n\mathcal{C}_{n} 𝐌={Mc,d,Ms,d}d=1n1\mathbf{M}=\left\{M_{c,d},M_{s,d}\right\}_{d=1}^{n-1} [21] (1,2)
G(n,p^)G(n,\hat{p}), G(n,r^)G(n,\hat{r}) 𝒢n\mathcal{G}_{n} σ^1={σ1}\hat{\sigma}_{1}=\left\{\sigma_{1}\right\} [39] (1)
X(n,𝐩^)X(n,\hat{{\bf{p}}}), X(n,𝐫^)X(n,\hat{\mathbf{r}}) 𝒞n\mathcal{C}_{n} 𝝈^={{σd},{σd}}d=1n1\hat{\boldsymbol{\sigma}}=\left\{\left\{\partial\sigma_{d}\right\},\left\{\sigma_{d}\right\}\right\}_{d=1}^{n-1} [21] (1,2)
((H)S)CM(n,)\mathrm{((H)S)CM}(n,\cdot) 𝒢n\mathcal{G}_{n} 𝐤={ki}\mathbf{k}=\left\{k_{i}\right\} [39] (3)
((H)S)CMd(n,)\mathrm{((H)S)CM}_{d}(n,\cdot) 𝒞n,d\mathcal{C}_{n,d} 𝐤={kσd1}\mathbf{k}=\left\{k_{\sigma_{d-1}}\right\} here (4)
Z-((H)S)CMd(n,)\mathrm{((H)S)CM}_{d}(n,\cdot) 𝒵n,d\mathcal{Z}_{n,d} 𝐤={ki(d)}\mathbf{k}=\left\{k_{i}(d)\right\} [50] (5)
Table 1: Canonical models of random simplicial complexes. The first column lists the canonical models from Section 2, along with their micro- and hyper-canonical counterparts (if any), whose entropy maximization properties have been established. The second and third columns document the constraints (space of complexes and their properties) under which the model entropy is maximized. The fourth column contains references to where this maximization has been established and to further details. The last column refers to pertinent comments in the text.

To define a canonical model, one needs to specify not only the sufficient statistics 𝐱\mathbf{x} (whose expectations can take any admissible values 𝐲\mathbf{y} in a particular canonical model), but also the space 𝒮\mathcal{S} of allowed complexes over which the entropy of a canonical model is maximized. Table 1 uses the following notations for such spaces 𝒮\mathcal{S} of labeled graphs and complexes:

  • 𝒢n\mathcal{G}_{n}: all graphs of size nn;

  • n\mathcal{F}_{n}: flag complexes of size nn;

  • 𝒞n\mathcal{C}_{n}: all complexes of size nn;

  • 𝒞n,d\mathcal{C}_{n,d}: dd-complexes of size nn with a complete (d1)(d-1)-skeleton;

  • 𝒵n,d\mathcal{Z}_{n,d}: dd-complexes of size nn whose all dd^{\prime}-simplexes, for all d<dd^{\prime}<d, are faces of dd-simplexes.

For the sufficient statistics, Table 1 uses the following notations that rely on the definition of a dd-shell, which can be found at the end of Section 2.2:

  • Mc,dM_{c,d} and Ms,dM_{s,d}: numbers of dd-shells and dd-simplexes (Ms,1=MM_{s,1}=M, the number of edges);

  • σd\partial\sigma_{d} and σd\sigma_{d}: dd-shell and dd-simplex (σ0=i\sigma_{0}=i, a vertex; σ1={i,j}\sigma_{1}=\left\{i,j\right\}, an edge);

  • kσdk_{\sigma_{d}} and kσd(d)k_{\sigma_{d}}(d^{\prime}): (d+1)(d+1)-degree and generalized dd^{\prime}-degree of σd\sigma_{d} (d<d<nd<d^{\prime}<n).

Finally, the comments referred to in Table 1 are as follows:

  1. 1.

    The values 𝐲\mathbf{y} of properties 𝐱\mathbf{x} are not shown, but they are assumed to match properly the model parameters. That is, in a general canonical model we have that 𝔼𝐱=𝐲\mathbb{E}\,\mathbf{x}=\mathbf{y}, so that the G(n,p)G(n,p) model, for example, is the canonical model with 𝐱=M\mathbf{x}=M and 𝐲=𝔼M=(n2)p\mathbf{y}=\mathbb{E}\,M={n\choose 2}p. Similarly, the X(n,𝐩^)X(n,\hat{{\bf{p}}}) model is the canonical model with 𝔼σd=pσd𝔼σd\mathbb{E}\,\sigma_{d}=p_{\sigma_{d}}\mathbb{E}\,\partial\sigma_{d} and 𝔼σd=τd1<σd𝔼τd1\mathbb{E}\,\partial\sigma_{d}=\prod_{\tau_{d-1}<\sigma_{d}}\mathbb{E}\,\tau_{d-1}, giving the expected values of the sufficient statistics as functions of the model parameters pσdp_{\sigma_{d}}.

  2. 2.

    The presence of the constraints on the existence of the boundaries of dd-simplexes, their dd-shells, in addition to the constraints on the existence of dd-simplexes themselves, may appear surprising at first. These stem from the conditional nature of the definition of these complexes. The models with these additional constraints removed, e.g. the canonical model over 𝒞n\mathcal{C}_{n} with 𝐱={Ms,d}d=1n1\mathbf{x}=\left\{M_{s,d}\right\}_{d=1}^{n-1}, are still entirely unknown. This is not surprising, since such models appear to be combinatorially intractable [21].

  3. 3.

    Note that the constraints under which entropy is maximized in a hypercanonical model are generally intractable. The proofs that these constraints are the degree distributions in the dense and sparse HSCM graphs appear in [44] and [62], respectively. That is, the HSCM is the unbiased maximum-entropy model of random graphs with a given degree distribution, versus an (expected) degree sequence in the (S)CM.

  4. 4.

    The dd-dimensional ((H)S)CM models have not been considered before. The proof that SCMd(n,𝐤)\mathrm{SCM}_{d}(n,\mathbf{k}) maximizes entropy subject to the expected (d1,d)(d-1,d)-degree sequence constraints is a straightforward adjustment of notations in the corresponding proof for a general canonical model [14], so that it is omitted here for brevity.

  5. 5.

    The hypercanonical version of the model (Z-HSCMd(n,ρ)\mathrm{HSCM}_{d}(n,\rho)) has not been considered before. Efficient algorithms to sample from a generalized version of Z-CMd(n,𝐤)\mathrm{CM}_{d}(n,\mathbf{k}) are considered in [51].

3 Phase transitions

Phase transitions are very interesting and important phenomena occurring in random structures in statistical physics. In this section we briefly recall the most fundamental types of phase transitions that have been studied in random graphs, and then describe their higher dimensional analogues in simplicial complexes.

3.1 Homological Connectivity

The very first theorem proved for random graphs in [24] was the phase transition when the G(n,M)G(n,M) random graph becomes connected. The following is the equivalent result for the G(n,p)G(n,p) random graph.

Theorem 3.1.

Let GG(n,p)G\sim G(n,p). Then

limn(G is connected)={1np=logn+w(n),0np=lognw(n),\lim_{n\to\infty}\mathbb{P}\left(G\text{ is connected}\right)=\begin{cases}1&np=\log n+w(n),\\ 0&np=\log n-w(n),\end{cases}

where w(n)w(n) is any function satisfying w(n)w(n)\to\infty. In addition, if np=logn+cnp=\log n+c for c(0,)c\in(0,\infty), and denoting by NcompN_{\mathrm{comp}} the number of connected components in GG, then

(Ncomp1)𝐷Pois(ec),(N_{\mathrm{comp}}-1)\xrightarrow{D}\mathrm{Pois}\left({e^{-c}}\right),

where 𝐷\xrightarrow{D} denotes convergence in distribution. This convergence implies that

limn(G is connected)=eec.\lim_{n\to\infty}\mathbb{P}\left(G\text{ is connected}\right)=e^{-e^{-c}}.

The main idea behind the proof of Theorem 3.1 is to show that around np=lognnp=\log n, the random graph consists only of a giant connected component and isolated vertices. In that case, the phase transition for connectivity can be achieved by analysing the number of isolated vertices.

Note that in the language of homology, the event ‘GG is connected’ can be phrased as ‘the 0th homology group H0(G)H_{0}(G) is trivial’, and that NcompN_{\mathrm{comp}} is equal to the 0th Betti number β0(G)\beta_{0}(G). Thus, it is tempting to try to generalize this phase transition for higher degrees of homology, and search for the point (value of pp or npnp) where the kkth homology group HkH_{k} becomes trivial.

3.1.1 The random dd-complex

We start with the Linial-Meshulam random dd-complex Yd(n,p)Y_{d}(n,p). Recall that G(n,p)=Y1(n,p)G(n,p)=Y_{1}(n,p), and that Theorem 3.1 studies H0H_{0}. Similarly, the following results studies Hd1H_{d-1} in Yd(n,p)Y_{d}(n,p).

Theorem 3.2.

Let YYd(n,p)Y\sim Y_{d}(n,p). Then

limn(Hd1(Y)=0)={1np=dlogn+w(n),0np=dlognw(n),\lim_{n\to\infty}\mathbb{P}\left(H_{d-1}(Y)=0\right)=\begin{cases}1&np=d\log n+w(n),\\ 0&np=d\log n-w(n),\end{cases}

where w(n)w(n)\to\infty.

In addition, if np=dlogn+cnp=d\log n+c, c(0,)c\in(0,\infty), then

βd1(Y)𝐷Pois(ecd!),\beta_{d-1}(Y)\xrightarrow{D}\mathrm{Pois}\left({\frac{e^{-c}}{d!}}\right),

which implies that

limn(Hd1(Y)=0)=eecd!.\lim_{n\to\infty}\mathbb{P}\left(H_{d-1}(Y)=0\right)=e^{-\frac{e^{-c}}{d!}}.

Note that taking d=1d=1 in Theorem 3.2, nicely recovers the graph result in Theorem 3.1. Thus, Linial and Meshulam named this phase transition ‘homological connectivity’. The phase transition itself was proved first for d=2d=2 and for homology in 2\mathbb{Z}_{2} coefficients in [1], and then for any dd and homology in m\mathbb{Z}_{m} coefficients in [29]. The Poisson limit was proved a decade later [30].

Aside from the analogy between the statements of Theorems 3.2 and 3.1, the general idea behind the proof also shares some similarity. To prove Theorem 3.2 one can show that around np=dlognnp=d\log n, the only possible cycles in Hd1H_{d-1}111More accurately, the proof actually looks at cocycles in Hd1H^{d-1}. are those generated by isolated (d1)(d-1)-simplexes. An ‘isolated’ (d1)(d-1)-simplex is such that it is not included in any dd-dimensional simplex. The fact that isolated (d1)(d-1)-simplexes yield nontrivial Hd1H_{d-1} is relatively easy to prove. The much harder part here is to show that these are the only possible cycles.

3.1.2 The random flag complex

The random flag complex X(n,p)X(n,p) differs from the random dd-complex Yd(n,p)Y_{d}(n,p) in two important ways. Firstly, X(n,p)X(n,p) has random homology in all possible degrees kk, rather than just k=dk=d and k=d1k=d-1 as in Yd(n,p)Y_{d}(n,p). Secondly, note that in Yd(n,p)Y_{d}(n,p) both Hd1H_{d-1} and HdH_{d} are monotoneHd1H_{d-1} is decreasing in pp, while HdH_{d} is increasing. This is not the case for the flag complex, where except for H0H_{0}, none of the homology groups is monotone. The following result was proved by Kahle [33].

Theorem 3.3.

Let XX(n,p)X\sim X(n,p). Then,

limn(Hk(X)=0)={1npk+1=(k2+1+ϵ)logn,0npk+1=(k2+1ϵ)logn.\lim_{n\to\infty}\mathbb{P}\left(H_{k}(X)=0\right)=\begin{cases}1&np^{k+1}=\left(\frac{k}{2}+1+\epsilon\right)\log n,\\ 0&np^{k+1}=\left(\frac{k}{2}+1-\epsilon\right)\log n.\end{cases}

Note that here as well, taking k=0k=0 agrees with the phase transition in Theorem 3.1. The phase transition here is also a consequence of the vanishing of the isolated kk-simplexes. However, as oppose to the proofs in [1, 29] which mainly consist of combinatorial arguments, the proof in [33] goes in a different way, employing Garland’s method. Briefly, Garland’s method [69] is a powerful tool that allows “breaking” the computation of homology into local pieces (the links of the faces), and to invoke spectral graph arguments.

3.1.3 The multi-parameter random complex

Phase transitions are usually described as a rapid change of system behavior, in response to a small change in a system parameter value (e.g. pp or rr above). Since the multi-parameter complex X(n,𝐩)X(n,{\bf{p}}) has an infinite sequence of parameters, we cannot describe a suitable phase transition per se. Nevertheless, interesting results were presented in [70], showing that there are different regions within this high-dimensional parameter space, where in each region there is a single dominant homology group HkH_{k}.

To describe the result we need a few definitions. Let d>0d>0 and let α=(α1,,αd)+d\alpha=(\alpha_{1},\ldots,\alpha_{d})\in\mathbb{R}^{d}_{+}. Define

ψk(α)=i=1d(ki)αi,k=1,,d.\psi_{k}(\alpha)=\sum_{i=1}^{d}\binom{k}{i}\alpha_{i},\quad k=1,\ldots,d.

These functions are then used to define different domains

𝒟k={α+d:ψk(α)<1<ψk+1(α)},\mathcal{D}_{k}=\left\{\alpha\in\mathbb{R}^{d}_{+}:\psi_{k}(\alpha)<1<\psi_{k+1}(\alpha)\right\},

for k=0,,d1k=0,\ldots,d-1, and where ψ00\psi_{0}\equiv 0. We also define 𝒟d={α:ψd(α)<1}\mathcal{D}_{d}=\left\{\alpha:\psi_{d}(\alpha)<1\right\}. Finally, define

τk(α)=i=1k(1ψi(α)),e(α)=min0id(1ψi(α)).\tau_{k}(\alpha)=\sum_{i=1}^{k}(1-\psi_{i}(\alpha)),\qquad e(\alpha)=\min_{0\leq i\leq d}(1-\psi_{i}(\alpha)).
Theorem 3.4 ([70]).

Let 𝐩=(p1,p2,,pd,0,0,){\bf{p}}=(p_{1},p_{2},\ldots,p_{d},0,0,\ldots), be such that pi=nαip_{i}=n^{-\alpha_{i}}, where α=(α1,,αd)d\alpha=(\alpha_{1},\ldots,\alpha_{d})\in\mathbb{R}^{d} may be a function of nn, and limnα(n)=α\lim_{n\to\infty}\alpha(n)=\alpha_{*}. Let XX(n,𝐩)X\sim X(n,{\bf{p}}), and suppose that α𝒟k\alpha_{*}\in\mathcal{D}_{k}. Then, with high probability

βk(X)nτk(α)(k+1)!,\beta_{k}(X)\approx n^{\tau_{k}(\alpha)}{(k+1)!},

and for all jkj\neq k we have

βj(X)=O(ne(α)βk(X)).\beta_{j}(X)=O(n^{-e(\alpha)}\beta_{k}(X)).

In other words, in each 𝒟k\mathcal{D}_{k} the kk-th Betti numbers is quite large, and all other Betti numbers are negligible compared to it. One can think of this results as a “multi-parameter phase transition” so that the system behavior changes as one moves from one region 𝒟k\mathcal{D}_{k} to another. The proofs in [37] relies mainly on counting faces and using Morse inequalities.

3.1.4 Random geometric complexes

With respect to connectivity, the random geometric graph G(𝒫n,r)G(\mathcal{P}_{n},r) behaves quite similarly to the G(𝒫n,p)G(\mathcal{P}_{n},p) random graph. Denoting by Λ=ωdnrd\Lambda=\omega_{d}nr^{d} the expected degree equal to the expected number of points in a ball of radius rr, with ωd\omega_{d} standing for the volume of the unit-ball in d\mathbb{R}^{d}, the following was shown in [71].

Theorem 3.5.

Let GG(𝒫n,r)G\sim G(\mathcal{P}_{n},r). Then

limn(G is connected)={1Λ=logn+w(n),0Λ=lognw(n),\lim_{n\to\infty}\mathbb{P}\left(G\text{ is connected}\right)=\begin{cases}1&\Lambda=\log n+w(n),\\ 0&\Lambda=\log n-w(n),\end{cases}

where w(n)w(n) is any function satisfying w(n)w(n)\to\infty.

In other words, in both models connectivity is achieved once the expected degree is larger than logn\log n.

Moving to higher dimensions, the geometric models start to exhibit different behavior than the combinatorial ones. The main difference is the following. In Yd(n,p)Y_{d}(n,p) and X(n,p)X(n,p), when p1p\to 1, homological connectivity describes the stage where homology becomes trivial. This is due to the fact that there is no structure underlying the complex. In the geometric complexes, the vertices are sampled over a metric space SS, which might has its own intrinsic homology. Thus, for rr large enough, one should expect the homology of the random complex (C(𝒫n,r)C(\mathcal{P}_{n},r) or R(𝒫n,r)R(\mathcal{P}_{n},r)) to “converge” to the homology of SS, rather than to vanish. To account for this, the event we will refer to as the kk-th homological connectivity here, is defined in [72] as

k,r:={sr:Hk(C(𝒫n,s))Hk(S)}\mathcal{H}_{k,r}:=\left\{\forall s\geq r:H_{k}(C(\mathcal{P}_{n},s))\cong H_{k}(S)\right\}

for the Čech complex, and similarly for the Rips complex, with ‘CC’ replaced by ‘RR’. Note that k,r\mathcal{H}_{k,r} is a monotone event (i.e. k,r1k,r2\mathcal{H}_{k,r_{1}}\subset\mathcal{H}_{k,r_{2}} for all r1r2r_{1}\leq r_{2}).

In the Čech complex, the phase transition in k,r\mathcal{H}_{k,r} is by now fully understood. The following was proved in [72].

Theorem 3.6.

Let S=𝕋dS=\mathbb{T}^{d}, then for 1kd21\leq k\leq d-2,

limn(k,r)={1Λ=2d(logn+(k1)loglogn+w(n)),0Λ=2d(logn+(k1)loglognw(n)),\lim_{n\to\infty}\mathbb{P}\left(\mathcal{H}_{k,r}\right)=\begin{cases}1&\Lambda=2^{d}(\log n+(k-1)\operatorname{\log\log}n+w(n)),\\ 0&\Lambda=2^{d}(\log n+(k-1)\operatorname{\log\log}n-w(n)),\end{cases}

for any w(n)w(n)\to\infty. In addition,

limn(d1,r)=limn(d,r)={1Λ=2d(logn+(d1)loglogn+w(n)),0Λ=2d(logn+(d1)loglognw(n)).\lim_{n\to\infty}\mathbb{P}\left(\mathcal{H}_{d-1,r}\right)=\lim_{n\to\infty}\mathbb{P}\left(\mathcal{H}_{d,r}\right)=\begin{cases}1&\Lambda=2^{d}(\log n+(d-1)\operatorname{\log\log}n+w(n)),\\ 0&\Lambda=2^{d}(\log n+(d-1)\operatorname{\log\log}n-w(n)).\end{cases}

While the proof there considers only the flat torus case, the results in [73] indicate that similar results could be achieved for any compact Riemannian manifold. The key idea behind the proof of Theorem 3.6 was to consider the evolution of complex C(𝒫n,r){C(\mathcal{P}_{n},r)} as rr is increased, and to look for critical faces. By that we mean simplexes that facilitate changes in Hk(C(𝒫n,r))H_{k}(C(\mathcal{P}_{n},r)) when they first enter the complex. The analysis of critical faces employs Morse theory [74, 75]. The proof then consists of the following arguments:

  1. 1.

    If Λ\Lambda is large enough (Λlogn\Lambda\gg\log n), we have coverage, i.e. B(𝒫n,r)=𝕋dB(\mathcal{P}_{n},r)=\mathbb{T}^{d}. Then, by the Nerve Lemma [65] we have that Hk(C(𝒫n,r))Hk(𝕋d)H_{k}(C(\mathcal{P}_{n},r))\cong H_{k}(\mathbb{T}^{d}).

  2. 2.

    If Λ=2d(logn+(k1)loglogn+w(n))\Lambda=2^{d}(\log n+(k-1)\operatorname{\log\log}n+w(n)) we can show that w.h.p. no more critical faces that modify HkH_{k} enter the complex. In other words, HkH_{k} has reached its limit. From the previous argument the limit is Hk(𝕋d)H_{k}(\mathbb{T}^{d}), and thus we conclude that k,r\mathcal{H}_{k,r} holds.

  3. 3.

    If Λ=2d(logn+(k1)loglognw(n))\Lambda=2^{d}(\log n+(k-1)\operatorname{\log\log}n-w(n)), critical faces still enter the complex and make chances in HkH_{k}. Thus k,r\mathcal{H}_{k,r} does not hold (w.h.p.).

Similarly to Theorems 3.1 and 3.2, the results in [72] also include a Poisson limit for the counting of the critical faces, when Λ=2d(logn+(k1)loglogn+c)\Lambda=2^{d}(\log n+(k-1)\operatorname{\log\log}n+c). This result is the analogue of the isolated vertices in the random graph, since: (a) Critical faces are the ‘obstructions’ to homological connectivity, in the same way that isolated vertices are for connectivity. (b) It can be shown that critical faces are indeed isolated when they first enter the complex. Since the exact definition of the critical faces require more details, we will not present the exact theorem here, but refer the reader to [72, 76].

Homological connectivity for the Rips complex R(𝒫n,r)R(\mathcal{P}_{n},r) is still an open problem to date. The most recent result in this direction appeared in [77]. By using Discrete Morse Theory [78], Kahle was able to prove the following.

Theorem 3.7.

Let SS be a compact and convex subset of d\mathbb{R}^{d}. Then

𝔼{βk(R(𝒫n,r))}=O(nΛkecΛ).\mathbb{E}\left\{{\beta_{k}(R(\mathcal{P}_{n},r))}\right\}=O\left(n\Lambda^{k}e^{-c\Lambda}\right).

In particular, if Λ=1+ϵclogn\Lambda=\frac{1+\epsilon}{c}\log n then βk=0\beta_{k}=0 (w.h.p.).

Note that this result is for compact and convex sets SS, for which Hk(S)=0H_{k}(S)=0 for all k1k\geq 1. While this result is a significant step, it is still incomplete since: (a) It does not provide a sharp phase transition. In particular, there is no reason to believe that the constant cc provided by the proof, is the optimal one. (b) It does not apply to general manifolds.

3.2 Emergence of homology

Another fundamental result proved from random graph concerns the appearance of cycles. The following was proved in [47] (for the G(n,M)G(n,M) model, but the statement for G(n,p)G(n,p) is equivalent).

Theorem 3.8.

Let GG(n,p)G\sim G(n,p), then

limn(G contains cycles)={1np=c1,γ(c)np=c(0,1),0np=o(1),\lim_{n\to\infty}\mathbb{P}\left(G\text{ contains cycles}\right)=\begin{cases}1&np=c\geq 1,\\ \gamma(c)&np=c\in(0,1),\\ 0&np=o(1),\end{cases}

where γ(c)=11cexp(c2+c24)\gamma(c)=1-\sqrt{1-c}\exp\left(\frac{c}{2}+\frac{c^{2}}{4}\right).

Note that the event ‘GG contains cycles’ can be phrased as H1(G)0H_{1}(G)\neq 0. In other words, Theorem 3.8 describes a phase transition for the emergence of H1H_{1} in G(n,p)=X1(n,p)G(n,p)=X_{1}(n,p).

In the following, we will review the results known to date about the emergence of kk-cycles for the various random simplicial complex models.

3.2.1 The random dd-complex

For the random dd-complex, the following is an aggregate of a collection of works.

Theorem 3.9.

Let YYd(n,p)Y\sim Y_{d}(n,p). There exists cd>0c_{d}^{*}>0 such that

limn(Hd(Y)0)={1np=c>cd,γd(c)np=c(0,cd),0np=o(1),\lim_{n\to\infty}\mathbb{P}\left(H_{d}(Y)\neq 0\right)=\begin{cases}1&np=c>c^{*}_{d},\\ \gamma_{d}(c)&np=c\in(0,c^{*}_{d}),\\ 0&np=o(1),\end{cases}

where γd(c)=1exp(cd+2(d+2)!)\gamma_{d}(c)=1-\exp\left(-\frac{c^{d+2}}{(d+2)!}\right). In addition, the only dd-cycles in c(0,cd)c\in(0,c_{d}^{*}) are generated by empty (d+1)(d+1)-shells.

The first case was proved in [79], the middle case was proved in [80], and the last case was proved in [81]. An equation defining the critical values can be found in [79], while numerical approximations [80] yield c22.754c_{2}^{*}\approx 2.754, c33.907,c44.962,,c10001001c_{3}^{*}\approx 3.907,c_{4}^{*}\approx 4.962,\ldots,c_{1000}^{*}\approx 1001. So roughly, it looks like cdd+1c_{d}^{*}\approx d+1.

Another closely related phase transition is for collapsibility. Briefly, a dd-complex is collapsible, if we can iteratively erase pairs of simplexes in dimension dd and d1d-1, without changing the topology of the complex. In [82] a phase transition for collapsibility was shown at critical values cdcol<cdc_{d}^{\mathrm{col}}<c_{d}^{*}.

3.2.2 The random flag complex

Recall that the homology Hk(X(n,p))H_{k}(X(n,p)) is non-monotone (in pp). In Section 3.1.2 we saw that the largest value of pp such that Hk(X(n,p))H_{k}(X(n,p)) is non-trivial is when npk+1=Θ(logn)np^{k+1}=\Theta(\log n). In this section we look for the smallest possible value of pp.

Theorem 3.10.

Let XX(n,p)X\sim X(n,p), and suppose that limnnpk<\lim_{n\to\infty}np^{k}<\infty. Then,

limn(Hk(X)0)={1npkk+1+ϵ,0npknϵ,\lim_{n\to\infty}\mathbb{P}\left(H_{k}(X)\neq 0\right)=\begin{cases}1&np^{k}\geq k+1+\epsilon,\\ 0&np^{k}\leq n^{-\epsilon},\end{cases}

for any ϵ>0\epsilon>0.

The upper case was proved in [33], while the lower case was proved in [32]. While this result is not as sharp as the previous ones, it still implies a phase transition when npk=Θ(1)np^{k}=\Theta(1). Together with Theorem 3.3, we have that Hk(X)0H_{k}(X)\neq 0 for pp in the interval

[(k+1+ϵn)1k,((k2+1ϵ)lognn)1k+1].\left[\left(\frac{k+1+\epsilon}{n}\right)^{\frac{1}{k}},\left(\frac{\left(\frac{k}{2}+1-\epsilon\right)\log n}{n}\right)^{\frac{1}{k+1}}\right].

3.2.3 Random geometric complexes

The behavior of random geometric complexes is similar in many ways to that of the random flag complex. In particular, for each kk we have two phase transitions – one for the emergence of the first kk-cycles, and another for homological connectivity (Section 3.1.4). Here we present the former phase transition.

For the Čech complex, combining the results from [77, 83], we have the following.

Theorem 3.11.

For 1kd11\leq k\leq d-1,

limn(Hk(Cr(𝒫n))0)={1nΛk+1=ω(1),γk(c)nΛk+1=c(0,),0nΛk+1=o(1),\lim_{n\to\infty}\mathbb{P}\left(H_{k}(C_{r}(\mathcal{P}_{n}))\neq 0\right)=\begin{cases}1&n\Lambda^{k+1}=\omega(1),\\ \gamma_{k}(c)&n\Lambda^{k+1}=c\in(0,\infty),\\ 0&n\Lambda^{k+1}=o(1),\end{cases}

where γk(c)=1exp(cAk)\gamma_{k}(c)=1-\exp(-cA_{k}), for some constant Ak>0A_{k}>0.

Note that the transition occurs when Λ=Θ(n1k+1)0\Lambda=\Theta(n^{-\frac{1}{k+1}})\to 0. In other words, the first cycles appear in a sparse regime where the expected degree is small. Consequently, it was shown in [77] that in this regime the only kk-cycles that appear are the smallest possible ones – i.e., (k+1)(k+1)-shells, which consist of (k+2)(k+2) vertices. The phase transition in Theorem 3.11 then essentially describes the appearance of these empty (k+1)(k+1)-shells.

A similar result was proved for the Rips complex.

Theorem 3.12.

For any k1k\geq 1,

limn(Hk(Rr(𝒫n))0)={1nΛ2k+1=ω(1),γ~k(c)nΛ2k+1=c(0,),0nΛ2k+1=o(1),\lim_{n\to\infty}\mathbb{P}\left(H_{k}(R_{r}(\mathcal{P}_{n}))\neq 0\right)=\begin{cases}1&n\Lambda^{2k+1}=\omega(1),\\ \tilde{\gamma}_{k}(c)&n\Lambda^{2k+1}=c\in(0,\infty),\\ 0&n\Lambda^{2k+1}=o(1),\end{cases}

where γ~k(c)=1exp(cA~k)\tilde{\gamma}_{k}(c)=1-\exp(-c\tilde{A}_{k}), for some constant A~k>0\tilde{A}_{k}>0.

There are two main differences between Theorems 3.11 and 3.12. Firstly, due to the Nerve Lemma [65], the Čech complex on 𝕋d\mathbb{T}^{d} (as well as any subset of d\mathbb{R}^{d}) cannot have any kk-cycles for k>dk>d. The Rips complex, on the other hand, may have kk-cycles for any kk. Secondly, the transition here occurs when Λ=Θ(n12k+1)\Lambda=\Theta\left(n^{-\frac{1}{2k+1}}\right), as opposed to Λ=Θ(n1k+1)\Lambda=\Theta\left(n^{-\frac{1}{k+1}}\right) in Theorem 3.11. This difference stems from the fact that empty shells cannot appear in the Rips complex. Thus, it can be shown that the smallest possible kk-cycle is achieved by generating an empty cross-polytope, which consists of 2k+22k+2 vertices.

3.3 Percolation-related phenomena

Percolation theory studies the emergence of infinite or “giant” connected component in various random media. The study of higher-dimensional analogues is at a preliminary stage, and we describe here the most recent results.

In the case of random graphs, giant components are commonly defined in terms of the number of vertices. Specifically, for both the G(n,p)G(n,p) and G(𝒫n,r)G(\mathcal{P}_{n},r) models, by a “giant component” we refer to a component that consists of Θ(n)\Theta(n) many vertices. As opposed to connectivity which we discussed earlier, this notion does not have a simple analogue in higher dimensions. In this section we present two different notions – one of the random dd-complex and another for the random Čech complex.

3.3.1 Random dd-complex

Starting with the G(n,p)G(n,p) graph, the following was proved in [47].

Theorem 3.13.

Let GG(n,p)G\sim G(n,p), and np=c(0,)np=c\in(0,\infty). Denote by LnL_{n} the largest connected component in GG. Then with high probability

Ln={Θ(n)c>1,Θ(n2/3)c=1,Θ(logn)c<1.L_{n}=\begin{cases}\Theta(n)&c>1,\\ \Theta(n^{2/3})&c=1,\\ \Theta(\log n)&c<1.\end{cases}

In other words, a giant component appears for c>1c>1. Further, it can be shown that in this case, the giant component is unique.

In order to generalize the emergence of the giant component to higher dimension, the notion of shadow was introduced in [84]. Suppose that XX is a dd-dimensional simplicial complex with a complete (d1)(d-1)-skeleton. The shadow of XX, denoted SH(X)\mathrm{SH}(X), is the set of all dd-dimensional faces σ\sigma such that (a) σX\sigma\not\in X, and (b) adding σ\sigma to XX generates a new dd-cycle.

In the case of the graph (i.e., d=1d=1) the shadow are all the edges not in the graph, such that adding them generates a new cycle. In that case, it is easy to show that when c>1c>1 we have |SH(X)|=Θ((n2))|\mathrm{SH}(X)|=\Theta\left(\binom{n}{2}\right), so that the shadow has a positive density, and we say that the graph has a giant shadow. In [80], this phenomenon was generalized for higher dimensions.

Theorem 3.14.

Let YYd(n,p)Y\sim Y_{d}(n,p), and np=c(0,)np=c\in(0,\infty). Then with high probability,

|SH(X)|={Θ((nd+1))c>cd,Θ(n)c<cd,|\mathrm{SH(X)}|=\begin{cases}\Theta\big{(}\binom{n}{d+1}\big{)}&c>c_{d}^{*},\\ \Theta(n)&c<c_{d}^{*},\end{cases}

where cdc_{d}^{*} is the critical value defined in Section 3.2.

In other words, when c>cdc>c_{d}^{*} a giant shadow emerges in the random dd-complex, similarly to what happens in the graph case.

3.3.2 Random geometric complexes

Similarly to the G(n,p)G(n,p) model, the random geometric graph G(𝒫n,r)G(\mathcal{P}_{n},r) exhibits a phase transition for the emergence of a giant component. This phase transition occurs when the expected degree Λ\Lambda is constant, and the corresponding critical value is denoted as λc\lambda_{c}.

In the context of geometric complexes, a different generalization for percolation was studied recently [85, 86]. Consider the random Čech complex C(𝒫n,r)C(\mathcal{P}_{n},r) defined in Section 2, where 𝒫n\mathcal{P}_{n} is a homogeneous Poisson process on the dd-torus 𝕋d\mathbb{T}^{d}. One of the nice facts about the torus, is that it has non-trivial homology for all 0kd0\leq k\leq d, and in particular βk(𝕋d)=(dk)\beta_{k}(\mathbb{T}^{d})=\binom{d}{k}. In [85] the notion of a “giant kk-cycle” was introduced, by which we mean any kk-cycle in C(𝒫n,r)C(\mathcal{P}_{n},r) that corresponds to any of the nontrivial cycles in Hk(𝕋d)H_{k}(\mathbb{T}^{d}).

To define this more rigorously, we consider the union of balls B(𝒫n,r)B(\mathcal{P}_{n},r), and the fact that Hk(B(𝒫n,r))Hk(C(𝒫n,r))H_{k}(B(\mathcal{P}_{n},r))\cong H_{k}(C(\mathcal{P}_{n},r)). We then take the inclusion map i:B(𝒫n,r)𝕋di:B(\mathcal{P}_{n},r)\hookrightarrow\mathbb{T}^{d}, and its induced map i:Hk(B(𝒫n,r))Hk(𝕋d)i_{*}:H_{k}(B(\mathcal{P}_{n},r))\to H_{k}(\mathbb{T}^{d}). By a giant kk-cycle we refer to all nontrivial elements in the image Im(i)\operatorname{Im}(i_{*}). Finally, we define the events

k:={Im(i)0},𝒜k={Im(i)=Hk(𝕋d)}.{\cal E}_{k}:=\left\{\operatorname{Im}(i_{*})\neq 0\right\},\quad{\cal A}_{k}=\left\{\operatorname{Im}(i_{*})=H_{k}(\mathbb{T}^{d})\right\}.

In other words, k\mathcal{E}_{k} is the event that some giant cycles exist, while 𝒜k\mathcal{A}_{k} is the event that all of them are present in B(𝒫n,r)B(\mathcal{P}_{n},r) (and correspondingly in C(𝒫n,r)C(\mathcal{P}_{n},r)). The following was proved in [85].

Theorem 3.15.

Suppose that Λ=ωdnrd=λ(0,)\Lambda=\omega_{d}nr^{d}=\lambda\in(0,\infty).

  1. 1.

    There exist λ10λ20λd10\lambda_{1}^{0}\leq\lambda_{2}^{0}\ldots\leq\lambda_{d-1}^{0}, such that if λ<λk0\lambda<\lambda_{k}^{0} then,

    (𝒜k)(k)eCk0n1/d,\mathbb{P}\left({\cal A}_{k}\right)\leq\mathbb{P}\left(\mathcal{E}_{k}\right)\leq e^{-C_{k}^{0}n^{1/d}},

    for some Ck0>0C_{k}^{0}>0.

  2. 2.

    There exist λ11λ21λd11\lambda_{1}^{1}\leq\lambda_{2}^{1}\ldots\leq\lambda_{d-1}^{1}, such that if λ>λk1\lambda>\lambda_{k}^{1} then,

    (k)(𝒜k)1eCk1n1/d,\mathbb{P}\left(\mathcal{E}_{k}\right)\geq\mathbb{P}\left(\mathcal{A}_{k}\right)\geq 1-e^{-C_{k}^{1}n^{1/d}},

    for some Ck1>0C_{k}^{1}>0.

In addition, we have λk0λk1\lambda_{k}^{0}\leq\lambda_{k}^{1} for all kk, and λ10=λ11=λc\lambda_{1}^{0}=\lambda_{1}^{1}=\lambda_{c}.

In other words, Theorem 3.15 suggests that there is a phase transition describing the emergence of the giant kk-cycles. However in order to complete the proof, one needs to show that λk0=λk1\lambda_{k}^{0}=\lambda_{k}^{1} for all kk.

At this stage the giant shadow and the giant cycle phenomena are incomparable. However, it should be noted that both occur in the regime where the expected degree (npnp or Λ\Lambda) is finite, similarly to the giant component. It is an interesting question whether these are merely two ways to view the same phenomenon, or they describe completely different structures. Finally, we should note that other ideas for extending percolation-type phenomena to higher dimension exist in the literature.

3.4 The fundamental group

Given a space XX, the fundamental group π1(X)\pi_{1}(X) represents the space of equivalence classes of loops in XX, where the equivalence is based on homotopy (a continuous transformation of one loop into the other). The space XX is simply connected if and only if π1(X)\pi_{1}(X) is trivial.

Note that the first homology group H1(X)H_{1}(X) also represents loops, but under a different equivalence relation (being a boundary). In fact, it can be shown that H1(X)H_{1}(X) is an abelianization of π1(X)\pi_{1}(X). Thus, it is possible that H1(X)H_{1}(X) is trivial while π1(X)\pi_{1}(X) is not. Consequently, the vanishing threshold for π1\pi_{1} is larger than the threshold for H1H_{1} as the following statements show.

We start with the random dd-complex. Here the only relevant dimension is d=2d=2, since otherwise the fundamental group is trivial. The following was proved in [87, 88].

Theorem 3.16.

Let YY2(n,p)Y\sim Y_{2}(n,p). Then,

limn(π1(Y)=0)={1np=cn12,0np=n12ϵ,\lim_{n\to\infty}\mathbb{P}\left(\pi_{1}(Y)=0\right)=\begin{cases}1&np=cn^{\frac{1}{2}},\\ 0&np=n^{\frac{1}{2}-\epsilon},\end{cases}

where cc is any constant bigger than 33/44\sqrt{3^{3}/4^{4}}.

Further, when np=n1/2ϵnp=n^{1/2-\epsilon} it is shown in [87] that the fundamental group is hyperbolic. It is also conjectured in [88] that in fact c=33/44c=\sqrt{3^{3}/4^{4}} is the exact sharp threshold for simple connectivity.

For the random flag complex, the existing results [89] are even less sharp.

Theorem 3.17.

Let XX(n,p)X\sim X(n,p), then for any

n12+ϵnpn23ϵ,n^{\frac{1}{2}+\epsilon}\leq np\leq n^{\frac{2}{3}-\epsilon},

with high probability π1(X)\pi_{1}(X) is nontrivial and hyperbolic.

Note that the algebraic structure of the fundamental group is more intricate than H1(X)H_{1}(X), and thus several other properties have been studied in the literature. For example: its geometric and cohomological dimension [90], torsion [91], property T [33], freeness [92], and more.

4 Limit theorems

The results in the previous section focused on a rather qualitative analysis of phase transitions. Another direction of study is to consider some numerical marginals of homology (e.g., the Betti numbers) and find their limiting distribution. In this section we will briefly review some of the results in this direction.

We will not discuss the proofs here, however we note that in most cases they rely on Stein’s method [93, 94]. Briefly, this method allows one to prove central limit theorems and convergence to the normal and Poisson distributions in cases that involve sums of random variables that are not independent.

Notation.

In this section we will use anbna_{n}\approx b_{n} to denote that an/bn1a_{n}/b_{n}\to 1 as nn\to\infty, and anbna_{n}\ll b_{n} to denote that an/bn0a_{n}/b_{n}\to 0. In addition, 𝐷\xrightarrow{D} stands for convergence in distribution, and N(μ,σ2)N(\mu,\sigma^{2}) for the normal distribution with mean μ\mu and variance σ2\sigma^{2}.

4.1 Betti numbers

As we have seen in Section 3.2, the kk-th homology of X(n,p)X(n,p) is nontrivial roughly when pp is between n1/kn^{-1/k} and n1/(k+1)n^{-1/(k+1)} (up to constants). It was shown in [32] that when n1/kpn1/(k+1)n^{-1/k}\ll p\ll n^{-1/(k+1)} then

𝔼{βk(X(n,p))}nk(k+1)!p(k+12).\mathbb{E}\left\{{\beta_{k}(X(n,p))}\right\}\approx\frac{n^{k}}{(k+1)!}p^{\binom{k+1}{2}}.

In addition, the following central limit theorem was proved in [83].

Theorem 4.1.

Let XX(n,p)X\in X(n,p), and let n1/kpn1/(k+1)n^{-1/k}\ll p\ll n^{-1/(k+1)}. Then

βk(X)𝔼{βk(X)}Var(βk(X))𝐷N(0,1).\frac{\beta_{k}(X)-\mathbb{E}\left\{{\beta_{k}(X)}\right\}}{\sqrt{\mathrm{Var}\left({\beta_{k}(X)}\right)}}\xrightarrow{D}N(0,1).

In the geometric complexes case, the results usually split according to the expected degree Λ\Lambda. In the case where Λ0\Lambda\to 0 (sometimes called the sparse regime), the homology is dominated by empty shells, and thus it is relatively easy to compute the Betti numbers. For example, it was proved in [77] that

𝔼{βk(C(𝒫n,r))}AknΛk+1,\mathbb{E}\left\{{\beta_{k}(C(\mathcal{P}_{n},r))}\right\}\approx A_{k}n\Lambda^{k+1},

where Ak>0A_{k}>0 is a known constant [77]. Note that the limit of the right hand side can be zero, finite, or infinite. Consequently we have three possible limits. The following was proved in [83].

Theorem 4.2.

Let CC(𝒫n,r)C\sim C(\mathcal{P}_{n},r), and suppose that Λ0\Lambda\to 0.

  1. 1.

    If nΛk+10n\Lambda^{k+1}\to 0 then βk(C)=0\beta_{k}(C)=0 with high probability.

  2. 2.

    If nΛk+1=α(0,)n\Lambda^{k+1}=\alpha\in(0,\infty) then βk(C)𝐷Pois(Akα)\beta_{k}(C)\xrightarrow{D}\mathrm{Pois}\left({A_{k}\alpha}\right).

  3. 3.

    If nΛk+1n\Lambda^{k+1}\to\infty then βk(C)𝔼{βk(C)}Var(βk(C))𝐷N(0,1)\frac{\beta_{k}(C)-\mathbb{E}\left\{{\beta_{k}(C)}\right\}}{\sqrt{\mathrm{Var}\left({\beta_{k}(C)}\right)}}\xrightarrow{D}N(0,1).

Similar results hold for the Rips complex as well, just with nΛ2k+1n\Lambda^{2k+1} replacing nΛk+1n\Lambda^{k+1} [83].

The regime where Λ\Lambda is finite and non-vanishing is sometimes referred to as the thermodynamic limit. It can be shown that most of the cycles in either the Čech or Rips complexes are generated in this regime. However, exact counting here is much more difficult. It was shown in [77] that 𝔼{βk(C(𝒫n,r))}=Θ(n)\mathbb{E}\left\{{\beta_{k}(C(\mathcal{P}_{n},r))}\right\}=\Theta(n) (and the same for the Rips). However the exact expectation is not known.

Nevertheless, one can prove laws of large numbers as well as central limit theorems for the Betti number functionals. This type of results was first introduced in [95], and have been improved over the years [96, 97, 98]. The main tools used are stabilization methods for random point processes (e.g. [99]). The result is as follows.

Theorem 4.3.

Let CC(𝒫n,r)C\sim C(\mathcal{P}_{n},r), and suppose that Λ=λ(0,)\Lambda=\lambda\in(0,\infty). Then

βk(C)𝔼{βk(C)}n𝐷N(0,σ2(λ)),\frac{\beta_{k}(C)-\mathbb{E}\left\{{\beta_{k}(C)}\right\}}{\sqrt{n}}\xrightarrow{D}N(0,\sigma^{2}(\lambda)),

for some σ2(λ)>0\sigma^{2}(\lambda)>0.

A similar result applies to the Rips complex as well. In addition in [98, 100] the binomial process 𝒳n\mathcal{X}_{n} was studied as well, and similar limiting theorems were proved.

In the dense regime, i.e., Λ\Lambda\to\infty there are very few results. In particular, even the scale of the expected Betti numbers is not known. The only case analyzed is the Poisson limit when Λlogn\Lambda\approx\log n described earlier, as part of the homological connectivity phenomenon.

4.2 Topological types

In [101] a completely different type of limit theorem was proved. There, the goal was to study the distribution of the different topological types that may appear in a random Čech complex. Let XX be a simplicial complex, and let 𝒞(X)\mathcal{C}(X) be the set of all connected components of XX (so that β0(X)=|𝒞(X)|\beta_{0}(X)=|\mathcal{C}(X)|). Define the empirical measure

μX:=1β0(X)C𝒞δ[C],\mu_{X}:=\frac{1}{\beta_{0}(X)}\sum_{C\in\mathcal{C}}\delta_{[C]},

where δ\delta stands for the Dirac delta measure, and [C][C] stands for the equivalence class of all components homotopy equivalent to CC (i.e. so that one component can be “continuously transformed” into the other, see [102]).

Theorem 4.4.

Let CC(𝒫n,r)C\sim C(\mathcal{P}_{n},r), and denote μ^n:=μC\hat{\mu}_{n}:=\mu_{C}. Then the random measure μ^n\hat{\mu}_{n} converges in probability to a universal probability measure. The support of the distribution is the set of all connected (finite) Čech complexes in d\mathbb{R}^{d}.

Note that universality here means that the limit is the same regardless of the underlying manifold SS.

4.3 Persistent homology

In the context of Topological Data Analysis (TDA), one is often more interested in studying persistent homology rather than just the (static) homology. In this section we consider the persistent homology over filtrations of either Čech or Rips complexes, namely {C(𝒫n,r)}r=0\left\{C(\mathcal{P}_{n},r)\right\}_{r=0}^{\infty} or {R(𝒫n,r)}r=0\left\{R(\mathcal{P}_{n},r)\right\}_{r=0}^{\infty}, respectively.

4.3.1 Limit theorems

One useful quantity that can be extracted from persistent homology are the persistent Betti numbers. Briefly, for any sts\leq t, we denote by βk(s,t)\beta_{k}^{(s,t)} the number of cycles born before radius ss and die after radius rr (see [96] for a formal definition). In [96] the following central limit theorem was proved.

Theorem 4.5.

Let CC(𝒫n,r)C\sim C(\mathcal{P}_{n},r), and suppose that nsd=αns^{d}=\alpha and ntd=βnt^{d}=\beta, for some αβ(0,)\alpha\leq\beta\in(0,\infty). Then

βk(s,t)(C)𝔼{βk(s,t)(C)}n𝐷N(0,σ2(α,β)),\frac{\beta_{k}^{(s,t)}(C)-\mathbb{E}\left\{{\beta_{k}^{(s,t)}(C)}\right\}}{\sqrt{n}}\xrightarrow{D}N(0,\sigma^{2}(\alpha,\beta)),

for some σ2(α,β)>0\sigma^{2}(\alpha,\beta)>0.

Note that this theorem describes the persistent Betti numbers in the thermodynamic limit, where most cycles are born and die. A similar theorem applies for R(𝒫n,r)R(\mathcal{P}_{n},r), and in fact [96] present this central limit theorem for an even larger general class of geometric complexes. In [97] Theorem 4.5 was further extended to a multidimensional central limit theorem (i.e. for a sequence (α1,β1),,(αm,βm)(\alpha_{1},\beta_{1}),\ldots,(\alpha_{m},\beta_{m})). Finally, note that Theorem 4.3 is in fact a special case of Theorem 4.5 in the case where α=β\alpha=\beta.

Recall that a common representation for persistent homology is via a persistence diagram, which is essentially a finite subset of Δ={(x,y):yx}2\Delta=\left\{(x,y):y\geq x\right\}\subset\mathbb{R}^{2}, where the coordinates of each point correspond to the birth and death times of a persistence interval. Thus, persistence diagrams attached to random finite simplicial complexes, can be thought of as a random subset of Δ\Delta, or equivalently a random discrete measure on Δ\Delta. One can then employ the theory of random sets and random measures to study the limit of such diagrams.

Consider the random Čech filtration {C(𝒫n,r)}r=0\left\{C(\mathcal{P}_{n},r)\right\}_{r=0}^{\infty}, and let μk(n)\mu_{k}(n) be the discrete measure representing the persistence diagram in degree kk. The main result in [96] shows that there exists a limiting Radon measure on Δ\Delta,

μk=limnμk(n)n.\mu_{k}=\lim_{n\to\infty}\frac{\mu_{k}(n)}{n}.

The same is true for the Rips filtration as well. This statement is essentially a law of large numbers for the persistence diagram as a Radon measure in 2\mathbb{R}^{2}.

A different approach was taken in [103], where persistence diagrams were studied from the point of view of set-topology. In that case it was shown that each persistence diagram can be roughly split into three different areas. The limit of the persistence diagram in the sparse region is empty. In the intermediate region it has a limiting Poisson process distribution. Finally, in the dense region, the persistence diagram converges to a solid (nonrandom) 2d region.

4.3.2 Maximal cycles

Another interesting and significant type of limit for persistence diagrams was presented in [104]. There, it was assumed that SS is a unit box (though the results should not depend on that). The quantity considered was the maximal death/birth ratio among all persistent cycles of degree kk, denoted Πk\Pi_{k}.

Theorem 4.6.

Let Πk\Pi_{k} be the maximally persistent cycle for either the Rips or Čech complex. Then there exist Ak<BkA_{k}<B_{k} such that with high probability

Ak(lognloglogn)1/kΠkBk(lognloglogn)1/k.A_{k}\left(\frac{\log n}{\operatorname{\log\log}n}\right)^{1/k}\leq\Pi_{k}\leq B_{k}\left(\frac{\log n}{\operatorname{\log\log}n}\right)^{1/k}.

Theorem 4.6 provides the scaling for the multiplicative persistence value in a random geometric complex. Since the unit box has no intrinsic homology, this result can be thought of as describing the homology of noisy cycles, i.e., cycles that are not part of the underlying topology. It can be shown that the same result applies for other compact spaces SS, as long as we do not count cycle that belong to Hk(S)H_{k}(S). Recalling Section 3.3.2, such “signal” cycles (those in Hk(S)H_{k}(S)) are born when Λ\Lambda is constant, implying that the birth time is rn1/dr\propto n^{-1/d}. On the other hand, the signal cycles die when rr is a constant (depends on SS and not on nn). Therefore, for the signal cycles we have (death/birth)n1/d(\mathrm{death}/\mathrm{birth})\propto n^{1/d}. In other words, taking death/birth as a measure of persistence, Theorem 4.6 shows that asymptotically noisy cycles are significantly less persistent than the signal ones, and one should be able to differentiate between them, assuming a large enough sample.

5 Other directions

In Sections 3 and 4 we presented some of the most fundamental results proved in the mathematical literature on random simplicial complexes. There are numerous important studies that we omit for space reasons, but would like to mention them briefly here.

  • Spectrum and expansion: Similarly to graphs, one can define an adjacency and Laplacian operators for simplicial complexes, as well as different types of expansion properties. These were studied mainly in the the context of the random dd-complex Yd(n,p)Y_{d}(n,p) [105, 106, 48, 107].

  • Functional limit theorems: One can examine functionals such as the Betti numbers and the Euler characteristic in a dynamic setting, and seek a limit in the form of a stochastic (Gaussian) process. In [103, 108], geometric complexes were studied, and the dynamics was the growing connectivity radius in the complex. The results show that the limiting process is indeed Gaussian. In [109], the random flag complex was studied. Here, the set of edges used to construct the complex turns on and off, following stationary Markov dynamics. The limiting processes are shown to be Ornstein-Uhlenbeck (Gaussian) processes. These results were further extended to the multiparameter complex [110].

  • Spanning acycles: Similarly to the study of spanning trees in graphs, one can consider spanning acycles in simplicial complex. Suppose that KK is a complete (d1)(d-1)-skeleton on nn vertices. A set of dd-faces SS is considered a spanning acycle if βd1(KS)=βd(KS)=0\beta_{d-1}(K\cup S)=\beta_{d}(K\cup S)=0. In other words, adding SS to KK kills all the existing (d1)(d-1)-cycles (hence “spanning”), while not generating any new dd-cycles (hence “acycle”). This definition invites the study of spanning acycles in the random dd-complex. In [111, 112] the weights of faces in the minimal spanning acycle and their connection to the persistence diagram were studied.

6 Future directions

We hope our review illustrates well that the recent progress in the fundamental study of random simplicial complexes in mathematics has been extremely rapid. As far as applications to real-world systems in network/data science are concerned, the progress has been as rapid, but the field is definitely less mature that its 11-dimensional counterpart dealing with applications of random graph models. There, a collection of important and ubiquitous properties of real-world networks abstracted as graphs have been long identified and well studied. Such properties include sparsity, scale-free degree distributions, degree correlations, clustering, small-worldness, community structure, self-similarity, and so on [113, 114]. Consequently, there have been an impressive body of work on null models that reproduce these properties in the statistically unbiased manner, and on growing network models that attempt to shed light on possible mechanisms that might lead to the formation of these properties in real-world systems [113, 114].

When the same systems are modeled as simplicial complexes or hypergraphs, the list of important properties one should be looking at, is much less understood. Consequently, the world of models dealing with such properties is much less explored, as it is not entirely clear what properties one should be most concerned with in the first place.

In addition to its relative infancy, the other reason why the world of random complex models has not yet been navigated as much, is that models of even simplest higher-dimensional properties tend to be much more complicated, not necessarily at the technical level as much as at the conceptual level.

Consider the null models of the degree distribution, i.e. the configuration models, for example. Since, as opposed to graphs, there is not one but (n2){n\choose 2} notions of degree in a simplicial complex of size nn (dd-degrees of dd^{\prime}-simplexes for any pair of dimensions d,dd,d^{\prime}, 0d<dn10\leq d^{\prime}<d\leq n-1), it may not be immediately clear what degrees to focus on. The degrees that the (S)CMd(n,𝐤)\mathrm{(S)CM}_{d}(n,\mathbf{k}) and Z-(S)CMd(n,𝐤)\mathrm{(S)CM}_{d}(n,\mathbf{k}) models reproduce in Sections 2.3.2 and 2.5.2, are the dd-degrees of (d1)(d-1)- and 0-dimensional simplexes, respectively, with very different constraints imposed on the underlying simplicial structure.

In that regard, these two models are the two simplest extremal versions of a great variety of configuration models we can think of. Such models can reproduce sequences of dd-degrees of dd^{\prime}-simplexes for any pair of dimensions d,dd,d^{\prime}, in which case the defining equations (2.3, 2.4) remain the same, except that simplexes τ\tau in those equations are now understood to be of dimension dd^{\prime}. We may also want to jointly reproduce different combinations of degree sequences of different dimensions d,dd,d^{\prime} satisfying structural constraints. Such models, based on generalized degree sequences observed in real data, could serve as more adequate and more realistic null models of the data. The first step in this direction has been made in [51].

Another example is homology, which we discussed at length in this chapter. Homology is known to be a powerful tool to capture information about the qualitative structure of simplicial complexes, and its theory in the random setting is constantly growing. However, as a tool to analyze high-dimensional networks, it is not yet as clear what part of the information provided by homology is useful and for what purposes. For example, while H0H_{0} (connected components) and H1H_{1} (cycles) are quite intuitive, the role of H2H_{2} as a network descriptor is not as clear. Neither is it entirely clear what useful information is contained in the number of kk-cycles, their physical size, their persistent lifetime, etc. So while the theoretical homological analysis is fruitful and highly interesting, it tends to be a challenge to see how and when to use it in application to real-world networks.

The degree distribution and homology are certainly just two out of many properties that the future work may find interesting and important to model in more realistic models of random simplicial complexes. For now, the list of such models is quite rudimentary for the reasons above, so that it is not surprising that rigorous mathematical results are available only for very basic models. Yet we have seen that even for these simplest models, the spectrum of available results is extremely rich and complex.

We have also seen that these results mostly address direct problems, whereas in dealing with real-world data one often faces inverse problems. For example, the results presented in Section 4.3 tell us facts about the persistence diagrams in “laboratory-controlled” settings, i.e. for given complexes in given spaces with known topology—a direct problem. An inverse problem would be to make any statistical inferences about the space topology from a persistence diagram coming from real-world data. Some progress in this direction has been made over the past decade [115, 116, 117, 118, 119], but this is still an active and important area of research to pursue.

Finally, we comment on one potentially very interesting direction of future research in mathematics of random complexes. One of the most fundamentally important recent achievements in mathematics of random graphs is the development and essential completion of the theory of limits of dense graphs known as graphons [60, 61]. One of the main results in that theory is that, even though there are many very different notions of graph convergence, they all are equivalent for dense graphs, and if a family of dense graphs converge, they converge to a (random) graphon [120]. As mentioned in Section 2.4.1, a graphon is an integrable connection probability function W:[0,1]2[0,1]W:[0,1]^{2}\to[0,1] modulo a certain equivalence relation, and a graphon-based random graph of size nn—known as a WW-random graph in the graphon theory—is GG(n,r^)G\sim G(n,\hat{r}), where rij=W(Xi,Xj)r_{ij}=W(X_{i},X_{j}) and XiU(0,1)X_{i}\sim U(0,1). Can an analogous theory be constructed—or discovered, depending on one’s philosophical view—for the limits of dense complexes, e.g. complexons? Complexons may probably be related to hypergraphons [121, 122, 123, 124], the limits of dense hypergraphs, via the lower and upper complexes Z¯(n,𝐩^)\underline{Z}(n,\hat{{\bf{p}}}) and Z¯(n,𝐩^)\overline{Z}(n,\hat{{\bf{p}}}) discussed in Section 2.5.1.

References

  • Linial and Meshulam [2006] N. Linial and R. Meshulam, Homological connectivity of random 2-complexes, Combinatorica 26, 475 (2006).
  • Bianconi and Rahmede [2015] G. Bianconi and C. Rahmede, Complex Quantum Network Manifolds in Dimension d>2d>2 are Scale-Free, Sci Rep 5, 13979 (2015).
  • Bianconi et al. [2015] G. Bianconi, C. Rahmede,  and Z. Wu, Complex quantum network geometries: Evolution and phase transitions, Phys Rev E 92, 022815 (2015).
  • Bianconi and Rahmede [2016] G. Bianconi and C. Rahmede, Network geometry with flavor: From complexity to quantum geometry, Phys Rev E 93, 53 (2016).
  • Courtney and Bianconi [2017] O. T. Courtney and G. Bianconi, Weighted growing simplicial complexes, Phys Rev E 95, 062301 (2017).
  • Fountoulakis et al. [2019] N. Fountoulakis, T. Iyer, C. Mailler,  and H. Sulzbach, Dynamical Models for Random Simplicial Complexes,  (2019), arXiv:1910.12715 .
  • Kovalenko et al. [2021] K. Kovalenko, I. Sendiña-Nadal, N. Khalil, A. Dainiak, D. Musatov, A. M. Raigorodskii, K. Alfaro-Bittner, B. Barzel,  and S. Boccaletti, Growing scale-free simplices, Commun Phys 4, 43 (2021).
  • Bianconi et al. [2019] G. Bianconi, I. Kryven,  and R. M. Ziff, Percolation on branching simplicial and cell complexes and its relation to interdependent percolation, Phys Rev E 100, 062311 (2019).
  • Iacopini et al. [2019] I. Iacopini, G. Petri, A. Barrat,  and V. Latora, Simplicial models of social contagion, Nat Commun 10, 2485 (2019).
  • Schaub et al. [2020] M. T. Schaub, A. R. Benson, P. Horn, G. Lippner,  and A. Jadbabaie, Random Walks on Simplicial Complexes and the Normalized Hodge 1-Laplacian, SIAM Rev 62, 353 (2020).
  • Gambuzza et al. [2021] L. V. Gambuzza, F. Di Patti, L. Gallo, S. Lepri, M. Romance, R. Criado, M. Frasca, V. Latora,  and S. Boccaletti, Stability of synchronization in simplicial complexes, Nat Commun 12, 1255 (2021).
  • Lee et al. [2021] Y. Lee, J. Lee, S. M. Oh, D. Lee,  and B. Kahng, Homological percolation transitions in growing simplicial complexes, Chaos An Interdiscip J Nonlinear Sci 31, 041102 (2021).
  • Battiston et al. [2020] F. Battiston, G. Cencetti, I. Iacopini, V. Latora, M. Lucas, A. Patania, J.-g. Young,  and G. Petri, Networks beyond pairwise interactions: Structure and dynamics, Phys Rep 874, 1 (2020).
  • Jaynes [1957] E. T. Jaynes, Information Theory and Statistical Mechanics, Phys Rev 106, 620 (1957).
  • Shannon [1948] C. E. Shannon, A Mathematical Theory of Communication, Bell Syst Tech J 27, 379 (1948).
  • Shore and Johnson [1980] J. Shore and R. Johnson, Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy, IEEE Trans Inf Theory 26, 26 (1980).
  • Tikochinsky et al. [1984] Y. Tikochinsky, N. Z. Tishby,  and R. D. Levine, Consistent Inference of Probabilities for Reproducible Experiments, Phys Rev Lett 52, 1357 (1984).
  • Skilling [1988] J. Skilling, The Axioms of Maximum Entropy, in Maximum-Entropy and Bayesian Methods in Science and Engineering (Springer Netherlands, Dordrecht, 1988) pp. 173–187.
  • Rassoul-Agha and Seppäläinen [2015] F. Rassoul-Agha and T. Seppäläinen, A Course on Large Deviations with an Introduction to Gibbs Measures, Graduate Studies in Mathematics, Vol. 162 (American Mathematical Society, Providence, Rhode Island, 2015).
  • Chatterjee [2017] S. Chatterjee, Large Deviations for Random Graphs, Lecture Notes in Mathematics, Vol. 2197 (Springer International Publishing, Cham, 2017).
  • Zuev et al. [2015] K. Zuev, O. Eisenberg,  and D. Krioukov, Exponential random simplicial complexes, Journal of Physics A: Mathematical and Theoretical 48, 465002 (2015).
  • Cover and Thomas [2005] T. M. Cover and J. A. Thomas, Elements of Information Theory (Wiley, Hoboken, NJ, 2005).
  • Solomonoff and Rapoport [1951] R. Solomonoff and A. Rapoport, Connectivity of random nets, B Math Biophys 13, 107 (1951).
  • Erdős and Rényi [1959] P. Erdős and A. Rényi, On Random Graphs, Publ Math 6, 290 (1959).
  • Gilbert [1959] E. N. Gilbert, Random Graphs, Ann Math Stat 30, 1141 (1959).
  • Anand and Bianconi [2009] K. Anand and G. Bianconi, Entropy measures for networks: Toward an information theory of complex topologies, Phys Rev E 80, 045102 (2009).
  • Squartini et al. [2015] T. Squartini, J. de Mol, F. den Hollander,  and D. Garlaschelli, Breaking of Ensemble Equivalence in Networks, Phys Rev Lett 115, 268701 (2015).
  • Janson [2010] S. Janson, Asymptotic equivalence and contiguity of some random graphs, Random Struct Algor 36, 26 (2010).
  • Meshulam and Wallach [2009] R. Meshulam and N. Wallach, Homological connectivity of random k-dimensional complexes, Random Structures & Algorithms 34, 408 (2009).
  • Kahle and Pittel [2016] M. Kahle and B. Pittel, Inside the critical window for cohomology of random k-complexes, Random Structures & Algorithms 48, 102 (2016).
  • Luczak and Peled [2018] T. Luczak and Y. Peled, Integral homology of random simplicial complexes, Discrete & Computational Geometry 59, 131 (2018).
  • Kahle [2009] M. Kahle, Topology of random clique complexes, Discrete Mathematics 309, 1658 (2009).
  • Kahle [2014a] M. Kahle, Sharp vanishing thresholds for cohomology of random flag complexes, Annals of Mathematics 179, 1085 (2014a).
  • Kahle [2014b] M. Kahle, Topology of random simplicial complexes: a survey, in AMS Contemp Vol Math, Vol. 620 (2014) pp. 201–221.
  • Costa and Farber [2015a] A. Costa and M. Farber, in Proc IMA Conf Math Robot, September (Institute of Mathematics and its Applications, 2015) pp. 1–8.
  • Costa and Farber [2016a] A. Costa and M. Farber, Random Simplicial Complexes, in Config Spaces Geom Topol Represent Theory (2016) pp. 129–153.
  • Costa and Farber [2016b] A. Costa and M. Farber, Large random simplicial complexes, I, J Topol Anal 08, 399 (2016b).
  • Söderberg [2002] B. Söderberg, General formalism for inhomogeneous random graphs, Phys Rev E 66, 066121 (2002).
  • Park and Newman [2004] J. Park and M. E. J. Newman, Statistical mechanics of networks, Phys Rev E 70, 066117 (2004).
  • Britton et al. [2006] T. Britton, M. Deijfen,  and A. Martin-Löf, Generating Simple Random Graphs with Prescribed Degree Distribution, J Stat Phys 124, 1377 (2006).
  • Bollobás et al. [2007] B. Bollobás, S. Janson,  and O. Riordan, The phase transition in inhomogeneous random graphs, Random Struct Algor 31, 3 (2007).
  • Bianconi [2008] G. Bianconi, The entropy of randomized network ensembles, EPL 81, 28005 (2008).
  • Garlaschelli and Loffredo [2008] D. Garlaschelli and M. Loffredo, Maximum likelihood: Extracting unbiased information from complex networks, Phys Rev E 78, 015101 (2008).
  • Chatterjee et al. [2011] S. Chatterjee, P. Diaconis,  and A. Sly, Random graphs with a given degree sequence, Ann Appl Probab 21, 1400 (2011).
  • Bender and Canfield [1978] E. A. Bender and E. R. Canfield, The asymptotic number of labeled graphs with given degree sequences, J Comb Theory, Ser A 24, 296 (1978).
  • Molloy and Reed [1995] M. Molloy and B. Reed, A critical point for random graphs with a given degree sequence, Random Struct Algor 6, 161 (1995).
  • Erdős and Rényi [1960] P. Erdős and A. Rényi, On the evolution of random graphs, Publ. Math. Inst. Hung. Acad. Sci 5, 17 (1960).
  • Lubotzky et al. [2019] A. Lubotzky, Z. Luria,  and R. Rosenthal, Random Steiner systems and bounded degree coboundary expanders of every dimension, Discrete & Computational Geometry 62, 813 (2019), publisher: Springer.
  • Keevash [2014] P. Keevash, The existence of designs, arXiv preprint arXiv:1401.3665  (2014).
  • Courtney and Bianconi [2016] O. T. Courtney and G. Bianconi, Generalized network structures: The configuration model and the canonical ensemble of simplicial complexes, Phys Rev E 93, 062311 (2016).
  • Young et al. [2017] J.-G. Young, G. Petri, F. Vaccarino,  and A. Patania, Construction of and efficient sampling from the simplicial configuration model, Phys Rev E 96, 032312 (2017).
  • Caldarelli et al. [2002] G. Caldarelli, A. Capocci, P. De Los Rios,  and M. A. Muñoz, Scale-Free Networks from Varying Vertex Intrinsic Fitness, Phys Rev Lett 89, 258702 (2002).
  • Boguñá and Pastor-Satorras [2003] M. Boguñá and R. Pastor-Satorras, Class of correlated random networks with hidden variables, Phys Rev E 68, 036112 (2003).
  • Holland et al. [1983] P. W. Holland, K. B. Laskey,  and S. Leinhardt, Stochastic blockmodels: First steps, Soc Networks 5, 109 (1983).
  • Sorokin [1927] A. P. Sorokin, Social Mobility (Harper, New York, 1927).
  • McFarland and Brown [1973] D. D. McFarland and D. J. Brown, Social distance as a metric: A systematic introduction to smallest space analysis, in Bonds of Pluralism: The Form and Substance of Urban Social Networks (John Wiley, New York, 1973) pp. 213–252.
  • Hoff et al. [2002] P. D. Hoff, A. E. Raftery,  and M. S. Handcock, Latent Space Approaches to Social Network Analysis, J Am Stat Assoc 97, 1090 (2002).
  • Gilbert [1961] E. N. Gilbert, Random Plane Networks, J Soc Ind Appl Math 9, 533 (1961).
  • Penrose [2003] M. Penrose, Random Geometric Graphs (Oxford University Press, Oxford, 2003).
  • Lovász [2012] L. Lovász, Large Networks and Graph Limits (American Mathematical Society, Providence, RI, 2012).
  • Janson [2013] S. Janson, Graphons, cut norm and distance, couplings and rearrangements, NYJM Monogr 4 (2013).
  • van der Hoorn et al. [2018] P. van der Hoorn, G. Lippner,  and D. Krioukov, Sparse Maximum-Entropy Random Graphs with a Given Power-Law Degree Distribution, J Stat Phys 173, 806 (2018).
  • Last and Penrose [2017] G. Last and M. Penrose, Lectures on the Poisson Process, October (Cambridge University Press, Cambridge, 2017).
  • De Silva and Ghrist [2007] V. De Silva and R. Ghrist, Coverage in sensor networks via persistent homology, Algebraic & Geometric Topology 7, 339 (2007).
  • Borsuk [1948] K. Borsuk, On the imbedding of systems of compacta in simplicial complexes, Fundamenta Mathematicae 35, 217 (1948).
  • Farber et al. [2019] M. Farber, L. Mead,  and T. Nowik, Random simplicial complexes, duality and the critical dimension, J Topol Anal , 1 (2019).
  • Farber and Mead [2020] M. Farber and L. Mead, Random simplicial complexes in the medial regime, Topol Appl 272, 107065 (2020).
  • Cooley et al. [2020] O. Cooley, N. D. Giudice, M. Kang,  and P. Sprüssel, Vanishing of cohomology groups of random simplicial complexes, Random Structures & Algorithms 56, 461 (2020).
  • Garland [1973] H. Garland, p-adic curvature and the cohomology of discrete subgroups of p-adic groups, Annals of Mathematics , 375 (1973).
  • Costa and Farber [2017] A. Costa and M. Farber, Large random simplicial complexes, III; the critical dimension, J Knot Theory Its Ramifications 26, 1740010 (2017).
  • Penrose [1997] M. D. Penrose, The longest edge of the random minimal spanning tree, The Annals of Applied Probability , 340 (1997).
  • Bobrowski [2019] O. Bobrowski, Homological Connectivity in Random Čech Complexes, arXiv:1906.04861 [math]  (2019), arXiv: 1906.04861.
  • Bobrowski and Oliveira [2019] O. Bobrowski and G. Oliveira, Random Čech Complexes on Riemannian Manifolds, Random Structures & Algorithms 54, 373 (2019).
  • Milnor [1963] J. W. Milnor, Morse theory (Princeton university press, 1963).
  • Gershkovich and Rubinstein [1997] V. Gershkovich and H. Rubinstein, Morse theory for Min-type functions, The Asian Journal of Mathematics 1, 696 (1997).
  • Bobrowski et al. [2021] O. Bobrowski, M. Schulte,  and D. Yogeshwaran, Poisson process approximation under stabilization and Palm coupling, arXiv preprint arXiv:2104.13261  (2021).
  • Kahle [2011] M. Kahle, Random geometric complexes, Discrete & Computational Geometry 45, 553 (2011).
  • Forman [2002] R. Forman, A user’s guide to discrete Morse theory, Sém. Lothar. Combin 48, 35pp (2002).
  • Aronshtam et al. [2013] L. Aronshtam, N. Linial, T. Luczak,  and R. Meshulam, Collapsibility and vanishing of top homology in random simplicial complexes, Discrete & Computational Geometry 49, 317 (2013).
  • Linial and Peled [2016] N. Linial and Y. Peled, On the phase transition in random simplicial complexes, Annals of Mathematics. Second Series 184, 745 (2016).
  • Kozlov [2010] D. Kozlov, The threshold function for vanishing of the top homology group of random d-complexes, Proceedings of the American Mathematical Society 138, 4517 (2010).
  • Aronshtam and Linial [2015] L. Aronshtam and N. Linial, The threshold for d-collapsibility in random complexes*, Random Structures & Algorithms  (2015).
  • Kahle and Meckes [2013] M. Kahle and E. Meckes, Limit the theorems for Betti numbers of random simplicial complexes, Homology, Homotopy and Applications 15, 343 (2013).
  • Linial et al. [2019] N. Linial, I. Newman, Y. Peled,  and Y. Rabinovich, Extremal hypercuts and shadows of simplicial complexes, Israel Journal of Mathematics 229, 133 (2019).
  • Bobrowski and Skraba [2020a] O. Bobrowski and P. Skraba, Homological Percolation: The Formation of Giant k-Cycles, International Mathematics Research Notices  (2020a).
  • Bobrowski and Skraba [2020b] O. Bobrowski and P. Skraba, Homological percolation and the Euler characteristic, Physical Review E 101, 032304 (2020b).
  • Babson et al. [2011] E. Babson, C. Hoffman,  and M. Kahle, The fundamental group of random 2-complexes, Journal of the American Mathematical Society 24, 1 (2011).
  • Luria and Peled [2018] Z. Luria and Y. Peled, On simple connectivity of random 2-complexes, arXiv preprint arXiv:1806.03351  (2018).
  • Babson [2012] E. Babson, Fundamental groups of random clique complexes, arXiv preprint arXiv:1207.5028  (2012).
  • Costa et al. [2015] A. Costa, M. Farber,  and D. Horak, Fundamental groups of clique complexes of random graphs, Trans London Math Soc 2, 1 (2015).
  • Costa and Farber [2015b] A. E. Costa and M. Farber, Geometry and topology of random 2-complexes, Isr J Math 209, 883 (2015b).
  • Newman [2020] A. Newman, Freeness of the random fundamental group, Journal of Topology and Analysis 12, 29 (2020).
  • Stein [1986] C. Stein (IMS, 1986).
  • Ross et al. [2011] N. Ross et al.Fundamentals of Stein’s method, Probability Surveys 8, 210 (2011).
  • Yogeshwaran et al. [2016] D. Yogeshwaran, E. Subag,  and R. J. Adler, Random geometric complexes in the thermodynamic regime, Probability Theory and Related Fields , 1 (2016).
  • Hiraoka et al. [2018] Y. Hiraoka, T. Shirai,  and K. D. Trinh, Limit theorems for persistence diagrams, The Annals of Applied Probability 28, 2740 (2018).
  • Krebs and Polonik [2019] J. T. Krebs and W. Polonik, On the asymptotic normality of persistent Betti numbers, arXiv preprint arXiv:1903.03280  (2019).
  • Trinh et al. [2019] K. D. Trinh et al.On central limit theorems in stochastic geometry for add-one cost stabilizing functionals, Electronic Communications in Probability 24 (2019).
  • Penrose and Yukich [2001] M. D. Penrose and J. E. Yukich, Central limit theorems for some graphs in computational geometry, Annals of Applied probability , 1005 (2001).
  • Goel et al. [2019] A. Goel, K. D. Trinh,  and K. Tsunoda, Strong law of large numbers for Betti numbers in the thermodynamic regime, Journal of Statistical Physics 174, 865 (2019).
  • Auffinger et al. [2020] A. Auffinger, A. Lerario,  and E. Lundberg, Topologies of Random Geometric Complexes on Riemannian Manifolds in the Thermodynamic Limit, International Mathematics Research Notices  (2020).
  • Hatcher [2002] A. Hatcher, Algebraic topology (Cambridge University Press, Cambridge, 2002).
  • Owada et al. [2020] T. Owada, O. Bobrowski, et al.Convergence of persistence diagrams for topological crackle, Bernoulli 26, 2275 (2020).
  • Bobrowski et al. [2017a] O. Bobrowski, M. Kahle,  and P. Skraba, Maximally persistent cycles in random geometric complexes, The Annals of Applied Probability 27, 2032 (2017a).
  • Gundert and Wagner [2016] A. Gundert and U. Wagner, On eigenvalues of random complexes, Israel Journal of Mathematics 216, 545 (2016).
  • Dotterrer and Kahle [2012] D. Dotterrer and M. Kahle, Coboundary expanders, Journal of Topology and Analysis 4, 499 (2012).
  • Knowles and Rosenthal [2017] A. Knowles and R. Rosenthal, Eigenvalue confinement and spectral gap for random simplicial complexes, Random Structures & Algorithms 51, 506 (2017).
  • Thomas and Owada [2021] A. M. Thomas and T. Owada, Functional limit theorems for the Euler characteristic process in the critical regime, Advances in Applied Probability 53, 57 (2021).
  • Thoppe et al. [2016] G. C. Thoppe, D. Yogeshwaran, R. J. Adler, et al.On the evolution of topology in dynamic clique complexes, Advances in Applied Probability 48, 989 (2016).
  • Owada et al. [2021] T. Owada, G. Samorodnitsky,  and G. Thoppe, Limit theorems for topological invariants of the dynamic multi-parameter simplicial complex, Stochastic Processes and their Applications  (2021).
  • Skraba et al. [2017] P. Skraba, G. Thoppe,  and D. Yogeshwaran, Randomly Weighted dd-complexes: Minimal Spanning Acycles and Persistence Diagrams, arXiv:1701.00239 [math]  (2017), arXiv: 1701.00239.
  • Hiraoka and Shirai [2017] Y. Hiraoka and T. Shirai, Minimum spanning acycle and lifetime of persistent homology in the Linial–Meshulam process, Random Structures & Algorithms 51, 315 (2017).
  • Barabási [2016] A.-L. Barabási, Network science (Cambridge University Press, Cambridge, UK, 2016).
  • Newman [2018] M. E. J. Newman, Networks (Oxford University Press, Oxford, 2018).
  • Niyogi et al. [2011] P. Niyogi, S. Smale,  and S. Weinberger, A topological view of unsupervised learning from noisy data, SIAM Journal on Computing 40, 646 (2011).
  • Fasy et al. [2014] B. T. Fasy, F. Lecci, A. Rinaldo, L. Wasserman, S. Balakrishnan,  and A. Singh, Confidence sets for persistence diagrams, Annals of Statistics 42, 2301 (2014).
  • Chazal et al. [2017] F. Chazal, B. Fasy, F. Lecci, B. Michel, A. Rinaldo,  and L. Wasserman, Robust topological inference: Distance to a measure and kernel distance, The Journal of Machine Learning Research 18, 5845 (2017).
  • Bobrowski et al. [2017b] O. Bobrowski, S. Mukherjee,  and J. E. Taylor, Topological consistency via kernel estimation, Bernoulli 23, 288 (2017b).
  • Reani and Bobrowski [2021] Y. Reani and O. Bobrowski, Cycle Registration in Persistent Homology with Applications in Topological Bootstrap, arXiv preprint arXiv:2101.00698  (2021).
  • Diaconis and Janson [2008] P. Diaconis and S. Janson, Graph Limits and Exhcangeable Random Graphs, Rend di Matemtica 28, 33 (2008).
  • Gowers [2007] W. Gowers, Hypergraph regularity and the multidimensional Szemerédi theorem, Ann Math 166, 897 (2007).
  • Elek and Szegedy [2012] G. Elek and B. Szegedy, A measure-theoretic approach to the theory of dense hypergraphs, Adv Math 231, 1731 (2012).
  • Zhao [2015] Y. Zhao, Hypergraph limits: A regularity approach, Random Struct Algor 47, 205 (2015).
  • Balasubramanian et al. [2021] K. Balasubramanian, D. Gitelman,  and H. Liu, Nonparametric Modeling of Higher-Order Interactions via Hypergraphons, J Mach Learn Res 22, 1 (2021).