This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Convergence of persistence diagrams
for discrete time stationary processes

Andrew M. Thomas Department of Statistics and Actuarial Science, University of Iowa [email protected]
Abstract.

In this article we establish two fundamental results for the sublevel set persistent homology for stationary processes indexed by the positive integers. The first is a strong law of large numbers for the persistence diagram (treated as a measure “above the diagonal” in the extended plane) evaluated on a large class of sets and functions—more than just continuous functions with compact support. We prove this result subject to only minor conditions that the sequence is ergodic and the tails of the marginals are not too heavy. The second result is a central limit theorem for the persistence diagram evaluated on the class of all step functions; this result holds as long as a ρ\rho-mixing criterion is satisfied and the distributions of the partial maxima do not decay too slowly. Our results greatly expand those extant in the literature to allow for more fruitful use in statistical applications, beyond idealized settings. Examples of distributions and functions for which the limit theory holds are provided throughout.

A portion of this work was completed while the author was a postdoctoral associate at Cornell University. This work was funded in part by NSF grants DMS-2114143 and OAC-1940124.

1. Introduction

Understanding the persistent homology of large samples from various probability distributions is of increasing utility in goodness-of-fit testing (Biscio et al., 2020; Krebs and Hirsch, 2022). For goodness-of-fit testing in the “geometric” setting there are a number of results to choose from, as much attention has been focused on the limiting stochastic behavior of Čech and Vietoris-Rips persistent homology of (Euclidean) point clouds (ibid. as well as Hiraoka et al., 2018; Divol and Polonik, 2019; Krebs and Polonik, 2019; Owada and Bobrowski, 2020; Krebs, 2021; Owada, 2022; Bobrowski and Skraba, 2024). However, less attention has been focused on the asymptotics of the entire sublevel (or superlevel) set persistent homology of stochastic processes and random fields—with a few notable exceptions (Chazal and Divol, 2018; Baryshnikov, 2019; Miyanaga, 2023; Perez, 2023; Kanazawa et al., 2024).

In recent years, summaries of sublevel set persistent homology of time series—such as those we establish limit theory for below—have been applied to the problems of heart rate variability analysis (Chung et al., 2021; Graff et al., 2021), eating behavior detection (Chung et al., 2022), and sleep stage scoring using respiratory signals (Chung et al., 2024). Thus, a comprehensive treatment of the asymptotic properties of sublevel set persistent homology of stochastic processes is needed for rigorous statistical approaches to the aformentioned problems. In this article we greatly extend the existing limit theory for persistence diagrams derived from sublevel set filtrations of discrete time stochastic processes. As a result, we understand the behavior of certain real-valued summaries of these random persistence diagrams—so-called persistence statistics—that are particularly relevant to machine learning and goodness-of-fit testing.

Work pertaining to the topology of sub/superlevel sets of random functions has its most prominent originator in Rice (1944). Current work in the area of establishing results about the sublevel set (0th0^{th}) persistent homology of stochastic processes has focused on almost surely continuous processes, such as investigations into the expected persistence diagrams of Brownian motion (Chazal and Divol, 2018); expected persistence diagrams of Brownian motion with drift (Baryshnikov, 2019); and expectations for the number of barcodes and persistent Betti numbers β0s,t\beta^{s,t}_{0} of continuous semimartingales (Perez, 2023). The formulas in Perez (2023), save for the expected number of barcodes with lifetime greater than \ell, follow asymptotic formulas with \ell tending to 0 or \infty.

Though not overlapping entirely with our setting, some results for cubical persistent homology are applicable here. Notable results include the strong law of large numbers for persistence diagrams (Kanazawa et al., 2024) of random cubical sets (with the quality of the strong law being vague convergence) and central limit theorems for persistent betti numbers of sublevel sets of i.i.d. sequences found in Miyanaga (2023). In this article, we establish the most general strong law of large numbers yet for functionals of persistence diagrams. We do so by normalizing the persistence diagrams so they become probability measures and by leveraging the tools of weak convergence. We also prove a central limit theorem for persistence diagrams evaluated on step functions using recent results for weakly dependent and potentially nonstationary triangular arrays, subject to standard dependence decay conditions on the underlying stationary sequence.

The quality of most strong laws of large numbers for persistence diagrams has been vague convergence, with Hiraoka et al. (2018), Krebs (2021), and Owada (2022) tackling the geometric (i.e. Čech and Vietoris-Rips persistent homology) setting, and Kanazawa et al. (2024) addressing the cubical setting. Recently however, the authors of Bobrowski and Skraba (2024) have employed the weak convergence ideas that we use here to prove a strong law of large numbers for the probability measure defined by death/birth ratios in a persistence diagram, for the geometric setting. In Divol and Polonik (2019)—again in the geometric setting—the authors extend the set functions for which the strong law of Hiraoka et al. (2018) holds to a class of unbounded functions.

In Section 3.1, we accomplish this extension as well in the setting of sublevel set persistent homology. We extend the strong law of large numbers (SLLN) of Kanazawa et al. (2024) (that which pertains to the 1-dimensional setting) from continuous functions with compact support to a large class of unbounded functions. We achieve this based solely on minor conditions such as ergodicity and restrictions on the heaviness of the tails of the marginal distributions of our underlying stochastic process. We also remove the need for any local dependence condition, such as that of Kanazawa et al. (2024). In doing so, we answer an open question of Chung et al. (2021) about the limiting empirical distribution of persistence diagram lifetimes for sublevel sets of discrete time stationary processes. For this specific setting, we also derive an explicit representation of the strong limit of our sublevel set persistent betti numbers in Proposition 3.3, answering a query set forth in the conclusion to Hiraoka and Tsunoda (2018). Finally, we extend the current state-of-the art result central limit theorem (CLT) for persistent Betti numbers of sublevel set filtrations of 1-dimensional processes (Theorem 1.2.3 in Miyanaga, 2023) to finite-dimensional convergence and beyond the realm of i.i.d. observations.

This article proceeds in Section 2 with a treatment of persistent homology specialized to our setting, as well as details of our probabilistic setup. In Section 3 the strong law of large numbers is stated and proved (Theorems 3.1 and 3.8, on pages 3.1 and 3.8) and examples for which it holds are given for specific unbounded functionals of persistence diagrams in Corollary 3.10. Beyond this, we derive some satisfying results in the case of i.i.d. stochastic processes in Corollary 3.5 and state a Glivenko-Cantelli theorem for persistence lifetimes in Corollary 3.7. Finally in Section 4, we state the setting and results of our central limit theorem for persistence diagrams (Theorem 4.6, on page 4.6). We conclude with a brief discussion about the potential improvements and extensions of this work in Section 5. The proof of the central limit theorem is deferred to Section 6.

2. Background

We begin by discussing the necessary notions in topological data analysis—specifically zero-dimensional sublevel set persistent homology. From there, we detail crucial results for the representation of zero-dimensional sublevel set persistent homology for stochastic processes.

Before continuing, let us make a brief note about notation. For a real numbers x,yx,y we define xy:=min{x,y}x\wedge y\mathrel{\mathop{\mathchar 58\relax}}=\min\{x,y\}, xy:=max{x,y}x\vee y\mathrel{\mathop{\mathchar 58\relax}}=\max\{x,y\}, and (x)+:=x0=max{x,0}(x)_{+}\mathrel{\mathop{\mathchar 58\relax}}=x\vee 0=\max\{x,0\}. We set ¯:=[,]\bar{\mathbb{R}}\mathrel{\mathop{\mathchar 58\relax}}=[-\infty,\infty] and +:=[0,)\mathbb{R}_{+}\mathrel{\mathop{\mathchar 58\relax}}=[0,\infty). If RR is a set in some topological space, we denote RR^{\circ} the interior (i.e. largest open subset) of RR and R\partial R its boundary. We denote B(z,ϵ)B(z,\epsilon) to be the open Euclidean ball of radius ϵ>0\epsilon>0 centered at zz. If for a real sequence (an)n1(a_{n})_{n\geq 1} and a positive sequence (bn)n1(b_{n})_{n\geq 1} we have an/bn0a_{n}/b_{n}\to 0 as nn\to\infty, we write an=o(bn)a_{n}=o(b_{n}); if there exists a C>0C>0 such that |an|Cbn|a_{n}|\leq Cb_{n} for nn large enough, we write an=O(bn)a_{n}=O(b_{n}).

2.1. Homology

Recall that an (abstract) simplicial complex KK is a collection of subsets of a set AA with the property that it is closed under inclusion. Let KK be the graph (i.e. a special case of a simplicial complex) with vertex set V={v0,v1,v2,}V=\{v_{0},v_{1},v_{2},\dots\} and edge set

{v0v1,v1v2,v2v3,v3v4,}.\{v_{0}v_{1},v_{1}v_{2},v_{2}v_{3},v_{3}v_{4},\dots\}.

For a fixed function f:Kf\mathrel{\mathop{\mathchar 58\relax}}K\to\mathbb{R} that satisfies τσ\tau\subset\sigma \Rightarrow f(τ)f(σ)f(\tau)\leq f(\sigma), we define K(t):={σK:f(σ)t}K(t)\mathrel{\mathop{\mathchar 58\relax}}=\{\sigma\in K\mathrel{\mathop{\mathchar 58\relax}}f(\sigma)\leq t\}. It is clear that for sts\leq t we have K(s)K(t)K(s)\subset K(t) and thus K=(K(t))tK=\big{(}K(t)\big{)}_{t\in\mathbb{R}} defines a filtration of graphs. For any tt\in\mathbb{R} we can assess the connectivity information of K(t)K(t) by calculating its 0-dimensional homology group H0(K(t))H_{0}(K(t)). We do so by initially forming two vector spaces C0C_{0} and C1C_{1} of all formal linear combinations of the vertices

C0(K(t)):={i:viK(t)aivi:ai2}C_{0}(K(t))\mathrel{\mathop{\mathchar 58\relax}}=\Bigg{\{}\sum_{i\mathrel{\mathop{\mathchar 58\relax}}\,v_{i}\in K(t)}a_{i}v_{i}\mathrel{\mathop{\mathchar 58\relax}}a_{i}\in\mathbb{Z}_{2}\Bigg{\}}

and

C1(K(t)):={i:vivi+1K(t)aivivi+1:ai2},C_{1}(K(t))\mathrel{\mathop{\mathchar 58\relax}}=\Bigg{\{}\sum_{i\mathrel{\mathop{\mathchar 58\relax}}\,v_{i}v_{i+1}\in K(t)}a_{i}v_{i}v_{i+1}\mathrel{\mathop{\mathchar 58\relax}}a_{i}\in\mathbb{Z}_{2}\Bigg{\}},

where 2\mathbb{Z}_{2} is the field of two elements {0,1}\{0,1\}. The elements of C0(K(t))C_{0}(K(t)) and C1(K(t))C_{1}(K(t)) are called 0-chains and 1-chains, respectively. Addition of ii-chains in Ci(K(t))C_{i}(K(t)) is done componentwise. To calculate H0(K(t))H_{0}(K(t)) we need to specify the boundary map 1:C1(K(t))C0(K(t))\partial_{1}\mathrel{\mathop{\mathchar 58\relax}}C_{1}(K(t))\to C_{0}(K(t)), which is defined by

1(vivi+1)=vi+vi+1.\partial_{1}\big{(}v_{i}v_{i+1}\big{)}=v_{i}+v_{i+1}.

We can extend this to an arbitrary cC1(K(t))c\in C_{1}(K(t)) by

1(c)=i:vivi+1K(t)ai1(vivi+1).\partial_{1}(c)=\sum_{i\mathrel{\mathop{\mathchar 58\relax}}\,v_{i}v_{i+1}\in K(t)}a_{i}\partial_{1}(v_{i}v_{i+1}).

By analogy to the construction above, each vertex in C0(K(t))C_{0}(K(t)) gets sent to 0 by 0\partial_{0} so Z0(K(t)):=ker0=#{vC0(K(t))}Z_{0}(K(t))\mathrel{\mathop{\mathchar 58\relax}}=\ker\partial_{0}=\#\{v\in C_{0}(K(t))\}. Defining B0(K(t)):=1(C1(K(t)))B_{0}(K(t))\mathrel{\mathop{\mathchar 58\relax}}=\partial_{1}\big{(}C_{1}(K(t))\big{)} (the image of 1\partial_{1}), we define the 0th0^{th} homology group as the quotient vector space,

H0(K(t)):=Z0(K(t))/B0(K(t)).H_{0}(K(t))\mathrel{\mathop{\mathchar 58\relax}}=Z_{0}(K(t))/B_{0}(K(t)).

A more general setup of homology with 2\mathbb{Z}_{2} coefficients can be seen in Chapter 4 of Edelsbrunner and Harer (2010).

2.2. Persistent homology and representations

The vector spaces111Conventionally called groups, as coefficients may lie in \mathbb{Z}, for example. H0(K(t))H_{0}(K(t)) capture intuitive connectivity information—the elements of H0(K(t))H_{0}(K(t)) are the equivalence classes of vertices that satisfy v+vB0(K(t))v+v^{\prime}\in B_{0}(K(t)). More simply put, elements of H0(K(t))H_{0}(K(t)) are vertices connected by a chain of edges. The information in H0(K(t))H_{0}(K(t)) gives us useful information on the function ff, but being able to assess how connected components (i.e. elements of H0(K(t))H_{0}(K(t))) appear and merge as we vary tt would be better. We can do so by introducing the notion of persistent homology.

Refer to caption
Figure 1. The sublevel set filtrations K(t)K(t) of a sample of 100 points from a 88-dependent stationary Gaussian process along with its 0th0^{th} persistence diagram PD0PD_{0} (upper right).

Given the inclusion maps ιs,t:K(s)K(t)\iota_{s,t}\mathrel{\mathop{\mathchar 58\relax}}K(s)\to K(t), for sts\leq t there exist linear maps between all homology groups

f0s,t:H0(K(s))H0(K(t)),f^{s,t}_{0}\mathrel{\mathop{\mathchar 58\relax}}H_{0}(K(s))\to H_{0}(K(t)),

which are induced by ιs,t\iota_{s,t}. The persistent homology groups of the filtration (K(t))t(K(t))_{t\in\mathbb{R}} are the quotient vector spaces

H0s,t(K):=imf0s,tZ0(K(s))/(B0(K(t))Z0(K(s))),H^{s,t}_{0}(K)\mathrel{\mathop{\mathchar 58\relax}}=\mathrm{im}\,f^{s,t}_{0}\cong Z_{0}(K(s))/\big{(}B_{0}(K(t))\cap Z_{0}(K(s))\big{)},

whose elements represent the cycles that are “born” in K(s)K(s) or before and that “die” after K(s)K(s). The dimensions of these vector spaces are the persistent Betti numbers β0s,t\beta_{0}^{s,t}. Heuristically, a connected component γH0(K(s))\gamma\in H_{0}(K(s)) is born at K(s)K(s) if it appears for the first time in H0(K(s))H_{0}(K(s))—formally, γH0(K(r))\gamma\not\in H_{0}(K(r)), for r<sr<s. The component γH0(K(s))\gamma\in H_{0}(K(s)) dies entering K(t)K(t) if it merges with an older class (born before ss) entering Hk(K(t))H_{k}(K(t)). The 0th0^{th} persistent homology of 𝒳\mathcal{X}, denoted PH0PH_{0}, is the collection of homology groups H0(K(t))H_{0}(K(t)) and maps f0s,tf^{s,t}_{0}, for <st-\infty<s\leq t\leq\infty. All of the information in the persistent homology groups is contained in a multiset in 2\mathbb{R}^{2} called the persistence diagram (Edelsbrunner and Harer, 2010). The 0th0^{th} persistence diagram of (K(t))t(K(t))_{t\in\mathbb{R}}, denoted PD0PD_{0}, consists of the points (b,d)(b,d) with multiplicity equal to the number of the classes that are born at K(b)K(b) and die entering K(d)K(d). Often, the diagonal y=xy=x is added to this diagram, but we need not consider this here. Formally, we have

PD0={(b,d):there exists γPH0 born at b that dies entering d},PD_{0}=\big{\{}(b,d)\mathrel{\mathop{\mathchar 58\relax}}\text{there exists }\gamma\in PH_{0}\text{ born at }b\text{ that dies entering }d\big{\}},

where PD0PD_{0} is a multiset. Each point (b,d)(b,d) in PD0PD_{0} can also be represented as a barcode, or interval [b,d)[b,d)\subset\mathbb{R} (cf. Carlsson and Vejdemo-Johansson, 2021). As such, we may represent PD0PD_{0} as a measure

ξ0=(b,d)PD0δ(b,d),\xi_{0}=\sum_{(b,d)\in PD_{0}}\delta_{(b,d)},

on Δ:={(x,y)¯2:<x<y}\Delta\mathrel{\mathop{\mathchar 58\relax}}=\{(x,y)\in\bar{\mathbb{R}}^{2}\mathrel{\mathop{\mathchar 58\relax}}-\infty<x<y\leq\infty\}. See Figure 1 for an illustration of a persistence diagram associated to a sublevel set filtration of a given stochastic process.

2.3. Probability and persistence

Throughout the paper, let us fix a probability space (Ω,,)(\Omega,\mathcal{F},\mathbb{P}). For random variables X,X1,X2,X,X_{1},X_{2},\dots we write XnXX_{n}\Rightarrow X to convey that XnX_{n} converges weakly to XX, i.e. 𝔼[f(Xn)]𝔼[f(X)]\mathbb{E}[f(X_{n})]\to\mathbb{E}[f(X)] for all bounded, continuous ff. We write Xn𝑃XX_{n}\overset{P}{\to}X to convey that XnX_{n} converges in probability to XX. We say an event AA\in\mathcal{F} occurs “a.s.” (almost surely), if (A)=1\mathbb{P}(A)=1. We use the term stationary throughout this work to refer to the strict stationarity of invariance of finite-dimensional distributions under shifts. A stationary sequence X1,X2,X_{1},X_{2},\dots of random variables is said to be ergodic if any a.s. shift-invariant event EE satisfies either (E)=0\mathbb{P}(E)=0 or (E)=1\mathbb{P}(E)=1.

As we are interested in studying the stochastic behavior of persistence diagrams, we want to associate to each vertex viv_{i} a random variable XiX_{i} for each i=0,1,2,i=0,1,2,\dots. Consider a stationary sequence of random variables X1,X2,X_{1},X_{2},\dots and define X0X_{0}\equiv\infty. We then define for tt\in\mathbb{R} the filtration

Kn(t):={σK:maxviσXi,nt},K_{n}(t)\mathrel{\mathop{\mathchar 58\relax}}=\big{\{}\sigma\in K\mathrel{\mathop{\mathchar 58\relax}}\max_{v_{i}\in\sigma}X_{i,n}\leq t\big{\}},

where X0,n=Xk,n=X_{0,n}=X_{k,n}=\infty for k>nk>n and Xk,n=XkX_{k,n}=X_{k} otherwise. Furthermore, set Kn=(Kn(t))tK_{n}=\big{(}K_{n}(t)\big{)}_{t\in\mathbb{R}}. Crucially, we can show that

(1) β0,ns,t=i=1nj=1ni+1𝟏{k=jj+i1Xk,nt,k=jj+i1Xk,ns}𝟏{Xj1,nXj+i,n>t}.\beta_{0,n}^{s,t}=\sum_{i=1}^{n}\sum_{j=1}^{n-i+1}\mathbf{1}\bigg{\{}\bigvee_{k=j}^{j+i-1}X_{k,n}\leq t,\bigwedge_{k=j}^{j+i-1}X_{k,n}\leq s\bigg{\}}\mathbf{1}\big{\{}X_{j-1,n}\wedge X_{j+i,n}>t\big{\}}.

We now formalize (1) into a proposition and present a proof.

Proposition 2.1.

The formula (1) holds.

Proof.

Take two vertices vi,vjZ0(Kn(s))v_{i},v_{j}\in Z_{0}(K_{n}(s)). These vertices are equivalent if and only if

vi+vjB1(Kn(t)),v_{i}+v_{j}\in B_{1}(K_{n}(t)),

i.e. if they can be connected by edges lying in Kn(t)K_{n}(t). Hence, viv_{i} and vjv_{j} must lie in the same connected component in Kn(t)K_{n}(t). Thus there is a one-to-one correspondence between the number of connected components in Kn(t)K_{n}(t) (which contain a vertex from Kn(s)K_{n}(s)) and the number of equivalence classes present in H0s,t(Kn)H^{s,t}_{0}(K_{n}). Hence, these same classes form a spanning set. Let [c][c] denote the equivalence class of a chain cc. Now take the vertices [vi1],,[vi][v_{i_{1}}],\dots,[v_{i_{\ell}}] that constitute H0s,t(Kn)H^{s,t}_{0}(K_{n}) (note that n+1\ell\leq n+1). Then,

ai1[vi1]++ai[vi]=[ai1vi1++aivi]=0a_{i_{1}}[v_{i_{1}}]+\cdots+a_{i_{\ell}}[v_{i_{\ell}}]=[a_{i_{1}}v_{i_{1}}+\cdots+a_{i_{\ell}}v_{i_{\ell}}]=0

if and only if

ai1vi1++aiviB1(Kn(t)),a_{i_{1}}v_{i_{1}}+\cdots+a_{i_{\ell}}v_{i_{\ell}}\in B_{1}(K_{n}(t)),

where the aa terms lie in 2\mathbb{Z}_{2}. Suppose without loss of generality that

i1<<i{i_{1}}<\cdots<{i_{\ell}}

As vi1v_{i_{1}} lies in a different connected component from the rest of the vertices, any 11-chain of edges in Kn(t)K_{n}(t) including an edge that vi1v_{i_{1}} is a part of, must have a boundary containing a point not equal to vi1v_{i_{1}} and also not equal to vi2,,viv_{i_{2}},\dots,v_{i_{\ell}}. Hence ai1=0a_{i_{1}}=0, and induction furnishes the other cases. Hence, (1) holds. ∎

Having brought forth the representation of persistent Betti numbers that will prove crucial to the results herein, we turn our attention to persistence diagrams. Let ξ0,n\xi_{0,n} be the measure on Δ\Delta associated to the 0th0^{th} persistence diagram PD0PD_{0} of the filtration Kn=(Kn(t))tK_{n}=\big{(}K_{n}(t)\big{)}_{t\in\mathbb{R}}. Note that

β0,ns,t=ξ0,n((,s]×(t,]).\beta_{0,n}^{s,t}=\xi_{0,n}\big{(}(-\infty,s]\times(t,\infty]\big{)}.

If we let

R=(s1,s2]×(t1,t2],R=(s_{1},s_{2}]\times(t_{1},t_{2}],

for <s1<s2t1<t2-\infty<s_{1}<s_{2}\leq t_{1}<t_{2}\leq\infty, then

(2) ξ0,n(R)=β0,ns2,t1β0,ns2,t2β0,ns1,t1+β0,ns1,t2,\xi_{0,n}(R)=\beta_{0,n}^{s_{2},t_{1}}-\beta_{0,n}^{s_{2},t_{2}}-\beta_{0,n}^{s_{1},t_{1}}+\beta_{0,n}^{s_{1},t_{2}},

due to the so-called “Fundamental Lemma of Persistent Homology” (Edelsbrunner and Harer, 2010). If RR has the above representation, we will say that s1,s2,t1,t2s_{1},s_{2},t_{1},t_{2} are the coordinates of RR. We define the class \mathcal{R} of sets by

:={(s1,s2]×(t1,t2]:<s1<s2t1<t2}.\mathcal{R}\mathrel{\mathop{\mathchar 58\relax}}=\big{\{}(s_{1},s_{2}]\times(t_{1},t_{2}]\mathrel{\mathop{\mathchar 58\relax}}-\infty<s_{1}<s_{2}\leq t_{1}<t_{2}\leq\infty\big{\}}.

An important result holds for the class \mathcal{R}.

Lemma 2.2.

\mathcal{R} is a convergence-determining class for weak convergence on Δ\Delta equipped with the Borel σ\sigma-algebra, (Δ)\mathcal{B}(\Delta). Namely, if (μn)n(\mu_{n})_{n} and μ\mu are probability measures on Δ\Delta and

μn(R)μ(R),n,\mu_{n}(R)\to\mu(R),\quad n\to\infty,

for all RR\in\mathcal{R} such that μ(R)=0\mu(\partial R)=0, then

μnμ,n.\mu_{n}\Rightarrow\mu,\quad n\to\infty.

Furthermore, for each probability measure μ\mu on Δ\Delta there is a countable convergence-determining class μ\mathcal{R}_{\mu}\subset\mathcal{R} for μ\mu.

Proof.

We will adapt the proof of Theorem A.2 from Hiraoka et al. (2018). First, it is clear that \mathcal{R} is closed under finite intersections, so we have satisfied the first condition of Theorem 2.4 in Billingsley (1999) (i.e. that \mathcal{R} is a π\pi-system). It is also evident that Δ\Delta is separable. Now, for any zΔz\in\Delta if we denote

z,ϵ:={R:zRRB(z,ϵ)},\mathcal{R}_{z,\epsilon}\mathrel{\mathop{\mathchar 58\relax}}=\{R\in\mathcal{R}\mathrel{\mathop{\mathchar 58\relax}}z\in R^{\circ}\subset R\subset B(z,\epsilon)\},

then the class of boundaries z,ϵ\partial\mathcal{R}_{z,\epsilon} contains uncountably many disjoint sets, regardless of if z=(s,)z=(s,\infty) or (s,t)(s,t), where t<t<\infty (in the former case R=(s1,s2)×(t1,]R^{\circ}=(s_{1},s_{2})\times(t_{1},\infty]). Thus \mathcal{R} is a convergence-determining class by Theorem 2.4 of Billingsley (1999).

For the final part of the proof, let us fix a probability measure μ\mu and choose an open set UΔU\subset\Delta. Note that for every zUz\in U, there is an ϵ>0\epsilon>0 such that B(z,ϵ)UB(z,\epsilon)\subset U. By the first part of this proof, for each of these B(z,ϵ)B(z,\epsilon) there exists a set RzRzUz,ϵR_{z}\equiv R^{U}_{z}\in\mathcal{R}_{z,\epsilon} such that μ(Rz)=0\mu(\partial R_{z})=0 and hence we have

U=zURz=zURz,U=\bigcup_{z\in U}R_{z}=\bigcup_{z\in U}R^{\circ}_{z},

and UU is the union of sets with μ\mu-null boundaries. By Δ\Delta separable, there exists a countable subcover {RziU}i=1\{R^{U}_{z_{i}}\}_{i=1}^{\infty} of UU. Also, there exists a countable basis {Uj}j=1\{U_{j}\}_{j=1}^{\infty} of Δ\Delta. Hence, if we denote Ri,j:=RziUjR_{i,j}\mathrel{\mathop{\mathchar 58\relax}}=R^{U_{j}}_{z_{i}} then

Uj=i=1Ri,j=i=1Ri,j.U_{j}=\bigcup_{i=1}^{\infty}R_{i,j}=\bigcup_{i=1}^{\infty}R^{\circ}_{i,j}.

If we let μ\mathcal{R}_{\mu} be the class of finite intersections of the sets Ri,jR_{i,j}. As the boundary of an intersection is a subset of the union of the boundaries, each element of μ\mathcal{R}_{\mu} has a μ\mu-null boundary. Furthermore, every open set in Δ\Delta is the countable union of elements of μ\mathcal{R}_{\mu}. Hence, we apply Theorem 2.2 in Billingsley (1999) and the result holds.

An important result holds for the measure ξ0,n\xi_{0,n}. Namely that the value ξ0,n(Δ)\xi_{0,n}(\Delta) is equal to the number of local minima of X0,n,X1,n,,Xn,n,Xn+1,nX_{0,n},X_{1,n},\dots,X_{n,n},X_{n+1,n}.

Proposition 2.3.

Suppose that X1,X2,X_{1},X_{2},\dots is a stationary sequence of random variables with (X1=X2)=0\mathbb{P}(X_{1}=X_{2})=0. Then

ξ0,n(Δ)=i=1n𝟏{Xi,n<Xi1,nXi+1,n}\xi_{0,n}(\Delta)=\sum_{i=1}^{n}\mathbf{1}\big{\{}X_{i,n}<X_{i-1,n}\wedge X_{i+1,n}\big{\}}
Proof.

The case when n=1n=1 is trivial, so suppose that n2n\geq 2. As the underlying stochastic process is stationary and (X1=X2)=0\mathbb{P}(X_{1}=X_{2})=0 then every value X1,X2,X_{1},X_{2},\dots is distinct with probability 1. Let aiX(i),na_{i}\equiv X_{(i),n} be the order statistics of X1,n,,Xn,nX_{1,n},\dots,X_{n,n}—which are distinct with probability 1—and let v(i)v_{(i)} be the associated vertices (see above). If we define

Ki:=Kn(ai),i=1,,n,K_{i}\mathrel{\mathop{\mathchar 58\relax}}=K_{n}(a_{i}),\quad i=1,\dots,n,

with K0=K_{0}=\emptyset, then K0K1KnK_{0}\subset K_{1}\subset\cdots\subset K_{n} and Ki+1K_{i+1} contains all the simplices of KiK_{i} along with the 0-simplex v(i+1)v_{(i+1)} and any edges containing it. If m>m>\ell then there are α\alpha points at (a,am)ξ0,n(a_{\ell},a_{m})\in\xi_{0,n} if and only if ξ0,n((a1,a]×(am1,am])=α\xi_{0,n}((a_{\ell-1},a_{\ell}]\times(a_{m-1},a_{m}])=\alpha—see p. 152 in Edelsbrunner and Harer (2010). By Proposition 2.1, we have that

ξ0,n((a1,a]×(am1,am])\displaystyle\xi_{0,n}((a_{\ell-1},a_{\ell}]\times(a_{m-1},a_{m}])
=β0,na,am1β0,na,amβ0,na1,am1+β0,na1,am\displaystyle\ =\beta_{0,n}^{a_{\ell},a_{m-1}}-\beta_{0,n}^{a_{\ell},a_{m}}-\beta_{0,n}^{a_{\ell-1},a_{m-1}}+\beta_{0,n}^{a_{\ell-1},a_{m}}
=i=1nj=1ni+1𝟏{k=jj+i1Xk,n=a}\displaystyle\ =\sum_{i=1}^{n}\sum_{j=1}^{n-i+1}\mathbf{1}\bigg{\{}\bigwedge_{k=j}^{j+i-1}X_{k,n}=a_{\ell}\bigg{\}}
×[𝟏{k=jj+i1Xk,nam1,Xj1,nXj+i,n>am1}\displaystyle\ \times\Bigg{[}\mathbf{1}\bigg{\{}\bigvee_{k=j}^{j+i-1}X_{k,n}\leq a_{m-1},X_{j-1,n}\wedge X_{j+i,n}>a_{m-1}\bigg{\}}
𝟏{k=jj+i1Xk,nam,Xj1,nXj+i,n>am}].\displaystyle\phantom{\ \times\Bigg{[}\mathbf{1}\bigg{\{}\bigvee_{k=j}^{j+i-1}X_{k,n}}-\mathbf{1}\bigg{\{}\bigvee_{k=j}^{j+i-1}X_{k,n}\leq a_{m},X_{j-1,n}\wedge X_{j+i,n}>a_{m}\bigg{\}}\Bigg{]}.

Now, ξ0,n(Δ)==1n1m=+1nξ0,n((a1,a]×(am1,am])\xi_{0,n}(\Delta)=\sum_{\ell=1}^{n-1}\sum_{m=\ell+1}^{n}\xi_{0,n}((a_{\ell-1},a_{\ell}]\times(a_{m-1},a_{m}]) so by cancelling sums—and the fact that n2n\geq 2 implies that Xj1,nXj+i,n>anX_{j-1,n}\wedge X_{j+i,n}>a_{n} cannot happen—we have that

i=1nj=1ni+1=1n1𝟏{k=jj+i1Xk,n=a}\displaystyle\sum_{i=1}^{n}\sum_{j=1}^{n-i+1}\sum_{\ell=1}^{n-1}\mathbf{1}\bigg{\{}\bigwedge_{k=j}^{j+i-1}X_{k,n}=a_{\ell}\bigg{\}}
×𝟏{k=jj+i1Xk,na,Xj1,nXj+i,n>a}\displaystyle\phantom{\sum_{i=1}^{n}\sum_{j=1}^{n-i+1}\sum_{\ell=1}^{n-1}}\qquad\times\mathbf{1}\bigg{\{}\bigvee_{k=j}^{j+i-1}X_{k,n}\leq a_{\ell},X_{j-1,n}\wedge X_{j+i,n}>a_{\ell}\bigg{\}}
(3) =j=1n=1n1𝟏{Xj,n=a,Xj1,nXj+1,n>a},\displaystyle=\sum_{j=1}^{n}\sum_{\ell=1}^{n-1}\mathbf{1}\big{\{}X_{j,n}=a_{\ell},X_{j-1,n}\wedge X_{j+1,n}>a_{\ell}\},

because the only way the maximum and minimum of a collection of ii of random variables are idenitical is if they’re constant—which is only possible if i=1i=1 as the Xi,n,i=1,,nX_{i,n},i=1,\dots,n are almost surely distinct. The desired formula follows from applying this same uniqueness to (3).

To finish this section, we must introduce the restricted measure on the set Δ~:=Δ2\tilde{\Delta}\mathrel{\mathop{\mathchar 58\relax}}=\Delta\cap\mathbb{R}^{2}—equipped with the usual Borel sub σ\sigma-algebra (Δ~)\mathcal{B}(\tilde{\Delta})—defined by

ξ~0,n(A):=ξ0,n(A),A(Δ~).\tilde{\xi}_{0,n}(A)\mathrel{\mathop{\mathchar 58\relax}}=\xi_{0,n}(A),\quad A\in\mathcal{B}(\tilde{\Delta}).

Note that as Δ2\Delta\cap\mathbb{R}^{2} is Borel subset of Δ\Delta that (Δ~)(Δ)\mathcal{B}(\tilde{\Delta})\subset\mathcal{B}(\Delta). To reduce notational clutter, we will mostly write ξ~0,n(Δ)\tilde{\xi}_{0,n}(\Delta) in place of ξ~0,n(Δ~)\tilde{\xi}_{0,n}(\tilde{\Delta}) from here on out, unless otherwise noted.

3. Strong law of large numbers

In this section we establish our strong law of large numbers for sublevel set persistence diagrams for a very broad class of sets and functions. We do this for the class of bounded, continuous functions initially via a weak convergence argument, and proceed to extend our result to a class of unbounded functions which are of great practical use in topological data analysis. Along the way, we give an explicit representation for the limiting persistent Betti number for i.i.d. sequences.

Theorem 3.1.

Consider a stationary and ergodic sequence 𝒳=(X1,X2,)\mathcal{X}=(X_{1},X_{2},\dots) where each XiX_{i} has distribution FF and density ff such that (X1=X2)=0\mathbb{P}(X_{1}=X_{2})=0. For the random probability measure ξ0,n/ξ0,n(Δ)\xi_{0,n}/\xi_{0,n}(\Delta) induced by 𝒳\mathcal{X} there exists a probability measure ξ0\xi_{0} on Δ\Delta such that

ξ0,nξ0,n(Δ)ξ0a.s.,n.\frac{\xi_{0,n}}{\xi_{0,n}(\Delta)}\Rightarrow\xi_{0}\ \ \mathrm{a.s.},\quad n\to\infty.

Additionally, if we define ξ~0ξ0\tilde{\xi}_{0}\equiv\xi_{0} on (Δ~)\mathcal{B}(\tilde{\Delta}) then

ξ~0,nξ~0,n(Δ)ξ~0a.s.,n.\frac{\tilde{\xi}_{0,n}}{\tilde{\xi}_{0,n}(\Delta)}\Rightarrow\tilde{\xi}_{0}\ \ \mathrm{a.s.},\quad n\to\infty.
Proof.

We will begin by establishing the almost sure convergence of the persistent Betti numbers β0,ns,t/n\beta_{0,n}^{s,t}/n for <st-\infty<s\leq t\leq\infty. Recall that

β0,ns,tn\displaystyle\frac{\beta_{0,n}^{s,t}}{n} =1ni=1nj=1ni+1𝟏{k=jj+i1Xk,nt,k=jj+i1Xk,ns}𝟏{Xj1,nXj+i,n>t}\displaystyle=\frac{1}{n}\sum_{i=1}^{n}\sum_{j=1}^{n-i+1}\mathbf{1}\bigg{\{}\bigvee_{k=j}^{j+i-1}X_{k,n}\leq t,\bigwedge_{k=j}^{j+i-1}X_{k,n}\leq s\bigg{\}}\mathbf{1}\big{\{}X_{j-1,n}\wedge X_{j+i,n}>t\big{\}}
=1nj=1ni=1nj+1𝟏{k=jj+i1Xk,nt,k=jj+i1Xk,ns}𝟏{Xj1,nXj+i,n>t}\displaystyle=\frac{1}{n}\sum_{j=1}^{n}\sum_{i=1}^{n-j+1}\mathbf{1}\bigg{\{}\bigvee_{k=j}^{j+i-1}X_{k,n}\leq t,\bigwedge_{k=j}^{j+i-1}X_{k,n}\leq s\bigg{\}}\mathbf{1}\big{\{}X_{j-1,n}\wedge X_{j+i,n}>t\big{\}}

Define for m{}m\in\mathbb{N}\cup\{\infty\} the indicator random variable

(4) Yj,nm(s,t):=i=1m𝟏{k=jj+i1Xk,nt,k=jj+i1Xk,ns}𝟏{Xj1,nXj+i,n>t},Y_{j,n}^{m}(s,t)\mathrel{\mathop{\mathchar 58\relax}}=\sum_{i=1}^{m}\mathbf{1}\bigg{\{}\bigvee_{k=j}^{j+i-1}X_{k,n}\leq t,\bigwedge_{k=j}^{j+i-1}X_{k,n}\leq s\bigg{\}}\mathbf{1}\big{\{}X_{j-1,n}\wedge X_{j+i,n}>t\big{\}},

with the indicators Yjm(s,t)Y_{j}^{m}(s,t) defined as Yj,nm(s,t)Y^{m}_{j,n}(s,t) with the second subscript nn dropped. If we fix mm, we have for nmn\geq m that

β0,ns,t=j=1nYj,nnj+1(s,t)j=1nm+1Yj,nnj+1(s,t)j=1nm+1Yjm(s,t),\beta_{0,n}^{s,t}=\sum_{j=1}^{n}Y_{j,n}^{n-j+1}(s,t)\geq\sum_{j=1}^{n-m+1}Y_{j,n}^{n-j+1}(s,t)\geq\sum_{j=1}^{n-m+1}Y_{j}^{m}(s,t),

which yields

β0,ns,tj=2n+1Yjm(s,t)(m+1).\beta_{0,n}^{s,t}\geq\sum_{j=2}^{n+1}Y_{j}^{m}(s,t)-(m+1).

Similarly, we see that

β0,ns,t1+j=1nYj,nnj(s,t)2+j=2n+1Yj(s,t),\beta_{0,n}^{s,t}\leq 1+\sum_{j=1}^{n}Y_{j,n}^{n-j}(s,t)\leq 2+\sum_{j=2}^{n+1}Y_{j}^{\infty}(s,t),

because

j=1n𝟏{k=jnXk,nt,k=jnXk,ns}𝟏{Xj1,n>t}{0,1}.\sum_{j=1}^{n}\mathbf{1}\bigg{\{}\bigvee_{k=j}^{n}X_{k,n}\leq t,\bigwedge_{k=j}^{n}X_{k,n}\leq s\bigg{\}}\mathbf{1}\big{\{}X_{j-1,n}>t\big{\}}\in\{0,1\}.

It is readily observed for fixed tst\geq s that Y2m(s,t),Y3m(s,t),Y_{2}^{m}(s,t),Y_{3}^{m}(s,t),\dots are indicator random variables and form a stationary and ergodic sequence, owing to Theorem 7.1.3 in Durrett (2010), for example. Thus, Birkhoff’s ergodic theorem implies that for any mm\in\mathbb{N} we have

𝔼[Y2m(s,t)]lim infnβ0,ns,tnlim supnβ0,ns,tn𝔼[Y2(s,t)],a.s.\mathbb{E}[Y_{2}^{m}(s,t)]\leq\liminf_{n\to\infty}\frac{\beta_{0,n}^{s,t}}{n}\leq\limsup_{n\to\infty}\frac{\beta_{0,n}^{s,t}}{n}\leq\mathbb{E}[Y^{\infty}_{2}(s,t)],\quad\mathrm{a.s.}

The monotone convergence theorem then implies that

n1β0,ns,ta.s.𝔼[Y2(s,t)],n.n^{-1}\beta_{0,n}^{s,t}\overset{\text{a.s.}}{\to}\mathbb{E}[Y_{2}^{\infty}(s,t)],\quad n\to\infty.

To establish the convergence of ξ0,n(Δ)/n\xi_{0,n}(\Delta)/n, it suffices to recall that from Proposition 2.3 the total number of points in the persistence diagram ξ0,n(Δ)\xi_{0,n}(\Delta) is equal to the number of local minima of 𝒳\mathcal{X}. Therefore, the ergodic theorem once again implies that ξ0,n(Δ)/n\xi_{0,n}(\Delta)/n converges a.s. to (X2<X1X3)\mathbb{P}(X_{2}<X_{1}\wedge X_{3}) and

(5) ξ0,n((,s]×(t,])ξ0,n(Δ)𝔼[Y2(s,t)](X2<X1X3),a.s.,n.\frac{\xi_{0,n}\big{(}(-\infty,s]\times(t,\infty]\big{)}}{\xi_{0,n}(\Delta)}\to\frac{\mathbb{E}[Y_{2}^{\infty}(s,t)]}{\mathbb{P}(X_{2}<X_{1}\wedge X_{3})},\ \ \mathrm{a.s.},\quad n\to\infty.

(By our assumptions we must have that P(X2<X1X3)>0P(X_{2}<X_{1}\wedge X_{3})>0). Define a set function ξ¯0\bar{\xi}_{0} by

ξ¯0((,s]×(t,])):=𝔼[Y2(s,t)](X2<X1X3)\bar{\xi}_{0}\big{(}(-\infty,s]\times(t,\infty])\big{)}\mathrel{\mathop{\mathchar 58\relax}}=\frac{\mathbb{E}[Y_{2}^{\infty}(s,t)]}{\mathbb{P}(X_{2}<X_{1}\wedge X_{3})}

which can likewise be defined on \mathcal{R} in a straightforward manner, by (2). It is clear that the convergence in (5) holds for any set in \mathcal{R} as well. As \mathcal{R} is a semiring which generates the Borel σ\sigma-algebra (Δ)\mathcal{B}(\Delta) on Δ\Delta (as Δ\Delta is separable), then ξ¯0\bar{\xi}_{0} extends uniquely to a probability measure ξ0\xi_{0} on (Δ)\mathcal{B}(\Delta), provided that ξ¯0\bar{\xi}_{0} is countably additive on \mathcal{R}. By Lemma 2.2, there is a countable convergence-determining class 0\mathcal{R}_{0} for ξ0\xi_{0}. We have shown thus far that

(limnξ0,n(R)ξ0,n(Δ)ξ0(R), for any R0)=1,\mathbb{P}\Bigg{(}\lim_{n\to\infty}\frac{\xi_{0,n}(R)}{\xi_{0,n}(\Delta)}\to\xi_{0}(R),\text{ for any }R\in\mathcal{R}_{0}\Bigg{)}=1,

so convergence for all sets in (Δ)\mathcal{B}(\Delta) with ξ0\xi_{0}-null boundary follows (with probability 1). It remains to demonstrate that ξ¯0\bar{\xi}_{0} is countably additive on \mathcal{R}. Let

(s1,s2]×(t1,t2]=i=1(s1,i,s2,i]×(t1,i,t2,i],(s_{1},s_{2}]\times(t_{1},t_{2}]=\bigcup_{i=1}^{\infty}(s_{1,i},s_{2,i}]\times(t_{1,i},t_{2,i}],

where (s1,i,s2,i]×(t1,i,t2,i](s_{1,i},s_{2,i}]\times(t_{1,i},t_{2,i}] are disjoint. Then, almost surely,

ξ¯0((s1,s2]×(t1,t2])\displaystyle\bar{\xi}_{0}\big{(}(s_{1},s_{2}]\times(t_{1},t_{2}]\big{)} =limnξ0,n((s1,s2]×(t1,t2])ξ0,n(Δ)\displaystyle=\lim_{n\to\infty}\frac{\xi_{0,n}\big{(}(s_{1},s_{2}]\times(t_{1},t_{2}]\big{)}}{\xi_{0,n}(\Delta)}
=limni=1ξ0,n((s1,i,s2,i]×(t1,i,t2,i])ξ0,n(Δ)\displaystyle=\lim_{n\to\infty}\sum_{i=1}^{\infty}\frac{\xi_{0,n}\big{(}(s_{1,i},s_{2,i}]\times(t_{1,i},t_{2,i}]\big{)}}{\xi_{0,n}(\Delta)}
=i=1ξ¯0((s1,i,s2,i]×(t1,i,t2,i]),\displaystyle=\sum_{i=1}^{\infty}\bar{\xi}_{0}\big{(}(s_{1,i},s_{2,i}]\times(t_{1,i},t_{2,i}]\big{)},

by the monotone convergence theorem.

To finish the proof, note that it is the case222This fact implies that ξ0\xi_{0} is supported on Δ~\tilde{\Delta}. that ξ~0,n(Δ~)ξ0,n(Δ)\tilde{\xi}_{0,n}(\tilde{\Delta})\sim\xi_{0,n}(\Delta)—as they both tend to infinity and differ by 1. Also, we have that for any set A(Δ~)A\in\mathcal{B}(\tilde{\Delta})—which is also a Borel subset of Δ\Delta—if ξ0(A)=0\xi_{0}(\partial A)=0, then almost surely

ξ~0,n(A)ξ~0,n(Δ)ξ0,n(A)ξ0,n(Δ)ξ0(A),n.\frac{\tilde{\xi}_{0,n}(A)}{\tilde{\xi}_{0,n}(\Delta)}\sim\frac{\xi_{0,n}(A)}{\xi_{0,n}(\Delta)}\to\xi_{0}(A),\quad n\to\infty.

As ξ0(A)=ξ0~(A)\xi_{0}(A)=\tilde{\xi_{0}}(A) for A(Δ~)A\in\mathcal{B}(\tilde{\Delta}), the proof is finished. ∎

Remark 3.2.

In Theorem 3.1 we assumed that (X1=X2)=0\mathbb{P}(X_{1}=X_{2})=0 in our stationary sequence, to ensure consecutive points are distinct, as stated in Proposition 2.3. It seems straightforward to generalize this result to the situation where consecutive points can be identical, by accounting for this in the proof of Proposition 2.3, and ensuring that the number of points in ξ0,n\xi_{0,n} tends to infinity.

Before seeing an example of the strong law in action, we will establish a result that will provide us an explicit representation of the limiting measure. Let us define the quantity

pi(s,t):=(k=1i{X1t,,Xks,,Xit}),p_{i}(s,t)\mathrel{\mathop{\mathchar 58\relax}}=\mathbb{P}\Big{(}\bigcup_{k=1}^{i}\big{\{}X_{1}\leq t,\dots,X_{k}\leq s,\dots,X_{i}\leq t\}\Big{)},

which represents the probability that there is some index kk such that XksX_{k}\leq s and all other random variables are less than or equal to tt. In the setup with XiX_{i} all i.i.d. with distribution function FF we have

pi(s,t)=F(t)i(F(t)F(s))ip_{i}(s,t)=F(t)^{i}-\big{(}F(t)-F(s)\big{)}^{i}

and

𝔼[β0,ns,t]\displaystyle\mathbb{E}[\beta_{0,n}^{s,t}] =i=1nj=1ni+1(k=jj+i1Xk,nt,k=jj+i1Xk,ns, and Xj1,nXj+i,n>t)\displaystyle=\sum_{i=1}^{n}\sum_{j=1}^{n-i+1}\mathbb{P}\bigg{(}\bigvee_{k=j}^{j+i-1}X_{k,n}\leq t,\bigwedge_{k=j}^{j+i-1}X_{k,n}\leq s,\text{ and }X_{j-1,n}\wedge X_{j+i,n}>t\bigg{)}
(6) =pn(s,t)+2pn1(s,t)(1F(t))+i=1n2(2pi(s,t)(1F(t))+(ni1)pi(s,t)(1F(t))2).\displaystyle=p_{n}(s,t)+2p_{n-1}(s,t)(1-F(t))+\sum_{i=1}^{n-2}\bigg{(}2p_{i}(s,t)(1-F(t))+(n-i-1)p_{i}(s,t)(1-F(t))^{2}\bigg{)}.

We will assume that 0<F(s)<10<F(s)<1, as if F(s)=0F(s)=0 then β0,ns,t0\beta^{s,t}_{0,n}\equiv 0 and if F(s)=1F(s)=1 then β0,ns,t1\beta^{s,t}_{0,n}\equiv 1. Dividing (6) by nn we can see that

𝔼[β0,ns,t]n\displaystyle\frac{\mathbb{E}[\beta_{0,n}^{s,t}]}{n} (1F(t))2ni=1n(ni+1)pi(s,t)\displaystyle\sim\frac{(1-F(t))^{2}}{n}\sum_{i=1}^{n}(n-i+1)p_{i}(s,t)
=(1F(t))2ni=1n(ni+1)[F(t)i(F(t)F(s))i]\displaystyle=\frac{(1-F(t))^{2}}{n}\sum_{i=1}^{n}(n-i+1)[F(t)^{i}-\big{(}F(t)-F(s)\big{)}^{i}]

as the other terms are finite or tend to zero upon dividing by nn. Let us make the substitution i=nj+1i=n-j+1 and consider a general a(0,1]a\in(0,1] with b=a1b=a^{-1}. Thus,

i=1n(ni+1)ai\displaystyle\sum_{i=1}^{n}(n-i+1)a^{i} =anj=1njbj1\displaystyle=a^{n}\sum_{j=1}^{n}jb^{j-1}
=an[nbn+1(n+1)bn+1(b1)2]\displaystyle=a^{n}\Bigg{[}\frac{nb^{n+1}-(n+1)b^{n}+1}{(b-1)^{2}}\Bigg{]}
(7) =nb(n+1)+an(b1)2\displaystyle=\frac{nb-(n+1)+a^{n}}{(b-1)^{2}}

by differentiating i=1nxi=(xn+1x)/(x1)\sum_{i=1}^{n}x^{i}=(x^{n+1}-x)/(x-1) with respect to xx. We have the following pleasing result for the limiting expectation for the persistent Betti number in this simplified i.i.d case.

Proposition 3.3.

For XiX_{i} i.i.d. having distribution FF, we have that

𝔼[β0,ns,t]n(1F(t))F(s)1F(t)+F(s),\frac{\mathbb{E}[\beta_{0,n}^{s,t}]}{n}\to\frac{(1-F(t))F(s)}{1-F(t)+F(s)},

for any <st-\infty<s\leq t\leq\infty with F(s)(0,1)F(s)\in(0,1) and 0 otherwise.

Proof.

Dividing by nn and taking the limit in (7) for the two cases a=F(t)a=F(t) and a=F(t)F(s)a=F(t)-F(s) gives

(1F(t))21/F(t)1=(1F(t))F(t),\frac{(1-F(t))^{2}}{1/F(t)-1}=(1-F(t))F(t),

and

(1F(t))21/[F(t)F(s)]1=(1F(t))2[F(t)F(s)]1F(t)+F(s).\frac{(1-F(t))^{2}}{1/[F(t)-F(s)]-1}=\frac{(1-F(t))^{2}[F(t)-F(s)]}{1-F(t)+F(s)}.

Simplifying the above two expressions yields the ultimate result. ∎

Example 3.4.

If the stationary and ergodic sequence in Theorem 3.1 is i.i.d, Proposition 3.3 shows we can characterize the limiting probability measure ξ0\xi_{0} quite nicely. We note that

ξ0((,s]×(t,])=3(1F(t))F(s)1F(t)+F(s)\xi_{0}\big{(}(-\infty,s]\times(t,\infty]\big{)}=\frac{3(1-F(t))F(s)}{1-F(t)+F(s)}

for all <st\infty<s\leq t\leq\infty as (X2<X1X3)=1/3\mathbb{P}(X_{2}<X_{1}\wedge X_{3})=1/3. Therefore, ξ0\xi_{0} admits a probability density

2xy[3(1F(y))F(x)1F(y)+F(x)]\displaystyle-\frac{\partial^{2}}{\partial x\partial y}\Bigg{[}\frac{3(1-F(y))F(x)}{1-F(y)+F(x)}\Bigg{]}
=6f(x)f(y)(1F(y))F(x)(1F(y)+F(x))3.\displaystyle\phantom{-\frac{\partial^{2}}{\partial x\partial y}}=\frac{6f(x)f(y)(1-F(y))F(x)}{(1-F(y)+F(x))^{3}}.

This density facilities the simulation of random variables according to the limiting persistence distribution ξ0NULL\xi^{\text{NULL}}_{0} in the case that 𝒳\mathcal{X} corresponds to i.i.d. noise. After a Monte Carlo random sample is generated from this distribution, we may test for “significant” points (b,d)(b,d) in the diagram ξ0,n\xi_{0,n}, based off of what we would expect from ξ0NULL\xi^{\text{NULL}}_{0}.

Of particular importance to us is the partial derivative

x[3(1F(y))F(x)1F(y)+F(x)]\displaystyle\frac{\partial}{\partial x}\Bigg{[}\frac{3(1-F(y))F(x)}{1-F(y)+F(x)}\Bigg{]}
(8) =3f(x)(1F(y))2(1F(y)+F(x))2.\displaystyle\phantom{-\frac{\partial^{2}}{\partial x\partial y}}=\frac{3f(x)(1-F(y))^{2}}{(1-F(y)+F(x))^{2}}.

If we set y=x+y=x+\ell, then (8) evaluates to

3(1F(x+)1F(x+)+F(x))2f(x)\displaystyle 3\Bigg{(}\frac{1-F(x+\ell)}{1-F(x+\ell)+F(x)}\Bigg{)}^{2}f(x)

Define Δ:={(x,y)Δ:yx>}\Delta_{\ell}\mathrel{\mathop{\mathchar 58\relax}}=\{(x,y)\in\Delta\mathrel{\mathop{\mathchar 58\relax}}y-x>\ell\} for 0\ell\geq 0. As a result of the above discussion, we have the following corollary.

Corollary 3.5.

For X1,X2,X_{1},X_{2},\dots i.i.d. with distribution function FF satisfying the conditions of Theorem 3.1, we have that

ξ0(Δ)=3𝔼[1F(X+)1F(X+)+F(X)]2\xi_{0}(\Delta_{\ell})=3\mathbb{E}\Bigg{[}\frac{1-F(X+\ell)}{1-F(X+\ell)+F(X)}\Bigg{]}^{2}

where X=𝑑X1X\overset{d}{=}X_{1}.

Example 3.6.

Corollary 3.5 implies that for F(t)F(t) uniform on [0,1][0,1] we have for 0<<10<\ell<1 that

ξ0(Δ)\displaystyle\xi_{0}(\Delta_{\ell}) =301(1x1)2dx.\displaystyle=3\int_{0}^{1-\ell}\bigg{(}\frac{1-\ell-x}{1-\ell}\bigg{)}^{2}\operatorname{d\!}{x}.
=1,\displaystyle=1-\ell,

This is a rather interesting, given that there is no a priori reason that uniform noise should also produce asymptotically uniformly distributed persistence lifetimes.

Before addressing strong laws for unbounded functions, we conclude with a corollary of Theorem 3.1, establishing a Glivenko-Cantelli result for persistence lifetimes. We omit the proof of Corollary 3.7 as it is proved in exactly the same manner as the Glivenko-Cantelli theorem—see Theorem 1.3 in Dudley (2014).

Corollary 3.7.

Suppose the conditions on the sequence 𝒳\mathcal{X} stated in Theorem 3.1 hold. Then we have

sup[0,)|ξ0,n(Δ)ξ0,n(Δ)ξ0(Δ)|0a.s.,n.\sup_{\ell\in[0,\infty)}\Bigg{|}\frac{\xi_{0,n}(\Delta_{\ell})}{\xi_{0,n}(\Delta)}-\xi_{0}(\Delta_{\ell})\Bigg{|}\to 0\ \mathrm{a.s.},\quad n\to\infty.

3.1. SLLN for unbounded functions

At this point, we have established almost surely that

ξ~0,n(f)/ξ~0,n(Δ)ξ~0(f),\tilde{\xi}_{0,n}(f)/\tilde{\xi}_{0,n}(\Delta)\to\tilde{\xi}_{0}(f),

for any bounded, continuous real-valued function ff on Δ~\tilde{\Delta}, when ξ~0,n\tilde{\xi}_{0,n} is induced by a stationary and ergodic sequence of random variables (similar for ξ0,n\xi_{0,n}). In general, if ff is continuous, nonnegative function and fMf\wedge M is the function that equals MM when fMf\geq M, then almost surely

ξ~0,n(fM)/ξ~0,n(Δ)=(b,d)ξ~0,nf(b,d)M(b,d)ξ~0,n1Δ~f(x,y)Mξ~0(dx,dy),n,\tilde{\xi}_{0,n}(f\wedge M)/\tilde{\xi}_{0,n}(\Delta)=\frac{\sum_{(b,d)\in\tilde{\xi}_{0,n}}f(b,d)\wedge M}{\sum_{(b,d)\in\tilde{\xi}_{0,n}}1}\to\int_{\tilde{\Delta}}f(x,y)\wedge M\,\tilde{\xi}_{0}(\operatorname{d\!}{x},\operatorname{d\!}{y}),\quad n\to\infty,

for all M>0M>0. Following this line of inquiry, we establish a result which yields convergence results for a large class of persistence statistics often seen in practice, including many of the functions for which convergence holds for geometric complexes in Divol and Polonik (2019), though we make no requirements on the behavior near the diagonal nor do we require polynomial growth. Prior to stating the result, it is necessary to define the notion of largely nondecreasing. We say that an unbounded function g:++g\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}_{+}\to\mathbb{R}_{+} is largely nondecreasing if there exists an M>0M>0 such that {x:g(x)M}\{x\mathrel{\mathop{\mathchar 58\relax}}g(x)\geq M\} is non-empty and gg is nondecreasing on [g(M),)[g^{\leftarrow}(M),\infty) where g(M)=inf{x:g(x)M}g^{\leftarrow}(M)=\inf\{x\mathrel{\mathop{\mathchar 58\relax}}g(x)\geq M\}. Furthermore, recall that the function gg is coercive if g(x)g(x)\to\infty as xx\to\infty.

Theorem 3.8.

Assume the conditions of Theorem 3.1 and suppose that f(b,d)=g(db)f(b,d)=g(d-b) and g:++g\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}_{+}\to\mathbb{R}_{+} is a continuous, coercive, and largely nondecreasing function with 𝔼[g(2|X1|)1+ϵ]<\mathbb{E}\big{[}g(2|X_{1}|)^{1+\epsilon}\big{]}<\infty for some ϵ>0\epsilon>0. If ξ~0(f)<\tilde{\xi}_{0}(f)<\infty, then

ξ~0,n(f)/ξ~0,n(Δ)ξ~0(f),a.s.,n.\tilde{\xi}_{0,n}(f)/\tilde{\xi}_{0,n}(\Delta)\to\tilde{\xi}_{0}(f),\ \ \mathrm{a.s.},\quad n\to\infty.
Proof.

Before beginning, fix any M>0M>0 such that gg is nondecreasing on [g(M),)[g^{\leftarrow}(M),\infty). We will focus our proof on the case where the marginal distribution FF can take negative and positive values, but the proofs follow from a simplified version of the argument below when the support of FF is restricted to a half-line. To show that

ξ~0,n(f)/ξ~0,n(Δ)ξ~0(f),\tilde{\xi}_{0,n}(f)/\tilde{\xi}_{0,n}(\Delta)\to\tilde{\xi}_{0}(f),

for ff as in the statement of the theorem, it will suffice to first bound the quantity

ξ~0,n(f)ξ~0,n(Δ)ξ~0,n(fM)ξ~0,n(Δ)\displaystyle\frac{\tilde{\xi}_{0,n}(f)}{\tilde{\xi}_{0,n}(\Delta)}-\frac{\tilde{\xi}_{0,n}(f\wedge M)}{\tilde{\xi}_{0,n}(\Delta)} =ξ~0,n((fM)+)ξ~0,n(Δ)\displaystyle=\frac{\tilde{\xi}_{0,n}\big{(}(f-M)_{+}\big{)}}{\tilde{\xi}_{0,n}(\Delta)}
(9) =ξ~0,n(Δ)1(b,d)ξ0,n,f(b,d)Mf(b,d).\displaystyle=\tilde{\xi}_{0,n}(\Delta)^{-1}\sum_{\begin{subarray}{c}(b,d)\in\xi_{0,n},\\ f(b,d)\geq M\end{subarray}}f(b,d).

Recall that f(b,d)=g(db)f(b,d)=g(d-b). In this situation, we have that the unnormalized form of (9) equals

dbg(M)g(db)\displaystyle\sum_{d-b\geq g^{\leftarrow}(M)}g(d-b) =dbg(M),b0g(db)+dbg(M),b<0,d<0g(db)+dbg(M),b<0,d0g(db)\displaystyle=\sum_{\begin{subarray}{c}d-b\geq g^{\leftarrow}(M),\\ b\geq 0\end{subarray}}g(d-b)+\sum_{\begin{subarray}{c}d-b\geq g^{\leftarrow}(M),\\ \ b<0,\,d<0\end{subarray}}g(d-b)+\sum_{\begin{subarray}{c}d-b\geq g^{\leftarrow}(M),\\ \ b<0,\,d\geq 0\end{subarray}}g(d-b)
dg(M)g(d)+bg(M)g(b)+dbg(M),b<0,d0g(2d)+g(2b)\displaystyle\leq\sum_{\begin{subarray}{c}d\geq g^{\leftarrow}(M)\end{subarray}}g(d)+\sum_{\begin{subarray}{c}-b\geq g^{\leftarrow}(M)\end{subarray}}g(-b)+\sum_{\begin{subarray}{c}d-b\geq g^{\leftarrow}(M),\\ \ b<0,\,d\geq 0\end{subarray}}g(2d)+g(-2b)
(10) dg(M)g(d)+bg(M)g(b)+2max{d,b}g(M),b<0,d0g(2d)+g(2b),\displaystyle\leq\sum_{\begin{subarray}{c}d\geq g^{\leftarrow}(M)\end{subarray}}g(d)+\sum_{\begin{subarray}{c}-b\geq g^{\leftarrow}(M)\end{subarray}}g(-b)+\sum_{\begin{subarray}{c}2\max\{d,-b\}\geq g^{\leftarrow}(M),\\ \ b<0,\,d\geq 0\end{subarray}}g(2d)+g(-2b),

because of the fact g(db)g(2max{d,b})g(2d)+g(2b)g(d-b)\leq g\big{(}2\max\{d,-b\}\big{)}\leq g(2d)+g(-2b) when b<0b<0, d0d\geq 0 and we have dbg(M)d-b\geq g^{\leftarrow}(M). Furthermore,

2max{d,b}g(M),b<0,d0g(2d)\displaystyle\sum_{\begin{subarray}{c}2\max\{d,-b\}\geq g^{\leftarrow}(M),\\ \ b<0,\,d\geq 0\end{subarray}}g(2d) =(b,d)ξ0,ng(2d)𝟏{2max{d,b}g(M)}(𝟏{d>b}+𝟏{db})\displaystyle=\sum_{(b,d)\in\xi_{0,n}}g(2d)\mathbf{1}\big{\{}2\max\{d,-b\}\geq g^{\leftarrow}(M)\big{\}}\big{(}\mathbf{1}\big{\{}d>-b\big{\}}+\mathbf{1}\big{\{}d\leq-b\big{\}}\big{)}
=(b,d)ξ0,ng(2d)𝟏{2dg(M)}𝟏{d>b}\displaystyle=\sum_{(b,d)\in\xi_{0,n}}g(2d)\mathbf{1}\big{\{}2d\geq g^{\leftarrow}(M)\big{\}}\mathbf{1}\big{\{}d>-b\big{\}}
+(b,d)ξ0,ng(2d)𝟏{2bg(M)}𝟏{db}\displaystyle\qquad\qquad\qquad+\sum_{(b,d)\in\xi_{0,n}}g(2d)\mathbf{1}\big{\{}-2b\geq g^{\leftarrow}(M)\big{\}}\mathbf{1}\big{\{}d\leq-b\big{\}}
2dg(M)g(2d)+2bg(M)g(2b).\displaystyle\leq\sum_{\begin{subarray}{c}2d\geq g^{\leftarrow}(M)\end{subarray}}g(2d)+\sum_{\begin{subarray}{c}-2b\geq g^{\leftarrow}(M)\end{subarray}}g(-2b).

This occurs as g(x)g(y)g(x)\leq g(y) if yg(M)xy\geq g^{\leftarrow}(M)\vee x. With a similar argument for the g(2b)g(-2b) term, we can see that (10) is bounded above by

2dg(M)3g(2d)+2bg(M)3g(2b)\sum_{\begin{subarray}{c}2d\geq g^{\leftarrow}(M)\end{subarray}}3g(2d)+\sum_{\begin{subarray}{c}-2b\geq g^{\leftarrow}(M)\end{subarray}}3g(-2b)

By a similar argument to Proposition 2.3 occurs at d=Xid=X_{i} if and only XiX_{i} is a local maxima. Birkhoff’s ergodic theorem then implies that

2dg(M)g(2d)/n𝔼[g(2X2)𝟏{X2>X1X3}𝟏{2X2g(M)}],a.s.,\sum_{2d\geq g^{\leftarrow}(M)}g(2d)/n\to\mathbb{E}\big{[}g(2X_{2})\mathbf{1}\big{\{}X_{2}>X_{1}\vee X_{3}\big{\}}\mathbf{1}\big{\{}2X_{2}\geq g^{\leftarrow}(M)\big{\}}\big{]},\quad\text{a.s.},

as nn\to\infty. Hölder’s inequality then implies that for p>1p>1 and q=p/(p1)q=p/(p-1),

𝔼[g(2X2)𝟏{X2>X1X3}𝟏{2X2g(M)}]\displaystyle\mathbb{E}\big{[}g(2X_{2})\mathbf{1}\big{\{}X_{2}>X_{1}\vee X_{3}\big{\}}\mathbf{1}\big{\{}2X_{2}\geq g^{\leftarrow}(M)\big{\}}\big{]} (𝔼[g(2|X2|)p])1/p((2|X2|g(M)))1/q\displaystyle\leq\Bigg{(}\mathbb{E}[g(2|X_{2}|)^{p}]\Bigg{)}^{1/p}\Bigg{(}\mathbb{P}(2|X_{2}|\geq g^{\leftarrow}(M))\Bigg{)}^{1/q}

By assumption, 𝔼[g(2|X2|)p]<\mathbb{E}[g(2|X_{2}|)^{p}]<\infty for some p>1p>1, so that coercivity of gg entails we may choose M>0M>0 large enough such that

𝔼[g(2X2)𝟏{X2>X1X3}𝟏{2X2g(M)}]<ϵ(X2<X1X3)/18.\mathbb{E}\big{[}g(2X_{2})\mathbf{1}\big{\{}X_{2}>X_{1}\vee X_{3}\big{\}}\mathbf{1}\big{\{}2X_{2}\geq g^{\leftarrow}(M)\big{\}}\big{]}<\epsilon\mathbb{P}(X_{2}<X_{1}\wedge X_{3})/18.

Therefore, for such an MM we have

lim supn2dg(M)3g(2d)/ξ~0,n(Δ)<ϵ/6,a.s.\limsup_{n\to\infty}\sum_{2d\geq g^{\leftarrow}(M)}3g(2d)/\tilde{\xi}_{0,n}(\Delta)<\epsilon/6,\ \ \text{a.s.}

A similar argument holds for the term

2bg(M)3g(2b),\sum_{\begin{subarray}{c}-2b\geq g^{\leftarrow}(M)\end{subarray}}3g(-2b),

so the additivity of lim sup\limsup furnishes that

lim supndbg(M)g(db)/ξ~0,n(Δ)<ϵ/3,a.s.\limsup_{n\to\infty}\sum_{d-b\geq g^{\leftarrow}(M)}g(d-b)/\tilde{\xi}_{0,n}(\Delta)<\epsilon/3,\ \ \text{a.s.}

By Theorem 3.1 and the triangle inequality, it remains to show that

ξ~0((fM)+)<ϵ/3\tilde{\xi}_{0}\big{(}(f-M^{\prime})_{+}\big{)}<\epsilon/3

for some MMM^{\prime}\geq M, which follows from ξ~0(f)<\tilde{\xi}_{0}(f)<\infty. ∎

If all XiX_{i} are nonnegative, we have an easy corollary to Theorem 3.8. We omit the proof as it follows directly from the one above.

Corollary 3.9.

If Xi0X_{i}\geq 0 for all i=1,2,i=1,2,\dots then Theorem 3.8 holds for f(b,d)=g(d+b)f(b,d)=g(d+b).

The utility of Theorem 3.8 can be seen in the following section.

3.2. Strong law of large numbers: two examples

Strong laws of large numbers can be established from Theorem 3.8 for various quantities used in topological data science called persistence statistics. For instance, we have a strong law of large numbers for degree-pp total persistence333See Cohen-Steiner et al. (2010) for a definition and Divol and Polonik (2019) for the geometric complex result, provided that

𝔼[|X1|p+ϵ]<.\mathbb{E}\big{[}|X_{1}|^{p+\epsilon}\big{]}<\infty.

A more difficult example is persistent entropy (Merelli et al., 2015; Atienza et al., 2020). Persistent entropy has been used as part of a suite of statistics in the studies of Chung et al. (2021, 2022, 2024) and Thomas et al. (2024), as well as to detect activation in the immune system (Rucco et al., 2016), and to detect structure in nanoparticle images (Thomas et al., 2023; Crozier et al., 2024). The definition (excluding the longest barcode) is

E(X1,,Xn)En:=(b,d)ξ~0,ndbLnlog(dbLn),E(X_{1},\dots,X_{n})\equiv E_{n}\mathrel{\mathop{\mathchar 58\relax}}=-\sum_{(b,d)\in\tilde{\xi}_{0,n}}\frac{d-b}{L_{n}}\log\Bigg{(}\frac{d-b}{L_{n}}\Bigg{)},

where Ln:=(b,d)ξ~0,ndbL_{n}\mathrel{\mathop{\mathchar 58\relax}}=\sum_{(b,d)\in\tilde{\xi}_{0,n}}d-b. We may represent EnE_{n} as

Ln1(b,d)ξ~0,n(db)log(db)+logLn.\displaystyle-L_{n}^{-1}\sum_{(b,d)\in\tilde{\xi}_{0,n}}(d-b)\log(d-b)+\log L_{n}.

Another nontrivial statistic of interest is the ALPS statistic, defined in Thomas et al. (2023) and utilized in Thomas et al. (2023), Crozier et al. (2024), and Thomas et al. (2024). Its representation is

A(X1,,Xn)An:=0logξ0,n(Δ)d,A(X_{1},\dots,X_{n})\equiv A_{n}\mathrel{\mathop{\mathchar 58\relax}}=\int_{0}^{\infty}\log\xi_{0,n}(\Delta_{\ell})\,\operatorname{d\!}{\ell},

and we define a truncation of the ALPS statistic as AnL:=0Llogξ0,n(Δ)d.A_{n}^{L}\mathrel{\mathop{\mathchar 58\relax}}=\int_{0}^{L}\log\xi_{0,n}(\Delta_{\ell})\,\operatorname{d\!}{\ell}. Before continuing, let us define fe(b,d)=(db)log(db)f_{e}(b,d)=(d-b)\log(d-b) and fI(b,d)=dbf_{I}(b,d)=d-b. Both fe+1f_{e}+1 and fIf_{I} are continuous, coercive, and largely nondecreasing in dbd-b.

Corollary 3.10.

Assuming the conditions of Theorems 3.1 and 3.8, we have that

Enlogξ~0,n(Δ)ξ~0(fe)ξ~0(fI)+logξ~0(fI),a.s.,E_{n}-\log\tilde{\xi}_{0,n}(\Delta)\to\frac{\tilde{\xi}_{0}(f_{e})}{\tilde{\xi}_{0}(f_{I})}+\log\tilde{\xi}_{0}(f_{I}),\ \ \mathrm{a.s.},

and for any L>0L>0 with ξ0(ΔL)>0\xi_{0}(\Delta_{L})>0 we have

Llogξ0,n(Δ)AnL0Llogξ0(Δ)d,a.s.,L\log\xi_{0,n}(\Delta)-A_{n}^{L}\to-\int^{L}_{0}\log\xi_{0}(\Delta_{\ell})\operatorname{d\!}{\ell},\ \ \mathrm{a.s.},

as nn\to\infty. That is, the sublevel set persistent entropy and the ALPS statistic of a stationary and ergodic process converge almost surely.

Proof.

The proof follows fairly simply from Theorem 3.8. We know that

En=ξ~0,n(fe+1)+ξ~0,n(Δ)ξ~0,n(fI)+logξ~0,n(fI).E_{n}=\frac{-\tilde{\xi}_{0,n}(f_{e}+1)+\tilde{\xi}_{0,n}(\Delta)}{\tilde{\xi}_{0,n}(f_{I})}+\log\tilde{\xi}_{0,n}(f_{I}).

Subtracting logξ~0,n(Δ)\log\tilde{\xi}_{0,n}(\Delta) and applying Theorem 3.8 yields a limit of

ξ~0(fe+1)+1ξ~0(fI)+logξ~0(fI),\frac{-\tilde{\xi}_{0}(f_{e}+1)+1}{\tilde{\xi}_{0}(f_{I})}+\log\tilde{\xi}_{0}(f_{I}),

which finishes the proof, as ξ~0\tilde{\xi}_{0} a probability measure. For the ALPS statistic, we see that

Llogξ0,n(Δ)AnL=0Llog(ξ0,n(Δ)ξ0,n(Δ))d.L\log\xi_{0,n}(\Delta)-A_{n}^{L}=\int^{L}_{0}\log\bigg{(}\frac{\xi_{0,n}(\Delta)}{\xi_{0,n}(\Delta_{\ell})}\bigg{)}\operatorname{d\!}{\ell}.

If we fix a positive ϵ<ξ0(ΔL)\epsilon<\xi_{0}(\Delta_{L}), Corollary 3.7 implies that for nN(ω)n\geq N(\omega) (NN depending on the sample point ωΩ\omega\in\Omega), we have

log(ξ0,n(Δ)ξ0,n(Δ))log(ξ0(Δ)ϵ)log(ξ0(ΔL)ϵ),\log\bigg{(}\frac{\xi_{0,n}(\Delta)}{\xi_{0,n}(\Delta_{\ell})}\bigg{)}\leq-\log(\xi_{0}(\Delta_{\ell})-\epsilon)\leq-\log(\xi_{0}(\Delta_{L})-\epsilon),

for all [0,L]\ell\in[0,L]. Therefore, the bounded convergence assumption holds for all ωΩ\omega\in\Omega such that convergence holds. Hence, our result follows almost surely. ∎

Having demonstrated our strong law of large numbers for persistence diagrams, and its ramifications, we now turn our attention to the central limit theorem.

4. Central limit theorem

In this section, we prove a central limit theorem for the integral ξ0,n(f)\xi_{0,n}(f), where ff is a step function. This follows from proving a CLT for linear combinations of persistent Betti numbers β0,ns,t\beta^{s,t}_{0,n} using the Lindeberg method for weakly dependent triangular arrays given in Neumann (2013). The desired result will follow as a consequence of demonstrating

n1/2l=1mal(β0,nsl,tl𝔼[β0,nsl,tl]).n^{-1/2}\sum_{l=1}^{m}a_{l}\Big{(}\beta_{0,n}^{s_{l},t_{l}}-\mathbb{E}[\beta_{0,n}^{s_{l},t_{l}}]\Big{)}.

obeys a central limit theorem when 𝒳n\mathcal{X}_{n} obeys weak dependence conditions (to be specified below) and a1,,ama_{1},\dots,a_{m} are arbitrary real numbers. The reason for this is that if Rl=(s1,s2]×(t1,t2]R_{l}=(s_{1},s_{2}]\times(t_{1},t_{2}] then

𝟏Rl=𝟏(,s2]×(t1,]𝟏(,s2]×(t2,]𝟏(,s1]×(t1,]+𝟏(,s1]×(t2,].\mathbf{1}_{R_{l}}=\mathbf{1}_{(-\infty,s_{2}]\times(t_{1},\infty]}-\mathbf{1}_{(-\infty,s_{2}]\times(t_{2},\infty]}-\mathbf{1}_{(-\infty,s_{1}]\times(t_{1},\infty]}+\mathbf{1}_{(-\infty,s_{1}]\times(t_{2},\infty]}.

The Crámer-Wold device also provides us with finite-dimensional weak convergence as an added benefit.

As for the aforementioned notions of weak dependence, the one we employ is that of ρ\rho-mixing. To begin, note that for any two sub-σ\sigma algebras 𝒜,\mathcal{A},\mathcal{B}\subset\mathcal{F} we define

ρ(𝒜,):=supXL2(𝒜)YL2()|Corr(X,Y)|,\rho(\mathcal{A},\mathcal{B})\mathrel{\mathop{\mathchar 58\relax}}=\sup_{\begin{subarray}{c}X\in L^{2}(\mathcal{A})\,Y\in L^{2}(\mathcal{B})\end{subarray}}\big{|}\text{Corr}(X,Y)\big{|},

where L2(𝒜)L^{2}(\mathcal{A}) (resp. L2()L^{2}(\mathcal{B})) is the space of square-integrable 𝒜\mathcal{A}-measurable (resp. \mathcal{B}-measurable) random variables444For random variables X,YX,Y the value Corr(X,Y)=Cov(X,Y)/Var(X)Var(Y)\text{Corr}(X,Y)=\text{Cov}(X,Y)/\sqrt{\mathrm{Var}(X)\mathrm{Var}(Y)}.. Furthermore, we define

ρ𝒳(k):=supmρ(σ(X1,,Xm),σ(Xm+k,Xm+k+1,)),\rho_{\mathcal{X}}(k)\mathrel{\mathop{\mathchar 58\relax}}=\sup_{m\in\mathbb{N}}\rho\big{(}\sigma(X_{1},\dots,X_{m}),\sigma(X_{m+k},X_{m+k+1},\dots)\big{)},

so that the stochastic process 𝒳=(X1,X2,)\mathcal{X}=(X_{1},X_{2},\dots) is said to be ρ\rho-mixing if ρ𝒳(k)0\rho_{\mathcal{X}}(k)\to 0 as kk\to\infty. For our limit theorems, we will require that k=1ρ𝒳(k)<\sum_{k=1}^{\infty}\rho_{\mathcal{X}}(k)<\infty, which implies ρ\rho-mixing. More details on ρ\rho-mixing and other mixing conditions can be seen in Bradley (2005). Another particularly important condition for our proofs is that our stationary process obeys a certain condition on the probability distributions of the partial maxima decaying sufficiently quickly. This serves to limit any percolation-esque phenomena that would preclude a central limit theorem.

Definition 4.1.

A stationary stochastic process 𝒳=(X1,X2,)\mathcal{X}=(X_{1},X_{2},\dots) with marginal distribution function FF is said to be max-root summable if for all tt with F(t)<1F(t)<1 we have

i=1i(X1t,,Xit)<.\sum_{i=1}^{\infty}i\sqrt{\mathbb{P}(X_{1}\leq t,\dots,X_{i}\leq t)}<\infty.

Before stating our main theorem, we will establish conditions on the stochastic process that guarantee max-root summability.

Proposition 4.2.

Suppose that 𝒳\mathcal{X} is a stationary stochastic process. If there is some ϵ>0\epsilon>0 s.t.

(X1t,,Xnt)=O(n4ϵ),\mathbb{P}(X_{1}\leq t,\dots,X_{n}\leq t)=O(n^{-4-\epsilon}),

for all tt with F(t)<1F(t)<1, then 𝒳\mathcal{X} is max-root summable.

Proof.

If the condition above holds there is some CtC_{t} such that

n(X1t,,Xnt)Ctn1ϵ/2,n\sqrt{\mathbb{P}(X_{1}\leq t,\dots,X_{n}\leq t)}\leq\sqrt{C_{t}}n^{-1-\epsilon/2},

the right-hand side of which is clearly summable. ∎

Example 4.3.

Suppose that 𝒳\mathcal{X} is a (stationary) Markov chain with transition kernel PP such that for every tt with F(t)<1F(t)<1 there is some ηt>0\eta_{t}>0 that satisifies

supxtP(x,(,t])1ηt.\sup_{x\leq t}P\big{(}x,(-\infty,t]\big{)}\leq 1-\eta_{t}.

By Theorem 3.4.1 in Meyn and Tweedie (2009), we have that

(X1t,,Xnt)\displaystyle\mathbb{P}(X_{1}\leq t,\dots,X_{n}\leq t) =x1txn1tF(dx1)P(x1,dx2)P(xn2,dxn1)P(xn1,(,t])\displaystyle=\int_{x_{1}\leq t}\cdots\int_{x_{n-1}\leq t}F(\operatorname{d\!}x_{1})P(x_{1},\operatorname{d\!}x_{2})\cdots P(x_{n-2},\operatorname{d\!}x_{n-1})P\big{(}x_{n-1},(-\infty,t]\big{)}
x1txn1tF(dx1)P(x1,dx2)P(xn2,dxn1)(1ηt).\displaystyle\leq\int_{x_{1}\leq t}\cdots\int_{x_{n-1}\leq t}F(\operatorname{d\!}x_{1})P(x_{1},\operatorname{d\!}x_{2})\cdots P(x_{n-2},\operatorname{d\!}x_{n-1})(1-\eta_{t}).

Therefore, induction furnishes that

(X1t,,Xnt)F(t)(1ηt)n1,\mathbb{P}(X_{1}\leq t,\dots,X_{n}\leq t)\leq F(t)(1-\eta_{t})^{n-1},

and the condition in Proposition 4.2 can be simply established.

Example 4.4.

Suppose that 𝒳\mathcal{X} is stationary and mm-dependent, i.e. ψ𝒳(k)=0\psi_{\mathcal{X}}(k)=0 for all km+1k\geq m+1. Then we have

(X1t,,Xnt)\displaystyle\mathbb{P}(X_{1}\leq t,\dots,X_{n}\leq t) (X1t,Xm+2t,,Xn1m+1(m+1)+1t)\displaystyle\leq\mathbb{P}(X_{1}\leq t,X_{m+2}\leq t,\dots,X_{\lfloor\frac{n-1}{m+1}\rfloor(m+1)+1}\leq t)
=F(t)n1m+1+1.\displaystyle=F(t)^{\lfloor\frac{n-1}{m+1}\rfloor+1}.

Because F(t)=0F(t)=0 establishes max-root summability trivially, we take 0<F(t)<10<F(t)<1. Then as (n1m+1+1)log[1/F(t)]klogn(\lfloor\frac{n-1}{m+1}\rfloor+1)\log\big{[}1/F(t)\big{]}\geq k\log n for any k>0k>0 and nn large enough, then the condition in Proposition 4.2 is established.

To establish our CLT (Theorem 4.6 below), we first need to assess the limiting behavior of the covariance.

Proposition 4.5.

Let 𝒳\mathcal{X} be a stationary stochastic process that is max-root summable and satisfies k=1ρ𝒳(k)<\sum_{k=1}^{\infty}\rho_{\mathcal{X}}(k)<\infty. Assume further that the marginal distribution of XiX_{i} is continuous with distribution FF. Suppose that <siti-\infty<s_{i}\leq t_{i}\leq\infty for i=1,2i=1,2 with F(s1s2)>0F(s_{1}\wedge s_{2})>0 and F(t1t2)<1F(t_{1}\vee t_{2})<1.

limnn1Cov(β0,ns1,t1,β0,ns2,t2)\displaystyle\lim_{n\to\infty}n^{-1}\mathrm{Cov}\Big{(}\beta_{0,n}^{s_{1},t_{1}},\beta_{0,n}^{s_{2},t_{2}}\Big{)}
=Cov(Y2(s1,t1),Y2(s2,t2))\displaystyle\qquad\qquad=\mathrm{Cov}\big{(}Y_{2}^{\infty}(s_{1},t_{1}),Y_{2}^{\infty}(s_{2},t_{2})\big{)}
+k=1[Cov(Y2(s1,t1),Y2+k(s2,t2))+Cov(Y2+k(s1,t1),Y2(s2,t2))].\displaystyle\qquad\qquad+\sum_{k=1}^{\infty}\bigg{[}\mathrm{Cov}\big{(}Y_{2}^{\infty}(s_{1},t_{1}),Y_{2+k}^{\infty}(s_{2},t_{2})\big{)}+\mathrm{Cov}\big{(}Y_{2+k}^{\infty}(s_{1},t_{1}),Y_{2}^{\infty}(s_{2},t_{2})\big{)}\bigg{]}.

where the terms Yj(s,t)Y_{j}^{\infty}(s,t) are defined at (4) respectively.

With this all at hand, we may finally state the central limit theorem.

Theorem 4.6.

Let 𝒳\mathcal{X} be a stationary stochastic process that is max-root summable and satisfies k=1ρ𝒳(k)<\sum_{k=1}^{\infty}\rho_{\mathcal{X}}(k)<\infty. Assume further that the marginal distribution of XiX_{i} is continuous with distribution FF. Then for any function f=l=1mal𝟏Rlf=\sum_{l=1}^{m}a_{l}\mathbf{1}_{R_{l}} with ala_{l}\in\mathbb{R} and RlR_{l}\in\mathcal{R}, l=1,,ml=1,\dots,m, if the corners (s,t)(s,t) of the rectangles satisfy F(s)>0F(s)>0 and F(t)<1F(t)<1 we have:

n1/2(ξ0,n(f)𝔼[ξ0,n(f)])N(0,If),n^{-1/2}\big{(}\xi_{0,n}(f)-\mathbb{E}[\xi_{0,n}(f)]\big{)}\Rightarrow N(0,I_{f}),

and if each of the coordinates of RlR_{l} lie in \mathbb{R} for l=1,,ml=1,\dots,m then

n1/2(ξ~0,n(f)𝔼[ξ~0,n(f)])N(0,If),n^{-1/2}\big{(}\tilde{\xi}_{0,n}(f)-\mathbb{E}[\tilde{\xi}_{0,n}(f)]\big{)}\Rightarrow N(0,I_{f}),

as nn\to\infty, where IfI_{f} is a nonnegative constant depending on ff.

We defer the proof to Section 6.

5. Discussion

In this paper, we have demonstrated a strong law of large numbers for a large class of integrals with the respect to the random measure induced by the 0th0^{th} sublevel set persistent homology of general stationary and ergodic processes. We also proved a central limit theorem for the same random measure for a large class of step functions. As the SLLNs—by consideration of the negated process X1,X2,-X_{1},-X_{2},\dots—also pertain to superlevel sets, it would be interesting to consider the limiting behavior of the persistent homology of the extremes of a stationary stochastic process; the reason is due to the natural connection between the superlevel set value β0,nun(τ),un(τ)\beta_{0,n}^{u_{n}(\tau),u_{n}(\tau)} (number of connected components above levels un(τ)u_{n}(\tau), τ0\tau\geq 0) and the clusters of exceedances seen in the extreme value theory literature (see chapter 6 of Kulik and Soulier, 2020).

Two potential improvements for this paper seem to lie in the weakening of conditions and the augmentation of the class of functions for which the central limit theorem holds (Theorem 4.6). There are likely only improvements to be made in the latter case, as the k=1ρ𝒳(k)<\sum_{k=1}^{\infty}\rho_{\mathcal{X}}(k)<\infty condition is only slightly stronger than the slowest mixing rate of k=1k1ρ𝒳(k)<\sum_{k=1}^{\infty}k^{-1}\rho_{\mathcal{X}}(k)<\infty for a conventional CLT to hold for a stationary sequence (Bradley, 1987). The improvement of the second objective seemingly depends on a more precise treatment of the covariance in Proposition 4.5, which is rather tedious as it stands. Nonetheless, such improvements would see utility as the class of functions of persistence diagrams used in practice are large, which is what motivated Section 3.1 (and this paper) to begin with. Expanding the CLT results to a functional CLT for the persistent Betti numbers (as in Krebs and Hirsch, 2022) may yield some progress towards this end, but we leave all the pursuits mentioned in these last two paragraphs for future work.

6. Central limit theorem proof

For the proof of our central limit theorem, we will employ Theorem 2.1 from Neumann (2013), which establishes a CLT for potentially nonstationary weakly dependent triangular arrays. As mentioned at the beginning of Section 4, it is sufficient to show that

(11) n1/2l=1mal(β0,nsl,tl𝔼[β0,nsl,tl]),n^{-1/2}\sum_{l=1}^{m}a_{l}\bigg{(}\beta_{0,n}^{s_{l},t_{l}}-\mathbb{E}[\beta_{0,n}^{s_{l},t_{l}}]\bigg{)},

converges to a Gaussian distribution for each a1,,ama_{1},\dots,a_{m}\in\mathbb{R}, to establish our desired convergence. Recall that at (4) we defined the indicator (Bernoulli) random variable Yj,nm(s,t)Y^{m}_{j,n}(s,t) and on the following line we noticed that

β0,ns,t=j=1nYj,nnj+1(s,t),\beta_{0,n}^{s,t}=\sum_{j=1}^{n}Y^{n-j+1}_{j,n}(s,t),

so that (11) is equal to

n1/2j=1nl=1mal(Yj,nnj+1(sl,tl)𝔼[Yj,nnj+1(sl,tl)]).n^{-1/2}\sum_{j=1}^{n}\sum_{l=1}^{m}a_{l}\bigg{(}Y_{j,n}^{n-j+1}(s_{l},t_{l})-\mathbb{E}[Y_{j,n}^{n-j+1}(s_{l},t_{l})]\bigg{)}.

For the proof the CLT it is convenient for us to establish first a CLT for a truncated version of the persistent Betti numbers—as was done in the proof of the Betti number CLT for the critical regime in the geometric setting, in Theorem 4.1 of Owada and Thomas (2020). Define first

β0,n,Ks,t=j=1nYj,n(nj+1)K(s,t)\beta_{0,n,K}^{s,t}=\sum_{j=1}^{n}Y^{(n-j+1)\wedge K}_{j,n}(s,t)

Therefore, if we define

Wj,n:=n1/2l=1mal(Yj,n(nj+1)K(sl,tl)𝔼[Yj,n(nj+1)K(sl,tl)]),W_{j,n}\mathrel{\mathop{\mathchar 58\relax}}=n^{-1/2}\sum_{l=1}^{m}a_{l}\bigg{(}Y_{j,n}^{(n-j+1)\wedge K}(s_{l},t_{l})-\mathbb{E}[Y_{j,n}^{(n-j+1)\wedge K}(s_{l},t_{l})]\bigg{)},

establishing Theorem 4.6 amounts to establishing a CLT for j=1nWj,n\sum_{j=1}^{n}W_{j,n} for each KK then showing that the difference between β0,ns,t\beta_{0,n}^{s,t} and β0,n,Ks,t\beta_{0,n,K}^{s,t} disappears in probability. We will now quote the theorem which we will use to establish this.

Theorem 6.1 (Theorem 2.1 in Neumann, 2013).

Suppose that (Wj,n)j=1n(W_{j,n})_{j=1}^{n} with nn\in\mathbb{N} is a triangular array of random variables with 𝔼[Wj,n]=0\mathbb{E}[W_{j,n}]=0 for all j,nj,n and supnj=1n𝔼[Wj,n2]M\sup_{n}\sum_{j=1}^{n}\mathbb{E}[W_{j,n}^{2}]\leq M for some M<M<\infty. Suppose further that

(12) Var(j=1nWj,n)σ2,n,\mathrm{Var}\bigg{(}\sum_{j=1}^{n}W_{j,n}\bigg{)}\to\sigma^{2},\quad n\to\infty,

for some σ20\sigma^{2}\geq 0, and that for every ϵ>0\epsilon>0 we have

(13) k=1n𝔼[Wj,n2𝟏{|Wj,n|>ϵ}]0,n.\sum_{k=1}^{n}\mathbb{E}[W_{j,n}^{2}\mathbf{1}\big{\{}|W_{j,n}|>\epsilon\big{\}}]\to 0,\quad n\to\infty.

Furthermore, assume that there exists a summable sequence of θr\theta_{r}, rr\in\mathbb{N}, such that for all qNq\in N and indices 1u1<u2<<uq+r=v1v2n1\leq u_{1}<u_{2}<\cdots<u_{q}+r=v_{1}\leq v_{2}\leq n, the following upper bounds for covariances hold true:

(14) |Cov(g(Wu1,n,,Wuq,n)Wuq,n,Wv1,n)|θr(𝔼[Wuq,n2]+𝔼[Wv1,n2]+n1)\Big{|}\mathrm{Cov}\big{(}g(W_{u_{1},n},\dots,W_{u_{q},n})W_{u_{q},n},W_{v_{1},n}\big{)}\Big{|}\leq\theta_{r}\big{(}\mathbb{E}[W_{u_{q},n}^{2}]+\mathbb{E}[W_{v_{1},n}^{2}]+n^{-1}\big{)}

and

(15) |Cov(g(Wu1,n,,Wuq,n),Wv1,nWv2,n)|θr(𝔼[Wv1,n2]+𝔼[Wv2,n2]+n1)\Big{|}\mathrm{Cov}\big{(}g(W_{u_{1},n},\dots,W_{u_{q},n}),W_{v_{1},n}W_{v_{2},n}\big{)}\Big{|}\leq\theta_{r}\big{(}\mathbb{E}[W_{v_{1},n}^{2}]+\mathbb{E}[W_{v_{2},n}^{2}]+n^{-1}\big{)}

for all measurable g:qg\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}^{q}\to\mathbb{R} with supxq|g(x)|1\sup_{x\in\mathbb{R}^{q}}|g(x)|\leq 1. Then

j=1nWj,nN(0,σ2),n.\sum_{j=1}^{n}W_{j,n}\Rightarrow N(0,\sigma^{2}),\quad n\to\infty.
Proof of Theorem 4.6.

The finite-dimensional CLT proof for β0,n,Ks,t\beta_{0,n,K}^{s,t} follows by checking that the conditions of Theorem 6.1 hold for our setup. First, we notice that

Wj,n2=n1l1=1ml2=1mal1al2(Yj,n(nj+1)K(sl1,tl1)𝔼[Yj,n(nj+1)K(sl1,tl1)])\displaystyle W_{j,n}^{2}=n^{-1}\sum_{l_{1}=1}^{m}\sum_{l_{2}=1}^{m}a_{l_{1}}a_{l_{2}}\bigg{(}Y_{j,n}^{(n-j+1)\wedge K}(s_{l_{1}},t_{l_{1}})-\mathbb{E}[Y_{j,n}^{(n-j+1)\wedge K}(s_{l_{1}},t_{l_{1}})]\bigg{)}
×(Yj,n(nj+1)K(sl2,tl2)𝔼[Yj,n(nj+1)K(sl2,tl2)]),\displaystyle\phantom{W_{j,n}^{2}=n^{-1}\sum_{l_{1}=1}^{m}\sum_{l_{2}=1}^{m}a_{l_{1}}a_{l_{2}}}\times\bigg{(}Y_{j,n}^{(n-j+1)\wedge K}(s_{l_{2}},t_{l_{2}})-\mathbb{E}[Y_{j,n}^{(n-j+1)\wedge K}(s_{l_{2}},t_{l_{2}})]\bigg{)},

so that

𝔼[Wj,n2]=n1l1=1ml2=1mal1al2Cov(Yj,n(nj+1)K(sl1,tl1),Yj,n(nj+1)K(sl2,tl2)),\mathbb{E}[W_{j,n}^{2}]=n^{-1}\sum_{l_{1}=1}^{m}\sum_{l_{2}=1}^{m}a_{l_{1}}a_{l_{2}}\text{Cov}\Big{(}Y_{j,n}^{(n-j+1)\wedge K}(s_{l_{1}},t_{l_{1}}),Y_{j,n}^{(n-j+1)\wedge K}(s_{l_{2}},t_{l_{2}})\Big{)},

which is bounded above by

n1(l=1m|al|Var(Yj,n(nj+1)K(sl,tl)))2Mn1,n^{-1}\bigg{(}\sum_{l=1}^{m}|a_{l}|\sqrt{\mathrm{Var}(Y_{j,n}^{(n-j+1)\wedge K}(s_{l},t_{l})})\bigg{)}^{2}\leq Mn^{-1},

for M:=(l|al|)2M\mathrel{\mathop{\mathchar 58\relax}}=(\sum_{l}|a_{l}|)^{2} by the inequalities |Cov(X,Y)|Var(X)Var(Y)|\text{Cov}(X,Y)|\leq\sqrt{\mathrm{Var}(X)\mathrm{Var}(Y)} and Var(𝟏A)(A)1\mathrm{Var}(\mathbf{1}_{A})\leq\mathbb{P}(A)\leq 1. Thus, supnj=1n𝔼[Wj,n2]M<\sup_{n}\sum_{j=1}^{n}\mathbb{E}[W_{j,n}^{2}]\leq M<\infty. If we note that

Var(j=1nWj,n)=n1l1=1ml2=1mal1al2Cov(β0,n,Ksl1,tl1,β0,n,Ksl2,tl2),\mathrm{Var}\bigg{(}\sum_{j=1}^{n}W_{j,n}\bigg{)}=n^{-1}\sum_{l_{1}=1}^{m}\sum_{l_{2}=1}^{m}a_{l_{1}}a_{l_{2}}\text{Cov}\Big{(}\beta_{0,n,K}^{s_{l_{1}},t_{l_{1}}},\beta_{0,n,K}^{s_{l_{2}},t_{l_{2}}}\Big{)},

then Var(j=1nWj,n)\mathrm{Var}\big{(}\sum_{j=1}^{n}W_{j,n}\big{)} converges to some limit σ2\sigma^{2} via arguments analogous to and much simpler than those of Proposition 4.5. Thus, (12) is satisfied. If we use the triangle inequality, we can see that for each jj

|Wj,n|2n1/2l=1m|al|=2(Mn)1/2|W_{j,n}|\leq 2n^{-1/2}\sum_{l=1}^{m}|a_{l}|=2(Mn)^{-1/2}

using the trivial indicator random variable bound 𝟏A1\mathbf{1}_{A}\leq 1. Therefore when n4(M/ϵ)2n\geq 4(M/\epsilon)^{2} we have that 𝟏{|Wj,n|>ϵ}=0\mathbf{1}\big{\{}|W_{j,n}|>\epsilon\big{\}}=0 so that (13) holds as well. To finish the proof, we must show that (14) and (15) hold in Theorem 6.1 above. For both situations, we can ignore the case for rK+1r\leq K+1, as we can set θr\theta_{r} arbitrarily large in this case to get the required bounds in this case. Therefore, suppose that r>K+1r>K+1, so that Wuq,nW_{u_{q},n} only depends on indices up to uq+Ku_{q}+K and Wv1,nW_{v_{1},n} only depends on indices starting at v11=uq+r1>uq+Kv_{1}-1=u_{q}+r-1>u_{q}+K.

We will only demonstrate (14), as (15) follows by a similar, simpler argument. For a fixed set of indices u1,,uqu_{1},\dots,u_{q} and fixed nn let us denote G:=g(Wu1,n,,Wuq,n)G\mathrel{\mathop{\mathchar 58\relax}}=g(W_{u_{1},n},\dots,W_{u_{q},n}). By the bilinearity of covariance, it will suffice to establish the required bounds in (14) for a single summand in

n1l1=1ml2=1mal1al2Cov(G{Yuq,n(nuq+1)K(sl1,tl1)𝔼[Yuq,n(nuq+1)K(sl1,tl1)]},\displaystyle n^{-1}\sum_{l_{1}=1}^{m}\sum_{l_{2}=1}^{m}a_{l_{1}}a_{l_{2}}\text{Cov}\Big{(}G\big{\{}Y_{u_{q},n}^{(n-u_{q}+1)\wedge K}(s_{l_{1}},t_{l_{1}})-\mathbb{E}[Y_{u_{q},n}^{(n-u_{q}+1)\wedge K}(s_{l_{1}},t_{l_{1}})]\big{\}},
(16) Yv1,n(nv1+1)K(sl2,tl2)𝔼[Yv1,n(nv1+1)K(sl2,tl2)])\displaystyle\phantom{n^{-1}\sum_{l_{1}=1}^{m}\sum_{l_{2}=1}^{m}a_{l_{1}}a_{l_{2}}\text{Cov}\Big{(}}Y_{v_{1},n}^{(n-v_{1}+1)\wedge K}(s_{l_{2}},t_{l_{2}})-\mathbb{E}[Y_{v_{1},n}^{(n-v_{1}+1)\wedge K}(s_{l_{2}},t_{l_{2}})]\Big{)} .

provided that such a bound is uniform in l1,l2l_{1},l_{2}. It can be shown that the covariance term in (16) is equal to

Cov(GYuq,n(nuq+1)K(sl1,tl1),Yv1,n(nv1+1)K(sl2,tl2))\displaystyle\text{Cov}\Big{(}GY_{u_{q},n}^{(n-u_{q}+1)\wedge K}(s_{l_{1}},t_{l_{1}}),Y_{v_{1},n}^{(n-v_{1}+1)\wedge K}(s_{l_{2}},t_{l_{2}})\Big{)}
(17) 𝔼[Yuq,n(nuq+1)K(sl1,tl1)]Cov(G,Yv1,n(nv1+1)K(sl2,tl2)),\displaystyle\qquad\qquad\qquad-\mathbb{E}[Y_{u_{q},n}^{(n-u_{q}+1)\wedge K}(s_{l_{1}},t_{l_{1}})]\text{Cov}(G,Y_{v_{1},n}^{(n-v_{1}+1)\wedge K}(s_{l_{2}},t_{l_{2}})),

and the absolute value of (17) can be bounded above by

|Cov(GYuq,n(nuq+1)K(sl1,tl1),Yv1,n(nv1+1)K(sl2,tl2))|+|Cov(G,Yv1,n(nv1+1)K(sl2,tl2))|.|\text{Cov}\Big{(}GY_{u_{q},n}^{(n-u_{q}+1)\wedge K}(s_{l_{1}},t_{l_{1}}),Y_{v_{1},n}^{(n-v_{1}+1)\wedge K}(s_{l_{2}},t_{l_{2}})\Big{)}|+|\text{Cov}(G,Y_{v_{1},n}^{(n-v_{1}+1)\wedge K}(s_{l_{2}},t_{l_{2}}))|.

Because GYuq,n(nuq+1)K(sl1,tl1)=g(Wu1,n,,Wuq,n)GY_{u_{q},n}^{(n-u_{q}+1)\wedge K}(s_{l_{1}},t_{l_{1}})=g^{*}(W_{u_{1},n},\dots,W_{u_{q},n}), for gg^{*} measurable and supx|g(x)|1\sup_{x}|g^{*}(x)|\leq 1, the required bound will follow provided we find a suitable bound for the quantity |Cov(G,Yv1,n(nv1+1)K(sl,tl))||\text{Cov}(G,Y_{v_{1},n}^{(n-v_{1}+1)\wedge K}(s_{l},t_{l}))|. By definition of ρ\rho-mixing and the trivial bound Var(X)E[X2]\mathrm{Var}(X)\leq E[X^{2}] we have

|Cov(G,Yv1,n(nv1+1)K(sl,tl))|ρ𝒳(r1K)Var(G)Var(Yv1,n(nv1+1)K(sl,tl))ρ𝒳(r1K).|\text{Cov}(G,Y_{v_{1},n}^{(n-v_{1}+1)\wedge K}(s_{l},t_{l}))|\leq\rho_{\mathcal{X}}(r-1-K)\sqrt{\mathrm{Var}(G)}\sqrt{\mathrm{Var}(Y_{v_{1},n}^{(n-v_{1}+1)\wedge K}(s_{l},t_{l}))}\leq\rho_{\mathcal{X}}(r-1-K).

By assumption, r>K+1ρ𝒳(r1K)<\sum_{r>K+1}\rho_{\mathcal{X}}(r-1-K)<\infty so that (14) is established. As alluded to earlier, the proof for (15) follows in exactly the same way, hence

(18) n1/2l=1mal(β0,n,Ksl,tl𝔼[β0,n,Ksl,tl])N(0,σK2),nn^{-1/2}\sum_{l=1}^{m}a_{l}\bigg{(}\beta_{0,n,K}^{s_{l},t_{l}}-\mathbb{E}[\beta_{0,n,K}^{s_{l},t_{l}}]\bigg{)}\Rightarrow N(0,\sigma_{K}^{2}),\quad n\to\infty

for all KK\in\mathbb{N}. As the dominated convergence assumption holds true in Proposition 4.5, it is straightforward to see that σK2σ2\sigma_{K}^{2}\to\sigma^{2} as KK\to\infty, where σ2\sigma^{2} is the limiting variance of (11). Hence N(0,σK2)N(0,σ2)N(0,\sigma_{K}^{2})\Rightarrow N(0,\sigma^{2}) as KK\to\infty as well (using Lévy’s continuity theorem, for example). Theorem 3.2 in Billingsley (1999) will yield the rest if we can show that

limKlim supn(|Zn,KZn|ϵ)=0,\lim_{K\to\infty}\limsup_{n\to\infty}\mathbb{P}(|Z_{n,K}-Z_{n}|\geq\epsilon)=0,

where ZnZ_{n} is the sum of persistent Betti numbers in (11) and Zn,KZ_{n,K} is the KK-truncated version on the left-hand side of (18). An application of Chebyshev’s inequality and the covariance inequality yields

(|ZnZn,K|ϵ)\displaystyle\mathbb{P}(|Z_{n}-Z_{n,K}|\geq\epsilon) 𝔼|ZnZn,K|2ϵ2\displaystyle\leq\frac{\mathbb{E}\big{|}Z_{n}-Z_{n,K}\big{|}^{2}}{\epsilon^{2}}
=1ϵ2nVar(l=1mal[β0,nsl,tlβ0,n,Ksl,tl])\displaystyle=\frac{1}{\epsilon^{2}n}\mathrm{Var}\bigg{(}\sum_{l=1}^{m}a_{l}\Big{[}\beta_{0,n}^{s_{l},t_{l}}-\beta_{0,n,K}^{s_{l},t_{l}}\Big{]}\bigg{)}
(19) 1ϵ2(l=1maln1Var(β0,nsl,tlβ0,n,Ksl,tl))2.\displaystyle\leq\frac{1}{\epsilon^{2}}\bigg{(}\sum_{l=1}^{m}a_{l}\sqrt{n^{-1}\mathrm{Var}\big{(}\beta_{0,n}^{s_{l},t_{l}}-\beta_{0,n,K}^{s_{l},t_{l}}\big{)}}\Bigg{)}^{2}.

The quantity n1Var(β0,nsl,tlβ0,n,Ksl,tl)n^{-1}\mathrm{Var}\big{(}\beta_{0,n}^{s_{l},t_{l}}-\beta_{0,n,K}^{s_{l},t_{l}}\big{)} converges to a limit defined by the terms (24), (25), and (26) below with the restriction that i1,i2>Ki_{1},i_{2}>K. As each of the sums in (24), (25), and (26) are absolutely convergent, their restriction with i1,i2>Ki_{1},i_{2}>K tends to 0 as KK\to\infty, and the CLT follows.

Finally, for any A(Δ~)A\in\mathcal{B}(\tilde{\Delta})

ξ0,n(A)=ξ~0,n(A)\xi_{0,n}(A)=\tilde{\xi}_{0,n}(A)

so that if each coordinate of RlR_{l} is in \mathbb{R}, then Rl(Δ~)R_{l}\in\mathcal{B}(\tilde{\Delta}) and the result is proved for the restricted persistence diagram as well. ∎

We finish this section with a proof of the limiting covariance seen in Proposition 4.5, which we will break into a few lemmas.

Proof of Proposition 4.5. Let us define

Ci,jn(s,t):=𝟏{k=jj+i1Xk,nt,k=jj+i1Xk,ns}𝟏{Xj1,nXj+i,n>t},C^{n}_{i,j}(s,t)\mathrel{\mathop{\mathchar 58\relax}}=\mathbf{1}\bigg{\{}\bigvee_{k=j}^{j+i-1}X_{k,n}\leq t,\bigwedge_{k=j}^{j+i-1}X_{k,n}\leq s\bigg{\}}\mathbf{1}\big{\{}X_{j-1,n}\wedge X_{j+i,n}>t\big{\}},

where Ci,j(s,t)Ci,j(s,t)C_{i,j}(s,t)\equiv C^{\infty}_{i,j}(s,t) is analogously defined for the entire sequence 𝒳\mathcal{X}. Therefore we have

β0,ns,t=j=1ni=1nj+1Ci,jn(s,t)=i=1nj=1ni+1Ci,jn(s,t).\beta_{0,n}^{s,t}=\sum_{j=1}^{n}\sum_{i=1}^{n-j+1}C^{n}_{i,j}(s,t)=\sum_{i=1}^{n}\sum_{j=1}^{n-i+1}C^{n}_{i,j}(s,t).\\

Thus, it follows that

Cov(β0,ns1,t1,β0,ns2,t2)\displaystyle\mathrm{Cov}\Big{(}\beta_{0,n}^{s_{1},t_{1}},\beta_{0,n}^{s_{2},t_{2}}\Big{)}
=𝔼[β0,ns1,t1β0,ns2,t2]𝔼[β0,ns1,t1]𝔼[β0,ns2,t2]\displaystyle\qquad=\mathbb{E}\big{[}\beta_{0,n}^{s_{1},t_{1}}\beta_{0,n}^{s_{2},t_{2}}\big{]}-\mathbb{E}[\beta_{0,n}^{s_{1},t_{1}}]\mathbb{E}[\beta_{0,n}^{s_{2},t_{2}}]
(20) =i1,i2j1,j2𝔼[Ci1,j1n(s1,t1)Ci2,j2n(s2,t2)]𝔼[Ci1,j1n(s1,t1)]𝔼[Ci2,j2n(s2,t2)],\displaystyle\qquad=\sum_{i_{1},i_{2}}\sum_{j_{1},j_{2}}\mathbb{E}[C^{n}_{i_{1},j_{1}}(s_{1},t_{1})C^{n}_{i_{2},j_{2}}(s_{2},t_{2})]-\mathbb{E}[C^{n}_{i_{1},j_{1}}(s_{1},t_{1})]\mathbb{E}[C^{n}_{i_{2},j_{2}}(s_{2},t_{2})],

where i1,i2=1,,ni_{1},i_{2}=1,\dots,n with j1=1,,ni1+1j_{1}=1,\dots,n-i_{1}+1, and j2=1,,ni2+1j_{2}=1,\dots,n-i_{2}+1. We may then break (20) into

i1,i2j=1ni1i2+1𝔼[Ci1,jn(s1,t1)Ci2,jn(s2,t2)]𝔼[Ci1,jn(s1,t1)]𝔼[Ci2,jn(s2,t2)]\displaystyle\sum_{i_{1},i_{2}}\sum_{j=1}^{n-i_{1}\vee i_{2}+1}\mathbb{E}[C^{n}_{i_{1},j}(s_{1},t_{1})C^{n}_{i_{2},j}(s_{2},t_{2})]-\mathbb{E}[C^{n}_{i_{1},j}(s_{1},t_{1})]\mathbb{E}[C^{n}_{i_{2},j}(s_{2},t_{2})]
+i1,i2k=1ni2j=1ni1(i2+k)+1𝔼[Ci1,jn(s1,t1)Ci2,j+kn(s2,t2)]𝔼[Ci1,jn(s1,t1)]𝔼[Ci2,j+kn(s2,t2)]\displaystyle\qquad+\sum_{i_{1},i_{2}}\sum_{k=1}^{n-i_{2}}\sum_{j=1}^{n-i_{1}\vee(i_{2}+k)+1}\mathbb{E}[C^{n}_{i_{1},j}(s_{1},t_{1})C^{n}_{i_{2},j+k}(s_{2},t_{2})]-\mathbb{E}[C^{n}_{i_{1},j}(s_{1},t_{1})]\mathbb{E}[C^{n}_{i_{2},j+k}(s_{2},t_{2})]
(21) +i1,i2k=1ni1j=1ni2(i1+k)+1𝔼[Ci1,j+kn(s1,t1)Ci2,jn(s2,t2)]𝔼[Ci1,j+kn(s1,t1)]𝔼[Ci2,jn(s2,t2)].\displaystyle\qquad+\sum_{i_{1},i_{2}}\sum_{k=1}^{n-i_{1}}\sum_{j=1}^{n-i_{2}\vee(i_{1}+k)+1}\mathbb{E}[C^{n}_{i_{1},j+k}(s_{1},t_{1})C^{n}_{i_{2},j}(s_{2},t_{2})]-\mathbb{E}[C^{n}_{i_{1},j+k}(s_{1},t_{1})]\mathbb{E}[C^{n}_{i_{2},j}(s_{2},t_{2})].

For now, we will exclude the boundary terms from each sum—which use X0,nX_{0,n} and Xn+1,nX_{n+1,n}. We will treat the boundary terms later. The nonboundary terms of the expression (21) can thus be simplified based on the assumed stationarity of 𝒳n\mathcal{X}_{n} to be

i1,i2(ni1i21)(𝔼[Ci1,2n(s1,t1)Ci2,2n(s2,t2)]𝔼[Ci1,2n(s1,t1)]𝔼[Ci2,2n(s2,t2)])\displaystyle\sum_{i_{1},i_{2}}(n-i_{1}\vee i_{2}-1)\Big{(}\mathbb{E}[C^{n}_{i_{1},2}(s_{1},t_{1})C^{n}_{i_{2},2}(s_{2},t_{2})]-\mathbb{E}[C^{n}_{i_{1},2}(s_{1},t_{1})]\mathbb{E}[C^{n}_{i_{2},2}(s_{2},t_{2})]\Big{)}
+i1,i2k=1ni2(ni1(i2+k)1)(𝔼[Ci1,2n(s1,t1)Ci2,2+kn(s2,t2)]𝔼[Ci1,2n(s1,t1)]𝔼[Ci2,2+kn(s2,t2)])\displaystyle\ +\sum_{i_{1},i_{2}}\sum_{k=1}^{n-i_{2}}(n-i_{1}\vee(i_{2}+k)-1)\Big{(}\mathbb{E}[C^{n}_{i_{1},2}(s_{1},t_{1})C^{n}_{i_{2},2+k}(s_{2},t_{2})]-\mathbb{E}[C^{n}_{i_{1},2}(s_{1},t_{1})]\mathbb{E}[C^{n}_{i_{2},2+k}(s_{2},t_{2})]\Big{)}
(22) +i1,i2k=1ni1(ni2(i1+k)1)(𝔼[Ci1,2+kn(s1,t1)Ci2,2n(s2,t2)]𝔼[Ci1,2+kn(s1,t1)]𝔼[Ci2,2n(s2,t2)]).\displaystyle\ +\sum_{i_{1},i_{2}}\sum_{k=1}^{n-i_{1}}(n-i_{2}\vee(i_{1}+k)-1)\Big{(}\mathbb{E}[C^{n}_{i_{1},2+k}(s_{1},t_{1})C^{n}_{i_{2},2}(s_{2},t_{2})]-\mathbb{E}[C^{n}_{i_{1},2+k}(s_{1},t_{1})]\mathbb{E}[C^{n}_{i_{2},2}(s_{2},t_{2})]\Big{)}.

Dividing by nn, we may express the first term in (22) as

(23) i1=1i2=1(1(i1i21)/n)+(𝔼[Ci1,2n(s1,t1)Ci2,2n(s2,t2)]𝔼[Ci1,2n(s1,t1)]𝔼[Ci2,2n(s2,t2)]).\sum_{i_{1}=1}^{\infty}\sum_{i_{2}=1}^{\infty}(1-(i_{1}\vee i_{2}-1)/n)_{+}\Big{(}\mathbb{E}[C^{n}_{i_{1},2}(s_{1},t_{1})C^{n}_{i_{2},2}(s_{2},t_{2})]-\mathbb{E}[C^{n}_{i_{1},2}(s_{1},t_{1})]\mathbb{E}[C^{n}_{i_{2},2}(s_{2},t_{2})]\Big{)}.

Assuming we can show that

i1=1i2=1|𝔼[Ci1,2(s1,t1)Ci2,2(s2,t2)]𝔼[Ci1,2(s1,t1)]𝔼[Ci2,2(s2,t2)]|<,\sum_{i_{1}=1}^{\infty}\sum_{i_{2}=1}^{\infty}\Big{|}\mathbb{E}[C_{i_{1},2}(s_{1},t_{1})C_{i_{2},2}(s_{2},t_{2})]-\mathbb{E}[C_{i_{1},2}(s_{1},t_{1})]\mathbb{E}[C_{i_{2},2}(s_{2},t_{2})]\Big{|}<\infty,

where we drop the superscript nn as mentioned at the start of Section 4, then (23) will converge to

(24) i1=1i2=1𝔼[Ci1,2(s1,t1)Ci2,2(s2,t2)]𝔼[Ci1,2(s1,t1)]𝔼[Ci2,2(s2,t2)].\sum_{i_{1}=1}^{\infty}\sum_{i_{2}=1}^{\infty}\mathbb{E}[C_{i_{1},2}(s_{1},t_{1})C_{i_{2},2}(s_{2},t_{2})]-\mathbb{E}[C_{i_{1},2}(s_{1},t_{1})]\mathbb{E}[C_{i_{2},2}(s_{2},t_{2})].

Similarly, we will get limits of

(25) i1=1i2=1k=1𝔼[Ci1,2(s1,t1)Ci2,2+k(s2,t2)]𝔼[Ci1,2(s1,t1)]𝔼[Ci2,2+k(s2,t2)],\sum_{i_{1}=1}^{\infty}\sum_{i_{2}=1}^{\infty}\sum_{k=1}^{\infty}\mathbb{E}[C_{i_{1},2}(s_{1},t_{1})C_{i_{2},2+k}(s_{2},t_{2})]-\mathbb{E}[C_{i_{1},2}(s_{1},t_{1})]\mathbb{E}[C_{i_{2},2+k}(s_{2},t_{2})],

and

(26) i1=1i2=1k=1𝔼[Ci1,2+k(s1,t1)Ci2,2(s2,t2)]𝔼[Ci1,2+k(s1,t1)]𝔼[Ci2,2(s2,t2)],\sum_{i_{1}=1}^{\infty}\sum_{i_{2}=1}^{\infty}\sum_{k=1}^{\infty}\mathbb{E}[C_{i_{1},2+k}(s_{1},t_{1})C_{i_{2},2}(s_{2},t_{2})]-\mathbb{E}[C_{i_{1},2+k}(s_{1},t_{1})]\mathbb{E}[C_{i_{2},2}(s_{2},t_{2})],

for the second and third terms in (22), provided the dominated convergence assumption holds for each of these cases. In fact, these three sums comprise the limit of the covariance. However, to establish that, we must ensure that the “boundary terms” vanish, which we do in Lemma 6.3. A useful fact will aid in the proof of the covariance limit above and the lemma below.

Lemma 6.2.

Fix k0k\geq 0. Suppose that i2+k>i1i_{2}+k>i_{1} and ki1k\leq i_{1}, then for any values of t1,t2t_{1},t_{2} we have

Ci1,jn(s1,t1)Ci2,j+kn(s2,t2)=0.C^{n}_{i_{1},j}(s_{1},t_{1})C^{n}_{i_{2},j+k}(s_{2},t_{2})=0.

Analogously, if i1+k>i2i_{1}+k>i_{2} and ki2k\leq i_{2}, then for any values of t1,t2t_{1},t_{2} we have

Ci1,j+kn(s1,t1)Ci2,jn(s2,t2)=0.C^{n}_{i_{1},j+k}(s_{1},t_{1})C^{n}_{i_{2},j}(s_{2},t_{2})=0.
Proof.

Note that if i2+k>i1i_{2}+k>i_{1} and ki1k\leq i_{1}, then it must be the case that there exists indices l,ll,l^{\prime} such that if Ci1,j(s1,t1)Ci2,j+k(s2,t2)=1C_{i_{1},j}(s_{1},t_{1})C_{i_{2},j+k}(s_{2},t_{2})=1 then

Xlt1,Xl>t2 and Xl>t1,Xlt2,X_{l}\leq t_{1},X_{l}>t_{2}\text{ and }X_{l}^{\prime}>t_{1},X_{l}^{\prime}\leq t_{2},

a contradiction because t1>t2t_{1}>t_{2} and t2>t1t_{2}>t_{1} cannot simultaneously hold—even if t1=t2t_{1}=t_{2}. The proof for the second case follows by the same argument. ∎

Lemma 6.3.

If 𝒳\mathcal{X} is a ρ\rho-mixing stationary stochastic process that is max-root summable then the boundary terms in (21) are o(n)o(n) as nn\to\infty.

Proof.

The boundary terms (21) comprise those terms in the first sum that satisfy j=1j=1 or j+(i1i2)=n+1j+(i_{1}\vee i_{2})=n+1, the terms in the second sum satisfying j=1j=1 or j+i1(i2+k)=n+1j+i_{1}\vee(i_{2}+k)=n+1, and the terms in the third sum satisfying j=1j=1, or j+(i1+k)i2=n+1j+(i_{1}+k)\vee i_{2}=n+1. Thus, the boundary terms can be represented as

i1,i2Cov(Ci1,1n(s1,t1),Ci2,1n(s2,t2))+Cov(Ci1,ni1i2+1n(s1,t1),Ci2,ni1i2+1n(s2,t2))\displaystyle\sum_{i_{1},i_{2}}\text{Cov}\big{(}C^{n}_{i_{1},1}(s_{1},t_{1}),C^{n}_{i_{2},1}(s_{2},t_{2})\big{)}+\text{Cov}\big{(}C^{n}_{i_{1},n-i_{1}\vee i_{2}+1}(s_{1},t_{1}),C^{n}_{i_{2},n-i_{1}\vee i_{2}+1}(s_{2},t_{2})\big{)}
+i1,i2k=1ni2Cov(Ci1,1n(s1,t1),Ci2,1+kn(s2,t2))\displaystyle+\sum_{i_{1},i_{2}}\sum_{k=1}^{n-i_{2}}\text{Cov}\big{(}C^{n}_{i_{1},1}(s_{1},t_{1}),C^{n}_{i_{2},1+k}(s_{2},t_{2})\big{)}
+Cov(Ci1,ni1(i2+k)+1n(s1,t1),Ci2,n(i1k)i2+1n(s2,t2))\displaystyle\qquad\qquad\qquad+\text{Cov}\big{(}C^{n}_{i_{1},n-i_{1}\vee(i_{2}+k)+1}(s_{1},t_{1}),C^{n}_{i_{2},n-(i_{1}-k)\vee i_{2}+1}(s_{2},t_{2})\big{)}
+i1,i2k=1ni1Cov(Ci1,1+kn(s1,t1),Ci2,1n(s2,t2))\displaystyle+\sum_{i_{1},i_{2}}\sum_{k=1}^{n-i_{1}}\text{Cov}\big{(}C^{n}_{i_{1},1+k}(s_{1},t_{1}),C^{n}_{i_{2},1}(s_{2},t_{2})\big{)}
(27) +Cov(Ci1,ni1(i2k)+1n(s1,t1),Ci2,n(i1+k)i2+1n(s2,t2)).\displaystyle\qquad\qquad\qquad+\text{Cov}\big{(}C^{n}_{i_{1},n-i_{1}\vee(i_{2}-k)+1}(s_{1},t_{1}),C^{n}_{i_{2},n-(i_{1}+k)\vee i_{2}+1}(s_{2},t_{2})\big{)}.

We may bound the absolute value of the first sum in (27) by

i1,i2Var(Ci1,1n(s1,t1))Var(Ci2,1n(s2,t2))\displaystyle\sum_{i_{1},i_{2}}\sqrt{\mathrm{Var}(C^{n}_{i_{1},1}(s_{1},t_{1}))}\sqrt{\mathrm{Var}(C^{n}_{i_{2},1}(s_{2},t_{2}))}
+Var(Ci1,n(i1i2)+1n(s1,t1))Var(Ci2,ni1i2+1n(s2,t2))\displaystyle\phantom{\sum_{i_{1},i_{2}}}\qquad\qquad+\sqrt{\mathrm{Var}(C^{n}_{i_{1},n-(i_{1}\vee i_{2})+1}(s_{1},t_{1}))}\sqrt{\mathrm{Var}(C^{n}_{i_{2},n-i_{1}\vee i_{2}+1}(s_{2},t_{2}))}
2i1,i2(X1t1,,Xi1t1)(X1t2,,Xi2t2)<,\displaystyle\leq 2\sum_{i_{1},i_{2}}\sqrt{\mathbb{P}(X_{1}\leq t_{1},\dots,X_{i_{1}}\leq t_{1})}\sqrt{\mathbb{P}(X_{1}\leq t_{2},\dots,X_{i_{2}}\leq t_{2})}<\infty,

and thus o(n)o(n)—where we use the inequalities |Cov(X,Y)|Var(X)Var(Y)|\text{Cov}(X,Y)|\leq\sqrt{\mathrm{Var}(X)\mathrm{Var}(Y)}, Var(𝟏A)(A)\mathrm{Var}(\mathbf{1}_{A})\leq\mathbb{P}(A), and the fact that 𝒳\mathcal{X} is max-root summable. We now will finish the proof by showing that the second sum in (27) is o(n)o(n) as well. That the third sum in (27) is o(n)o(n) follows by an essentially symmetric proof. We may bound the absolute value of the second sum in (27) by

2\displaystyle 2 i1,i2k=i1+2ni2ρ𝒳(ki11)Var(Ci1,1n(s1,t1))Var(Ci2,1n(s2,t2))\displaystyle\sum_{i_{1},i_{2}}\sum_{k=i_{1}+2}^{n-i_{2}}\rho_{\mathcal{X}}(k-i_{1}-1)\sqrt{\mathrm{Var}(C^{n}_{i_{1},1}(s_{1},t_{1}))}\sqrt{\mathrm{Var}(C^{n}_{i_{2},1}(s_{2},t_{2}))}
+\displaystyle+ i1,i2k=1i1+1|Cov(Ci1,1n(s1,t1),Ci2,1+kn(s2,t2))|\displaystyle\sum_{i_{1},i_{2}}\sum_{k=1}^{i_{1}+1}\Big{|}\text{Cov}\big{(}C^{n}_{i_{1},1}(s_{1},t_{1}),C^{n}_{i_{2},1+k}(s_{2},t_{2})\big{)}\Big{|}
(28) +|Cov(Ci1,ni1(i2+k)+1n(s1,t1),Ci2,n(i1k)i2+1n(s2,t2))|.\displaystyle\qquad\qquad\qquad+\Big{|}\text{Cov}\big{(}C^{n}_{i_{1},n-i_{1}\vee(i_{2}+k)+1}(s_{1},t_{1}),C^{n}_{i_{2},n-(i_{1}-k)\vee i_{2}+1}(s_{2},t_{2})\big{)}\Big{|}.

The first sum in (28) follows from the definition of ρ\rho-mixing and the fact that F(si)>0F(s_{i})>0 and F(ti)<1F(t_{i})<1. Dividing the aforementioned first sum by nn we see that

2n1i1,i2k=i1+2ni2ρ𝒳(ki11)Var(Ci1,1n(s1,t1))Var(Ci2,1n(s2,t2))\displaystyle 2n^{-1}\sum_{i_{1},i_{2}}\sum_{k=i_{1}+2}^{n-i_{2}}\rho_{\mathcal{X}}(k-i_{1}-1)\sqrt{\mathrm{Var}(C^{n}_{i_{1},1}(s_{1},t_{1}))}\sqrt{\mathrm{Var}(C^{n}_{i_{2},1}(s_{2},t_{2}))}
2n1k=1nρ𝒳(k)i1,i2(X1t1,,Xi1t1)(X1t2,,Xi2t2)\displaystyle\qquad\leq 2n^{-1}\sum_{k=1}^{n}\rho_{\mathcal{X}}(k)\sum_{i_{1},i_{2}}\sqrt{\mathbb{P}(X_{1}\leq t_{1},\dots,X_{i_{1}}\leq t_{1})}\sqrt{\mathbb{P}(X_{1}\leq t_{2},\dots,X_{i_{2}}\leq t_{2})}

which tends to 0 as nn\to\infty by max-root summability and the fact that ρ𝒳(k)0\rho_{\mathcal{X}}(k)\to 0. The second sum in (28) is a little more delicate. Before continuing, note that |k:k>i1i2,ki1|=i1i2i1i2\big{|}k\in\mathbb{N}\mathrel{\mathop{\mathchar 58\relax}}k>i_{1}-i_{2},\,k\leq i_{1}\big{|}=i_{1}\wedge i_{2}\leq i_{1}i_{2} when both terms are at least 1. Hence, Lemma 6.2 implies that the second sum in (28) equals

i2=1ni1=1nk=1+(i1i2)+i1𝔼[Ci1,1n(s1,t1)]𝔼[Ci2,1+kn(s2,t2)]\displaystyle\sum_{i_{2}=1}^{n}\sum_{i_{1}=1}^{n}\sum_{k=1+(i_{1}-i_{2})_{+}}^{i_{1}}\mathbb{E}[C^{n}_{i_{1},1}(s_{1},t_{1})]\mathbb{E}[C^{n}_{i_{2},1+k}(s_{2},t_{2})]
+𝔼[Ci1,ni1(i2+k)+1n(s1,t1)]𝔼[Ci2,n(i1k)i2+1n(s2,t2)]\displaystyle\phantom{\quad+\sum_{i_{2}=1}^{n}\sum_{i_{1}=i_{2}+1}^{n}\sum_{k=1}^{i_{1}-i_{2}}}\qquad+\mathbb{E}[C^{n}_{i_{1},n-i_{1}\vee(i_{2}+k)+1}(s_{1},t_{1})]\mathbb{E}[C^{n}_{i_{2},n-(i_{1}-k)\vee i_{2}+1}(s_{2},t_{2})]
+i2=1ni1=1ni21|Cov(Ci1,1n(s1,t1),Ci2,i1+2n(s2,t2))|+|Cov(Ci1,n(i1+i2)n(s1,t1),Ci2,ni2+1n(s2,t2))|\displaystyle\quad+\sum_{i_{2}=1}^{n}\sum_{i_{1}=1}^{n-i_{2}-1}\Big{|}\text{Cov}\big{(}C^{n}_{i_{1},1}(s_{1},t_{1}),C^{n}_{i_{2},i_{1}+2}(s_{2},t_{2})\big{)}\Big{|}+\Big{|}\text{Cov}\big{(}C^{n}_{i_{1},n-(i_{1}+i_{2})}(s_{1},t_{1}),C^{n}_{i_{2},n-i_{2}+1}(s_{2},t_{2})\big{)}\Big{|}
(29) +i2=1ni1=i2+1nk=1i1i2(|Cov(Ci1,1n(s1,t1),Ci2,1+kn(s2,t2))|.\displaystyle\quad+\sum_{i_{2}=1}^{n}\sum_{i_{1}=i_{2}+1}^{n}\sum_{k=1}^{i_{1}-i_{2}}\bigg{(}\Big{|}\text{Cov}\big{(}C^{n}_{i_{1},1}(s_{1},t_{1}),C^{n}_{i_{2},1+k}(s_{2},t_{2})\big{)}\Big{|}.
+|Cov(Ci1,ni1+1n(s1,t1),Ci2,ni1+k+1n(s2,t2))|).\displaystyle\phantom{\quad+\sum_{i_{2}=1}^{n}\sum_{i_{1}=i_{2}+1}^{n}\sum_{k=1}^{i_{1}-i_{2}}}\qquad+\Big{|}\text{Cov}\big{(}C^{n}_{i_{1},n-i_{1}+1}(s_{1},t_{1}),C^{n}_{i_{2},n-i_{1}+k+1}(s_{2},t_{2})\big{)}\Big{|}\bigg{)}.

We may bound the first term in (29)

2i2=1ni1=1ni1i2(X1t1,,Xi1t1)(X1t2,,Xi2t2)\displaystyle 2\sum_{i_{2}=1}^{n}\sum_{i_{1}=1}^{n}i_{1}i_{2}\mathbb{P}(X_{1}\leq t_{1},\dots,X_{i_{1}}\leq t_{1})\mathbb{P}(X_{1}\leq t_{2},\dots,X_{i_{2}}\leq t_{2})
=2i1=1ni1(X1t1,,Xi1t1)i2=1ni2(X1t2,,Xi2t2)\displaystyle\qquad=2\sum_{i_{1}=1}^{n}i_{1}\mathbb{P}(X_{1}\leq t_{1},\dots,X_{i_{1}}\leq t_{1})\sum_{i_{2}=1}^{n}i_{2}\mathbb{P}(X_{1}\leq t_{2},\dots,X_{i_{2}}\leq t_{2})
=o(n)\displaystyle\qquad=o(n)

by the max-root summability condition. Furthermore, we can bound the second sum in (29) by

2i1=1n(X1t1,,Xi1t1)i2=1n(X1t2,,Xi2t2)=o(n),2\sum_{i_{1}=1}^{n}\sqrt{\mathbb{P}(X_{1}\leq t_{1},\dots,X_{i_{1}}\leq t_{1})}\sum_{i_{2}=1}^{n}\sqrt{\mathbb{P}(X_{1}\leq t_{2},\dots,X_{i_{2}}\leq t_{2})}=o(n),

using the covariance inequality Cov(X,Y)Var(X)Var(Y)\text{Cov}(X,Y)\leq\sqrt{\mathrm{Var}(X)\mathrm{Var}(Y)}, and again using the max-root summability condition. Finally, we bound the third sum in (29) by

2i2=1ni1=i2+1n(i1i2)(X1t1,,Xi1t1)(X1t2,,Xi2t2)\displaystyle 2\sum_{i_{2}=1}^{n}\sum_{i_{1}=i_{2}+1}^{n}(i_{1}-i_{2})\sqrt{\mathbb{P}(X_{1}\leq t_{1},\dots,X_{i_{1}}\leq t_{1})}\sqrt{\mathbb{P}(X_{1}\leq t_{2},\dots,X_{i_{2}}\leq t_{2})}
2i1=1ni1(X1t1,,Xi1t1)i2=1n(X1t2,,Xi2t2)\displaystyle\leq 2\sum_{i_{1}=1}^{n}i_{1}\sqrt{\mathbb{P}(X_{1}\leq t_{1},\dots,X_{i_{1}}\leq t_{1})}\sum_{i_{2}=1}^{n}\sqrt{\mathbb{P}(X_{1}\leq t_{2},\dots,X_{i_{2}}\leq t_{2})}
=o(n),\displaystyle=o(n),

by a final application of the max-root summability condition. ∎

Having shown that the boundary terms vanish under our conditions, it will suffice to show the dominated convergence condition for the terms in (22) divided by nn, which will then tend to the sums of (24), (25), and (26) respectively. First, we divide each term by nn and see that the first covariance term with absolute summands is bounded above (using again the usual covariance inequalities) by

i1,i2(X1t1,,Xi1t1)(X1t2,,Xi2t2)\displaystyle\sum_{i_{1},i_{2}}\sqrt{\mathbb{P}(X_{1}\leq t_{1},\dots,X_{i_{1}}\leq t_{1})}\sqrt{\mathbb{P}(X_{1}\leq t_{2},\dots,X_{i_{2}}\leq t_{2})}
=i1(X1t1,,Xi1t1)i2(X1t2,,Xi2t2)<,\displaystyle\qquad=\sum_{i_{1}}\sqrt{\mathbb{P}(X_{1}\leq t_{1},\dots,X_{i_{1}}\leq t_{1})}\sum_{i_{2}}\sqrt{\mathbb{P}(X_{1}\leq t_{2},\dots,X_{i_{2}}\leq t_{2})}<\infty,

by applying max-root summability for each sum. We now prove the dominated convergence assumption for the second sum (divided by nn) in (22), as the third sum follows an analogous proof. This procedure yields an upper bound of

i1,i2k=1ni2|Cov(Ci1,2n(s1,t1)Ci2,2+kn(s2,t2))|\displaystyle\sum_{i_{1},i_{2}}\sum_{k=1}^{n-i_{2}}\Big{|}\text{Cov}\big{(}C^{n}_{i_{1},2}(s_{1},t_{1})C^{n}_{i_{2},2+k}(s_{2},t_{2})\big{)}\Big{|}
i1,i2k=1i1+1|Cov(Ci1,2n(s1,t1)Ci2,2+kn(s2,t2))|\displaystyle\qquad\leq\sum_{i_{1},i_{2}}\sum_{k=1}^{i_{1}+1}\Big{|}\text{Cov}\big{(}C^{n}_{i_{1},2}(s_{1},t_{1})C^{n}_{i_{2},2+k}(s_{2},t_{2})\big{)}\Big{|}
(30) +i1,i2k=i1+2ni2ρ𝒳(ki11)Var(Ci1,2n(s1,t1))Var(Ci2,2n(s2,t2)).\displaystyle\qquad+\sum_{i_{1},i_{2}}\sum_{k=i_{1}+2}^{n-i_{2}}\rho_{\mathcal{X}}(k-i_{1}-1)\sqrt{\mathrm{Var}(C^{n}_{i_{1},2}(s_{1},t_{1}))}\sqrt{\mathrm{Var}(C^{n}_{i_{2},2}(s_{2},t_{2}))}.

The first sum in (30) we may bound by

i1,i2(i1+1)(X1t1,,Xi1t1)(X1t2,,Xi2t2)<,\sum_{i_{1},i_{2}}(i_{1}+1)\sqrt{\mathbb{P}(X_{1}\leq t_{1},\dots,X_{i_{1}}\leq t_{1})}\sqrt{\mathbb{P}(X_{1}\leq t_{2},\dots,X_{i_{2}}\leq t_{2})}<\infty,

by max-root summability of 𝒳\mathcal{X}. The second sum in (30) is bounded above by

k=1nρ𝒳(k)i1,i2Var(Ci1,1n(s2,t1))Var(Ci2,2n(s2,t2))\displaystyle\sum_{k=1}^{n}\rho_{\mathcal{X}}(k)\sum_{i_{1},i_{2}}\sqrt{\mathrm{Var}(C^{n}_{i_{1},1}(s_{2},t_{1}))}\sqrt{\mathrm{Var}(C^{n}_{i_{2},2}(s_{2},t_{2}))}
\displaystyle\leq k=1ρ𝒳(k)i1,i2(X1t1,,Xi1t1)(X1t2,,Xi2t2)<.\displaystyle\sum_{k=1}^{\infty}\rho_{\mathcal{X}}(k)\sum_{i_{1},i_{2}}\sqrt{\mathbb{P}(X_{1}\leq t_{1},\dots,X_{i_{1}}\leq t_{1})}\sqrt{\mathbb{P}(X_{1}\leq t_{2},\dots,X_{i_{2}}\leq t_{2})}<\infty.

by assumption.

As for the representation of limnn1Cov(β0,ns1,t1,β0,ns2,t2)\lim_{n\to\infty}n^{-1}\mathrm{Cov}\Big{(}\beta_{0,n}^{s_{1},t_{1}},\beta_{0,n}^{s_{2},t_{2}}\Big{)}, we note that the sums (24), (25), and (26) are all absolutely convergent, hence we may split the sums and apply the monotone convergence theorem to each, and recombine to get the stated representation. \square

References

  • Atienza et al. (2020) Nieves Atienza, Rocio Gonzalez-Díaz, and Manuel Soriano-Trigueros. On the stability of persistent entropy and new summary functions for topological data analysis. Pattern Recognition, 107:107509, 2020.
  • Baryshnikov (2019) Yuliy Baryshnikov. Time series, persistent homology and chirality. arXiv preprint arXiv:1909.09846, 2019.
  • Billingsley (1999) Patrick Billingsley. Convergence of probability measures. John Wiley & Sons, Inc., 2nd edition, 1999. ISBN 0-471-19745-9.
  • Biscio et al. (2020) Christophe A. N. Biscio, Nicolas Chenavier, Christian Hirsch, and Anne Marie Svane. Testing goodness of fit for point processes via topological data analysis. Electronic Journal of Statistics, 14(1):1024–1074, 2020. ISSN 1935-7524. doi: 10.1214/20-EJS1683.
  • Bobrowski and Skraba (2024) Omer Bobrowski and Primoz Skraba. Weak universality in random persistent homology and scale-invariant functionals. arXiv preprint arXiv:2406.05553, 2024.
  • Bradley (1987) Richard C. Bradley. The central limit question under ρ\rho-mixing. The Rocky Mountain journal of mathematics, pages 95–114, 1987.
  • Bradley (2005) Richard C. Bradley. Basic Properties of Strong Mixing Conditions. A Survey and Some Open Questions. Probability Surveys, 2(none):107 – 144, 2005. doi: 10.1214/154957805100000104.
  • Carlsson and Vejdemo-Johansson (2021) Gunnar Carlsson and Mikael Vejdemo-Johansson. Topological Data Analysis with Applications. Cambridge University Press, 2021.
  • Chazal and Divol (2018) Frédéric Chazal and Vincent Divol. The density of expected persistence diagrams and its kernel based estimation. In 34th International Symposium on Computational Geometry (SoCG 2018). Schloss-Dagstuhl-Leibniz Zentrum für Informatik, 2018.
  • Chung et al. (2021) Yu-Min Chung, Chuan-Shen Hu, Yu-Lun Lo, and Hau-Tieng Wu. A persistent homology approach to heart rate variability analysis with an application to sleep-wake classification. Frontiers in physiology, 12:637684, 2021.
  • Chung et al. (2022) Yu-Min Chung, Amir Nikooienejad, and Bo Zhang. Automatic eating behavior detection from wrist motion sensor using bayesian, gradient boosting, and topological persistence methods. In 2022 IEEE International Conference on Big Data (Big Data), pages 1809–1815, 2022. doi: 10.1109/BigData55660.2022.10021031.
  • Chung et al. (2024) Yu-Min Chung, Whitney K. Huang, and Hau-Tieng Wu. Topological data analysis assisted automated sleep stage scoring using airflow signals. Biomedical Signal Processing and Control, 89:105760, 2024. ISSN 1746-8094. doi: 10.1016/j.bspc.2023.105760.
  • Cohen-Steiner et al. (2010) David Cohen-Steiner, Herbert Edelsbrunner, John Harer, and Yuriy Mileyko. Lipschitz functions have LpL_{p}-stable persistence. Foundations of computational mathematics, 10(2):127–139, 2010.
  • Crozier et al. (2024) Peter A. Crozier, Matan Leibovich, Piyush Haluai, Mai Tai, Andrew M. Thomas, Joshua Vincent, David M. Matteson, Yifan Wang, and Carlos Fernandez-Granda. Atomic resolution observations of nanoparticle surface dynamics and instabilities enabled by artificial intelligence. 2024. Submitted.
  • Divol and Polonik (2019) Vincent Divol and Wolfgang Polonik. On the choice of weight functions for linear representations of persistence diagrams. Journal of Applied and Computational Topology, 3(3):249–283, September 2019. ISSN 2367-1734. doi: https://doi.org/10.1007/s41468-019-00032-z.
  • Dudley (2014) Richard M. Dudley. Uniform central limit theorems, volume 142. Cambridge university press, 2014.
  • Durrett (2010) Rick Durrett. Probability: theory and examples. Cambridge university press, 4th edition, 2010.
  • Edelsbrunner and Harer (2010) Herbert Edelsbrunner and John Harer. Computational Topology: An Introduction. American Mathematical Soc., 2010.
  • Graff et al. (2021) Grzegorz Graff, Beata Graff, Paweł Pilarczyk, Grzegorz Jabłoński, Dariusz Gąsecki, and Krzysztof Narkiewicz. Persistent homology as a new method of the assessment of heart rate variability. Plos one, 16(7):e0253851, 2021.
  • Hiraoka and Tsunoda (2018) Yasuaki Hiraoka and Kenkichi Tsunoda. Limit theorems for random cubical homology. Discrete & Computational Geometry, 60:665–687, 2018.
  • Hiraoka et al. (2018) Yasuaki Hiraoka, Tomoyuki Shirai, and Khanh Duy Trinh. Limit theorems for persistence diagrams. The Annals of Applied Probability, 28(5):2740–2780, 2018.
  • Kanazawa et al. (2024) Shu Kanazawa, Yasuaki Hiraoka, Jun Miyanaga, and Kenkichi Tsunoda. Large deviation principle for persistence diagrams of random cubical filtrations. Journal of Applied and Computational Topology, pages 1–52, 2024.
  • Krebs (2021) Johannes Krebs. On limit theorems for persistent betti numbers from dependent data. Stochastic Processes and their Applications, 139:139–174, 2021. ISSN 0304-4149. doi: https://doi.org/10.1016/j.spa.2021.04.013.
  • Krebs and Hirsch (2022) Johannes Krebs and Christian Hirsch. Functional central limit theorems for persistent betti numbers on cylindrical networks. Scandinavian Journal of Statistics, 49(1):427–454, 2022.
  • Krebs and Polonik (2019) Johannes Krebs and Wolfgang Polonik. On the asymptotic normality of persistent betti numbers. arXiv preprint arXiv:1903.03280, 2019.
  • Kulik and Soulier (2020) Rafal Kulik and Philippe Soulier. Heavy-tailed time series. Springer, 2020.
  • Merelli et al. (2015) Emanuela Merelli, Matteo Rucco, Peter Sloot, and Luca Tesei. Topological characterization of complex systems: Using persistent entropy. Entropy, 17(10):6872–6892, 2015.
  • Meyn and Tweedie (2009) S. Meyn and R.L. Tweedie. Markov Chains and Stochastic Stability. Cambridge Mathematical Library. Cambridge University Press, 2009. ISBN 9780521731829.
  • Miyanaga (2023) Jun Miyanaga. Limit theorems of persistence diagrams for random cubical filtrations. Phd thesis, Kyoto University, 2023.
  • Neumann (2013) Michael H Neumann. A central limit theorem for triangular arrays of weakly dependent random variables, with applications in statistics. ESAIM: Probability and Statistics, 17:120–134, 2013.
  • Owada (2022) Takashi Owada. Convergence of persistence diagram in the sparse regime. The Annals of Applied Probability, 32(6):4706–4736, 2022.
  • Owada and Bobrowski (2020) Takashi Owada and Omer Bobrowski. Convergence of persistence diagrams for topological crackle. Bernoulli, 26(3):2275–2310, aug 2020. ISSN 1350-7265. doi: 10.3150/20-BEJ1193.
  • Owada and Thomas (2020) Takashi Owada and Andrew M. Thomas. Limit theorems for process-level betti numbers for sparse and critical regimes. Advances in Applied Probability, 52(1):1–31, 2020.
  • Perez (2023) Daniel Perez. On the persistent homology of almost surely C0\textit{C}^{0} stochastic processes. Journal of Applied and Computational Topology, 7(4):879–906, 2023.
  • Rice (1944) S. O. Rice. Mathematical analysis of random noise. The Bell System Technical Journal, 23(3):282–332, 1944. doi: 10.1002/j.1538-7305.1944.tb00874.x.
  • Rucco et al. (2016) Matteo Rucco, Filippo Castiglione, Emanuela Merelli, and Marco Pettini. Characterisation of the idiotypic immune network through persistent entropy. In Proceedings of ECCS 2014: European Conference on Complex Systems, pages 117–128. Springer, 2016.
  • Thomas et al. (2023) Andrew M. Thomas, Peter A. Crozier, Yuchen Xu, and David S. Matteson. Feature detection and hypothesis testing for extremely noisy nanoparticle images using topological data analysis. Technometrics, 65(4):590–603, 2023. doi: 10.1080/00401706.2023.2203744.
  • Thomas et al. (2024) Andrew M. Thomas, Michael Jauch, and David S. Matteson. Bayesian changepoint detection via logistic regression and the topological analysis of image series. arXiv preprint arXiv:2401.02917, 2024.