Persistent Homology of Fractional Gaussian Noise

H. Masoomy Department of Physics, Shahid Beheshti University, 1983969411, Tehran, Iran B. Askari Department of Physics, Shahid Beheshti University, 1983969411, Tehran, Iran M. N. Najafi [email protected] Department of Physics, University of Mohaghegh Ardabili, P.O. Box 179, Ardabil, Iran S. M. S. Movahed [email protected] Department of Physics, Shahid Beheshti University, 1983969411, Tehran, Iran

Abstract

In this paper, we employ the persistent homology (PH) technique to examine the topological properties of fractional Gaussian noise (fGn). We develop the weighted natural visibility graph algorithm, and the associated simplicial complexes through the filtration process are quantified by PH. The evolution of the homology group dimension represented by Betti numbers demonstrates a strong dependency on the Hurst exponent ( $H$ ). The coefficients of the birth and death curve of the $k$ -dimensional topological holes ( $k$ -holes) at a given threshold depend on $H$ which is almost not affected by finite sample size. We show that the distribution function of a lifetime for $k$ -holes decays exponentially and the corresponding slope is an increasing function versus $H$ , and more interestingly, the sample size effect completely disappears in this quantity. The persistence entropy logarithmically grows with the size of the visibility graph of a system with almost $H$ -dependent prefactors. On the contrary, the local statistical features are not able to determine the corresponding Hurst exponent of fGn data, while the moments of eigenvalue distribution ( $M_{n}$ ) for $n\geq 1$ reveal a dependency on $H$ , containing the sample size effect. Finally, the PH shows the correlated behavior of electroencephalography for both healthy and schizophrenic samples.

Topological Data Analysis, Persistent Homology, Fractional Gaussian Noise, Weighted Natural Visibility Graph, Topological Persistence, Persistence Entropy

I Introduction

A powerful approach to study different types of data sets ranging from point cloud data and scalar fields to a complex network (graph), particularly high-dimensional data, is called topological data analysis (TDA) Edelsbrunner et al. (2001); Zomorodian and Carlsson (2005); Zomorodian (2005); Carlsson (2009); Zomorodian (2012); Wasserman (2018). TDA as an application of algebraic topology Nakahara (2003); Munkres (2018); Hatcher (2005) and a branch of computational topology Edelsbrunner and Harer (2010); it analyzes the shape of high-dimensional complex data in terms of global features (number of connected components, loops, voids, etc.) of topological space underlying the data set. In the persistent homology (PH) technique, as the main part of TDA, the topological approximation of phase space of any type of data set which is called the simplicial complex is assigned to the underlying data, and then topological invariants are computed.

The PH aims to capture topological evolution of data set by varying scale (parameter), and extracts topological invariants of the data set in each scale summarizing them in different representations, e.g., persistence barcode (PB) Ghrist (2008); Carlsson et al. (2005), persistence diagram (PD) Patel (2018), persistence landscape (PL) Bubenik and Dłotko (2017), persistence image (PI) Adams et al. (2017), persistence surface (PS), and $\beta$ -curve, which reveal topological information of the data set. Being robust to noise, PH can clarify the essential features of the systems with high internal degrees of freedom and is capable of classifying data sets Chazal and Michel (2017); Munch (2017). The PH technique has attracted much attention due to its vast applications in analyzing complex networks Serrano et al. (2020); Saggar et al. (2018); Horak et al. (2009). Also it has been used in various systems (see, e.g., Topaz et al. (2015); Sizemore et al. (2019); Pranav et al. (2017); Jaquette and Schweinhart (2020); Speidel et al. (2018); Salnikov et al. (2018); Bobrowski and Skraba (2020); Tran et al. (2021); Olsthoorn et al. (2020) and references therein).

A mathematical model containing correlation tuned by one parameter (the Hurst exponent Hurst (1951)) is the fractional Brownian motion (fBm) and its increment is known as fractional Gaussian noise (fGn) Mandelbrot and Van Ness (1968). It is also used to model self-similar phenomena of various types ranging from meteorology, engineering, econophysics, and astronomy to biology (see, e.g., Eghdami et al. (2018); Jiang et al. (2019) and references therein). To quantify the properties of a given self-similar data set whose power spectrum behaves as a power-law in the frequency (wavelength) domain, many methods have been proposed. A well-studied method is multi-fractal detrended fluctuation analysis (MFDFA) Peng et al. (1994, 1995); Kantelhardt et al. (2002). Taking into account the higher-order detrended covariance, the multi-fractal detrended cross-correlation analysis (MFDXA) has been introduced Zhou et al. (2008). Despite many advantages brought by the mentioned methods, the impact of more complicated trends and finite-size effects have not been diminished completely in many previous approaches Hu et al. (2001); Chen et al. (2002); Kantelhardt et al. (2001); Nagarajan and Kavasseri (2005a). On the other hand, the finite size of time-series affects the accurate estimation of the Hurst exponent by some of previous methods Xu et al. (2005).

Although different methods can be found in the literature to determine the scaling exponent of fBm or fGn series, very little attention has been paid to deal with the topological properties and the probable capability of TDA for estimation of the Hurst exponent. In this regard, knowing a given time series belongs to an fGn class, some relevant questions can be raised: (i) Does the topological aspect of time series depend on the Hurst exponent? (ii) What are the effects of sample size, trends, and irregularity of fGn signal on the PH of topological motifs generated from the data set? (iii) How is the multifractality expressed by PH? Motivated by the mentioned questions, in this paper, we concentrate on the topological properties (homology group) of fGn.

A way to construct the higher dimensional manifold from a typical time-series in order to evaluate the evolution of $k$ -dimensional holes through the filtration process is assigning a weighted graph to the underlying time-series. There are various methods to assign a network to a typical time-series, e.g., the proximity network, cycle network, visibility graph, correlation network, recurrence network, and transition network (see, e.g., Zou et al. (2019) and references therein). The idea of the visibility graph (VG) is a complex network constructed by considering the visibility algorithm, proposed by Lacasa et al as a novel way to analyze time-series in terms of complex network language Lacasa et al. (2008) proposed as an alternative method to estimate $H$ for fBm and fGn series Lacasa et al. (2009).

In this paper, we rely on statistical and topological properties (homology group) of natural VGs (NVGs) associated with such signals. First, we propose a weighted version of natural VGs (WNVGs) by defining a well-defined weight function to quantify the quality of visibility between data points of the signal for extracting nonsensitive (robust) features in the presence of possible noises in the signal. Then, we apply the filtration process on the weighted clique simplicial complex corresponding to WNVG by considering the weight (visibility value) of links as a threshold parameter and higher-order connections ( $k$ -cliques ; $k>2$ ) as building blocks of such higher-order structure. Accordingly, we figure out the evolution of robust topological features explaining the global structure of WNVG. Considering other techniques for making networks from time-series and applying PH could provide interesting results, but is time-consuming from computational points of view; the possibility of distinguishing the non-stationary and stationary series, the sensitivity of corresponding results to the value of the Hurst exponent, and the robustness of results with sample size are some of our criteria for taking VG in its weighted version.

Our research presented in this paper has the following advantages.

(1) Inspired by network science and a self-similar process characterized by a scaling exponent called the ”Hurst exponent”, we implement the weighted natural visibility graph (WNVG) to make a network from fractional Gaussian noise (fGn). In this approach, the topological motifs survive; consequently, their evolution concerning self-similar exponents can be examined. Our results approve that such evolution is a robust feature for determining the Hurst exponent.

(2) We will demonstrate that the statistical analysis of WNVGs such as probability distribution functions of eigenvector, betweenness, and closeness centralities are almost $H$ -independent, and therefore, they can not recognize the type of correlation of fGn series.

(3) We rely on TDA and implement the PH, which is a robust method to examine the evolution of global properties during the filtration process. Almost all relevant results are sensitive enough to the Hurst exponent of the underlying data set and it is also possible to check whether the data are stationary or not. More precisely, irrespective of either stationarity or non-stationarity of underlying times series, to capture the homology generators, the constructed network should be weighted. It is worth noting that the non-stationary behavior may be produced due to the various types of trends superimposed on the intrinsic fluctuation; therefore, in the presence of any prior information regarding trends the pre-processing procedure including at least one of the following algorithms must be employed to produce clean series, e.g., the singular value decomposition algorithm (Nagarajan and Kavasseri, 2005b), the adaptive detrending algorithm (Hu et al., 2009), and empirical mode decomposition (Wu et al., 2007).

(4) Finally, we emphasize that the behavior of the local (statistical) observables depend weakly on $H$ , whereas the coefficients of global (topological) observables are almost strongly $H$ -dependent and even for a part of measures, the size effect is almost diminished. We notify that TDA provides a new type of measure to quantify the Hurst exponent, and therefore, the scaling exponents of the correlation function, power spectrum, and fractal dimension can be specified with reliable approaches.

The rest of paper is organized as follows: In the next section, our methodology is introduced briefly. The construction of VGs for a time-series and the key idea of topological network analysis are clarified in Sec. II. The numerical results of synthetic fGn time series, which are covered in two subsections, local statistical properties and topological properties, are presented in Sec. III. The implementation of realistic data is described in Sec. IV. We give some ideas for applying the proposed method to various problems in Sec. V, and close the paper with a conclusion. More details on synthetic data generation, statistical network analysis, algebraic topology, and persistent homology are explained in appendices.

II Methodology

Refer to caption — Figure 1: The schematic representation of making a network for a typical data set using VG algorithm. (a) The HVG and (b) NVG of a synthetic fGn with $H=0.5$ (white-noise). To make more sense, we took the sample size equal to $N=32$ .

In this research, we aim to analyze the complex network of the visibility graph constructed from a fractional Gaussian noise (fGn) (see Appendix A for more details of generating a synthetic fGn series), with an emphasis on the topological aspects. Besides this, we also compute some conventional statistical properties of the mentioned signal.

II.1 Visibility Graph Method

Among growing applicability of complex networks in many fields and interdisciplinary branches in science Costa et al. (2011), a technique has been suggested which converts a time-series to a network, the so-called visibility graph method Lacasa et al. (2008). Generally, suppose $\{x\}:\{x(t_{i}),i=1,...,N\}$ represents a real-valued time-series. One can construct a network, the so-called visibility graph, denoted by $G=(V,E,w)$ , $V\equiv\{v_{i}\}_{i=1}^{N}$ is the node (vertex) set, and $E$ is link (edge) set and $w$ is a map from $E$ to real numbers. The VG is defined by using the bijection as follows,

f:~{}V\equiv\Bigr{\{}v_{i}\Bigr{\}}_{i=1}^{N}\leftrightarrow\quad T\equiv\Bigr{(}t_{i}\Bigr{)}_{i=1}^{N}\quad;\quad f(v_{i})=t_{i}

(1)

and the connections are constructed according to the visibility condition between the nodes, i.e., the nodes $v_{i}$ and $v_{j}$ are connected if the node $v_{j}$ is visible from the node $v_{i}$ and vice versa, and therefore the resulting graph is undirected (for more details on the properties of VGs, see Lacasa et al. (2008)). In general, there are two ways to construct a network (graph) from a time-series: the horizontal visibility graph (HVG) Gonçalves et al. (2016); Gao et al. (2016); Xie and Zhou (2011) and the natural visibility graph (NVG) Zheng et al. (2021); Yang et al. (2009); Ahmadlou et al. (2010); the former is more sparse than the latter case and in this work we focus on the NVG. In Fig. 1, we show how an HVG [panel (a)] and an NVG [panel (b)] for a synthetic fGn series can be constructed. In a binary setup, the corresponding visibility graph chooses the range of the weights from a binary set, $w^{(B)}:E\rightarrow\mathbb{Z}_{2}\equiv\{0,1\}$ ; e.g., for a binary NVG (BNVG) the weight function can be written according to following relation,

\displaystyle w_{ij}^{(BN)}\equiv\left\{\begin{array}[]{l l}1\qquad\qquad\qquad\qquad;\qquad\Bigr{|}f(v_{i})-f(v_{j})\Bigr{|}=1\\ \displaystyle\prod_{k=i+1}^{j-1}\Theta\Bigr{(}s_{ij}-s_{ik}\Bigr{)}~{};\qquad\Bigr{|}f(v_{i})-f(v_{j})\Bigr{|}>1\end{array}\right.

(4)

where $\Theta$ is the step function, and $s_{ij}\equiv\frac{x(f(v_{j}))-x(f(v_{i}))}{f(v_{j})-f(v_{i})}$ . The argument of the $\Theta$ function being positive guarantees that the node $v_{j}$ is visible from the node $v_{i}$ and vice versa. Since the edge in a BNVG has the weight $0$ or $1$ , it is unsuitable for continuous filtering. To take into account the quality of visibility between nodes, we suggest the weighted version of the natural visibility graph (WNVG), by considering the weight function as follows,

w_{ij}^{(WN)}\equiv\left\{\begin{array}[]{l l}\Bigr{|}s_{ij}\Bigr{|}\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad;\qquad\Bigr{|}f(v_{i})-f(v_{j})\Bigr{|}=1\\ \Bigr{(}\displaystyle\prod_{k=i+1}^{j-1}\Theta\Bigr{(}s_{ij}-s_{ik}\Bigr{)}\Bigr{[}s_{ij}-s_{ik}\Bigr{]}\Bigr{)}^{1/(j-i-1)}\quad;\qquad\Bigr{|}f(v_{i})-f(v_{j})\Bigr{|}>1\\ \end{array}\right.

(5)

There are two factors inside the product. In the second branch of Eq. (5), one is the step function just like the binary graph, and the other is the weight which is proportional to ”how visible is the site $j$ from $i$ and vice versa”; i.e., the more distinguishable the data points are in the original time-series, the higher the corresponding weight is in the constructed network. The term $\frac{1}{j-i-1}$ is necessary to make the weights reasonable numbers for comparison reasons. In the absence of this exponent, the more the distance between the nodes is, the higher the corresponding weights are. For both statistical and topological analysis, we use this weight function which admits continuous filtering.

II.2 Topological Network Analysis

A conventional approach to quantify the properties of a typical network is determining degree, the number of nodes straightly connected with the underlying node by non-zero weight, and centrality measures such as betweenness centrality, the number of shortest paths between any pair of nodes going through a distinct node, closeness centrality, eigenvector centrality, etc. (see Appendix B for more details). Besides the mentioned approach, other formalisms have been proposed in which not only the pairwise connections (links) are examined but also a systematic pipeline is considered to incorporate higher-order connections ( $k$ -cliques), including the simplicial complex and hypergraph Bianconi and Rahmede (2016); Kovalenko et al. (2021); Young et al. (2020).

To analyze such higher-order structures, one can generalize the well-known statistical quantities based on higher-order connections. On the other hand, for global analysis of these structures newly born topological tools are proposed. By the term ”global”, we mean the essential features of the object which are not affected by geometric transformations. The well-known topological quantities describing global properties of the space (or any object mapped to a topological space) are Betti numbers. The $k$ th Betti number of an object ( $\beta_{k}$ ) is a topological invariant. The dimension of the $k$ -homology group of the topological space corresponding to the object counts the number of $k$ -dimensional topological holes ( $k$ -holes) of the object. Intuitively, $\beta_{0}$ indicates the number of connected components, $\beta_{1}$ is the number of topological (unshrinkable) loops, and $\beta_{2}$ counts the number of topological (unshrinkable) voids (see the Appendix C for more details). According to the mentioned properties of the topological measures and weight-based analysis of weighted complex networks, the applied algebraic topology tool, known as the persistent homology (PH) technique, is useful to study global structures of the weighted complex network in higher-dimensions Edelsbrunner and Harer (2010); Rote and Vegter (2006); Zomorodian (2009).

In the PH-based analysis of a weighted complex network, the weighted network is mapped to a weighted clique simplicial complex as a higher-order structure containing any $(k+1)$ -clique of the network as the $k$ -simplex. Therefore, PH creates a nested sequence of the clique simplicial complex, so-called filtration, by considering the weight as the parameter, such that any link (1-simplex) presents in the complex at a distinct weight if the weight of the link is less than the distinct weight. Accordingly, the $k$ -holes of the complex appear and disappear when the weight (threshold) varies. To summarize the evolution of these $k$ -dimensional topological features of the system, PH assigns an ordered tuple, the so-called persistence pair (PP), to this global descriptor to visualize the topological variation of the system in the birth-death space. This kind of visualization for the evolution of $k$ -holes is called the persistence diagram (PD) for the $k$ -homology group. PD for the $k$ -homology group includes hidden topological information of the $k$ -holes in the weighted complex. For example, one can compute the population of birth, death, and lifetime of $k$ -holes using the distribution of PPs in PD for the $k$ -homology group. Persistence entropy (PE) for the $k$ -homology group is another quantity defined by the Shannon entropy of topological persistence (lifetime) of PPs of the $k$ -homology group (see Appendix D for more details).

III Results

To evaluate the statistical and topological features for the constructed VG from a series, at first, we focus on the synthetic fractional Gaussian noise (fGn) as a self-similar series. There are some exact and approximate algorithms for generating a self-similar time-series. The Hosking Hosking (1984), Cholesky Dieker and Mandjes (2003), and Davies-Harte methods Davies and Harte (1987) are the examples of exact approach to generate fBm and fGn series. Throughout this paper, we implement the Davies-Harte method to generate fGn series with different self-similar exponents and sizes. To reduce any bias in the nominal Hurst exponent as much as possible, all relevant results are determined by doing an ensemble average over $10^{4}$ realizations generated by Davies-Harte method for each $H$ . To ensure the correctness of the Hurst exponent value associated with each simulated time-series, we have computed again the value of $H$ by our code based on the detrended fluctuation analysis (DFA) method Kantelhardt et al. (2002) and compare the expected and computed Hurst exponents. Our results show that both Hurst exponents are in agreement with each other.

Now we turn to the statistical and topological properties of the VGs constructed from the fGn time-series and investigate their behavior concerning the Hurst exponent. The networks of sizes $N=2^{7},2^{8},2^{9},2^{10},2^{11}$ , and $2^{12}$ (for which a desktop with $128$ GB memory is capable of performing matrix operations) are considered. The Python toolbox ”NetworkX”¹¹1https://networkx.org/ is employed for the matrix operations on the graphs. In the topological analysis, we especially focus on the Betti- $0$ (represented by the $\beta_{0}$ defined as the number of connected components of the network) and Betti- $1$ (represented by $\beta_{1}$ defined as the number of loops) features, which are extracted by using the ”Dionysus” Python package Dmitriy (sus2). The persistence statistics, containing the lifetime (the interval between birth and death) of the topological features, and its Shannon entropy are also analyzed.

Each exponent has been estimated by Bayesian statistics accordingly; the $\{{\mathcal{D}}\}$ and $\{\Upsilon\}$ reveal the data and model free parameters, respectively. The posterior function is defined by

{\mathcal{P}}(\Upsilon|{\mathcal{D}})=\frac{{\mathcal{L}}({\mathcal{D}}|\Upsilon){\mathcal{P}}(\Upsilon)}{\int{\mathcal{L}}({\mathcal{D}}|\Upsilon){\mathcal{P}}(\Upsilon)d\Upsilon}

(6)

where $\mathcal{L}$ is the likelihood and $\mathcal{P}(\Upsilon)$ is the prior probability function containing all information concerning model parameters. Here we adopt the top-hat function for the prior function whose window’s size depends on the expected range of the corresponding exponent. Taking into account the central limit theorem, the functional form of likelihood becomes multivariate Gaussian, i.e., ${\mathcal{L}}({\mathcal{D}}|\Upsilon)\sim\exp(-\chi^{2}/2)$ . The $\chi^{2}$ for determining the best-fit value for the scaling exponent reads as

\chi^{2}(\Upsilon)\equiv\Delta^{{\dagger}}.{C}^{-1}.\Delta

(7)

where $\Delta$ is a column vector whose elements are determined by difference between computed value and theoretical form for each measure, and ${C}$ is the corresponding covariance matrix. Finally, the best fit value of the considered exponent is computed by maximizing the likelihood probability distribution and the associated errorbar is given by

68.3\%=\int_{-\sigma_{\Upsilon}}^{+\sigma_{\Upsilon}}{\mathcal{L}}(\mathcal{D}|\Upsilon)d\Upsilon

(8)

Subsequently, we report the best value of the scaling exponent at a $1\sigma$ confidence interval as $\Upsilon_{-\sigma_{\Upsilon}}^{+\sigma_{\Upsilon}}$ .

III.1 Local Statistical Properties

By local properties, we mean the properties which are node-dependent and are not necessarily globally defined. It has been confirmed that the distribution function of the node degree of VGs of the fBms and fGns is power-law $p(k)\propto k^{-\gamma}$ with the exponent $\gamma(H)=3-2H$ and $\gamma(H)=5-2H$ , respectively Lacasa et al. (2008, 2009). In this subsection, we perform our computation for the WNVG, introduced in this paper.

The probability distribution function of eigenvector ( $c^{E}$ ) and betweenness ( $c^{B}$ ) centralities are indicated in panels (a) and (b) of Fig. 2 in log-log scale, in terms of $H$ for WNVGs. The scaling behavior of mentioned distributions has been checked by the Kolmogorov-Smirnov test (KS test) and the results confirm that the power-law behavior of the computed $p(c^{E})$ and $p(c^{B})$ is highly $H$ - and size-dependent [see panels (a) and (b) of Fig. 2]. The probability distribution of closeness centrality and corresponding moments $M_{n}$ for $n=2$ and $n=3$ are illustrated in panels (c) and (d) of Fig. 2, respectively. This part depicts that not only the Hurst dependency is not confirmed but also the overall shape of probability distribution functions of statistical measures depends on the size of the underlying fGn series. Subsequently, such distributions can not discriminate between different fGn series.

The full spectrum of the eigenvalues [Eq. (12)] is illustrated in panel (a) of Fig. 3. We see that the impact of $H$ is changing the range of the spectrum, and by increasing the Hurst exponent, the range of the spectrum for WNVGs becomes tight. This phenomenon can be understood by recalling that correlations (obtained by increasing $H$ ) smooth the underlying time-series, causing the corresponding network to have more links at low weight. For more smoothed time-series, the typical slopes for the associated WNVG become down, leading to lower weights according to Eq. (5), and equivalently making a shorter range for the distribution of $\lambda$ s. Panel (b) of Fig. 3 indicates different moments for $p(\lambda)$ . The solid and dashed lines correspond to $M_{2}$ and $M_{3}$ , respectively. The $n$ th moment of this distribution behaves as an $H$ -dependent quantity. The higher the order of moments, the stronger the dependency on sample size is.

III.2 Topological Properties

For any given value of Hurst exponent, the associated BNVG of fGn is a topological tree; therefore, the BNVGs are topologically equivalent to each other. In other words, various BNVGs are homeomorphic for different $H$ according to the homeomorphism theorem. The looplessness of BNVGs indicates that these networks do not contain some classes of network motifs corresponding to topological loops, so-called topological motifs Xie and Zhou (2011). To make more sense, suppose 4 successive data points as $\{x_{j}\}_{j=i}^{i+3},(j=1,...,N-4)$ from a given time-series $\{x_{i}\}_{i=1}^{N}$ to make a minimal topological loop. From a network perspective, the necessary (not sufficient) condition to construct a topological loop is that $w_{i,i+1}=w_{i,i+3}=w_{i+1,i+2}=w_{i+2,i+3}=1$ and the sufficient condition is $w_{i,i+2}=w_{i+1,i+3}=0$ . The weights here are determined by Eq. (4). Geometrically, the necessary condition is equivalent to the case where data points have upward concavity. The visibility condition does not satisfy the sufficient condition, such that either $w_{i,i+2}=1$ (creating two 2-simplices $[v_{i},v_{i+1},v_{i+2}]$ and $[v_{i},v_{i+2},v_{i+3}]$ ) or $w_{i+1,i+3}=1$ (making two 2-simplices $[v_{i},v_{i+1},v_{i+3}]$ and $[v_{i},v_{i+2},v_{i+3}]$ ) or even $w_{i,i+2}=w_{i+1,i+3}=1$ (constructing a 3-simplex $[v_{i},v_{i+1},v_{i+2},v_{i+3}]$ ) killing the topological loop and leads the binary network to be topologically tree. Therefore, the BNVG does not contain a certain class of network motifs equivalent to topological holes, whereas, in the WNVG case these local patterns can be captured as topological loops. Also mentioned loops are extracted through the filtration of the corresponding weighted simplicial complex. According to the PH language, the sufficient condition can be treated as $w_{i,i+1},w_{i+1,i+2},w_{i+2,i+3},w_{i,i+3}<w_{i,i+2},w_{i+1,i+3}$ ; therefore, the associated topological motifs now remain. Subsequently, we only focus on the topological aspects of WNVGs. Using this terminology, we can analyze local behaviors of a generic fGn using the statistics of corresponding patterns (topological motifs) in WNVGs which are presented as PPs in PD.

Now, we are going to evaluate topological measures for WNVG constructed from a generic time series. Since in this paper, we are interested in examining the capability of the PH technique in determining the Hurst exponent of a typical self-similar process, therefore we take fGn which is an increment of fBm. Classifying the stationary and non-stationary series is in principle possible when the weighted network is constructed from time series in our approach. Intuitively, by increasing the value of the Hurst exponent, the underlying series becomes more smooth and the statistics of weights grow for lower values of weight. For an fBm data set which is the profile set of fGn Taqqu et al. (1995); Kantelhardt et al. (2002), the maximum value of distribution function of weights occurs at lower weight values. Subsequently, the PH of WNVG which is represented by topological curves can recognize the stationary and non-stationary series. The constructed networks according to the visibility graph possess the footprint of Hurst exponent empirically demonstrated in Lacasa et al. (2009). The higher value of $H$ causes making the denser network. Also, the number of $k$ -simplices at lower value of threshold increase in the weighted version of the visibility graph; on the other hand, such simplices influence the evolution of $k$ -holes in associated simplicial complexes. Therefore, the coefficients of topological curves depend on Hurst exponent. Hereafter, we consider fGn as our case study. Panel (a) of Fig. 4 indicates the $\beta_{0}$ as a function of filtration parameter (threshold) for clique complexes of WNVG produced for various time-series with different values of Hurst exponent. To reduce the effect of sample size, we divide the Betti numbers by the sample size and call these normalized Betti numbers. By increasing the value of the Hurst exponent, the normalized $\beta_{0}$ versus threshold decreases. Namely, the number of connected components in the network decreases with increasing $H$ (the graphs become steeper). Interestingly, the slope of normalized $\beta_{0}$ as a function of $w$ is different for different $H$ ; consequently, the WNVG of fGn sets reaches to the connected regime (path-connected), $\beta_{0}=1$ , [in panel (a) of Fig. 4, we take $N=2^{10}$ ] by different rates and at different thresholds, $w_{0}$ . Panel (b) of Fig. 4 represents the value of $w_{0}$ as a function of Hurst exponent for different network sizes. As depicted in the mentioned panel, the value of $w_{0}$ behaves as a sample size-dependent quantity and by increasing $H$ , such dependency becomes negligible.

We define the $\beta_{k}^{\text{(birth)}}(w)$ and $\beta_{k}^{\text{(death)}}(w)$ as the number of $k$ -dimensional holes that are born and die at a given threshold, $w$ , respectively. It turns out that $\beta_{0}^{\text{(birth)}}=0$ for $w>0$ , since at $w=0$ the underlying network has $N$ connected components; therefore, all connected components are born at $w=0$ . Panel (c) of Fig. 4 illustrates the $\beta_{0}^{\text{(death)}}/N$ as a function of $w^{2}$ . As shown in this plot, one of the proper fitting functions to describe the normalized $\beta_{0}^{\text{(death)}}$ is given by $\beta_{0}^{\text{(death)}}(w)\sim\exp\left[-\alpha_{0}^{\text{(death)}}w^{2}\right]$ for $2\lesssim w^{2}\lesssim w^{2}_{\rm max}$ , where the value of $w_{\rm max}$ depends on the $H$ value. The $\alpha_{0}^{\rm(death)}$ depends on the Hurst exponent as increasing function and it behaves as an almost independent function on size of series [panel (d) of Fig. 4].

Panel (a) of Fig. 5 is devoted to $\beta_{1}/N$ for various Hurst exponents. As we expect, for a trivial threshold, $w=0$ , we have $N$ connected components ( $\beta_{0}=N$ ) and therefore the number of loops is identically zero ( $\beta_{1}=0$ ). By increasing the threshold, the higher value of the Hurst exponent leads to a more rapidly increasing rate of $\beta_{1}$ . On the other hand, for a high enough value of the threshold, again the underlying data set behaves like a topological tree without any topological loops. Therefore, the normalized $\beta_{1}$ goes asymptotically to zero, and such descending is more rapid for higher $H$ . We also determine the lowest non-trivial threshold for which there is no loop in the underlying network denoted by $w_{1}$ , and depict this threshold versus $H$ for different samples size in panel (b) of Fig. 5. The $w_{1}$ is also a size-dependent quantity. Comparing the $\beta_{0}/N$ and $\beta_{1}/N$ demonstrate that by increasing threshold value, the WNVGs of the fGn series reach the loopless regime ( $\beta_{1}=0$ ) before appearing in the connected regime (for which $\beta_{0}=1$ ), irrespective of the Hurst exponent, i.e., $w_{0}(H)>w_{1}(H)$ . The quantities $\beta_{1}^{\text{(birth)}}\propto\exp\left[-\alpha_{1}^{\text{(birth)}}w\right]$ , $\beta_{1}^{\text{(death)}}\propto\exp\left[-\alpha_{1}^{\text{(death)}}w\right]$ versus thresholds and corresponding coefficients are illustrated in panels (c)-(f) of Fig. 5, respectively. The $\alpha_{1}^{\rm(birth)}$ and $\alpha_{1}^{\rm(death)}$ are almost size-independent and they grow by increasing $H$ . The local upward concavity behavior of time series satisfying the loop condition and associated topological motifs (1-holes) appears and disappears earlier as the Hurst exponent of the time-series increases.

Another interesting property to assess is the probability distribution of lifetime for 1-holes which is the difference between the death and birth thresholds of a typical measure in a 1-homology class. Panel (a) of Fig. 6 shows the probability distribution of topological 1-dimensional hole lifetime, $\ell_{1}$ , for various synthetic data sets with different values for the Hurst exponent when the vertical axis is plotted in log-scale. Our results confirm that $p(\ell_{1})\propto\exp\left[-\alpha_{1}^{\text{(lifetime)}}\ell_{1}\right]$ . The $H$ -dependency of $\alpha_{1}^{\rm(lifetime)}$ for various systems sizes is depicted in panel (b) of Fig. 6. This result confirms that $\alpha_{1}^{\rm(lifetime)}$ can be considered as a robust measure for determining the Hurst exponent of the fGn signal which is not affected by sample size even compared to $\alpha_{0}^{\rm(death)}$ , $\alpha_{1}^{\rm(birth)}$ , and $\alpha_{1}^{\rm(death)}$ . The increasing behavior of the lifetime coefficient $\alpha_{1}^{\rm(lifetime)}$ versus the Hurst exponent also indicates that the anti-correlated signals ( $H<0.5$ ) contain more persistent topological motifs (1-holes). This behavior also demonstrates a considerable difference between the amount of visibility between the inner and the outer nodes which are corresponding to the local upward concavity behaving data points in these signals.

Figure 7 indicates the persistence diagram (PD) and persistence barcode (PB) for three types of fGn signals for (a) $H=0.2$ (anticorrelated), (b) $H=0.5$ (uncorrelated) and (c) $H=0.8$ (correlated). The open circle and triangle symbols correspond, respectively to the $0$ - and $1$ -homology groups in a persistence diagram for WNVGs of fGn. Each symbol depicts a pair $(w_{\rm birth},w_{\rm death})$ of a $k$ -dimensional hole. As expected, all 0-holes are born in $w_{\rm birth}=0$ , and also always $w_{\rm death}>w_{\rm birth}$ . In the barcode representation (inset plot), the horizontal lines (blue lines for $k=0$ and orange lines for $k=1$ ) start and end on the threshold values on which $k$ -dimensional holes are born and die, respectively.

The associated persistence entropies (PEs) defined by Eq. (30) are obtained using the persistence diagram. Panels (a) and (b) of Fig. 8 depict the ${PE}_{0}$ and ${PE}_{1}$ , as a function of sample size, respectively. We compute persistence entropy for all available Hurst exponent values represented by different symbols. Our results demonstrate that ${PE}_{k}=\mathcal{A}_{k}(H)\log_{10}N$ for $k=0,1$ . The behavior of prefactor $\mathcal{A}_{k}$ versus $H$ is represented in panel (c) of Fig. 8. The $\mathcal{A}_{0}$ is almost an increasing function versus Hurst exponent, while the $\mathcal{A}_{1}$ becomes constant.

IV Implementation on Realistic Data

To illustrate the applicability of PH to characterize the self-similar nature of real data, in this section, we use the physiological data set studied in Olejarczyk and Jernajczyk (2017). We implement the DFA algorithm to verify the scaling behavior of their correlation functions. Then, we apply our approach to the channel F8 of the standard 10–20 electroencephalography (EEG) montage recorded from 14 patients (7 males: 27.9 ± 3.3 years, 7 females: 28.3 ± 4.1 years) with paranoid schizophrenia and 14 healthy controls (7 males: 26.8 ± 2.9, 7 females: 28.7 ± 3.4 years). The EEG data were recorded in all subjects during an eyes-closed resting-state condition for 15 minutes with the sampling frequency of 250 Hz. We construct the WNVG from the EEG signals and the evolution of topological motifs through the filtration process is examined as depicted in Fig. 9. To infer the statistical significance of the results, we carry out the KS test and we obtain the exponential behavior of results indicated in Fig. 9. Then, according to the likelihood approach, we estimate the topological coefficients and corresponding errorbars. The value of coefficients $\alpha_{1}^{(\rm birth)}$ , $\alpha_{1}^{(\rm death)}$ , $\alpha_{1}^{(\rm lifetime)}$ and associated Hurst exponents for both healthy and schizophrenic cases are reported in Table 1. Comparing the topological coefficients with our results for estimation of the Hurst exponent (Figs. 5 and 6), enables us to compute the associated Hurst exponent. The value of the Hurst exponent for each coefficient is reported in Table 1. Averaging on three kinds of Hurst exponents for EEG signals leads to $H=0.86\pm 0.02$ and $H=0.69\pm 0.03$ at the $1\sigma$ confidence interval for healthy and schizophrenic cases, respectively. Our results also reveal a correlated behavior in both data sets. It is worth noting that, the PH can classify different types of data.

Measure	Healthy	Schizophrenic
$\alpha_{1}^{(\rm birth)}$	$4.85\pm 0.04$	$3.52\pm 0.05$
$H_{(\rm birth)}$	$0.84\pm 0.02$	$0.66\pm 0.03$
$\alpha_{1}^{(\rm death)}$	$3.96\pm 0.04$	$2.83\pm 0.04$
$H_{(\rm death)}$	$0.85\pm 0.02$	$0.68\pm 0.03$
$\alpha_{1}^{(\rm lifetime)}$	$7.03\pm 0.03$	$5.00\pm 0.03$
$H_{(\rm lifetime)}$	$0.89\pm 0.02$	$0.73\pm 0.03$

Table 1: The value of topological coefficients and corresponding Hurst exponents for healthy and schizophrenic cases at

1\sigma

level of confidence.

V Summary and Concluding Remarks

Applying the various tools developed in network science on a network constructed from a typical time series reveals non-trivial information. Particularly, some of the features of the network for a generic fractional Gaussian noise (fGn), characterized by Hurst exponent $H$ , exhibit a promising relation to the corresponding Hurst exponent. In this paper, we developed a method to generate a WNVG from the fGn (an increment of fBm) series which is a more general network compared to a binary network class. According to the filtration process, the evolution of topological holes in the associated WNVG can be captured. To this end, we used the PH technique, to examine the topological motifs.

The statistical properties of the WNVG were analyzed using the standard network measures. Our main results are as follows:

(1) The probability distribution functions of eigenvector, betweenness and closeness centralities computed for WNVGs constructed from fGn series [panel (b) of Fig. 1] do not depend on $H$ and even the overall shape of mentioned distributions is highly size-dependent. The main message is that such kinds of statistical measures are not proper measures for determining the Hurst exponent of the fGn series (Fig. 2).

(2) Increasing the amount of correlation results in obtaining more dense networks and shifts the weights of the links to the lower values and, consequently, the width of the distribution function of the eigenvalues to be tighter. The $n$ th moment of this distribution behaves as an $H$ -dependent quantity. The higher the order of moments, the stronger the dependency on sample size (Fig. 3).

In the second part, we considered the topological properties of synthetic fGn. Our main results are summarized below:

(3) The corresponding BNVG of a typical fGn is a topological tree which means that all topological motifs have identically vanished while taking into account a weight function to construct the so-called weighted NVG (WNVG) leads to the survival of topological motifs and therefore their evolution with respect to self-similar exponents can be examined. Such evolution reveals a robust feature for determining the Hurst exponent reliably.

(4) The Betti-0 is a representative of the number of connected components of the weighted clique simplicial complex associated with the WNVG with respect to the threshold (weight). The decreasing rate of Betti-0 increases by increasing $H$ . The threshold value for which the underlying WNVG reaches the connected regime (path-connected) indicates an $H$ -dependency [panels (a) and (b) of Fig. 4].

(5) Our result also confirms that the coefficient corresponding to the exponentially decaying function of the number of connected components dying (merging), $\alpha_{0}^{(\rm death)}$ , as a function of threshold, depends on the Hurst exponent and interestingly it is almost size-independent [panels (c) and (d) of Fig. 4].

(6) The statistics of 1-holes (topological loops) show that the vanishing threshold for the number of topological loops, $w_{1}$ , is $H$ -dependent and it contains the sample size effect. The coefficients corresponding to the exponentially decaying function of the number of topological loops appearing ( $\alpha_{1}^{({\rm birth})}$ ) and disappearing ( $\alpha_{1}^{({\rm death})}$ ) as a function of threshold reveal the proper criteria for measuring the Hurst exponent (Fig. 5). These coefficients also indicate that the topological motifs (upward concavity behavior data points of time-series satisfying loop condition) evolve (appear and disappear) in low weights for correlated signals.

(7) The probability distribution of the lifetime of 1-holes ( $\ell_{1}$ ) confirms that the corresponding coefficient ( $\alpha_{1}^{({\rm lifetime})}$ ) is an increasing function versus $H$ emphasizing that the sample size effect is completely diminished in this quantity. Consequently, for a self-similar time-series in the absence of trends, this coefficient can be a reliable measure for estimating the Hurst exponent even for a small sample size irrespective of the value of the $H$ exponent (Fig. 6). This coefficient also indicates that the fGn signals living in negatively correlated regimes incorporate more robust topological motifs (1-holes) which shows a significant difference between the quality of visibility of inner and outer nodes corresponding to the local upward concavity behaving data points in these signals.

(8) The persistence pairs (PPs) in the persistence diagram (PD) and the persistence barcode (PB) for the weighted clique complex of WNVGs which are indicators of persistent homology have been computed and with increasing the value of the Hurst exponent shrink to the origin of the coordinate (Fig. 7). We also computed persistence entropies (PEs) for 0-homology and 1-homology groups. Both quantities depend on sample size as expected and the corresponding slopes in semi-log scale were almost $H$ -dependent (Fig. 8). Implementing the PH method on real data and computing the topological parameters, we could determine the corresponding Hurst exponent. Interestingly, PH admitted its capability for the classification of various data sets.

(9) Employing PH on the constructed weighted network from the realistic series confirmed the correlated behavior of electroencephalography for both healthy and schizophrenic samples.

Finally, we emphasize that the behavior of the local (statistical) observables depends weakly on $H$ , whereas the exponents of global (topological) observables are almost strongly $H$ -dependent and even for the $\alpha_{1}^{({\rm lifetime})}$ , the size effect is completely diminished. A take-home message is that the TDA provides a new type of measure to quantify the Hurst exponent and, therefore, the scaling exponents of the correlation function, power spectrum, and fractal dimension can be specified with reliable approaches.

In this paper, we have not verified whether the persistent homology technique is capable of recognizing that the underlying time-series is a self-similar set or not. Indeed, this purpose is beyond the scope of this paper. To address the capability of our method to discriminate between stationary and non-stationary times series, more simulations for different types of non-stationary cases should be performed; this is left for our future study. Also, it would be interesting to examine the effect of trends and irregularity which may occur in a wide range of events in nature in the context of the TDA and more precisely via the PH approach, and we leave this too for future research. The above analysis can be done on different phenomena ranging from cosmology, astrophysics, and economics to biology Akrami et al. (2020); Eghdami et al. (2018); Vafaei Sadr et al. (2018) with different approaches to make graphs from time-series.

Acknowledgment

The authors are very grateful to Marco Piangerelli for his notice on TDA as a starting point in this research. In addition, we thank to the anonymous referees, who helped us to improve the paper, considerably. The numerical simulations were carried out on the computing cluster of the University of Mohaghegh Ardabili.

Appendices

Appendix A: Synthetic Fractional Gaussian Noise

To model the stochastic fractal processes, Mandelbrot and Van Ness introduced the fractional Brownian motion (fBm) and fractional Gaussian noise (fGn) Mandelbrot and Van Ness (1968). The theory of fBm is a mathematical generalization of the classical random walk and Brownian motion Mandelbrot and Van Ness (1968); Kahane and Kahane (1993). A 1-dimensional fBm is represented by $B\equiv\{B(t):t\geq 0\}$ , with power-law variance, for which $Var(B(at))\triangleq Var(a^{H}B(t))=a^{2H}Var(B(t))$ , where $H\in(0,1)$ is called the Hurst exponent. For this random force, the Markov property and the stationarity are violated (note that when we have domain Markov property, stationarity, and continuity for a time series, then it should be proportional to a 1-dimensional Brownian motion). A model for generating fBm (denoted by $B_{H}$ to emphasize on its Hurst exponent $H$ ) is a generalization of the Brownian motion which is non-stationary and non-Markovian Reed et al. (1995), and is given by the Holmgren-Riemann-Liouville fractional integral,

B_{H}(t)=\frac{1}{{\Gamma(H+\frac{1}{2})}}\int\limits_{0}^{t}{{{(t-s)}^{H-\frac{1}{2}}}}\text{d}B(s)

(9)

where $\Gamma$ is the Gamma function, $\text{d}B(s)\equiv B(s+\text{d}s)-B(s)$ is the increment of 1-dimensional Brownian motion, and it has the following covariance:

\langle{{B_{H}}(t){B_{H}}(s)}\rangle=\frac{\sigma^{2}}{2}\left({{{\left|t\right|}^{2H}}+{{\left|s\right|}^{2H}}-{{\left|{t-s}\right|}^{2H}}}\right)

(10)

where $\sigma^{2}\equiv\langle B(0)\rangle$ and also $\langle B_{H}\rangle=0$ . The increments, $x_{H}(t)\equiv\delta_{t}\left({B_{H}}(t+\delta t)-{B_{H}}(t)\right)$ , are known as fractional Gaussian noise (fGn). The power spectrum of fBm and fGn behaves as $S(f)\sim f^{-\xi}$ , where $\xi(H)=2H+1$ and $\xi(H)=2H-1$ for fBm and fGn, respectively. For $H>\frac{1}{2}$ ( $H<\frac{1}{2}$ ) the corresponding fGn is positively (negatively) correlated. According to the results provided by detrended fluctuation analysis (DFA), the relation between the scaling exponent derived by the DFA method and associated Hurst exponent of fBm is $H+1$ , while constructing the fGn series by making the increment of fBm confirmed that the scaling exponent of fluctuation functions computed by DFA is directly related to the Hurst exponent Taqqu et al. (1995); Kantelhardt et al. (2002).

Appendix B: Statistical Network Analysis

Suppose that a network (graph) is represented by $G=(V,E,w)$ . Here, $V\equiv\{v_{i}\}_{i=1}^{N}$ is a node (vertex) set, $E=V\times V\equiv\Bigr{\{}e_{ij}=(v_{i},v_{j})\Bigr{|}v_{i},v_{j}\in V\Bigr{\}}_{i,j=1}^{N}$ is a link (edge) set, and $w:E\rightarrow\mathbb{R}$ is a weight function (threshold). Subsequently, the degree of $i$ th node ( $v_{i}$ ) is the number of nodes straightly connected with the underlying node by non-zero weight, and it is denoted by $k_{i}\equiv\sum_{j}(1-\delta_{0,w_{ij}})$ . The degree centrality ( $c_{i}^{D}$ ) is defined by $\frac{k_{i}}{N-1}$ , which is apparently related directly to how important the underlying node is, since it is the number of agents that have a connection with it. An important function concerning this quantity is the degree distribution showing the probability distribution function for degree of all nodes in the network:

p(k)=\displaystyle\frac{1}{N}\displaystyle\sum_{i=1}^{N}\delta_{k,k_{i}}

(11)

The eigenvector centrality ( $c_{i}^{E}$ ) is also defined via the eigenvalue equation

\lambda c_{i}^{E}=\sum_{j=1}^{N}w_{ij}c_{j}^{E}

(12)

where $\lambda$ is the eigenvalue. The maximum value of the $\lambda$ spectrum, i.e., $\lambda_{\rm max}$ plays the dominant role in the network properties, the corresponding eigenvectors of which are denoted by $c_{i,{\rm max}}^{E}$ revealing the importance of the nodes. Let us denote the shortest distance between nodes $v_{i}$ and $v_{j}$ by $d_{ij}$ which is assumed to be $N$ when there is no path connecting them (disconnected graphs). Then the closeness centrality is defined by

c_{i}^{C}=\frac{N-1}{\sum_{j=1}^{N}d_{ij}},

(13)

and the betweenness centrality is as follows:

c_{i}^{B}=\frac{2}{(N-1)(N-2)}{\sum_{j=1,j\neq i}^{N}}{\sum^{N}_{k=1,k\neq i,j}}\frac{n_{jk}(i)}{n_{jk}},

(14)

where $n_{jk}$ is the number of geodesics from $v_{j}$ to $v_{k}$ , and $n_{jk}(i)$ is the number of geodesics from $v_{j}$ to $v_{k}$ which pass through node $v_{i}$ .

Appendix C: Algebraic Topology

Topology generally refers to the global features in contrast to the geometrical invariants of underlying objects or sets. We have two spaces represented by $X$ and $Y$ , and they have the same local (geometric) features if any relevant features are invariant under congruence, while the mentioned spaces are topologically equivalent if the associated features are invariant under homeomorphisms. In other words, they are homeomorphic. Homology theory plays a crucial role in the mathematical description of the relevant building block of a typical topological space and reveals the connectedness of underlying space Edelsbrunner and Harer (2010); Rote and Vegter (2006); Zomorodian (2009). Based on such properties, for the sake of clarity, we will give a brief review on the building blocks of algebraic topology which are useful to set up homology groups.

Simplex: A $k$ -simplex ( $\sigma_{k}$ ) is a convex-hull of any geometrically independent subset, accordingly $\sigma_{k}\equiv[v_{0},v_{1},...,v_{k}]\subseteq\mathbb{R}^{D}$ . By this definition, a 0-simplex is a point, a 1-simplex is a segment of a line, a 2-simplex is a filled triangle, a 3-simplex is a filled tetrahedron, and so on (Fig. 10).

Face: An $l$ -simplex which is denoted by $\sigma_{l}$ is a subset of the $k$ -simplex ( $\sigma_{l}\subseteq\sigma_{k}$ ) and it is called the $l$ -face of the $k$ -simplex.

Simplicial complex: A simplicial complex ( $\psi$ ) is a collection of simplices such that: any $l$ -face of any $k$ -simplex of a typical complex ( $0<l<k$ ), is a member of the complex. In addition, the non-empty intersection of any two simplices, $\sigma_{k}$ , and $\sigma_{m}$ , from the complex is an $l$ -face of both simplices. The dimension of a complex is the maximum dimension of all simplices of the complex. According to the definition of the complex, one can define its $k$ -ordered subcollection of complexes $\psi$ as follows:

\Sigma_{k}(\psi)\equiv\Bigr{\{}\sigma\in\psi~{}\Bigr{|}~{}dim(\sigma)=k\Bigr{\}}

(15)

Chain: For a given simplicial complex, a $k$ -chain ( $k$ -dimensional chain) is a linear combination of $k$ -simplices of $\psi$ , defined by

c_{k}\equiv\displaystyle\sum_{i=1}^{|\Sigma_{k}(\psi)|}a_{i}~{}\sigma_{k}^{(i)}\quad;\quad\sigma_{k}^{(i)}\in\Sigma_{k}(\psi)

(16)

where $|\Sigma_{k}(\psi)|$ corresponds to the cardinality of the $k$ -ordered subcollection of complex and the coefficients, $a_{i}$ s, belong to a field, which is usually considered as $\mathbb{F}=\mathbb{Z}_{2}\equiv\{0,1\}$ . The collection of all possible $k$ -chains in the simplicial complex is called the $k$ -chain group as

C_{k}(\psi)\equiv\Bigr{\{}c_{k}~{}\Bigr{|}~{}c_{k}=\displaystyle\sum_{i=1}^{|\Sigma_{k}(\psi)|}a_{i}\sigma_{k}^{(i)};\sigma_{k}^{(i)}\in\Sigma_{k}(\psi),a_{i}\in\mathbb{F}\Bigr{\}}

(17)

Boundary operator: For the simplices in any dimension, the boundary operator $\partial_{k}$ is an operator mapping $\sigma_{k}$ to its boundary according to

\partial_{k}(\sigma_{k})\equiv\displaystyle\sum_{j=0}^{k}(-1)^{j}~{}[x_{0},x_{1},...,x_{j-1},x_{j+1},...,x_{k}]~{}~{}\subseteq~{}\sigma_{k}

(18)

Boundary: A $k$ -chain which is the boundary of a $(k+1)$ -chain, is called the $k$ -boundary, denoted by $b_{k}$ . The $k$ -boundary group is the collection of all $k$ -boundaries in complex $\psi$ ,

	$\displaystyle B_{k}(\psi)$	$\displaystyle\equiv$	$\displaystyle\Bigr{\{}c_{k}\in C_{k}(\psi)\Bigr{\|}\exists c_{k+1}\in C_{k+1}(\psi);\partial_{k+1}(c_{k+1})=c_{k}\Bigr{\}}$		(19)
		$\displaystyle\equiv$	$\displaystyle\Bigr{\{}b_{k}^{(i)}\Bigr{\}}_{i}^{\|B_{k}(\psi)\|}\subseteq C_{k}(\psi)$		(19)

Cycle: A $k$ -chain that has no boundary, is called a $k$ -cycle denoted by $z_{k}$ as

\partial_{k}(z_{k})=\oslash

(20)

The $k$ -cycle group is defined as the collection of all $k$ -cycles in complex $\psi$ ,

	$\displaystyle Z_{k}(\psi)$	$\displaystyle\equiv$	$\displaystyle\Bigr{\{}c_{k}\in C_{k}(\psi)~{}\Bigr{\|}~{}\partial_{k}(c_{k})=\oslash\Bigr{\}}$		(21)
		$\displaystyle\equiv$	$\displaystyle\Bigr{\{}z_{k}^{(i)}\Bigr{\}}_{i}^{\|Z_{k}(\psi)\|}\subseteq C_{k}(\psi)$		(21)

Since ”boundaries have no boundary”, therefore, we have

\partial_{k}(b_{k})=\partial_{k}(\partial_{k+1}(c_{k+1}))=\oslash

(22)

hence

B_{k}(\psi)\subseteq Z_{k}(\psi)\subseteq C_{k}(\psi)

(23)

Homology group: The $k$ -homology group is defined by the quotient group of the $k$ -cycles group by the $k$ -boundary group,

H_{k}(\psi)\equiv Z_{k}(\psi)/B_{k}(\psi).

(24)

The $k$ th Betti number of a simplicial complex, denoted by $\beta_{k}(\psi)$ , is a topological invariant which counts the number of $k$ -homology classes corresponding to the number of $k$ -dimensional holes of complex $\psi$ (Fig. 11),

\beta_{k}(\psi)\equiv dim(H_{k}(\psi))

(25)

Clique (flag) simplicial complex of an unweighted (binary) network, $G=(V,E,w\in\{0,1\}$ ), is a simplicial complex, denoted by $\psi^{(G)}$ , such that any $k$ -simplex of each dimension in the complex corresponds to a $(k+1)$ -clique in the network and vice versa (Fig. 12).

A binary network is a topological tree, if and only if, it is topologically holeless in all dimensions. Namely, the associated clique simplicial complex has the following property:

\beta_{k}(\psi^{(G)})=min(\beta_{k})=\left\{\begin{array}[]{l l}1\qquad;\qquad k=0\\ 0\qquad;\qquad k>0\\ \end{array}\right.

(26)

Appendix D: Persistent Homology

In the context of topological data analysis (TDA), we are interested in statistical analysis of the structure in data. Generally, in TDA-based analysis, data of any type (point cloud data, scalar field, time-series, network, etc.) when mapped to a weighted simplicial complex are worked out topologically in terms of the parameters present inherently in the original data, e.g. the weight of links in a weighted network, or the pairwise distance between data points in point cloud data. More precisely, TDA maps parameter-dependent data, $X(w)$ , (where $w$ is a typical parameter, like the threshold value for any weighted network) to a weighted simplicial complex, $\psi^{X}(w)$ . Such approach produces the chain group, the cycle group, the boundary group, the homology group and particularly, the $k$ th Betti number Nakahara (2003); Edelsbrunner and Harer (2010).

Persistent homology (PH) as a powerful tool of TDA examines the creation (birth) and destruction (death) of topological invariants associated with homology classes during a mathematical process called filtration Edelsbrunner and Harer (2010). Filtration, $\phi$ , is a nested sequence of weighted complex $\psi(w)$ in which any complex with a distinct weight is a subcollection of any complex with higher weight,

\phi(\psi(w))\equiv\left(\psi(w)~{}\Bigr{|}~{}\forall w^{\prime}<w^{\prime\prime}:\psi(w^{\prime})\subseteq\psi(w^{\prime\prime})\right)_{w_{\rm min}}^{w_{\rm max}}

(27)

More precisely, the PH technique enumerates the $k$ th Betti number of any subcomplex in $\phi$ and assigns an ordered tuple $w^{(h_{k})}\equiv(w_{{\rm birth}}^{(h_{k})},w_{{\rm death}}^{(h_{k})})$ to the existing $k$ -dimensional topological hole. Here $w_{{\rm birth}}^{(h_{k})}$ and $w_{{\rm death}}^{(h_{k})}$ are the thresholds for which $h_{k}$ appears (birth) and disappears (death), respectively. Since $w_{{\rm birth}}^{(h_{k})}<w_{{\rm death}}^{(h_{k})}$ , we can define the positive-value quantity $\ell^{(h_{k})}\equiv w_{{\rm death}}^{(h_{k})}-w_{{\rm birth}}^{(h_{k})}$ as persistency (lifetime) of a $k$ -dimensional hole. The persistence barcode (PB) and equivalently the persistence diagram (PD) are the famous representations of PH. As an illustration, the $k$ -dimensional persistence diagram of weighted complex $\psi(w)$ is a multiset $PD_{k}\Bigr{(}\phi(\psi(w))\Bigr{)}\equiv(\mathcal{M},\mathcal{N})$ , where

\mathcal{M}~{}\equiv~{}\Bigr{\{}w^{(h_{k})}~{}\Bigr{|}~{}h_{k}\in H_{k}(\psi(w))~{},~{}\psi(w)\in\phi\Bigr{\}}

(28)

and $\mathcal{N}:\mathcal{M}\rightarrow\mathbb{N}$ is the count function. Inspired by Shannon entropy for a typical state probability, one can define the persistence entropy (PE) of the $k$ th PD (PB). To this end, we construct the probability for lifetime of homology classes as

\displaystyle p(\ell^{(h_{k})})\equiv\frac{\ell^{(h_{k})}}{\mathcal{L}}\quad;\quad\mathcal{L}~{}\equiv\displaystyle\sum_{w^{(h_{k})}\in\mathcal{M}({PD}_{k})}\ell^{(h_{k})}

(29)

Therefore, the PE for the $k$ -dimensional topological hole is defined by Merelli et al. (2015); Atienza et al. (2019)

\displaystyle PE_{k}=-\displaystyle\sum_{w^{(h_{k})}\in\mathcal{M}({PD}_{k})}p(\ell^{(h_{k})})~{}\log_{10}p(\ell^{(h_{k})})

(30)

References

Edelsbrunner et al. (2001) H. Edelsbrunner, J. Harer, and A. Zomorodian, in Proceedings of the Seventeenth Annual Symposium on Computational Geometry, SCG ’01 (Association for Computing Machinery, New York, NY, USA, 2001) pp. 70–79.
Zomorodian and Carlsson (2005) A. Zomorodian and G. Carlsson, Discrete & Computational Geometry 33, 249 (2005).
Zomorodian (2005) A. J. Zomorodian, Topology for computing, Vol. 16 (Cambridge university press, 2005).
Carlsson (2009) G. Carlsson, Bulletin of the American Mathematical Society 46, 255 (2009).
Zomorodian (2012) A. Zomorodian, Advances in applied and computational topology 70, 1 (2012).
Wasserman (2018) L. Wasserman, Annual Review of Statistics and Its Application 5, 501 (2018).
Nakahara (2003) M. Nakahara, Geometry, topology and physics (CRC Press, 2003).
Munkres (2018) J. R. Munkres, Elements of algebraic topology (CRC Press, 2018).
Hatcher (2005) A. Hatcher, Algebraic topology (Cambridge University Press, 2005).
Edelsbrunner and Harer (2010) H. Edelsbrunner and J. Harer, Computational topology: an introduction (American Mathematical Soc., 2010).
Ghrist (2008) R. Ghrist, Bulletin of the American Mathematical Society 45, 61 (2008).
Carlsson et al. (2005) G. Carlsson, A. Zomorodian, A. Collins, and L. J. Guibas, International Journal of Shape Modeling 11, 149 (2005).
Patel (2018) A. Patel, Journal of Applied and Computational Topology 1, 397 (2018).
Bubenik and Dłotko (2017) P. Bubenik and P. Dłotko, Journal of Symbolic Computation 78, 91 (2017).
Adams et al. (2017) H. Adams, T. Emerson, M. Kirby, R. Neville, C. Peterson, P. Shipman, S. Chepushtanova, E. Hanson, F. Motta, and L. Ziegelmeier, The Journal of Machine Learning Research 18, 218 (2017).
Chazal and Michel (2017) F. Chazal and B. Michel, arXiv preprint arXiv:1710.04019 (2017).
Munch (2017) E. Munch, Journal of Learning Analytics 4, 47 (2017).
Serrano et al. (2020) D. H. Serrano, J. Hernández-Serrano, and D. S. Gómez, Chaos, Solitons & Fractals 137, 109839 (2020).
Saggar et al. (2018) M. Saggar, O. Sporns, J. Gonzalez-Castillo, P. A. Bandettini, G. Carlsson, G. Glover, and A. L. Reiss, Nature communications 9, 1 (2018).
Horak et al. (2009) D. Horak, S. Maletić, and M. Rajković, Journal of Statistical Mechanics: Theory and Experiment 2009, P03034 (2009).
Topaz et al. (2015) C. M. Topaz, L. Ziegelmeier, and T. Halverson, PloS one 10, e0126383 (2015).
Sizemore et al. (2019) A. E. Sizemore, J. E. Phillips-Cremins, R. Ghrist, and D. S. Bassett, Network Neuroscience 3, 656 (2019).
Pranav et al. (2017) P. Pranav, H. Edelsbrunner, R. Van de Weygaert, G. Vegter, M. Kerber, B. J. Jones, and M. Wintraecken, Monthly Notices of the Royal Astronomical Society 465, 4281 (2017).
Jaquette and Schweinhart (2020) J. Jaquette and B. Schweinhart, Communications in Nonlinear Science and Numerical Simulation 84, 105163 (2020).
Speidel et al. (2018) L. Speidel, H. A. Harrington, S. J. Chapman, and M. A. Porter, Physical Review E 98, 012318 (2018).
Salnikov et al. (2018) V. Salnikov, D. Cassese, and R. Lambiotte, European Journal of Physics 40, 014001 (2018).
Bobrowski and Skraba (2020) O. Bobrowski and P. Skraba, Physical Review E 101, 032304 (2020).
Tran et al. (2021) Q. H. Tran, M. Chen, and Y. Hasegawa, Physical Review E 103, 052127 (2021).
Olsthoorn et al. (2020) B. Olsthoorn, J. Hellsvik, and A. V. Balatsky, Physical Review Research 2, 043308 (2020).
Hurst (1951) H. E. Hurst, Transactions of the American society of civil engineers 116, 770 (1951).
Mandelbrot and Van Ness (1968) B. B. Mandelbrot and J. W. Van Ness, SIAM review 10, 422 (1968).
Eghdami et al. (2018) I. Eghdami, H. Panahi, and S. Movahed, The Astrophysical Journal 864, 162 (2018).
Jiang et al. (2019) Z.-Q. Jiang, W.-J. Xie, W.-X. Zhou, and D. Sornette, Reports on Progress in Physics 82, 125901 (2019).
Peng et al. (1994) C.-K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, H. E. Stanley, and A. L. Goldberger, Physical review e 49, 1685 (1994).
Peng et al. (1995) C.-K. Peng, S. Havlin, H. E. Stanley, and A. L. Goldberger, Chaos: an interdisciplinary journal of nonlinear science 5, 82 (1995).
Kantelhardt et al. (2002) J. W. Kantelhardt, S. A. Zschiegner, E. Koscielny-Bunde, S. Havlin, A. Bunde, and H. E. Stanley, Physica A: Statistical Mechanics and its Applications 316, 87 (2002).
Zhou et al. (2008) W.-X. Zhou et al., Physical Review E 77, 066211 (2008).
Hu et al. (2001) K. Hu, P. C. Ivanov, Z. Chen, P. Carpena, and H. E. Stanley, Physical Review E 64, 011114 (2001).
Chen et al. (2002) Z. Chen, P. C. Ivanov, K. Hu, and H. E. Stanley, Physical review E 65, 041107 (2002).
Kantelhardt et al. (2001) J. W. Kantelhardt, E. Koscielny-Bunde, H. H. Rego, S. Havlin, and A. Bunde, Physica A: Statistical Mechanics and its Applications 295, 441 (2001).
Nagarajan and Kavasseri (2005a) R. Nagarajan and R. G. Kavasseri, Chaos, Solitons & Fractals 26, 777 (2005a).
Xu et al. (2005) L. Xu, P. C. Ivanov, K. Hu, Z. Chen, A. Carbone, and H. E. Stanley, Physical Review E 71, 051101 (2005).
Zou et al. (2019) Y. Zou, R. V. Donner, N. Marwan, J. F. Donges, and J. Kurths, Physics Reports 787, 1 (2019).
Lacasa et al. (2008) L. Lacasa, B. Luque, F. Ballesteros, J. Luque, and J. C. Nuno, Proceedings of the National Academy of Sciences 105, 4972 (2008).
Lacasa et al. (2009) L. Lacasa, B. Luque, J. Luque, and J. C. Nuno, EPL (Europhysics Letters) 86, 30001 (2009).
Nagarajan and Kavasseri (2005b) R. Nagarajan and R. G. Kavasseri, Physica A: Statistical Mechanics and its Applications 354, 182 (2005b).
Hu et al. (2009) J. Hu, J. Gao, and X. Wang, Journal of Statistical Mechanics: Theory and Experiment 2009, P02066 (2009).
Wu et al. (2007) Z. Wu, N. E. Huang, S. R. Long, and C.-K. Peng, Proceedings of the National Academy of Sciences 104, 14889 (2007).
Costa et al. (2011) L. d. F. Costa, O. N. Oliveira Jr, G. Travieso, F. A. Rodrigues, P. R. Villas Boas, L. Antiqueira, M. P. Viana, and L. E. Correa Rocha, Advances in Physics 60, 329 (2011).
Gonçalves et al. (2016) B. A. Gonçalves, L. Carpi, O. A. Rosso, and M. G. Ravetti, Physica A: Statistical Mechanics and its Applications 464, 93 (2016).
Gao et al. (2016) Z.-K. Gao, Q. Cai, Y.-X. Yang, W.-D. Dang, and S.-S. Zhang, Scientific reports 6, 35622 (2016).
Xie and Zhou (2011) W.-J. Xie and W.-X. Zhou, Physica A: Statistical Mechanics and its Applications 390, 3592 (2011).
Zheng et al. (2021) M. Zheng, S. Domanskyi, C. Piermarocchi, and G. I. Mias, Scientific Reports 11, 5623 (2021).
Yang et al. (2009) Y. Yang, J. Wang, H. Yang, and J. Mang, Physica A: Statistical Mechanics and its Applications 388, 4431 (2009).
Ahmadlou et al. (2010) M. Ahmadlou, H. Adeli, and A. Adeli, Journal of neural transmission 117, 1099 (2010).
Bianconi and Rahmede (2016) G. Bianconi and C. Rahmede, Physical Review E 93, 032315 (2016).
Kovalenko et al. (2021) K. Kovalenko, I. Sendiña-Nadal, N. Khalil, A. Dainiak, D. Musatov, A. M. Raigorodskii, K. Alfaro-Bittner, B. Barzel, and S. Boccaletti, Communications Physics 4, 1 (2021).
Young et al. (2020) J.-G. Young, G. Petri, and T. P. Peixoto, arXiv preprint arXiv:2008.04948 (2020).
Rote and Vegter (2006) G. Rote and G. Vegter, in Effective Computational Geometry for curves and surfaces (Springer, 2006) pp. 277–312.
Zomorodian (2009) A. Zomorodian, Algorithms and theory of computation handbook 2 (2009).
Hosking (1984) J. R. Hosking, Water resources research 20, 1898 (1984).
Dieker and Mandjes (2003) A. B. Dieker and M. Mandjes, On spectral simulation of fractional Brownian motion (Centrum voor Wiskunde en Informatica, 2003).
Davies and Harte (1987) R. B. Davies and D. Harte, Biometrika 74, 95 (1987).
Note (1) https://networkx.org/.
Dmitriy (sus2) Dmitriy, (https://mrzv.org/software/dionysus2/).
Taqqu et al. (1995) M. S. Taqqu, V. Teverovsky, and W. Willinger, Fractals 3, 785 (1995).
Olejarczyk and Jernajczyk (2017) E. Olejarczyk and W. Jernajczyk, PLoS One 12, e0188629 (2017).
Akrami et al. (2020) Y. Akrami, M. Ashdown, J. Aumont, C. Baccigalupi, M. Ballardini, A. Banday, R. Barreiro, N. Bartolo, S. Basak, K. Benabed, et al., Astronomy & Astrophysics 641, A4 (2020).
Vafaei Sadr et al. (2018) A. Vafaei Sadr, M. Farhang, S. Movahed, B. Bassett, and M. Kunz, Monthly Notices of the Royal Astronomical Society 478, 1132 (2018).
Kahane and Kahane (1993) C. Kahane and J.-P. Kahane, Some random series of functions, Vol. 5 (Cambridge University Press, 1993).
Reed et al. (1995) I. S. Reed, P. Lee, and T.-K. Truong, IEEE Transactions on Information Theory 41, 1439 (1995).
Merelli et al. (2015) E. Merelli, M. Rucco, P. Sloot, and L. Tesei, Entropy 17, 6872 (2015).
Atienza et al. (2019) N. Atienza, R. Gonzalez-Diaz, and M. Rucco, Journal of Intelligent Information Systems 52, 637 (2019).