This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Stable Volumes for Persistent Homology

Ippei Obayashi
Abstract

This paper proposes a stable volume and a stable volume variant, referred to as a stable sub-volume, for more reliable data analysis using persistent homology. In prior research, an optimal cycle and similar ideas have been proposed to identify the homological structure corresponding to each birth-death pair in a persistence diagram. While this is helpful for data analysis using persistent homology, the results are sensitive to noise. In this paper, stable volumes and stable sub-volumes are proposed to solve this problem. For a special case, we prove that a stable volume is the robust part of an optimal volume against noise. We implemented stable volumes and sub-volumes on HomCloud, a data analysis software package based on persistent homology, and show examples of stable volumes and sub-volumes.

1 Introduction

Topological data analysis (TDA)[13, 5] is a data analysis method utilizing the mathematical concept of topology. In recent years, persistent homology (PH)[14, 33] has become one of the most important tools of TDA. PH is mathematically formalized by using homology on a filtration. We can characterize the geometric information by encoding the information regarding the scale of the data on the filtration. PH has developed rapidly over the most recent decade and has been applied in a variety of areas, including natural image analysis[6], biology [7], geology [31], and materials science [18, 28, 26, 19].

PH information can be described by a persistence diagram (PD) or a persistence barcode111 A persistence diagram and a persistence barcode contain the same information. The difference is how the information is visualized.. A PD is a scatter plot on the XY plane, where each point (called a birth-death pair) on the plot corresponds to a homological structure in the input data.

1.1 Persistent homology and volume-optimal cycles

Figure 1 can be used to intuitively explain how a PD is determined from data. The input data in the example constitute a pointcloud with five points, as shown in Fig. 1(a). The points appear as a regular triangle and a square. Let aa be the length of edges of the regular triangle and the square. Since the five points themselves do not carry any interesting topological information, we construct a topological structure by putting circles with values of radii rr as indicated in Fig. 1(b)(c)(d)(e).

Refer to caption
Figure 1: How to compute a Persistence diagram for 5 points (a) Input pointcloud (b) Pointcloud and circles with radii r<a/2r<a/2 (c) r=a/2r=a/2 (d) r=a/3r=a/\sqrt{3} (e) r=a/2r=a/\sqrt{2} (f) Persistence diagram

We now face the problem of how to determine the proper size of the circles. If the circles are too small, as in Fig. 1(b), the topology is simply equivalent to the five points. On the other hand, if the circles are too large, as in Fig. 1(e), the shape becomes acyclic, which makes it impossible to uncover any interesting topological information. PH solves this problem by considering the appearance and disappearance of homology generators associated with radii changes; that is, PH considers the increasing sequence from (b) to (e).

In the Fig. 1 example, two holes (homology generators in the 1st homology) appear at (c), one of which disappears at (d). The other hole disappears at (e). The pair of radii222The squares of radii are also often used in the literature of PH. of the appearance and disappearance of each homology generator is called a birth-death pair, and the multiset of all birth-death pairs is called a persistence diagram. In this example, the PD for the 1st homology is {(a/2,a/3),(a/2,a/2)}\{(a/2,a/\sqrt{3}),(a/2,a/\sqrt{2})\}. The persistence diagram is visualized by a scatter plot or a 2D histogram (Fig. 1(f)). The two pairs correspond to a regular triangle and a square. We can extract the geometric information of the pointcloud in Fig. 1(a) using the PD.

It is beneficial in data analysis using PH to detect the original homological structures corresponding to each birth-death pair. This is sometimes called “inverse analysis on PH”. Some applications of PH[18, 19] already use inverse analysis on PH. However, detection is not an easy task, as the representative cycle corresponding to a homology generator is not unique.

To solve this problem, various methods involving the solution of an optimization problem in homology algebra has been proposed[9, 17, 12, 9, 16, 11, 24, 32, 29] in various settings. For PH, optimal cycles[17], volume-optimal cycles[24], and persistent 1-cycles[12] have been proposed. These studies give formalizations suitable to PH. The “tightest” or “minimal” cycle is considered the best to interpret, and solving the optimization problem in homology algebra produces the tightest cycle. As an example, consider the simplicial complex in Fig. 2. The 1st homology group of the simplicial complex H1H_{1} is isomorphic to \mathbb{Z}. z1z_{1} and z2z_{2} give the same generator of H1H_{1}, but z1z_{1} is superior since the z1z_{1} surrounds the hole more tightly than z2z_{2}. The above methods find z1z_{1} by using the prescribed optimization technique.

It should be noted that there are two choices of functions to be optimized. One is the size of the cycle; the other is the internal volume of the cycle. In Fig. 2, z1z_{1} is tighter than z2z_{2} in both senses; however, it is possible for the solutions to the two optimization problems to differ. The better choice depends on the problem. This paper mainly uses the internal volume as an optimization function since it is suitable to define stable volumes in Section 3.

Refer to caption
Figure 2: A simplicial complex with one hole

1.2 Problems of homology optimization

The existing methods have the following two problems:

  • The methods do not always give the minimal building blocks of the data

  • The result is unstable to noise

The example in Fig. 1 can be used to demonstrate these problems.

The pointcloud in Fig. 1 has two minimal building blocks, a triangle and a square. However, the optimization technique sometimes fails to find these minimal building blocks when a small noise is added. Figure 3(a) shows the pointcloud in Fig. 1(a) when a small noise is added. This figure also shows circles, and, as can be seen, a pentagon appears before either a triangle or a square appears. By applying the homology optimization technique to the data in Fig. 3(a), we produce a triangle and a pentagon, as shown in Fig. 3(b). On the other hand, consider the pointcloud in Fig. 3(c). The pointcloud here is also very close to that in Fig. 1(a); however, while a square appears first, the homology optimization technique gives a square and a triangle, as shown in Fig. 3(d).

Refer to caption
Figure 3: Two types of optimization results

The example demonstrates the following problems: (1) A small noise can change the result, and (2) The optimization technique sometimes fails to give minimal building blocks of the data. The first problem is related to the reliability of the analysis; the second problem is related to the interpretability of the result.

Previous studies [10, 8, 2, 22] have proved the stability theorem for persistence diagrams. The literature indicates that a PD is continuously changed by a small noise in the input data if we consider reasonable metrics. The stability of other PH outputs has also been studied in the context of machine learning and PH[4, 21, 1] or pointcloud summary method[20, 30]. Although these types of stability play an important role in the study of PH, such a stability theorem does not hold for the solution of homology optimization problems.

Both in Fig. 3(a) and Fig. 3(c), the minimal building blocks are a triangle and a square, and it is more natural that an inverse analysis technique gives a triangle and a square for both data. Therefore, we would hope to detect such structures even if a small noise is added.

These problems are practically significant, especially when we analyze crystalline structures using PH. We can demonstrate the problems using the synthetic crystalline data in Fig. 4. Here, the pointcloud consists of 27 points, arranged in a 3x3x3 cubical lattice (Fig. 4(a)). The distance between two vertices on the cube is 11. A small noise is added to the pointcloud. Figure 4(b) shows the 1st PD of the pointcloud. As can be seen here, some birth-death pairs are concentrated around (1/4,1/2)(0.5,0.7)(1/4,1/\sqrt{2})\approx(0.5,0.7). These pairs correspond to 1x1 squares in the lattice. By applying volume-optimal cycles[24], we found some loops corresponding to the pairs in Fig. 4(c) and (d). Some of the cycles shown in (c) are squares, as expected; however, others in (d) are larger structures resembling chairs. These larger structures were detected with the same mechanism in Fig. 3.

Refer to caption
Figure 4: Optimal volumes for 3x3x3 cubical lattice. (a) The pointcloud, (b) 1st PD, (c) Optimal volumes of two pairs, (d) Optimal volumes of other two pairs

The purpose of this paper is to present a new method to solve these problems. That is, we propose a method to find minimal building blocks that is robust to noise.

1.3 Statistical approach in previous research

Bendich, Bubenik, and Wagner [3] proposed a statistical approach for the problem, which can be outlined as follows:

  1. 1.

    Computing a PD from the input data and choosing a birth-death pair to analyze

  2. 2.

    Adding a small noise to the input data, computing a PD, and applying inverse analysis

  3. 3.

    Repeating (2) multiple times

  4. 4.

    Computing an average of the results of the inverse analysis

We can flexibly change the method of inverse analysis and the way in which the average of the results is computed. We can combine the statistical approach with optimal or volume-optimal cycles are as follows:

Algorithm 1 Computing persistence trees by the merge-tree algorithm.
  1. 1.

    Computing a PD from the input pointcloud and choosing a birth-death pair to analyze

  2. 2.

    Adding a small noise to the input data and computing the PD and optimal cycles or volume-optimal cycles of the corresponding birth-death pair

  3. 3.

    Repeating (2) multiple times

  4. 4.

    For each point in the pointcloud, computing the frequency of (the point)(optimal cycle)(\text{the point})\in(\text{optimal cycle})

By applying this method to the birth-death pair (1/2,1/2)(1/2,1/\sqrt{2}) in Fig. 1, we produce the result shown in Fig. 5. The result indicates that the four points on the square are robust to noise and that the leftmost point is less robust. This result is consistent with the fact that the pair (1/2,1/2)(1/2,1/\sqrt{2}) corresponds to a square.

Refer to caption
Figure 5: Result of applying the statistical approach to the birth-death pair (1/2,1/2)(1/2,1/\sqrt{2})

The stability of the method in a probabilistic sense was also proved by Bendich, Bubenik, and Wagner. Notably, the method provides a more reliable inverse analysis. The idea is quite clever, simple, and easy to implement. However, the method has a high computation cost since it requires the user to compute PDs and optimal cycles a large number of times. In [3], the authors repeat the computation of the generators 1,000 or 10,000 times. We suspect that fewer repetitions may be sufficient, but ten or a hundred trials will be necessary. Multiple trials and errors are typically needed to tune the noise bandwidth in order to apply the method, and thus the cost is not ignorable.

1.4 Reconstructed shortest cycles in Ripserer.jl

The “reconstructed shortest cycles” functionality of Ripserer.jl[34]333https://github.com/mtsch/Ripserer.jl by Čufar gives another solution of the problem. The functionality reconstructs the tightest 1-cycle using the shortest path algorithm and the representative of persistent cohomology. The functionality accepts a noise bandwidth parameter and computes a tighter loop.

The advantage of reconstructed shortest cycles is its efficiency. Since it uses the shortest path algorithm, the computational complexity is small. However, reconstructed shortest cycles can be applied only to 1st PH since it uses the shortest path algorithm. Another disadvantage is the lack of mathematical justification of the functionality. Now444Dec. 4, 2021 the functionality is declared as experimental and gives no mathematical documentation. We explain the algorithm in Appendix A.

1.5 Results

In this paper, we propose stable volumes and a variant of stable volumes, called stable sub-volumes. The proposed method is based on the volume-optimal cycles and optimal volumes shown in [24]. Stable volumes produce minimal building blocks with a lower computation cost than the statistical approach. Below, we provide an outline of stable volumes. The exact formalization will be introduced in Section 2 and further developed in subsequent sections.

Let XX be a simplicial complex and X(k)X^{(k)} the set of all kk-simplices of XX. We define a level function as follows.

Definition 1.

r^:X\hat{r}:X\to\mathbb{R} is a level function if σσ\sigma\subsetneq\sigma^{\prime} implies r^(σ)r^(σ)\hat{r}(\sigma)\leq\hat{r}(\sigma^{\prime}).

Here, for simplicity, we assume that a level function r^\hat{r} can distinguish every simplex; that is, we assume the following:

σσ implies r^(σ)r^(σ).\sigma\not=\sigma^{\prime}\text{ implies }\hat{r}(\sigma)\not=\hat{r}(\sigma^{\prime}). (1)

From the definition of the level function, Xr^,t={σr^(σ)<t}X_{\hat{r},t}=\{\sigma\mid\hat{r}(\sigma)<t\} is a subcomplex of XX and t<tt<t^{\prime} implies Xr^,tXr^,tX_{\hat{r},t}\subseteq X_{\hat{r},t^{\prime}}. This means that {Xr^,t}t\{X_{\hat{r},t}\}_{t\in\mathbb{R}} is a filtration of simplicial complexes. From the filtration, the kkth PD, PDk({Xr^,t})\mathrm{PD}_{k}(\{X_{\hat{r},t}\}), is defined. Any birth-death pair in the diagram can be written as (b,d)=(r^(τb),r^(ωd))(b,d)=(\hat{r}(\tau_{b}),\hat{r}(\omega_{d})), where τb\tau_{b} is a kk-simplex and ωd\omega_{d} is a (k+1)(k+1)-simplex due to the theory of PH.

The optimal volume of the pair (b,d)(b,d) is defined by the solution to the following minimization problem:

minimize z0, s. t.\displaystyle\text{minimize }\|z\|_{0},\text{ s. t. } (2)
z\displaystyle z =ωd+ωk+1αωωCk+1(X;𝕜),\displaystyle=\omega_{d}+\sum_{\omega\in\mathcal{F}_{k+1}}\alpha_{\omega}\omega\in C_{k+1}(X;\Bbbk),
τ(z)\displaystyle\tau^{*}(\partial z) =0 for every τk,\displaystyle=0\text{ for every }\tau\in\mathcal{F}_{k},
τb(z)\displaystyle\tau_{b}^{*}(\partial z) 0,\displaystyle\not=0,

where Ck+1(X;𝕜)C_{k+1}(X;\Bbbk) is a (k+1)(k+1)th chain complex whose coefficient field is 𝕜\Bbbk, k={σX(k)r^(τd)<r^(σ)<r^(σd)}\mathcal{F}_{k}=\{\sigma\in X^{(k)}\mid\hat{r}(\tau_{d})<\hat{r}(\sigma)<\hat{r}(\sigma_{d})\}, τ\tau^{*} is an element of the dual basis of the cochain complex Ck(X;𝕜)C^{k}(X;\Bbbk), and z0={ωαω0}\|z\|_{0}=\{\omega\mid\alpha_{\omega}\not=0\} is the 0\ell^{0} norm of zz555The 0\ell^{0} norm is, in fact, a norm since the triangle inequality does not hold. However, this is often called the 0\ell^{0} norm in the context of machine learning and sparse modeling.. The optimal volume minimizes the internal volume surrounded by the loop. The optimal volume is defined on 𝕜=/2\Bbbk=\mathbb{Z}/2\mathbb{Z}; however, in practice, optimal volumes are determined by using linear programming with a change of coefficient field from /2\mathbb{Z}/2\mathbb{Z} to \mathbb{R} and an approximation of the optimization of the 0\ell^{0} norm by the 1\ell^{1} norm. For optimal volume zz, z\partial z is called a volume-optimal cycle. Reference [24] shows that optimal volumes and volume-optimal cycles have good mathematical properties and provide good representations of a birth-death pair on a PD.

The stable volume for the pair (b,d)(b,d) with a noise bandwidth parameter ϵ\epsilon is defined by the solution to the following minimization problem:

minimize z0, s. t.\displaystyle\text{minimize }\|z\|_{0},\text{ s. t. } (3)
z\displaystyle z =ωd+ωϵ,k+1αωσCk+1(X;𝕜),\displaystyle=\omega_{d}+\sum_{\omega\in\mathcal{F}_{\epsilon,k+1}}\alpha_{\omega}\sigma\in C_{k+1}(X;\Bbbk),
τ(z)\displaystyle\tau^{*}(\partial z) =0 for every τϵ,k,\displaystyle=0\text{ for every }\tau\in\mathcal{F}_{\epsilon,k},

where ϵ,k={σX(k)r^(τd)+ϵr^(σ)<r^(ωd)}\mathcal{F}_{\epsilon,k}=\{\sigma\in X^{(k)}\mid\hat{r}(\tau_{d})+\epsilon\leq\hat{r}(\sigma)<\hat{r}(\omega_{d})\}. This is quite similar to the definition of optimal volumes. The following theorem states that the stable volume is considered to be the “robust part” against noise.

Theorem 2.

Let XX be an nn-dimensional simplicial complex, and let r^\hat{r} be a level function. Assume that XX is embedded in n\mathbb{R}^{n}. We consider 𝕜=/2\Bbbk=\mathbb{Z}/2\mathbb{Z} as a coefficient field. For any (b,d)PDn1({Xr^,t}t)(b,d)\in\mathrm{PD}_{n-1}(\{X_{\hat{r},t}\}_{t\in\mathbb{R}}), OV(r^,b,d)\mathrm{OV}(\hat{r},b,d) and SVϵ(r^,b,d)\mathrm{SV}_{\epsilon}(\hat{r},b,d) are defined as follows:

OV(r^,b,d)=\displaystyle\mathrm{OV}(\hat{r},b,d)= {ωX(n)ω(z)0, z is an optimal volume of (b,d)},\displaystyle\{\omega\in X^{(n)}\mid\omega^{*}(z)\not=0,\text{ $z$ is an optimal volume of $(b,d)$}\}, (4)
SVϵ(r^,b,d)=\displaystyle\mathrm{SV}_{\epsilon}(\hat{r},b,d)= {ωX(n)ω(z)0,z is a stable volume of (b,d)\displaystyle\{\omega\in X^{(n)}\mid\omega^{*}(z)\not=0,\ z\text{ is a stable volume of }(b,d)
with noise bandwidth parameter ϵ}.\displaystyle\text{ with noise bandwidth parameter }\epsilon\}.

Then the following holds for sufficiently small ϵ>0\epsilon>0.

SVϵ(r^,b,d)=q^ϵOV(q^,bq^,ωd,q^(ωd)),\displaystyle\mathrm{SV}_{\epsilon}(\hat{r},b,d)=\cap_{\hat{q}\in\mathcal{R}_{\epsilon}}\mathrm{OV}(\hat{q},b_{\hat{q},\omega_{d}},\hat{q}(\omega_{d})), (5)

where ωd\omega_{d}, ϵ\mathcal{R}_{\epsilon} and bq^,ωdb_{\hat{q},\omega_{d}} are defined as follows:

d\displaystyle d =r^(ωd),\displaystyle=\hat{r}(\omega_{d}), (6)
ϵ\displaystyle\mathcal{R}_{\epsilon} ={q^:a level function satisfying the condition (1)\displaystyle=\{\hat{q}:\text{a level function satisfying the condition \eqref{eq:distinguishability}}\mid
q^r^<ϵ/2, and\displaystyle\ \ \|\hat{q}-\hat{r}\|_{\infty}<\epsilon/2,\text{ and }
r^(σ)<r^(ωd) implies q^(σ)<q^(ωd)\displaystyle\ \ \hat{r}(\sigma)<\hat{r}(\omega_{d})\text{ implies }\hat{q}(\sigma)<\hat{q}(\omega_{d})
},\displaystyle\}, (7)
bq^,ωd\displaystyle b_{\hat{q},\omega_{d}} : a birth time paired with q^(σd) in PDn1({Xq^,t}t}.\displaystyle:\text{ a birth time paired with }\hat{q}(\sigma_{d})\text{ in }PD_{n-1}(\{X_{\hat{q},t}\}_{t\in\mathbb{R}}\}. (8)

We can regard ϵ\mathcal{R}_{\epsilon} as the set of all possible level functions with small noise ϵ/2\epsilon/2. Then the theorem states that the stable volume is the common area of all possible optimal volumes under small noise. This theorem does not hold for the PH of another dimension, but it suggests that stable volumes give a better result than optimal volumes.

For practical applications, assumption (1) is overly strict. Vietoris-Rips, Čech, and alpha filtrations do not satisfy the condition. Therefore, we introduce an order with levels in Section 2.2 and formalize all of them using the order with levels.

As the stable volume method has already been implemented in HomCloud [25], stable volumes can now be used to analyze the data.

We can demonstrate stable volumes by using the same input data as in Fig. 4. Figure 6 shows the optimal volumes (Fig. 6(a)) and stable volumes (Fig. 6(b)) of the same two birth-death pairs. The birth-death pairs correspond to 1x1 squares in the lattice points, but Fig. 6(a) shows larger loops than expected. In contrast Fig. 6(b) shows the expected 1x1 squares. This example suggests that stable volumes give better results and help promote a better intuitive understanding of PDs.

Refer to caption
Figure 6: Stable volumes computed for 3x3x3 cubical lattice

Figure 7 shows another example of stable volume. Here, we consider 2D lattice points with defects (Fig. 7 (a)). The configuration of the points is somewhat perturbed from the complete lattice, and some points are removed randomly. Figure 7(b) shows the 1st PD of the points, and Fig. 7(c) shows the optimal volume of (0.496, 4.371). Figure 7(d) shows the stable volume of the same birth-death pair. The optimal volume in Fig. 7(d) seems to surround the holes in the pointcloud more naturally than is the case in Fig. 7(c).

Refer to caption
Figure 7: An optimal volume and a stable volume for a 2D lattice with defects. (a) The input pointcloud, (b) 1st PD, (c) The optimal volumes of (0.496, 4.371), (d) The stable volume of the same birth-death pair

Additional demonstrations of stable volumes using synthetic and real data are given in Section 6.

1.6 Organization of the paper

The remainder of the paper is organized as follows: Section 2 introduces the concept of PH, optimal volumes, and volume-optimal cycles. Section 3 defines stable volumes for the (n1)(n-1)th PH embedded in n\mathbb{R}^{n}. In this section we also discuss the limitation of stable volumes. Section 4 shows an alternative formalization of stable volumes using mathematical optimization. Section 4.1 provides a proof that the two formalizations are consistent for the (n1)(n-1)th PH when the simplicial complex is embedded in n\mathbb{R}^{n}. Section 4.2 extends the formalization to other cases. Section 5 discusses the software implementation of stable volumes. Section 6 gives examples using synthetic and real data. This section also provies the comparison with previous methods, the statistical method and the reconstructed shortest cycles. Section 7 discusses the way in which the noise bandwidth parameter can be tuned. Section 8 compares stable volumes and sub-volumes. Section 9 summarizes the paper and offers concluding remarks. In this section we also discuss the differences between the proposed method and the methods previous works.

2 Persistent homology, optimal volumes, and volume-optimal cycles

In this section, we introduce the mathematical concepts of PH, persistence diagrams, optimal volumes, and volume-optimal cycles, which provide the foundation for our discussion.

2.1 Persistent homology

PH is defined on a filtration of topological spaces. Let XX be a finite simplicial complex666To simplify the explanation in this paper, we use a simplicial complex. However, the content of the paper can be easily extended to cubical or cell complexes. and

𝕏:=X0X1XN=X\mathbb{X}:\emptyset=X_{0}\subsetneq X_{1}\subsetneq\cdots\subsetneq X_{N}=X (9)

be a filtration of subcomplexes of XX. The kkth persistent homology Hk(𝕏;𝕜)H_{k}(\mathbb{X};\Bbbk) with a coefficient field 𝕜\Bbbk is defined as

Hk(𝕏;𝕜):Hk(X0;𝕜)Hk(X1;𝕜)Hk(XN;𝕜),H_{k}(\mathbb{X};\Bbbk):H_{k}(X_{0};\Bbbk)\to H_{k}(X_{1};\Bbbk)\to\cdots\to H_{k}(X_{N};\Bbbk), (10)

where Hk(Xi;𝕜)H_{k}(X_{i};\Bbbk) is the kkth homology 𝕜\Bbbk-vector space of XiX_{i} and \to is the linear map induced by the inclusion maps. The fundamental theorem of PH is as follows.

Theorem A.

There exists a unique decomposition of Hk(𝕏;𝕜)H_{k}(\mathbb{X};\Bbbk) ,

Hk(𝕏;𝕜)m=1LI(bm,dm),H_{k}(\mathbb{X};\Bbbk)\simeq\bigoplus_{m=1}^{L}I(b_{m},d_{m}), (11)

where 0<bm<dm0<b_{m}<d_{m}\leq\infty, 𝕜𝕜\Bbbk\to\Bbbk is an identity map, 0𝕜,𝕜00\to\Bbbk,\Bbbk\to 0 are zero maps, and I(b,d)I(b,d) are as shown below:

I(b,d)=\displaystyle I(b,d)= 00𝕜bth𝕜(d1)th0dth0, if dm,\displaystyle 0\to\cdots\to 0\to\overbrace{\Bbbk}^{b\text{th}}\to\cdots\to\overbrace{\Bbbk}^{(d-1)\text{th}}\to\overbrace{0}^{d\text{th}}\to\cdots\to 0,\ \ \text{ if }d_{m}\not=\infty, (12)
I(b,d)=\displaystyle I(b,d)= 00𝕜bth𝕜, if dm=.\displaystyle 0\to\cdots\to 0\to\overbrace{\Bbbk}^{b\text{th}}\to\cdots\to\Bbbk,\ \ \text{ if }d_{m}=\infty. (13)

On an intuitive level, the map 0𝕜0\to\Bbbk indicates the appearance of a homology generator, 𝕜𝕜\Bbbk\to\Bbbk indicates the persistence of the generator, and 𝕜0\Bbbk\to 0 indicates the disappearance of the generator.

We note that we should use a field as a coefficient ring of the homology module since the theorem holds only when 𝕜\Bbbk is a field.

2.2 Order with level

We earlier introduced the idea of a level function. We now introduce the concept of total order consistent with the level function. Total order is used to rigorously describe the algorithms; the level function is needed to rigorously describe Theorem 2. Simplices with the same level often appear in an alpha filtration or a Vietoris-Rips filtration; thus, the concept of total order is required to deal with such filtrations. The pair consisting of the total order and the level function is called an order with level.

Let \prec be a total order on the set of all simplices of XX. We assume that

σσ implies σσ.\sigma\subsetneq\sigma^{\prime}\text{ implies }\sigma\prec\sigma^{\prime}. (14)

We can number all simplices in XX by this ordering as follows:

σ1σ2σN.\sigma_{1}\prec\sigma_{2}\prec\cdots\prec\sigma_{N}. (15)

From the condition (14), Xi={σ1,,σi}X_{i}=\{\sigma_{1},\ldots,\sigma_{i}\} is always a subcomplex of XX and

=X0X1XN=X\emptyset=X_{0}\subsetneq X_{1}\subsetneq\ldots\subsetneq X_{N}=X (16)

is a filtration of a simplicial complex. In this filtration, the number of simplices is increased one by one, which simplifies the mathematical description.

To include additional information in the filtration, we consider an order with level.

Definition 3.

A pair r=(r^,r)r=(\hat{r},\preceq_{r}) of a real-valued map on XX and a total order on XX satisfying the following conditions is called an order with level.

  1. 1.

    σrσ\sigma\preceq_{r}\sigma^{\prime} implies r^(σ)r^(σ)\hat{r}(\sigma)\leq\hat{r}(\sigma^{\prime})

  2. 2.

    The order r\preceq_{r} satisfies (14)

From the total order r\preceq_{r}, a filtration 𝕏r\mathbb{X}_{r} is defined by (15) and (16). Theorem A gives the unique decomposition Hk(𝕏r;𝕜)m=1LI(bm,dm)H_{k}(\mathbb{X}_{r};\Bbbk)\simeq\bigoplus_{m=1}^{L}I(b_{m},d_{m}). For each (bm,dm)(b_{m},d_{m}), the simplex σbm\sigma_{b_{m}} is called a birth simplex, σdm\sigma_{d_{m}} is called a death simplex, and the pair (σbm,σdm)(\sigma_{b_{m}},\sigma_{d_{m}}) is called a birth-death simplices pair. When dm=d_{m}=\infty, we write the pair as (σbm,)(\sigma_{b_{m}},\star) using the special symbol \star. Dk(𝕏r)D_{k}(\mathbb{X}_{r}) denotes the set of all birth-death simplices pairs. Moreover, r^(σbm)\hat{r}(\sigma_{b_{m}}) is called a birth time, r^(σdm)\hat{r}(\sigma_{d_{m}}) is called a death time, and the pair of the birth time and death time, (r^(σbm),r^(σdm))(\hat{r}(\sigma_{b_{m}}),\hat{r}(\sigma_{d_{m}})), is called a birth-death pair. When σdm=\sigma_{d_{m}}=\star, r^()\hat{r}(\star) is defined as ++\infty. The multiset of birth-death pairs is called a persistence diagram:

{(r^(σbm),r^(σdm))m=1,,L;r^(σbm)r^(σdm)}.\{(\hat{r}(\sigma_{b_{m}}),\hat{r}(\sigma_{d_{m}}))\mid m=1,\ldots,L;\hat{r}(\sigma_{b_{m}})\not=\hat{r}(\sigma_{d_{m}})\}. (17)

We write the persistence diagram as PDk(X,r)\mathrm{PD}_{k}(X,r).

We can readily show the following fact.

Remark 4.

For any (τ0,ω0)Dk(𝕏r)(\tau_{0},\omega_{0})\in D_{k}(\mathbb{X}_{r}), τ0\tau_{0} is a kk-simplex and σ0\sigma_{0} is a (k+1)(k+1)-simplex.

By using the level information, we can describe the stability theorem of persistence diagrams [10, 8, 2, 22]. Let PD1,PD2\mathrm{PD}_{1},\mathrm{PD}_{2} be two persistence diagrams. The bottleneck distance between two diagrams dB(PD1,PD2)d_{B}(\mathrm{PD}_{1},\mathrm{PD}_{2}) is defined as

dB(PD1,PD2)=infγsupxPD1Δxγ(x),d_{B}(\mathrm{PD}_{1},\mathrm{PD}_{2})=\inf_{\gamma}\sup_{x\in\mathrm{PD}_{1}\cup\Delta}\|x-\gamma(x)\|_{\infty}, (18)

where γ\gamma ranges over all bijections from PD1Δ\mathrm{PD}_{1}\cup\Delta to PD2Δ\mathrm{PD}_{2}\cup\Delta, Δ\Delta is the diagonal, and \|\cdot\|_{\infty} is the \ell^{\infty} norm. In our setting, the stability theorem of PH is described as below.

Theorem B (Stability theorem of PH).

Let XX be a finite simplicial complex. For any two orders with levels q=(q^,q)q=(\hat{q},\preceq_{q}) and r=(r^,r)r=(\hat{r},\preceq_{r}) and any k0k\geq 0, the following inequality holds.

dB(PDk(X,q),PDk(X,r))q^r^:=maxσX|q^(σ)r^(σ)|d_{B}(\mathrm{PD}_{k}(X,q),\mathrm{PD}_{k}(X,r))\leq\|\hat{q}-\hat{r}\|_{\infty}:=\max_{\sigma\in X}|\hat{q}(\sigma)-\hat{r}(\sigma)| (19)

The theorem ensures the robustness of PDs to noise.

2.3 Optimal volumes

Optimal volume is proposed in [24] as a way to extract homological structures corresponding to a birth-death pair. For an order with levels r=(r^,r)r=(\hat{r},\preceq_{r}) and a birth-death simplices pair (τ0,ω0)Dk(𝕏r)(\tau_{0},\omega_{0})\in D_{k}(\mathbb{X}_{r}) with ω0\omega_{0}\not=\star, the optimal volume for the pair is formalized as the solution to the following minimization problem:

minimize z0, subject to\displaystyle\text{minimize }\|z\|_{0},\text{ subject to } (20)
z\displaystyle z =ω0+ωk+1αωωCk+1(X;𝕜),\displaystyle=\omega_{0}+\sum_{\omega\in\mathcal{F}_{k+1}}\alpha_{\omega}\omega\in C_{k+1}(X;\Bbbk),
τ(z)\displaystyle\tau^{*}(\partial z) =0 for every τk,\displaystyle=0\text{ for every }\tau\in\mathcal{F}_{k},
τ0(z)\displaystyle\tau_{0}^{*}(\partial z) 0,\displaystyle\not=0,

where Ck+1(X;𝕜)C_{k+1}(X;\Bbbk) is a chain complex whose coefficient field is 𝕜\Bbbk, k={σX(k)τ0rσrω0}\mathcal{F}_{k}=\{\sigma\in X^{(k)}\mid\tau_{0}\prec_{r}\sigma\prec_{r}\omega_{0}\}, τ\tau^{*} is an element of the dual basis of cochain complex Ck(X;𝕜)C^{k}(X;\Bbbk), and z0={ωαω0}\|z\|_{0}=\{\omega\mid\alpha_{\omega}\not=0\} is the 0\ell^{0} norm of zz. For solution zz, z\partial z is called a volume-optimal cycle.

Reference [24] shows the following fact, which indicates that the volume-optimal cycle is suitable for representing a birth-death pair.

Fact 5.

Let (τ0,ω0)Dk(𝕏r)(\tau_{0},\omega_{0})\in D_{k}(\mathbb{X}_{r}) be a birth-death simplices pair and zz an optimal volume of that pair. Then the following relations hold:

z\displaystyle\partial z Zk({σXσrτ0}),\displaystyle\not\in Z_{k}(\{\sigma\in X\mid\sigma\prec_{r}\tau_{0}\}), (21)
z\displaystyle\partial z Zk({σXσrτ0}),\displaystyle\in Z_{k}(\{\sigma\in X\mid\sigma\preceq_{r}\tau_{0}\}), (22)
z\displaystyle\partial z Bk({σXσrσ0}),\displaystyle\not\in B_{k}(\{\sigma\in X\mid\sigma\prec_{r}\sigma_{0}\}), (23)
z\displaystyle\partial z Bk({σXσrσ0}),\displaystyle\in B_{k}(\{\sigma\in X\mid\sigma\preceq_{r}\sigma_{0}\}), (24)

where Zk()Z_{k}(\cdot) represents the cycles and Bk()B_{k}(\cdot) represents the boundaries.

2.4 Optimal volumes for the (n1)(n-1)th PH and persistence trees

In this section, we consider a triangulation of a convex set in n\mathbb{R}^{n} and its (n1)(n-1)th PH with n2n\geq 2. More precisely, we assume the following:

Condition 6.

A simplicial complex XX in n\mathbb{R}^{n} satisfies two conditions.

  • Any kk-simplex (k<nk<n) is a face of an nn-simplex

  • |X|:=σXσ|X|:=\bigcup_{\sigma\in X}\sigma is convex in n\mathbb{R}^{n}

Schweinhart [29] pointed out that the (n1)(n-1)th PH was isomorphic to the 0th persistent cohomology of the dual filtration by Alexander duality. Zeroth cohomology is deeply related to the connected components in the dual filtration. This gives rise to another formalization of (n1)(n-1)th PH. Reference [24] generalizes the idea shown in [29].

We now examine an order with levels rr on XX. Consider the filtration 𝕏r\mathbb{X}_{r} in (15) and (16) given by r\preceq_{r}. For simplicity, we will use 2\mathbb{Z}_{2} as the coefficient field. Then the following proposition and theorems hold [29, 24].

Proposition C.

For any birth-death pair in PDn1(X,r)\mathrm{PD}_{n-1}(X,r), the death time is not infinity.

Theorem D.

The optimal volume for any birth-death pair is uniquely determined.

Theorem E.

If zz and zz^{\prime} are the optimal volumes for two different birth-death pairs, one of the following holds:

  • zz=z\cap z^{\prime}=\emptyset

  • zzz\subsetneq z^{\prime}

  • zzz\supsetneq z^{\prime}

Note that we can naturally regard any z=ωX(n)kωσCn(X)z=\sum_{\omega\in X^{(n)}}k_{\omega}\sigma\in C_{n}(X) as a subset of nn-simplices of XX, {ωX(n)kω0}\{\omega\in X^{(n)}\mid k_{\omega}\not=0\}, since we use 2\mathbb{Z}_{2} as the homology coefficient field.

From Theorem E, we know that Dn1(𝕏r)D_{n-1}(\mathbb{X}_{r}) can be regarded as a forest (i.e., the union of distinct trees) by the inclusion relation. In [29], the forest is called persistence trees. Moreover, we can compute the persistence trees by the merge-tree algorithm (Algorithm 2).

To describe the algorithm, consider the one-point compactification space n{}Sn\mathbb{R}^{n}\cup\{\infty\}\simeq S^{n}. The following facts are well known.

Fact 7.

X{ω}X\cup\{\omega_{\infty}\} is a cell decomposition of n{}\mathbb{R}^{n}\cup\{\infty\} where ω=n{}\|X|\omega_{\infty}=\mathbb{R}^{n}\cup\{\infty\}\backslash|X|.

Fact 8.

For any τX(n1)\tau\in X^{(n-1)}, the proper cofaces777 If τ\tau is a face of ω\omega, ω\omega is called a coface of τ\tau. If the difference of the dimensions is one, ω\omega is a proper coface of τ\tau. of τ\tau are just two nn-cells in X{ω}X\cup\{\omega_{\infty}\}.

We can extend the order with levels rr onto X{ω}X\cup\{\omega_{\infty}\} by regarding ω\omega_{\infty} as the maximum element and r^(ω)=+\hat{r}(\omega_{\infty})=+\infty. To describe the algorithm, we consider a directed graph G¯\bar{G} whose nodes are nn-cells in X{ω}X\cup\{\omega_{\infty}\}. An edge has extra data in X(n1)X^{(n-1)}, and we can write the edge from ω\omega to ω\omega^{\prime} with extra data τ\tau as (ω𝜏ω)(\omega\xrightarrow{\tau}\omega^{\prime}). The directed graph G¯\bar{G} is increasingly updated in Algorithm 2.

Since the graph is always a forest throughout the algorithm[24], we can find a root of a tree that contains an nn-cell ω\omega in the graph G¯\bar{G} by recursively following the edges from ω\omega. We call this procedure Root(ω,G¯\omega,\bar{G}).

Algorithm 2 Computing persistence trees by the merge-tree algorithm.
procedure Compute-Tree(𝕏r\mathbb{X}_{r})
     initialize G¯={ω}\bar{G}=\{\omega_{\infty}\}
     for σX\sigma\in X in (r)(\preceq_{r})-descending order do \triangleright (LOOP)
         if σ\sigma is an nn-simplex then
              add σ\sigma to G¯\bar{G} as a vertex
         else if σ\sigma is an (n1)(n-1)-simplex then
              let ω1\omega_{1} and ω2\omega_{2} be two cofaces of σ\sigma
              ω1Root(ω1,G¯)\omega_{1}^{\prime}\leftarrow\textsc{Root}(\omega_{1},\bar{G})
              ω2Root(ω2,G¯)\omega_{2}^{\prime}\leftarrow\textsc{Root}(\omega_{2},\bar{G})
              if ω1=ω2\omega_{1}^{\prime}=\omega_{2}^{\prime} then
                  continue
              else if ω1rω2\omega_{1}^{\prime}\succ_{r}\omega_{2}^{\prime} then
                  Add (ω2𝜎ω1)(\omega_{2}^{\prime}\xrightarrow{\sigma}\omega_{1}^{\prime}) to G¯\bar{G} as an edge
              else
                  Add (ω1𝜎ω2)(\omega_{1}^{\prime}\xrightarrow{\sigma}\omega_{2}^{\prime}) to G¯\bar{G} as an edge                             return G¯\bar{G}

The following theorem gives the interpretation of the result of the algorithm as persistence information.

Theorem F.

Let G¯\bar{G}_{*} be the result of Algorithm 2. Then the following holds:

  1. 1.

    G¯\bar{G}_{*} is a tree whose root is ω\omega_{\infty}

  2. 2.

    Dn1(𝕏r)={(τ,ω)(ω𝜏) is an edge of G¯}D_{n-1}(\mathbb{X}_{r})=\{(\tau,\omega)\mid(\omega\xrightarrow{\tau}*)\text{ is an edge of }\bar{G}_{*}\}. Here * means another vertex

  3. 3.

    If there is an edge ω𝜏ω\omega^{\prime}\xrightarrow{\tau}\omega is in G¯\bar{G}_{*}, we have ωrω\omega^{\prime}\prec_{r}\omega

  4. 4.

    The optimal volume for (τ,ω)Dn1(𝕏r)(\tau,\omega)\in D_{n-1}(\mathbb{X}_{r}) is dec(ω,G¯)\mathrm{dec}(\omega,\bar{G}_{*}), where dec(ω,G¯)\mathrm{dec}(\omega,\bar{G}_{*}) the set of all descendant nodes of ω\omega in G¯\bar{G}_{*} including ω\omega itself

  5. 5.

    G¯\bar{G}_{*} gives the persistence trees. That is, (τ,ω)(\tau,\omega) is the parent of (τ,ω)(\tau^{\prime},\omega^{\prime}) in the persistence trees if and only if there are edges ωτω\omega^{\prime}\xrightarrow{\tau^{\prime}}\omega

We can prove the above theorems using Alexander duality. Appendix A in [23]888 This paper is the preprint version of [24]. The contents of Appendix A in [23] are omitted from [24]; thus, we sometimes refer to the preprint version. provides a detailed discussion.

3 Stable volumes for the (n-1)th PH

We can now describe stable volume for (n1)(n-1)th PH using persistence trees under Condition 6. The parameter ϵ\epsilon in the following definition is called a bandwidth parameter.

Definition 9.

Let XX be a simplicial complex and rr an order with levels on XX, and 𝕏r\mathbb{X}_{r} be the filtration of XX given by (15) and (16). Let G¯\bar{G}_{*} be persistence trees of the filtration 𝕏r\mathbb{X}_{r}. For (τ0,ω0)Dn1(𝕏r)(\tau_{0},\omega_{0})\in D_{n-1}(\mathbb{X}_{r}) and a positive number ϵ\epsilon, the stable volume of the birth-death simplices pair with a noise bandwidth parameter ϵ\epsilon, SVϵ(r,τ0,ω0)\mathrm{SV}_{\epsilon}(r,\tau_{0},\omega_{0}), is defined as follows:

SVϵ(r,τ0,ω0)={ω0}(ωCϵ(τ0,ω0)dec(ω,G¯)),\mathrm{SV}_{\epsilon}(r,\tau_{0},\omega_{0})=\{\omega_{0}\}\cup\left(\bigcup_{\omega\in C_{\epsilon}(\tau_{0},\omega_{0})}\mathrm{dec}(\omega,\bar{G}_{*})\right), (25)

where Cϵ(τ0,ω0)C_{\epsilon}(\tau_{0},\omega_{0}) is given as

Cϵ(τ0,ω0)\displaystyle C_{\epsilon}(\tau_{0},\omega_{0}) ={ωVω𝜏ω0, and r^(τ)r^(τ0)+ϵ}.\displaystyle=\{\omega\in V\mid\omega\xrightarrow{\tau}\omega_{0},\text{ and }\hat{r}(\tau)\geq\hat{r}(\tau_{0})+\epsilon\}. (26)

We specify the following (r,ω0)(r,\omega_{0})-order condition in order to describe the main theorem.

Definition 10.

Let rr and qq be two orders with levels on a simplicial complex XX and ω0\omega_{0} be an nn-simplex. Then qq satisfies an (r,ω0)(r,\omega_{0})-order condition if σrω0\sigma\prec_{r}\omega_{0} implies σqω0\sigma\prec_{q}\omega_{0} for all σX\sigma\in X.

We also define the symbol τq,ω0\tau_{q,\omega_{0}}. For an order with levels qq and an nn-simplex ω0\omega_{0}, τq,ω0\tau_{q,\omega_{0}} is the birth simplex paired with ω0\omega_{0} in Dn1(𝕏q)D_{n-1}(\mathbb{X}_{q}). That is, (τq,ω0,ω0)Dn1(𝕏q)(\tau_{q,\omega_{0}},\omega_{0})\in D_{n-1}(\mathbb{X}_{q}).

The following theorem is the first main theorem of the paper. It states that the stable volume is an invariant part of optimal volumes in the presence of small noise.

Theorem 11.
SVϵ(r,τ0,ω0)=qϵOV(q,ω0),\mathrm{SV}_{\epsilon}(r,\tau_{0},\omega_{0})=\bigcap_{q\in\mathcal{R}_{\epsilon}}\mathrm{OV}(q,\omega_{0}), (27)

where

ϵ={\displaystyle\mathcal{R}_{\epsilon}=\{ q=(q^,q): an order with levels \displaystyle q=(\hat{q},\preceq_{q}):\text{ an order with levels }\mid (28)
q^r^<ϵ/2 and q satisfies (r,ω0)-order condition},\displaystyle\|\hat{q}-\hat{r}\|_{\infty}<\epsilon/2\text{ and }q\text{ satisfies $(r,\omega_{0})$-order condition}\},

and

OV(q,ω0)\displaystyle\mathrm{OV}(q,\omega_{0}) =the optimal volume of (τq,ω0,ω0)Dn1(𝕏q).\displaystyle=\text{the optimal volume of }(\tau_{q,\omega_{0}},\omega_{0})\in D_{n-1}(\mathbb{X}_{q}). (29)

The theorem treats two different filtrations, 𝕏r\mathbb{X}_{r} and 𝕏q\mathbb{X}_{q}, on the same simplicial complex XX. It should be noted that the reader needs to treat these filtrations carefully.

The following two claims immediately imply Theorem 11 and are discussed in the next three subsections.

Claim 12.

For any qϵq\in\mathcal{R}_{\epsilon}, the following relation holds:

SVϵ(r,τ0,ω0)OV(q,ω0).\mathrm{SV}_{\epsilon}(r,\tau_{0},\omega_{0})\subseteq\mathrm{OV}(q,\omega_{0}). (30)
Claim 13.

For any ω~SVϵ(r,τ0,ω0)\tilde{\omega}\not\in\mathrm{SV}_{\epsilon}(r,\tau_{0},\omega_{0}), there is an order with levels qϵq\in\mathcal{R}_{\epsilon} satisfying ω~OV(q,ω0)\tilde{\omega}\not\in\mathrm{OV}(q,\omega_{0}).

3.1 Dual graph and its subgraphs

To show the theorem, we introduce a dual graph of XX and its subgraphs.

Definition 14.

VXV_{X} and EXE_{X} are defined as follows:

VX\displaystyle V_{X} =X(n){ω},\displaystyle=X^{(n)}\cup\{\omega_{\infty}\}, (31)
EX\displaystyle E_{X} =X(n1).\displaystyle=X^{(n-1)}.

GX=VXEXG_{X}=V_{X}\cup E_{X} is an undirected graph when the two endpoints of τEX\tau\in E_{X} are defined as the two cofaces of τ\tau.

We call the graph GXG_{X} the dual graph of XX. We define the subgraphs of GXG_{X}, GX(rσ)G_{X}(\succeq_{r}\sigma) and GX(r^s)G_{X}(\hat{r}\geq s), as

GX(rσ)\displaystyle G_{X}(\succeq_{r}\sigma) =VX(rσ)EX(rσ)\displaystyle=V_{X}(\succeq_{r}\sigma)\cup E_{X}(\succeq_{r}\sigma) (32)
VX(rσ)\displaystyle V_{X}(\succeq_{r}\sigma) ={ωVXωrσ},\displaystyle=\{\omega\in V_{X}\mid\omega\succeq_{r}\sigma\},
EX(rσ)\displaystyle E_{X}(\succeq_{r}\sigma) ={τEXτrσ},\displaystyle=\{\tau\in E_{X}\mid\tau\succeq_{r}\sigma\},
GX(r^s)\displaystyle G_{X}(\hat{r}\geq s) =VX(r^s)EX(r^s),\displaystyle=V_{X}(\hat{r}\geq s)\cup E_{X}(\hat{r}\geq s),
VX(r^s)\displaystyle V_{X}(\hat{r}\geq s) ={ωVXr^(ω)s},\displaystyle=\{\omega\in V_{X}\mid\hat{r}(\omega)\geq s\},
EX(r^s)\displaystyle E_{X}(\hat{r}\geq s) ={τEXr^(τ)s},\displaystyle=\{\tau\in E_{X}\mid\hat{r}(\tau)\geq s\},

where σ\sigma is a simplex in XX and ss is a real number. The condition (14) ensures that GX(rσ)G_{X}(\succeq_{r}\sigma) and GX(r^s)G_{X}(\hat{r}\geq s) are subgraphs of GXG_{X}. We also define GX(rσ)G_{X}(\succ_{r}\sigma) by replacing r\succeq_{r} with r\succ_{r}.

We introduce the notation G¯σ\bar{G}_{\sigma} as G¯\bar{G} in Algorithm 2 when the inside of the (LOOP) is finished at σ\sigma. The following propositions are essential for the proof of Theorem 11. The facts are shown as Fact 17 and Fact 18 in [23].

Fact 15.

The topological connectivity of vertices in G¯σ\bar{G}_{\sigma} is the same as GX(rσ)G_{X}(\succeq_{r}\sigma). That is, ω1,,ωkVX(rσ)\omega_{1},\ldots,\omega_{k}\in V_{X}(\succeq_{r}\sigma) are all vertices of a connected component in GX(rσ)G_{X}(\succeq_{r}\sigma) if and only if there is a tree in G¯σ\bar{G}_{\sigma} whose vertices are ω1,,ωk\omega_{1},\ldots,\omega_{k}.

Fact 16.

For each tree in G¯σ\bar{G}_{\sigma}, the root of the tree is the (r)(\preceq_{r})-maximum vertex in the tree.

The next proposition describes the role of the birth and death simplices. K(G,ω)K(G,\omega) denotes the connected component of a graph GG which contains a vertex ω\omega. KV(G,ω)K_{\mathrm{V}}(G,\omega) and KE(G,ω)K_{\mathrm{E}}(G,\omega) denote all the vertices and all the edges of K(G,ω)K(G,\omega).

Proposition 17.

Let (τ0,ω0)Dn1(𝕏r)(\tau_{0},\omega_{0})\in D_{n-1}(\mathbb{X}_{r}). Then ω0\omega_{0} is the (r)(\preceq_{r})-maximum nn-simplex in K(GX(rτ0),ω0)K(G_{X}(\succ_{r}\tau_{0}),\omega_{0}) and the optimal volume of (τ0,ω0)(\tau_{0},\omega_{0}) is K(GX(rτ0),ω0)K(G_{X}(\succ_{r}\tau_{0}),\omega_{0}). Furthermore, there exists ω1VX\omega_{1}\in V_{X} satisfying the following conditions:

  • ω0τ0ω1\omega_{0}\xrightarrow{\tau_{0}}\omega_{1} in G¯\bar{G}_{*}

  • ω1rω0\omega_{1}\succ_{r}\omega_{0}

  • K(GX(rτ0),ω0)=K(GX(rτ0),ω0)K(GX(rτ0),ω1){τ0}K(G_{X}(\succeq_{r}\tau_{0}),\omega_{0})=K(G_{X}(\succ_{r}\tau_{0}),\omega_{0})\cup K(G_{X}(\succ_{r}\tau_{0}),\omega_{1})\cup\{\tau_{0}\}. That is, τ0\tau_{0} is the edge between K(GX(rτ0),ω0)K(G_{X}(\succ_{r}\tau_{0}),\omega_{0}) and K(GX(rτ0),ω1)K(G_{X}(\succ_{r}\tau_{0}),\omega_{1})

We can easily prove the proposition by considering Algorithm 2 and Facts 15 and 16.

3.2 Proof of Claim 12

The following equality holds from the definition of stable volume (25), Theorem F, Facts 15 and 16, and Proposition 17.

SVϵ(r,τ0,ω0)=KV(GX(r^r^(τ0)+ϵ),ω0).\mathrm{SV}_{\epsilon}(r,\tau_{0},\omega_{0})=K_{\mathrm{V}}(G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon),\omega_{0}). (33)

The following equality also holds from Proposition 17.

OV(q,ω0)=KV(GX(qτq,ω0),ω0).\mathrm{OV}(q,\omega_{0})=K_{\mathrm{V}}(G_{X}(\succ_{q}\tau_{q,\omega_{0}}),\omega_{0}). (34)

Therefore, we can show the following relationship to prove Claim 12.

KV(GX(r^r^(τ0)+ϵ),ω0)KV(GX(qτq,ω0),ω0).K_{\mathrm{V}}(G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon),\omega_{0})\subseteq K_{\mathrm{V}}(G_{X}(\succ_{q}\tau_{q,\omega_{0}}),\omega_{0}). (35)

The following lemma is essential to prove Claim 12.

Lemma 18.

Let (τ0,ω0)Dn1(𝕏r)(\tau_{0},\omega_{0})\in D_{n-1}(\mathbb{X}_{r}) be a birth-death simplices pair and qϵq\in\mathcal{R}_{\epsilon}. Then the following inequality holds:

q^(τq,ω0)ϵ/2<r^(τ0).\hat{q}(\tau_{q,\omega_{0}})-\epsilon/2<\hat{r}(\tau_{0}). (36)
Proof.

We assume that q^(τq,ω0)ϵ/2r^(τ0)\hat{q}(\tau_{q,\omega_{0}})-\epsilon/2\geq\hat{r}(\tau_{0}) and will show a contradiction. Since q^r^<ϵ/2\|\hat{q}-\hat{r}\|_{\infty}<\epsilon/2, for any σGX\sigma\in G_{X} with τq,ω0qσ\tau_{q,\omega_{0}}\preceq_{q}\sigma, we have

r^(τ0)q^(τq,ω0)ϵ/2q^(σ)ϵ/2<(r^(σ)+ϵ/2)ϵ/2=r^(σ).\hat{r}(\tau_{0})\leq\hat{q}(\tau_{q,\omega_{0}})-\epsilon/2\leq\hat{q}(\sigma)-\epsilon/2<(\hat{r}(\sigma)+\epsilon/2)-\epsilon/2=\hat{r}(\sigma). (37)

From the definition of order with levels, the inequality immediately implies τ0rσ\tau_{0}\prec_{r}\sigma and GX(qτq,ω0)G_{X}(\succeq_{q}\tau_{q,\omega_{0}}) is a subgraph of GX(rτ0)G_{X}(\succ_{r}\tau_{0}).

Since (τq,ω0,ω0)Dn1(𝕏q)(\tau_{q,\omega_{0}},\omega_{0})\in D_{n-1}(\mathbb{X}_{q}), Proposition 17 gives ω1VX\omega_{1}\in V_{X} satisfying

ω1\displaystyle\omega_{1} qω0,\displaystyle\succ_{q}\omega_{0}, (38)
K(GX(qτq,ω0),ω0)\displaystyle K(G_{X}(\succeq_{q}\tau_{q,\omega_{0}}),\omega_{0}) =K(GX(qτq,ω0),ω0)K(GX(qτq,ω0),ω1){τq,ω0}.\displaystyle=K(G_{X}(\succ_{q}\tau_{q,\omega_{0}}),\omega_{0})\cup K(G_{X}(\succ_{q}\tau_{q,\omega_{0}}),\omega_{1})\cup\{\tau_{q,\omega_{0}}\}. (39)

(38) leads to ω1rω0\omega_{1}\succ_{r}\omega_{0} from the (r,ω0)(r,\omega_{0})-order condition. At the same time, (39) leads to ω1K(GX(rτ0),ω0)\omega_{1}\in K(G_{X}(\succ_{r}\tau_{0}),\omega_{0}) since GX(qτq,ω0)G_{X}(\succeq_{q}\tau_{q,\omega_{0}}) is a subgraph of GX(rτ0)G_{X}(\succ_{r}\tau_{0}). These facts lead to a contradiction since ω0\omega_{0} is the (r)(\preceq_{r})-maximum vertex in K(GX(rτ0),ω0)K(G_{X}(\succ_{r}\tau_{0}),\omega_{0}), but K(GX(rτ0),ω0)K(G_{X}(\succ_{r}\tau_{0}),\omega_{0}) contains ω1\omega_{1} and ω1rω0\omega_{1}\succ_{r}\omega_{0}. ∎

Using Lemma 18, we prove that GX(r^r^(τ0)+ϵ)G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon) is a subgraph of GX(qτq,ω0)G_{X}(\succ_{q}\tau_{q,\omega_{0}}) to show (35). For any σGX(r^r^(τ0)+ϵ)\sigma\in G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon), we have

q^(σ)>r^(σ)ϵ/2(r^(τ0)+ϵ)ϵ/2=r^(τ0)+ϵ/2>q^(τq,ω0),\hat{q}(\sigma)>\hat{r}(\sigma)-\epsilon/2\geq(\hat{r}(\tau_{0})+\epsilon)-\epsilon/2=\hat{r}(\tau_{0})+\epsilon/2>\hat{q}(\tau_{q,\omega_{0}}), (40)

and hence σqτq,ω0\sigma\succ_{q}\tau_{q,\omega_{0}}. This means that GX(r^r^(τ0)+ϵ)G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon) is a subgraph of GX(qτq,ω0)G_{X}(\succ_{q}\tau_{q,\omega_{0}}).

3.3 Proof of Claim 13

If ω~OV(r,ω0)\tilde{\omega}\not\in OV(r,\omega_{0}), the conclusion of Claim 13 is trivial. Therefore, we assume ω~OV(r,ω0)\tilde{\omega}\in OV(r,\omega_{0}).

Since

ω~\displaystyle\tilde{\omega} OV(r,ω0)=KV(GX(rτ0),ω0),\displaystyle\in\mathrm{OV}(r,\omega_{0})=K_{\mathrm{V}}(G_{X}(\succ_{r}\tau_{0}),\omega_{0}),
ω~\displaystyle\tilde{\omega} SVϵ(r,τ0,ω0)=KV(GX(r^r^(τ0)+ϵ),ω0),\displaystyle\not\in\mathrm{SV}_{\epsilon}(r,\tau_{0},\omega_{0})=K_{\mathrm{V}}(G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon),\omega_{0}),

and GX(r^>r^(τ0)+ϵ)G_{X}(\hat{r}>\hat{r}(\tau_{0})+\epsilon) is a subgraph of GX(rτ)G_{X}(\succ_{r}\tau), there is τ~EX\tilde{\tau}\in E_{X} satisfying the following conditions:

τ0\displaystyle\tau_{0} rτ~,\displaystyle\prec_{r}\tilde{\tau}, (41)
r^(τ0)\displaystyle\hat{r}(\tau_{0}) r^(τ~)<r^(τ0)+ϵ,\displaystyle\leq\hat{r}(\tilde{\tau})<\hat{r}(\tau_{0})+\epsilon,
ω~\displaystyle\tilde{\omega} KV(GX(rτ~),ω0),\displaystyle\not\in K_{\mathrm{V}}(G_{X}(\succ_{r}\tilde{\tau}),\omega_{0}),
ω~\displaystyle\tilde{\omega} KV(GX(rτ~),ω0).\displaystyle\in K_{\mathrm{V}}(G_{X}(\succeq_{r}\tilde{\tau}),\omega_{0}).

Let ϵ=r^(τ~)r^(τ0)\epsilon_{*}=\hat{r}(\tilde{\tau})-\hat{r}(\tau_{0}) and η=(ϵ+ϵ)/4\eta=(\epsilon_{*}+\epsilon)/4. From (41), we have

ϵ/2\displaystyle\epsilon/2 >η>ϵ/20.\displaystyle>\eta>\epsilon_{*}/2\geq 0. (42)

Proposition 17 gives ω1VX\omega_{1}\in V_{X} satisfying

ω1\displaystyle\omega_{1} rω0, and\displaystyle\succ_{r}\omega_{0},\text{ and } (43)
K(GX(rτ),ω0)\displaystyle K(G_{X}(\succeq_{r}\tau),\omega_{0}) =K(GX(rτ0),ω0)K(GX(rτ0),ω1){τ0}.\displaystyle=K(G_{X}(\succ_{r}\tau_{0}),\omega_{0})\cup K(G_{X}(\succ_{r}\tau_{0}),\omega_{1})\cup\{\tau_{0}\}. (44)

Let ωe\omega_{e} be the endpoint of τ0\tau_{0} contained in K(GX(rτ0),ω0)K(G_{X}(\succ_{r}\tau_{0}),\omega_{0}). A path from ω0\omega_{0} to ωe\omega_{e} exists in K(GX(rτ0),ω0)K(G_{X}(\succ_{r}\tau_{0}),\omega_{0}). We consider the following two cases regarding the path.

Case 1

There is a path PP from ω0\omega_{0} to ωe\omega_{e} in GX(rτ0)G_{X}(\succ_{r}\tau_{0}) without passing through K(GX(rτ~),ω~)K(G_{X}(\succ_{r}\tilde{\tau}),\tilde{\omega}).

Case 2

Any path ω0\omega_{0} to ωe\omega_{e} in K(GX(rτ0)K(G_{X}(\succ_{r}\tau_{0}) passes through K(GX(rτ~),ω~)K(G_{X}(\succ_{r}\tilde{\tau}),\tilde{\omega}).

We will divide the proof into the above two cases.

3.3.1 Case 1

Figure 8 shows the relationship between connected components and the path PP in Case 1. We can construct an order with levels qq satisfying ω~KV(GX(qτq,ω0),ω0)\tilde{\omega}\not\in K_{\mathrm{V}}(G_{X}(\succ_{q}\tau_{q,\omega_{0}}),\omega_{0}) based on this relationship.

Refer to caption
Figure 8: The relationship between connected components and path PP in Case 1. Circles represent vertices, rectangles by dotted lines represent connected components, solid lines represent edges between connected components, and dashed curves represent paths.

Using η\eta, we define a function q^\hat{q} on XX as follows:

q^(σ)={r^(σ)+ηif σX(n),r^(σ)+ηif σQE,r^(σ)ηif σX(n1)\QE,r^(σ)ηif σX(0)X(n2),\hat{q}(\sigma)=\left\{\begin{array}[]{ll}\hat{r}(\sigma)+\eta&\text{if }\sigma\in X^{(n)},\\ \hat{r}(\sigma)+\eta&\text{if }\sigma\in Q_{\mathrm{E}},\\ \hat{r}(\sigma)-\eta&\text{if }\sigma\in X^{(n-1)}\backslash Q_{\mathrm{E}},\\ \hat{r}(\sigma)-\eta&\text{if }\sigma\in X^{(0)}\cup\cdots\cup X^{(n-2)},\end{array}\right. (45)

where Q=P{τ0}K(GX(rτ),ω1)Q=P\cup\{\tau_{0}\}\cup K(G_{X}(\succ_{r}\tau),\omega_{1}), and QEQ_{\mathrm{E}} is the set of all edges in QQ. We can easily show that q^\hat{q} is a level function from the fact that r^\hat{r} is a level function. We define the order on X{ω}X\cup\{\omega_{\infty}\}, q\preceq_{q}, as follows:

σqσ if and only if\displaystyle\sigma\prec_{q}\sigma^{\prime}\text{ if and only if} (46)
q^(σ)<q^(σ), or\displaystyle\hat{q}(\sigma)<\hat{q}(\sigma^{\prime}),\text{ or }
q^(σ)=q^(σ) and σσ, or\displaystyle\hat{q}(\sigma)=\hat{q}(\sigma^{\prime})\text{ and }\sigma\subsetneq\sigma^{\prime},\text{ or }
q^(σ)=q^(σ) and σσ= and σrσ.\displaystyle\hat{q}(\sigma)=\hat{q}(\sigma^{\prime})\text{ and }\sigma\cap\sigma^{\prime}=\emptyset\text{ and }\sigma\prec_{r}\sigma^{\prime}.

This is a kind of lexicographic order by the total preorder (σ,σ)q^(σ)q^(σ)(\sigma,\sigma^{\prime})\mapsto\hat{q}(\sigma)\leq\hat{q}(\sigma^{\prime}), the partial order \subseteq, and the total order r\preceq_{r}. Therefore q\preceq_{q} is a total order.

The following two facts are easily shown.

Fact 19.

q=(q,q^)q=(\preceq_{q},\hat{q}) is an order with levels.

Fact 20.

The order q\preceq_{q} coincides with r\preceq_{r} on VXV_{X}. The same is true on QEQ_{\mathrm{E}} or X(n1)\QEX^{(n-1)}\backslash Q_{\mathrm{E}}.

Fact 21.

qϵq\in\mathcal{R}_{\epsilon}.

Proof.

From the definition, we have q^r^η<ϵ/2\|\hat{q}-\hat{r}\|\leq\eta<\epsilon/2. To show the (r,ω0)(r,\omega_{0})-order condition, we assume σrω0\sigma\prec_{r}\omega_{0} and show σrω0\sigma\prec_{r}\omega_{0}. From the assumption, we have r^(σ)r^(ω0)\hat{r}(\sigma)\leq\hat{r}(\omega_{0}). Since ω0\omega_{0} is nn-simplex, q^(ω0)=r^(ω0)+η\hat{q}(\omega_{0})=\hat{r}(\omega_{0})+\eta, and so q^(ω0)q^(σ)\hat{q}(\omega_{0})\geq\hat{q}(\sigma). Therefore, we consider the following cases:

  1. 1.

    q^(ω0)>q^(σ)\hat{q}(\omega_{0})>\hat{q}(\sigma). In this case, we can see ω0qσ\omega_{0}\succ_{q}\sigma from the definition of order with levels.

  2. 2.

    q^(ω0)=q^(σ)\hat{q}(\omega_{0})=\hat{q}(\sigma) and ω0σ\omega_{0}\supsetneq\sigma. In this case, we have ω0qσ\omega_{0}\succ_{q}\sigma from the definition of q\preceq_{q}.

  3. 3.

    q^(ω0)=q^(σ)\hat{q}(\omega_{0})=\hat{q}(\sigma) and ω0σ\omega_{0}\subsetneq\sigma. In this case, the condition ω0σ\omega_{0}\subsetneq\sigma leads to a contradiction since r\prec_{r} satisfies (14) but σrω0\sigma\prec_{r}\omega_{0}.

  4. 4.

    q^(ω0)=q^(σ)\hat{q}(\omega_{0})=\hat{q}(\sigma) and ω0σ=\omega_{0}\cap\sigma=\emptyset. In this case, we have ω0qσ\omega_{0}\succ_{q}\sigma from the assumption of σrω0\sigma\prec_{r}\omega_{0}.

In all cases except 3, we have σrω0\sigma\prec_{r}\omega_{0}. ∎

The following fact comes from Fact 20.

Fact 22.

τ0\tau_{0} is the (q)(\preceq_{q})-minimum element in QQ.

The above fact leads to the following.

Fact 23.

QQ is a subgraph of K(GX(qτ0),ω0)K(G_{X}(\succeq_{q}\tau_{0}),\omega_{0}).

Fact 24.

K(GX(qτ0),ω0)K(G_{X}(\succ_{q}\tau_{0}),\omega_{0}) and K(GX(qτ0),ω1)K(G_{X}(\succ_{q}\tau_{0}),\omega_{1}) are different connected components in GX(qτ0)G_{X}(\succ_{q}\tau_{0}).

Proof.

We assume that K(GX(qτ0),ω0)=K(GX(qτ0),ω1)K(G_{X}(\succ_{q}\tau_{0}),\omega_{0})=K(G_{X}(\succ_{q}\tau_{0}),\omega_{1}) and this will lead to a contradiction. From the assumption, there is a path QQ^{\prime} from ω0\omega_{0} to ω1\omega_{1} in GX(qτ0)G_{X}(\succ_{q}\tau_{0}). Any edge τ\tau on QQ^{\prime} satisfies τqτ0\tau\succ_{q}\tau_{0} and so q^(τ)q^(τ0)\hat{q}(\tau)\geq\hat{q}(\tau_{0}). Therefore, since |r^(τ)q^(τ)|η|\hat{r}(\tau)-\hat{q}(\tau)|\leq\eta and q^(τ0)=r^(τ0)+η\hat{q}(\tau_{0})=\hat{r}(\tau_{0})+\eta, the inequality r^(τ)r^(τ0)\hat{r}(\tau)\geq\hat{r}(\tau_{0}) holds for any τQE\tau\in Q_{\mathrm{E}}^{\prime}, where QEQ_{\mathrm{E}}^{\prime} is the set of all edges on QQ^{\prime}.

We conclude that τrτ0\tau\succ_{r}\tau_{0} for any τQE\tau\in Q_{\mathrm{E}}^{\prime} from (46), τqτ0\tau\succ_{q}\tau_{0}, and r^(τ)r^(τ0)\hat{r}(\tau)\geq\hat{r}(\tau_{0}). This means that QQ^{\prime} is a subgraph of GX(rτ0)G_{X}(\succ_{r}\tau_{0}). However, this contradicts the fact that ω0\omega_{0} and ω1\omega_{1} are contained in different connected components of GX(rτ0)G_{X}(\succ_{r}\tau_{0}). ∎

From Facts 20, 23, and 24, and (43), we have the following fact.

Fact 25.

τq,ω0=τ0\tau_{q,\omega_{0}}=\tau_{0}

Next, we examine τ~\tilde{\tau}.

Fact 26.

τ0qτ~\tau_{0}\succ_{q}\tilde{\tau}.

Proof.

Since τ0QE\tau_{0}\in Q_{\mathrm{E}} and τ~QE\tilde{\tau}\not\in Q_{\mathrm{E}}, we have

q^(τ0)q^(τ~)\displaystyle\hat{q}(\tau_{0})-\hat{q}(\tilde{\tau}) =(r^(τ0)+η)(r^(τ~)η)\displaystyle=(\hat{r}(\tau_{0})+\eta)-(\hat{r}(\tilde{\tau})-\eta)
=r^(τ0)r^(τ~)2η\displaystyle=\hat{r}(\tau_{0})-\hat{r}(\tilde{\tau})-2\eta
=ϵ2η>0.\displaystyle=\epsilon_{*}-2\eta>0.

The inequality leads to τ0qτ~\tau_{0}\succ_{q}\tilde{\tau} since qq is an order with levels. ∎

We can prove the following fact in a similar way to Fact 24.

Fact 27.

ω~K(GX(qτ~),ω0)\tilde{\omega}\not\in K(G_{X}(\succ_{q}\tilde{\tau}),\omega_{0}).

From Facts 25, 26, and 27, the following fact is easily established.

Fact 28.

ω~K(GX(qτq,ω0),ω0)\tilde{\omega}\not\in K(G_{X}(\succ_{q}\tau_{q,\omega_{0}}),\omega_{0}).

Finally, since OV(q,ω0)=KV(GX(qτq,ω0),ω0)\mathrm{OV}(q,\omega_{0})=K_{\mathrm{V}}(G_{X}(\succ_{q}\tau_{q,\omega_{0}}),\omega_{0}), we conclude that ω~OV(q,ω0)\tilde{\omega}\not\in\mathrm{OV}(q,\omega_{0}).

3.3.2 Case 2

In Case 2, there is a path TT from K(GX(rτ~),ω~)K(G_{X}(\succ_{r}\tilde{\tau}),\tilde{\omega}) to ωe\omega_{e} without passing through K(GX(rτ~),ω0)K(G_{X}(\succ_{r}\tilde{\tau}),\omega_{0}). Figure 9 shows the relationship between connected components and paths between the components in this case.

Refer to caption
Figure 9: The relationship between connected components and path TT in Case 2. Circles represent vertices, rectangles by dotted lines represent connected components, solid lines represent edges between connected components, and dashed curves represent paths.

We now define a function s^\hat{s} on XX as follows:

s^(σ)={r^(σ)+ηif σX(n),r^(σ)+ηif σSE,r^(σ)ηif σX(n1)\SE,r^(σ)ηif σX(0)X(n2),\hat{s}(\sigma)=\left\{\begin{array}[]{ll}\hat{r}(\sigma)+\eta&\text{if }\sigma\in X^{(n)},\\ \hat{r}(\sigma)+\eta&\text{if }\sigma\in S_{\mathrm{E}},\\ \hat{r}(\sigma)-\eta&\text{if }\sigma\in X^{(n-1)}\backslash S_{\mathrm{E}},\\ \hat{r}(\sigma)-\eta&\text{if }\sigma\in X^{(0)}\cup\cdots\cup X^{(n-2)},\end{array}\right. (47)

where S=K(GX(rτ~),ω~)T{τ0}K(GX(rτ0),ω1)S=K(G_{X}(\succ_{r}\tilde{\tau}),\tilde{\omega})\cup T\cup\{\tau_{0}\}\cup K(G_{X}(\succ_{r}\tau_{0}),\omega_{1}) and SES_{\mathrm{E}} is the set of all edges of SS.

We also define the total order on XX, s\preceq_{s}, as follows:

σsσ if and only if\displaystyle\sigma\prec_{s}\sigma^{\prime}\text{ if and only if} (48)
s^(σ)<s^(σ), or\displaystyle\hat{s}(\sigma)<\hat{s}(\sigma^{\prime}),\text{ or }
s^(σ)=s^(σ) and σσ, or\displaystyle\hat{s}(\sigma)=\hat{s}(\sigma^{\prime})\text{ and }\sigma\subsetneq\sigma^{\prime},\text{ or }
s^(σ)=s^(σ) and σσ= and σrσ.\displaystyle\hat{s}(\sigma)=\hat{s}(\sigma^{\prime})\text{ and }\sigma\cap\sigma^{\prime}=\emptyset\text{ and }\sigma\prec_{r}\sigma^{\prime}.

We can prove the following fact in a similar way to Facts 19 and 21.

Fact 29.

s=(s,s^)s=(\preceq_{s},\hat{s}) is an order with levels on XX, and sϵs\in\mathcal{R}_{\epsilon}.

The following fact arises directly from the definition of s^\hat{s} and s\preceq_{s}.

Fact 30.

The order s\preceq_{s} coincides with r\preceq_{r} on VXV_{X}. The same is true on QEQ_{\mathrm{E}} or X(n1)\QEX^{(n-1)}\backslash Q_{\mathrm{E}}.

The following fact can be shown in a similar way to Facts 22 and 23.

Fact 31.

SS is a subgraph of K(GX(sτ0),ω~)K(G_{X}(\succeq_{s}\tau_{0}),\tilde{\omega}).

We can show the following fact in the same way as Fact 26. From the following fact, we have K(GX(sτ0),ω~)K(GX(sτ~),ω~)K(G_{X}(\succeq_{s}\tau_{0}),\tilde{\omega})\subseteq K(G_{X}(\succ_{s}\tilde{\tau}),\tilde{\omega}).

Fact 32.

τ0sτ~\tau_{0}\succ_{s}\tilde{\tau}.

We can prove the following in a similar way to Facts 24 and 25 using Facts 30, 31, and 32.

Fact 33.

K(GX(sτ~),ω~)K(G_{X}(\succ_{s}\tilde{\tau}),\tilde{\omega}) and K(GX(sτ~),ω0)K(G_{X}(\succ_{s}\tilde{\tau}),\omega_{0}) are different connected components in GX(sτ~)G_{X}(\succ_{s}\tilde{\tau}) and τ~\tilde{\tau} connects the two connected components. Therefore, τs,ω0=τ~\tau_{s,\omega_{0}}=\tilde{\tau}.

From the above facts, we have the following.

Fact 34.

ω~K(GX(sτs,ω0),ω0)\tilde{\omega}\not\in K(G_{X}(\succ_{s}\tau_{s,\omega_{0}}),\omega_{0}).

Finally, since OV(q,ω0)=KV(GX(qτq,ω0),ω0)\mathrm{OV}(q,\omega_{0})=K_{\mathrm{V}}(G_{X}(\succ_{q}\tau_{q,\omega_{0}}),\omega_{0}), we conclude that ω~OV(q,ω0)\tilde{\omega}\not\in\mathrm{OV}(q,\omega_{0}).

3.4 Limitation of stable volumes

In this subsection, we discuss some of the limitations of stable volumes.

The first relates to the (r,ω0)(r,\omega_{0})-order condition. Theorem 11 requires the (r,ω0)(r,\omega_{0})-order condition; however, this condition seems somewhat curious. To clarify, we offer a brief discussion of the role of the condition.

First, we examine what happens when orders with levels qq and rr satisfy q^r^ϵ/2\|\hat{q}-\hat{r}\|\leq\epsilon/2 but not the (r,ω0)(r,\omega_{0})-order condition. Figure 10 shows an example. In this example, we assume the following conditions:

Y\displaystyle Y =Y0{τ1,τ2,ω1,ω2} is a simplicial complex,\displaystyle=Y_{0}\cup\{\tau_{1},\tau_{2},\omega_{1},\omega_{2}\}\text{ is a simplicial complex}, (49)
r^(σ)\displaystyle\hat{r}(\sigma) q^(σ) for any σY,\displaystyle\approx\hat{q}(\sigma)\text{ for any }\sigma\in Y, (50)
r^(ω1)\displaystyle\hat{r}(\omega_{1}) r^(ω2),q^(ω1)q^(ω2),\displaystyle\approx\hat{r}(\omega_{2}),\hat{q}(\omega_{1})\approx\hat{q}(\omega_{2}), (51)
r^(σ)\displaystyle\hat{r}(\sigma) <r^(τ1) for any σY0,\displaystyle<\hat{r}(\tau_{1})\text{ for any }\sigma\in Y_{0}, (52)
q^(σ)\displaystyle\hat{q}(\sigma) <q^(τ1) for any σY0,\displaystyle<\hat{q}(\tau_{1})\text{ for any }\sigma\in Y_{0}, (53)
r^(τ1)\displaystyle\hat{r}(\tau_{1}) <r^(τ2)<r^(ω1)<r^(ω2) for any σY0,\displaystyle<\hat{r}(\tau_{2})<\hat{r}(\omega_{1})<\hat{r}(\omega_{2})\text{ for any }\sigma\in Y_{0}, (54)
q^(τ1)\displaystyle\hat{q}(\tau_{1}) <q^(τ2)<q^(ω2)<q^(ω1) for any σY0.\displaystyle<\hat{q}(\tau_{2})<\hat{q}(\omega_{2})<\hat{q}(\omega_{1})\text{ for any }\sigma\in Y_{0}. (55)
Refer to caption
Figure 10: Two filtrations on the same simplicial complex

Here, we have

(r^(τ1),r^(ω2)),(r^(τ2),r^(ω1))\displaystyle(\hat{r}(\tau_{1}),\hat{r}(\omega_{2})),(\hat{r}(\tau_{2}),\hat{r}(\omega_{1})) PD1(X,r),\displaystyle\in\mathrm{PD}_{1}(X,r),
(q^(τ1),q^(ω1)),(q^(τ2),q^(ω2))\displaystyle(\hat{q}(\tau_{1}),\hat{q}(\omega_{1})),(\hat{q}(\tau_{2}),\hat{q}(\omega_{2})) PD1(X,q),\displaystyle\in\mathrm{PD}_{1}(X,q),
(r^(τ1),r^(ω2))\displaystyle(\hat{r}(\tau_{1}),\hat{r}(\omega_{2})) (q^(τ1),q^(ω1)),\displaystyle\approx(\hat{q}(\tau_{1}),\hat{q}(\omega_{1})),
(r^(τ2),r^(ω1))\displaystyle(\hat{r}(\tau_{2}),\hat{r}(\omega_{1})) (q^(τ2),q^(ω2)).\displaystyle\approx(\hat{q}(\tau_{2}),\hat{q}(\omega_{2})).

The optimal volumes are

{ω1,ω2}\displaystyle\{\omega_{1},\omega_{2}\} =OV(r^,r^(τ1),r^(ω2))=OV(q^,q^(τ1),q^(ω1)),\displaystyle=\mathrm{OV}(\hat{r},\hat{r}(\tau_{1}),\hat{r}(\omega_{2}))=\mathrm{OV}(\hat{q},\hat{q}(\tau_{1}),\hat{q}(\omega_{1})), (56)
{ω1}\displaystyle\{\omega_{1}\} =OV(r^,r^(τ2),r^(ω1)),\displaystyle=\mathrm{OV}(\hat{r},\hat{r}(\tau_{2}),\hat{r}(\omega_{1})), (57)
{ω2}\displaystyle\{\omega_{2}\} =OV(q^,q^(τ2),q^(ω2)).\displaystyle=\mathrm{OV}(\hat{q},\hat{q}(\tau_{2}),\hat{q}(\omega_{2})). (58)

Therefore, (r^(τ2),r^(ω1))(q^(τ2),q^(ω2))(\hat{r}(\tau_{2}),\hat{r}(\omega_{1}))\approx(\hat{q}(\tau_{2}),\hat{q}(\omega_{2})); however, OV(r^,r^(τ2),r^(ω1))\mathrm{OV}(\hat{r},\hat{r}(\tau_{2}),\hat{r}(\omega_{1})) does not intersect with OV(q^,q^(τ2),q^(ω2))\mathrm{OV}(\hat{q},\hat{q}(\tau_{2}),\hat{q}(\omega_{2})). Thus, the example shows that OV(r^,r^(τ2),r^(ω1))\mathrm{OV}(\hat{r},\hat{r}(\tau_{2}),\hat{r}(\omega_{1})) does not have a robust part to small noise if we do not assume the (r,ω0)(r,\omega_{0})-order condition.

In contrast, (56) indicates that OV(r^,r^(τ1),r^(ω2))\mathrm{OV}(\hat{r},\hat{r}(\tau_{1}),\hat{r}(\omega_{2})) has a stable part even if we do not assume the (r,ω0)(r,\omega_{0})-order condition. We expect that if an optimal volume is large in some sense, then the optimal volume has a stable part even if the (r,ω0)(r,\omega_{0})-order condition is not satisfied. Mathematically clarifying what is meant by “large in some sense” is a subject for future research.

Next, we focus on the construction of simplicial filtrations. An alpha complex [15] is often used to construct a filtration from a pointcloud. Alpha complexes have some good properties. If a pointcloud is in d\mathbb{R}^{d} and satisfies the general position condition, then the alpha shape is embedded in d\mathbb{R}^{d}. An alpha shape has the same information as the union-balls model. More precisely, an alpha shape is homotopy equivalent to the union of balls. However, the alpha shape may vary discontinuously with noise. Theorem 11 assumes that the simplicial complex XX does not change with noise; however, the assumption does not hold if the noise is large. On the other hand, if the noise is small, the assumption holds and the theorem works perfectly. In Section 6, we show that a large noise bandwidth parameter gives unexpected results. Therefore, it is reasonable to assume that the noise is small; however, care should be taken.

4 Another formalization of a stable volume as the solution to an optimization problem

The formalization of stable volumes is based on persistence trees. Importantly, however, persistence trees are available only for the (n1)(n-1)th persistence homology, and we cannot apply the concept directly to another degree. Another version of a stable volume can be defined as the solution to an optimization problem.

Definition 35.

Let rr be an order with levels and (τ0,ω0)Dk(𝕏r)(\tau_{0},\omega_{0})\in D_{k}(\mathbb{X}_{r}). The stable volume by optimization of (τ0,ω0)(\tau_{0},\omega_{0}) is defined by the solution to the following minimization problem.

minimize z0, s. t.\displaystyle\|z\|_{0},\text{ s. t. }
z\displaystyle z =ω0+ωϵ,k+1αωωCk+1(X;𝕜),\displaystyle=\omega_{0}+\sum_{\omega\in\mathcal{F}_{\epsilon,k+1}}\alpha_{\omega}\omega\in C_{k+1}(X;\Bbbk), (59)
τ(z)\displaystyle\tau^{*}(\partial z) =0 for any τϵ,k,\displaystyle=0\text{ for any }\tau\in\mathcal{F}_{\epsilon,k}, (60)

where 𝕜\Bbbk is the coefficient field and ϵ,k={σX(k)r^(τ0)+ϵr^(σ) and σrω0}\mathcal{F}_{\epsilon,k}=\{\sigma\in X^{(k)}\mid\hat{r}(\tau_{0})+\epsilon\leq\hat{r}(\sigma)\text{ and }\sigma\prec_{r}\omega_{0}\}. SV~ϵ(r,τ0,ω0)\tilde{\mathrm{SV}}_{\epsilon}(r,\tau_{0},\omega_{0}) denotes the stable volume by optimization.

4.1 Another formalization of a stable volume: A special case

For (n1)(n-1)th PH, we will prove the following second main theorem, which guarantees the validity of the definition.

Theorem 36.

If XX satisfies Condition 6, 𝕜=/2\Bbbk=\mathbb{Z}/2\mathbb{Z}, and (τ0,ω0)Dn1(𝕏r)(\tau_{0},\omega_{0})\in D_{n-1}(\mathbb{X}_{r}), SV~ϵ(r,τ0,ω0)\tilde{\mathrm{SV}}_{\epsilon}(r,\tau_{0},\omega_{0}) coincides with SVϵ(r,τ0,ω0)\mathrm{SV}_{\epsilon}(r,\tau_{0},\omega_{0}).

Recall the remark associated with Theorem E that SV~ϵ(r,τ0,ω0)\tilde{\mathrm{SV}}_{\epsilon}(r,\tau_{0},\omega_{0}) can be regarded as a subset of X(n)X^{(n)}.

For z=ω0+ωϵ,nαωωCn(X;/2)z=\omega_{0}+\sum_{\omega\in\mathcal{F}_{\epsilon,n}}\alpha_{\omega}\omega\in C_{n}(X;\mathbb{Z}/2\mathbb{Z}), we define VzV_{z} and EzE_{z} as follows.

Vz\displaystyle V_{z} ={ωϵ,nαω=1}{ω0},\displaystyle=\{\omega\in\mathcal{F}_{\epsilon,n}\mid\alpha_{\omega}=1\}\cup\{\omega_{0}\}, (61)
Ez\displaystyle E_{z} ={τϵ,n1ωVzτ(ω)=1}.\displaystyle=\{\tau\in\mathcal{F}_{\epsilon,n-1}\mid\exists\omega\in V_{z}\ \tau^{*}(\partial\omega)=1\}.

To prove the theorem, we show the following lemma.

Lemma 37.

If zz satisfies (59) and (60), Gz:=VzEzG_{z}:=V_{z}\cup E_{z} is a subgraph of GXG_{X} and GzG_{z} is a disjoint union of some connected components of GX(r^r^(τ0)+ϵ)G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon).

We now assume that zz satisfies (59) and (60). To show the above lemma, we first show the following fact.

Fact 38.

For any τEz\tau\in E_{z}, τ\tau is not a face of ω\omega_{\infty}.

Proof.

Assume τ\tau is a face of ω\omega_{\infty}. Then there is a unique nn-simplex ω1X(n)\omega_{1}\in X^{(n)} which is the coface of τ\tau. From the definition of EzE_{z} and ωVz\omega_{\infty}\not\in V_{z}, VzV_{z} should contain ω1\omega_{1}. However,

τ(z)=τ((ωVzω))=ωVzτ(ω)=τ(ω1)=10,\tau^{*}(\partial z)=\tau^{*}(\partial(\sum_{\omega\in V_{z}}\omega))=\sum_{\omega\in V_{z}}\tau^{*}(\partial\omega)=\tau^{*}(\partial\omega_{1})=1\not=0, (62)

which contradicts (60). ∎

The following fact guarantees that GzG_{z} is a subgraph of GXG_{X}.

Fact 39.

For any τEz\tau\in E_{z}, both cofaces of τ\tau, ω1\omega_{1} and ω2\omega_{2}, are contained in VzV_{z}.

Proof.

From the definition EzE_{z}, VzV_{z} contains either ω1\omega_{1} or ω2\omega_{2} and one of the following holds: αω1=1\alpha_{\omega_{1}}=1 or αω1=1\alpha_{\omega_{1}}=1. From the condition (60),

τ(z)=ωϵ,nαωτ((ω))=αω1+αω2=0.\tau^{*}(\partial z)=\sum_{\omega\in\mathcal{F}_{\epsilon,n}}\alpha_{\omega}\tau^{*}(\partial(\omega))=\alpha_{\omega_{1}}+\alpha_{\omega_{2}}=0. (63)

Therefore, αω1=αω2=1\alpha_{\omega_{1}}=\alpha_{\omega_{2}}=1 , which means that both ω1\omega_{1} and ω2\omega_{2} are contained in VzV_{z}. ∎

Fact 40.

For ω1VX\omega_{1}\in V_{X}, Aω1A_{\omega_{1}} denotes the set of all (n1)(n-1)-dimensional faces of ω1\omega_{1}. If ω1Vz\omega_{1}\in V_{z}, then the following holds:

Aω1EX(r^r^(τ0)+ϵ)=Aω1Ez.A_{\omega_{1}}\cap E_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon)=A_{\omega_{1}}\cap E_{z}. (64)
Proof.

Since Ezϵ,n1EX(r^r^(τ0)+ϵ)E_{z}\subseteq\mathcal{F}_{\epsilon,n-1}\subseteq E_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon), Aω1EX(r^r^(τ0)+ϵ)Aω1EzA_{\omega_{1}}\cap E_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon)\supseteq A_{\omega_{1}}\cap E_{z} is trivial. Therefore, we assume τAω1EX(r^r^(τ0)+ϵ)\tau\in A_{\omega_{1}}\cap E_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon) and will show τAω1Ez\tau\in A_{\omega_{1}}\cap E_{z}. Since τrω1\tau\prec_{r}\omega_{1} and ω1Vzϵ,n{ω0}\omega_{1}\in V_{z}\subset\mathcal{F}_{\epsilon,n}\cup\{\omega_{0}\}, we have τrω0\tau\prec_{r}\omega_{0}. Therefore, τϵ,n1\tau\in\mathcal{F}_{\epsilon,n-1} since τEX(r^r^(τ0)+ϵ)\tau\in E_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon). We also have τ(ω1)=1\tau^{*}(\partial\omega_{1})=1 because τAω1\tau\in A_{\omega_{1}} and we conclude τEz\tau\in E_{z}. ∎

By repeatedly using Fact 40, we can see that any path in GX(r^r^(τ0)+ϵ)G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon) starting from ω1Vz\omega_{1}\in V_{z} is also a path in GzG_{z}. This means that GzG_{z} is a disjoint union of some connected components of GX(r^r^(τ0)+ϵ)G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon).

Lemma 41.

z1=ωKV(GX(r^r^(τ0)+ϵ),ω0)ωz_{1}=\sum_{\omega\in K_{\mathrm{V}}(G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon),\omega_{0})}\omega satisfies (59) and (60).

Proof.

First, we show (59). The condition (59) is equivalent to the following inclusion relationship:

KV(GX(r^r^(τ0)+ϵ),ω0)ϵ,n{ω0}.K_{\mathrm{V}}(G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon),\omega_{0})\subset\mathcal{F}_{\epsilon,n}\cup\{\omega_{0}\}. (65)

Since GX(r^r^(τ0)+ϵ)G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon) is a subgraph of GX(rτ0)G_{X}(\succ_{r}\tau_{0}) and ω0\omega_{0} is (r)(\succeq_{r})-maximum element in K(GX(rτ0),ω0)K(G_{X}(\succ_{r}\tau_{0}),\omega_{0}), ωrω0\omega\preceq_{r}\omega_{0} holds for any ωKV(GX(r^r^(τ0)+ϵ),ω0)\omega\in K_{\mathrm{V}}(G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon),\omega_{0}). This equates to (65).

Next we show (60). We can show the following relationship in a similar way to (65).

KE(GX(r^r^(τ0)+ϵ),ω0)ϵ,n1.K_{E}(G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon),\omega_{0})\subseteq\mathcal{F}_{\epsilon,n-1}. (66)

Let τ\tau be an element of ϵ,n1\mathcal{F}_{\epsilon,n-1}. We consider the following two cases to show τ(z)=0\tau^{*}(\partial z)=0.

  1. 1.

    τKE(GX(r^r^(τ0)+ϵ),ω0)\tau\in K_{E}(G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon),\omega_{0}). In this case, KE(GX(r^r^(τ0)+ϵ),ω0)K_{E}(G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon),\omega_{0}) contains both cofaces ω1,ω2\omega_{1},\omega_{2} of τ\tau, and

    τ(z1)=τ(ω1)+τ(ω2)=1+1=0.\tau^{*}(\partial z_{1})=\tau^{*}(\partial\omega_{1})+\tau^{*}(\partial\omega_{2})=1+1=0. (67)
  2. 2.

    τϵ,n1\KE(GX(r^r^(τ0)+ϵ),ω0)\tau\in\mathcal{F}_{\epsilon,n-1}\backslash K_{E}(G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon),\omega_{0}). In this case, KE(GX(r^r^(τ0)+ϵ),ω0)K_{E}(G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon),\omega_{0}) does not contain both cofaces of τ\tau, and τ(z1)=0\tau^{*}(\partial z_{1})=0.

In both cases, we have τ(z1)=0\tau^{*}(\partial z_{1})=0. ∎

From Lemma 37 and Lemma 41, we can show that z1z_{1} in Lemma 41 is the solution to the optimization problem in Definition 35. This means that KV(GX(r^r^(τ0)+ϵ),ω0)=SV~ϵ(r^,τ0,ω0)K_{\mathrm{V}}(G_{X}(\hat{r}\geq\hat{r}(\tau_{0})+\epsilon),\omega_{0})=\tilde{\mathrm{SV}}_{\epsilon}(\hat{r},\tau_{0},\omega_{0}). From (33), this set is equal to SV(r,τ0,ω0)\mathrm{SV}(r,\tau_{0},\omega_{0}) , which completes the proof of the theorem.

4.2 Another formalization of a stable volume: general case

We can apply Definition 35 to cases other than the above. In such cases, Theorem 36 does not hold in general, and it is difficult to mathematically ensure the good property shown in Theorem 11. Empirically, however, the stable volumes often work well. (We examine the property using computer experiments in Section 6.)

The reason for this is likely that an optimal volume is often included in a lower-dimensional structure (a submanifold or a lower-dimensional simplicial complex) and the solution of the stable volume is also included in the structure.

4.3 Stable sub-volume

In the previous subsection, we expressed our belief that a lower-dimensional structure may make the stable volume work well. We can develop a variant of stable volume using this idea. If we find such a lower-dimensional structure, we can consider the stable volume constrained on the structure. An optimal volume is one possible lower-dimensional structure. We define a stable sub-volume as follows.

Definition 42.

Let rr be an order with levels and (τ0,ω0)Dk(𝕏r)(\tau_{0},\omega_{0})\in D_{k}(\mathbb{X}_{r}). The stable sub-volume of (τ0,ω0)(\tau_{0},\omega_{0}) is defined by the solution to the following minimization problem:

minimize z0, s. t.\displaystyle\|z\|_{0},\text{ s. t. }
z\displaystyle z =ω0+ωϵ,k+1OV(r,τ0,ω0)αωωCk+1(X;𝕜),\displaystyle=\omega_{0}+\sum_{\omega\in\mathcal{F}_{\epsilon,k+1}\cap\mathrm{OV}(r,\tau_{0},\omega_{0})}\alpha_{\omega}\omega\in C_{k+1}(X;\Bbbk), (68)
τ(z)\displaystyle\tau^{*}(\partial z) =0 for any τϵ,k,\displaystyle=0\text{ for any }\tau\in\mathcal{F}_{\epsilon,k}, (69)

where 𝕜\Bbbk is the coefficient field and ϵ,k={σX(k)r^(τ0)+ϵr^(σ) and σrω0}\mathcal{F}_{\epsilon,k}=\{\sigma\in X^{(k)}\mid\hat{r}(\tau_{0})+\epsilon\leq\hat{r}(\sigma)\text{ and }\sigma\prec_{r}\omega_{0}\}.

For stable sub-volumes, the parameter ϵ\epsilon is also called a noise bandwidth parameter.

The following proposition holds for the (n1)(n-1)th PH in n\mathbb{R}^{n} since the stable volume is a subset of an optimal volume.

Proposition 43.

Let XX be a simplicial complex embedded in n\mathbb{R}^{n} satisfying Condition 6. Let rr be an order with levels. For (τ0,ω0)Dn1(𝕏r)(\tau_{0},\omega_{0})\in D_{n-1}(\mathbb{X}_{r}) and ϵ>0\epsilon>0, the stable volume of (τ0,ω0)(\tau_{0},\omega_{0}) with bandwidth parameter ϵ\epsilon coincides with the stable sub-volume of (τ0,ω0)(\tau_{0},\omega_{0}) with bandwidth parameter ϵ\epsilon.

The advantages and disadvantages of a stable sub-volume will be discussed in Section 8.

5 Implementation

In this section, we discuss how to produce the stable volume using a computer.

While we can directly implement stable volumes by persistence trees (Definition 9) using the persistence trees constructed by Algorithm 2, implementing stable volumes as an optimization problem is more difficult. As discussed in [9, 12], these kinds of optimization problems are NP-hard in general, which makes them difficult to solve on a computer. To resolve this problem, we can apply the following techniques:

  • Use \mathbb{R} as a coefficient field instead of /2\mathbb{Z}/2\mathbb{Z}

  • Change 0\ell^{0} norm to 1\ell^{1} norm

The second technique is often used in sparse modeling. Following the above approximations, the optimization problem can be formulated as a linear programming problem.

We first need to translate the problem into an acceptable form for a linear programming solver. That is, we need to specify the variables, the objective function (the function to be minimized), and the constraints in the following form:

  • The objective function should be
    (a linear combination of variables)+(constant)(\mbox{a linear combination of variables})+(\mbox{constant})

  • Each constraint should be
    (a linear combination of variables)+(constant)OP(another constant)(\mbox{a linear combination of variables})+(\mbox{constant})\ \textrm{OP}\ (\mbox{another constant}), where OP is any of the relational operators =,,=,\geq,\leq

We can translate this in the same way as in [24].

variables: αω,α¯ω for all ωϵ,k+1\displaystyle\mbox{variables: }\alpha_{\omega},\bar{\alpha}_{\omega}\mbox{ for all }\omega\in\mathcal{F}_{\epsilon,k+1}
minimize ωϵ,k+1α¯ω, subject to\displaystyle\mbox{minimize }\sum_{\omega\in\mathcal{F}_{\epsilon,k+1}}\bar{\alpha}_{\omega},\mbox{ subject to}
α¯ωαω0 for each ωϵ,k+1,\displaystyle\ \ \bar{\alpha}_{\omega}-\alpha_{\omega}\geq 0\mbox{ for each }\omega\in\mathcal{F}_{\epsilon,k+1},
α¯ω+αω0 for each ωϵ,k+1,\displaystyle\ \ \bar{\alpha}_{\omega}+\alpha_{\omega}\geq 0\mbox{ for each }\omega\in\mathcal{F}_{\epsilon,k+1},
cω0,τ+ωϵ,k+1cω,ταω=0 for each τϵ,k,\displaystyle\ \ c_{\omega_{0},\tau}+\sum_{\omega\in\mathcal{F}_{\epsilon,k+1}}c_{\omega,\tau}\ \alpha_{\omega}=0\mbox{ for each }\tau\in\mathcal{F}_{\epsilon,k},
where
cω,τ=τ(ω).\displaystyle\ \ c_{\omega,\tau}=\tau^{*}(\partial\omega).

It should be noted that \mathbb{R} is used as the coefficient field and that we need to consider the sign of cω,τc_{\omega,\tau}.

Section 4.2 in [24] introduced the idea of accelerating the computation of optimal volume using locality. The same idea is applicable to stable volumes.

The program is already in HomCloud[25] and was used for the examples in Section 6.

6 Examples

In this section, we give examples of stable volumes and stable sub-volumes. Alpha filtration[15] is used to compute the persistence diagrams.

6.1 3x3x3 lattice points

For the example shown in Fig. 4, we first produced 27 points in three-dimensional space on a lattice. Specifically, the pointcloud here is {(0,0,0),(0,0,1),,(2,2,2)}\{(0,0,0),(0,0,1),\ldots,(2,2,2)\}. We added a small noise to each point sampled from a uniform distribution on (0.05,0.05)3(-0.05,0.05)^{3}, and a persistence diagram was computed from the pointcloud (Fig. 4(b)). The diagram includes 28 birth-death pairs near (1/4,1/2)(0.5,0.7)(1/4,1/\sqrt{2})\approx(0.5,0.7), each corresponding to 1x1 squares.

Figure 4(c) shows two optimal cycles; ; Fig. 4(d) shows two other optimal cycles. Figure 6(b) shows two stable volumes of the same pairs as in Fig. 4(d) with a bandwidth parameter ϵ=0.05\epsilon=0.05.

Table 1 shows the numbers of volumes having different numbers of vertices among all optimal volumes and stable volumes . As indicated, twenty of the optimal volumes are squares, while eight are not. In contrast, all the stable volumes are squares.

Table 1: Number of volumes having a number of vertices among all optimal volumes and stable volumes
4 vertices 6 vertices 10 vertices
number of optimal volumes 20 7 1
number of stable volumes 28 0 0

We also computed stable sub-volumes in this setting and confirmed that the stable sub-volumes coincided with the stable volumes.

6.2 2D lattice with random defects

For the example shown in Fig. 7, we prepared 30x30 lattice points in a 2D space. The distance between vertices is 1. The points were randomly removed with a probability of 0.5. The points on the perimeter were not removed. We added a small noise to each point sampled from a uniform distribution on (0.1,0.1)2(-0.1,0.1)^{2}. A 1st PD was computed from the pointcloud.

Figure 7(c) shows the optimal volume of (0.496, 4.371). Figure 7(d) shows the stable volume of the same birth-death pair with bandwidth parameter ϵ=0.12\epsilon=0.12.

We also examined the effect of changing the bandwidth parameter, computing stable volumes for bandwidth parameter ϵ=0.0,0.01,0.02,,0.40\epsilon=0.0,0.01,0.02,\ldots,0.40. Figure 12 shows the graph of the bandwidth parameter versus the size of the stable volumes. Size was measured as the number of simplices in the stable volume.

Refer to caption
Figure 11: Bandwidth parameter versus size of stable volume

From the optimal volume (ϵ=0\epsilon=0), the size of the stable volume rapidly decreases. A wide plateau appears at ϵ=0.04\epsilon=0.04 and is completely flat from ϵ=0.06\epsilon=0.06 to ϵ=0.27\epsilon=0.27, meaning that the stable volume is stable to changes in bandwidth parameter over the range [0.06,0.27][0.06,0.27]. This suggests that ϵ\epsilon should be somewhere within this range. It should be noted that the scale of this range coincides with the scale of the noise. This is consistent with the fact that the bandwidth parameter indicates the strength of the virtual noise.

6.3 Comparison with previous studies

We apply the statistical method and reconstructed shortest cycles shortest using the same data as in Figure 7 to compare with our method.

Figure 7 shows the results of statistical method. We use Algorithm 1 to compute these results with a random uniform perturbation to each point. The strenght of the noise is [0.03,0.03]2[-0.03,0.03]^{2} and [0.06,0.06]2[-0.06,0.06]^{2}. We put large black dots in the figure to indicate points whose frequencies are more than 70% or 90%. The result is consistent with Fig 7 and has richer information. However the result looks more difficult to interpret than Fig 7.

Refer to caption
Figure 12: Result of the statistical method. Upper two figures: Additive noise from uniform distribution on [0.03,0.03]2[-0.03,0.03]^{2}. Lower two figures: Additive noise from uniform distribution on [0.06,0.06]2[-0.06,0.06]^{2}. The left two figures: The points with frequencies > 70% is marked. The right two figures: The points with frequencies > 90% is marked.

Figure 13 shows the reconstructed shortest cycles with different noise bandwidth parameters, 0.1 (left) and 0.3 (right). As shown in the figure, the result for 0.3 is consistent with other results but the result for 0.1 is not consistent. This means that the shortest loop criterion sometimes gives a result inconsistent with the structure of persistent homology.

Refer to caption
Figure 13: Result of reconstructed shortest cycles. Left: Noise bandwidth parameter is 0.1. Right: Noise bandwidth parameter is 0.3.

In this comparison, we also remark that the meaning of noise bandwidth parameters used in those three methods are different, so direct comparison of the results with the same parameter is meaningless. We should focus on the changes of the results by the change of noise bandwidth parameters.

6.4 Atomic configuration of amorphous silica

The proposed approach was applied to more realistic data. In this case, we used the atomic configuration of amorphous silica. The data are from ISAACS[27], generated by reverse Monte Carlo simulations guided by synchrotron X-ray radiation data. The data are available at http://isaacs.sourceforge.net/ex.html.

The PDs were computed from the atomic configuration. Two types of atoms, silicons and oxygens, were mixed in a 1:2 ratio in the data. The atomic type was ignored in this example, and only the positions of the atoms were used. Figure 14 shows the 1st PD.

The PD has birth-death pairs on the vertical line with (birth time)0.7(\text{birth time})\approx 0.7. The birth-death pairs on the vertical line correspond to rings formed by the chemical bonds between oxygen and silicon. Oxygen and silicon atoms appear alternatively on these rings. Previous studies [18, 26] have found that the existence of the vertical line on a PD implies network structures.

Refer to caption
Figure 14: 1st PD for amorphous silica
Refer to caption
Figure 15: (a) Optimal volume and stable volume of (0.677, 5.007). The orange ring is the boundary of the optimal volume; the green ring is the boundary of the stable volume. The red and blue points are oxygen atoms and silicon atoms, respectively. (b) The plot of the bandwidth parameter ϵ\epsilon versus the size of stable volumes and stable sub-volumes.

Figure 15(a) shows the boundary of the optimal volume (orange) and the boundary of the stable volume (green) with ϵ=0.2\epsilon=0.2 of (0.677, 5.007). The red and blue points are oxygen atoms and silicon atoms, respectively. In this case, the stable volume and stable sub-volume are identical. The optimal volume is a large twisted ring, while the stable volume is a simpler ring. Figure 15(b) shows the ϵ\epsilon versus volume plot. The plot indicates that the size of the stable volume quickly decreases from ϵ=0\epsilon=0 to ϵ=0.1\epsilon=0.1, meaning that the optimal volume is sensitive to noise. Therefore, we can consider that the stable volume is the essential part for the birth-death pair.

It should be noted that the stable volume with ϵ[1.1,1.6]\epsilon\in[1.1,1.6] on the second plateau is not a reasonable representation of the birth-death pair since only oxygen atoms exist on the boundary of the volume. The large ϵ\epsilon causes the removal of the information of the silicon atoms.

Figure 16(a) shows the optimal volume (orange) of (0.669, 5.053) and the stable volume (green) of the same pair with ϵ=0.2\epsilon=0.2. In this case, the optimal volume and the stable sub-volume are identical. However, the stable volume and sub-volume are not identical since the stable volume is not included in the optimal volume. Figure 16(b) shows the plot of ϵ\epsilon against the volume.

Refer to caption
Figure 16: (a) Optimal volume and stable volume of (0.669, 5.053). The orange ring is the boundary of the optimal volume; the green ring is the boundary of the stable volume. The red and blue points are oxygen atoms and silicon atoms, respectively. (b) Plot of the bandwidth parameter ϵ\epsilon versus the size of the stable volumes and stable sub-volumes. (c) Schematic of the optimal volume and the stable volume. The orange area is the optimal volume, the dark orange ring is its boundary, and the green ring is the boundary of the stable volume.

In Fig. 16(a), the stable volume is tighter than the stable sub-volume. Figure 16(c) shows the schematic of the optimal volume and the stable volume. The stable volume and sub-volume are different since path ZZ in the figure is not included in the optimal volume. In our opinion, the stable volume appears superior since it surrounds the tunnel in the pointcloud more tightly. However, we do not have a theoretical guarantee. In general, the stable volume is tighter than the stable sub-volume since the optimization for a stable volume is more aggressive than that for a stable sub-volume.

Refer to caption
Figure 17: (a) Optimal volume and stable volume of (0.671, 5.063). The orange ring is the boundary of the optimal volume, the green ring is the boundary of the stable volume, and the black ring is the boundary of the stable sub-volume. The atoms are omitted for the sake of clarity. (b) Plot of the bandwidth parameter ϵ\epsilon versus the size of stable volumes and stable sub-volumes. (c, d, e) Optimal volume, stable sub-volume, and stable volume.

Figure 17(a) shows the boundaries of the optimal volume (orange), stable volume (green), and stable sub-volume (black) of (0.671, 5.063). The noise bandwidth parameter ϵ=0.2\epsilon=0.2. In this example, the stable volume shows a very tight ring, while the stable sub-volume gives a very conservative result. The implication is that a stable volume gives a better result than a stable sub-volume when the optimal volume is extremely large.

7 Tuning of noise bandwidth parameter

In applying stable volumes, properly tuning the noise bandwidth parameter ϵ\epsilon is essential. The examples shown in the previous section suggest two possible approaches to such tuning:

  • The scale of ϵ\epsilon should be the same scale as the noise of the data

  • A value on the plateau shown in the plot of ϵ\epsilon against volume size (Section 6.2 and Fig. 12)

should be used.

In applying the first approach, we can use the domain knowledge about the data to tune the parameter. Here, the scale of the system noise gives an estimate of the parameter. For example, the scale of thermal fluctuation can be used as the parameter if the data consist of the atomic configuration, and thermal fluctuation is the dominant noise.

To apply the second approach, we need to plot the figure. As noted in Section 6.4, when there are multiple plateaus, we need additional criteria to determine which is better.

If we use stable volumes by persistence trees, such a plot is easy to produce. From Definition 9, the size of a stable volume is computable as follows:

1+ωCϵ(τ0,ω0)(size of dec(ω,V,E)).1+\sum_{\omega\in C_{\epsilon}(\tau_{0},\omega_{0})}(\text{size of }\mathrm{dec}(\omega,V,E)). (70)

We can compute the size of dec(ω,V,E)\mathrm{dec}(\omega,V,E) by counting all the descendant elements of ω\omega. It is also easy to judge whether ωCϵ(τ0,ω0)\omega\in C_{\epsilon}(\tau_{0},\omega_{0}) by comparing r^(τ0)\hat{r}(\tau_{0}) and r^(τ)\hat{r}(\tau) in (26). HomCloud already has this functionality.

When we cannot use stable volumes by persistence trees and we want to plot ϵ\epsilon against volume size, a stable sub-volume is more useful than a stable volume.

8 Comparison between stable volumes and stable sub-volumes

The question of whether stable volumes or stable sub-volumes are better requires further discussion. The differences between stable volumes and stable sub-volumes can be described as follows:

  • If we want to compute stable volumes or stable sub-volumes using multiple bandwidth parameters, the computation cost of stable sub-volumes is smaller than that of stable volumes. Thus, stable sub-volumes are desirable for constructing ϵ\epsilon versus volume plots.

  • If the size of the optimal volume is large, a stable volume often gives a more aggressive and likely better result than a stable sub-volume.

  • A stable sub-volume gives a more conservative result than a stable volume.

The first listed characteristic shows the advantage of using a stable sub-volume, while the second and third characteristics imply that the choice between stable volumes and stable sub-volumes depends on the problem.

In the author’s opinion, stable sub-volumes are easy to handle in general. However, users should compare both options and make a decision as to which is better for their data.

9 Concluding remarks

We have proposed the concept of stable volumes and stable sub-volumes as a means for identifying good geometric realization of homology generators that, unlike many of the methods proposed in prior research, is robust to noise. The idea of stable volumes and stable sub-volumes is based on optimal volumes [24].

The statistical approach taken by Bendich, Bubenik, and Wagner [3] offers another solution to our problem. However, one advantage of stable volumes and stable sub-volumes is that they do not require a large number of repeated computations. Moreover, stable volumes and sub-volumes are easier to visualize than the output of the statistical method since they give a deterministic rather than a probabilistic description. On the other hand, the advantage of the statistical approach is its flexibility. For first PH, we can only apply the statistical approach when we want to minimize the length of a loop rather than minimize the volume. The statistical approach is also applicable to the spatial distribution of points. Another advantage of the statistical approach is that it gives richer probabilistic information than stable volumes and sub-volumes.

Reconstruct shortest cycles in Ripserer.jl by Čufar[34] offers the other solution. The approach uses the representative of persistent cohomology and the shortest path algorithm. Our proposed method has two advantages over reconstructed shortest cycles. The first is its mathematical background. We prove Theorem 11 to show the good property of stable volumes and stable sub-volumes. Reconstruct shortest cycles do not have such mathematical justification. The second is the difference in the scope of application. Stable volumes and sub-volumes can apply to the PH of any degree; however, reconstructed shortest cycles are only applicable to 1st PH. On the other hand, the advantage of reconstructed shortest cycles is computation efficiency. The algorithm of reconstructed shortest cycles uses the shortest path algorithm and is faster than stable volumes and sub-volumes in general.

In this paper, we presented algorithms only for the filtration of a simplicial complex; however, the concept can be applied to other filtrations, such as cubical and cell filtrations. The proposed algorithms should be helpful in the study of two-dimensional, three-dimensional, or higher-dimensional digital images.

Although we used the number of simplices as the volume size in this paper, it is possible to use the real volume by changing the objective function from ωϵ,k+1α¯ω\sum_{\omega\in\mathcal{F}_{\epsilon,k+1}}\bar{\alpha}_{\omega} to ωϵ,k+1vol(ω)α¯ω\sum_{\omega\in\mathcal{F}_{\epsilon,k+1}}\mathrm{vol}(\omega)\bar{\alpha}_{\omega}, where vol(ω)\mathrm{vol}(\omega) is the volume of the simplex ω\omega. This idea may improve the result.

Acknowledgments

This research is partially supported by JSPS KAKENHI JP19H00834, JP20H05884, JST Presto JPMJPR1923, JST CREST JPMJCR15D3, and JST-Mirai Program JPMJMI18G3.

References

  • [1] Henry Adams, Tegan Emerson, Michael Kirby, Rachel Neville, Chris Peterson, Patrick Shipman, Sofya Chepushtanova, Eric Hanson, Francis Motta, and Lori Ziegelmeier. Persistence images: A stable vector representation of persistent homology. Journal of Machine Learning Research, 18(8):1–35, 2017.
  • [2] Ulrich Bauer and Michael Lesnick. Induced matchings of barcodes and the algebraic stability of persistence. In Proceedings of the Thirtieth Annual Symposium on Computational Geometry, SOCG’14, page 355–364, New York, NY, USA, 2014. Association for Computing Machinery.
  • [3] P. Bendich, P. Bubenik, and A. Wagner. Stabilizing the unstable output of persistent homology computations. J Appl. and Comput. Topology, 4:309–338, 2020.
  • [4] Peter Bubenik. Statistical topological data analysis using persistence landscapes. Journal of Machine Learning Research, 16(3):77–102, 2015.
  • [5] Gunnar Carlsson. Topology and data. Bull. Amer. Math. Soc., 46:255–308, January 2009.
  • [6] Gunnar Carlsson, Tigran Ishkhanov, Vin de Silva, and Afra Zomorodian. On the local behavior of spaces of natural images. International Journal of Computer Vision, 76(1):1–12, Jan 2008.
  • [7] Joseph Minhow Chan, Gunnar Carlsson, and Raul Rabadan. Topology of viral evolution. Proceedings of the National Academy of Sciences, 110(46):18566–18571, 2013.
  • [8] Frédéric Chazal, David Cohen-Steiner, Marc Glisse, Leonidas J. Guibas, and Steve Y. Oudot. Proximity of persistence modules and their diagrams. In Proceedings of the Twenty-Fifth Annual Symposium on Computational Geometry, SCG ’09, page 237–246, New York, NY, USA, 2009. Association for Computing Machinery.
  • [9] Chao Chen and Daniel Freedman. Hardness results for homology localization. Discrete & Computational Geometry, 45(3):425–448, Apr 2011.
  • [10] David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. Stability of persistence diagrams. Discrete & Computational Geometry, 37(1):103–120, Jan 2007.
  • [11] Tamal K. Dey, Anil N. Hirani, and Bala Krishnamoorthy. Optimal homologous cycles, total unimodularity, and linear programming. SIAM J. Comput., 40(4):1026–1044, July 2011.
  • [12] Tamal K. Dey, Tao Hou, and Sayan Mandal. Persistent 1-cycles: Definition, computation, and its application. In Rebeca Marfil, Mariletty Calderón, Fernando Díaz del Río, Pedro Real, and Antonio Bandera, editors, Computational Topology in Image Context, pages 123–136, Cham, 2019. Springer International Publishing.
  • [13] Herbert Edelsbrunner and John Harer. Computational topology: an introduction. American Mathematical Soc., 2010.
  • [14] Herbert Edelsbrunner, David Letscher, and Afra Zomorodian. Topological persistence and simplification. Discrete & Computational Geometry, 28(4):511–533, Nov 2002.
  • [15] Herbert Edelsbrunner and Ernst P. Mücke. Three-dimensional alpha shapes. ACM Trans. Graph., 13(1):43–72, January 1994.
  • [16] Jeff Erickson and Kim Whittlesey. Greedy optimal homotopy and homology generators. In Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’05, pages 1038–1046, Philadelphia, PA, USA, 2005. Society for Industrial and Applied Mathematics.
  • [17] Emerson G. Escolar and Yasuaki Hiraoka. Optimal Cycles for Persistent Homology Via Linear Programming, pages 79–96. Springer Japan, Tokyo, 2016.
  • [18] Yasuaki Hiraoka, Takenobu Nakamura, Akihiko Hirata, Emerson G. Escolar, Kaname Matsue, and Yasumasa Nishiura. Hierarchical structures of amorphous solids characterized by persistent homology. Proceedings of the National Academy of Sciences, 113(26):7035–7040, 2016.
  • [19] Akihiko Hirata, Tomohide Wada, Ippei Obayashi, and Yasuaki Hiraoka. Structural changes during glass formation extracted by computational homology with machine learning. Communications Materials, 1(1):98, Dec 2020.
  • [20] V. Kurlin. A one-dimensional homologically persistent skeleton of an unstructured point cloud in any metric space. Computer Graphics Forum, 34(5):253–262, 2015.
  • [21] Genki Kusano, Kenji Fukumizu, and Yasuaki Hiraoka. Kernel method for persistence diagrams via kernel embedding and weight factor. Journal of Machine Learning Research, 18(189):1–41, 2018.
  • [22] Michael Lesnick. The theory of the interleaving distance on multidimensional persistence modules. Foundations of Computational Mathematics, 15(3):613–650, Jun 2015.
  • [23] Ippei Obayashi. Volume optimal cycle: Tightest representative cycle of a generator on persistent homology, 2017. Preprint version of [24], https://arxiv.org/abs/1712.05103.
  • [24] Ippei Obayashi. Volume-optimal cycle: Tightest representative cycle of a generator in persistent homology. SIAM Journal on Applied Algebra and Geometry, 2(4):508–534, 2018.
  • [25] Ippei Obayashi and HomCloud developer team. Homcloud.
  • [26] Yohei ONODERA, Shinji KOHARA, Shuta TAHARA, Atsunobu MASUNO, Hiroyuki INOUE, Motoki SHIGA, Akihiko HIRATA, Koichi TSUCHIYA, Yasuaki HIRAOKA, Ippei OBAYASHI, Koji OHARA, Akitoshi MIZUNO, and Osami SAKATA. Understanding diffraction patterns of glassy, liquid and amorphous materials via persistent homology analyses. Journal of the Ceramic Society of Japan, 127(12):853–863, 2019.
  • [27] Sébastien Le Rouxa and Valeri Petkova. Isaacs – interactive structure analysis of amorphous and crystalline systems. Journal of applied crystallography, 43(1):181–185, 2010.
  • [28] M. Saadatfar, H. Takeuchi, V. Robins, N. Francois, and Y. Hiraoka. Pore configuration landscape of granular crystallization. Nature Communications, 8:15082, 2017.
  • [29] B. Schweinhart. Statistical Topology of Embedded Graphs. PhD thesis, Princeton University, August 2015. https://web.math.princeton.edu/~bschwein/.
  • [30] Philip Smith and Vitaliy Kurlin. Skeletonisation algorithms with theoretical guarantees for unorganised point clouds with high levels of noise. volume 115, 2021.
  • [31] Anna Suzuki, Miyuki Miyazawa, James M. Minto, Takeshi Tsuji, Ippei Obayashi, Yasuaki Hiraoka, and Takatoshi Ito. Flow estimation solely from image data through persistent homology analysis. Scientific Reports, 11(1):17948, Sep 2021.
  • [32] A. Tahbaz-Salehi and A. Jadbabaie. Distributed coverage verification in sensor networks without location information. In 2008 47th IEEE Conference on Decision and Control, pages 4170–4176, Dec 2008.
  • [33] Afra Zomorodian and Gunnar Carlsson. Computing persistent homology. Discrete & Computational Geometry, 33(2):249–274, Feb 2005.
  • [34] Matija Čufar. Ripserer.jl: flexible and efficient persistent homology computation in julia. Journal of Open Source Software, 5(54):2614, 2020.

Appendix A Reconstructed shortest cycles in Ripserer.jl

We illustrate the mechanism of reconstructed shortest cycles in Ripserer.jl using an example.

Refer to caption
Figure 18: Example for reconstructed shortest cycles

Figure 18 shows the example filtration, and we consider /2\mathbb{Z}/2\mathbb{Z}-persistent homology. In this filtration, a homology generator appearing at X2X_{2} disappears at X7X_{7}. That is, (2,7)(2,7) is a birth-death pair in the first persistence diagram. The representative cycle corresponding to the pair is σX2(1)σ\sum_{\sigma\in X_{2}^{(1)}}\sigma. This representative cycle is the loop appearing at X2X_{2}.

Now we consider persistent cohomology. The persistence diagram defined by persistent cohomology is identical to the diagram by persistent homology since we use a field as a coefficient, but the representative cocycle is, of course, different. The representative cocycle corresponding to (2,7)(2,7) is σ1+σ2+σ3\sigma_{1}^{*}+\sigma_{2}^{*}+\sigma_{3}^{*}.

Refer to caption
Figure 19: Representative cocycle

From the observation of the example, we find that the representative cocycle is the “cut” of the loop. This means that any representative cycle corresponding to the birth-death pair disappears if we remove all 1-simplices in the representative cocycle from the 1-skeleton of X7X_{7} as shown in Fig. 19(a).

Using this fact, we can compute representative cycles using the shortest path algorithm. Let CC be {σ1,σ2,σ3}\{\sigma_{1},\sigma_{2},\sigma_{3}\}, the set of all 1-simplices in the representative cocycle. Now X2(1)X_{2}^{(1)} is divided into two parts, one is X2(1)\CX_{2}^{(1)}\backslash C and another is X2(1)C={σ1}X_{2}^{(1)}\cap C=\{\sigma_{1}\}. Then we compute a shortest path in X4(1)\CX_{4}^{(1)}\backslash C whose two endpoints are the endpoints of σ1\sigma_{1}. The result is shown in Fig. 19(b).

We can extend the idea to computer tighter cycles. We can similarly divide X4(1)X_{4}^{(1)} into X4(1)\CX_{4}^{(1)}\backslash C and X4(1)C={σ1,σ2}X_{4}^{(1)}\cap C=\{\sigma_{1},\sigma_{2}\}. For each σX4(1)C\sigma\in X_{4}^{(1)}\cap C, we compute the shortest path in X4(1)\CX_{4}^{(1)}\backslash C whose two endpoints are the endpoints of σ\sigma. The results are (b) and (c) in Fig. 19, and we choose (c) as the shortest loop in these loops. Using X5X_{5} instead of X4X_{4}, we compute an even tighter loop (d). This is the mechanism of the reconstructed shortest cycles.

The algorithm is summarized as follows.

  1. 1.

    The 1st Persistence diagram PD1(𝕏)\mathrm{PD}_{1}(\mathbb{X}) for a filtration 𝕏:X1XN\mathbb{X}:X_{1}\subset\cdots\subset X_{N} is computed using persistent cohomology. The representative cocycles are also computed

  2. 2.

    We choose a birth-death pair (b,d)(b,d), and take the corresponding representative cocycle σCσ\sum_{\sigma\in C}\sigma^{*}

  3. 3.

    We also choose kk between bb and dd

  4. 4.

    For each σXk(1)C\sigma\in X_{k}^{(1)}\cap C, we compute the shortest path in Xk(1)\CX_{k}^{(1)}\backslash C, P(σ)P(\sigma), whose two endpoints are the endpoints of σ\sigma

  5. 5.

    We search the shortest one from {P(σ)}σC\{P(\sigma)\}_{\sigma\in C}

The mathematical justification of the algorithm is not yet given, but empirically it works well. A deeper analysis of the algorithm is beyond the scope of this paper.