This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Graph recovery from graph wave equation

Yuuya Takayama
Nikon Corporation, Japan
[email protected]
Abstract

We propose a method by which to recover an underlying graph from a set of multivariate wave signals that is discretely sampled from a solution of the graph wave equation. Herein, the graph wave equation is defined with the graph Laplacian, and its solution is explicitly given as a mode expansion of the Laplacian eigenvalues and eigenfunctions. For graph recovery, our idea is to extract modes corresponding to the square root of the eigenvalues from the discrete wave signals using the DMD method, and then to reconstruct the graph (Laplacian) from the eigenfunctions obtained as amplitudes of the modes. Moreover, in order to estimate modes more precisely, we modify the DMD method under an assumption that only stationary modes exist, because graph wave functions always satisfy this assumption.

In conclusion, we demonstrate the proposed method on the wave signals over a path graph. Since our graph recovery procedure can be applied to non-wave signals, we also check its performance on human joint sensor time-series data.

1 Introduction

Graphs have been researched over a century in mathematics and have recently been important in science and application fields, such as physics, chemistry, finance, signal processing, and machine learning. Once a graph is given, mathematical studies can give powerful and useful results about its Laplacian eigenvalues, its connectivity, its hitting time, and so on. However, for practical data, the graph structure is sometimes unknown. Hence, it is a fundamental problem to reconstruct the underlying graph from a given dataset. As typical examples of its solution, Gaussian graphical models (GGMs) are well known in machine learning; these models are derived from the assumption that data are generated from a multivariate Gaussian distribution [25]. In the present paper, alternatively, we assume that multivariate time-series data are generated from the graph wave equation and then provide a procedure by which to recover the weighted graph structure over the variables from the observed dataset. Moreover, based on this procedure, we propose an algorithm to construct a reasonable graph from any multivariate time-series data without assuming the data to follow the graph wave equation.

The graph wave equation was first studied by Friedman and Tillich [11] as a discrete analogue of the continuous wave equation. Here, its general solutions, i.e. graph wave functions, are given as

U(x,t)=k{akcosρkt+bkρksinρkt}vk(x),\displaystyle U(\mathrm{x},t)=\sum_{k}\left\{a_{k}\cos{\sqrt{\rho_{k}}t}+\frac{b_{k}}{\sqrt{\rho_{k}}}\sin{\sqrt{\rho_{k}}t}\right\}v_{k}(\mathrm{x}),

by the eigenfunctions {vk}\{v_{k}\} of the graph Laplacian, where x\mathrm{x} indicates a node of a graph. In fact, a graph wave function for a path graph is viewed as a spatial discretization of a conventional wave function over an interval. In the present paper, as a discrete multivariate time-series model, we use a set of observed values of a graph wave function at integer points tt\in\mathbb{Z}, namely, {U(x,t)x,t}\{U(\mathrm{x},t)\mid\forall\mathrm{x},t\in\mathbb{Z}\}. Note that this discretization in the time direction does not lose any information of the waves if the Nyquist condition ρk<π\sqrt{\rho_{k}}<\pi is satisfied for any kk. When the underlying graph is unknown, so are the frequencies {ρk}\{\sqrt{\rho_{k}}\}. In contrast, it is easier to estimate the eigenfunctions {vk}\{v_{k}\} when the frequencies are known because of the curve-fitting method, or in other words, the least squares method [10] [15, §5]. Hence, in order to recover the graph, it is necessary to estimate the frequencies from observed values of a graph wave function.

To this end, we use the dynamic mode decomposition (DMD) proposed by Schmid [20] and studied by Tu et al. [24]. The essence of DMD is to describe the (T+1)(T+1)-th observed set in terms of a linear sum of the previous TT observed sets

U(x,T+1)=c1U(x,T)+c2U(x,T1)++cTU(x,1)\displaystyle U(\mathrm{x},T+1)=c_{1}U(\mathrm{x},T)+c_{2}U(\mathrm{x},T-1)+\cdots+c_{T}U(\mathrm{x},1)

by cjc_{j}\in\mathbb{R}. Then, the modes {zj}\{z_{j}\} are calculated as the solutions of a polynomial equation ZTc1ZT1cT1ZcT=0Z^{T}-c_{1}Z^{T-1}-\cdots-c_{T-1}Z-c_{T}=0, just like Prony’s method [19], but are shared in all (spatial) variables {U(x,)}\{U(\mathrm{x},\cdot)\}. Moreover, the mode zjz_{j} is immediately related to the frequency ρk\sqrt{\rho_{k}} as zj=e±iρktz_{j}=e^{\pm\mathrm{i}\sqrt{\rho_{k}}t} for some jj and kk. Here, it is worth emphasizing that the mode set {zj}\{z_{j}\} satisfies a stationary condition |zj|=1|z_{j}|=1 and a conjugate condition z¯j=zl\bar{z}_{j}=z_{l} (for some ll). Hence, the coefficients {cj}\{c_{j}\} satisfy a symmetric condition [16]. In fact, this idea works well to calculate the frequencies robustly against numerical error from fewer observation points along the time axis. As a result, this modified DMD method enables us to extract common frequencies more efficiently from multivariate time-series data, although other approaches may be applicable to the frequency estimation for the time-series data of each variable, such as [15, 17, 21],

From the calculated frequencies, we can estimate the graph Laplacian and the graph weights, in addition to the eigenfunctions {vk}\{v_{k}\}, as explained above. To be more precise, the graph weights are given as

wxy=kvk(x)ρkvk(y).\displaystyle w_{\mathrm{x}\mathrm{y}}=-\sum_{k}v_{k}(\mathrm{x})\rho_{k}v_{k}(\mathrm{y}).

For any multivariate time-series data, we extend this form by replacing {vk}\{v_{k}\} with estimated eigenfunction-like functions. Here, the constructed frequency-based weights have the following remarkable properties:

  • the weights are independent of the mean of each variable,

  • when two variables U(x,)U(\mathrm{x},\cdot) and U(y,)U(\mathrm{y},\cdot) have no common frequency, then wxy=0w_{\mathrm{x}\mathrm{y}}=0, and

  • when two variables U(x,)U(\mathrm{x},\cdot) and U(y,)U(\mathrm{y},\cdot) are anti-synchronized, then wxy>0w_{\mathrm{x}\mathrm{y}}>0.

In contrast to the frequency-based graph, Gaussian graphical models (GGMs) define graph weights by the correlation between two variables after conditioning on all other datasets [25, §8] and are widely applicable to biology [26], psychology [8], and finance [12]. Another perspective suggests defining distances for time series, such as dynamical time warping (DTW) [5] and feature based distances [9, 18]. These distances define graph weights through distance kernels for the sake of various time-series analysis or machine learning tasks [2]. In particular, for the human joint time-series data we deal with, a graph structure is often constructed based on the Joint Relative Distance, defined by the pair-wise Euclidean distances between two 3D (or 2D) joint locations taken by a physical sensor. Namely, graph weights are given by its variance [23], its relative variance [14], or its DTW [3]. Compared with these studies, the proposed method has advantages in that it can directly extract information of temporal order, especially, the frequency of time-series data, and moreover requires a smaller amount of data.

The remainder of the present paper is organized as follows. In §2, we review the way to extract modes from equispaced sampled (multivariate) time-series data, known as Prony’s method, and the DMD. We also explain how to impose a stationary mode condition on these methods. In §3, we first recall the definitions of the graph Laplacian and the graph wave equation. Next, we see that a solution to the graph wave equation, a graph wave function, for a path graph naturally corresponds to a continuous wave over an interval, which is used for a demonstration in §4. Then, we solve the graph recovery problem for an observed dataset of graph wave functions using the modified DMD method. This procedure is summarized and extended in §4. Finally, we determine its performance through three examples: graph wave signals over a path graph, discretely sampled continuous wave signals over an interval, and human joint sensor time-series data.

2 Mode extraction techniques

In this section, we review mode extraction methods from the observed multivariate time-series dataset. Then, we explain how to modify these methods for a set of modes satisfying special conditions in §2.2. In §2.3, we also mention amplitude estimation methods, which are combined into mode expansion techniques. Finally, we briefly explain examples of the mode expansion results in §2.4.

2.1 Mode extraction problem setting

Let VV be a finite set {1,2,,n}\{1,2,\cdots,n\} and let 𝒜V\mathcal{A}_{V} denote the set of functions {f:V}\{f\colon V\rightarrow\mathbb{R}\}, which is an \mathbb{R}-vector space. When n=1n=1, 𝒜V\mathcal{A}_{V} is identified with \mathbb{R}. We define two constant functions 0V(x):=00_{V}(\mathrm{x}):=0 and 1V(x):=11_{V}(\mathrm{x}):=1 for any xV\mathrm{x}\in V.

In addition, let {z1,,zM}\{z_{1},\cdots,z_{M}\} be roots of a polynomial equation

ZMc1ZM1c2ZM2cM1ZcM=0.\displaystyle Z^{M}-c_{1}Z^{M-1}-c_{2}Z^{M-2}-\cdots-c_{M-1}Z-c_{M}=0. (2.1)

Needless to say, each coefficient is represented by cj=(1)j+1ej(z1,,zM)c_{j}=(-1)^{j+1}e_{j}(z_{1},\cdots,z_{M}), where eje_{j} denotes the jj-th elementary symmetric polynomial. We assume that the roots {zj}\{z_{j}\} are distinct, and then define a time-series model FF as the time evolution of the roots

F(x,t):=j=1Mαj(x)zjt, for xV,t,\displaystyle F(\mathrm{x},t):=\sum_{j=1}^{M}\alpha_{j}(\mathrm{x})z_{j}^{t},\text{ for }\mathrm{x}\in V,t\in\mathbb{Z}, (2.2)

where αj𝒜V0V\alpha_{j}\in\mathcal{A}_{V}\otimes\mathbb{C}\setminus 0_{V}. We can define FF for tt\in\mathbb{R} by fixing argzj\arg z_{j}, although this definition is not used herein. In this context, we just refer to the roots {zj}\{z_{j}\} and coefficients {αj}\{\alpha_{j}\} as the modes and amplitudes, respectively, of FF. Now, we assume that the amplitudes satisfy a non-degenerate condition

dimSpan[α1,,αM]=min(M,n).\displaystyle\mathop{\mathrm{dim}}\nolimits_{\mathbb{C}}\mathop{\mathrm{Span}}\nolimits[\alpha_{1},\cdots,\alpha_{M}]=\min(M,n). (2.3)

Note that this condition is always satisfied by taking subset VVV^{\prime}\subset V and considering all of the above over VV^{\prime} if necessary. Then, because of (2.1), time-series model FF satisfies

F(x,t+M)c1F(x,t+M1)cM1F(x,t+1)cMF(x,t)=0\displaystyle F(\mathrm{x},t+M)-c_{1}F(\mathrm{x},t+M-1)-\cdots-c_{M-1}F(\mathrm{x},t+1)-c_{M}F(\mathrm{x},t)=0 (2.4)

for any xV\mathrm{x}\in V and any tt\in\mathbb{Z}. This relation is regarded as a linear equation of the coefficients {cj}\{c_{j}\}. Hence the modes {zj}\{z_{j}\} of FF can be determined from temporal observed sets {F(,t)}\{F(\cdot,t)\}.

Proposition 2.5.

Set L=max(Mn,0)+1L=\max(M-n,0)+1. The mode set {zj}\{z_{j}\} in the model FF (2.2) is determined from observed values at L+ML+M integer points {F(,t)𝒜Vt=1,2,,L+M}\{F(\cdot,t)\in\mathcal{A}_{V}\otimes\mathbb{C}\mid t=1,2,\cdots,L+M\} if condition (2.3) is satisfied.

This result is viewed as Prony’s method [19] when n=1n=1, and the dynamic mode decomposition [20] when nMn\geq M.

Proof..

Let us consider the following linear equation according to (2.4);

[F(1,1)F(1,2)F(1,M)F(n,1)F(n,2)F(n,M)F(1,2)F(1,3)F(1,M+1)F(n,2)F(n,3)F(n,M+1)F(1,L)F(1,L+1)F(1,L+M1)F(n,L)F(n,L+1)F(n,L+M1)][cMcM1c1]=[F(1,M+1)F(n,M+1)F(1,M+2)F(n,M+2)F(1,L+M)F(n,L+M)].\displaystyle\left[\begin{smallmatrix}F(1,1)&F(1,2)&\cdots&F(1,M)\\ \vdots&\vdots&&\vdots\\ F(n,1)&F(n,2)&\cdots&F(n,M)\\ F(1,2)&F(1,3)&\cdots&F(1,M+1)\\ \vdots&\vdots&&\vdots\\ F(n,2)&F(n,3)&\cdots&F(n,M+1)\\ \vdots&\vdots&&\vdots\\ F(1,L)&F(1,L+1)&\cdots&F(1,L+M-1)\\ \vdots&\vdots&&\vdots\\ F(n,L)&F(n,L+1)&\cdots&F(n,L+M-1)\\ \end{smallmatrix}\right]\left[\begin{smallmatrix}c_{M}\\ c_{M-1}\\ \vdots\\ c_{1}\end{smallmatrix}\right]=\left[\begin{smallmatrix}F(1,M+1)\\ \vdots\\ F(n,M+1)\\ F(1,M+2)\\ \vdots\\ F(n,M+2)\\ \vdots\\ F(1,L+M)\\ \vdots\\ F(n,L+M)\end{smallmatrix}\right]. (2.6)

Here, our goal is to show that the (nL,M)(nL,M) matrix on the left has rank MM. It is rewritten as

[α1(1)α2(1)αM(1)α1(n)α2(n)αM(n)α1(1)z1α2(1)z2αM(1)zMα1(n)z1α2(n)z2αM(n)zMα1(1)z1L1α2(1)z2L1αM(1)zML1α1(n)z1L1α2(n)z2L1αM(n)zML1][z1z12z1Mz2z22z2MzMzM2zMM]\displaystyle\left[\begin{smallmatrix}\alpha_{1}(1)&\alpha_{2}(1)&\cdots&\alpha_{M}(1)\\ \vdots&\vdots&&\vdots\\ \alpha_{1}(n)&\alpha_{2}(n)&\cdots&\alpha_{M}(n)\\ \alpha_{1}(1)z_{1}&\alpha_{2}(1)z_{2}&\cdots&\alpha_{M}(1)z_{M}\\ \vdots&\vdots&&\vdots\\ \alpha_{1}(n)z_{1}&\alpha_{2}(n)z_{2}&\cdots&\alpha_{M}(n)z_{M}\\ \vdots&\vdots&&\vdots\\ \alpha_{1}(1)z_{1}^{L-1}&\alpha_{2}(1)z_{2}^{L-1}&\cdots&\alpha_{M}(1)z_{M}^{L-1}\\ \vdots&\vdots&&\vdots\\ \alpha_{1}(n)z_{1}^{L-1}&\alpha_{2}(n)z_{2}^{L-1}&\cdots&\alpha_{M}(n)z_{M}^{L-1}\end{smallmatrix}\right]\left[\begin{smallmatrix}z_{1}&z_{1}^{2}&\cdots&z_{1}^{M}\\ z_{2}&z_{2}^{2}&\cdots&z_{2}^{M}\\ \vdots&\vdots&&\vdots\\ z_{M}&z_{M}^{2}&\cdots&z_{M}^{M}\end{smallmatrix}\right]

by applying the definition of FF (2.2). Since {zj}\{z_{j}\} are assumed to be distinct, the Vandermonde matrix on the right has full rank. We claim that the matrix on the left also has rank MM. When nMn\geq M, that is L=1L=1, this follows from condition (2.3). Suppose n<Mn<M, that is L=Mn+1L=M-n+1. Take MM-dimensional row vectors

𝐯j,k=(α1(j)z1k,α2(j)z2k,,αM(j)zMk)\displaystyle{\mathbf{v}}_{j,k}=\left(\alpha_{1}(j)z_{1}^{k},\alpha_{2}(j)z_{2}^{k},\cdots,\alpha_{M}(j)z_{M}^{k}\right)

for j=1,2,,nj=1,2,\cdots,n and k=0,1,,L1k=0,1,\cdots,L-1. Non-degenerate condition (2.3) means nn vectors {𝐯j,0}\{{\mathbf{v}}_{j,0}\} are linearly independent. If necessary, retake 𝐯1,0{\mathbf{v}}_{1,0} by a linear combination of {𝐯j,0}\{{\mathbf{v}}_{j,0}\} so that all entries of 𝐯1,0{\mathbf{v}}_{1,0} are non-zero, which is possible because αj𝒜V0V\alpha_{j}\in\mathcal{A}_{V}\otimes\mathbb{C}\setminus 0_{V}. Hence, we first obtain

dimSpan[𝐯1,0,𝐯1,1,,𝐯1,L1]=L\displaystyle\mathop{\mathrm{dim}}\nolimits_{\mathbb{C}}\mathop{\mathrm{Span}}\nolimits[{\mathbf{v}}_{1,0},{\mathbf{v}}_{1,1},\cdots,{\mathbf{v}}_{1,L-1}]=L

because

[α1(1)α2(1)αM(1)α1(1)z1α2(1)z2αM(1)zMα1(1)z1L1α2(1)z2L1αM(1)zML1]=[111z1z2zMz1L1z2L1zML1][α1(1)000α2(1)000αM(1)]\displaystyle\left[\begin{smallmatrix}\alpha_{1}(1)&\alpha_{2}(1)&\cdots&\alpha_{M}(1)\\ \alpha_{1}(1)z_{1}&\alpha_{2}(1)z_{2}&\cdots&\alpha_{M}(1)z_{M}\\ \vdots&\vdots&&\vdots\\ \alpha_{1}(1)z_{1}^{L-1}&\alpha_{2}(1)z_{2}^{L-1}&\cdots&\alpha_{M}(1)z_{M}^{L-1}\end{smallmatrix}\right]=\left[\begin{smallmatrix}1&1&\cdots&1\\ z_{1}&z_{2}&\cdots&z_{M}\\ \vdots&\vdots&&\vdots\\ z_{1}^{L-1}&z_{2}^{L-1}&\cdots&z_{M}^{L-1}\end{smallmatrix}\right]\left[\begin{smallmatrix}\alpha_{1}(1)&0&\cdots&0\\ 0&\alpha_{2}(1)&&0\\ \vdots&&\ddots&\\ 0&0&\cdots&\alpha_{M}(1)\end{smallmatrix}\right]

Next, if 𝐯2,0Span[𝐯1,0,𝐯1,1,,𝐯1,L1]{\mathbf{v}}_{2,0}\in\mathop{\mathrm{Span}}\nolimits[{\mathbf{v}}_{1,0},{\mathbf{v}}_{1,1},\cdots,{\mathbf{v}}_{1,L-1}], then we can describe 𝐯2,0=l=0L1βl𝐯1,l{\mathbf{v}}_{2,0}=\sum_{l=0}^{L-1}\beta_{l}{\mathbf{v}}_{1,l} for some βl\beta_{l}. This also indicates 𝐯2,L1=l=0L1βl𝐯1,L+l1{\mathbf{v}}_{2,L-1}=\sum_{l=0}^{L-1}\beta_{l}{\mathbf{v}}_{1,L+l-1}. Hence, we obtain

dimSpan[𝐯1,0,𝐯1,1,,𝐯1,L1,𝐯2,0,𝐯2,1,,𝐯2,L1]L+1,\displaystyle\mathop{\mathrm{dim}}\nolimits_{\mathbb{C}}\mathop{\mathrm{Span}}\nolimits[{\mathbf{v}}_{1,0},{\mathbf{v}}_{1,1},\cdots,{\mathbf{v}}_{1,L-1},{\mathbf{v}}_{2,0},{\mathbf{v}}_{2,1},\cdots,{\mathbf{v}}_{2,L-1}]\geq L+1,

because condition (2.3) means 𝐯2,L1β0𝐯1,L1{\mathbf{v}}_{2,L-1}\neq\beta_{0}{\mathbf{v}}_{1,L-1}. Otherwise, Span[𝐯1,0,𝐯1,1,,𝐯1,L1,𝐯2,0]\mathop{\mathrm{Span}}\nolimits[{\mathbf{v}}_{1,0},{\mathbf{v}}_{1,1},\cdots,{\mathbf{v}}_{1,L-1},{\mathbf{v}}_{2,0}] becomes an (L+1)(L+1)-dimensional vector space again. By induction, we can conclude

dimSpan[𝐯1,0,,𝐯1,L1,𝐯2,0,,𝐯2,L1,,𝐯n,0,,𝐯n,L1]L+n1=M.\displaystyle\mathop{\mathrm{dim}}\nolimits_{\mathbb{C}}\mathop{\mathrm{Span}}\nolimits[{\mathbf{v}}_{1,0},\cdots,{\mathbf{v}}_{1,L-1},{\mathbf{v}}_{2,0},\cdots,{\mathbf{v}}_{2,L-1},\cdots,{\mathbf{v}}_{n,0},\cdots,{\mathbf{v}}_{n,L-1}]\geq L+n-1=M.

Therefore, the coefficients {cj}\{c_{j}\} are obtained as a unique solution of the linear equation (2.6), and the modes {zj}\{z_{j}\} are determined as solutions of (2.1). ∎

Remark 2.7.

In practice, it is often the case that the number of modes is also unknown. In that case, one can compute the number of modes as the rank of the matrix on the left of (2.6) for large MM and LL.

2.2 Special conditions for mode set

Now, we consider a special case of (2.2). In practice, the time-series model FF often contains a constant term, which is represented as a mode Z=1Z=1. In addition, it is usual to suppose that other modes are stationary and FF is \mathbb{R}-valued.

Refer to caption
Refer to caption
Refer to caption
Figure 2.1: [left] General mode set in (2.2). [right] DFT mode set, lying on the unit circle at equal intervals. [center] Particular mode set in (2.8), lying on the unit circle vertically symmetrically, including Z=1Z=1, like DFT modes, but not equispaced.

These conditions indicate a particular representation

F(x,t)=α0(x)+j=1N{αj(x)zjt+α¯j(x)z¯jt}, for t,\displaystyle F(\mathrm{x},t)=\alpha_{0}(\mathrm{x})+\sum_{j=1}^{N}\left\{\alpha_{j}(\mathrm{x})z_{j}^{t}+\bar{\alpha}_{j}(\mathrm{x})\bar{z}_{j}^{t}\right\},\text{ for }t\in\mathbb{Z}, (2.8)

where α0𝒜V0V\alpha_{0}\in\mathcal{A}_{V}\setminus 0_{V}, αj𝒜V0V\alpha_{j}\in\mathcal{A}_{V}\otimes\mathbb{C}\setminus 0_{V} and {zjImzj>0,|zj|=1\{z_{j}\in\mathbb{C}\mid\mathop{\mathrm{Im}}\nolimits z_{j}>0,|z_{j}|=1 for 1jN1\leq j\leq N and are distinct from each other}\}. This form is also viewed as a general form of the inverse discrete Fourier transform (DFT), the modes of which are given by {zj=exp(2πij/N)}\{z_{j}=\exp(2\pi\mathrm{i}j/N)\}. See the difference of mode sets in Figure 2.1. As above, we assume that the amplitudes satisfy the non-degenerate condition

dimSpan[α0,α1,,αN]=min(N+1,n).\displaystyle\mathop{\mathrm{dim}}\nolimits_{\mathbb{C}}\mathop{\mathrm{Span}}\nolimits[\alpha_{0},\alpha_{1},\cdots,\alpha_{N}]=\min(N+1,n). (2.9)

Then, as a special case of (2.4), we have the following lemma;

Lemma 2.10.

Take dj=(1)j+1ej(z1,,zN,z¯1,,z¯N)d_{j}=(-1)^{j+1}e_{j}(z_{1},\cdots,z_{N},\bar{z}_{1},\cdots,\bar{z}_{N}) for 1j2N1\leq j\leq 2N. Then, G(x,t;k):=F(x,t+N+1+k)F(x,t+N+k)+F(x,t+N+1k)F(x,t+Nk)(0kN)G(\mathrm{x},t;k):=F(\mathrm{x},t+N+1+k)-F(\mathrm{x},t+N+k)+F(\mathrm{x},t+N+1-k)-F(\mathrm{x},t+N-k)\ (0\leq k\leq N) satisfies

G(x,t;N)d1G(x,t;N1)dN1G(x,t;1)12dNG(x,t;0)=0.\displaystyle G(\mathrm{x},t;N)-d_{1}G(\mathrm{x},t;N-1)-\cdots-d_{N-1}G(\mathrm{x},t;1)-\frac{1}{2}d_{N}G(\mathrm{x},t;0)=0. (2.11)
Proof..

For simplicity, set z0=1z_{0}=1. Then, all Z=z0,z1,,zN,z¯1,,z¯NZ=z_{0},z_{1},\cdots,z_{N},\bar{z}_{1},\cdots,\bar{z}_{N} satisfy

(Z1)(Z2Nd1Z2N1d2Z2N2d2N1Zd2N)=0.\displaystyle(Z-1)\left(Z^{2N}-d_{1}Z^{2N-1}-d_{2}Z^{2N-2}-\cdots-d_{2N-1}Z-d_{2N}\right)=0.

Now we have d2N=1d_{2N}=-1 and d2Nj=djd_{2N-j}=d_{j} because of the conditions for the mode set {zj}\{z_{j}\}, as used in [16]. Hence, the above equation can be rewritten as

Z2N+1Z2N+Z1\displaystyle Z^{2N+1}-Z^{2N}+Z-1 d1(Z2NZ2N1+Z2Z)\displaystyle-d_{1}(Z^{2N}-Z^{2N-1}+Z^{2}-Z)-\cdots
dN1(ZN+2ZN+1+ZNZN1)dN(ZN+1ZN)=0.\displaystyle-d_{N-1}(Z^{N+2}-Z^{N+1}+Z^{N}-Z^{N-1})-d_{N}(Z^{N+1}-Z^{N})=0.

Therefore, by replacing ZkZ^{k} by F(x,t+k)F(\mathrm{x},t+k), we obtain the assertion. ∎

Remark 2.12.

For any mode set {zj|zj|=1,1jN}\{z_{j}\in\mathbb{C}\mid|z_{j}|=1,1\leq j\leq N\}, the coefficients {dj}\{d_{j}\} given in Lemma 2.10 satisfy d2N=1d_{2N}=-1 and d2Nj=djd_{2N-j}=d_{j}, as above. However, the converse is not always true. Namely, for the polynomial equation

Z2Nd1Z2N1dN1ZN+1dNZNdN1ZN1d1Z+1=0,\displaystyle Z^{2N}-d_{1}Z^{2N-1}-\cdots-d_{N-1}Z^{N+1}-d_{N}Z^{N}-d_{N-1}Z^{N-1}\cdots-d_{1}Z+1=0,

when Z=zZ=z is a root, z¯\bar{z} and 1/z1/z become roots as well, where these two roots do not always coincide, as might be expected. For example, we can give a counterexample (Z2i)(Z+2i)(Z12i)(Z+12i)=Z4+174Z2+1(Z-2\mathrm{i})(Z+2\mathrm{i})(Z-\frac{1}{2}\mathrm{i})(Z+\frac{1}{2}\mathrm{i})=Z^{4}+\frac{17}{4}Z^{2}+1 for N=2N=2.

The above lemma helps us to determine the modes from fewer observation points than Proposition 2.5, which needs L+M=2N+2+max(2N+1n,0)L+M=2N+2+\max(2N+1-n,0) observation points for model (2.11).

Proposition 2.13.

Set L=max(Nn,0)+1L=\max(N-n,0)+1. The mode set in model F of (2.8) is determined from observed values at L+2N+1L+2N+1 integer points {F(,t)𝒜Vt=1,2,,L+2N+1}\{F(\cdot,t)\in\mathcal{A}_{V}\mid t=1,2,\cdots,L+2N+1\} if condition (2.9) is satisfied.

Proof..

For linear equation (2.6) for the coefficients {cj}\{c_{j}\}, take M=2N+1M=2N+1 and LL as asserted. Then, the (nL,2N+1)(nL,2N+1) matrix has rank NN because of non-degenerate condition (2.9). By using the argument in the proof of Lemma 2.10, we can transform that linear equation into the following equation for the coefficients {dj}\{d_{j}\}:

[12G(1,1;0)G(1,1;1)G(1,1;N1)12G(n,1;0)G(n,1;1)G(n,1;N1)12G(1,2;0)G(1,2;1)G(1,2;N1)12G(n,2;0)G(n,2;1)G(n,2;N1)12G(1,L;0)G(1,L;1)G(1,L;N1)12G(n,L;0)G(n,L;1)G(n,L;N1)][dNdN1d1]=[G(1,1;N)G(n,1;N)G(1,2;N)G(n,2;N)G(1,L;N)G(n,L;N)].\displaystyle\left[\begin{smallmatrix}\frac{1}{2}G(1,1;0)&G(1,1;1)&\cdots&G(1,1;N-1)\\ \vdots&\vdots&&\vdots\\ \frac{1}{2}G(n,1;0)&G(n,1;1)&\cdots&G(n,1;N-1)\\ \frac{1}{2}G(1,2;0)&G(1,2;1)&\cdots&G(1,2;N-1)\\ \vdots&\vdots&&\vdots\\ \frac{1}{2}G(n,2;0)&G(n,2;1)&\cdots&G(n,2;N-1)\\ \vdots&\vdots&&\vdots\\ \frac{1}{2}G(1,L;0)&G(1,L;1)&\cdots&G(1,L;N-1)\\ \vdots&\vdots&&\vdots\\ \frac{1}{2}G(n,L;0)&G(n,L;1)&\cdots&G(n,L;N-1)\end{smallmatrix}\right]\left[\begin{smallmatrix}d_{N}\\ d_{N-1}\\ \vdots\\ d_{1}\end{smallmatrix}\right]=\left[\begin{smallmatrix}G(1,1;N)\\ \vdots\\ G(n,1;N)\\ G(1,2;N)\\ \vdots\\ G(n,2;N)\\ \vdots\\ G(1,L;N)\\ \vdots\\ G(n,L;N)\end{smallmatrix}\right]. (2.14)

Here, the (nL,N)(nL,N) matrix on the left still has rank NN, and then the coefficients {dj}\{d_{j}\} and the modes {zj}\{z_{j}\} are determined. ∎

2.3 Amplitude estimation

Once the modes {zj}\{z_{j}\} in the general model (2.2) or the particular model (2.8) are obtained, the corresponding amplitudes {αj}\{\alpha_{j}\} are also calculated by the curve fitting method. For model (2.2), we have

[F(1,1)F(1,2)F(1,T)F(n,1)F(n,2)F(n,T)]=[α1(1)α2(1)αM(1)α1(n)α2(n)αM(n)][z1z12z1Tz2z22z2TzMzM2zMT],\displaystyle\left[\begin{smallmatrix}F(1,1)&F(1,2)&\cdots&F(1,T)\\ \vdots&\vdots&&\vdots\\ F(n,1)&F(n,2)&\cdots&F(n,T)\end{smallmatrix}\right]=\left[\begin{smallmatrix}\alpha_{1}(1)&\alpha_{2}(1)&\cdots&\alpha_{M}(1)\\ \vdots&\vdots&&\vdots\\ \alpha_{1}(n)&\alpha_{2}(n)&\cdots&\alpha_{M}(n)\end{smallmatrix}\right]\left[\begin{smallmatrix}z_{1}&z_{1}^{2}&\cdots&z_{1}^{T}\\ z_{2}&z_{2}^{2}&\cdots&z_{2}^{T}\\ \vdots&\vdots&&\vdots\\ z_{M}&z_{M}^{2}&\cdots&z_{M}^{T}\end{smallmatrix}\right], (2.15)

for any T>0T\in\mathbb{Z}_{>0}. Since the Vandermonde matrix on the right has rank MM when TMT\geq M, observed values at MM integer points {F(,t)𝒜Vt=1,2,,M}\{F(\cdot,t)\in\mathcal{A}_{V}\otimes\mathbb{C}\mid t=1,2,\cdots,M\} determine all of the amplitudes {αj}\{\alpha_{j}\}.

Likewise, for model (2.8), we have

[F(1,1)F(1,2)F(1,T)F(n,1)F(n,2)F(n,T)]=[α0(1)α1(1)α¯1(1)α¯N(1)α0(n)α1(n)α¯1(n)α¯N(n)][111z1z12z1Tz¯1z¯12z¯1Tz¯Nz¯N2z¯NT],\displaystyle\left[\begin{smallmatrix}F(1,1)&F(1,2)&\cdots&F(1,T)\\ \vdots&\vdots&&\vdots\\ F(n,1)&F(n,2)&\cdots&F(n,T)\end{smallmatrix}\right]=\left[\begin{smallmatrix}\alpha_{0}(1)&\alpha_{1}(1)&\bar{\alpha}_{1}(1)&\cdots&\bar{\alpha}_{N}(1)\\ \vdots&\vdots&\vdots&&\vdots\\ \alpha_{0}(n)&\alpha_{1}(n)&\bar{\alpha}_{1}(n)&\cdots&\bar{\alpha}_{N}(n)\end{smallmatrix}\right]\left[\begin{smallmatrix}1&1&\cdots&1\\ z_{1}&z_{1}^{2}&\cdots&z_{1}^{T}\\ \bar{z}_{1}&\bar{z}_{1}^{2}&\cdots&\bar{z}_{1}^{T}\\ \vdots&\vdots&&\vdots\\ \bar{z}_{N}&\bar{z}_{N}^{2}&\cdots&\bar{z}_{N}^{T}\end{smallmatrix}\right], (2.16)

for any T>0T\in\mathbb{Z}_{>0}. Hence, the observed values at 2N+12N+1 integer points {F(,t)𝒜Vt=1,2,,2N+1}\{F(\cdot,t)\in\mathcal{A}_{V}\mid t=1,2,\cdots,2N+1\} determine all of the amplitudes {αj}\{\alpha_{j}\}. More generally, even for a dataset not generated from model (2.8), we can obtain this expansion. In other words, the following lemma holds.

Lemma 2.17.

Any 2N+12N+1 real values {Ftt=1,2,,2N+1}\{F_{t}\in\mathbb{R}\mid t=1,2,\cdots,2N+1\} are represented as

Ft=a0z0+j=1N{ajzjt+aj+Nz¯jt}, for t=1,2,,2N+1,\displaystyle F_{t}=a_{0}z_{0}+\sum_{j=1}^{N}\{a_{j}z_{j}^{t}+a_{j+N}\bar{z}_{j}^{t}\},\text{ for }t=1,2,\cdots,2N+1,

by distinct modes z0=1z_{0}=1 and {zj,z¯j1jN}\{z_{j},\bar{z}_{j}\mid 1\leq j\leq N\}, and aja_{j}\in\mathbb{C}. Moreover, a0a_{0}\in\mathbb{R} and aj+N=a¯ja_{j+N}=\bar{a}_{j} hold.

Proof..

Take (2N+1)(2N+1)-dimensional vectors

𝐯0=(1,1,,1),𝐯j=(zj,zj2,,zj2N+1),𝐯j+N=(z¯j,z¯j2,,z¯j2N+1)\displaystyle{\mathbf{v}}_{0}=(1,1,\cdots,1),\ {\mathbf{v}}_{j}=(z_{j},z_{j}^{2},\cdots,z_{j}^{2N+1}),\ {\mathbf{v}}_{j+N}=(\bar{z}_{j},\bar{z}_{j}^{2},\cdots,\bar{z}_{j}^{2N+1})

for 1jN1\leq j\leq N. These vectors consist of a basis of 2N+1\mathbb{C}^{2N+1} because of the distinct condition. Then, we have their dual vectors {𝐯j0j2N}\{{\mathbf{v}}_{j}^{*}\mid 0\leq j\leq 2N\} such that 𝐯j,𝐯l=δj,l\langle{\mathbf{v}}_{j},{\mathbf{v}}^{*}_{l}\rangle=\delta_{j,l}, which are given by the inverse matrix of the matrix formed by {𝐯j}\{{\mathbf{v}}_{j}\}. Therefore, aja_{j} is defined by 𝐅,𝐯j\langle{\mathbf{F}},{\mathbf{v}}^{*}_{j}\rangle for 𝐅=(F1,,F2N+1){\mathbf{F}}=(F_{1},\cdots,F_{2N+1}). Moreover, for 1jN1\leq j\leq N, we have a¯j=𝐅,𝐯¯j=𝐅,𝐯j+N=aj+N\bar{a}_{j}=\langle{\mathbf{F}},\bar{{\mathbf{v}}}^{*}_{j}\rangle=\langle{\mathbf{F}},{\mathbf{v}}^{*}_{j+N}\rangle=a_{j+N} because 𝐯¯j\bar{{\mathbf{v}}}^{*}_{j} is the dual vector of 𝐯¯j=𝐯j+N\bar{{\mathbf{v}}}_{j}={\mathbf{v}}_{j+N}. ∎

In either case, the number of observation points needed to determine the amplitudes is less than that needed to determine the modes shown in Propositions 2.5 and 2.13. Therefore, the mode expansion techniques are summarized as follows:

Theorem 2.18.

(i) General model (2.2) is determined from n(L+M)n(L+M) complex values {F(x,t)xV,t=1,2,,L+M}\{F(\mathrm{x},t)\in\mathbb{C}\mid\mathrm{x}\in V,t=1,2,\cdots,L+M\} for L=max(Mn,0)+1L=\max(M-n,0)+1 as in Proposition 2.5. When n=1n=1 or MM, this number coincides with the complex dimension M(n+1)M(n+1) of the model parameter.

(ii) Particular model (2.8) is determined from n(L+2N+1)n(L+2N+1) real values {F(x,t)xV,t=1,2,,L+2N+1}\{F(\mathrm{x},t)\in\mathbb{R}\mid\mathrm{x}\in V,t=1,2,\cdots,L+2N+1\} for L=max(Nn,0)+1L=\max(N-n,0)+1, as in Proposition 2.13. When n=1n=1 or NN, this number coincides with the real dimension (2N+1)n+N(2N+1)n+N of the model parameter.

We can use the estimated mode expansion for forecasting unobserved data.

Corollary 2.19.

When {αj,zj}\{\alpha_{j},z_{j}\} are determined from given observed points in Theorem 2.18, further values {F(x,t)}\{F(\mathrm{x},t)\} are computed for any tt\in\mathbb{Z}.

2.4 Example

Here, let us consider the following n=1n=1 time-series data for example;

Refer to caption
Figure 2.2: Example data in (2.20).
F(t)=2sin(0.016πt)+3cos(0.04πt)+12sin(0.6πt)+3,\displaystyle F(t)=2\sin(0.016\pi t)+3\cos(0.04\pi t)+\frac{1}{2}\sin(0.6\pi t)+3, (2.20)

for t=1,2,,100t=1,2,\cdots,100. See the plots in Figure 2.2.

Refer to caption
Refer to caption
Figure 2.3: [left] True mode set (red) and estimated mode set (blue) by Proposition 2.5. [right] The blue plot indicates a forecast after t=14t=14 by using the estimated modes and amplitudes overlaid on the red true plot in Figure 2.2.

By applying Proposition 2.5 for M=7M=7 and L=7L=7 to {F(t)t=1,2,,14}\{F(t)\in\mathbb{R}\mid t=1,2,\cdots,14\}, we estimate modes and amplitudes, and then forecast the remaining values for t>14t>14. As is well known, Prony’s method is too sensitive to extract precise modes from numerical values, given as 64-bit floating point numbers in our case. Therefore, the results in Figure 2.3 show that the estimated modes are slightly different from the true values, and then such error accumulates to give a bad forecast.

Refer to caption
Refer to caption
Figure 2.4: [left] True mode set (red) and estimated mode set (blue) by Proposition 2.13. [right] The blue forecasting plot after t=10t=10 lies exactly on the red true plot in Figure 2.2.

On the other hand, Proposition 2.13 for N=3N=3 and L=3L=3 gives better results from a smaller number of values {F(t)t=1,2,,10}\{F(t)\in\mathbb{R}\mid t=1,2,\cdots,10\}, as in Figure 2.4. This is because the absolute values of extracted modes are controlled to be 11, as explained in Figure 2.1.

In §4, we explain a more detailed algorithm with a graph recovery procedure.

3 Graph wave equation and graph recovery

In this section, we start from a review of the graph Laplacian and the graph wave equation from §3.1 to §3.3. In order to see an explicit solution of the graph wave equation, we calculate eigenfunctions for a path graph in §3.2, which is used in §4. Finally, we solve a graph recovery problem from the graph wave functions in §3.4.

3.1 Graph Laplacian

Let {wxy}\{w_{\mathrm{x}\mathrm{y}}\} be graph weights that satisfy wyx=wxy0w_{\mathrm{y}\mathrm{x}}=w_{\mathrm{x}\mathrm{y}}\geq 0 and wxx=0w_{\mathrm{x}\mathrm{x}}=0 for xyV\mathrm{x}\neq\mathrm{y}\in V. In the present paper, we regard (V,{wxy})(V,\{w_{\mathrm{x}\mathrm{y}}\}) as a weighted undirected graph that has an edge x\mathrm{x}-y\mathrm{y} if and only if wxy>0w_{\mathrm{x}\mathrm{y}}>0.

For a function f𝒜Vf\in\mathcal{A}_{V}, we define the graph Laplacian :𝒜V𝒜V\mathcal{L}\colon\mathcal{A}_{V}\rightarrow\mathcal{A}_{V} as

f(x):=yVwxy(f(x)f(y))=deg(x)f(x)yVwxyf(y),\displaystyle\mathcal{L}f(\mathrm{x}):=\sum_{\mathrm{y}\in V}w_{\mathrm{x}\mathrm{y}}(f(\mathrm{x})-f(\mathrm{y}))=\deg(\mathrm{x})f(\mathrm{x})-\sum_{\mathrm{y}\in V}w_{\mathrm{x}\mathrm{y}}f(\mathrm{y}), (3.1)

where deg(x):=yVwxy\deg(\mathrm{x}):=\sum_{\mathrm{y}\in V}w_{\mathrm{x}\mathrm{y}}. Note that graph weights {wxy}\{w_{\mathrm{x}\mathrm{y}}\} are defined as a metric of 11-forms over a discrete set VV, and the graph Laplacian is given as the Laplace-Beltrami operator with an inner product f,gV:=xVf(x)g(x)\langle f,g\rangle_{V}:=\sum_{\mathrm{x}\in V}f(\mathrm{x})g(\mathrm{x}) [22]. Since the Laplacian \mathcal{L} is self-adjoint for this inner product, we can take eigenfunctions {vk𝒜V0kn1}\{v_{k}\in\mathcal{A}_{V}\mid 0\leq k\leq n-1\} as follows:

vk=ρkvk, 0=ρ0ρ1ρn1,vk,vlV=δk,l.\displaystyle\mathcal{L}v_{k}=\rho_{k}v_{k},\ 0=\rho_{0}\leq\rho_{1}\leq\cdots\leq\rho_{n-1},\ \langle v_{k},v_{l}\rangle_{V}=\delta_{k,l}. (3.2)

Here, we see v0=n1/21Vv_{0}=n^{-1/2}1_{V} and vk1Vv_{k}\perp 1_{V} for k1k\geq 1. Although the eigenfunctions {vk}\{v_{k}\} have ambiguity regarding their signs even when the eigenvalues {ρk}\{\rho_{k}\} are distinct, we assume that the eigenfunctions are fixed in a certain manner. Moreover, in the present paper, we assume that a weighted graph (V,{wxy})(V,\{w_{\mathrm{x}\mathrm{y}}\}) is always connected, so that ρ1>0\rho_{1}>0. The upper and lower bounds of the eigenvalues are well studied; for example, we have

ρn12maxxVdeg(x).\displaystyle\rho_{n-1}\leq 2\max_{\mathrm{x}\in V}\deg(\mathrm{x}). (3.3)

The proof and other bounds of graphs are shown in [6].

3.2 Graph Fourier transform

Set [f]k:=f,vkV\mathcal{F}[f]_{k}:=\langle f,v_{k}\rangle_{V}\in\mathbb{R} for f𝒜Vf\in\mathcal{A}_{V}. Then, the eigenfunction expansion is given as

f=k=0n1f,vkVvk=k=0n1[f]kvk.\displaystyle f=\sum_{k=0}^{n-1}\langle f,v_{k}\rangle_{V}v_{k}=\sum_{k=0}^{n-1}\mathcal{F}[f]_{k}v_{k}. (3.4)

When the eigenvalues {ρk}\{\rho_{k}\} are distinct, this expansion is unique. In signal processing, the transformation :𝒜Vf[f]n\mathcal{F}\colon\mathcal{A}_{V}\ni f\mapsto\mathcal{F}[f]\in\mathbb{R}^{n} is called the graph Fourier transform [13]. For the support function χx\chi_{\mathrm{x}} on xV\mathrm{x}\in V, i.e., χx(y)=δx,y\chi_{\mathrm{x}}(\mathrm{y})=\delta_{\mathrm{x},\mathrm{y}}, we have [χx]k=χx,vkV=vk(x)\mathcal{F}[\chi_{\mathrm{x}}]_{k}=\langle\chi_{\mathrm{x}},v_{k}\rangle_{V}=v_{k}(\mathrm{x}). This calculation formally rewrites the eigenfunction decomposition of the graph Laplacian \mathcal{L} in (3.2) as the following identity concerning the graph weight:

wxy\displaystyle-w_{\mathrm{x}\mathrm{y}} =χx,χyV\displaystyle=\langle\chi_{\mathrm{x}},\mathcal{L}\chi_{\mathrm{y}}\rangle_{V}
=k=0n1[χx]kvk,ρk[χy]kvkV=k=0n1vk(x)ρkvk(y).\displaystyle=\sum_{k=0}^{n-1}\langle\mathcal{F}[\chi_{\mathrm{x}}]_{k}v_{k},\rho_{k}\mathcal{F}[\chi_{\mathrm{y}}]_{k}v_{k}\rangle_{V}=\sum_{k=0}^{n-1}v_{k}(\mathrm{x})\rho_{k}v_{k}(\mathrm{y}). (3.5)
Example 3.6 (Path graph).

Let us consider the path graph PnP_{n} in Figure 3.1, which is defined by the weights

wxy={1for x,yV such that y=x±10otherwise.\displaystyle w_{\mathrm{x}\mathrm{y}}=\begin{cases}1&\text{for $\mathrm{x},\mathrm{y}\in V$ such that $\mathrm{y}=\mathrm{x}\pm 1$}\\ 0&\text{otherwise.}\end{cases} (3.7)
Refer to caption
Figure 3.1: Path graph PnP_{n} in Example 3.6.

The eigenvalues and eigenfunctions are explicitly calculated as

ρk=22cosπkn,vk(x)=2ncosπk(2x1)2n,\displaystyle\rho_{k}=2-2\cos\frac{\pi k}{n},\ v_{k}(\mathrm{x})=\sqrt{\frac{2}{n}}\cos\frac{\pi k(2\mathrm{x}-1)}{2n}, (3.8)

for 1kn11\leq k\leq n-1. This is directly checked via the identity

expπik(2x+1)2n+expπik(2x3)2n=2cosπknexpπik(2x1)2n\displaystyle\exp\frac{\pi\mathrm{i}k(2\mathrm{x}+1)}{2n}+\exp\frac{\pi\mathrm{i}k(2\mathrm{x}-3)}{2n}=2\cos\frac{\pi k}{n}\exp\frac{\pi\mathrm{i}k(2\mathrm{x}-1)}{2n} (3.9)

and so on. Here, as the upper bound (3.3), the largest eigenvalue satisfies ρn1<4\rho_{n-1}<4. See also Figure 3.2 for n=21n=21. The corresponding graph Fourier transform is known as the discrete cosine transform in image processing [4].

Similarly, the graph Fourier transform for a cycle graph is known as the discrete Fourier transform (DFT) in signal processing and image processing [4].

3.3 Graph wave equation

By use of the graph Laplacian, we can define the following graph wave equation [11] for U𝒜V𝒞2(;)U\in\mathcal{A}_{V}\otimes\mathcal{C}^{2}(\mathbb{R};\mathbb{R}):

d2dt2U(x,t)=cU(x,t),U(x,0)=f(x),ddt|t=0U(x,t)=g(x),\displaystyle\frac{d^{2}}{dt^{2}}U(\mathrm{x},t)=-c\mathcal{L}U(\mathrm{x},t),\ U(\mathrm{x},0)=f(\mathrm{x}),\ \frac{d}{dt}\Big{|}_{t=0}U(\mathrm{x},t)=g(\mathrm{x}), (3.10)

for a certain c>0c>0 and given f,g𝒜Vf,g\in\mathcal{A}_{V}. For a solution U(x,t)U(\mathrm{x},t), the graph Fourier coefficient [U(,t)]\mathcal{F}[U(\cdot,t)] becomes the harmonic oscillator

d2dt2[U(,t)]k=cρk[U(,t)]k,[U(,0)]k=[f]k,ddt|t=0[U(,t)]k=[g]k,\displaystyle\frac{d^{2}}{dt^{2}}\mathcal{F}[U(\cdot,t)]_{k}=-c\rho_{k}\mathcal{F}[U(\cdot,t)]_{k},\ \mathcal{F}[U(\cdot,0)]_{k}=\mathcal{F}[f]_{k},\ \frac{d}{dt}\Big{|}_{t=0}\mathcal{F}[U(\cdot,t)]_{k}=\mathcal{F}[g]_{k},

for 0kn10\leq k\leq n-1; hence, we can describe U(x,t)U(\mathrm{x},t) as

U(x,t)=[f]0v0(x)+k=1n1{[f]kcoscρkt+[g]kcρksincρkt}vk(x).\displaystyle U(\mathrm{x},t)=\mathcal{F}[f]_{0}v_{0}(\mathrm{x})+\sum_{k=1}^{n-1}\left\{\mathcal{F}[f]_{k}\cos{\sqrt{c\rho_{k}}t}+\frac{\mathcal{F}[g]_{k}}{\sqrt{c\rho_{k}}}\sin{\sqrt{c\rho_{k}}t}\right\}v_{k}(\mathrm{x}). (3.11)

Therefore, graph wave equation (3.10) has a unique solution if and only if g1Vg\perp 1_{V}. For simplicity, we call this U(x,t)U(\mathrm{x},t) a graph wave function. In the present paper, we also assume that the Nyquist condition is satisfied:

cρk<π, for any k.\displaystyle\sqrt{c\rho_{k}}<\pi,\text{ for any }k. (3.12)

This is always achieved by re-parameterizing tt by a higher sampling rate than 11, which also makes cc smaller, if necessary.

Here, it is worth noting that the above graph wave equation is naturally regarded as a discretization of the (continuous) wave equation

2t2W(x,t)=ci=1d2xi2W(x,t)\displaystyle\frac{\partial^{2}}{\partial t^{2}}W(x,t)=c\sum_{i=1}^{d}\frac{\partial^{2}}{\partial x_{i}^{2}}W(x,t)

for W𝒞2(d×;)W\in\mathcal{C}^{2}(\mathbb{R}^{d}\times\mathbb{R};\mathbb{R}), which is easily checked for the path graph PnP_{n}. Namely, for the weights (3.7), the graph Laplacian gives the second-order central difference in the spatial domain

f(x)=f(x+1)+f(x1)2f(x)2x2f(x).\displaystyle-\mathcal{L}f(\mathrm{x})=f(\mathrm{x}+1)+f(\mathrm{x}-1)-2f(\mathrm{x})\sim\frac{\partial^{2}}{\partial x^{2}}f(x).

In order to see boundary conditions, we compare continuous and graph wave functions directly. The wave equation over [0,n][0,n] with the Neumann conditions is written as

2t2W(x,t)=c2x2W(x,t),x|x=0W(x,t)=0,x|x=nW(x,t)=0,\displaystyle\frac{\partial^{2}}{\partial t^{2}}W(x,t)=c\frac{\partial^{2}}{\partial x^{2}}W(x,t),\ \frac{\partial}{\partial x}\Big{|}_{x=0}W(x,t)=0,\ \frac{\partial}{\partial x}\Big{|}_{x=n}W(x,t)=0,

and its solution is explicitly described with some {ak,bk}\{a_{k},b_{k}\} by

W(x,t)=a0+k=1{akcoscπktn+bksincπktn}cosπkxn.\displaystyle W(x,t)=a_{0}+\sum_{k=1}^{\infty}\left\{a_{k}\cos\frac{\sqrt{c}\pi kt}{n}+b_{k}\sin\frac{\sqrt{c}\pi kt}{n}\right\}\cos\frac{\pi kx}{n}. (3.13)
Refer to caption
Refer to caption
Figure 3.2: For n=21n=21, eigenfunctions and eigenvalues are plotted. [left] Correspondence among five graph eigenfunctions (red dots) and the five continuous eigenfunctions (blue curves). [right] Correspondence among the graph eigenvalues (red dots) and the square of the continuous frequencies (blue curves).

This is viewed as a continuous analogue of the solution (3.11) for the path graph PnP_{n}, because the eigenfunctions {cosπk(2x1)2n}\{\cos\frac{\pi k(2\mathrm{x}-1)}{2n}\} in (3.8) are obtained from {cosπkxn}\{\cos\frac{\pi kx}{n}\} by discretization, and the eigenvalues {ρk}\{\rho_{k}\} in (3.8) are computed by

ρk=22cosπkn=2sinπk2nπkn\displaystyle\sqrt{\rho_{k}}=\sqrt{2-2\cos\frac{\pi k}{n}}=2\sin\frac{\pi k}{2n}\sim\frac{\pi k}{n}

for knk\ll n. These correspondences are illustrated in Figure 3.2.

Remark 3.14.

Even when a continuous wave function W(x,t)W(x,t) in (3.13) is evaluated at discrete points, like x=1/2n,3/2n,,(2n1)/2nx=1/2n,3/2n,\cdots,(2n-1)/2n, the points generally do not satisfy the graph wave equation (3.10) over a path graph. We see this point at §4.3.

3.4 Graph recovery from graph wave function

The target in the present paper is to give a procedure to recover the graph weights {wxy}\{w_{\mathrm{x}\mathrm{y}}\} from finite observed values at integer points {U(,t)𝒜Vt=1,2,,T}\{U(\cdot,t)\in\mathcal{A}_{V}\mid t=1,2,\cdots,T\}. To state our main theorem, we simplify the representation of a graph wave function UU in (3.11) as

U(x,t)=β0v0(x)+k=1n1{βkzkt+β¯kz¯kt}vk(x)\displaystyle U(\mathrm{x},t)=\beta_{0}v_{0}(\mathrm{x})+\sum_{k=1}^{n-1}\left\{\beta_{k}z_{k}^{t}+\bar{\beta}_{k}\bar{z}_{k}^{t}\right\}v_{k}(\mathrm{x}) (3.15)

where β0\beta_{0}\in\mathbb{R}, βk\beta_{k}\in\mathbb{C}, and zk=eicρkz_{k}=e^{\mathrm{i}\sqrt{c\rho_{k}}}.

Theorem 3.16.

(i) Assume that the modes {zk}\{z_{k}\} in (3.15) are distinct and unknown. If βk0\beta_{k}\neq 0 for all kk, then the modes {zk}\{z_{k}\}, the eigenfunctions {vk}\{v_{k}\} and the amplitudes {βk}\{\beta_{k}\} are determined from observed valued at 2n2n integer points {U(,t)𝒜Vt=1,2,,2n}\{U(\cdot,t)\in\mathcal{A}_{V}\mid t=1,2,\cdots,2n\}.

(ii) The graph Laplacian \mathcal{L} and the graph weights {wxy}\{w_{\mathrm{x}\mathrm{y}}\} are recovered from the given modes {zk}\{z_{k}\} and the given eigenfunctions {vk}\{v_{k}\} up to constant cc.

Proof..

The modes and amplitudes are determined by Theorem 2.18, because (3.15) is a special case of the time-series model (2.8) for N=n1N=n-1 and αk=βkvk\alpha_{k}=\beta_{k}v_{k}, where non-degenerate condition (2.9) is satisfied. Then, from amplitudes {βkvk}\{\beta_{k}v_{k}\}, the eigenfunctions {vk}\{v_{k}\} are determined by vk=1\|v_{k}\|=1, as asserted in (i).

For (ii), note that Nyquist condition (3.12) distinguishes zkz_{k} and z¯k\bar{z}_{k} by the signs of their imaginary parts. Then, cρk=argzk\sqrt{c\rho_{k}}=\arg z_{k} is defined uniquely. Hence, the assertion follows from the identity (3.5). ∎

In theory, the underlying graph is determined by Theorem 3.16. On the other hand, in numerical computation, it is often difficult to compute the underlying graph exactly because of numerical error, although we do not assume any noise. Therefore, in the following section, we summarize all of the above procedures as an algorithm.

4 Numerical experiments

In this section, we summarize the previous arguments into an algorithm, and then determine its performance using numerical datasets. In §4.1, we provide a six-step algorithm originating from Theorems 2.18 and 3.16. Then, in §4.2, §4.3, and §4.4, we apply the algorithm to graph wave signals, continuous wave signals, and human joint sensor time-series data, respectively.

4.1 Graph recover algorithm

For given multivariate wave signals, the number nn of variables is obviously known. Then, N=n1N=n-1, L=1L=1, and T=2nT=2n in Proposition 2.13 and (2.16) should be optimal parameters to recover the underlying graph as the proof of Theorem 3.16. In addition, each calculated amplitude αk𝒜V\alpha_{k}\in\mathcal{A}_{V}\otimes\mathbb{C} is supposed to be decomposed into βk\beta_{k}\in\mathbb{C} and vk𝒜Vv_{k}\in\mathcal{A}_{V} as αk=βkvk\alpha_{k}=\beta_{k}v_{k}. However, in practice, numerical error causes unexpected results in spite of Theorem 3.16. Therefore, we propose the following algorithm.

Algorithm 4.1.

Execute the following procedure:

Step 1

Set N,L>0N,L\in\mathbb{Z}_{>0} such that nLNnL\geq N, and take n(L+2N+1)n(L+2N+1) real values {Fx,t1xn,t=1,2,,L+2N+1}\{F_{\mathrm{x},t}\in\mathbb{R}\mid 1\leq\mathrm{x}\leq n,t=1,2,\cdots,L+2N+1\}.

Step 2

Define G(x,t;j)G(\mathrm{x},t;j) as in Lemma 2.10, and get {d^j1jN}\{\hat{d}_{j}\in\mathbb{R}\mid 1\leq j\leq N\} by solving linear equation (2.14).

Step 3

Obtain {z^j1j2N}\{\hat{z}_{j}\in\mathbb{C}\mid 1\leq j\leq 2N\} by solving the polynomial equation

Z2Nd^1Z2N1d^N1ZN+1d^NZNd^N1ZN1d^1Z+1=0\displaystyle Z^{2N}-\hat{d}_{1}Z^{2N-1}-\cdots-\hat{d}_{N-1}Z^{N+1}-\hat{d}_{N}Z^{N}-\hat{d}_{N-1}Z^{N-1}\cdots-\hat{d}_{1}Z+1=0 (4.2)

with the above {d^j}\{\hat{d}_{j}\}, and compute θ^j:=argz^j\hat{\theta}_{j}:=\arg\hat{z}_{j} so that π<θ^j<π-\pi<\hat{\theta}_{j}<\pi for 1j2N1\leq j\leq 2N. Moreover, set z^0:=1\hat{z}_{0}:=1 and θ^0:=0\hat{\theta}_{0}:=0.

Step 4

Get {α^j𝒜V0j2N}\{\hat{\alpha}_{j}\in\mathcal{A}_{V}\otimes\mathbb{C}\mid 0\leq j\leq 2N\} by solving linear equation (2.15) for T=L+2N+1T=L+2N+1 with the above {z^j}\{\hat{z}_{j}\}.

Step 5

If a reconstructed F^(x,t):=j=02Nα^j(x)z^jt\hat{F}(\mathrm{x},t):=\sum_{j=0}^{2N}\hat{\alpha}_{j}(\mathrm{x})\hat{z}_{j}^{t} is far from a validation dataset, then go back to Step 1 and retake larger NN and LL.

Step 6

If α^j\|\hat{\alpha}_{j}\| is tiny, then set v^j:=0V\hat{v}_{j}:=0_{V}, otherwise v^j:=α^j/α^j\hat{v}_{j}:=\hat{\alpha}_{j}/\|\hat{\alpha}_{j}\|. Then, define

w^xy:=12j=12NRe(v^j(x)θ^j2v^¯j(y))\displaystyle\hat{w}_{\mathrm{x}\mathrm{y}}:=-\frac{1}{2}\sum_{j=1}^{2N}\mathop{\mathrm{Re}}\nolimits\left(\hat{v}_{j}(\mathrm{x})\hat{\theta}_{j}^{2}\bar{\hat{v}}_{j}(\mathrm{y})\right) (4.3)

for any xyV\mathrm{x}\neq\mathrm{y}\in V from the above {θ^j}\{\hat{\theta}_{j}\}. Optionally, we can retake this weight as max(w^xy,0)\max(\hat{w}_{\mathrm{x}\mathrm{y}},0) instead if the graph weights are assumed to be non-negative.

Here, we present some remarks on this algorithm. For the graph recovery in Theorem 3.16, NN is supposed to be larger than n1n-1 as in examples §4.2 and §4.3. On the other hand, for a practical application, such as §4.4, smaller NN may work better to extract a significant graph. The condition nLNnL\geq N means linear equation (2.14) is an overdetermined system. Hence, {d^j1jN}\{\hat{d}_{j}\in\mathbb{R}\mid 1\leq j\leq N\} are computed by the least squares method or the pseudo-inverse matrix of the rectangular Vandermonde matrix. Then, modes {z^j}\{\hat{z}_{j}\} are calculated as solutions of polynomial equation (4.2). For example, this part is practically computed as eigenvalues of the companion matrix of coefficients {d^j}\{\hat{d}_{j}\}. As an expression of graph wave function (3.11), θ^j2=(argz^j)2\hat{\theta}_{j}^{2}=(\arg\hat{z}_{j})^{2} corresponds to the eigenvalue ρ^j\hat{\rho}_{j}. If constant cc in graph wave equation (3.10) is known, then θ^\hat{\theta} can be rescaled. Since the absolute value of z^j\hat{z}_{j} is not always 11 as explained in Remark 2.12, z^j\hat{z}_{j} is sometimes retaken as z^j:=z^j/|z^j|\hat{z}^{\prime}_{j}:=\hat{z}_{j}/|\hat{z}_{j}|. In either case, for z^j\hat{z}_{j}, we have z^l=z^¯j\hat{z}_{l}=\bar{\hat{z}}_{j} for a certain ll. Then linear equation (2.15) becomes (2.16). Hence, by Lemma 2.17, given values {Fx,t}\{F_{\mathrm{x},t}\} reconstruct a function F^\hat{F} as

F^(x,t):=j=02Nα^j(x)z^jt=α^0(x)+12j=12N{α^j(x)zjt+α^¯j(x)z¯jt}\displaystyle\hat{F}(\mathrm{x},t):=\sum_{j=0}^{2N}\hat{\alpha}_{j}(\mathrm{x})\hat{z}_{j}^{t}=\hat{\alpha}_{0}(\mathrm{x})+\frac{1}{2}\sum_{j=1}^{2N}\{\hat{\alpha}_{j}(\mathrm{x})z_{j}^{t}+\bar{\hat{\alpha}}_{j}(\mathrm{x})\bar{z}_{j}^{t}\} (4.4)

for any tt\in\mathbb{Z}. In an ideal case, α^j\hat{\alpha}_{j} is decomposed as β^jv^j\hat{\beta}_{j}\hat{v}_{j} by β^j\hat{\beta}_{j}\in\mathbb{C}, and then the weights are estimated from identity (3.5). Instead, we set α^j=:α^jv^j\hat{\alpha}_{j}=:\|\hat{\alpha}_{j}\|\hat{v}_{j} so that v^j=1\|\hat{v}_{j}\|=1. Since v^j\hat{v}_{j} is in 𝒜V\mathcal{A}_{V}\otimes\mathbb{C} in general, we replace vj(x)ρjvj(y)v_{j}(\mathrm{x})\rho_{j}{v}_{j}(\mathrm{y}) by Rev^j(x)ρ^jv^¯j(y)\mathop{\mathrm{Re}}\nolimits\hat{v}_{j}(\mathrm{x})\hat{\rho}_{j}\bar{\hat{v}}_{j}(\mathrm{y}) as in (4.3), in order to eliminate the ambiguity of an unit complex value on v^j\hat{v}_{j}. Basically, this Rev^j(x)ρ^jv^¯j(y)\mathop{\mathrm{Re}}\nolimits\hat{v}_{j}(\mathrm{x})\hat{\rho}_{j}\bar{\hat{v}}_{j}(\mathrm{y}) depends only on α^j(x)\hat{\alpha}_{j}(\mathrm{x}) and α^j(y)\hat{\alpha}_{j}(\mathrm{y}) up to the constant α^j\|\hat{\alpha}_{j}\|; thus, partial observed values along VV are supposed to give a subgraph. We check this in §4.2.

Furthermore, we consider the meaning of the weight (4.3).

Refer to caption
Refer to caption
Figure 4.1: [left] Sample multivariate function given in (4.5). [right] Estimated weights for the sample.

Since θ^0=0\hat{\theta}_{0}=0, the constant term α^0\hat{\alpha}_{0} in (4.4) does not affect the weights {w^xy}\{\hat{w}_{\mathrm{x}\mathrm{y}}\}. The other term α^j(x)\hat{\alpha}_{j}(\mathrm{x}) indicates the existence of the mode zjz_{j} at node x\mathrm{x}. Hence, when FxF_{\mathrm{x}} and FyF_{\mathrm{y}} have no common mode, they have no connection, i.e., w^xy=0\hat{w}_{\mathrm{x}\mathrm{y}}=0. Moreover, when they are synchronized, w^xy\hat{w}_{\mathrm{x}\mathrm{y}} becomes negative, and when they are anti-synchronized, w^xy\hat{w}_{\mathrm{x}\mathrm{y}} becomes positive. To examine this property, let us consider the following example:

F(x,t)\displaystyle F(\mathrm{x},t) =52x+cos(π3x+π5t)\displaystyle=-\frac{5}{2}\mathrm{x}+\cos\left(\frac{\pi}{3}\mathrm{x}+\frac{\pi}{5}t\right)
=52x+12exp(iπ3x)exp(iπ5t)+12exp(iπ3x)exp(iπ5t)\displaystyle=-\frac{5}{2}\mathrm{x}+\frac{1}{2}\exp\left(\mathrm{i}\frac{\pi}{3}\mathrm{x}\right)\exp\left(\mathrm{i}\frac{\pi}{5}t\right)+\frac{1}{2}\exp\left(-\mathrm{i}\frac{\pi}{3}\mathrm{x}\right)\exp\left(-\mathrm{i}\frac{\pi}{5}t\right) (4.5)

for 1x71\leq\mathrm{x}\leq 7. We have θ^1=π/5\hat{\theta}_{1}=\pi/5 and v^1(x)=exp(iπx/3)/7\hat{v}_{1}(\mathrm{x})=\exp(\mathrm{i}\pi\mathrm{x}/3)/\sqrt{7}, and then positivized weights {max(w^xy,0)}\{\max(\hat{w}_{\mathrm{x}\mathrm{y}},0)\} are calculated as in Figure 4.1. There, the maximum value of the weights is (π/5)2/70.0564(\pi/5)^{2}/7\fallingdotseq 0.0564.

4.2 Graph wave over an interval

We hereinafter demonstrate Algorithm 4.1 in order to verify its effectiveness. The first example is a graph wave function on an interval, as explained in §3.3. For n=21n=21 and V={1x21}V=\{1\leq\mathrm{x}\leq 21\}, let us take an initial function ff in the graph wave equation (3.10) over the path graph P21P_{21} as

f~(x)=x115+(small random value), for xV\displaystyle\tilde{f}(\mathrm{x})=\frac{\mathrm{x}-11}{5}+\text{(small random value)},\text{ for }\mathrm{x}\in V (4.6)

and g~=0V\tilde{g}=0_{V}.

Refer to caption
Refer to caption
Figure 4.2: [left] Initial function given in (4.6) for the graph wave equation. [right] Corresponding graph Fourier transform.

Note that {[f~]k}\{\mathcal{F}[\tilde{f}]_{k}\} satisfy the assumption in Theorem 3.16, as in Figure 4.2. Moreover, we take c=1.5\sqrt{c}=1.5 so that even the largest eigenvalue satisfies Nyquist condition (3.12):

cρk<1.5×2<π,\displaystyle\sqrt{c\rho_{k}}<1.5\times 2<\pi,

where upper bound (3.3) is used. Then, by applying the expression (3.11) to these conditions, we can see that the wave behavior becomes like the water in a bottle as in Figure 4.3. Although the function is continuous with respect to time variable tt as on the left-hand side, only integer points are used in the algorithm shown as the heat map on the right-hand side.

Refer to caption Refer to caption
Figure 4.3: [left] Graph wave function given from initial conditions (4.6) and g~=0V\tilde{g}=0_{V} over the path graph P21P_{21}. [right] Heatmap representation by discretization.

To this dataset, we apply Algorithm 4.1.

Refer to caption
Refer to caption
Figure 4.4: [left] Forty-one true modes (red) and 53 estimated modes (blue). [right] Recovered graph structure (10x), which is supposed to be the path graph P21P_{21} in (3.7).

First, let us take L=3L=3 and N=26N=26 in Step 1. Then, we obtain 5353 modes {z^j}\{\hat{z}_{j}\} from 5656 observation points along the time axis from Steps 2 and 3, as on the left-hand side in Figure 4.4. Note that although 5353 is larger than 4141 (the number of actual modes), such redundancy gives better results in numerical computation. Then, we obtain the corresponding amplitudes {α^j}\{\hat{\alpha}_{j}\} and weights {w^xy}\{\hat{w}_{\mathrm{x}\mathrm{y}}\} from Steps 4 and 6. Since cc is given as above, θ^j=argz^j/c\hat{\theta}_{j}=\arg\hat{z}_{j}/\sqrt{c} can be used to calculate weights instead of θ^j\hat{\theta}_{j} in Step 6. On the right-hand side in Figure 4.4, we multiply the recovered weights by 1010 for visualization, which is quite close to the original weights in the path graph (3.7). Finally, we check the mean squared error in each tt as in Figure 4.5 to verify Step 5.

Refer to caption
Figure 4.5: Mean squared error between exact data and reconstructed data.

Since the error is tiny, even for test data, we conclude that the proposed graph recovery algorithm works well for this exact graph wave setting.

By modifying this dataset, let us consider another case when only partial observed values with respect to spatial variables are available.

Refer to caption
Figure 4.6: Six observation points along the xx-axis are missing from Figure 4.3. The expected graph on 15 black points is shown on the left-hand side.

For V={1x21}V=\{1\leq\mathrm{x}\leq 21\}, assume the values on x=3,7,10,12,16\mathrm{x}=3,7,10,12,16, and 1717 are missing. Then, a recovered graph is supposed to be the subgraph of the path graph in Figure 4.6.

To this dataset, we apply the same procedure as above, namely, take L=3L=3 and N=26N=26, and then obtain 5353 modes {z^j}\{\hat{z}_{j}\}. As the result on the left-hand side of Figure 4.7 shows, the extracted modes are basically the same as in the previous case, i.e., Figure 4.4. Then, the corresponding amplitudes {α^j}\{\hat{\alpha}_{j}\} and weights {w^xy}\{\hat{w}_{\mathrm{x}\mathrm{y}}\} are also estimated, as on the right-hand side in Figure 4.7, which is close to what we expected in Figure 4.6.

Refer to caption
Refer to caption
Figure 4.7: [left] Forty-one true modes (red) and 53 estimated modes (blue). [right] Recovered graph structure (10x), which is same as the subgraph of the path graph in (4.6) up to a constant multiplication.

These examples show that Algorithm 4.1 can correctly recover the underlying graph and subgraph from an observed graph wave function as Theorem 3.16 guarantees. In the following subsections, we consider other cases in which multivariate signals are not explicitly assumed to have a graph structure under variables.

4.3 Continuous wave over an interval

Unlike in §4.2, here we consider a continuous wave function on the interval explained in (3.13). Then, we apply Algorithm 4.1 to its observed values in expectation that the algorithm extracts a graph in the shape of an interval, although they do not satisfy the graph wave equation over a path graph.

Refer to caption Refer to caption
Figure 4.8: [left] From f~\tilde{f} (red), we define an initial function W~(x,0)\widetilde{W}(x,0) (blue) by taking the leading 1111 terms of the graph Fourier series. [right] Continuous wave function given from the initial functions.

Take n=21n=21 in (3.13) as above. For an initial function W(x,0)W(x,0), let us use a smoothened function of f~\tilde{f} in (4.6). By using the identities

a0=0nW(x,0)𝑑x,ak=20nW(x,0)cosπkxndx for k>0,\displaystyle a_{0}=\int_{0}^{n}W(x,0)dx,\ a_{k}=2\int_{0}^{n}W(x,0)\cos\frac{\pi kx}{n}dx\ \text{ for }k>0,

we define

a~0=121xVf~(x),a~k=221xVf~(x)cosπk(2x1)42 for k>0.\displaystyle\tilde{a}_{0}=\frac{1}{21}\sum_{\mathrm{x}\in V}\tilde{f}(\mathrm{x}),\ \tilde{a}_{k}=\frac{2}{21}\sum_{\mathrm{x}\in V}\tilde{f}(\mathrm{x})\cos\frac{\pi k(2\mathrm{x}-1)}{42}\ \text{ for }k>0.

In fact, these coefficients are nothing but the graph Fourier coefficients of f~\tilde{f}, namely, 21/2a~k=[f~]k\sqrt{21/2}\tilde{a}_{k}=\mathcal{F}[\tilde{f}]_{k} for 1kn11\leq k\leq n-1. Then, by the taking first 1111 coefficients and assuming t|t=0W~(x,t)=0\frac{\partial}{\partial t}|_{t=0}\widetilde{W}(x,t)=0, i.e., bk=0b_{k}=0 for any kk, we obtain a continuous wave function

W~(x,t)=a~0+k=110a~kcoscπkt21cosπkx21\displaystyle\widetilde{W}(x,t)=\tilde{a}_{0}+\sum_{k=1}^{10}\tilde{a}_{k}\cos\frac{\sqrt{c}\pi kt}{21}\cos\frac{\pi kx}{21} (4.7)

for c=1.5\sqrt{c}=1.5, which is shown in Figure 4.8. Here, we emphasize that this wave function is continuously defined for x[0,21]x\in[0,21] and tt\in\mathbb{R}, and thus there exist various ways to discretize it by evaluation. Moreover, the number 1111 is taken so that each frequency cπk/21\sqrt{c}\pi k/21 satisfies Nyquist condition (3.12), although this has another purpose, as explained below.

First of all, we evaluate the above W~(x,t)\widetilde{W}(x,t) at 2121 points along the xx-axis, x=x0.5x=\mathrm{x}-0.5 for x=1,2,,21\mathrm{x}=1,2,\cdots,21, and integer points along the time axis.

Refer to caption
Refer to caption
Figure 4.9: [left] Twenty-one true modes (red) and 25 estimated modes (blue). [right] Estimated graph structure (10x) over the 21 observation points.

For these observed values, we take L=1L=1 and N=12N=12 in Step 1. Then, we obtain 2525 modes {z^j}\{\hat{z}_{j}\} from 2626 observation points from Steps 2 and 3, as on the left-hand side in Figure 4.9, and the corresponding amplitudes {α^j}\{\hat{\alpha}_{j}\} and weights {w^xy}\{\hat{w}_{\mathrm{x}\mathrm{y}}\} from Steps 4 and 6, as on the right-hand side.

Refer to caption
Figure 4.10: Upper estimated graph having the lower path graph as a subgraph.

The estimated weights indicate the path-like graph at the top of Figure 4.10, which includes the path graph below. Thus, the same procedure for a partial observed dataset like in Figure 4.6 is applicable to obtain the path graph.

Refer to caption
Refer to caption
Figure 4.11: [left] Eleven observation points reduced from the 21 points. [right] Estimated graph structure (10x) over the 11 observation points.

Namely, we try another discretization x=2x1.5x=2\mathrm{x}-1.5 for x=1,2,,11\mathrm{x}=1,2,\cdots,11, given on the left-hand side in Figure 4.11. Then, by using the same parameter, we successfully obtain the weights close to those of the path graph on the right-hand side in Figure 4.11 expected from the bottom path graph in Figure 4.10.

This phenomenon can be explained in another way. From 21×2621\times 26 observed values {W~(x,t)x=0.5,1.5,,20.5,t=1,2,,26}\{\widetilde{W}(x,t)\in\mathbb{R}\mid x=0.5,1.5,\cdots,20.5,t=1,2,\cdots,26\}, a (21,12)(21,12) matrix formed by {G(x,t;j)}\{G(\mathrm{x},t;j)\} is constructed as in linear equation (2.14) in Step 2.

Refer to caption
Figure 4.12: Twelve singular values of the (21,12)(21,12) matrix.

Here, we can see that the (21,12)(21,12) matrix has rank 1010, which means its singular values consist of 1010 major values and 22 minor values, as in Figure 4.12. This is consistent with the definition of W~(x,t)\widetilde{W}(x,t), where we take the 1010 terms in the summation other than the constant term (4.7). Hence, it is sufficient to use more than 1010 observation points, such as 11×2611\times 26 observed values {W~(x,t)x=0.5,2.5,,20.5,t=1,2,,26}\{\widetilde{W}(x,t)\in\mathbb{R}\mid x=0.5,2.5,\cdots,20.5,t=1,2,\cdots,26\} as above, in order to extract an outline of the underlying graph.

4.4 Pose model

As a more practical case, we consider a graph estimation problem from human joint tracking data. To this end, we use an open source dataset PKU-MMD, which consists of RGB, skeletons, depth, and IR sequential data for 51 action categories [7]. The skeleton data we use herein are obtained by Kinect v2, the joints of which are described as in Figure 4.13 [1]. More strictly speaking, we use the 2,100th to 2,800th frames from the "0007-M" sample in the dataset.

Refer to caption Refer to caption
Figure 4.13: [left] Meanings of 25 joints. [right] Corresponding joint numbers.

Although the skeleton data provide the three-dimensional coordinates xx, yy, and zz of each joint location in space, we only use the xx and yy coordinates in this experiment. Then, we try to extract relations among 2525 moving x+iyx+\mathrm{i}y joint locations as graphs, which are supposed to characterize individual actions.

Refer to caption
Figure 4.14: [top] Time-series data of the 25 xx-coordinates of human joints. [middle] Time-series data of the 25 yy-coordinates of human joints. [bottom] Four actions starting from t=65,265,425t=65,265,425, and 585585 with the 25 x+iyx+\mathrm{i}y-coordinates of human joints.

Let Fx(x,t)F_{x}(\mathrm{x},t) and Fy(x,t)F_{y}(\mathrm{x},t) denote functions of the xx and yy locations in an RGB image, respectively, for the joint nodes 1x251\leq\mathrm{x}\leq 25 and the times t=0,1,,600t=0,1,\cdots,600. First, as n=2×25=50n=2\times 25=50 time series, we apply Steps 1 to 5 of Algorithm 4.1 for L=4L=4 and N=5N=5 to a 1515-long sequence starting from the t1t_{1}-th frame. Then, we obtain mode decompositions

F^x(x,t;t1)=α^x,0,t1(x)+j=110α^x,j,t1(x)z^jt,F^y(x,t;t1)=α^y,0,t1(x)+j=110α^y,j,t1(x)z^jt\displaystyle\hat{F}_{x}(\mathrm{x},t;t_{1})=\hat{\alpha}_{x,0,t_{1}}(\mathrm{x})+\sum_{j=1}^{10}\hat{\alpha}_{x,j,t_{1}}(\mathrm{x})\hat{z}_{j}^{t},\ \ \hat{F}_{y}(\mathrm{x},t;t_{1})=\hat{\alpha}_{y,0,t_{1}}(\mathrm{x})+\sum_{j=1}^{10}\hat{\alpha}_{y,j,t_{1}}(\mathrm{x})\hat{z}_{j}^{t} (4.8)

for any node x\mathrm{x} so as to coincide with Fx(x,t)F_{x}(\mathrm{x},t) and Fy(x,t)F_{y}(\mathrm{x},t) on the sequence t1t<t1+15t_{1}\leq t<t_{1}+15. See also Lemma 2.17. Here, we can see that F^x(x,t;t1)+iF^y(x,t;t1)\hat{F}_{x}(\mathrm{x},t;t_{1})+\mathrm{i}\hat{F}_{y}(\mathrm{x},t;t_{1}) represents the time evolution of the x+iyx+\mathrm{i}y location in an RGB image. Finally, we calculate (4.3) for α^j,t1=α^x,j,t1+iα^y,j,t1\hat{\alpha}_{j,t_{1}}=\hat{\alpha}_{x,j,t_{1}}+\mathrm{i}\hat{\alpha}_{y,j,t_{1}} in order to estimate weights among joint locations at t=t1t=t_{1}. Note that although Fx(x,t)F_{x}(\mathrm{x},t) and Fy(x,t)F_{y}(\mathrm{x},t) no longer satisfy the graph wave equation for the estimated graph, we suppose that the graph weights extract meaningful features from the decomposition (4.8).

For example, we pick out four sequences representing "clapping" from the 2,165th to 2,180th frames (t=65t=65), "cross hands in front" from the 2,365th to 2,380th frames (t=265t=265), "hand waving" from the 2,525th to 2,540th frames (t=425t=425), and "jump up" from the 2,685th to 2,700th frames (t=585t=585). The corresponding data are shown in Figure 4.14. Then, we draw the resulting graphs in Figure 4.15, where each edge indicates that the estimated weight is larger than 0.20.2.

Refer to caption
Figure 4.15: Graphs of the results. From left to right, "clapping", "cross hands in front", "hand waving", and "jump up".

As we can see, edges in each case are illustrated on nodes and pairs that characterize each action. In particular, since we are paying attention to positive weights, Algorithm 4.1 ignores synchronized pairs and extracts anti-synchronized pairs. Hence, even in the "jump up" case, only the leg motion is highlighted against all upward moving nodes. As in this case, we expect that the proposed graph recovery algorithm will work well, even for non-wave signals.

5 Conclusion

In the present paper, we introduced a graph recovery method from a graph wave function based on a modified DMD algorithm. Then, we showed its effectiveness by using three examples, i.e., a graph wave function, a continuous wave function, and human joint tracking data, from viewpoints of from theoretical demonstration to practical study. In the present study, we do not assume the existence of noise in each time-series model. Hence, further study is needed in order to assure that the proposed algorithm works against noises, such as white noise, with the help of known technologies in signal processing, time-series analysis, and statistics.

Acknowledgements

The author would like to thank Yosuke Otsubo for discussing and suggesting DMD technology for further study. The author is also grateful to Satoshi Takahashi, Tetsuya Koike, Chikara Nakamura, Bausan Yuan, Ping-Wei Chang, and Pranav Gundewar for their constant encouragement.

References

  • [1] Kinect for windows v2 referece, jointtype enumeration. https://docs.microsoft.com/en-us/previous-versions/windows/kinect/dn758662(v=ieb.10).
  • [2] Amaia Abanda, Usue Mori, and Jose A Lozano. A review on distance based time series classification. Data Mining and Knowledge Discovery, 33(2):378–412, 2019.
  • [3] Faisal Ahmed, Padma Polash Paul, and Marina L Gavrilova. Dtw-based kernel and rank-level fusion for 3d gait recognition using kinect. The Visual Computer, 31(6):915–924, 2015.
  • [4] Rosa A Asmara and Reza Agustina. Comparison of discrete cosine transforms (dct), discrete fourier transforms (dft), and discrete wavelet transforms (dwt) in digital image watermarking. International Journal of Advanced Computer Science and Applications, 8(2):245–249, 2017.
  • [5] Donald J Berndt and James Clifford. Using dynamic time warping to find patterns in time series. In KDD workshop, volume 10, pages 359–370. Seattle, WA, USA:, 1994.
  • [6] Fan RK Chung and Fan Chung Graham. Spectral graph theory. Number 92. American Mathematical Soc., 1997.
  • [7] Liu Chunhui, Hu Yueyu, Li Yanghao, Song Sijie, and Liu Jiaying. Pku-mmd: A large scale benchmark for continuous multi-modal human action understanding. arXiv preprint arXiv:1703.07475, 2017.
  • [8] Sacha Epskamp, Lourens J Waldorp, René Mõttus, and Denny Borsboom. The gaussian graphical model in cross-sectional and time-series data. Multivariate behavioral research, 53(4):453–480, 2018.
  • [9] Christos Faloutsos, Mudumbai Ranganathan, and Yannis Manolopoulos. Fast subsequence matching in time-series databases. ACM Sigmod Record, 23(2):419–429, 1994.
  • [10] George E Forsythe. Generation and use of orthogonal polynomials for data-fitting with a digital computer. Journal of the Society for Industrial and Applied Mathematics, 5(2):74–88, 1957.
  • [11] Joel Friedman and Jean-Pierre Tillich. Wave equations for graphs and the edge-based laplacian. Pacific Journal of Mathematics, 216(2):229–266, 2004.
  • [12] Paolo Giudici and Alessandro Spelta. Graphical network models for international financial flows. Journal of Business & Economic Statistics, 34(1):128–138, 2016.
  • [13] David K Hammond, Pierre Vandergheynst, and Rémi Gribonval. Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis, 30(2):129–150, 2011.
  • [14] Meng Li and Howard Leung. Graph-based approach for 3d human skeletal action recognition. Pattern Recognition Letters, 87:195–202, 2017.
  • [15] Ta-Hsin Li. Time series with mixed spectra. CRC Press, 2013.
  • [16] Ta-Hsin Li and Benjamin Kedem. Asymptotic analysis of a multiple frequency estimation method. Journal of multivariate analysis, 46(2):214–236, 1993.
  • [17] Gerlind Plonka and Manfred Tasche. Prony methods for recovery of structured functions. GAMM-Mitteilungen, 37(2):239–258, 2014.
  • [18] Ivan Popivanov and Renee J Miller. Similarity search over time-series data using wavelets. In Proceedings 18th international conference on data engineering, pages 212–221. IEEE, 2002.
  • [19] R Prony. Essai experimental et analytique. Journal de L’Ecole Polytechnique, 1(2):24–76, 1795.
  • [20] Peter J Schmid. Dynamic mode decomposition of numerical and experimental data. Journal of fluid mechanics, 656:5–28, 2010.
  • [21] Kilian Stampfer and Gerlind Plonka. The generalized operator based prony method. Constructive Approximation, 52(2):247–282, 2020.
  • [22] Yuuya Takayama. Geometric formulation for discrete points and its applications. arXiv preprint arXiv:2002.03767, 2020.
  • [23] Jeff KT Tang and Howard Leung. Retrieval of logically relevant 3d human motions by adaptive feature selection with graded relevance feedback. Pattern Recognition Letters, 33(4):420–430, 2012.
  • [24] Jonathan H Tu. Dynamic mode decomposition: Theory and applications. PhD thesis, Princeton University, 2013.
  • [25] Caroline Uhler. Gaussian graphical models: An algebraic and geometric perspective. arXiv preprint arXiv:1707.04345, 2017.
  • [26] Ting Wang, Zhao Ren, Ying Ding, Zhou Fang, Zhe Sun, Matthew L MacDonald, Robert A Sweet, Jieru Wang, and Wei Chen. Fastggm: an efficient algorithm for the inference of gaussian graphical model in biological networks. PLoS computational biology, 12(2):e1004755, 2016.