Graph recovery from graph wave equation

Yuuya Takayama
Nikon Corporation, Japan
[email protected]

Abstract

We propose a method by which to recover an underlying graph from a set of multivariate wave signals that is discretely sampled from a solution of the graph wave equation. Herein, the graph wave equation is defined with the graph Laplacian, and its solution is explicitly given as a mode expansion of the Laplacian eigenvalues and eigenfunctions. For graph recovery, our idea is to extract modes corresponding to the square root of the eigenvalues from the discrete wave signals using the DMD method, and then to reconstruct the graph (Laplacian) from the eigenfunctions obtained as amplitudes of the modes. Moreover, in order to estimate modes more precisely, we modify the DMD method under an assumption that only stationary modes exist, because graph wave functions always satisfy this assumption.

In conclusion, we demonstrate the proposed method on the wave signals over a path graph. Since our graph recovery procedure can be applied to non-wave signals, we also check its performance on human joint sensor time-series data.

1 Introduction

Graphs have been researched over a century in mathematics and have recently been important in science and application fields, such as physics, chemistry, finance, signal processing, and machine learning. Once a graph is given, mathematical studies can give powerful and useful results about its Laplacian eigenvalues, its connectivity, its hitting time, and so on. However, for practical data, the graph structure is sometimes unknown. Hence, it is a fundamental problem to reconstruct the underlying graph from a given dataset. As typical examples of its solution, Gaussian graphical models (GGMs) are well known in machine learning; these models are derived from the assumption that data are generated from a multivariate Gaussian distribution [25]. In the present paper, alternatively, we assume that multivariate time-series data are generated from the graph wave equation and then provide a procedure by which to recover the weighted graph structure over the variables from the observed dataset. Moreover, based on this procedure, we propose an algorithm to construct a reasonable graph from any multivariate time-series data without assuming the data to follow the graph wave equation.

The graph wave equation was first studied by Friedman and Tillich [11] as a discrete analogue of the continuous wave equation. Here, its general solutions, i.e. graph wave functions, are given as

\displaystyle U(\mathrm{x},t)=\sum_{k}\left\{a_{k}\cos{\sqrt{\rho_{k}}t}+\frac{b_{k}}{\sqrt{\rho_{k}}}\sin{\sqrt{\rho_{k}}t}\right\}v_{k}(\mathrm{x}),

by the eigenfunctions $\{v_{k}\}$ of the graph Laplacian, where $\mathrm{x}$ indicates a node of a graph. In fact, a graph wave function for a path graph is viewed as a spatial discretization of a conventional wave function over an interval. In the present paper, as a discrete multivariate time-series model, we use a set of observed values of a graph wave function at integer points $t\in\mathbb{Z}$ , namely, $\{U(\mathrm{x},t)\mid\forall\mathrm{x},t\in\mathbb{Z}\}$ . Note that this discretization in the time direction does not lose any information of the waves if the Nyquist condition $\sqrt{\rho_{k}}<\pi$ is satisfied for any $k$ . When the underlying graph is unknown, so are the frequencies $\{\sqrt{\rho_{k}}\}$ . In contrast, it is easier to estimate the eigenfunctions $\{v_{k}\}$ when the frequencies are known because of the curve-fitting method, or in other words, the least squares method [10] [15, §5]. Hence, in order to recover the graph, it is necessary to estimate the frequencies from observed values of a graph wave function.

To this end, we use the dynamic mode decomposition (DMD) proposed by Schmid [20] and studied by Tu et al. [24]. The essence of DMD is to describe the $(T+1)$ -th observed set in terms of a linear sum of the previous $T$ observed sets

\displaystyle U(\mathrm{x},T+1)=c_{1}U(\mathrm{x},T)+c_{2}U(\mathrm{x},T-1)+\cdots+c_{T}U(\mathrm{x},1)

by $c_{j}\in\mathbb{R}$ . Then, the modes $\{z_{j}\}$ are calculated as the solutions of a polynomial equation $Z^{T}-c_{1}Z^{T-1}-\cdots-c_{T-1}Z-c_{T}=0$ , just like Prony’s method [19], but are shared in all (spatial) variables $\{U(\mathrm{x},\cdot)\}$ . Moreover, the mode $z_{j}$ is immediately related to the frequency $\sqrt{\rho_{k}}$ as $z_{j}=e^{\pm\mathrm{i}\sqrt{\rho_{k}}t}$ for some $j$ and $k$ . Here, it is worth emphasizing that the mode set $\{z_{j}\}$ satisfies a stationary condition $|z_{j}|=1$ and a conjugate condition $\bar{z}_{j}=z_{l}$ (for some $l$ ). Hence, the coefficients $\{c_{j}\}$ satisfy a symmetric condition [16]. In fact, this idea works well to calculate the frequencies robustly against numerical error from fewer observation points along the time axis. As a result, this modified DMD method enables us to extract common frequencies more efficiently from multivariate time-series data, although other approaches may be applicable to the frequency estimation for the time-series data of each variable, such as [15, 17, 21],

From the calculated frequencies, we can estimate the graph Laplacian and the graph weights, in addition to the eigenfunctions $\{v_{k}\}$ , as explained above. To be more precise, the graph weights are given as

\displaystyle w_{\mathrm{x}\mathrm{y}}=-\sum_{k}v_{k}(\mathrm{x})\rho_{k}v_{k}(\mathrm{y}).

For any multivariate time-series data, we extend this form by replacing $\{v_{k}\}$ with estimated eigenfunction-like functions. Here, the constructed frequency-based weights have the following remarkable properties:

•

the weights are independent of the mean of each variable,
•

when two variables $U(\mathrm{x},\cdot)$ and $U(\mathrm{y},\cdot)$ have no common frequency, then $w_{\mathrm{x}\mathrm{y}}=0$ , and
•

when two variables $U(\mathrm{x},\cdot)$ and $U(\mathrm{y},\cdot)$ are anti-synchronized, then $w_{\mathrm{x}\mathrm{y}}>0$ .

In contrast to the frequency-based graph, Gaussian graphical models (GGMs) define graph weights by the correlation between two variables after conditioning on all other datasets [25, §8] and are widely applicable to biology [26], psychology [8], and finance [12]. Another perspective suggests defining distances for time series, such as dynamical time warping (DTW) [5] and feature based distances [9, 18]. These distances define graph weights through distance kernels for the sake of various time-series analysis or machine learning tasks [2]. In particular, for the human joint time-series data we deal with, a graph structure is often constructed based on the Joint Relative Distance, defined by the pair-wise Euclidean distances between two 3D (or 2D) joint locations taken by a physical sensor. Namely, graph weights are given by its variance [23], its relative variance [14], or its DTW [3]. Compared with these studies, the proposed method has advantages in that it can directly extract information of temporal order, especially, the frequency of time-series data, and moreover requires a smaller amount of data.

The remainder of the present paper is organized as follows. In §2, we review the way to extract modes from equispaced sampled (multivariate) time-series data, known as Prony’s method, and the DMD. We also explain how to impose a stationary mode condition on these methods. In §3, we first recall the definitions of the graph Laplacian and the graph wave equation. Next, we see that a solution to the graph wave equation, a graph wave function, for a path graph naturally corresponds to a continuous wave over an interval, which is used for a demonstration in §4. Then, we solve the graph recovery problem for an observed dataset of graph wave functions using the modified DMD method. This procedure is summarized and extended in §4. Finally, we determine its performance through three examples: graph wave signals over a path graph, discretely sampled continuous wave signals over an interval, and human joint sensor time-series data.

2 Mode extraction techniques

In this section, we review mode extraction methods from the observed multivariate time-series dataset. Then, we explain how to modify these methods for a set of modes satisfying special conditions in §2.2. In §2.3, we also mention amplitude estimation methods, which are combined into mode expansion techniques. Finally, we briefly explain examples of the mode expansion results in §2.4.

2.1 Mode extraction problem setting

Let $V$ be a finite set $\{1,2,\cdots,n\}$ and let $\mathcal{A}_{V}$ denote the set of functions $\{f\colon V\rightarrow\mathbb{R}\}$ , which is an $\mathbb{R}$ -vector space. When $n=1$ , $\mathcal{A}_{V}$ is identified with $\mathbb{R}$ . We define two constant functions $0_{V}(\mathrm{x}):=0$ and $1_{V}(\mathrm{x}):=1$ for any $\mathrm{x}\in V$ .

In addition, let $\{z_{1},\cdots,z_{M}\}$ be roots of a polynomial equation

\displaystyle Z^{M}-c_{1}Z^{M-1}-c_{2}Z^{M-2}-\cdots-c_{M-1}Z-c_{M}=0.

(2.1)

Needless to say, each coefficient is represented by $c_{j}=(-1)^{j+1}e_{j}(z_{1},\cdots,z_{M})$ , where $e_{j}$ denotes the $j$ -th elementary symmetric polynomial. We assume that the roots $\{z_{j}\}$ are distinct, and then define a time-series model $F$ as the time evolution of the roots

\displaystyle F(\mathrm{x},t):=\sum_{j=1}^{M}\alpha_{j}(\mathrm{x})z_{j}^{t},\text{ for }\mathrm{x}\in V,t\in\mathbb{Z},

(2.2)

where $\alpha_{j}\in\mathcal{A}_{V}\otimes\mathbb{C}\setminus 0_{V}$ . We can define $F$ for $t\in\mathbb{R}$ by fixing $\arg z_{j}$ , although this definition is not used herein. In this context, we just refer to the roots $\{z_{j}\}$ and coefficients $\{\alpha_{j}\}$ as the modes and amplitudes, respectively, of $F$ . Now, we assume that the amplitudes satisfy a non-degenerate condition

\displaystyle\mathop{\mathrm{dim}}\nolimits_{\mathbb{C}}\mathop{\mathrm{Span}}\nolimits[\alpha_{1},\cdots,\alpha_{M}]=\min(M,n).

(2.3)

Note that this condition is always satisfied by taking subset $V^{\prime}\subset V$ and considering all of the above over $V^{\prime}$ if necessary. Then, because of (2.1), time-series model $F$ satisfies

\displaystyle F(\mathrm{x},t+M)-c_{1}F(\mathrm{x},t+M-1)-\cdots-c_{M-1}F(\mathrm{x},t+1)-c_{M}F(\mathrm{x},t)=0

(2.4)

for any $\mathrm{x}\in V$ and any $t\in\mathbb{Z}$ . This relation is regarded as a linear equation of the coefficients $\{c_{j}\}$ . Hence the modes $\{z_{j}\}$ of $F$ can be determined from temporal observed sets $\{F(\cdot,t)\}$ .

Proposition 2.5.

Set $L=\max(M-n,0)+1$ . The mode set $\{z_{j}\}$ in the model $F$ (2.2) is determined from observed values at $L+M$ integer points $\{F(\cdot,t)\in\mathcal{A}_{V}\otimes\mathbb{C}\mid t=1,2,\cdots,L+M\}$ if condition (2.3) is satisfied.

This result is viewed as Prony’s method [19] when $n=1$ , and the dynamic mode decomposition [20] when $n\geq M$ .

Proof..

Let us consider the following linear equation according to (2.4);

\displaystyle\left[\begin{smallmatrix}F(1,1)&F(1,2)&\cdots&F(1,M)\\ \vdots&\vdots&&\vdots\\ F(n,1)&F(n,2)&\cdots&F(n,M)\\ F(1,2)&F(1,3)&\cdots&F(1,M+1)\\ \vdots&\vdots&&\vdots\\ F(n,2)&F(n,3)&\cdots&F(n,M+1)\\ \vdots&\vdots&&\vdots\\ F(1,L)&F(1,L+1)&\cdots&F(1,L+M-1)\\ \vdots&\vdots&&\vdots\\ F(n,L)&F(n,L+1)&\cdots&F(n,L+M-1)\\ \end{smallmatrix}\right]\left[\begin{smallmatrix}c_{M}\\ c_{M-1}\\ \vdots\\ c_{1}\end{smallmatrix}\right]=\left[\begin{smallmatrix}F(1,M+1)\\ \vdots\\ F(n,M+1)\\ F(1,M+2)\\ \vdots\\ F(n,M+2)\\ \vdots\\ F(1,L+M)\\ \vdots\\ F(n,L+M)\end{smallmatrix}\right].

(2.6)

Here, our goal is to show that the $(nL,M)$ matrix on the left has rank $M$ . It is rewritten as

\displaystyle\left[\begin{smallmatrix}\alpha_{1}(1)&\alpha_{2}(1)&\cdots&\alpha_{M}(1)\\ \vdots&\vdots&&\vdots\\ \alpha_{1}(n)&\alpha_{2}(n)&\cdots&\alpha_{M}(n)\\ \alpha_{1}(1)z_{1}&\alpha_{2}(1)z_{2}&\cdots&\alpha_{M}(1)z_{M}\\ \vdots&\vdots&&\vdots\\ \alpha_{1}(n)z_{1}&\alpha_{2}(n)z_{2}&\cdots&\alpha_{M}(n)z_{M}\\ \vdots&\vdots&&\vdots\\ \alpha_{1}(1)z_{1}^{L-1}&\alpha_{2}(1)z_{2}^{L-1}&\cdots&\alpha_{M}(1)z_{M}^{L-1}\\ \vdots&\vdots&&\vdots\\ \alpha_{1}(n)z_{1}^{L-1}&\alpha_{2}(n)z_{2}^{L-1}&\cdots&\alpha_{M}(n)z_{M}^{L-1}\end{smallmatrix}\right]\left[\begin{smallmatrix}z_{1}&z_{1}^{2}&\cdots&z_{1}^{M}\\ z_{2}&z_{2}^{2}&\cdots&z_{2}^{M}\\ \vdots&\vdots&&\vdots\\ z_{M}&z_{M}^{2}&\cdots&z_{M}^{M}\end{smallmatrix}\right]

by applying the definition of $F$ (2.2). Since $\{z_{j}\}$ are assumed to be distinct, the Vandermonde matrix on the right has full rank. We claim that the matrix on the left also has rank $M$ . When $n\geq M$ , that is $L=1$ , this follows from condition (2.3). Suppose $n<M$ , that is $L=M-n+1$ . Take $M$ -dimensional row vectors

\displaystyle{\mathbf{v}}_{j,k}=\left(\alpha_{1}(j)z_{1}^{k},\alpha_{2}(j)z_{2}^{k},\cdots,\alpha_{M}(j)z_{M}^{k}\right)

for $j=1,2,\cdots,n$ and $k=0,1,\cdots,L-1$ . Non-degenerate condition (2.3) means $n$ vectors $\{{\mathbf{v}}_{j,0}\}$ are linearly independent. If necessary, retake ${\mathbf{v}}_{1,0}$ by a linear combination of $\{{\mathbf{v}}_{j,0}\}$ so that all entries of ${\mathbf{v}}_{1,0}$ are non-zero, which is possible because $\alpha_{j}\in\mathcal{A}_{V}\otimes\mathbb{C}\setminus 0_{V}$ . Hence, we first obtain

\displaystyle\mathop{\mathrm{dim}}\nolimits_{\mathbb{C}}\mathop{\mathrm{Span}}\nolimits[{\mathbf{v}}_{1,0},{\mathbf{v}}_{1,1},\cdots,{\mathbf{v}}_{1,L-1}]=L

because

\displaystyle\left[\begin{smallmatrix}\alpha_{1}(1)&\alpha_{2}(1)&\cdots&\alpha_{M}(1)\\ \alpha_{1}(1)z_{1}&\alpha_{2}(1)z_{2}&\cdots&\alpha_{M}(1)z_{M}\\ \vdots&\vdots&&\vdots\\ \alpha_{1}(1)z_{1}^{L-1}&\alpha_{2}(1)z_{2}^{L-1}&\cdots&\alpha_{M}(1)z_{M}^{L-1}\end{smallmatrix}\right]=\left[\begin{smallmatrix}1&1&\cdots&1\\ z_{1}&z_{2}&\cdots&z_{M}\\ \vdots&\vdots&&\vdots\\ z_{1}^{L-1}&z_{2}^{L-1}&\cdots&z_{M}^{L-1}\end{smallmatrix}\right]\left[\begin{smallmatrix}\alpha_{1}(1)&0&\cdots&0\\ 0&\alpha_{2}(1)&&0\\ \vdots&&\ddots&\\ 0&0&\cdots&\alpha_{M}(1)\end{smallmatrix}\right]

Next, if ${\mathbf{v}}_{2,0}\in\mathop{\mathrm{Span}}\nolimits[{\mathbf{v}}_{1,0},{\mathbf{v}}_{1,1},\cdots,{\mathbf{v}}_{1,L-1}]$ , then we can describe ${\mathbf{v}}_{2,0}=\sum_{l=0}^{L-1}\beta_{l}{\mathbf{v}}_{1,l}$ for some $\beta_{l}$ . This also indicates ${\mathbf{v}}_{2,L-1}=\sum_{l=0}^{L-1}\beta_{l}{\mathbf{v}}_{1,L+l-1}$ . Hence, we obtain

\displaystyle\mathop{\mathrm{dim}}\nolimits_{\mathbb{C}}\mathop{\mathrm{Span}}\nolimits[{\mathbf{v}}_{1,0},{\mathbf{v}}_{1,1},\cdots,{\mathbf{v}}_{1,L-1},{\mathbf{v}}_{2,0},{\mathbf{v}}_{2,1},\cdots,{\mathbf{v}}_{2,L-1}]\geq L+1,

because condition (2.3) means ${\mathbf{v}}_{2,L-1}\neq\beta_{0}{\mathbf{v}}_{1,L-1}$ . Otherwise, $\mathop{\mathrm{Span}}\nolimits[{\mathbf{v}}_{1,0},{\mathbf{v}}_{1,1},\cdots,{\mathbf{v}}_{1,L-1},{\mathbf{v}}_{2,0}]$ becomes an $(L+1)$ -dimensional vector space again. By induction, we can conclude

\displaystyle\mathop{\mathrm{dim}}\nolimits_{\mathbb{C}}\mathop{\mathrm{Span}}\nolimits[{\mathbf{v}}_{1,0},\cdots,{\mathbf{v}}_{1,L-1},{\mathbf{v}}_{2,0},\cdots,{\mathbf{v}}_{2,L-1},\cdots,{\mathbf{v}}_{n,0},\cdots,{\mathbf{v}}_{n,L-1}]\geq L+n-1=M.

Therefore, the coefficients $\{c_{j}\}$ are obtained as a unique solution of the linear equation (2.6), and the modes $\{z_{j}\}$ are determined as solutions of (2.1). ∎

Remark 2.7.

In practice, it is often the case that the number of modes is also unknown. In that case, one can compute the number of modes as the rank of the matrix on the left of (2.6) for large $M$ and $L$ .

2.2 Special conditions for mode set

Now, we consider a special case of (2.2). In practice, the time-series model $F$ often contains a constant term, which is represented as a mode $Z=1$ . In addition, it is usual to suppose that other modes are stationary and $F$ is $\mathbb{R}$ -valued.

Refer to caption — Figure 2.1: [left] General mode set in (2.2). [right] DFT mode set, lying on the unit circle at equal intervals. [center] Particular mode set in (2.8), lying on the unit circle vertically symmetrically, including $Z=1$ , like DFT modes, but not equispaced.

These conditions indicate a particular representation

\displaystyle F(\mathrm{x},t)=\alpha_{0}(\mathrm{x})+\sum_{j=1}^{N}\left\{\alpha_{j}(\mathrm{x})z_{j}^{t}+\bar{\alpha}_{j}(\mathrm{x})\bar{z}_{j}^{t}\right\},\text{ for }t\in\mathbb{Z},

(2.8)

where $\alpha_{0}\in\mathcal{A}_{V}\setminus 0_{V}$ , $\alpha_{j}\in\mathcal{A}_{V}\otimes\mathbb{C}\setminus 0_{V}$ and $\{z_{j}\in\mathbb{C}\mid\mathop{\mathrm{Im}}\nolimits z_{j}>0,|z_{j}|=1$ for $1\leq j\leq N$ and are distinct from each other $\}$ . This form is also viewed as a general form of the inverse discrete Fourier transform (DFT), the modes of which are given by $\{z_{j}=\exp(2\pi\mathrm{i}j/N)\}$ . See the difference of mode sets in Figure 2.1. As above, we assume that the amplitudes satisfy the non-degenerate condition

\displaystyle\mathop{\mathrm{dim}}\nolimits_{\mathbb{C}}\mathop{\mathrm{Span}}\nolimits[\alpha_{0},\alpha_{1},\cdots,\alpha_{N}]=\min(N+1,n).

(2.9)

Then, as a special case of (2.4), we have the following lemma;

Lemma 2.10.

Take $d_{j}=(-1)^{j+1}e_{j}(z_{1},\cdots,z_{N},\bar{z}_{1},\cdots,\bar{z}_{N})$ for $1\leq j\leq 2N$ . Then, $G(\mathrm{x},t;k):=F(\mathrm{x},t+N+1+k)-F(\mathrm{x},t+N+k)+F(\mathrm{x},t+N+1-k)-F(\mathrm{x},t+N-k)\ (0\leq k\leq N)$ satisfies

\displaystyle G(\mathrm{x},t;N)-d_{1}G(\mathrm{x},t;N-1)-\cdots-d_{N-1}G(\mathrm{x},t;1)-\frac{1}{2}d_{N}G(\mathrm{x},t;0)=0.

(2.11)

Proof..

For simplicity, set $z_{0}=1$ . Then, all $Z=z_{0},z_{1},\cdots,z_{N},\bar{z}_{1},\cdots,\bar{z}_{N}$ satisfy

\displaystyle(Z-1)\left(Z^{2N}-d_{1}Z^{2N-1}-d_{2}Z^{2N-2}-\cdots-d_{2N-1}Z-d_{2N}\right)=0.

Now we have $d_{2N}=-1$ and $d_{2N-j}=d_{j}$ because of the conditions for the mode set $\{z_{j}\}$ , as used in [16]. Hence, the above equation can be rewritten as

	$\displaystyle Z^{2N+1}-Z^{2N}+Z-1$	$\displaystyle-d_{1}(Z^{2N}-Z^{2N-1}+Z^{2}-Z)-\cdots$
		$\displaystyle-d_{N-1}(Z^{N+2}-Z^{N+1}+Z^{N}-Z^{N-1})-d_{N}(Z^{N+1}-Z^{N})=0.$

Therefore, by replacing $Z^{k}$ by $F(\mathrm{x},t+k)$ , we obtain the assertion. ∎

Remark 2.12.

For any mode set $\{z_{j}\in\mathbb{C}\mid|z_{j}|=1,1\leq j\leq N\}$ , the coefficients $\{d_{j}\}$ given in Lemma 2.10 satisfy $d_{2N}=-1$ and $d_{2N-j}=d_{j}$ , as above. However, the converse is not always true. Namely, for the polynomial equation

\displaystyle Z^{2N}-d_{1}Z^{2N-1}-\cdots-d_{N-1}Z^{N+1}-d_{N}Z^{N}-d_{N-1}Z^{N-1}\cdots-d_{1}Z+1=0,

when $Z=z$ is a root, $\bar{z}$ and $1/z$ become roots as well, where these two roots do not always coincide, as might be expected. For example, we can give a counterexample $(Z-2\mathrm{i})(Z+2\mathrm{i})(Z-\frac{1}{2}\mathrm{i})(Z+\frac{1}{2}\mathrm{i})=Z^{4}+\frac{17}{4}Z^{2}+1$ for $N=2$ .

The above lemma helps us to determine the modes from fewer observation points than Proposition 2.5, which needs $L+M=2N+2+\max(2N+1-n,0)$ observation points for model (2.11).

Proposition 2.13.

Set $L=\max(N-n,0)+1$ . The mode set in model F of (2.8) is determined from observed values at $L+2N+1$ integer points $\{F(\cdot,t)\in\mathcal{A}_{V}\mid t=1,2,\cdots,L+2N+1\}$ if condition (2.9) is satisfied.

Proof..

For linear equation (2.6) for the coefficients $\{c_{j}\}$ , take $M=2N+1$ and $L$ as asserted. Then, the $(nL,2N+1)$ matrix has rank $N$ because of non-degenerate condition (2.9). By using the argument in the proof of Lemma 2.10, we can transform that linear equation into the following equation for the coefficients $\{d_{j}\}$ :

\displaystyle\left[\begin{smallmatrix}\frac{1}{2}G(1,1;0)&G(1,1;1)&\cdots&G(1,1;N-1)\\ \vdots&\vdots&&\vdots\\ \frac{1}{2}G(n,1;0)&G(n,1;1)&\cdots&G(n,1;N-1)\\ \frac{1}{2}G(1,2;0)&G(1,2;1)&\cdots&G(1,2;N-1)\\ \vdots&\vdots&&\vdots\\ \frac{1}{2}G(n,2;0)&G(n,2;1)&\cdots&G(n,2;N-1)\\ \vdots&\vdots&&\vdots\\ \frac{1}{2}G(1,L;0)&G(1,L;1)&\cdots&G(1,L;N-1)\\ \vdots&\vdots&&\vdots\\ \frac{1}{2}G(n,L;0)&G(n,L;1)&\cdots&G(n,L;N-1)\end{smallmatrix}\right]\left[\begin{smallmatrix}d_{N}\\ d_{N-1}\\ \vdots\\ d_{1}\end{smallmatrix}\right]=\left[\begin{smallmatrix}G(1,1;N)\\ \vdots\\ G(n,1;N)\\ G(1,2;N)\\ \vdots\\ G(n,2;N)\\ \vdots\\ G(1,L;N)\\ \vdots\\ G(n,L;N)\end{smallmatrix}\right].

(2.14)

Here, the $(nL,N)$ matrix on the left still has rank $N$ , and then the coefficients $\{d_{j}\}$ and the modes $\{z_{j}\}$ are determined. ∎

2.3 Amplitude estimation

Once the modes $\{z_{j}\}$ in the general model (2.2) or the particular model (2.8) are obtained, the corresponding amplitudes $\{\alpha_{j}\}$ are also calculated by the curve fitting method. For model (2.2), we have

\displaystyle\left[\begin{smallmatrix}F(1,1)&F(1,2)&\cdots&F(1,T)\\ \vdots&\vdots&&\vdots\\ F(n,1)&F(n,2)&\cdots&F(n,T)\end{smallmatrix}\right]=\left[\begin{smallmatrix}\alpha_{1}(1)&\alpha_{2}(1)&\cdots&\alpha_{M}(1)\\ \vdots&\vdots&&\vdots\\ \alpha_{1}(n)&\alpha_{2}(n)&\cdots&\alpha_{M}(n)\end{smallmatrix}\right]\left[\begin{smallmatrix}z_{1}&z_{1}^{2}&\cdots&z_{1}^{T}\\ z_{2}&z_{2}^{2}&\cdots&z_{2}^{T}\\ \vdots&\vdots&&\vdots\\ z_{M}&z_{M}^{2}&\cdots&z_{M}^{T}\end{smallmatrix}\right],

(2.15)

for any $T\in\mathbb{Z}_{>0}$ . Since the Vandermonde matrix on the right has rank $M$ when $T\geq M$ , observed values at $M$ integer points $\{F(\cdot,t)\in\mathcal{A}_{V}\otimes\mathbb{C}\mid t=1,2,\cdots,M\}$ determine all of the amplitudes $\{\alpha_{j}\}$ .

Likewise, for model (2.8), we have

\displaystyle\left[\begin{smallmatrix}F(1,1)&F(1,2)&\cdots&F(1,T)\\ \vdots&\vdots&&\vdots\\ F(n,1)&F(n,2)&\cdots&F(n,T)\end{smallmatrix}\right]=\left[\begin{smallmatrix}\alpha_{0}(1)&\alpha_{1}(1)&\bar{\alpha}_{1}(1)&\cdots&\bar{\alpha}_{N}(1)\\ \vdots&\vdots&\vdots&&\vdots\\ \alpha_{0}(n)&\alpha_{1}(n)&\bar{\alpha}_{1}(n)&\cdots&\bar{\alpha}_{N}(n)\end{smallmatrix}\right]\left[\begin{smallmatrix}1&1&\cdots&1\\ z_{1}&z_{1}^{2}&\cdots&z_{1}^{T}\\ \bar{z}_{1}&\bar{z}_{1}^{2}&\cdots&\bar{z}_{1}^{T}\\ \vdots&\vdots&&\vdots\\ \bar{z}_{N}&\bar{z}_{N}^{2}&\cdots&\bar{z}_{N}^{T}\end{smallmatrix}\right],

(2.16)

for any $T\in\mathbb{Z}_{>0}$ . Hence, the observed values at $2N+1$ integer points $\{F(\cdot,t)\in\mathcal{A}_{V}\mid t=1,2,\cdots,2N+1\}$ determine all of the amplitudes $\{\alpha_{j}\}$ . More generally, even for a dataset not generated from model (2.8), we can obtain this expansion. In other words, the following lemma holds.

Lemma 2.17.

Any $2N+1$ real values $\{F_{t}\in\mathbb{R}\mid t=1,2,\cdots,2N+1\}$ are represented as

\displaystyle F_{t}=a_{0}z_{0}+\sum_{j=1}^{N}\{a_{j}z_{j}^{t}+a_{j+N}\bar{z}_{j}^{t}\},\text{ for }t=1,2,\cdots,2N+1,

by distinct modes $z_{0}=1$ and $\{z_{j},\bar{z}_{j}\mid 1\leq j\leq N\}$ , and $a_{j}\in\mathbb{C}$ . Moreover, $a_{0}\in\mathbb{R}$ and $a_{j+N}=\bar{a}_{j}$ hold.

Proof..

Take $(2N+1)$ -dimensional vectors

\displaystyle{\mathbf{v}}_{0}=(1,1,\cdots,1),\ {\mathbf{v}}_{j}=(z_{j},z_{j}^{2},\cdots,z_{j}^{2N+1}),\ {\mathbf{v}}_{j+N}=(\bar{z}_{j},\bar{z}_{j}^{2},\cdots,\bar{z}_{j}^{2N+1})

for $1\leq j\leq N$ . These vectors consist of a basis of $\mathbb{C}^{2N+1}$ because of the distinct condition. Then, we have their dual vectors $\{{\mathbf{v}}_{j}^{*}\mid 0\leq j\leq 2N\}$ such that $\langle{\mathbf{v}}_{j},{\mathbf{v}}^{*}_{l}\rangle=\delta_{j,l}$ , which are given by the inverse matrix of the matrix formed by $\{{\mathbf{v}}_{j}\}$ . Therefore, $a_{j}$ is defined by $\langle{\mathbf{F}},{\mathbf{v}}^{*}_{j}\rangle$ for ${\mathbf{F}}=(F_{1},\cdots,F_{2N+1})$ . Moreover, for $1\leq j\leq N$ , we have $\bar{a}_{j}=\langle{\mathbf{F}},\bar{{\mathbf{v}}}^{*}_{j}\rangle=\langle{\mathbf{F}},{\mathbf{v}}^{*}_{j+N}\rangle=a_{j+N}$ because $\bar{{\mathbf{v}}}^{*}_{j}$ is the dual vector of $\bar{{\mathbf{v}}}_{j}={\mathbf{v}}_{j+N}$ . ∎

In either case, the number of observation points needed to determine the amplitudes is less than that needed to determine the modes shown in Propositions 2.5 and 2.13. Therefore, the mode expansion techniques are summarized as follows:

Theorem 2.18.

(i) General model (2.2) is determined from $n(L+M)$ complex values $\{F(\mathrm{x},t)\in\mathbb{C}\mid\mathrm{x}\in V,t=1,2,\cdots,L+M\}$ for $L=\max(M-n,0)+1$ as in Proposition 2.5. When $n=1$ or $M$ , this number coincides with the complex dimension $M(n+1)$ of the model parameter.

(ii) Particular model (2.8) is determined from $n(L+2N+1)$ real values $\{F(\mathrm{x},t)\in\mathbb{R}\mid\mathrm{x}\in V,t=1,2,\cdots,L+2N+1\}$ for $L=\max(N-n,0)+1$ , as in Proposition 2.13. When $n=1$ or $N$ , this number coincides with the real dimension $(2N+1)n+N$ of the model parameter.

We can use the estimated mode expansion for forecasting unobserved data.

Corollary 2.19.

When $\{\alpha_{j},z_{j}\}$ are determined from given observed points in Theorem 2.18, further values $\{F(\mathrm{x},t)\}$ are computed for any $t\in\mathbb{Z}$ .

2.4 Example

Here, let us consider the following $n=1$ time-series data for example;

\displaystyle F(t)=2\sin(0.016\pi t)+3\cos(0.04\pi t)+\frac{1}{2}\sin(0.6\pi t)+3,

(2.20)

for $t=1,2,\cdots,100$ . See the plots in Figure 2.2.

By applying Proposition 2.5 for $M=7$ and $L=7$ to $\{F(t)\in\mathbb{R}\mid t=1,2,\cdots,14\}$ , we estimate modes and amplitudes, and then forecast the remaining values for $t>14$ . As is well known, Prony’s method is too sensitive to extract precise modes from numerical values, given as 64-bit floating point numbers in our case. Therefore, the results in Figure 2.3 show that the estimated modes are slightly different from the true values, and then such error accumulates to give a bad forecast.

On the other hand, Proposition 2.13 for $N=3$ and $L=3$ gives better results from a smaller number of values $\{F(t)\in\mathbb{R}\mid t=1,2,\cdots,10\}$ , as in Figure 2.4. This is because the absolute values of extracted modes are controlled to be $1$ , as explained in Figure 2.1.

In §4, we explain a more detailed algorithm with a graph recovery procedure.

3 Graph wave equation and graph recovery

In this section, we start from a review of the graph Laplacian and the graph wave equation from §3.1 to §3.3. In order to see an explicit solution of the graph wave equation, we calculate eigenfunctions for a path graph in §3.2, which is used in §4. Finally, we solve a graph recovery problem from the graph wave functions in §3.4.

3.1 Graph Laplacian

Let $\{w_{\mathrm{x}\mathrm{y}}\}$ be graph weights that satisfy $w_{\mathrm{y}\mathrm{x}}=w_{\mathrm{x}\mathrm{y}}\geq 0$ and $w_{\mathrm{x}\mathrm{x}}=0$ for $\mathrm{x}\neq\mathrm{y}\in V$ . In the present paper, we regard $(V,\{w_{\mathrm{x}\mathrm{y}}\})$ as a weighted undirected graph that has an edge $\mathrm{x}$ - $\mathrm{y}$ if and only if $w_{\mathrm{x}\mathrm{y}}>0$ .

For a function $f\in\mathcal{A}_{V}$ , we define the graph Laplacian $\mathcal{L}\colon\mathcal{A}_{V}\rightarrow\mathcal{A}_{V}$ as

\displaystyle\mathcal{L}f(\mathrm{x}):=\sum_{\mathrm{y}\in V}w_{\mathrm{x}\mathrm{y}}(f(\mathrm{x})-f(\mathrm{y}))=\deg(\mathrm{x})f(\mathrm{x})-\sum_{\mathrm{y}\in V}w_{\mathrm{x}\mathrm{y}}f(\mathrm{y}),

(3.1)

where $\deg(\mathrm{x}):=\sum_{\mathrm{y}\in V}w_{\mathrm{x}\mathrm{y}}$ . Note that graph weights $\{w_{\mathrm{x}\mathrm{y}}\}$ are defined as a metric of $1$ -forms over a discrete set $V$ , and the graph Laplacian is given as the Laplace-Beltrami operator with an inner product $\langle f,g\rangle_{V}:=\sum_{\mathrm{x}\in V}f(\mathrm{x})g(\mathrm{x})$ [22]. Since the Laplacian $\mathcal{L}$ is self-adjoint for this inner product, we can take eigenfunctions $\{v_{k}\in\mathcal{A}_{V}\mid 0\leq k\leq n-1\}$ as follows:

\displaystyle\mathcal{L}v_{k}=\rho_{k}v_{k},\ 0=\rho_{0}\leq\rho_{1}\leq\cdots\leq\rho_{n-1},\ \langle v_{k},v_{l}\rangle_{V}=\delta_{k,l}.

(3.2)

Here, we see $v_{0}=n^{-1/2}1_{V}$ and $v_{k}\perp 1_{V}$ for $k\geq 1$ . Although the eigenfunctions $\{v_{k}\}$ have ambiguity regarding their signs even when the eigenvalues $\{\rho_{k}\}$ are distinct, we assume that the eigenfunctions are fixed in a certain manner. Moreover, in the present paper, we assume that a weighted graph $(V,\{w_{\mathrm{x}\mathrm{y}}\})$ is always connected, so that $\rho_{1}>0$ . The upper and lower bounds of the eigenvalues are well studied; for example, we have

\displaystyle\rho_{n-1}\leq 2\max_{\mathrm{x}\in V}\deg(\mathrm{x}).

(3.3)

The proof and other bounds of graphs are shown in [6].

3.2 Graph Fourier transform

Set $\mathcal{F}[f]_{k}:=\langle f,v_{k}\rangle_{V}\in\mathbb{R}$ for $f\in\mathcal{A}_{V}$ . Then, the eigenfunction expansion is given as

\displaystyle f=\sum_{k=0}^{n-1}\langle f,v_{k}\rangle_{V}v_{k}=\sum_{k=0}^{n-1}\mathcal{F}[f]_{k}v_{k}.

(3.4)

When the eigenvalues $\{\rho_{k}\}$ are distinct, this expansion is unique. In signal processing, the transformation $\mathcal{F}\colon\mathcal{A}_{V}\ni f\mapsto\mathcal{F}[f]\in\mathbb{R}^{n}$ is called the graph Fourier transform [13]. For the support function $\chi_{\mathrm{x}}$ on $\mathrm{x}\in V$ , i.e., $\chi_{\mathrm{x}}(\mathrm{y})=\delta_{\mathrm{x},\mathrm{y}}$ , we have $\mathcal{F}[\chi_{\mathrm{x}}]_{k}=\langle\chi_{\mathrm{x}},v_{k}\rangle_{V}=v_{k}(\mathrm{x})$ . This calculation formally rewrites the eigenfunction decomposition of the graph Laplacian $\mathcal{L}$ in (3.2) as the following identity concerning the graph weight:

	$\displaystyle-w_{\mathrm{x}\mathrm{y}}$	$\displaystyle=\langle\chi_{\mathrm{x}},\mathcal{L}\chi_{\mathrm{y}}\rangle_{V}$
		$\displaystyle=\sum_{k=0}^{n-1}\langle\mathcal{F}[\chi_{\mathrm{x}}]_{k}v_{k},\rho_{k}\mathcal{F}[\chi_{\mathrm{y}}]_{k}v_{k}\rangle_{V}=\sum_{k=0}^{n-1}v_{k}(\mathrm{x})\rho_{k}v_{k}(\mathrm{y}).$		(3.5)

Example 3.6 (Path graph).

Let us consider the path graph $P_{n}$ in Figure 3.1, which is defined by the weights

\displaystyle w_{\mathrm{x}\mathrm{y}}=\begin{cases}1&\text{for $\mathrm{x},\mathrm{y}\in V$ such that $\mathrm{y}=\mathrm{x}\pm 1$}\\ 0&\text{otherwise.}\end{cases}

(3.7)

The eigenvalues and eigenfunctions are explicitly calculated as

\displaystyle\rho_{k}=2-2\cos\frac{\pi k}{n},\ v_{k}(\mathrm{x})=\sqrt{\frac{2}{n}}\cos\frac{\pi k(2\mathrm{x}-1)}{2n},

(3.8)

for $1\leq k\leq n-1$ . This is directly checked via the identity

\displaystyle\exp\frac{\pi\mathrm{i}k(2\mathrm{x}+1)}{2n}+\exp\frac{\pi\mathrm{i}k(2\mathrm{x}-3)}{2n}=2\cos\frac{\pi k}{n}\exp\frac{\pi\mathrm{i}k(2\mathrm{x}-1)}{2n}

(3.9)

and so on. Here, as the upper bound (3.3), the largest eigenvalue satisfies $\rho_{n-1}<4$ . See also Figure 3.2 for $n=21$ . The corresponding graph Fourier transform is known as the discrete cosine transform in image processing [4].

Similarly, the graph Fourier transform for a cycle graph is known as the discrete Fourier transform (DFT) in signal processing and image processing [4].

3.3 Graph wave equation

By use of the graph Laplacian, we can define the following graph wave equation [11] for $U\in\mathcal{A}_{V}\otimes\mathcal{C}^{2}(\mathbb{R};\mathbb{R})$ :

\displaystyle\frac{d^{2}}{dt^{2}}U(\mathrm{x},t)=-c\mathcal{L}U(\mathrm{x},t),\ U(\mathrm{x},0)=f(\mathrm{x}),\ \frac{d}{dt}\Big{|}_{t=0}U(\mathrm{x},t)=g(\mathrm{x}),

(3.10)

for a certain $c>0$ and given $f,g\in\mathcal{A}_{V}$ . For a solution $U(\mathrm{x},t)$ , the graph Fourier coefficient $\mathcal{F}[U(\cdot,t)]$ becomes the harmonic oscillator

\displaystyle\frac{d^{2}}{dt^{2}}\mathcal{F}[U(\cdot,t)]_{k}=-c\rho_{k}\mathcal{F}[U(\cdot,t)]_{k},\ \mathcal{F}[U(\cdot,0)]_{k}=\mathcal{F}[f]_{k},\ \frac{d}{dt}\Big{|}_{t=0}\mathcal{F}[U(\cdot,t)]_{k}=\mathcal{F}[g]_{k},

for $0\leq k\leq n-1$ ; hence, we can describe $U(\mathrm{x},t)$ as

\displaystyle U(\mathrm{x},t)=\mathcal{F}[f]_{0}v_{0}(\mathrm{x})+\sum_{k=1}^{n-1}\left\{\mathcal{F}[f]_{k}\cos{\sqrt{c\rho_{k}}t}+\frac{\mathcal{F}[g]_{k}}{\sqrt{c\rho_{k}}}\sin{\sqrt{c\rho_{k}}t}\right\}v_{k}(\mathrm{x}).

(3.11)

Therefore, graph wave equation (3.10) has a unique solution if and only if $g\perp 1_{V}$ . For simplicity, we call this $U(\mathrm{x},t)$ a graph wave function. In the present paper, we also assume that the Nyquist condition is satisfied:

\displaystyle\sqrt{c\rho_{k}}<\pi,\text{ for any }k.

(3.12)

This is always achieved by re-parameterizing $t$ by a higher sampling rate than $1$ , which also makes $c$ smaller, if necessary.

Here, it is worth noting that the above graph wave equation is naturally regarded as a discretization of the (continuous) wave equation

\displaystyle\frac{\partial^{2}}{\partial t^{2}}W(x,t)=c\sum_{i=1}^{d}\frac{\partial^{2}}{\partial x_{i}^{2}}W(x,t)

for $W\in\mathcal{C}^{2}(\mathbb{R}^{d}\times\mathbb{R};\mathbb{R})$ , which is easily checked for the path graph $P_{n}$ . Namely, for the weights (3.7), the graph Laplacian gives the second-order central difference in the spatial domain

\displaystyle-\mathcal{L}f(\mathrm{x})=f(\mathrm{x}+1)+f(\mathrm{x}-1)-2f(\mathrm{x})\sim\frac{\partial^{2}}{\partial x^{2}}f(x).

In order to see boundary conditions, we compare continuous and graph wave functions directly. The wave equation over $[0,n]$ with the Neumann conditions is written as

\displaystyle\frac{\partial^{2}}{\partial t^{2}}W(x,t)=c\frac{\partial^{2}}{\partial x^{2}}W(x,t),\ \frac{\partial}{\partial x}\Big{|}_{x=0}W(x,t)=0,\ \frac{\partial}{\partial x}\Big{|}_{x=n}W(x,t)=0,

and its solution is explicitly described with some $\{a_{k},b_{k}\}$ by

\displaystyle W(x,t)=a_{0}+\sum_{k=1}^{\infty}\left\{a_{k}\cos\frac{\sqrt{c}\pi kt}{n}+b_{k}\sin\frac{\sqrt{c}\pi kt}{n}\right\}\cos\frac{\pi kx}{n}.

(3.13)

This is viewed as a continuous analogue of the solution (3.11) for the path graph $P_{n}$ , because the eigenfunctions $\{\cos\frac{\pi k(2\mathrm{x}-1)}{2n}\}$ in (3.8) are obtained from $\{\cos\frac{\pi kx}{n}\}$ by discretization, and the eigenvalues $\{\rho_{k}\}$ in (3.8) are computed by

\displaystyle\sqrt{\rho_{k}}=\sqrt{2-2\cos\frac{\pi k}{n}}=2\sin\frac{\pi k}{2n}\sim\frac{\pi k}{n}

for $k\ll n$ . These correspondences are illustrated in Figure 3.2.

Remark 3.14.

Even when a continuous wave function $W(x,t)$ in (3.13) is evaluated at discrete points, like $x=1/2n,3/2n,\cdots,(2n-1)/2n$ , the points generally do not satisfy the graph wave equation (3.10) over a path graph. We see this point at §4.3.

3.4 Graph recovery from graph wave function

The target in the present paper is to give a procedure to recover the graph weights $\{w_{\mathrm{x}\mathrm{y}}\}$ from finite observed values at integer points $\{U(\cdot,t)\in\mathcal{A}_{V}\mid t=1,2,\cdots,T\}$ . To state our main theorem, we simplify the representation of a graph wave function $U$ in (3.11) as

\displaystyle U(\mathrm{x},t)=\beta_{0}v_{0}(\mathrm{x})+\sum_{k=1}^{n-1}\left\{\beta_{k}z_{k}^{t}+\bar{\beta}_{k}\bar{z}_{k}^{t}\right\}v_{k}(\mathrm{x})

(3.15)

where $\beta_{0}\in\mathbb{R}$ , $\beta_{k}\in\mathbb{C}$ , and $z_{k}=e^{\mathrm{i}\sqrt{c\rho_{k}}}$ .

Theorem 3.16.

(i) Assume that the modes $\{z_{k}\}$ in (3.15) are distinct and unknown. If $\beta_{k}\neq 0$ for all $k$ , then the modes $\{z_{k}\}$ , the eigenfunctions $\{v_{k}\}$ and the amplitudes $\{\beta_{k}\}$ are determined from observed valued at $2n$ integer points $\{U(\cdot,t)\in\mathcal{A}_{V}\mid t=1,2,\cdots,2n\}$ .

(ii) The graph Laplacian $\mathcal{L}$ and the graph weights $\{w_{\mathrm{x}\mathrm{y}}\}$ are recovered from the given modes $\{z_{k}\}$ and the given eigenfunctions $\{v_{k}\}$ up to constant $c$ .

Proof..

The modes and amplitudes are determined by Theorem 2.18, because (3.15) is a special case of the time-series model (2.8) for $N=n-1$ and $\alpha_{k}=\beta_{k}v_{k}$ , where non-degenerate condition (2.9) is satisfied. Then, from amplitudes $\{\beta_{k}v_{k}\}$ , the eigenfunctions $\{v_{k}\}$ are determined by $\|v_{k}\|=1$ , as asserted in (i).

For (ii), note that Nyquist condition (3.12) distinguishes $z_{k}$ and $\bar{z}_{k}$ by the signs of their imaginary parts. Then, $\sqrt{c\rho_{k}}=\arg z_{k}$ is defined uniquely. Hence, the assertion follows from the identity (3.5). ∎

In theory, the underlying graph is determined by Theorem 3.16. On the other hand, in numerical computation, it is often difficult to compute the underlying graph exactly because of numerical error, although we do not assume any noise. Therefore, in the following section, we summarize all of the above procedures as an algorithm.

4 Numerical experiments

In this section, we summarize the previous arguments into an algorithm, and then determine its performance using numerical datasets. In §4.1, we provide a six-step algorithm originating from Theorems 2.18 and 3.16. Then, in §4.2, §4.3, and §4.4, we apply the algorithm to graph wave signals, continuous wave signals, and human joint sensor time-series data, respectively.

4.1 Graph recover algorithm

For given multivariate wave signals, the number $n$ of variables is obviously known. Then, $N=n-1$ , $L=1$ , and $T=2n$ in Proposition 2.13 and (2.16) should be optimal parameters to recover the underlying graph as the proof of Theorem 3.16. In addition, each calculated amplitude $\alpha_{k}\in\mathcal{A}_{V}\otimes\mathbb{C}$ is supposed to be decomposed into $\beta_{k}\in\mathbb{C}$ and $v_{k}\in\mathcal{A}_{V}$ as $\alpha_{k}=\beta_{k}v_{k}$ . However, in practice, numerical error causes unexpected results in spite of Theorem 3.16. Therefore, we propose the following algorithm.

Algorithm 4.1.

Execute the following procedure:

Step 1

Set $N,L\in\mathbb{Z}_{>0}$ such that $nL\geq N$ , and take $n(L+2N+1)$ real values $\{F_{\mathrm{x},t}\in\mathbb{R}\mid 1\leq\mathrm{x}\leq n,t=1,2,\cdots,L+2N+1\}$ .

Step 2

Define $G(\mathrm{x},t;j)$ as in Lemma 2.10, and get $\{\hat{d}_{j}\in\mathbb{R}\mid 1\leq j\leq N\}$ by solving linear equation (2.14).

Step 3

Obtain $\{\hat{z}_{j}\in\mathbb{C}\mid 1\leq j\leq 2N\}$ by solving the polynomial equation

\displaystyle Z^{2N}-\hat{d}_{1}Z^{2N-1}-\cdots-\hat{d}_{N-1}Z^{N+1}-\hat{d}_{N}Z^{N}-\hat{d}_{N-1}Z^{N-1}\cdots-\hat{d}_{1}Z+1=0

(4.2)

with the above $\{\hat{d}_{j}\}$ , and compute $\hat{\theta}_{j}:=\arg\hat{z}_{j}$ so that $-\pi<\hat{\theta}_{j}<\pi$ for $1\leq j\leq 2N$ . Moreover, set $\hat{z}_{0}:=1$ and $\hat{\theta}_{0}:=0$ .

Step 4

Get $\{\hat{\alpha}_{j}\in\mathcal{A}_{V}\otimes\mathbb{C}\mid 0\leq j\leq 2N\}$ by solving linear equation (2.15) for $T=L+2N+1$ with the above $\{\hat{z}_{j}\}$ .

Step 5

If a reconstructed $\hat{F}(\mathrm{x},t):=\sum_{j=0}^{2N}\hat{\alpha}_{j}(\mathrm{x})\hat{z}_{j}^{t}$ is far from a validation dataset, then go back to Step 1 and retake larger $N$ and $L$ .

Step 6

If $\|\hat{\alpha}_{j}\|$ is tiny, then set $\hat{v}_{j}:=0_{V}$ , otherwise $\hat{v}_{j}:=\hat{\alpha}_{j}/\|\hat{\alpha}_{j}\|$ . Then, define

\displaystyle\hat{w}_{\mathrm{x}\mathrm{y}}:=-\frac{1}{2}\sum_{j=1}^{2N}\mathop{\mathrm{Re}}\nolimits\left(\hat{v}_{j}(\mathrm{x})\hat{\theta}_{j}^{2}\bar{\hat{v}}_{j}(\mathrm{y})\right)

(4.3)

for any $\mathrm{x}\neq\mathrm{y}\in V$ from the above $\{\hat{\theta}_{j}\}$ . Optionally, we can retake this weight as $\max(\hat{w}_{\mathrm{x}\mathrm{y}},0)$ instead if the graph weights are assumed to be non-negative.

Here, we present some remarks on this algorithm. For the graph recovery in Theorem 3.16, $N$ is supposed to be larger than $n-1$ as in examples §4.2 and §4.3. On the other hand, for a practical application, such as §4.4, smaller $N$ may work better to extract a significant graph. The condition $nL\geq N$ means linear equation (2.14) is an overdetermined system. Hence, $\{\hat{d}_{j}\in\mathbb{R}\mid 1\leq j\leq N\}$ are computed by the least squares method or the pseudo-inverse matrix of the rectangular Vandermonde matrix. Then, modes $\{\hat{z}_{j}\}$ are calculated as solutions of polynomial equation (4.2). For example, this part is practically computed as eigenvalues of the companion matrix of coefficients $\{\hat{d}_{j}\}$ . As an expression of graph wave function (3.11), $\hat{\theta}_{j}^{2}=(\arg\hat{z}_{j})^{2}$ corresponds to the eigenvalue $\hat{\rho}_{j}$ . If constant $c$ in graph wave equation (3.10) is known, then $\hat{\theta}$ can be rescaled. Since the absolute value of $\hat{z}_{j}$ is not always $1$ as explained in Remark 2.12, $\hat{z}_{j}$ is sometimes retaken as $\hat{z}^{\prime}_{j}:=\hat{z}_{j}/|\hat{z}_{j}|$ . In either case, for $\hat{z}_{j}$ , we have $\hat{z}_{l}=\bar{\hat{z}}_{j}$ for a certain $l$ . Then linear equation (2.15) becomes (2.16). Hence, by Lemma 2.17, given values $\{F_{\mathrm{x},t}\}$ reconstruct a function $\hat{F}$ as

\displaystyle\hat{F}(\mathrm{x},t):=\sum_{j=0}^{2N}\hat{\alpha}_{j}(\mathrm{x})\hat{z}_{j}^{t}=\hat{\alpha}_{0}(\mathrm{x})+\frac{1}{2}\sum_{j=1}^{2N}\{\hat{\alpha}_{j}(\mathrm{x})z_{j}^{t}+\bar{\hat{\alpha}}_{j}(\mathrm{x})\bar{z}_{j}^{t}\}

(4.4)

for any $t\in\mathbb{Z}$ . In an ideal case, $\hat{\alpha}_{j}$ is decomposed as $\hat{\beta}_{j}\hat{v}_{j}$ by $\hat{\beta}_{j}\in\mathbb{C}$ , and then the weights are estimated from identity (3.5). Instead, we set $\hat{\alpha}_{j}=:\|\hat{\alpha}_{j}\|\hat{v}_{j}$ so that $\|\hat{v}_{j}\|=1$ . Since $\hat{v}_{j}$ is in $\mathcal{A}_{V}\otimes\mathbb{C}$ in general, we replace $v_{j}(\mathrm{x})\rho_{j}{v}_{j}(\mathrm{y})$ by $\mathop{\mathrm{Re}}\nolimits\hat{v}_{j}(\mathrm{x})\hat{\rho}_{j}\bar{\hat{v}}_{j}(\mathrm{y})$ as in (4.3), in order to eliminate the ambiguity of an unit complex value on $\hat{v}_{j}$ . Basically, this $\mathop{\mathrm{Re}}\nolimits\hat{v}_{j}(\mathrm{x})\hat{\rho}_{j}\bar{\hat{v}}_{j}(\mathrm{y})$ depends only on $\hat{\alpha}_{j}(\mathrm{x})$ and $\hat{\alpha}_{j}(\mathrm{y})$ up to the constant $\|\hat{\alpha}_{j}\|$ ; thus, partial observed values along $V$ are supposed to give a subgraph. We check this in §4.2.

Furthermore, we consider the meaning of the weight (4.3).

Since $\hat{\theta}_{0}=0$ , the constant term $\hat{\alpha}_{0}$ in (4.4) does not affect the weights $\{\hat{w}_{\mathrm{x}\mathrm{y}}\}$ . The other term $\hat{\alpha}_{j}(\mathrm{x})$ indicates the existence of the mode $z_{j}$ at node $\mathrm{x}$ . Hence, when $F_{\mathrm{x}}$ and $F_{\mathrm{y}}$ have no common mode, they have no connection, i.e., $\hat{w}_{\mathrm{x}\mathrm{y}}=0$ . Moreover, when they are synchronized, $\hat{w}_{\mathrm{x}\mathrm{y}}$ becomes negative, and when they are anti-synchronized, $\hat{w}_{\mathrm{x}\mathrm{y}}$ becomes positive. To examine this property, let us consider the following example:

	$\displaystyle F(\mathrm{x},t)$	$\displaystyle=-\frac{5}{2}\mathrm{x}+\cos\left(\frac{\pi}{3}\mathrm{x}+\frac{\pi}{5}t\right)$
		$\displaystyle=-\frac{5}{2}\mathrm{x}+\frac{1}{2}\exp\left(\mathrm{i}\frac{\pi}{3}\mathrm{x}\right)\exp\left(\mathrm{i}\frac{\pi}{5}t\right)+\frac{1}{2}\exp\left(-\mathrm{i}\frac{\pi}{3}\mathrm{x}\right)\exp\left(-\mathrm{i}\frac{\pi}{5}t\right)$		(4.5)

for $1\leq\mathrm{x}\leq 7$ . We have $\hat{\theta}_{1}=\pi/5$ and $\hat{v}_{1}(\mathrm{x})=\exp(\mathrm{i}\pi\mathrm{x}/3)/\sqrt{7}$ , and then positivized weights $\{\max(\hat{w}_{\mathrm{x}\mathrm{y}},0)\}$ are calculated as in Figure 4.1. There, the maximum value of the weights is $(\pi/5)^{2}/7\fallingdotseq 0.0564$ .

4.2 Graph wave over an interval

We hereinafter demonstrate Algorithm 4.1 in order to verify its effectiveness. The first example is a graph wave function on an interval, as explained in §3.3. For $n=21$ and $V=\{1\leq\mathrm{x}\leq 21\}$ , let us take an initial function $f$ in the graph wave equation (3.10) over the path graph $P_{21}$ as

\displaystyle\tilde{f}(\mathrm{x})=\frac{\mathrm{x}-11}{5}+\text{(small random value)},\text{ for }\mathrm{x}\in V

(4.6)

and $\tilde{g}=0_{V}$ .

Note that $\{\mathcal{F}[\tilde{f}]_{k}\}$ satisfy the assumption in Theorem 3.16, as in Figure 4.2. Moreover, we take $\sqrt{c}=1.5$ so that even the largest eigenvalue satisfies Nyquist condition (3.12):

\displaystyle\sqrt{c\rho_{k}}<1.5\times 2<\pi,

where upper bound (3.3) is used. Then, by applying the expression (3.11) to these conditions, we can see that the wave behavior becomes like the water in a bottle as in Figure 4.3. Although the function is continuous with respect to time variable $t$ as on the left-hand side, only integer points are used in the algorithm shown as the heat map on the right-hand side.

To this dataset, we apply Algorithm 4.1.

First, let us take $L=3$ and $N=26$ in Step 1. Then, we obtain $53$ modes $\{\hat{z}_{j}\}$ from $56$ observation points along the time axis from Steps 2 and 3, as on the left-hand side in Figure 4.4. Note that although $53$ is larger than $41$ (the number of actual modes), such redundancy gives better results in numerical computation. Then, we obtain the corresponding amplitudes $\{\hat{\alpha}_{j}\}$ and weights $\{\hat{w}_{\mathrm{x}\mathrm{y}}\}$ from Steps 4 and 6. Since $c$ is given as above, $\hat{\theta}_{j}=\arg\hat{z}_{j}/\sqrt{c}$ can be used to calculate weights instead of $\hat{\theta}_{j}$ in Step 6. On the right-hand side in Figure 4.4, we multiply the recovered weights by $10$ for visualization, which is quite close to the original weights in the path graph (3.7). Finally, we check the mean squared error in each $t$ as in Figure 4.5 to verify Step 5.

Since the error is tiny, even for test data, we conclude that the proposed graph recovery algorithm works well for this exact graph wave setting.

By modifying this dataset, let us consider another case when only partial observed values with respect to spatial variables are available.

For $V=\{1\leq\mathrm{x}\leq 21\}$ , assume the values on $\mathrm{x}=3,7,10,12,16$ , and $17$ are missing. Then, a recovered graph is supposed to be the subgraph of the path graph in Figure 4.6.

To this dataset, we apply the same procedure as above, namely, take $L=3$ and $N=26$ , and then obtain $53$ modes $\{\hat{z}_{j}\}$ . As the result on the left-hand side of Figure 4.7 shows, the extracted modes are basically the same as in the previous case, i.e., Figure 4.4. Then, the corresponding amplitudes $\{\hat{\alpha}_{j}\}$ and weights $\{\hat{w}_{\mathrm{x}\mathrm{y}}\}$ are also estimated, as on the right-hand side in Figure 4.7, which is close to what we expected in Figure 4.6.

These examples show that Algorithm 4.1 can correctly recover the underlying graph and subgraph from an observed graph wave function as Theorem 3.16 guarantees. In the following subsections, we consider other cases in which multivariate signals are not explicitly assumed to have a graph structure under variables.

4.3 Continuous wave over an interval

Unlike in §4.2, here we consider a continuous wave function on the interval explained in (3.13). Then, we apply Algorithm 4.1 to its observed values in expectation that the algorithm extracts a graph in the shape of an interval, although they do not satisfy the graph wave equation over a path graph.

Take $n=21$ in (3.13) as above. For an initial function $W(x,0)$ , let us use a smoothened function of $\tilde{f}$ in (4.6). By using the identities

\displaystyle a_{0}=\int_{0}^{n}W(x,0)dx,\ a_{k}=2\int_{0}^{n}W(x,0)\cos\frac{\pi kx}{n}dx\ \text{ for }k>0,

we define

\displaystyle\tilde{a}_{0}=\frac{1}{21}\sum_{\mathrm{x}\in V}\tilde{f}(\mathrm{x}),\ \tilde{a}_{k}=\frac{2}{21}\sum_{\mathrm{x}\in V}\tilde{f}(\mathrm{x})\cos\frac{\pi k(2\mathrm{x}-1)}{42}\ \text{ for }k>0.

In fact, these coefficients are nothing but the graph Fourier coefficients of $\tilde{f}$ , namely, $\sqrt{21/2}\tilde{a}_{k}=\mathcal{F}[\tilde{f}]_{k}$ for $1\leq k\leq n-1$ . Then, by the taking first $11$ coefficients and assuming $\frac{\partial}{\partial t}|_{t=0}\widetilde{W}(x,t)=0$ , i.e., $b_{k}=0$ for any $k$ , we obtain a continuous wave function

\displaystyle\widetilde{W}(x,t)=\tilde{a}_{0}+\sum_{k=1}^{10}\tilde{a}_{k}\cos\frac{\sqrt{c}\pi kt}{21}\cos\frac{\pi kx}{21}

(4.7)

for $\sqrt{c}=1.5$ , which is shown in Figure 4.8. Here, we emphasize that this wave function is continuously defined for $x\in[0,21]$ and $t\in\mathbb{R}$ , and thus there exist various ways to discretize it by evaluation. Moreover, the number $11$ is taken so that each frequency $\sqrt{c}\pi k/21$ satisfies Nyquist condition (3.12), although this has another purpose, as explained below.

First of all, we evaluate the above $\widetilde{W}(x,t)$ at $21$ points along the $x$ -axis, $x=\mathrm{x}-0.5$ for $\mathrm{x}=1,2,\cdots,21$ , and integer points along the time axis.

For these observed values, we take $L=1$ and $N=12$ in Step 1. Then, we obtain $25$ modes $\{\hat{z}_{j}\}$ from $26$ observation points from Steps 2 and 3, as on the left-hand side in Figure 4.9, and the corresponding amplitudes $\{\hat{\alpha}_{j}\}$ and weights $\{\hat{w}_{\mathrm{x}\mathrm{y}}\}$ from Steps 4 and 6, as on the right-hand side.

The estimated weights indicate the path-like graph at the top of Figure 4.10, which includes the path graph below. Thus, the same procedure for a partial observed dataset like in Figure 4.6 is applicable to obtain the path graph.

Namely, we try another discretization $x=2\mathrm{x}-1.5$ for $\mathrm{x}=1,2,\cdots,11$ , given on the left-hand side in Figure 4.11. Then, by using the same parameter, we successfully obtain the weights close to those of the path graph on the right-hand side in Figure 4.11 expected from the bottom path graph in Figure 4.10.

This phenomenon can be explained in another way. From $21\times 26$ observed values $\{\widetilde{W}(x,t)\in\mathbb{R}\mid x=0.5,1.5,\cdots,20.5,t=1,2,\cdots,26\}$ , a $(21,12)$ matrix formed by $\{G(\mathrm{x},t;j)\}$ is constructed as in linear equation (2.14) in Step 2.

Here, we can see that the $(21,12)$ matrix has rank $10$ , which means its singular values consist of $10$ major values and $2$ minor values, as in Figure 4.12. This is consistent with the definition of $\widetilde{W}(x,t)$ , where we take the $10$ terms in the summation other than the constant term (4.7). Hence, it is sufficient to use more than $10$ observation points, such as $11\times 26$ observed values $\{\widetilde{W}(x,t)\in\mathbb{R}\mid x=0.5,2.5,\cdots,20.5,t=1,2,\cdots,26\}$ as above, in order to extract an outline of the underlying graph.

4.4 Pose model

As a more practical case, we consider a graph estimation problem from human joint tracking data. To this end, we use an open source dataset PKU-MMD, which consists of RGB, skeletons, depth, and IR sequential data for 51 action categories [7]. The skeleton data we use herein are obtained by Kinect v2, the joints of which are described as in Figure 4.13 [1]. More strictly speaking, we use the 2,100th to 2,800th frames from the "0007-M" sample in the dataset.

Although the skeleton data provide the three-dimensional coordinates $x$ , $y$ , and $z$ of each joint location in space, we only use the $x$ and $y$ coordinates in this experiment. Then, we try to extract relations among $25$ moving $x+\mathrm{i}y$ joint locations as graphs, which are supposed to characterize individual actions.

Let $F_{x}(\mathrm{x},t)$ and $F_{y}(\mathrm{x},t)$ denote functions of the $x$ and $y$ locations in an RGB image, respectively, for the joint nodes $1\leq\mathrm{x}\leq 25$ and the times $t=0,1,\cdots,600$ . First, as $n=2\times 25=50$ time series, we apply Steps 1 to 5 of Algorithm 4.1 for $L=4$ and $N=5$ to a $15$ -long sequence starting from the $t_{1}$ -th frame. Then, we obtain mode decompositions

\displaystyle\hat{F}_{x}(\mathrm{x},t;t_{1})=\hat{\alpha}_{x,0,t_{1}}(\mathrm{x})+\sum_{j=1}^{10}\hat{\alpha}_{x,j,t_{1}}(\mathrm{x})\hat{z}_{j}^{t},\ \ \hat{F}_{y}(\mathrm{x},t;t_{1})=\hat{\alpha}_{y,0,t_{1}}(\mathrm{x})+\sum_{j=1}^{10}\hat{\alpha}_{y,j,t_{1}}(\mathrm{x})\hat{z}_{j}^{t}

(4.8)

for any node $\mathrm{x}$ so as to coincide with $F_{x}(\mathrm{x},t)$ and $F_{y}(\mathrm{x},t)$ on the sequence $t_{1}\leq t<t_{1}+15$ . See also Lemma 2.17. Here, we can see that $\hat{F}_{x}(\mathrm{x},t;t_{1})+\mathrm{i}\hat{F}_{y}(\mathrm{x},t;t_{1})$ represents the time evolution of the $x+\mathrm{i}y$ location in an RGB image. Finally, we calculate (4.3) for $\hat{\alpha}_{j,t_{1}}=\hat{\alpha}_{x,j,t_{1}}+\mathrm{i}\hat{\alpha}_{y,j,t_{1}}$ in order to estimate weights among joint locations at $t=t_{1}$ . Note that although $F_{x}(\mathrm{x},t)$ and $F_{y}(\mathrm{x},t)$ no longer satisfy the graph wave equation for the estimated graph, we suppose that the graph weights extract meaningful features from the decomposition (4.8).

For example, we pick out four sequences representing "clapping" from the 2,165th to 2,180th frames ( $t=65$ ), "cross hands in front" from the 2,365th to 2,380th frames ( $t=265$ ), "hand waving" from the 2,525th to 2,540th frames ( $t=425$ ), and "jump up" from the 2,685th to 2,700th frames ( $t=585$ ). The corresponding data are shown in Figure 4.14. Then, we draw the resulting graphs in Figure 4.15, where each edge indicates that the estimated weight is larger than $0.2$ .

As we can see, edges in each case are illustrated on nodes and pairs that characterize each action. In particular, since we are paying attention to positive weights, Algorithm 4.1 ignores synchronized pairs and extracts anti-synchronized pairs. Hence, even in the "jump up" case, only the leg motion is highlighted against all upward moving nodes. As in this case, we expect that the proposed graph recovery algorithm will work well, even for non-wave signals.

5 Conclusion

In the present paper, we introduced a graph recovery method from a graph wave function based on a modified DMD algorithm. Then, we showed its effectiveness by using three examples, i.e., a graph wave function, a continuous wave function, and human joint tracking data, from viewpoints of from theoretical demonstration to practical study. In the present study, we do not assume the existence of noise in each time-series model. Hence, further study is needed in order to assure that the proposed algorithm works against noises, such as white noise, with the help of known technologies in signal processing, time-series analysis, and statistics.

Acknowledgements

The author would like to thank Yosuke Otsubo for discussing and suggesting DMD technology for further study. The author is also grateful to Satoshi Takahashi, Tetsuya Koike, Chikara Nakamura, Bausan Yuan, Ping-Wei Chang, and Pranav Gundewar for their constant encouragement.

References

[1] Kinect for windows v2 referece, jointtype enumeration. https://docs.microsoft.com/en-us/previous-versions/windows/kinect/dn758662(v=ieb.10).
[2] Amaia Abanda, Usue Mori, and Jose A Lozano. A review on distance based time series classification. Data Mining and Knowledge Discovery, 33(2):378–412, 2019.
[3] Faisal Ahmed, Padma Polash Paul, and Marina L Gavrilova. Dtw-based kernel and rank-level fusion for 3d gait recognition using kinect. The Visual Computer, 31(6):915–924, 2015.
[4] Rosa A Asmara and Reza Agustina. Comparison of discrete cosine transforms (dct), discrete fourier transforms (dft), and discrete wavelet transforms (dwt) in digital image watermarking. International Journal of Advanced Computer Science and Applications, 8(2):245–249, 2017.
[5] Donald J Berndt and James Clifford. Using dynamic time warping to find patterns in time series. In KDD workshop, volume 10, pages 359–370. Seattle, WA, USA:, 1994.
[6] Fan RK Chung and Fan Chung Graham. Spectral graph theory. Number 92. American Mathematical Soc., 1997.
[7] Liu Chunhui, Hu Yueyu, Li Yanghao, Song Sijie, and Liu Jiaying. Pku-mmd: A large scale benchmark for continuous multi-modal human action understanding. arXiv preprint arXiv:1703.07475, 2017.
[8] Sacha Epskamp, Lourens J Waldorp, René Mõttus, and Denny Borsboom. The gaussian graphical model in cross-sectional and time-series data. Multivariate behavioral research, 53(4):453–480, 2018.
[9] Christos Faloutsos, Mudumbai Ranganathan, and Yannis Manolopoulos. Fast subsequence matching in time-series databases. ACM Sigmod Record, 23(2):419–429, 1994.
[10] George E Forsythe. Generation and use of orthogonal polynomials for data-fitting with a digital computer. Journal of the Society for Industrial and Applied Mathematics, 5(2):74–88, 1957.
[11] Joel Friedman and Jean-Pierre Tillich. Wave equations for graphs and the edge-based laplacian. Pacific Journal of Mathematics, 216(2):229–266, 2004.
[12] Paolo Giudici and Alessandro Spelta. Graphical network models for international financial flows. Journal of Business & Economic Statistics, 34(1):128–138, 2016.
[13] David K Hammond, Pierre Vandergheynst, and Rémi Gribonval. Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis, 30(2):129–150, 2011.
[14] Meng Li and Howard Leung. Graph-based approach for 3d human skeletal action recognition. Pattern Recognition Letters, 87:195–202, 2017.
[15] Ta-Hsin Li. Time series with mixed spectra. CRC Press, 2013.
[16] Ta-Hsin Li and Benjamin Kedem. Asymptotic analysis of a multiple frequency estimation method. Journal of multivariate analysis, 46(2):214–236, 1993.
[17] Gerlind Plonka and Manfred Tasche. Prony methods for recovery of structured functions. GAMM-Mitteilungen, 37(2):239–258, 2014.
[18] Ivan Popivanov and Renee J Miller. Similarity search over time-series data using wavelets. In Proceedings 18th international conference on data engineering, pages 212–221. IEEE, 2002.
[19] R Prony. Essai experimental et analytique. Journal de L’Ecole Polytechnique, 1(2):24–76, 1795.
[20] Peter J Schmid. Dynamic mode decomposition of numerical and experimental data. Journal of fluid mechanics, 656:5–28, 2010.
[21] Kilian Stampfer and Gerlind Plonka. The generalized operator based prony method. Constructive Approximation, 52(2):247–282, 2020.
[22] Yuuya Takayama. Geometric formulation for discrete points and its applications. arXiv preprint arXiv:2002.03767, 2020.
[23] Jeff KT Tang and Howard Leung. Retrieval of logically relevant 3d human motions by adaptive feature selection with graded relevance feedback. Pattern Recognition Letters, 33(4):420–430, 2012.
[24] Jonathan H Tu. Dynamic mode decomposition: Theory and applications. PhD thesis, Princeton University, 2013.
[25] Caroline Uhler. Gaussian graphical models: An algebraic and geometric perspective. arXiv preprint arXiv:1707.04345, 2017.
[26] Ting Wang, Zhao Ren, Ying Ding, Zhou Fang, Zhe Sun, Matthew L MacDonald, Robert A Sweet, Jieru Wang, and Wei Chen. Fastggm: an efficient algorithm for the inference of gaussian graphical model in biological networks. PLoS computational biology, 12(2):e1004755, 2016.