This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Linear Simple Cycle Reservoirs at the edge of stability perform Fourier decomposition of the input driving signals

Robert Simon Fong School of Computer Science, University of Birmingham, Birmingham, B15 2TT, UK [email protected] Boyu Li Department of Mathematical Sciences, New Mexico State University, Las Cruces, New Mexico, 88003, USA [email protected]  and  Peter Tiňo School of Computer Science, University of Birmingham, Birmingham, B15 2TT, UK [email protected]
Abstract.

This paper explores the representational structure of linear Simple Cycle Reservoirs (SCR) operating at the edge of stability. We view SCR as providing in their state space feature representations of the input-driving time series. By endowing the state space with the canonical dot-product, we “reverse engineer” the corresponding kernel (inner product) operating in the original time series space. The action of this time-series kernel is fully characterized by the eigenspace of the corresponding metric tensor. We demonstrate that when linear SCRs are constructed at the edge of stability, the eigenvectors of the time-series kernel align with the Fourier basis. This theoretical insight is supported by numerical experiments.

Recurrent Neural Networks (RNNs) are machine learning methods for modeling temporal dependencies in sequential data, but their training can be computationally demanding. Reservoir Computing (RC), a simplified subset of RNNs, circumvents this problem by fixing the internal dynamics of the network (the reservoir) and focusing training on the readout layer. Simple Cycle Reservoir (SCR) is a type RC model that stands out for its minimalistic design and proven capability to universally approximate a wide class of processes operating on time series data (namely time-invariant fading memory filters) even in the linear dynamics regime (and non-linear static readouts). We show that, interestingly, when linear SCR is constructed at the edge of stability, it implicitly represents the time series according to a well known and widely used technique of Fourier signal decomposition. This insight demonstrates that deep connections can exist between recurrent neural networks, classical signal processing techniques and statistics, paving the way for their enhanced understanding and innovative applications.

1. Introduction

Recurrent Neural Networks (RNNs) are input-driven parametric state-space machine learning models designed to capture temporal dependencies in sequential input data streams. They encode time series data into a latent state space, dynamically storing temporal information within state-space vectors.

Reservoir Computing (RC) models is a subset of RNNs that operate with a fixed, non-trainable input-driven dynamical system (known as the reservoir) and a static trainable readout layer producing model responses based on the reservoir activations. This design uniquely simplifies the training process by concentrating adjustments solely on the readout layer (thus avoiding back-propagating the error information backwards through time), leading to enhanced computational efficiency. The simplest implementations of RC models include Echo State Networks (ESNs) [Jae01, MNM02, TD01, LJ09].

Simple Cycle Reservoirs (SCR) represent a specialized class of RC models characterized by a single degree of freedom in the reservoir construction (modulo the state space dimensionality), structured through uniform ring connectivity and binary input weights with an aperiodic sign pattern. Recently, SCRs were shown to be universal approximators of time-invariant dynamic filters with fading memory over \mathbb{C} and \mathbb{R} respectively in [LFT24, FLT24], making them highly suitable for integration in photonic circuits for high-performance, low-latency processing [BVdSBV22, LBFM+17, HVK+20].

Understanding the intricacies of SCRs in depth is essential. In this work, we employ the kernel view of linear ESNs introduced in [Tin20], in which the state-space ‘reservoir’ representation of (potentially left-infinite) input sequences is treated as a feature map corresponding to the given reservoir model through the associated reservoir kernel. For linear reservoirs, the canonical dot product of two input sequences’ feature representations is analytically expressible as the (semi-)inner product of the sequences themselves. The corresponding metric tensor reveals the representational structure imposed by the reservoir on the input sequences, in particular in the metric tensor’s eigenspace containing dominant projection axes (time series ’motifs’) and scaling (‘importance’ factors).

To assess the “richness” of linear SCR state-space representations, [Tin20] proposed analyzing the relative area of motifs under the Fast Fourier Transform (FFT). It was observed that the richness of these representations collapses at the edge of stability when the spectral radius ρ\rho of the dynamic coupling matrix equals to 1.

In this paper we theoretically analyze the collapse of motif richness at the edge of stability and show that when ρ=1\rho=1 the SCR kernel motifs correspond to Fourier basis. We begin by reviewing the notion of SCRs [RT10], kernel view of ESNs [Tin20], and Reservoir Motif Machines [TFL24] in Section 2. The contribution of this paper, in the subsequent sections, are outlined as follows:

  1. (1)

    In Section 4, we show in \mathbb{C} that motifs of linear SCR at the edge of stability are harmonic functions.

  2. (2)

    In Section 5, we show in \mathbb{R} that nn dimensional linear SCR has n2\lceil\frac{n}{2}\rceil symmetric motifs and n2\lfloor\frac{n}{2}\rfloor skew-symmetric motifs.

  3. (3)

    In Section 6, we combine the results of the previous two sections and demonstrate numerically that in \mathbb{R}, the motifs of linear SCR at edge of stability are exactly the columns of real Fourier basis matrix.

  4. (4)

    Finally in Section 7, we conclude the paper with numerical experiments supporting our findings.

2. Simple Cycle Reservoir and its temporal kernel

Let 𝕂=\mathbb{K}=\mathbb{R} or \mathbb{C} be a field. We first formally define the principal object of our study - parametrized linear driven dynamical system with a (possibly) non-linear readout.

Definition 2.1.

A linear reservoir system over 𝕂\mathbb{K} is is the triplet R:=(W,V,h)R:=(\mbox{\bf{W}},\mbox{\bf{V}},h) where the dynamic coupling W𝕄n×n(𝕂)\mbox{\bf{W}}\in\mathbb{M}_{n\times n}\left(\mathbb{K}\right) is an n×nn\times n matrix over 𝕂\mathbb{K}, the input-to-state coupling V𝕄n×m(𝕂)\mbox{\bf{V}}\in\mathbb{M}_{n\times m}\left(\mathbb{K}\right) is an n×mn\times m matrix, and the state-to-output mapping (readout) h:𝕂n𝕂dh:\mathbb{K}^{n}\to\mathbb{K}^{d} is a (trainable) continuous function.

The corresponding linear dynamical system is given by:

(2.1) {xt=Wxt1+Vctyt=h(xt)\begin{cases}\mbox{\bf{x}}_{t}&=\mbox{\bf{W}}\mbox{\bf{x}}_{t-1}+\mbox{\bf{V}}\mbox{\bf{c}}_{t}\\ \mbox{\bf{y}}_{t}&=h(\mbox{\bf{x}}_{t})\end{cases}

where {ct}t𝕂m\{\mbox{\bf{c}}_{t}\}_{t\in\mathbb{Z}_{-}}\subset\mathbb{K}^{m}, {xt}t𝕂n\{\mbox{\bf{x}}_{t}\}_{t\in\mathbb{Z}_{-}}\subset\mathbb{K}^{n}, and {yt}t𝕂d\{\mbox{\bf{y}}_{t}\}_{t\in\mathbb{Z}_{-}}\subset\mathbb{K}^{d} are the external inputs, states and outputs, respectively. We abbreviate the dimensions of RR by (n,m,d)(n,m,d).

We make the following assumptions for the system:

  1. (1)

    WW is assumed to be strictly contractive. In other words, its operator norm W<1\left\lVert W\right\rVert<1. The system (2.1) thus satisfies the fading memory property (FMP) [LFT24].

  2. (2)

    We assume the input stream is {ct}t\{\mbox{\bf{c}}_{t}\}_{t\in\mathbb{Z}_{-}} is uniformly bounded. In other words, there exists a constant MM such that ctM\left\lVert\mbox{\bf{c}}_{t}\right\rVert\leq M for all tt\in\mathbb{Z}_{-}.

The contractiveness of WW and the uniform boundedness of input stream imply that the images x𝕂n\mbox{\bf{x}}\in\mathbb{K}^{n} of the inputs c(𝕂m)\mbox{\bf{c}}\in(\mathbb{K}^{m})^{\mathbb{Z}_{-}} under the linear reservoir system live in a compact space X𝕂nX\subset{\mathbb{K}}^{n}. With slight abuse of mathematical terminology we call XX a state space.

Definition 2.2.

Let C=[cij]{\mbox{\bf{C}}}=[c_{ij}] be an n×nn\times n matrix. We say C is a permutation matrix if there exists a permutation σ\sigma in the symmetric group SnS_{n} such that

cij={1, if σ(i)=j,0, if otherwise.c_{ij}=\begin{cases}1,&\text{ if }\sigma(i)=j,\\ 0,&\text{ if otherwise.}\end{cases}

We say a permutation matrix C is a full-cycle permutation111Also called left circular shift or cyclic permutation in the literature. if its corresponding permutation σSn\sigma\in S_{n} is a cycle permutation of length nn. Finally, a matrix W=ρC\mbox{\bf{W}}=\rho\cdot{\mbox{\bf{C}}} is called a contractive full-cycle permutation if ρ(0,1)\rho\in(0,1) and C is a full-cycle permutation.

The idea of simple cycle reservoir was presented in [RT10] as a reservoir system with a very small number of design degrees of freedom, yet retaining performance capabilities of more complex or (unnecessarily) randomized constructions. In fact, it can be shown that even with such a drastically reduced design complexity, SCR models are universal approximators of fading memory filters [LFT24, FLT24].

Definition 2.3.

A linear reservoir system R=(W,w,h)R=\left(\mbox{\bf{W}},\mbox{\bf{w}},h\right) with dimensions (n,m,d)(n,m,d) is called a Simple Cycle Reservoir (SCR) 222We note that the assumption on the aperiodicity of the sign pattern in VV is not required for this study if

  1. (1)

    W is a contractive full-cycle permutation, and

  2. (2)

    w𝕄n×m({1,1})\mbox{\bf{w}}\in\mathbb{M}_{n\times m}\left(\left\{-1,1\right\}\right).

One possibility to understand inner representations of the input-driving time series forming inside the reservoir systems is to view the reservoir state space as a temporal feature space of the associated reservoir kernel [Tin20, GGO22] . Consider a linear reservoir system R=(W,w,h)R=(\mbox{\bf{W}},\mbox{\bf{w}},h) over 𝕂\mathbb{K} with dimensions (n,1,d)(n,1,d) operating on univariate input.

Let τ>n\tau>n denote the length of the look back window and consider two sufficiently long time series of length τ>n\tau>n,

u =(u(τ+1),u(τ+2),,u(1),u(0))\displaystyle=\left(u\left(-\tau+1\right),u\left(-\tau+2\right),\ldots,u\left(-1\right),u\left(0\right)\right)
=:(u1,u2,,uτ)𝕂τ\displaystyle=:\left(u_{1},u_{2},\ldots,u_{\tau}\right)\in\mathbb{K}^{\tau}

and

v =(v(τ+1),v(τ+2),,v(1),v(0))\displaystyle=\left(v\left(-\tau+1\right),v\left(-\tau+2\right),\ldots,v\left(-1\right),v\left(0\right)\right)
=:(v1,v2,,vτ)𝕂τ\displaystyle=:\left(v_{1},v_{2},\ldots,v_{\tau}\right)\in\mathbb{K}^{\tau}

we consider the reservoir states reached upon reading them (with zero initial state) their feature space representations [Tin20]:

ϕ(u)=j=1τujWτjw,ϕ(v)=j=1τvjWτjw.\phi(\mbox{\bf{u}})=\sum_{j=1}^{\tau}u_{j}\mbox{\bf{W}}^{\tau-j}\mbox{\bf{w}},\ \ \ \phi(\mbox{\bf{v}})=\sum_{j=1}^{\tau}v_{j}\mbox{\bf{W}}^{\tau-j}\mbox{\bf{w}}.

The canonical dot product (reservoir kernel)

K(u,v)=ϕ(u),ϕ(v)K(\mbox{\bf{u}},\mbox{\bf{v}})=\langle\phi(\mbox{\bf{u}}),\phi(\mbox{\bf{v}})\rangle

can be written in the original time series space as a semi-inner product u,vQ=uQv\langle\mbox{\bf{u}},\mbox{\bf{v}}\rangle_{Q}=\mbox{\bf{u}}^{\top}\mbox{\bf{Q}}\mbox{\bf{v}}, where

(2.2) Qi,j=w(W)i1Wj1w.Q_{i,j}=\mbox{\bf{w}}^{\top}\left({\mbox{\bf{W}}}^{\top}{}\right)^{i-1}\mbox{\bf{W}}^{j-1}\mbox{\bf{w}}.

Since the (semi-)metric tensor Q𝕄τ×τ(𝕂)\mbox{\bf{Q}}\in\mathbb{M}_{\tau\times\tau}(\mathbb{K}) is symmetric and positive semi-definite, it admits the following eigen-decomposition:

(2.3) Q=MΛQM,\displaystyle\mbox{\bf{Q}}=\mbox{\bf{M}}\Lambda_{Q}\mbox{\bf{M}}^{\top},

where ΛQ:=diag(λ1,λ2,,λNm)\Lambda_{Q}:=\operatorname{diag}\left(\lambda_{1},\lambda_{2},\ldots,\lambda_{N_{m}}\right) is a diagonal matrix consisting of non-negative eigenvalues of Q with the corresponding eigenvectors m1,m2,,mNm𝕂τ\mbox{\bf{m}}_{1},\mbox{\bf{m}}_{2},\ldots,\mbox{\bf{m}}_{N_{m}}\in\mathbb{K}^{\tau} (columns of M). The Nm:=rank(Q)NτN_{m}:=\operatorname{rank}\left(Q\right)\leq N\leq\tau eigenvectors of M with positive eigenvalues are called the motifs of RR. We have:

K(u,v)=(ΛQ12Mu)(ΛQ12Mv).K(\mbox{\bf{u}},\mbox{\bf{v}})=\left(\Lambda_{Q}^{\frac{1}{2}}\mbox{\bf{M}}^{\top}\mbox{\bf{u}}\right)^{\top}\left(\Lambda_{Q}^{\frac{1}{2}}\mbox{\bf{M}}^{\top}\mbox{\bf{v}}\right).

In particular, the reservoir kernel is a canonical dot product of time series projected onto the motif space spanned by {mi}i=1Nm\left\{\mbox{\bf{m}}_{i}\right\}_{i=1}^{N_{m}}:

K(u,v)=~u,~v,K(\mbox{\bf{u}},\mbox{\bf{v}})=\langle\tilde{}\mbox{\bf{u}},\tilde{}\mbox{\bf{v}}\rangle,

where

(2.4) ~u\displaystyle\tilde{}\mbox{\bf{u}} =ΛQ12Mu\displaystyle=\Lambda_{Q}^{\frac{1}{2}}\mbox{\bf{M}}^{\top}\mbox{\bf{u}}
=[λ112m1,uλNm12mNm,u]\displaystyle=\begin{bmatrix}\lambda^{\frac{1}{2}}_{1}\cdot\langle\mbox{\bf{m}}_{1},\mbox{\bf{u}}\rangle\\ \vdots\\ \lambda^{\frac{1}{2}}_{N_{m}}\cdot\langle\mbox{\bf{m}}_{N_{m}},\mbox{\bf{u}}\rangle\end{bmatrix}
=(λi12mi,u)i=1Nm𝕄Nm(𝕂).\displaystyle=\left(\lambda_{i}^{\frac{1}{2}}\cdot\langle\mbox{\bf{m}}_{i},\mbox{\bf{u}}\rangle\right)_{i=1}^{N_{m}}\in\mathbb{M}_{N_{m}}(\mathbb{K}).

Reservoir Motif Machine (RMM)[TFL24] is a predictive model motivated by the kernel view of linear Echo State Networks described above. By projecting the τ\tau-blocks of input time series onto the reservoir motif space given by span({mi}i=1Nm)\operatorname{span}(\{\mbox{\bf{m}}_{i}\}_{i=1}^{N_{m}}), RMM captures temporal and structural dynamics in a computationally efficient feature map.

In particular, rather than relying on the motif weights λi12\lambda_{i}^{\frac{1}{2}} determined by the reservoir RR in Equation (2.4), RMM introduces a set of adaptable motif coefficients, denoted as C:={ci}i=1NmC:=\{c_{i}\in\mathbb{R}\}_{i=1}^{N_{m}}, to define its feature map as follows:

φ(u;C)\displaystyle\varphi\left(\mbox{\bf{u}};C\right) =[c1m1,ucNmmNm,u]\displaystyle=\begin{bmatrix}c_{1}\cdot\langle\mbox{\bf{m}}_{1},\mbox{\bf{u}}\rangle\\ \vdots\\ c_{N_{m}}\cdot\langle\mbox{\bf{m}}_{N_{m}},\mbox{\bf{u}}\rangle\end{bmatrix}
(2.5) =(cimi,u)i=1NmNm.\displaystyle=\left(c_{i}\cdot\langle\mbox{\bf{m}}_{i},\mbox{\bf{u}}\rangle\right)_{i=1}^{N_{m}}\in\mathbb{R}^{N_{m}}.

This feature map is used to train a predictive model, such as linear regression or kernel-based methods, directly in the motif space.

Remark 2.4.

To streamline the theoretical analysis of SCR kernels, in line with [Tin20], we assume that the length of the look-back window (past horizon) τ\tau is an integer multiple of the dimension of the state space nn, i.e. there exists k+k\in\mathbb{N}_{+} such that τ=kn\tau=k\cdot n. In particular, denoting by m a SCR motif calculated with τ=n\tau=n, it is shown in [Tin20] that when τ=kn\tau=k\cdot n, the corresponding motif is a concatenation of kk copies of m, scaled by ρln\rho^{l\cdot n}, l=0,1,2,,k1l=0,1,2,...,k-1. Recall that ρ\rho is the spectral radius of the dynamic coupling WW. Hence, to study SCR motifs with past horizon τ=kn\tau=k\cdot n, for any k+k\in\mathbb{N}_{+}, it is sufficient to study only the base case τ=n\tau=n.

3. On the edge of stability of Simple Cycle Reservoirs

Having defined the reservoir temporal kernel, one can ask how “representationally rich” is the associated feature space (span of the reservoir motifs). To quantify the ‘richness’ of the reservoir feature space of a linear reservoir system over \mathbb{R}, [Tin20] proposed the following procedure:

Consider a linear reservoir system R=(W,w,h)R=(\mbox{\bf{W}},\mbox{\bf{w}},h) with dimensions (n,1,d)(n,1,d) over \mathbb{R}. Suppose W has spectral radius ρ\rho. Recall from Remark 2.4 that: to study the SCR motif structure of, it is sufficient to consider the past horizon τ=n\tau=n.The motif matrix M𝕄n×n()\mbox{\bf{M}}\in\mathbb{M}_{n\times n}\left(\mathbb{R}\right) is constructed according to Equation (2.3) from the matrix Q (metric tensor of the inner product of the reservoir kernel).

First, Fast Fourier Transform (FFT) is applied to the kernel motifs (columns of M), considering only those with motif weights upto a threshold of 10210^{-2} of the highest motif weight. This yields an n×nn\times n^{\prime} matrix of Fourier coefficients over \mathbb{C} with nnn^{\prime}\leq n. These Fourier coefficients are then collected in a (multi)set Z:={zk}k=1nnZ:=\left\{z_{k}\right\}_{k=1}^{n\cdot n^{\prime}}.

To evaluate the diversity and spread of the Fourier coefficients in the complex plane, [Tin20] proposed calculating the coarse-grained area occupied by ZZ\subseteq\mathbb{C}. In particular, the box [7,7]2\left[-7,7\right]^{2} in the complex plane is partitioned into a grid of cells (following [Tin20] we use side length 0.050.05). The relative area covered by ZZ\subseteq\mathbb{C} is defined as the ratio of the number of cells visited by the coefficients zkZz_{k}\in Z to the total number of cells in the grid. An example of the distribution of Fourier coefficients ZZ of linear SCR with n=97n=97 at ρ=0.9813,0.999,1\rho=0.9813,0.999,1 and w being the first nn digits of binary expansion of π\pi is presented in Figure 1 333In particular, ρ=0.9812798473475446\rho=0.9812798473475446 is where the relative area peaks at Figure 2..

Refer to caption
(a) ρ=0.9812798473475446\rho=0.9812798473475446
Refer to caption
(b) ρ=0.999\rho=0.999
Refer to caption
(c) ρ=1\rho=1
Figure 1. Example of Fourier coefficient of linear SCR at ρ=0.9812798473475446,0.999,1\rho=0.9812798473475446,0.999,1 respectively. 0.98127984734754460.9812798473475446 in particular is where the relative area peaks in Figure 2.
Refer to caption
Figure 2. Relative area of Linear SCR and Randomly generated Reservoir with respect to the spectral radius

Replicating the experiment in [Tin20] in Figure 2, we observe that the “richness” of the motif space of SCR, as measured by relative area under the current setup, increases as the spectral radius approaches approximately ρ0.9813\rho\approx 0.9813. Beyond this point, the measure sharply declines, aligning with the results for a randomly generated reservoir. This decline is also observed in is also shown in Figure 2, as ρ\rho increases from approximately 0.980.98 (the peak of Figure 2) to the edge of stability at ρ=1\rho=1.

In this paper, we aim to investigate this collapse of representational richness of SCR models at the edge of stability ρ=1\rho=1. In particular, we will show that at the edge of stability, the motifs of linear SCR are (sampled) harmonic functions.

The next two section will be a two-part exposition of the properties of the motifs of linear SCR:

  1. (1)

    In Section 4, we show in \mathbb{C} that the motifs are harmonic functions. Following the approach of [LFT24], we begin with unitary dynamical coupling and progress to cyclic permutations.

  2. (2)

    In Section 5, we demonstrate in \mathbb{R} that the number of symmetric motifs is n2\lceil\frac{n}{2}\rceil, while the number of skew-symmetric motifs is n2\lfloor\frac{n}{2}\rfloor. In line with [FLT24], we start with orthogonal coupling and then move onto cyclic permutation.

Combining these two results, we conclude that at ρ=1\rho=1, the motifs alternate between real and imaginary components of the first Fourier basis, which correspond to cosines and sines, respectively. We supplement our theoretical findings with numerical simulations of the motif space of SCR in the next section, which then lead to the numerical experiments in the final section.

4. Unit spectral radius SCR implies harmonic motifs in Complex Domain

We first show that, in the complex domain \mathbb{C}, the motifs of SCR can be derived explicitly. In particular, in this section we set 𝕂=\mathbb{K}=\mathbb{C} and show that when the spectral radius ρ=1\rho=1, the motifs of linear SCR are harmonic, i.e. they are precisely the Fourier basis (columns of the Fourier matrix).

Consider a linear reservoir system R=(W,w,h)R=(\mbox{\bf{W}},\mbox{\bf{w}},h) over \mathbb{C} with dimensions (n,1,d)(n,1,d) and W of spectral radius ρ\rho. Let Qρ\mbox{\bf{Q}}_{\rho} denote the metric tensor of the reservoir kernel (Eq. (2.2).

In the spirit of [LFT24], we begin by considering a more general setting of W=ρU\mbox{\bf{W}}=\rho\mbox{\bf{U}}, where U is a unitary matrix in 𝕄n×n()\mathbb{M}_{n\times n}(\mathbb{C}) (i.e. UU=UU=I\mbox{\bf{U}}\mbox{\bf{U}}^{*}=\mbox{\bf{U}}^{*}\mbox{\bf{U}}={\mbox{\bf{I}}}) and ρ(0,1]\rho\in(0,1]. We then move to the special case where U=C\mbox{\bf{U}}=\mbox{\bf{C}} is a cyclic permutation. Since U is unitary, its eigenvalues all have norm 11. We let its eigenvalues be {sj:1jn}\{s_{j}\in\mathbb{C}:1\leq j\leq n\} with corresponding eigenvectors {ξj:1jn}\{\xi_{j}:1\leq j\leq n\}, and we know each |sj|=1|s_{j}|=1.

By construction:

Qρ=[Qρ(i,j)]1i,jτ=[𝐰W(i1)Wj1𝐰]1i,jτ.\mbox{\bf{Q}}_{\rho}=[Q_{\rho\ (i,j)}]_{1\leq i,j\leq\tau}=[\mathbf{w}^{*}\mbox{\bf{W}}^{*(i-1)}\mbox{\bf{W}}^{j-1}\mathbf{w}]_{1\leq i,j\leq\tau}.

The matrix Qρ\mbox{\bf{Q}}_{\rho} is a τ×τ\tau\times\tau matrix. Denote

Xρ=[W(i1)Wj1]1i,jτ,\mbox{\bf{X}}_{\rho}=[\mbox{\bf{W}}^{*(i-1)}\mbox{\bf{W}}^{j-1}]_{1\leq i,j\leq\tau},

and,

𝐰^=[𝐰𝐰𝐰].\hat{\mathbf{w}}=\begin{bmatrix}\mathbf{w}&&&\\ &\mathbf{w}&&\\ &&\ddots&\\ &&&\mathbf{w}\end{bmatrix}.

Notice that Xρ\mbox{\bf{X}}_{\rho} is an τn×τn\tau n\times\tau n matrix while 𝐰^\hat{\mathbf{w}} is an τn×τ\tau n\times\tau matrix. Then by construction, Qρ=𝐰^Xρ𝐰^\mbox{\bf{Q}}_{\rho}=\hat{\mathbf{w}}^{*}\mbox{\bf{X}}_{\rho}\hat{\mathbf{w}}. Notice that since W=ρU\mbox{\bf{W}}=\rho\mbox{\bf{U}} and U is unitary, we can rewrite Xρ\mbox{\bf{X}}_{\rho} as follows:

Xρ=[ρi+j2Uji]1i,jτ=[IρUρ2U2ρτ1Uτ1ρUρ2Iρ3Uρ2U2ρ3Uρ4Iρτ1U(τ1)ρ2(τ1)I]\mbox{\bf{X}}_{\rho}=[\rho^{i+j-2}\mbox{\bf{U}}^{j-i}]_{1\leq i,j\leq\tau}=\begin{bmatrix}{\mbox{\bf{I}}}&\rho\mbox{\bf{U}}&\rho^{2}\mbox{\bf{U}}^{2}&\cdots&\rho^{\tau-1}\mbox{\bf{U}}^{\tau-1}\\ \rho\mbox{\bf{U}}^{*}&\rho^{2}{\mbox{\bf{I}}}&\rho^{3}\mbox{\bf{U}}&&\vdots\\ \rho^{2}\mbox{\bf{U}}^{*2}&\rho^{3}\mbox{\bf{U}}^{*}&\rho^{4}{\mbox{\bf{I}}}&&\\ \vdots&&&\ddots&\\ \rho^{\tau-1}\mbox{\bf{U}}^{*(\tau-1)}&&&&\rho^{2(\tau-1)}{\mbox{\bf{I}}}\end{bmatrix}

Finally, we let λ=1+ρ2+ρ4++ρ2(τ1)={τ, if ρ=11ρ2τ1ρ2, if ρ<1.\lambda=1+\rho^{2}+\rho^{4}+\cdots+\rho^{2(\tau-1)}=\begin{cases}\tau,&\text{ if }\rho=1\\ \frac{1-\rho^{2\tau}}{1-\rho^{2}},&\text{ if }\rho<1\end{cases}.

With the ultimate goal of computing the eigen-decomposition of Qρ\mbox{\bf{Q}}_{\rho}, we first characterize the eigenvalues and eigenvectors of Xρ\mbox{\bf{X}}_{\rho}. For the eigenvalues we first observe the following:

Lemma 4.1.

The matrix Xρ\mbox{\bf{X}}_{\rho} satisfies Xρ2=λXρ\mbox{\bf{X}}_{\rho}^{2}=\lambda\mbox{\bf{X}}_{\rho}.

Proof.

Multiplying the ii-th row of Xρ\mbox{\bf{X}}_{\rho} with its jj-th column, we obtain:

k=0τ1ρi+k1U(ik1)ρj+k1Ujk1=(k=0τ1ρ2k)ρi+j2Uji=λρi+j2Uji.\sum_{k=0}^{\tau-1}\rho^{i+k-1}\mbox{\bf{U}}^{-(i-k-1)}\cdot\rho^{j+k-1}\mbox{\bf{U}}^{j-k-1}=\left(\sum_{k=0}^{\tau-1}\rho^{2k}\right)\rho^{i+j-2}\mbox{\bf{U}}^{j-i}=\lambda\rho^{i+j-2}\mbox{\bf{U}}^{j-i}.

This proves the desired equality. ∎

As a result, we can now fully characterize the eigenvalues of Xρ\mbox{\bf{X}}_{\rho}.

Corollary 4.2.

The eigenvalues of Xρ\mbox{\bf{X}}_{\rho} are either 0 or λ\lambda. Moreover, the multiplicity of the eigenvalue λ\lambda is nn.

Proof.

Let ξ\xi be an eigenvector of Xρ\mbox{\bf{X}}_{\rho} with eigenvalue κ\kappa. Then Xρξ=κξ\mbox{\bf{X}}_{\rho}\xi=\kappa\xi, and thus

Xρ2ξ=κ2ξ=(λXρ)ξ=κλξ.\mbox{\bf{X}}_{\rho}^{2}\xi=\kappa^{2}\xi=(\lambda\mbox{\bf{X}}_{\rho})\xi=\kappa\lambda\xi.

This implies that either κ=0\kappa=0 or λ\lambda.

Moreover, Tr(Xρ)=n(1+ρ2++ρ2(τ1))=nλTr(\mbox{\bf{X}}_{\rho})=n(1+\rho^{2}+\cdots+\rho^{2(\tau-1)})=n\lambda. But the trace also equals the sum of all the eigenvalues. Since the eigenvalues of Xρ\mbox{\bf{X}}_{\rho} can only be 0 or λ\lambda, we conclude that the multiplicity of the eigenvalue λ\lambda is precisely nn. ∎

We now turn to characterize the non-zero eigenvectors of Xρ\mbox{\bf{X}}_{\rho}. We will then use these vectors to compute the motif matrix of Qρ\mbox{\bf{Q}}_{\rho}. Recall ξj\xi_{j} are the eigenvectors of U with the corresponding eigenvalues sjs_{j}. For each jj, define

ξ^j=[ξjρsj1ξjρτ1sj(τ1)ξj]nτ\hat{\xi}_{j}=\begin{bmatrix}\xi_{j}\\ \rho s_{j}^{-1}\xi_{j}\\ \vdots\\ \rho^{\tau-1}s_{j}^{-(\tau-1)}\xi_{j}\end{bmatrix}\in\mathbb{C}^{n\tau}
Lemma 4.3.

Each ξ^j\hat{\xi}_{j} is an eigenvector of Xρ\mbox{\bf{X}}_{\rho} with eigenvalue λ\lambda. Moreover, {1λξ^j}j=1n\left\{\frac{1}{\sqrt{\lambda}}\hat{\xi}_{j}\right\}_{j=1}^{n} is an orthonormal set.

Proof.

First, since U is unitary, we have Uξj=sjξj\mbox{\bf{U}}\xi_{j}=s_{j}\xi_{j} and Uξj=sj1ξj\mbox{\bf{U}}^{*}\xi_{j}=s_{j}^{-1}\xi_{j}. Therefore, Ukξj=sjkξj\mbox{\bf{U}}^{k}\xi_{j}=s_{j}^{k}\xi_{j} and Ukξj=sjkξj{\mbox{\bf{U}}^{*}}^{k}\xi_{j}=s_{j}^{-k}\xi_{j} for all non-negative integers kk, or, stated more compactly, Ukξj=sjkξj\mbox{\bf{U}}^{k}\xi_{j}=s_{j}^{k}\xi_{j} for all integers kk.

Expanding Xρξ^j\mbox{\bf{X}}_{\rho}\hat{\xi}_{j}, we observe that its ii-th block entry equals:

k=0n1ρi+k1U(ik1)ρksjkξj=k=0n1ρi+2k1sji+1ξj=(k=0n1ρ2k)ρi1sj(i1)ξj=λρi1sj(i1)ξj.\sum_{k=0}^{n-1}\rho^{i+k-1}\mbox{\bf{U}}^{-(i-k-1)}\rho^{k}s_{j}^{-k}\xi_{j}=\sum_{k=0}^{n-1}\rho^{i+2k-1}s_{j}^{-i+1}\xi_{j}=\left(\sum_{k=0}^{n-1}\rho^{2k}\right)\rho^{i-1}s_{j}^{-(i-1)}\xi_{j}=\lambda\rho^{i-1}s_{j}^{-(i-1)}\xi_{j}.

This is precisely λ\lambda times the ii-th block entry of ξ^j\hat{\xi}_{j}. This proves that Xρξ^j=λξ^j\mbox{\bf{X}}_{\rho}\hat{\xi}_{j}=\lambda\hat{\xi}_{j}.

Since each ξj\xi_{j} is a unit vector and each |sj|=1|s_{j}|=1, we have that ξ^j2=k=0n1ρ2k=λ\|\hat{\xi}_{j}\|^{2}=\sum_{k=0}^{n-1}\rho^{2k}=\lambda. Hence, 1λξ^j\frac{1}{\sqrt{\lambda}}\hat{\xi}_{j} is a vector of norm 11. Finally, eigenvectors {ξj}\{\xi_{j}\} of any unitary matrix form an orthonormal basis, so ξi,ξj=δij\langle\xi_{i},\xi_{j}\rangle=\delta_{ij}. This implies:

ξ^i,ξ^j=k=0n1ρ2ksik(sj¯)kξi,ξj,\langle\hat{\xi}_{i},\hat{\xi}_{j}\rangle=\sum_{k=0}^{n-1}\rho^{2k}s_{i}^{-k}(\overline{s_{j}})^{-k}\langle\xi_{i},\xi_{j}\rangle,

which is 0 when iji\neq j. This proves that {ξ^j}\left\{\hat{\xi}_{j}\right\}, j=1,,nj=1,\ldots,n, are pairwise orthogonal. ∎

Since all the other eigenvalues of Xρ\mbox{\bf{X}}_{\rho} are 0, we immediately have the desired eigen-decomposition of the matrix Xρ\mbox{\bf{X}}_{\rho} as:

Xρ=1λj=1nξ^jξ^j\mbox{\bf{X}}_{\rho}=\frac{1}{\lambda}\sum_{j=1}^{n}\hat{\xi}_{j}\hat{\xi}_{j}^{*}

Recall an n×nn\times n full-cycle permutation matrix is given by:

C:=[01100010000010]\mbox{\bf{C}}:=\begin{bmatrix}0&\cdots&\cdots&\cdots&1\\ 1&0&&\cdots&0\\ 0&1&0&\vdots&\vdots\\ \vdots&0&\ddots&0&\vdots\\ 0&\vdots&0&1&0\end{bmatrix}

For the rest of the section, we set ρ=1\rho=1 and W=C\mbox{\bf{W}}={\mbox{\bf{C}}} is a full-cycle permutation and we will now compute eigen-decomposition of the metric tensor Q=Q1\mbox{\bf{Q}}=\mbox{\bf{Q}}_{1} of the corresponding reservoir kernel.

From elementary matrix analysis, we know the eigenvalues of a full-cycle permutation C are precisely the nn-th root of unities {ωj=e2πijn,1jn}\{\omega_{j}=e^{\frac{2\pi ij}{n}},1\leq j\leq n\}. Its normalized eigenvectors for each eigenvalue ωj\omega_{j} is given by the Fourier basis:

(4.1) ξj=1n[1ωjωj2ωjn1]n,j=1,,n.\xi_{j}=\frac{1}{\sqrt{n}}\begin{bmatrix}1\\ \omega_{j}\\ \omega_{j}^{2}\\ \vdots\\ \omega_{j}^{n-1}\end{bmatrix}\in\mathbb{C}^{n},\quad j=1,\ldots,n.

We will now compute the eigenvalues and eigenvectors of Q explicitly. First, we see that

Q=𝐰^X1𝐰^,\mbox{\bf{Q}}=\hat{\mathbf{w}}^{*}\mbox{\bf{X}}_{1}\hat{\mathbf{w}},

where X1=1nj=1nξ^jξ^j\mbox{\bf{X}}_{1}=\frac{1}{n}\sum_{j=1}^{n}\hat{\xi}_{j}\hat{\xi}_{j}^{*}. Define ξj𝐰=𝐰,ξj=dj\xi_{j}^{*}\mathbf{w}=\langle\mathbf{w},\xi_{j}\rangle=d_{j}.

While, following Remark 2.4, we focused on the case where τ=n\tau=n, the results of this section up to this point (Lemma 4.1, Corollary 4.2, and Lemma 4.3) hold for general τ\tau.

Theorem 4.4.

Consider a linear SCR system R=(C,w,h)R=(\mbox{\bf{C}},\mbox{\bf{w}},h) over \mathbb{C} with dimensions (n,1,d)(n,1,d) and with the full cycle permutation matrix C as its dynamical coupling. Let Q denote metric tensor of the reservoir kernel under the past horizon τ=n\tau=n. Then, the normalized eigenvectors ξj\xi_{j}, j=1,2,,nj=1,2,...,n of C (eq. (4.1)) are also eigenvectors of Q with the corresponding eigenvalues equal to the squared projections of the input coupling vector w onto ξj\xi_{j}, |dj|2=|ξj𝐰|2|d_{j}|^{2}=|\xi_{j}^{*}\mathbf{w}|^{2}. In other words, RR has motifs ξj\xi_{j}, with motif weights |dj|=|ξj𝐰|=|𝐰,ξj||d_{j}|=|\xi_{j}^{*}\mathbf{w}|=|\langle\mathbf{w},\xi_{j}\rangle|, j=1,2,,nj=1,2,...,n.

Proof.

Define an n×τn\times\tau matrix,

A=1n[ξ^1ξ^2ξ^n]𝐰^{\mbox{\bf{A}}}=\frac{1}{\sqrt{n}}\begin{bmatrix}\hat{\xi}_{1}^{*}\\ \hat{\xi}_{2}^{*}\\ \vdots\\ \hat{\xi}_{n}^{*}\end{bmatrix}\hat{\mathbf{w}}

Then Q=AA\mbox{\bf{Q}}={\mbox{\bf{A}}}^{*}{\mbox{\bf{A}}}, and we have:

A=[d1ω11d1ω1(τ1)d1d2ω21d2ω2(τ1)d2dnωn1dnωn(τ1)dn]=DF{\mbox{\bf{A}}}=\begin{bmatrix}d_{1}&\omega_{1}^{-1}d_{1}&\cdots&\omega_{1}^{-(\tau-1)}d_{1}\\ d_{2}&\omega_{2}^{-1}d_{2}&\cdots&\omega_{2}^{-(\tau-1)}d_{2}\\ \vdots&\vdots&\vdots&\vdots\\ d_{n}&\omega_{n}^{-1}d_{n}&\cdots&\omega_{n}^{-(\tau-1)}d_{n}\\ \end{bmatrix}=\mbox{\bf{D}}\cdot\mbox{\bf{F}}

Here, by the assumption that τ=n\tau=n, we can define two n×nn\times n matrices D=diag{d1,d2,,dn}\mbox{\bf{D}}=\operatorname{diag}\{d_{1},d_{2},\cdots,d_{n}\} and F=[ωi(j1)]\mbox{\bf{F}}=[\omega_{i}^{-(j-1)}]. Note that F\mbox{\bf{F}}^{*} is the discrete Fourier transform matrix, so we know FF=FF=nI\mbox{\bf{F}}^{*}\mbox{\bf{F}}=\mbox{\bf{F}}\mbox{\bf{F}}^{*}=n\mbox{\bf{I}}. Therefore,

Q=(1nF)diag{|d1|2,|d2|2,,|dn|2}(1nF).\mbox{\bf{Q}}=\left(\frac{1}{\sqrt{n}}\mbox{\bf{F}}^{*}\right)\operatorname{diag}\{|d_{1}|^{2},|d_{2}|^{2},\cdots,|d_{n}|^{2}\}\left(\frac{1}{\sqrt{n}}\mbox{\bf{F}}\right).

Here, (1nF)(\frac{1}{\sqrt{n}}\mbox{\bf{F}}) is a unitary matrix. The jj-th column of (1nF)(\frac{1}{\sqrt{n}}\mbox{\bf{F}}^{*}), which is precisely ξj\xi_{j}, is the eigenvector for |dj|2|d_{j}|^{2}

Remark 4.5.

When τn\tau\neq n, we can still reach a similar decomposition as above, but the matrix F=[ωi(j1)]\mbox{\bf{F}}=[\omega_{i}^{-(j-1)}] is an n×τn\times\tau matrix. When τ\tau is not an integer multiple of nn in particular, its rows may not be orthogonal to each other in general, so it is hard to characterize the eigenvectors of Q. However, when τ\tau is an integer multiple of nn, the rows of F are orthogonal to each other. In this case, the eigenvectors of Q is given by the nn of these τ\tau dimensional vectors (ωij)j=0τ1(\omega_{i}^{-j})_{j=0}^{\tau-1}.

As a result, in the case when ρ=1\rho=1 and the matrix W=C\mbox{\bf{W}}={\mbox{\bf{C}}} is a full-cycle permutation, the motif matrix for Q consists of ξj\xi_{j}, which is precisely the discrete Fourier transform matrix. In other words, the columns of the motif matrix are precisely the Fourier basis.

Remark 4.6.

Notice that |dj|2=|dnj|2|d_{j}|^{2}=|d_{n-j}|^{2}, so that the eigenvalues and eigenvectors of the motif matrix come in pairs. Here, ξj\xi_{j} and ξnj=ξj¯\xi_{n-j}=\overline{\xi_{j}} share the same eigenvalue |dj|2=|dnj|2|d_{j}|^{2}=|d_{n-j}|^{2}.

5. Motifs of Orthogonal Dynamics with Unit Spectral Radius in Real Domain

In this section we return to the real domain, i.e. 𝕂=\mathbb{K}=\mathbb{R} and show that at unit spectral radius, the motifs of SCR consists of a fixed number of symmetric and skew symmetric vectors. As in [FLT24], we begin by deriving properties of the motif space of linear reservoir systems with orthogonal dynamical coupling and then move on to the special case of cyclic permutation.

Let W𝕄n×n()\mbox{\bf{W}}\in\mathbb{M}_{n\times n}\left(\mathbb{R}\right) be the dynamical coupling matrix of a reservoir system R=(W,w,h)R=(\mbox{\bf{W}},\mbox{\bf{w}},h) over \mathbb{R}. Suppose W is orthogonal with spectral radius ρ=1\rho=1. We now show that the matrix corresponding to the reservoir kernel Q:=Qρ=1\mbox{\bf{Q}}:=\mbox{\bf{Q}}_{\rho=1} is Toeplitz.

Q is Toeplitz if and only if Qij=Qi+1,j+1\mbox{\bf{Q}}_{ij}=\mbox{\bf{Q}}_{i+1,j+1} for all i,j=1,,n1i,j=1,\ldots,n-1. By construction of Q, this is satisfied if and only if:

w(W)i1Wj1w=w(W)iWjw.\mbox{\bf{w}}^{\top}\left(\mbox{\bf{W}}^{\top}\right)^{i-1}\mbox{\bf{W}}^{j-1}\mbox{\bf{w}}=\mbox{\bf{w}}^{\top}\left(\mbox{\bf{W}}^{\top}\right)^{i}\mbox{\bf{W}}^{j}\mbox{\bf{w}}.

Now, orthogonality of W implies WW=I\mbox{\bf{W}}^{\top}\mbox{\bf{W}}=\mbox{\bf{I}}. Without loss of generality assume that iji\geq j. Then

(W)i1Wj1=(W)ij=(W)iWj,\left(\mbox{\bf{W}}^{\top}\right)^{i-1}\mbox{\bf{W}}^{j-1}=\left(\mbox{\bf{W}}^{\top}\right)^{i-j}=\left(\mbox{\bf{W}}^{\top}\right)^{i}\mbox{\bf{W}}^{j},

showing that Q is indeed Toeplitz.

Let J denote the exchange matrix with 11 on the antidiagonal and 0 everywhere else:

J:=[0000100010001000100010000]\mbox{\bf{J}}:=\begin{bmatrix}0&0&0&\cdots&0&1\\ 0&0&0&\cdots&1&0\\ 0&0&\cdots&1&0&0\\ \vdots&\vdots&\iddots&\iddots&\vdots&\vdots\\ 0&1&0&\cdots&0&0\\ 1&0&0&\cdots&0&0\end{bmatrix}
Lemma 5.1.

Symmetric Toeplitz square matrices are centrosymmetric.

Proof.

Let T denote a symmetric Toeplitz τ×τ\tau\times\tau matrix. Let {tk}k=0τ1\left\{t_{k}\right\}_{k=0}^{\tau-1} denote the generating sequence of T such that T is expressed as:

T=[t0t1tτ2tτ1t1t0tτ3tτ2t2t1t0t1tτ1tτ2t1t0].\mbox{\bf{T}}=\begin{bmatrix}t_{0}&t_{1}&\cdots&t_{\tau-2}&t_{\tau-1}\\ t_{1}&t_{0}&\cdots&t_{\tau-3}&t_{\tau-2}\\ t_{2}&t_{1}&\ddots&\vdots&\vdots\\ \vdots&\vdots&\cdots&t_{0}&t_{1}\\ t_{\tau-1}&t_{\tau-2}&\cdots&t_{1}&t_{0}\end{bmatrix}.

Consider:

Tτ+1i,τ+1j\displaystyle\mbox{\bf{T}}_{\tau+1-i,\tau+1-j} =tτ+1iτ1+j\displaystyle=t_{\tau+1-i-\tau-1+j}
=tji=ti=j=Tij.\displaystyle=t_{j-i}=t_{i=j}=\mbox{\bf{T}}_{ij}.

The shows that T satisfies the definition of centro-symmetry. The second last equality is given by symmetry of T and the last equality is due to T being Toeplitz.

Corollary 5.2.

Let W be the dynamical coupling matrix of a reservoir system R=(W,w,h)R=(\mbox{\bf{W}},\mbox{\bf{w}},h). If W is orthogonal with spectral radius ρ=1\rho=1, then Q is symmetric centrosymmetric.

Proof.

Q is symmetric by construction. Moreover, when W is orthogonal with spectral radius 11, Q is Toeplitz and hence centrosymmetric by lemma 5.1. ∎

Definition 5.3.

Let 𝕂\mathbb{K} be an arbitrary field. A τ\tau-dimensional vector vv over 𝕂\mathbb{K} is symmetric if:

Jv=v.\mbox{\bf{J}}v=v.

Similarly, vv is skew-symmetric if:

Jv=v.\mbox{\bf{J}}v=-v.

By [CB76]: symmetric centrosymmetric matrices of order n admits an orthonormal basis of eigenvectors with:

  1. (1)

    n2\lceil\frac{n}{2}\rceil symmetric eigenvectors,

  2. (2)

    n2\lfloor\frac{n}{2}\rfloor skew-symmetric eigenvectors

5.1. Motif structure of cyclic permutation dynamics with unit spectral radius

Following Remark 2.4, we focus on the case where τ=n\tau=n below. In the special case when the dynamical coupling is a cyclic permutation, the corresponding matrix Q is circulant. A circulant matrix is a specific type of Toeplitz matrix in which each row is a cyclic shift of the previous row, with elements shifted one position to the right.

The cyclic permutation can equivalently be expressed as the linear map:

C:n\displaystyle\mbox{\bf{C}}:\mathbb{R}^{n} n\displaystyle\rightarrow\mathbb{R}^{n}
(5.1) (ak)k=1n\displaystyle\left(a_{k}\right)_{k=1}^{n} (bk)k=1n:=(ak1(modn))k=1n\displaystyle\mapsto\left(b_{k}\right)_{k=1}^{n}:=\left(a_{k-1\pmod{n}}\right)_{k=1}^{n}

Then by construction we have:

  1. C.1

    CαCβ=Cα+β(modn)\mbox{\bf{C}}^{\alpha}\cdot\mbox{\bf{C}}^{\beta}=\mbox{\bf{C}}^{\alpha+\beta\pmod{n}} .

  2. C.2

    (C)i=Cni(modn)\left(\mbox{\bf{C}}^{\top}\right)^{i}=\mbox{\bf{C}}^{n-i\pmod{n}}

Theorem 5.4.

Let W𝕄n×n()\mbox{\bf{W}}\in\mathbb{M}_{n\times n}\left(\mathbb{R}\right) be the dynamical coupling matrix of a reservoir system R=(W,w,h)R=(\mbox{\bf{W}},\mbox{\bf{w}},h). If W is a cyclic permutation with spectral radius ρ=1\rho=1. Then the metric tensor Q1:=Qρ=1\mbox{\bf{Q}}_{1}:=\mbox{\bf{Q}}_{\rho=1} of the reservoir kernel is circulant. Moreover, there exists an orthogonal basis such that RR has n2\lceil\frac{n}{2}\rceil symmetric motifs and n2\lfloor\frac{n}{2}\rfloor skew-symmetric motifs.

Proof.

Let rir_{i} denote the ithi^{th} row of the matrix Q:=Q1\mbox{\bf{Q}}:=\mbox{\bf{Q}}_{1}. It suffices to show that ri=Ci1r1r_{i}=\mbox{\bf{C}}^{i-1}\cdot r_{1}. By construction, for j=1,,nj=1,\ldots,n:

(r1)j\displaystyle\left(r_{1}\right)_{j} =wCj1w\displaystyle=\mbox{\bf{w}}^{\top}\mbox{\bf{C}}^{j-1}\mbox{\bf{w}}
(ri)j\displaystyle\left(r_{i}\right)_{j} =w(C)i1Cj1w\displaystyle=\mbox{\bf{w}}^{\top}\left(\mbox{\bf{C}}^{\top}\right)^{i-1}\mbox{\bf{C}}^{j-1}\mbox{\bf{w}}

By C.1 and C.2, the jthj^{th} component of rir_{i} can be rewritten as:

(ri)j\displaystyle\left(r_{i}\right)_{j} =w(C)i1Cj1w\displaystyle=\mbox{\bf{w}}^{\top}\left(\mbox{\bf{C}}^{\top}\right)^{i-1}\mbox{\bf{C}}^{j-1}\mbox{\bf{w}}
(5.2) =wCni+1Cj1w=wCji(modn)w\displaystyle=\mbox{\bf{w}}^{\top}\mbox{\bf{C}}^{n-i+1}\mbox{\bf{C}}^{j-1}\mbox{\bf{w}}=\mbox{\bf{w}}^{\top}\mbox{\bf{C}}^{j-i\pmod{n}}\mbox{\bf{w}}

By Equation 5.1 and C.1, the jthj^{th} component of Ci1r1\mbox{\bf{C}}^{i-1}\cdot r_{1} can be written as:

Ci1(r1)j\displaystyle\mbox{\bf{C}}^{i-1}\left(r_{1}\right)_{j} =wCj(i1)1(modn)w=wCji(modn)w\displaystyle=\mbox{\bf{w}}^{\top}\mbox{\bf{C}}^{j-(i-1)-1\pmod{n}}\mbox{\bf{w}}=\mbox{\bf{w}}^{\top}\mbox{\bf{C}}^{j-i\pmod{n}}\mbox{\bf{w}}

which coincides with Equation 5.1 and thus Q1\mbox{\bf{Q}}_{1} is circulant.

Circulant matrices are Toeplitz. Since Q1\mbox{\bf{Q}}_{1} is symmetric circulant, it is symmetric centrosymmetric of order nn. Thus by [CB76], Q1\mbox{\bf{Q}}_{1} admits an orthonormal basis spanning the space of space of motif consisting of n2\lceil\frac{n}{2}\rceil symmetric eigenvectors of Q1\mbox{\bf{Q}}_{1} and n2\lfloor\frac{n}{2}\rfloor skew-symmetric eigenvectors of Q1\mbox{\bf{Q}}_{1}. ∎

6. Linear SCR at unit spectral radius is Fourier transform

This section bridges the theoretical results and numerical experiments by explicitly constructing the Fourier basis matrix corresponding to a linear SCR over \mathbb{R}. We present numerical simulations to validate our theoretical findings and outline the components of the numerical experiments in the final section.

Combining the results of the previous sections, we conclude that: At unit spectral radius, the motifs of linear SCR over \mathbb{R} with nn neurons of look back window τ=n\tau=n are:

  1. R.1

    Harmonic, i.e. they correspond to the Fourier basis (Theorem 4.4).

  2. R.2

    There are n2\lceil\frac{n}{2}\rceil symmetric vectors (cosines) and n2\lfloor\frac{n}{2}\rfloor skew-symmetric vectors (sines) with positive eigenvalues. (Theorem 5.4).

We now demonstrate the explicit construction of the Fourier basis matrix corresponding to the motifs of a linear SCR at unit spectral radius.

For a linear reservoir system over \mathbb{R}, the motifs must also be real. By conditions R.1 and R.2, each motif must then correspond to either the real or imaginary parts of the Fourier basis. More precisely, for k=0,,n2k=0,\ldots,\lceil\frac{n}{2}\rceil and i=0,,τ1i=0,\ldots,\tau-1 (The ceiling function accounts for Theorem 5.4.), the real Fourier basis matrix F=F[i,j]𝕄τ×n()\mbox{\bf{F}}=\mbox{\bf{F}}[i,j]\in\mathbb{M}_{\tau\times n}(\mathbb{R}) is defined as

(6.1) F[i,2k]=2τcos2πkiτEven columnsF[i,2k+1]=2τsin2πkiτOdd columns\displaystyle\begin{split}\mbox{\bf{F}}[i,2k]&=\sqrt{\frac{2}{\tau}}\cos{\frac{2\pi ki}{\tau}}\quad\ldots\quad\text{Even columns}\\ \mbox{\bf{F}}[i,2k+1]&=\sqrt{\frac{2}{\tau}}\sin{\frac{2\pi ki}{\tau}}\quad\ldots\quad\text{Odd columns}\end{split}

We claim that the motifs of a linear SCR are exactly the first nn columns of F.

In Figure 3 and Figure 4, we present the Fourier analysis of motifs of the SCR at unit spectral radius when τ=n=97\tau=n=97.

In Figure 4, we present examples of 44 randomly chosen motifs and their corresponding Fourier basis in F. While some motifs and their corresponding Fourier basis are shifted by a fixed phase of π\pi or reflected over the xx-axis, their Fourier spectra align as illustrated in Figure 3. Furthermore, Figure 3, we observe that the eigenvalues come in pairs, as discussed in 4.6. The motifs may not be strictly harmonic in the classical sense due to the coarseness of the discretization grid (non-integer division of period) (See Remark 6.1 below).

Refer to caption
(a) Column-wise FFT of motifs of linear SCR with unit spectral radius.
Refer to caption
(b) Column-wise FFT of first nn columns of F.
Refer to caption
(c) Columns of Fig 3(a) rearranged according to largest Fourier spectra.
Refer to caption
(d) Columns of Fig 3(b) rearranged with same indices as Fig 3(c).
Figure 3. Column-wise FFT of motifs of linear SCR with ρ=1\rho=1 and the column-wise FFT of F. The first row shows the Fourier spectra in the original form and the second row has their columns rearranged with the same shuffling indices.
Refer to caption
Figure 4. Example of 4 randomly chosen motifs and their corresponding Fourier basis. Notice some are off by a phase of π\pi.
Remark 6.1.

In practice, motifs represent harmonic functions sampled at specific frequencies. This explains why certain motifs may not visually appear as harmonic functions (e.g., motif 93 in Figure 4), yet their Fourier spectra reveal characteristics consistent with harmonic functions (see Figure 3). The effect of sampling is further demonstrated in Figure 5, which compares the 93rd Fourier basis generated according to Equations 6.1 at two different sampling sizes.

On the first plot the sine curve is being sampled a high frequency with 500 sampling points; whereas the sampling points are reduced to 97=n=τ97=n=\tau on the second plot. On the third plot we see that they are generated by the exact same function but with different sampling frequency. Notice that the first plot appears to be harmonic but the second one does not.

Therefore, While this function may not visually resemble a harmonic function at the default sampling size of n=97n=97, increasing the sampling size to 500500 makes the function progressively align with the visual properties of a harmonic function.

The reason the Fourier spectra align with those of harmonic functions in the computational process is due to the matching of sampling frequencies within the FFT algorithm.

Refer to caption
Figure 5. Plots of Fourier basis number 9393 constructed under Equations 6.1 sampled under two different frequencies.

7. Illustrative Examples on Time Series Forecasting

We conclude the paper with numerical experiments illustrating our findings. We compare time series forecasting results using the so-called Reservoir Motif Machines (RMM) [TFL24] and the linear SCR. RMM is a simple time series forecasting method based on the feature space representation of time series derived from linear reservoirs and has demonstrated remarkable predictive performance, even surpassing that of more complex transformer models on univariate time series forecasting tasks. A brief exposition of RMM is presented in Section 2.

We use RMM because it allows us to explicitly define the feature space representation by imposing a set of motifs, rather than having the feature space representation implicitly defined as in classical ESNs, such as SCR. This provides an ideal platform to showcase our theoretical findings: that the feature space representation of the motif space of a linear SCR is the same as that defined by the Fourier basis matrix F in Equation 6.1.

For reproducibility of the experiments, all experiments are CPU-based and are performed on Apple M3 Max with 128GB of RAM. The source code and data of the numerical analysis is openly available at https://github.com/Lampertos/motif_Fourier.

We compare the prediction results on univariate time series forecasting across the following models:

  1. (1)

    Lin-RMM with SCR motifs and unit spectral radius,

  2. (2)

    Lin-RMM with Fourier basis motifs as discussed in Section 6, and

  3. (3)

    Linear SCR with unit spectral radius.

It is essential to emphasize that this experiment is not intended to showcase the predictive capability of the models, but rather to highlight the similarities in feature space representation between the linear SCR model over \mathbb{R} and the Fourier basis motifs introduced in Section 6. Consequently, hyperparameters are kept constant throughout the experiments to underscore this feature space comparison.

The fixed set of hyperparameters for all experiments are as follows: rin=0.05r_{\text{in}}=0.05 for input weights, n=97n=97 for the number of reservoir neurons, and τ=2n=194\tau=2n=194 for the look-back window length. All models are trained using ridge regression with a ridge coefficient of 10310^{-3}. The prediction horizon is set to 168168.

7.1. Datasets

To facilitate the comparison of our results with the state-of-the art, we have used the same datasets and the same experimental protocols used in the recent time series forecasting papers [ZZP+20]. Those are briefly described below for the sake of completeness.

ETT

The Electricity Transformer Temperature dataset444https://github.com/zhouhaoyi/ETDatasetconsists of measurements of oil temperature and six external power-load features from transformers in two regions of China. The data was recorded for two years, and measurements are provided either hourly (indicated by ’h’) or every 1515 minutes (indicated by ’m’). In this paper we used oil temperature of the ETTh1, ETTm1 dataset for univariate prediction with train/validation/test split being 1212/44/44 months.

ECL

The Electricity Consuming Load555https://archive.ics.uci.edu/dataset/321/
electricityloaddiagrams20112014
consists of hourly measurements of electricity consumption in kWh for 321 Portuguese clients during two years. In this paper we used client MT 320 for univariate prediction. The train/validation/test split is 15/3/4 months.

Weather

The Local Climatological Data (LCD) dataset666https://www.ncei.noaa.gov/data/local-climatological-data/ consists of hourly measurements of climatological observations for 1600 weather stations across the US during four years. The dataset was used for univariate prediction of the Wet Bulb Celcius variable.

7.2. Discussion

From Figure 6 and Figure 7 we see that there’s virtually no difference between Lin-RMM with unit spectral radius SCR motifs and Lin-RMM with Fourier basis. This confirms our observations in the previous sections. Furthermore Figure 6 also confirms that Lin-RMM with unit spectral radius SCR motifs (red bar) has superior performance against classical SCR with unit spectral radius, this affirms the studies in [TFL24].

Refer to caption
Figure 6. Statistics of MSE loss of Fourier RMM, unit SCR RMM and unit SCR across different data sets with fixed prediction horizon 168168.

Amongst RMM’s we compare the MSE loss in Figure 7. Notice that the difference between the MSE loss between RMM under Fourier motifs and SCR motifs are around 1e131e-13 which is negligible compared to the MSE loss of the models against standardized input signals.

Refer to caption
Figure 7. Comparison of MSE loss between Fourier RMM and unit SCR RMM in ETTm2.

8. Conclusion

Linear recurrent neural networks (RNN), such as Echo State Networks (ESN) can be thought of as providing feature representations of the input-driving time series in their state space [Tin20, GGO22]. By endowing the feature (state) space with the canonical dot-product, one can “reverse engineer” the corresponding inner product in the space of time series (time series kernel) that is defined through the the dot product in the RNN state space [Tin20]. This in turn helps to shed light on the inner representational schemes employed by the RNN to process the input-driving time series. In particular, the induced (semi-)inner product in the time series space can be theoretically analyzed through eigen-decomposition of the corresponding metric tensor. The eigenvectors (time series motifs) define the projection basis of the induced feature space and the (decay of) eigenvalues its dominant subspace and effective dimensionality. The induced time series kernels by the Simple Cycle Reservoir (SCR) models were shown to be superior (in terms of dimensionality, motif variability, and memory) to several alternative ESN constructions [Tin20].

In this paper we have shown a rather surprising result: When SCR is constructed at the edge of stability, the basis of its induced time series feature space correspond to the well known and widely used basis for signal decomposition - namely the Fourier basis.

This insight also explains the reduction in relative area covered by Fourier representations of SCR motifs observed by [Tin20] at the edge of stability. Our results imply that the feature space representation of a linear SCR at unit spectral radius effectively performs a weighted projection onto the Fourier basis.

This observation is supported by numerical experiments, in which we compared the time series forecasting accuracy of Lin-RMM with the motif space defined by a linear SCR at unit spectral radius and the Fourier basis, respectively.

References

  • [BVdSBV22] Ian Bauwens, Guy Van der Sande, Peter Bienstman, and Guy Verschaffelt. Using photonic reservoirs as preprocessors for deep neural networks. Frontiers in Physics, 10:1051941, 2022.
  • [CB76] A. Cantoni and P. Butler. Eigenvalues and eigenvectors of symmetric centrosymmetric matrices. Linear Algebra and its Applications, 13(3):275–288, 1976.
  • [FLT24] Robert Simon Fong, Boyu Li, and Peter Tiňo. Universality of real minimal complexity reservoir. arXiv preprint arXiv:2408.08071, 2024.
  • [GGO22] Lukas Gonon, Lyudmila Grigoryeva, and Juan-Pablo Ortega. Reservoir kernels and volterra series. arXiv preprint arXiv:2212.14641, 2022.
  • [HVK+20] Krishan Harkhoe, Guy Verschaffelt, Andrew Katumba, Peter Bienstman, and Guy Van der Sande. Demonstrating delay-based reservoir computing using a compact photonic integrated chip. Optics express, 28(3):3086–3096, 2020.
  • [Jae01] H. Jaeger. The ”echo state” approach to analysing and training recurrent neural networks. Technical report gmd report 148, German National Research Center for Information Technology, 2001.
  • [LBFM+17] Laurent Larger, Antonio Baylón-Fuentes, Romain Martinenghi, Vladimir S Udaltsov, Yanne K Chembo, and Maxime Jacquot. High-speed photonic reservoir computing using a time-delay-based architecture: Million words per second classification. Physical Review X, 7(1):011015, 2017.
  • [LFT24] Boyu Li, Robert Simon Fong, and Peter Tiňo. Simple Cycle Reservoirs are Universal. Journal of Machine Learning Research, 25(158):1–28, 2024.
  • [LJ09] M. Lukosevicius and H. Jaeger. Reservoir computing approaches to recurrent neural network training. Computer Science Review, 3(3):127–149, 2009.
  • [MNM02] W. Maass, T. Natschlager, and H. Markram. Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural Computation, 14(11):2531–2560, 2002.
  • [RT10] Ali Rodan and Peter Tiňo. Minimum complexity echo state network. IEEE transactions on neural networks, 22(1):131–144, 2010.
  • [TD01] P. Tiňo and G. Dorffner. Predicting the future of discrete sequences from fractal representations of the past. Machine Learning, 45(2):187–218, 2001.
  • [TFL24] Peter Tiňo, Robert Simon Fong, and Roberto Fabio Leonarduzzi. Predictive modeling in the reservoir kernel motif space. In 2024 International Joint Conference on Neural Networks (IJCNN), pages 1–8, 2024.
  • [Tin20] Peter Tino. Dynamical systems as temporal feature spaces. J. Mach. Learn. Res., 21:44–1, 2020.
  • [ZZP+20] Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wan Zhang. Informer: Beyond efficient transformer for long sequence time-series forecasting. In AAAI Conference on Artificial Intelligence, 2020.