Enhancing Dynamic-Mode Decomposition via Multi-Scale Analysis

Christopher W. Curtis

1 Introduction

In recent years, beginning with the now seminal papers [1, 2], a large body of literature has emerged around the modal decomposition technique known as Dynamic-Mode Decomposition (DMD); see [3] for a comprehensive review of the literature, and the recent reviews in [4] and [5] for the most recent summaries and extensions. The primary advantage of DMD is that it is a completely model free data processing tool, allowing for the ready determination of results from arbitrarily complicated data sets. However, this advantage comes at the cost of introducing an infinite dimensional framework by way of determining the spectral properties of the affiliated Koopman operator [6].

As shown in [7, 8], by making thoughtful choices of affiliated observables of the given data set which best explore the affiliated infinite dimensional spaces associated with the Koopman operator, one can get vastly superior results from the DMD approach. This then shows that even low dimensional data sets must be examined essentially from a Hilbert space perspective, thus making controlling for error and describing features of the DMD decomposition a difficult task. Some remedies to this can be found in the rigorous work presented in [9], which presents systematic ways to determine the spectra and eigenspaces of the Koopman operators. The methods developed are computationally intensive and thus restricted to relatively low dimensional dynamical systems.

Koopman Operators and Dynamic-Mode Decomposition

While finite-dimensional dynamical systems are certainly of interest in their own right, we often wish to study nonlinear partial-differential equations (PDEs) of the general form

\partial_{t}u=\mathcal{L}u+\mathcal{N}(u),~{}u(x,0)=u_{0}(x)\in\mathcal{H}

where $\mathcal{H}$ is some reasonably chosen Hilbert space. Frequently, the only recourse to solving the above initial-value problem is through numerical simulation, which is often facilitated by expanding the solution $u(x,t)$ along an orthonormal basis of $\mathcal{H}$ so that

u(x,t)\approx\sum_{j=1}^{N_{s}}\hat{u}_{j}(t)\hat{e}_{j}(x),~{}\left<\hat{e}_{j},\hat{e}_{k}\right>=\delta_{jk}.

This generates a system of equations describing the evolution of the coefficients $\hat{u}_{j}(t)$ .

We are therefore motivated to study nonlinear dynamical systems of the generic form

\frac{d}{dt}{\bf a}=f({\bf a},t),~{}{\bf a}(0)={\bf a}_{0}\in\mathbb{C}^{n},

where we suppose $N_{s}\gg 1$ . We denote the affiliated flow of this equation as ${\bf a}(t)=\varphi_{t}({\bf a}_{0})$ . We define the associated Hilbert space of observables, say $L_{2}\left(\mathbb{C}^{N_{s}},\mathbb{C},\mu\right)$ , so that $g\in L_{2}\left(\mathbb{C}^{N_{s}},\mathbb{C},\mu\right)$ if

\int_{\mathbb{C}^{N_{s}}}\left|g({\bf a})\right|^{2}d\mu\left({\bf a}\right)<\infty,

where $\mu$ is some appropriately chosen measure. A great deal of insight can be gained from looking at the affiliated linear representation of the problem given via the infinite-dimensional Koopman operator $\mathcal{K}^{t}$ , where

\mathcal{K}^{t}:L_{2}\left(\mathbb{C}^{N_{s}},\mathbb{C},\mu\right)\rightarrow L_{2}\left(\mathbb{C}^{N_{s}},\mathbb{C},\mu\right)

so that if $g\in L_{2}\left(\mathbb{C}^{N_{s}},\mathbb{C},\mu\right)$ , then

\mathcal{K}^{t}g({\bf a}_{0})=g\left(\varphi_{t}({\bf a}_{0})\right).

Note, throughout the remainder of the paper, we refer to the space of observables as $L_{2}\left(O\right)$ .

The power in this approach is that by moving to a linear-operator framework, the affiliated dynamics of the nonlinear system as measured via observables is captured via the eigenvalues of $\mathcal{K}^{t}$ . Thus, assuming for the moment that $\mathcal{K}^{t}$ has only discrete spectra, then if we can find a basis of $L_{2}\left(O\right)$ via the Koopman modes $h_{j}$ where

\mathcal{K}^{t}h_{j}=e^{t\lambda_{j}}h_{j}

then for any other observable $g$ , we should have

g=\sum_{j=1}^{\infty}c_{j}h_{j},~{}\mathcal{K}^{t}g=\sum_{j=1}^{\infty}e^{\lambda_{j}t}c_{j}h_{j}.

However, as one would imagine, determining the modes of the Koopman operator is in general impossible in an analytic way. The Dynamic-Mode Decomposition (DMD) method [2, 1, 7, 3] has been developed which allows for the determination of a finite number of the Koopman modes.

To wit, if we sample the flow ${\bf a}(t;{\bf a}_{0})$ at a discrete set of times $t_{j}$ , where $j=0,\cdots,N_{T}$ , thereby generating the data set ${\bf a}_{j}=\varphi_{t_{j}}({\bf a}_{0})$ , and if we select some set of observables ${\bf g}=\{g_{l}\}_{l=1}^{M}$ , then the DMD method approximates $\mathcal{K}^{t}$ by computing the spectra of the finite-dimensional operator $\tilde{\mathcal{K}}_{a}$ , which solves the the finite-dimensional least-squares problem

G_{+}=\tilde{\mathcal{K}}_{a}G_{-}

(1)

where $G_{-}$ and $G_{+}$ are the $M\times N_{T}$ matrices of the past and future states respectively, i.e.

G_{-}=\left\{{\bf g}\left({\bf a}_{0}\right)\cdots{\bf g}\left({\bf a}_{N_{T}-1}\right)\right\},~{}G_{+}=\left\{{\bf g}\left({\bf a}_{1}\right)\cdots{\bf g}\left({\bf a}_{N_{T}}\right)\right\}

For example, if we chose the observable $g_{l}({\bf a})$ where

g_{l}({\bf a}(t))=\sum_{j=1}^{N_{s}}a_{j}(t)\hat{e}_{j}(x_{l})\approx u(x_{l},t)

which just finds the approximation of the solution to the PDE of interest at the $l^{th}$ point of the computational mesh, then the Koopman operator provides a linear framework to describe the dynamics of a coordinate which evolves along an otherwise nonlinear flow. Note, this particular choice of observation function is called the canonical observation throughout the remainder of this paper.

The efficacy of this approach though is contingent on the choice of observables one makes, and to date no general strategy for observable choosing strategies has been made. Some success has been had by trying to use kernel based means allowing for the efficient exploration of more complicated, nonlinear observables; see [10]. However, as shown in [8], for infinite dimensional problems, or at least the relatively high-dimensional dynamical systems which approximate them, kernel based techniques can be of very limited utility. By using various reproducing kernel techniques from machine learning, [11], more robust methods for determining effective observables can be found. While effective, the methodology in [11] is computationally intensive requiring the manipulation of large matrices.

In this paper we propose a relatively stream-lined, and in some sense, more physically motivated approach to constructing nonlinear observables $g$ . We do this by making use of wavelet based, multi-scale approximations of various function space norms. By doing so, we are able to create a more robust implementation of the DMD algorithm without creating a significantly larger computational burden. We likewise show that the greater degrees of freedom introduced by our approach allow for more optimal decomposition strategies than would be available otherwise. Thus our work should be of use in helping to enhance current DMD based approaches as well as the next way of methodologies under current development [12, 13].

The outline of the paper is as follows. In Section 2 we explain our wavelet based multi-scale generation of observables. In Section 3 we present our results on a model test case. Section 4 collects the conclusions of our work.

2 Multi-Scale Analysis and Norm-Based Observables

To remedy this issue of effective observable selection, we make use multi-scale approximations by way of wavelets. Discrete wavelet analysis begins with a scaling function $\phi\in\mathcal{H}$ , such that relative to some length scale $\delta$ , the set of functions

\left\{\tau_{n}\phi\right\}_{n=-\infty}^{\infty},

forms an orthonormal set, where

\tau_{n}\phi(x)=\phi\left(x-n\right).

We define the subspace $V_{0}\subset\mathcal{H}$ so that

V_{0}=\mbox{Span}\left\{\left\{\tau_{n}\phi\right\}_{n=-\infty}^{\infty}\right\}.

By introducing the corresponding wavelet function $\psi$ , one is able to seperate the space $V_{0}$ such that

V_{0}=V_{1}\oplus W_{1},

where

V_{1}=\mbox{Span}\left\{\sqrt{2}\left\{\tau_{2n}\phi\right\}_{n=-\infty}^{\infty}\right\},~{}W_{1}=\mbox{Span}\left\{\sqrt{2}\left\{\tau_{2n}\psi\right\}_{n=-\infty}^{\infty}\right\},

so that $V_{1}$ represents the parts of functions in $V_{0}$ which vary on double the length scale, while $W_{1}$ represents the details of the functions in $V_{0}$ . In turn, one can then look at longer scales by separating $V_{1}=V_{2}\oplus W_{2}$ and so forth. This collection of separated spaces represents a multi-resolution approximation (MRA).

Thus, for a given function $u(x,t)$ , or some high-order numerically generated approximation to it, we can decompose $u(x,t)$ in an MRA consisting of $N_{lvl}$ levels such that

u(x,t)\approx\sum_{j=1}^{N_{lvl}}\sum_{n=-M_{f}}^{M_{f}}d_{j,n}(t)\psi_{j,n}(x)+\sum_{n=-M_{f}}^{M_{f}}a_{N_{lvl},n}(t)\phi_{N_{lvl},n}(x),

where

\psi_{j,n}(x)=2^{j/2}\tau_{n}\psi\left(2^{j}x\right),~{}\phi_{N_{lvl},n}(x)=2^{N_{v}/2}\tau_{n}\psi\left(2^{N_{lvl}x}\right),

and where $d_{j,n}$ denotes the detail coefficients at the $j^{th}$ scale while $a_{N_{lvl},n}$ denotes the approximation coefficients at the terminal scale $N_{lvl}$ . To use the MRA to build observables $g$ , we note that because of orthonormality, we can readily show that

\left|\left|u(\cdot,t)\right|\right|_{2}^{2}\approx\sum_{j=1}^{N_{v}}\sum_{n=-M_{f}}^{M_{f}}\left|d_{j,n}(t)\right|^{2}+\sum_{n=-M_{f}}^{M_{f}}\left|a_{N_{lvl},n}(t)\right|^{2}

Thus, we can define $N_{lvl}+1$ affiliated observables say $g_{2,l}$ where

g_{2,l}({\bf a}(t))=\left\{\begin{array}[]{rl}\displaystyle{\sum_{n=-M_{f}}^{M_{f}}}\left|d_{l,n}(t)\right|^{2},&l=1,\cdots,N_{lvl}\\ &\\ \displaystyle{\sum_{n=-M_{f}}^{M_{f}}}\left|a_{N_{v},n}(t)\right|^{2},&l=N_{lvl}+1\end{array}\right.

From an analytic point of view, one of the most significant applications of orthonormal wavelets is their ability to readily characterize complicated function spaces in terms of the detail and approximation coefficients described above. In particular, following [14], if we choose a function $\tilde{\varphi}\in\mathcal{S}$ , where $\mathcal{S}$ is the space of functions of rapid decay [15], such that for its Fourier transform $\hat{\tilde{\varphi}}$ , we have for some positive constant $c$ that

\mbox{supp}~{}\hat{\tilde{\varphi}}\subset\left\{k\in\mathbb{R}:\frac{1}{2}\leq|k|\leq 2\right\},~{}\left|\hat{\tilde{\varphi}}\right|>c>0,~{}\frac{3}{5}\leq|k|\leq\frac{5}{3}

then the Besov space $\dot{B}^{\alpha,q}_{p}$ , for $\alpha\geq 0$ , $1\leq p,q\leq\infty$ is characterized by the norm

\left|\left|f\right|\right|_{\dot{B}^{\alpha,q}_{p}}=\left(\sum_{n=-\infty}^{\infty}2^{n\alpha q}\left|\left|\tilde{\varphi}_{2^{-n}}\ast f\right|\right|^{q}_{p}\right)^{1/q},

where

\tilde{\varphi}_{2^{-n}}(x)=2^{n}\tilde{\varphi}\left(2^{n}x\right),

and $\left|\left|\cdot\right|\right|_{p}$ refers to the standard norm for the Lebesgue space $L_{p}$ , i.e.

\left|\left|f\right|\right|_{p}=\left(\int_{\mathbb{R}}\left|f(x)\right|^{p}dx\right)^{1/p}.

Here the parameter $\alpha$ acts as a regularity paramater measuring the degree of differentiability of the function $f$ . To wit, if we let $q=2$ , then for $p\geq 2$ , $B^{\alpha,2}_{p}\subset\dot{L}_{p,\alpha}$ , where $\dot{L}_{p,\alpha}$ is defined via the norm

\left|\left|f\right|\right|_{p,\alpha}=\left|\left|\frac{d^{\alpha}f}{dx^{\alpha}}\right|\right|_{p}.

Thus the homogeneous Besov spaces are in some sense $L_{p}$ spaces but with regularity. It is a nontrivial yet powerful result, see [14], that if $f$ has a wavelet expansion in the form

f(x)=\sum_{j=-\infty}^{\infty}\sum_{k=-\infty}^{\infty}d_{j,k}\psi_{j,k}(x),

then

\left|\left|f\right|\right|_{\dot{B}^{\alpha,q}_{p}}\sim\left(\sum_{n=-\infty}^{\infty}2^{nq(\alpha+1/2-1/p)}\left(\sum_{j=-\infty}^{\infty}|d_{n,j}|^{p}\right)^{q/p}\right)^{1/q}.

Thus, by using one wavelet decomposition, we can generate an arbitrary number of observables of the form $g^{\alpha,q}_{p,l}$ , where

g^{\alpha,q}_{p,l}({\bf a}(t))\sim 2^{nq(\alpha+1/2-1/p)}\left(\sum_{j=-\infty}^{\infty}|d_{n,j}|^{p}\right)^{q/p},

in the same way that we separated the $L_{2}$ norm into a sequence of observables reflecting the different wavelet scales.

3 Implementation and Results

Model Problem

We take as our prototypical example of a nonlinear PDE the Nonlinear Schrödinger Equation (NLS) which is

i\partial_{t}u+\partial_{x}^{2}u+\left|u\right|^{2}u=0.

While we cannot simulate this equation on all of $\mathbb{R}$ , by taking $x\in[-L,L]$ , $L\gg 1$ , we can generate a highly accurate numerical approximation to the above equation by using a standard periodic pseudo-spectral approximation in space coupled to a 4th-order Runge-Kutta scheme with integrating factor in time. Thus, we decompose $u(x,t)$ so that

u(x_{l},t)\approx\frac{1}{K_{T}}\sum_{m=0}^{K_{T}-1}\hat{u}_{m}(t)e^{ik_{m}x_{l}},~{}x_{l}=-L+l\delta_{x},~{}k_{m}=-\frac{\pi}{\delta_{x}}+m\delta_{k},

where $\delta_{x}=\frac{2L}{K_{T}}$ , $\delta_{k}=\frac{\pi}{L}$ , thereby generating a $K_{T}$ -dimensional dynamical system describing the evolution of $\hat{u}_{j}(t)$ . In general, we expect this system to be Hamiltonian, which implies the Koopman operator is unitary, and thus has strictly imaginary spectra $\lambda_{j}$ ; see [6, 9, 16].

As an initial condition, we take

u(x,0)=\sqrt{2}\left(\mbox{sech}(x)+\epsilon\mbox{sech}(x-x_{s})\right)+\frac{\epsilon}{2\pi}\int_{-\pi/\delta_{x}}^{\pi/\delta_{x}}e^{ikx}e^{-2k^{2}}e^{2i\pi\tilde{\theta}(k)}dk,

where $\tilde{\theta}(k)$ is a continuous random-walk generated from the uniform distribution $U(-\delta_{k}/2,\delta_{k}/2)$ , and $\epsilon\geq 0$ . Thus for nonzero $\epsilon$ , we look at an inital condition in which we have a central peak at $x=0$ and a secondary one at $x=x_{s}$ all on a background of normally distributed perturbations. In the limit as $\epsilon\rightarrow 0^{+}$ , we should have

u(x,t)\sim\sqrt{2}\mbox{sech}(x)e^{it},~{}\epsilon\sim 0^{+}.

Using the canonical observation in the $\epsilon\rightarrow 0^{+}$ limit, we see that if the time step in the numerical integrator is $\delta t$ then we have

G_{+}=D^{(s)}Oe^{i\delta t}D^{(t)},

where $D^{(s)}$ and $D^{(t)}$ are $K_{T}\times K_{T}$ and $N_{T}\times N_{T}$ diagonal matrices respectively such that

D^{(s)}_{ll}=\mbox{sech}(x_{l}),~{}D^{(t)}_{ll}=e^{il\delta t},

and $O$ is a $K_{T}\times N_{T}$ matrix of ones, i.e.

O=\begin{pmatrix}1&\cdots&1\\ \vdots&\ddots&\vdots\\ 1&\cdots&1\end{pmatrix}.

Thus, in the vanishing $\epsilon$ limit, we see that the approximation to the Koopman operator found via DMD given by $\tilde{K}_{a}$ is

\tilde{K}_{a}=e^{i\delta t}D^{(s)}O\left(D^{(s)}O\right)^{-P},

where $\left(D^{(s)}O\right)^{-P}$ denotes the Moore–Penrose pseudoinverse of the matrix $D^{(s)}O$ . If the Singular Value Decomposition (SVD) of $O$ is

O=U\Sigma V^{\dagger},

we readily see that

\tilde{K}_{a}=\frac{e^{i\delta t}}{K_{T}}D^{(s)}O_{K_{T}}\left(D^{(s)}\right)^{-1}

where $O_{K_{T}}$ is a $K_{T}\times K_{T}$ matrix of ones. Thus the eigenvalues of $\tilde{K}_{a}$ are $e^{i\delta t}$ and $0$ , repeated $K_{T}-1$ times, so ignoring the null-space of $\tilde{K}_{a}$ , we find the affiliated Koopman eigenvalue $\lambda=i$ . Thus in all future computations we anticipate that the computed spectra should reflect some degree of perturbation away from $e^{i\delta t}$ .

DMD: Details and Implementation

Following the approach in [2], the practical implementation of the DMD starts with a SVD of the past matrix $G_{-}$ so that Equation (1) becomes

G_{+}=\tilde{\mathcal{K}}_{a}U_{-}\Sigma_{-}V_{-}^{\dagger}.

Motivated by the previous example though, we should anticipate the possibility that $\Sigma_{-}$ could be of relatively small rank, which is to say that the conditioning inherent in trying to compute the Moore–Penrose pseudoinverse of $G_{-}$ could compromise the accuracy of any results one might try to compute. Thus, letting ${\bf d}_{\Sigma_{-}}$ be the descending ordered vector of the non-negative primary diagonal entries of $\Sigma_{-}$ , we define the index $D_{T_{l}}$ to be the largest index of ${\bf d}_{\Sigma_{-}}$ such that

\log_{10}\frac{{\bf d}_{\Sigma_{-},D_{T_{l}}}}{{\bf d}_{\Sigma_{-},1}}>-T_{l},~{}T_{l}\in\mathbb{N}\cup\left\{0\right\}.

Thus the non-negative integer tol sets the relative accuracy of any computations and allows for the omission of spurious modes computed by the DMD. This in turn defines a reduced SVD of $G_{-}$ so that

G_{-}\approx U^{T_{l}}_{-}\Sigma^{T_{l}}_{-}\left(V^{T_{l}}_{-}\right)^{\dagger},

and thus $\tilde{\mathcal{K}}_{a}$ becomes the $M\times M$ matrix

\tilde{\mathcal{K}}_{a}=G_{+}V^{T_{l}}_{-}\left(\Sigma^{T_{l}}_{-}\right)^{-P}\left(U^{T_{l}}_{-}\right)^{\dagger}.

Finding the affiliated eigenvalues and eigenvectors of $\tilde{\mathcal{K}}_{a}$ , i.e.

\tilde{\mathcal{K}}_{a}\Phi_{j}=\mu_{j}\Phi_{j}

then gives us the affiliated approximation formula at $t_{n}=n\delta t$

{\bf g}({\bf a}(t))\approx\sum^{M}_{j=1}b_{j}\Phi_{j}\mbox{exp}\left(\frac{t}{\delta t}\log\mu_{j}\right),

where the coefficients reflect the initial values of the observables ${\bf g}({\bf a}(0))$ .

With regards to our choice of Besov norms and corresponding multiscale decompositions, we choose to add to the list of canonical observables the decompositions of the quantities

\left|\left|u\right|\right|^{2}_{2},~{}\left|\left|u\right|\right|^{2}_{\dot{B}^{0,2}_{4}},~{}\left|\left|u\right|\right|^{2}_{\dot{B}^{1,2}_{2}}.

This corresponds to the first conserved quantity of the NLSE, i.e. the energy $\mathcal{E}(u)=\left|\left|u\right|\right|^{2}_{2}$ , while the next two quantities reflect the nonlinearity and dispersion of the NLSE as reflected in the Hamiltonian $\mathcal{H}(u)$ where

	$\displaystyle\mathcal{H}(u)=$	$\displaystyle\frac{1}{2}\left\|\left\|\partial_{x}u\right\|\right\|^{2}_{2}+\frac{1}{4}\left\|\left\|u\right\|\right\|^{4}_{4},$
	$\displaystyle\sim$	$\displaystyle\left\|\left\|u\right\|\right\|^{2}_{\dot{B}^{1,2}_{2}}+\left\|\left\|u\right\|\right\|^{2}_{\dot{B}^{0,2}_{4}}.$

This adds an overall $3(N_{lvl}+1)$ observables to the $K_{T}$ canonical observables corresponding to finding $u(x_{l},t)$ , which we collectively denote as $\tilde{{\bf g}}$ . Then the extended observables ${\bf g}$ are given by

{\bf g}=\begin{pmatrix}\tilde{{\bf g}}\\ {\bf g}_{2}\\ {\bf g}^{1,2}_{2}\\ {\bf g}^{0,2}_{4}\end{pmatrix}

We denote the Besov-norm enhanced, multiscale DMD as the MDMD. Note, while we have relied on knowing the model to make our decisions, one would very naturally choose these two Besov norms in so far as their affiliated variational derivatives naturally lead one to cubic nonlinearities and Laplacian based dispersion terms, which from a Ginzburg–Landau analysis are very natural leading order effects which emerge across a wide range of models.

At this point, with the choice of $T_{l}$ and the number of levels across which to perform a wavelet decomposition of the data, it is a nontrivial question to determine parameter choices for optimal performance of the DMD decomposition. To measure this performance, we use two criteria. The first is accuracy of the reconstruction of the given data. The second is that we know the underlying model is a Hamiltonian dynamical system, and thus the affiliated spectra $\mu_{j}$ should lie entirely on the imaginary unit circle. We thus define an affiliated error $E(T_{l},N_{v})$ where

E(T_{l},N_{v},\tilde{w})=E_{rc}(T_{l},N_{v})+\tilde{w}E_{sp}(T_{l},N_{v}),

where the reconstruction error $E_{rc}$ is given by

E_{rc}(T_{l},N_{v})=\left|\left|\tilde{{\bf g}}({\bf a}(t_{f}))-\tilde{{\bf g}}_{DMD}(t_{f})\right|\right|_{2}

and the spectral error $E_{sp}$ is given by

E_{sp}(T_{l},N_{v})=\left(\frac{1}{M}\sum_{j=1}^{M}\left(\left|\mu_{j}\right|-1\right)^{2}\right)^{1/2}.

where $\tilde{{\bf g}}({\bf a}(t_{f}))$ are the canonical observables not including those computed by Besov Norms measured at the final sampled time point, $\tilde{{\bf g}}_{DMD}(t_{f})$ is the affiliated DMD generated approximation of those same observables computed at the final sampled time point, and $\tilde{w}\in[0,2]$ is a relative weighting parameter between the two criteria. Thus for $0\leq\tilde{w}<1$ , we favor reconstruction accuracy, while for $1<\tilde{w}\leq 2$ , we favor adherance of the spectra $\mu_{j}$ to the unit circle.

Throughout the simulations below then, for given choice of $\tilde{w}$ , we use an optimization strategy of sweeping through different choices of $N_{lvl}$ and $T_{l}$ until a minimal error is found. The parameters are swept out over the intervals

2\leq T_{l}\leq 10,~{}1\leq N_{lvl}\leq\log_{2}K_{T}-1.

Since we examine the role of noise, we also do ensemble runs of where the number of members $N_{e}=16$ , which was the largest number of runs which allowed for practical run times of the simulations. Each simulation is run with $L=32$ and $K_{T}=256$ for a time step of $\delta t=.1$ and up to a time of $t_{f}=30$ so that $N_{T}>K_{T}+3(N_{lvl}+1)$ since in this case we have that $1\leq N_{lvl}\leq 7$ . The secondary peak in the initial condition is centered at $x_{s}=5$ .

Results

We first look at weighting the error norm in favor of approximation accuracy by choosing $\tilde{w}=.01$ for relatively weak perturbations, $\epsilon=.05$ , and stronger perturbations in which $\epsilon=.25$ . These results are seen respectively in Figures 1 and 2. For the relatively weak perturbation case, we see the reconstruction error is at times almost an order of magnitude better for the MDMD, while only making a modest sacrifice in the spectral error; see Figures 1 (a) and (b). Moreover, looking at Figures 1 (a) and (b), we see that the MDMD method is able to create a more uniform error in the approximation across the 16 different simulations. As we see in Figure 1 (c), this is done in part by allowing for a higher value of $T_{l}$ , thereby allowing for the use of more modes emerging from the DMD decomposition, and in part by using $N_{lvl}$ to adjust to varying noise profiles in an adaptive way as seen in Figure 1 (d).

For the larger perturbation when $\epsilon=.25$ , as seen in Figure 2 (a), the classic DMD approach often is incapable of maintaining reconstruction accuracy. While the multiscale approach also can suffer, it is far more reliable than its traditional counterpart. As seen in Figure 2 (b), this better reconstruction accuracy comes at a modest loss of spectral accuracy relative to the traditional DMD, though it should be noted that the reconstruction errors of the traditional DMD are $\mathcal{O}(1)$ , thus bringing into question the value of it maintaing spectral accuracy. Again, as seen in Figures 2 (b)-(d), the error profile using Besov-norms is slightly better and more uniform, the Besov approach uses more DMD modes, and $N_{lvl}$ is chosen in a highly adaptive manner.

Refer to caption — Figure 1: For $\tilde{w}=.01$ and $\epsilon=.05$ and across $N_{e}=16$ different choices of noise, we plot the reconstruction error (a), the spectral error (b), the optimal choice of $T_{l}$ (c), and the optimal choice of multiscale levels $N_{lvl}$ (d). Note, the traditional DMD approach is given by (–) while the MDMD approach is given by (- -) in the plots.

By instead looking at prioritizing the spectral accuracy of the DMD decomposition by setting $\tilde{w}=2$ , we see for both levels of perturbations that the Besov based approach provides strong improvements over standard DMD. While for both levels of the perturbation the MDMD approach provides both better and more uniform levels of reconstruction error, again for the largest perturbation, the standard DMD approach completely breaks down while the multiscale approach is markedly more robust; see Figures 3 (a) and 4 (a). Again this comes at a modest loss of spectral accuracy; see Figures 3 (b) and 4 (b). Thus, the MDMD based approach does not compromise approximation fidelity to the extent that the traditional DMD method does through the enhanced adaptability and better capturing of the dynamics facilitated by the Besov norms which help capture the key nonlinear features of the flow.

4 Conclusion

As we show, by introducing well chosen norms with ready multiscale decompositions to stand in for various nonlinear effects in a given model, we are able to produce a more robust implementation of the Dynamic-Mode Decomposition. This is done with only a modest increase in computational time and complexity, while the benefits can be an order of magnitude better than otherwise obtainable using standard approaches. Likewise, the added degrees of freedom allow for a greater degree of optimization in choosing a decomposition scheme. How our approach would enhance other DMD approaches currently under development is a question of future research.

References

[1] I. Mezić. Spectral properties of dynamical systems, model reduction, and decompositions. Nonlinear Dyn., 41:309–325, 2005.
[2] P. Schmid. Dynamic mode decomposition of numerical and experimental data. J. Fluid Mech., 656:5–28, 2010.
[3] J.N. Kutz, S.L. Brunton, B.W. Brunton, and J.L. Proctor. Dynamic Mode Decomposition: Data-driven modeling of complex systems. SIAM, Philadelphia, PA, 2016.
[4] K. Taira, S.L. Brunton, S.T.M. Dawson, C.W. Rowley, T. Colonius, B.J. McKeon, O.T. Schmidt, S. Gordeyev, V. Theofilis, and L.S. Ukeiley. Modal analysis of fluid flows: An overview. AIAA, 55:4013–4041, 2017.
[5] S. Le Clainche and J. M. Vega. Analyzing nonlinear dynamics via data-driven dynamic mode decomposition-like methods. Complexity, 2018:1–21, 2018.
[6] B.O. Koopman. Hamiltonian systems and transformation in Hilbert space. Proc. Nat. Acad. Sci., 17:315–318, 1931.
[7] M.O. Williams, I. G. Kevrekidis, and C. W. Rowley. A data-driven approximation of the Koopman operator: extending dynamic mode decomposition. J. Nonlin. Sci., 25:1307–1346, 2015.
[8] J.N. Kutz, J.L. Proctor, and S.L. Brunton. Applied Koopman theory for partial differential equations and data-driven modeling of spatio-temporal systems. Complexity, 2018:1–16, 2018.
[9] M. Korda and I. Mezić. On convergence of extended dynamic mode decomposition to the Koopman operator. J. Nonlinear Sci., 28:687–710, 2018.
[10] M.O. Williams, C. W. Rowley, and I. G. Kevrekidis. A kernel-based method for data driven Koopman spectral analysis. J. Comp. Dyn., 2:247–265, 2015.
[11] A. M. Degenarro and N. M. Urban. Scalable extended dynamic-mode decomposition using random kernel approximation. SIAM J. Sci. Comp., 41:1482–1499, 2019.
[12] O. Azencot, W. Yin, and A. Bertozzi. Consistent dynamic mode decomposition. SIAM J. Appl. Dyn. Sys., to appear, 2019.
[13] T. Askham and J.N. Kutz. Variable projection methods for an optimized dynamic mode decomposition. SIAM J. Appl. Dyn. Sys., 2018:380–416, 2018.
[14] M. Frazier, B. Jawerth, and G. Weiss. Littlewood–Paley Theory and the Study of Function Spaces. AMS, Providence, RI, 1991.
[15] G. Folland. Real Analysis: Modern Techniques and Applications. Wiley Interscience, New York, NY, 1999.
[16] C.W. Curtis, R. Carretero-Gonzalez, and M. Polimeno. Characterizing coherent structures in Bose-Einstein condensates through dynamic-mode decomposition. Phys. Rev. E, 99:062215, 2019.