Linear model reduction using SPOD modes
Abstract
The majority of model reduction approaches use an efficient representation of the state and then derive equations to temporally evolve the coefficients that encode the state in the representation. In this paper, we instead employ an efficient representation of the entire trajectory of the state over some time interval and solve for the coefficients that define the trajectory on the interval. We use spectral proper orthogonal decomposition (SPOD) modes, in particular, which possess properties that make them suitable for model reduction and are known to provide an accurate representation of trajectories. In fact, with the same number of total coefficients, the SPOD representation is substantially more accurate than any representation formed by specifying the coefficients in a spatial (e.g., POD) basis for the many time steps that make up the interval. We develop a method to solve for the SPOD coefficients that encode the trajectories in forced linear dynamical systems given the forcing and initial condition, thereby obtaining the accurate representation of the trajectory. We apply the method to two examples, a linearized Ginzburg-Landau problem and an advection-diffusion problem. In both, the error of the proposed method is orders of magnitude lower than both POD-Galerkin and balanced truncation applied to the same problem, as well as the most accurate solution within the span of the POD modes. The method is also fast, with CPU time comparable to or lower than both benchmarks in the examples we present.
1 Introduction
The expense of many modern computational models can prohibit their use in applications where speed is required. In a design optimization problem, for example, many simulations at different boundary conditions or parameters must be performed. In control applications, simulations may need to be conducted in real time to inform actuation. Model reduction techniques strive to deliver the orders-of-magnitude speedup necessary to enable adequately fast simulation for these and other problems with only a mild accuracy sacrifice.
The great majority of model reduction methods employ the following two-step strategy: they (i) find an accurate compression of the state of the system at a particular time and (ii) find equations that evolve the coefficients that represent the state in this representation. The POD-Galerkin method (Aubry88; Noack03; Rowley04), perhaps the most widely used starting point for model reduction, is representative of this approach. The proper orthogonal decomposition (POD) modes are an efficient means of representing the state in that with relatively few POD coefficients, the state can often be represented to high accuracy. In a POD-Galerkin reduced-order model (ROM), these coefficients are then evolved by projecting the governing equations into the space of POD modes, yielding a much smaller dynamical system to evolve. Many alternative choices have been explored for both steps. Examples of the compression step include using balanced truncation modes (Moore81) and autoencoders (Lee20; gonzalez18), and examples of deriving the equations in the reduced space include using Petrov-Galerkin projections (Carlberg10; Carlberg13; Otto22) and learning the equations from data Peherstorfer16; Padovan24. All of these approaches to model reduction, however, fall within the two-step strategy outlined above.
We have investigated a different approach in this work: instead of representing the state (at a particular time) in a reduced manner, we instead employ a reduced representation for the entire trajectory, i.e., the state’s evolution for some time interval. Whereas POD modes are the most efficient (linear) representation of the state, they are far from the most efficient representation of trajectories. This is true intuitively – to represent a trajectory with POD modes, one has to specify the POD coefficients for each time step along the trajectory, but from one time step to the next, the POD coefficients are highly correlated. The analog of POD for entire trajectories is space-time POD (Lumley67; Schmidt_Schmid19; Frame23). Space-time POD modes are themselves time- and space-dependent, so to represent a trajectory, they are weighted by static coefficients. These modes are the most efficient linear representation of trajectories in the sense that to represent a trajectory to some desired accuracy, fewer degrees of freedom are needed if the trajectory is represented with space-time POD modes than any other linear encoding scheme. Unfortunately, space-time POD modes have a number of characteristics that make them undesirable for model reduction; computing them requires much training data, storing them is memory intensive, and computing space-time inner products, which would be necessary in a space-time ROM method, is expensive.
Fortunately, an efficient space-time basis that does not share the undesirable properties of space-time POD modes exists. Spectral POD (SPOD) modes are most naturally formulated as a POD in the frequency domain. More precisely, at every temporal frequency, there exists a set of spatial modes that optimally represent the spatial structure at that frequency. These modes are the SPOD modes, and they may be thought of as space-time modes where each spatial mode at frequency has the time dependence . Each mode is associated with an energy, and these energies may be compared across frequencies; for example, the second mode at one frequency may have more energy than the first mode at another frequency. The fact that motivates this work is that the most energetic SPOD modes are also an excellent basis for representing trajectories. In fact, SPOD modes converge to space-time POD modes as the time interval becomes long, so for long intervals, the representation of a trajectory with SPOD modes is nearly as accurate (on average) as the space-time POD representation, which is optimal among all linear representations (Frame23).
With this motivation, the goal of this work is to develop an algorithm to solve quickly for the SPOD coefficients that represent a trajectory in forced linear dynamical systems given the initial condition and forcing. If these coefficients can be obtained accurately, then the resulting error will be substantially lower than that of POD-Galerkin with the same number of modes. The method works as follows. The SPOD coefficients at a given frequency are related to the (temporal) Fourier transform of the state at the same frequency, which in turn is related to the forcing and initial condition. We derive these relations analytically and obtain an equation for the SPOD coefficients as a linear operation on the forcing and initial condition. We precompute the linear operators involved, leaving only small matrix-vector multiplications to be done online.
We demonstrate the method on two problems: a linearized Ginzburg-Landau problem with a spatial dimension of , and an advection-diffusion problem with . We show that, indeed, we can solve for the SPOD coefficients accurately, resulting in two-orders-of-magnitude lower error than even the projection of the solution onto the same number of POD modes, which is itself a lower bound for the error in any time-domain Petrov-Galerkin method, such as balanced truncation (BT) (Moore81). We show that this accuracy improvement does not come with an increase in CPU time; the method is competitive with POD-Galerkin and balanced truncation in CPU time, as is predicted by scalings we derive, with a slight advantage in the two examples we present.
Though the space-time approach is uncommon, we are not the first to attempt it (Lin19; Choi19; Parish21; Choi21; Choi21b; Yano14; Towne21). Previous methods have formed both spatial and temporal bases with simulation data. These methods have been used to solve large linear problems as well as small non-linear ones. For example, Ref. (Choi21) solves a large linear Boltzmann transport problem, and Ref. (Choi21b) solves an advection-diffusion problem. In the nonlinear case, much of the effort has focused on solving simple nonlinear PDEs over relatively short time intervals (Choi19; Parish21). We believe that the representational advantage of SPOD modes relative to previous choices of space-time basis, as well as their analytic time dependence make them a more compelling choice for model reduction.
Our approach may also be related to harmonic balance (Hall02; Hall13). In this technique, the governing equations for a temporally periodic system are Fourier-transformed in time, resulting in a set of nonlinear equations to be solved for the transformed fields. This has been applied to the periodic flows arising in turbomachinery (Ekici07), resulting in significant computational speedup due to the relatively small number of relevant harmonics. Harmonic balance does not employ a spatial reduction, and our method may be viewed as a spatially reduced harmonic balance method for linear problems. Another important difference is that our method is applicable to non-periodic systems as well as periodic ones.
The remainder of this paper is organized as follows. In Section 2, we discuss the properties of POD, space-time POD, and SPOD that are relevant to the method. We present the main approach of the method in Section 3. However, there is a subtle issue with transforming to the frequency domain to be accounted for. We describe this issue in Section 4 and then account for it in the method in Section 5. We then demonstrate the method on a linearized Ginzburg-Landau problem and a scalar transport problem in Section LABEL:sec:results, and conclude the paper in Section LABEL:sec:Conclusion.
2 Space-only, space-time, and spectral POD
We review the space-only, space-time, and spectral forms of POD here. The most significant point for the purposes of this paper is the fact that spectral POD modes approach space-time POD modes as the time interval becomes long, and thus are very efficient in representing trajectories.
2.1 Space-only POD
Space-only POD aims to reconstruct snapshots of the state by adding together prominent modes weighted by expansion coefficients. The first mode is defined to maximize , the expected value of the energy captured by it,
(2.1a) | |||
(2.1b) |
The subsequent modes are defined to maximize the energy captured under the constraint that they are orthogonal to all previous ones,
(2.2) |
The space-only inner product , which defines the energy captured, is defined as an integral over the spatial domain ,
(2.3) |
where is a weight matrix used to account for inter-variable importance or possibly to preference certain regions of the domain. One can show (Towne2018spectral; Lumley67) that the solution to the optimization problem (2.1b,2.2) is modes which are eigenfunctions of the space-only correlation tensor,
(2.4) |
where the eigenvalue is equal to the energy of the mode, i.e., , and the correlation tensor is
(2.5) |
2.2 Space-time POD
Whereas space-only POD modes optimally represent snapshots, space-time POD modes optimally represent trajectories over the time window . The formulation is much the same as in space-only POD; the modes optimize the expected energy (2.1b,2.2), but in space-time POD, the inner product is over time as well as space,
(2.6) |
The space-time POD modes that solve this optimization are eigenfunctions of the space-time correlation , and the eigenvalues represent the energy (importance) of each mode. The most important property of space-time POD modes for the purpose of this paper is that the space-time POD reconstruction of a trajectory achieves lower error, on average, than the reconstruction with the same number of modes in any other space-time basis. More concretely, using the first space-time POD modes to reconstruct the trajectory,
(2.7) |
yields lower expected error
(2.8) |
than would any other space-time basis. Above, expected error is measured over space and time using the norm induced by the inner product.
2.3 Spectral POD
Spectral POD is most easily understood as the frequency domain variant of space-only POD for statistically stationary systems. In other words, SPOD modes at a particular frequency optimally reconstruct (in the same sense as above) the state at that frequency, on average. The property that makes them attractive for model reduction, however, is that SPOD modes are also the long-time limit of space-time POD modes for statistically stationary systems. These ideas are made precise below, but for a more complete discussion, see (Towne2018spectral).
Spectral POD modes at frequency maximize
(2.9) |
again, subject to the constraint that each mode is orthogonal to the previous ones at that frequency . The Fourier-transformed state is defined as
(2.10) |
The solution to the optimization (2.9) is that the modes are eigenvectors of the cross-spectral density
(2.11) |
The eigenvalue is again equal to the energy of the mode, i.e., . Modes at frequency have an implicit time dependence of .
SPOD modes and their energies become identical to space-time POD modes as the time interval on which the latter are defined becomes long (Lumley67; Towne2018spectral; Frame23). Thus, the SPOD modes with the largest energies among all frequencies are the dominant space-time POD modes (for long times) and are most efficient for reconstructing long-time trajectories. We denote by the -th largest SPOD eigenvalue among all frequencies, which may be compared to the space-time POD eigenvalues. The convergence in the energy of the trajectory they capture is relatively fast, so for time intervals beyond a few correlation times, the SPOD modes capture nearly as much energy as the space-time modes (Frame23).
If the simulation time of a reduced-order model is long enough for this convergence to be met, the ability of the SPOD modes to capture structures is not diminished relative to that of space-time POD modes. SPOD modes also have two properties that make them more suitable for model reduction than space-time modes: they have analytic time dependence, and they are separable in space and time. The former makes some analytic progress possible in writing the equations that govern the modes and enables Fourier theory to be applied. The latter means that storing the modes requires times less memory, where is the number of times steps in the simulation.
Figure 1 shows the convergence in the representational ability of space-time POD and SPOD as the time interval becomes long. Specifically, to represent trajectories with some level of accuracy, , say, one needs the same number of SPOD coefficients as space-time POD coefficients if the interval is long compared to the correlation time in the system. This convergence in representation ability occurs because the SPOD modes themselves converge to space-time POD modes in the limit of a long time interval. With space-only POD, one must specify the coefficients for every time step, which leads to a far less efficient encoding of the data because the coefficients are highly correlated from one time step to the next. That SPOD modes are near-optimal in representing trajectories, and that they are substantially more efficient than space-only POD modes motivate this work. If one can efficiently solve for some number of the SPOD coefficients of a trajectory, then these coefficients will lead to substantially lower error than solving for the same number of space-only POD coefficients.
2.4 Discretization of modes
Upon discretization, the continuous tensors become discrete, and the modes become -component vectors in . Integration over space becomes matrix-vector multiplication. For example, the discrete SPOD modes at frequency are defined as the eigenvectors of the (weighted) cross-spectral density matrix,
(2.12) |
is the cross-spectral density, is the matrix with the discrete SPOD modes as its columns, and is the diagonal matrix of eigenvalues which are the SPOD mode energies. It is important to note that in practice, the matrix is not formed; the SPOD modes are calculated by the method of snapshots (Sirovich87; Towne2018spectral). In particular, given trajectories from which to obtain SPOD modes, each time steps in length, the discrete Fourier transform (DFT) of each trajectory is taken. This yields realizations of the -th frequency, for every frequency . These can be formed into a data matrix
(2.13) |
where is the -th frequency of the DFT of the -th trajectory. The SPOD modes at frequency and the associated energies may then be obtained by first taking the singular value decomposition . The SPOD modes are then given by and the energies by (Towne2018spectral).
A finite number of evenly spaced frequencies are retained. The lowest one, , induces a time , which determines the interval on which the modes are periodic. The trajectories themselves are, of course, not periodic on this interval, so is the longest we may use SPOD modes for prediction, though, if a longer prediction is needed, the method may be repeated. Using the SPOD modes, the trajectory may be written
(2.14) |
where is the vector of expansion coefficients at the -th frequency. The factor of makes (2.14) an (interpolated) inverse discrete Fourier transform with as the -th coefficient in the DFT of . The fully reduced representation of the trajectory is
(2.15) |
In this paper, an superscript indicates a reduced quantity. The mean number of modes retained at each frequency is , and modes are retained in total. The same number is not retained at all frequencies because the most energetic space-time POD modes are the most energetic SPOD modes over all frequencies; the number of modes retained at the -th frequency is the number of modes at this frequency that are among the most energetic overall,
(2.16) |
With these notations established, and . Finally, the Fourier transformed trajectory is defined using the DFT as
(2.17) |
3 SPOD Petrov-Galerkin method
Our goal is to derive a SPOD-based method to solve the linear ordinary differential equation
(3.1a) | |||
(3.1b) |
on the interval , where , is some known forcing, maps the forcing to the state evolution, is some linear observable of the state given by . Given the forcing and initial condition , our goal is to find the retained SPOD coefficients for the trajectory , thereby obtaining the near-optimal rank- space-time representation of the trajectory. With these coefficients, can easily be obtained taking the inverse DFT of .
3.1 Method
The starting point is the following equation for in terms of :
(3.2) |
This equation is exact and follows from the fact that . The Fourier transformed state must be obtained from the known forcing and initial condition, and there is some subtlety here. For the sake of clarity in describing the model reduction approach, we ignore the subtlety in this section. The correct relation between the Fourier-transformed forcing and initial condition and the Fourier-transformed state is derived in Section 4 and is given in (4.14), and the model reduction method is adjusted for this correction in Section 5.
To write the equations in the frequency domain, we take the Fourier transform, replacing and with and and (naively, it turns out) replacing with , yielding
(3.3) |
Then, assuming the stability of , the state is related to the forcing by
(3.4) |
where the resolvent operator is defined as
(3.5) |
Plugging (3.4) into (3.2), we have
(3.6) |
This method may be derived as a Petrov-Galerkin method in the following way. Starting from the equations in the frequency domain, , where , and plugging in the representation for , we have
(3.7) |
By left-multiplying the equations by and inverting the matrices, the method would be Galerkin (Lin19; Towne21). Instead, (3.6) may be recovered by multiplying the equations on the left by the test basis , yielding
(3.8) |
The operator is the identity, so (3.6) is recovered and the method is Petrov-Galerkin.
We denote the matrix that maps the forcing at frequency to the SPOD mode coefficients at that frequency as . For small problems (i.e., where is not prohibitive) the resolvent, and thus , may be computed directly. More care is needed in larger problems, and, indeed, significant research has been devoted to approximating the resolvent operator in fluid mechanics (Farghadan23; Martini21), where it is often used in the context of stability theory (Trefethen93; Mckeon10; McKeon17; Schmidt18). These techniques may be used here to approximate using some number of the dominant resolvent modes. Implementing these methods, however, may require substantial effort, and an alternative is possible due to the availability of data (the same data used to calculate the SPOD modes).
3.2 Approximating
An accurate and simple-to-implement approximation of can be obtained by leveraging the availability of data as follows. Defining
(3.9) |
where, again, is the -th realization of in the training data, we have
(3.10) |
Using the many realizations of the training data (the same ones used to generate the SPOD modes), we have
(3.11) |
where and . Multiplying this equation by its Hermitian transpose gives
(3.12) |
Both and may be represented by their eigendecompositions, i.e., their POD decompositions, giving
(3.13) |
where and come from the POD of (the superscript is a reminder that is not full-rank). By left-multiplying by and right-multiplying by , we have
(3.14) |
where is the projection onto the modes of . The right-hand-side of (3.14) is an approximation of , which we label
(3.15) |
If were the identity, the approximation would be exact. The operator is used to operate on forcing vectors, so the approximation is accurate to the extent that the forcing is within the span of the realizations of . The forcing is likely to be near this span because it is closely related to , which will become clear in Section 4.2.
4 Full-order frequency domain equations
4.1 Problem with the naive equations
The -th Fourier coefficient of the time derivative is not . It is shown in appendix LABEL:App:Derivative_coefficient that in fact,
(4.1) |
where is the change of the state on the time interval (Martini19; Nogueira20). If the state were periodic, this term would be zero, and the usual equation would hold, but the solution to the linear ODE has no reason to be periodic. How, then, does one solve in the frequency domain? Modifying (3.4) with to give
(4.2) |
is not useful as is now known a priori (and, in some sense, is what we are solving for)!
4.2 Solution: derivation of correction formula
To circumvent the problem described above, we use the analytic solution in time of (3.1). We then take the discrete Fourier transform (DFT) of this solution analytically to obtain in terms of . We use the DFT rather than the Fourier series because, from the exact DFT, one can easily obtain the exact solution at a set of points in the time domain by taking the inverse DFT. On the other hand, with the first Fourier series coefficients, point values cannot be recovered exactly.
The analytic solution of (3.1) is (Grigoriu21)
(4.3) |
We want the (analytic) DFT of (4.3) where the -th component of the DFT of is defined
(4.4) |
where . Plugging (4.3) into (4.4), we have
(4.5) |
For later convenience, we refer to the initial condition and forcing terms in (4.5) as and , respectively.
First, we evaluate . It may be rewritten as
(4.6) |
where we have defined and substituted . The key to evaluating (4.6) is to notice that it is a matrix geometric sum, i.e., each term is the previous one multiplied by . The solution, entirely analogous to the scalar geometric sum, is
(4.7) |
Because , this simplifies to
(4.8) |
We view the first factor as a time-discrete resolvent operator; expanding the matrix exponential to first order in , one recovers a multiple of the resolvent. Note, however, that truncating the expansion to first order is always invalid for the higher frequencies because, for these frequencies, is order unity.
To evaluate , we assume that can be written as
(4.9) |
Though this is likely not exactly true in practice, it introduces minimal error (as we show in our numerical examples), and is necessary for evaluating the integral in (4.5). After pulling out the constant matrix exponential term and including the Fourier interpolation of , the integral in (4.5) is
(4.10) |
Integrating (and hence inverting the matrix exponential within the integral) brings out a resolvent, and the integral evaluates to
(4.11) |
where, again, is the resolvent operator at the -th frequency. Now, using this result, the forcing term in (4.5) is
(4.12) |
The frequency difference term evaluates to zero for , and the other term may be evaluated by the same geometric sum argument as before. The entire forcing term in (4.5) becomes
(4.13) |
Adding and , we obtain the following equation for the -th component of the DFT of the state
(4.14) |
The first term is the naive one. For a stable system, it may be shown that the correction term is inversely proportional to the length of the time interval. Therefore, while the error from ignoring it does decrease as the time interval increases in this case, it is not negligible unless the interval is extremely long. Intuitively, the correction term accounts for the transient in the solution, which decays exponentially at the rate prescribed by the slowest decaying eigenvector of . The decay time is (of course) independent of the time interval on which we take the transform, so the fraction of the time interval occupied by this decay is inversely proportional to . The term is the initial condition that is ‘in sync’ with the forcing. If were equal to this, then there would be no transient, would be exactly periodic on the interval, and the naive equation (3.4) would be correct. For long time intervals relative to the slowest decaying eigenvector of , the term may be ignored. We also note that summing the operator multiplying the initial condition and forcing sum over all frequencies can be shown to give
(4.15) |
Figure 2 shows the inadequacy of the uncorrected method for capturing trajectories in initial value problems. The results shown are for the Ginzburg-Landau system described in Section LABEL:sec:results, with a stable (a) and unstable (b) value of . While the corrected frequency domain equations (4.14) are correct regardless of the stability of the system, we only recommend the method in the stable case; eigensystem methods are likely superior in the unstable case. The remaining error in the corrected method is due to the fact that the forcing is not periodic, and is not composed solely of retained frequencies, meaning that its Fourier series representation used above is approximate. However, this error is indeed small in both the stable and unstable cases.
5 Corrected method
The Petrov-Galerkin method is the full-order frequency domain equations (4.14) left-multiplied by ,
(5.1) |
As in the uncorrected case, for small systems, the inverses and matrix exponentials above may be computed directly, but this is not possible for larger systems where operations are infeasible. The first term is the same as in the uncorrected case, so it may be approximated with , as before. We approximate the latter terms at each frequency by using the basis of SPOD modes at that frequency, as described below.
5.1 Operator approximations
We define the number of SPOD modes that are to be used in approximating the operators as , and we assume this is the same for each frequency, though this is not necessary. At a maximum, this number is the number of SPOD modes available in the data, but it can be fewer if desired. To approximate the operators accurately, it will likely be larger than the number of modes retained to represent the state, for all . To accomplish the approximations, we use the available SPOD modes as follows. To approximate the inverse and matrix exponential terms, we first multiply by the identity in various places,
(5.2) |
Note that, at this point, is full rank, so is the identity. It is straightforward to show that (5.2) may be rewritten with the SPOD bases inside the inverse and matrix exponentials as
(5.3) |
Finally, truncating the operators in (5.3), i.e., , and denoting , the approximated term is
(5.4) |
Here, selects the first rows of the matrix it multiplies and is the first columns of . As desired, all matrix exponentials and inverses are of size , which makes them tractable.
The forcing sum in (5.1), which is difficult to compute directly because it involves resolvents, must also be approximated. In the unreduced equations, this term is the same at each frequency, so the sum over frequencies only needs to be computed once. Any approximation of this term should be the same for each frequency to avoid quadratic scaling in . The natural choice for approximating the forcing sum is to approximate each term by
(5.5) |
This approximation is accurate because the SPOD modes at the -th frequency are the best basis for and are thus a very good basis for , which is without the correction. The operator may again be approximated with .
The equations can, at this point, be written as
(5.6) |
where . The operators have been approximated, so far, to avoid scaling in the offline phase of the algorithm. To avoid scaling in the online phase, one final approximation must be made. The term in parentheses in (5.6) is multiplied on the left by for every frequency , leading to scaling. This can be avoided by storing the term in parentheses in (5.6) in a rank- reduced basis and precomputing the product of this basis with each . This basis should represent the initial condition and forcing sum terms accurately, and in practice, we choose POD modes of the state. With this approximation, the corrected SPOD Petrov-Galerkin method becomes
(5.7) |
where and .