This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Non-Markovian Reduced Models to Unravel Transitions in Non-equilibrium Systems

Mickaël D. Chekroun [email protected] Department of Earth and Planetary Sciences, Weizmann Institute, Rehovot 76100, Israel Department of Atmospheric and Oceanic Sciences, University of California, Los Angeles, CA 90095-1565, USA    Honghu Liu Department of Mathematics, Virginia Tech, Blacksburg, VA 24061, USA    James C. McWilliams Department of Atmospheric and Oceanic Sciences and Institute of Geophysics and Planetary Physics, University of California, Los Angeles, CA 90095-1565, USA
Abstract

This work proposes a general framework for analyzing noise-driven transitions in spatially extended non-equilibrium systems and explains the emergence of coherent patterns beyond the instability onset. The framework relies on stochastic parameterization formulas to reduce the complexity of the original equations while preserving the essential dynamical effects of unresolved scales. The approach is flexible and operates for both Gaussian noise and non-Gaussian noise with jumps.

Our stochastic parameterization formulas offer two key advantages. First, they can approximate stochastic invariant manifolds when these manifolds exist. Second, even when such manifolds break down, our formulas can be adapted through a simple optimization of its constitutive parameters. This allows us to handle scenarios with weak time-scale separation where the system has undergone multiple transitions, resulting in large-amplitude solutions not captured by invariant manifolds or other time-scale separation methods.

The optimized stochastic parameterizations capture then how small-scale noise impacts larger scales through the system’s nonlinear interactions. This effect is achieved by the very fabric of our parameterizations incorporating non-Markovian (memory-dependent) coefficients into the reduced equation. These coefficients account for the noise’s past influence, not just its current value, using a finite memory length that is selected for optimal performance. The specific “memory” function, which determines how this past influence is weighted, depends on both the strength of the noise and how it interacts with the system’s nonlinearities.

Remarkably, training our theory-guided reduced models on a single noise path effectively learns the optimal memory length for out-of-sample predictions. This approach retains indeed good accuracy in predicting noise-induced transitions, including rare events, when tested against a large ensemble of different noise paths. This success stems from our “hybrid” approach, which combines analytical understanding with data-driven learning. This combination avoids a key limitation of purely data-driven methods: their struggle to generalize to unseen scenarios, also known as the “extrapolation problem.”

Spatially Extended Non-equilibrium Systems || Transition Paths || Optimization || Closure
pacs:
05.45.-a, 89.75.-k

I Introduction

Non-equilibrium systems are irreversible systems characterized by a continuous flow of energy that is driven by external forces or internal fluctuations. There are many different types of non-equilibrium systems, and they can be found in a wide variety of fields, including physics, chemistry, biology, and engineering. Non-equilibrium systems can exhibit complex behavior, including self-organization, pattern formation, and chaos. These complex behaviors arise from the interplay of nonlinear dynamics and statistical fluctuations. The emergence of coherent, complex patterns is ubiquitous in many spatially extended non-equilibrium systems; see e.g. [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]. Mechanisms to explain the emergence of these patterns include the development of instability saturated by nonlinear effects whose calculations can be conducted typically near the onset of linear instability; see e.g. [11, 12]. Historically, hydrodynamic systems have been commonly used as prototypes to study instabilities out of equilibrium in spatially extended systems, from both an experimental and a theoretical point of view [13, 3].

To describe how physical instability develops in such systems with infinitely many degrees of freedom, we often focus on the amplitudes’ temporal evolution of specific normal modes. The latter correspond typically to those that are mildly unstable and that are only slightly damped in linear theory. When the number of these nearly marginal modes is finite, their amplitudes are governed by ordinary differential equations (ODEs) in which the growth rates of the linear theory have been renormalized by nonlinear terms [1, 11, 3, 12]. Intuitively, the reason for this reduction is a simple separation of time scales. Modes that have just crossed the imaginary axis have a small real part and are evolving slowly on long time scales, all the other fast modes rapidly adapting themselves to these slow modes.

However, for non-equilibrium systems away from the onset of linear instability, such as when the Reynolds number for a fluid flow increases far beyond a laminar regime, the emergence of coherent patterns does not fit within this instability/nonlinear saturation theory as the reduction principle of the fast modes onto the slow ones breaks down [11] and calls for new reduction techniques to explain emergence. Furthermore, in the presence of random fluctuations, reduced equations under the form of deterministic ODEs are inherently incapable to capture phenomena like noise-induced transitions. Noise can have unexpected outcomes such as altering the sequence of transitions, possibly suppressing instabilities [14], the onset of turbulence [15] or, to the opposite, exciting transitions [16, 7]. Examples in which noise-induced transitions have an important role include the generation of convective rolls [17, 18], electroconvection in nematic liquid crystals [16, 19], certain oceanic flows [20, 21], and climate phenomena [22, 23, 24, 25].

This work aims to gain deeper insights into the efficient derivation of reduced models for such non-equilibrium systems subject to random fluctuations [3, 7]. In that respect, we seek reduced models able to capture the emergence of noise-driven spatiotemporal patterns and predict their transitions triggered by the subtle coupling between noise and the nonlinear dynamics. Our overall goal is to develop a unified framework for identifying the key variables and their interactions (parameterization) in complex systems undergoing noise-driven transitions. We specifically focus on systems that have experienced multiple branching points (bifurcations), leading to high-multiplicity regimes with numerous co-existing metastable states whose amplitude is relatively large (order one or more) resulting from a combination of strong noise, inherent nonlinear dynamics, or both. The purpose is thus to address limitations of existing approaches like amplitude equations and center manifold techniques in dealing with such regimes.

To address these limitations, the framework of optimal parameterizing manifolds (OPMs) introduced in [26, 27, 28] for forced-dissipative systems offers a powerful solution. This framework allows for the efficient derivation of parameterizations away from the onset of instability, achieved through continuous deformations of near-onset parameterizations that are optimized by data-driven minimization. Typically, the loss function involves a natural discrepancy metric measuring the defect of parameterization of the unresolved variables by the resolved ones. The parameterizations that are optimized are derived from the governing equations by means of backward-forward systems providing useful approximations of the unstable growth saturated by the nonlinear terms, even when the spectral gap between the stable and unstable modes is small [27, 29, 28]. These approximations, once optimized, do not suffer indeed the restrictions of invariant manifolds (spectral gap condition [27, Eq. (2.18)]) allowing to handle situations with e.g. weak time-scale separation between the resolved and unresolved modes.

In this article, we carry over this variational approach to the case of spatially extended non-equilibrium systems within the framework of stochastic partial differential equations (SPDEs). In particular, we extend the parameterization formulas obtained in [30, 29, 31] for SPDEs driven by multiplicative noise (parameter noise [18, 7]), to SPDEs driven, beyond this onset, by more realistic, spatio-temporal noise either of Gaussian or of non-Gaussian nature with jumps. For this class of SPDEs, our framework allows for dealing with the important case of cutoff scales larger than the scales forced stochastically.

For this forcing scenario, the stochastic parameterizations derived in this work, give rise to reduced systems taking the form of stochastic differential equations (SDEs) with non-Markovian, path-dependent coefficients depending on the noise’s past; see e.g. Section V.2 below. We mention that many time series records from real-world datasets, are known to exhibit long-term memory. This is the case of long-range memory features assumed to represent the internal variability of the climate on time scales from years to centuries [32, 33, 34, 35]. There, the surface temperature is considered as a superposition of internal variability (the response to stochastic forcing) and a forced signal which is the linear response to external forcing. The background variability may be modeled using a stochastic process with memory, or a different process that incorporates non-Gaussianity if this is considered more appropriate [36]. As will become apparent at the end of this paper, our reduction approach is adaptable to PDEs under the influence of stochastic forcing with long-range dependence; see Section IX.

In parallel and since Hasselmann’s seminal work [37], several climate models have been incorporating stochastic forcing to represent unresolved processes [38, 39, 40, 41, 42, 43]. Our approach shows that reduced models derived from these complex systems are expected to exhibit finite-range memory effects depending on the past history of the stochastic forcing, even if the latter is white in time. These exogenous memory effects, stemming from the forcing noise’s history, are distinct from endogenous memory effects encountered in the reduction of nonlinear, unforced systems [44, 45, 43]. The latter, predicted by the Mori-Zwanzig (MZ) theory, are functionals of the past of the resolved state variables. Endogenous memory effects often appear when the validity of the conditional expectation in the MZ expansion, breaks down [46, 26, 47, 43]. The exogenous memory effects dealt with in this work, have a different origin, arising as soon as the cutoff scale is larger than the scales forced stochastically.

It is worth mentioning that reduced models with non-Markovian coefficients responsible for such exogenous memory effects, have been encountered in the reduction of stochastic systems driven by white noise, albeit near a change of stability. These are obtained from reduction approaches benefiting from timescale separation such as stochastic normal forms [48, 49, 50], stochastic invariant manifold approximations [51, 30, 29, 31], or multiscale methods [52, 53, 54, 55]. Our reduction approach is not limited to timescale separation.

In that respect, our reduction framework is tested against a stochastic Allen-Cahn equation (sACE), a powerful tool for modeling non-equilibrium phase transitions in various scientific fields [1, 3, 56, 57]. The latter model is set in a parameter regime with weak timescale separation and far from the instability onset, in which the system exhibits multiple coexisting metastable states connected to the basic state through rare or typical stochastic transition paths.

We show that our resulting OPM reduced systems, when trained over a single path, are not only able to retain a remarkable predictive power to emulate ensemble statistics (correlations, power spectra, etc.) obtained as average over a large ensemble of noise realizations, but also anticipate what are the system’s typical and rare metastable states and their statistics of occurrence. The OPM reduced system’s ability to reproduce accurately such transitions is rooted in its very structure. Its coefficients are nonlinear functionals of the aforementioned non-Markovian terms (see Eq. (85) below) allowing for an accurate representation of the genuine nonlinear interactions between noise and nonlinear terms in the original sACE, which drive fluctuations in the large-mode amplitudes.

We emphasize, that because the cutoff scale, defining the retained large-scale dynamics, is chosen here to be larger than the forcing scale, any deterministic reduction (e.g. Galerkin, invariant manifold, etc.) would filter out the underlying noise’s effects. This leaves them blind to the subtle fluctuations driving the system’s behavior and leading eventually to stochastic transitions. In contrast, our stochastic parameterization approach sees right through this filter. It tracks how the noise, acting in the “unseen” part of the system, interacts with the resolved modes through our reduction method’s non-Markovian coefficients.

To demonstrate the broad applicability of our reduction approach, we apply it in Section VI to another significant class of spatially extended non-equilibrium systems: jump-driven SPDEs. The unforced, nonlinear dynamics is here chosen to exhibit an S-shaped solution curve. Many systems sharing this attribute are indicative of multistability, tipping points, and hysteresis. Such behaviors are observed in various fields, including combustion theory [58, 59], plasma physics [60, 61], ecology [9], neuroscience [62], climate science [63, 64], and oceanography [65, 66]. By understanding the complex interactions between noise and nonlinearity into these phenomena, we can gain insights into critical transitions and tipping points, which are increasingly relevant to addressing global challenges like climate change [67, 68, 69, 70, 43].

Jump processes offer a powerful tool for modeling complex systems with non-smooth dynamics. They offer valuable insights into multistable behaviors [71, 72] and have been applied to various real-world phenomena, such as paleoclimate events [73, 36], chaotic transport [74], atmospheric dynamics [75, 76, 77, 78, 79, 80], and cloud physics [81, 82].

SPDEs driven by jump processes are gaining increasing attention in applications [83]. For instance, jump processes can replace (non-smooth) ”if-then” conditions in such models, enabling more efficient simulations [80]. Nevertheless, efficient reduction techniques to disentangle the jump interactions with other smoother nonlinear components of the model, are still under development [84, 85, 86]. Our stochastic parameterization framework provides a promising solution in this direction, as demonstrated in Section VI. By capturing the intricate interplay between jump noise and nonlinear dynamics within the reduced equations, our approach can significantly simplify the analysis of such complex systems.

II Invariance Equation and Approximations

II.1 Spatially extended non-equilibrium systems

This article is concerned with the efficient reduction of spatially extended non-equilibrium systems. To do so, we work within the framework of Stochastic Equations in Infinite Dimensions [87] and its Ergodic Theory [88]. Formally, these equations take the following form

du=(Au+G(u))dt+d𝜼t,\,\mathrm{d}u=(Au+G(u))\,\mathrm{d}t+\,\mathrm{d}{\bm{\eta}_{t}}, (1)

in which 𝜼t{\bm{\eta}_{t}} is a stochastic process, either composed of Brownian motions or jump processes and whose exact structure is specified below. This formalism provides a convenient way to analyze stochastic partial differential equations (SPDEs) by means of semigroup theory [89, 90], in which the unknown uu evolves typically in a Hilbert space HH.

The operator AA represents a linear differential operator while GG is a nonlinear operator that accounts for the nonlinear terms. Both of these operators may involve loss of spatial regularity when applied to a function uu in HH. To have a consistent existence theory of solutions and their stochastic invariant manifolds for SPDEs requires to take into account such loss of regularity effects [87, 88].

General assumptions encountered in applications are made on the linear operator AA following [89]. More specifically, we assume

A=+,A=-\mathcal{L}+\mathcal{B}, (2)

where \mathcal{L} is sectorial with domain D()HD(\mathcal{L})\subset H which is compactly and densely embedded in HH. We assume also that -\mathcal{L} is stable, while \mathcal{B} is a low-order perturbation of \mathcal{L}, i.e. \mathcal{B} is a bounded linear operator such that :D(α)H\mathcal{B}:D(\mathcal{L}^{\alpha})\rightarrow H for some α\alpha in [0,1)[0,1). We refer to [89, Sec. 1.4] for an intuitive presentation of fractional power of an operator and characterization of the so-called operator domain D(α)D(\mathcal{L}^{\alpha}). In practice, the choice of α\alpha should match the loss of regularity effects caused by the nonlinear terms so that

G:D(α)H,G\colon D(\mathcal{L}^{\alpha})\rightarrow H,

is a well-defined CpC^{p}-smooth mapping that satisfies G(0)=0G(0)=0, DG(0)=0DG(0)=0 (tangency condition), and

G(u)=Gk(u,,u)+O(uαk+1),G(u)=G_{k}(u,\cdots,u)+O(\|u\|^{k+1}_{\alpha}), (3)

with p>k2p>k\geq 2, and GkG_{k} denoting the leading-order operator (on D(α)D(\mathcal{L}^{\alpha})) in the Taylor expansion of GG, near the origin. A broad class of spatially extended stochastic equations from physics can be recasted into this framework (see [91, 87, 30, 92, 31] for examples), as well as time-delay systems subject to stochastic disturbances [82, 93].

Throughout this article, we assume that the driving noise in Eq. (1) is an HH-valued stochastic process that takes the form

𝜼t(ω)=j=1Nσjηtj(ω)𝒆j,t,σj0,ωΩ,{\bm{\eta}_{t}}(\omega)=\sum_{j=1}^{N}\sigma_{j}\eta_{t}^{j}(\omega)\bm{e}_{j},\;t\in\mathbb{R},\;\sigma_{j}\geq 0,\;\omega\in\Omega, (4)

where the 𝒆j\bm{e}_{j} are eigenmodes of AA, the ηtj\eta_{t}^{j} denote either a finite family of mutually independent jump processes, or mutually independent Brownian motions, WtjW_{t}^{j}, over their relevant probability space (Ω,,)(\Omega,\mathbb{P},\mathcal{F}) endowed with its canonical probability measure \mathbb{P} and filtration \mathcal{F}; see e.g. [50, Appendix A.3].

Within this framework, the reduction of spatially extended non-equilibrium systems is organized in terms of resolved and unresolved spatial scales. The resolved scales are typically spanned by large wavenumbers and the unresolved by smaller ones. In that respect, the eigenmodes of the operator AA plays a central role throughout this paper to rank the spatial scales. Typically, we assume that the state space H𝔠H_{\mathfrak{c}} of resolved variables is spanned by the following modes

H𝔠=span{𝒆1,,𝒆mc},H_{\mathfrak{c}}=\mbox{span}\{\bm{e}_{1},\cdots,\bm{e}_{m_{c}}\}, (5)

where the 𝒆j\bm{e}_{j} correspond to large-scale modes up to a cutoff-scale associated with some index mcm_{c} (see Remark II.1). The subspace of unresolved modes is then the orthogonal complement of H𝔠H_{\mathfrak{c}} in HH, namely

H𝔠H𝔣=H.\displaystyle H_{\mathfrak{c}}\oplus H_{\mathfrak{f}}=H. (6)

To these subspaces, we associate their respective canonical projectors denoted by

Π𝔠:HH𝔠, and Π𝔣:HH𝔣.\displaystyle\Pi_{\mathfrak{c}}:H\rightarrow H_{\mathfrak{c}},\textrm{ and }\Pi_{\mathfrak{f}}:H\rightarrow H_{\mathfrak{f}}. (7)

Throughout Section III below and the remaining of this Section, we focus on the case of driving Brownian motions; the case of driving jump processes is dealt with in Section VI.

Our goal is to provide a general approach to derive reduced models that preserve the essential features of the large-scale dynamics without resolving the small scales.

We present hereafter and in Section III below, the formulas of the underlying small-scale parameterizations in the Gaussian noise case. The formalism is flexible and easily adaptable to the case of non-Gaussian noises with jumps such as discussed in Sections VI and VII, below.

Remark II.1

Within our working assumptions, the spectrum σ(A)\sigma(A) consists only of isolated eigenvalues with finite multiplicities. This combined with the sectorial property of AA implies that there are at most finitely many eigenvalues with a given real part. The sectorial property of AA also implies that the real part of the spectrum, Re(σ(A))\mathrm{Re}(\sigma(A)), is bounded above (see also [94, Thm. II.4.18]). These two properties of Re(σ(A))\mathrm{Re}(\sigma(A)) allow us in turn to label elements in σ(A)\sigma(A) according to the lexicographical order which we adopt throughout this article:

σ(A)={λn|n},\displaystyle\sigma(A)=\{\lambda_{n}\>|\>n\in\mathbb{N}^{\ast}\}, (8)

such that for any 1n<n1\leq n<n^{\prime} we have either

Re(λn)>Re(λn),\displaystyle\mathrm{Re}(\lambda_{n})>\mathrm{Re}(\lambda_{n^{\prime}}), (9)

or

Re(λn)=Re(λn), and Im(λn)Im(λn).\displaystyle\mathrm{Re}(\lambda_{n})=\mathrm{Re}(\lambda_{n^{\prime}}),\quad\text{ and }\quad\mathrm{Im}(\lambda_{n})\geq\mathrm{Im}(\lambda_{n^{\prime}}). (10)

This way, we can rely on a simple labeling of the eigenvalues/eigenmodes by positive integers to organize the resolved and unresolved scales and the corresponding parameterization formulas derived hereafter. In practice, when the spatial domain is 2D or 3D, it is usually physically more intuitive to label the eigenelements by wave vectors. The parameterization formulas presented below can be easily recast within this convention; see e.g. [95].

II.2 Invariance equation and backward-forward systems

As mentioned in Introduction, this work extends the parameterization formulas in [27, 28] (designed for constant or time-dependent forcing) to the stochastic setting. In both deterministic and stochastic contexts, deriving the relevant parameterizations and reduced systems relies on solving Backward-Forward (BF) systems arising in the theory of invariant manifolds. Similar to [27, 28], our BF framework allows us to overcome limitations of traditional invariant manifolds, particularly the spectral gap condition ([27, Eq. (2.18)]). This restrictive condition, which requires large gaps in the spectrum of the operator AA, is bypassed through data-driven optimization (detailed in Section III.3) of the backward integration time over which the BF systems are integrated.

Before diving into the parameterization formulas retained to address these limitations (Section II.3), we recall the basics of reducing an SPDE to a Stochastic Invariant Manifold (SIM). In particular, this detour enables us to better appreciate the emergence of BF systems as a natural framework for parameterizations. For that purpose, adapting the approach from [30], we transform the SPDE reduction problem into the more tractable problem of reducing a PDE with random coefficients. This simplification makes the problem significantly more amenable to analysis than its original SPDE form.

To do so, consider the stationary solution z(t,ω)z(t,\omega) to the Langevin equation

dz=Azdt+d𝑾t,\,\mathrm{d}z=Az\,\mathrm{d}t+\,\mathrm{d}{\bm{W}_{t}}, (11)

(Ornstein-Uhlenbeck (OU) process), where 𝑾t{\bm{W}_{t}} is defined in Eq. (4) with the corresponding mutually independent Brownian motions, WtjW_{t}^{j}, in place of the ηtj\eta_{t}^{j}, i.e.:

𝑾t(ω)=j=1NσjWtj(ω)𝒆j,t,ωΩ.{\bm{W}_{t}}(\omega)=\sum_{j=1}^{N}\sigma_{j}W_{t}^{j}(\omega)\bm{e}_{j},\;t\in\mathbb{R},\;\omega\in\Omega. (12)

Then, for each noise’s path ω\omega, the change of variable

v=uz(t,ω),v=u-z(t,\omega), (13)

transforms the SPDE,

du=(Au+G(u))dt+d𝑾t,\,\mathrm{d}u=(Au+G(u))\,\mathrm{d}t+\,\mathrm{d}\bm{W}_{t},

into the following PDE with random coefficients in the vv-variable:

dvdt=Av+G(v+z(t,ω)).\frac{\,\mathrm{d}v}{\,\mathrm{d}t}=Av+G(v+z(t,\omega)). (14)

Under standard conditions involving the spectral gap between the resolved and unresolved modes [96], the path-dependent PDE (14) admits a SIM, and the underlying stochastic SIM mapping, ϕ(t,X)\phi(t,X) (XH𝔠X\in H_{\mathfrak{c}}), satisfies the stochastic invariance equation:

tϕ+A[ϕ](X)=Dϕ(t,X,ω)(Π𝔠G(X+ϕ+z(t,ω)))+Π𝔣G(X+ϕ+z(t,ω)),\partial_{t}\phi+\mathcal{L}_{A}[\phi](X)=-D\phi(t,X,\omega)\Big{(}\Pi_{\mathfrak{c}}G(X+\phi+z(t,\omega))\Big{)}+\Pi_{\mathfrak{f}}G(X+\phi+z(t,\omega)), (15)

where A\mathcal{L}_{A} denotes the operator acting on differentiable mappings ψ\psi from H𝔠H_{\mathfrak{c}} into H𝔣H_{\mathfrak{f}}, as follows:

A[ψ](X)=Dψ(X)A𝔠XA𝔣ψ(X),XH𝔠,\mathcal{L}_{A}[\psi](X)=D\psi(X)A_{\mathfrak{c}}X-A_{\mathfrak{f}}\psi(X),\;X\in H_{\mathfrak{c}}, (16)

with A𝔠=Π𝔠AA_{\mathfrak{c}}=\Pi_{\mathfrak{c}}A and A𝔣=Π𝔣AA_{\mathfrak{f}}=\Pi_{\mathfrak{f}}A.

To simplify, assume that H𝔠H_{\mathfrak{c}} and H𝔣H_{\mathfrak{f}} are chosen such that Π𝔣𝑾t=0\Pi_{\mathfrak{f}}{\bm{W}_{t}}=0. Then in particular z𝔣=Π𝔣z=0z_{\mathfrak{f}}=\Pi_{\mathfrak{f}}z=0. Consider now the Lyapunov-Perron integral

𝔍(t,X,ω)=te(ts)A𝔣Π𝔣Gk(e(st)A𝔠X+z𝔠(s,ω))ds,\mathfrak{J}(t,X,\omega)=\int_{-\infty}^{t}e^{(t-s)A_{\mathfrak{f}}}\Pi_{\mathfrak{f}}G_{k}(e^{(s-t)A_{\mathfrak{c}}}X+z_{\mathfrak{c}}(s,\omega))\,\mathrm{d}s, (17)

where z𝔠=Π𝔠zz_{\mathfrak{c}}=\Pi_{\mathfrak{c}}z. In what follows, we denote by ,\langle\cdot,\cdot\rangle the Hermitian inner product on HH defined as f,g=f(x)g¯(x)dx\langle f,g\rangle=\int f(x)\overline{g}(x)\,\mathrm{d}x, while GkG_{k} denotes the leading-order term of order kk in the Taylor expansion of G(X)G(X) around X=0X=0, and 𝒆n\bm{e}_{n}^{\ast} denotes the nn-th eigenmode of the adjoint operator of AA adopting the same labelling convention as for AA; see Remark. II.1. To the order kk we associate the set KK of indices {1,,k}\{1,\cdots,k\}, and denote by KzK_{z} any subset of indices made of (possibly empty) of disjoint elements of KK.

We have then the following result.

Theorem II.1

Under the previous assumptions, assume furthermore that the following non-resonance condition holds, for any (j1,,jk)(j_{1},\cdots,j_{k}) in (1,,mc)k(1,\cdots,m_{c})^{k}, nmc+1n\geq m_{c}+1, and any subset KzK_{z} such as defined above:

(Gj1jkn0 and ΠqKzσq0)\displaystyle\left(G_{j_{1}\cdots j_{k}}^{n}\neq 0\text{ and }\Pi_{q\in K_{z}}\sigma_{q}\neq 0\right)\implies (18)
(Re(λnpKKzλjp)<0),\displaystyle\hskip 85.35826pt\left(\mathrm{Re}\Big{(}\lambda_{n}-\sum_{p\in K\setminus K_{z}}\lambda_{j_{p}}\Big{)}<0\right),

with Gj1jkn=Gk(𝐞j1,,𝐞jk),𝐞nG_{j_{1}\cdots j_{k}}^{n}=\langle G_{k}(\bm{e}_{j_{1}},\cdots,\bm{e}_{j_{k}}),\bm{e}_{n}^{\ast}\rangle.

Then, the Lyapunov-Perron integral 𝔍\mathfrak{J} (Eq. (17)) is well-defined almost surely, and is a solution to the following homological equation with random coefficients:

tψ+A[ψ](X)=Π𝔣Gk(X+z𝔠(t,ω)).\partial_{t}\psi+\mathcal{L}_{A}[\psi](X)=\Pi_{\mathfrak{f}}G_{k}(X+z_{\mathfrak{c}}(t,\omega)). (19)

Moreover, we observe that the integral 𝔍\mathfrak{J} is (formally) obtained as the limit, when τ\tau goes to infinity, of the qq-solution to the following backward-forward (BF) auxiliary system:

dpdt=A𝔠p,s[tτ,t],\displaystyle\frac{\,\mathrm{d}p}{\,\mathrm{d}t}=A_{\mathfrak{c}}p,\hskip 99.58464pts\in[t-\tau,t], (20a)
dqdt=A𝔣q+Π𝔣Gk(p+z𝔠(s,ω)),s[tτ,t],\displaystyle\frac{\,\mathrm{d}q}{\,\mathrm{d}t}=A_{\mathfrak{f}}q+\Pi_{\mathfrak{f}}G_{k}(p+z_{\mathfrak{c}}(s,\omega)),\hskip 5.69046pts\in[t-\tau,t], (20b)
with p(s)|s=t=XH𝔠, and q(s)|s=tτ=0.\displaystyle\mbox{with }p(s)|_{s=t}=X\in H_{\mathfrak{c}},\mbox{ and }q(s)|_{s=t-\tau}=0. (20c)

That is

limτ𝔍(t,X,ω)qτ(t,X,ω)H𝔣=0,\lim_{\tau\rightarrow\infty}\|\mathfrak{J}(t,X,\omega)-q_{\tau}(t,X,\omega)\|_{H_{\mathfrak{f}}}=0, (21)

where qτ(t,X,ω)q_{\tau}(t,X,\omega) denotes the solution to Eq. (20b) at time s=ts=t, when initialized with q=0q=0 at s=tτs=t-\tau.

For a proof of this Theorem, see Appendix A. This theorem extends to the stochastic context Theorem III.1 of [28]. To simplify the notations, we omit below the ω\omega-dependence in certain notations unless specified otherwise.

Theorem II.1 teaches us that solving the BF system (20) gives the solution to the homological equation (17) which is an approximation of the full invariance equation Eq. (15). In the deterministic setting, solutions to the homological equation (17) are known to provide actual approximations of the invariant manifolds with rigorous estimates; see [27, Theorem 1] and [28, Theorem III.1]. Such results extend to the case of SPDEs driven by multiplicative (parameter) noise; see [31, Theorem 2.1] and [30, Theorem 6.1 and Corollary 7.1]. It is not the scope of this article to deal with rigorous error estimates regarding the approximation problem of stochastic invariant manifolds for SPDEs driven by additive noise, but the rationale stays the same: solutions to the homological equation (17) or equivalently to the BF system (20) provide actual approximations of the underlying stochastic invariant manifolds, in the additive noise case as well.

Going back to the SPDE variable uu, the BF system (20) becomes

dp^=A𝔠p^ds+Π𝔠d𝑾s,s[tτ,t],\displaystyle\,\mathrm{d}\widehat{p}=A_{\mathfrak{c}}\widehat{p}\,\mathrm{d}s+\Pi_{\mathfrak{c}}\,\mathrm{d}\bm{W}_{s},\hskip 49.79231pts\in[t-\tau,t], (22a)
dq^=(A𝔣q^+Π𝔣Gk(p^))ds,s[tτ,t],\displaystyle\,\mathrm{d}\widehat{q}=\left(A_{\mathfrak{f}}\widehat{q}+\Pi_{\mathfrak{f}}G_{k}(\widehat{p})\right)\,\mathrm{d}s,\hskip 36.98866pts\in[t-\tau,t], (22b)
with p^(s)|s=t=XH𝔠, and q^(s)|s=tτ=0.\displaystyle\mbox{with }\widehat{p}(s)|_{s=t}=X\in H_{\mathfrak{c}},\mbox{ and }\widehat{q}(s)|_{s=t-\tau}=0. (22c)

Note that in Eq. (22), only the low-mode variable p^\widehat{p} is forced stochastically, and thus from what precedes, limτq^τ(t,X)\lim_{\tau\rightarrow\infty}\widehat{q}_{\tau}(t,X), provides a legitimate approximation of the SIM at time tt when in the original SPDE, only the low modes are forced stochastically, i.e. N=mcN=m_{c} in (12). In the next section, we consider the general case when the low and high modes are stochastically forced.

II.3 Approximations of fully coupled backward-forward systems

Backward-forward systems have been actually proposed in the literature for the construction of SIMs [97], through a different route than what presented above, i.e. without exploiting the invariance equation. The idea pursued in [97] is to envision SIMs as a fixed point of an integral form of the following fully coupled BF system

dp=[A𝔠p+Π𝔠G(p+q)]ds+Π𝔠d𝑾s,sIt,τ,\displaystyle\,\mathrm{d}p=\big{[}A_{\mathfrak{c}}p+\Pi_{\mathfrak{c}}G(p+q)\big{]}\mathrm{d}s+\Pi_{\mathfrak{c}}\mathrm{d}\bm{W}_{s},\;\;s\in I_{t,\tau}, (23a)
dq=[A𝔣q+Π𝔣G(p+q)]ds+Π𝔣d𝑾s,sIt,τ,\displaystyle\,\mathrm{d}q=\big{[}A_{\mathfrak{f}}q+\Pi_{\mathfrak{f}}G(p+q)\big{]}\mathrm{d}s+\Pi_{\mathfrak{f}}\mathrm{d}\bm{W}_{s},\;\;s\in I_{t,\tau}, (23b)
with p(s)|s=t=XH𝔠, and q(s)|s=tτ=0.\displaystyle\mbox{with }p(s)|_{s=t}=X\in H_{\mathfrak{c}},\mbox{ and }q(s)|_{s=t-\tau}=0. (23c)

where It,τ=[tτ,t]I_{t,\tau}=[t-\tau,t]. Note that in Eq. (23), we do not assume here the noise term 𝑾\bm{W} to be a finite sum of independent Brownian motions. In the case of an infinite sum one can thus covers the case of space-time white noise due to [98].

In any case, it is known that this type of fully coupled nonlinear problems involving backward stochastic equations does not have always solutions in general and we refer to [97, Proposition 3.1] for conditions on AA and GG ensuring existence of solutions and thus SIM. This (nonlinear) BF approach to SIM is also subject to a spectral gap condition that requires gap to be large enough as in [96].

Denoting by qτ(t,X)q_{\tau}(t,X) the solution to Eq. (23b) at s=ts=t, Proposition 3.4 of [97] ensures then that limτqτ(t,X)\lim_{\tau\rightarrow\infty}q_{\tau}(t,X) exists in L2(Ω)L^{2}(\Omega) and that this limit gives the sought SIM, i.e.

Ψsim(X,t)=limτqτ(t,X),XH𝔠.\Psi^{\text{sim}}(X,t)=\lim_{\tau\rightarrow\infty}q_{\tau}(t,X),\quad X\in H_{\mathfrak{c}}. (24)

The SIM is thus obtained, almost surely, as the asymptotic graph of the mapping Xqτ(t,X)X\mapsto q_{\tau}(t,X), when τ\tau is sent to infinite.

Instead of relying on an integral form of Eq. (23), one can address though the existence and construction of a SIM via a more direct approach, exploiting iteration schemes built directly from Eq. (23). It is not the purpose of this article to analyze the convergence of such iterative schemes (subject also to a spectral gap condition as in [97]) but rather to illustrate how informative such schemes can be in designing stochastic parameterizations in practice.

To solve (23), given an initial guess, (p(0),q(0))(p^{(0)},q^{(0)}), we propose the following iterative scheme:

dp()=[A𝔠p()+Π𝔠G(p(1)+q(1))]ds+Π𝔠d𝑾s,\displaystyle\mathrm{d}p^{(\ell)}=\Big{[}A_{\mathfrak{c}}p^{(\ell)}+\Pi_{\mathfrak{c}}G\big{(}p^{(\ell-1)}+q^{(\ell-1)}\big{)}\Big{]}\mathrm{d}s+\Pi_{\mathfrak{c}}\mathrm{d}\bm{W}_{s}, (25a)
dq()=[A𝔣q()+Π𝔣G(p(1)+q(1))]ds+Π𝔣d𝑾s,\displaystyle\mathrm{d}q^{(\ell)}=\Big{[}A_{\mathfrak{f}}q^{(\ell)}+\Pi_{\mathfrak{f}}G\big{(}p^{(\ell-1)}+q^{(\ell-1)}\big{)}\Big{]}\mathrm{d}s+\Pi_{\mathfrak{f}}\mathrm{d}\bm{W}_{s}, (25b)

where 1\ell\geq 1 and with p()(s)|s=t=Xp^{(\ell)}(s)|_{s=t}=X in H𝔠H_{\mathfrak{c}} and q()(s)|s=tτ=0q^{(\ell)}(s)|_{s=t-\tau}=0. Here again, the first equation (Eq. (25a)) is integrated backward over [tτ,t][t-\tau,t], followed by a forward integration of the second equation (Eq. (25b)) over the same interval. To simplify the notations, we will often omit to point out the interval [tτ,t][t-\tau,t] in the BF systems below.

To help interpret the parameterization produced by such an iterative scheme, we restrict momentarily ourselves to the case of a nonlinearity GG that is quadratic (denoted by BB) and to noise terms that are scaled by a parameter ϵ\epsilon. This case covers the important case of the Kardar–Parisi–Zhang equation [99]. The inclusion of this scaling factor ϵ\epsilon allows us to group the terms constituting the stochastic parameterization according to powers of ϵ\epsilon providing in particular useful small-noise expansions; see Eq. (30) below.

Under this working framework, if we start with (p(0),q(0))(0,0)(p^{(0)},q^{(0)})\equiv(0,0), we get then

p(1)(s)=e(st)A𝔠Xϵste(ss)A𝔠Π𝔠d𝑾s,\displaystyle p^{(1)}(s)=e^{(s-t)A_{\mathfrak{c}}}X-\epsilon\int_{s}^{t}e^{(s-s^{\prime})A_{\mathfrak{c}}}\Pi_{\mathfrak{c}}\,\mathrm{d}\bm{W}_{s^{\prime}}, (26)
q(1)(s)=ϵtτse(ss)A𝔣Π𝔣d𝑾s,\displaystyle q^{(1)}(s)=\epsilon\int_{t-\tau}^{s}e^{(s-s^{\prime})A_{\mathfrak{f}}}\Pi_{\mathfrak{f}}\,\mathrm{d}\bm{W}_{s^{\prime}},

which leads to

p(2)(s)\displaystyle p^{(2)}(s) =e(st)A𝔠X\displaystyle=e^{(s-t)A_{\mathfrak{c}}}X
ste(ss)A𝔠Π𝔠B(p(1)(s)+q(1)(s))ds\displaystyle\quad-\int_{s}^{t}e^{(s-s^{\prime})A_{\mathfrak{c}}}\Pi_{\mathfrak{c}}B\big{(}p^{(1)}(s^{\prime})+q^{(1)}(s^{\prime})\big{)}\,\mathrm{d}s^{\prime}
ϵste(ss)A𝔠Π𝔠d𝑾s,\displaystyle\qquad-\epsilon\int_{s}^{t}e^{(s-s^{\prime})A_{\mathfrak{c}}}\Pi_{\mathfrak{c}}\,\mathrm{d}\bm{W}_{s^{\prime}},

and

q(2)(s)=tτse(ss)A𝔣Π𝔣B(p(1)(s)+q(1)(s))ds\displaystyle q^{(2)}(s)=\int_{t-\tau}^{s}e^{(s-s^{\prime})A_{\mathfrak{f}}}\Pi_{\mathfrak{f}}B\big{(}p^{(1)}(s^{\prime})+q^{(1)}(s^{\prime})\big{)}\,\mathrm{d}s^{\prime} (27)
+ϵtτse(ss)A𝔣Π𝔣d𝑾s.\displaystyle\hskip 56.9055pt+\epsilon\int_{t-\tau}^{s}e^{(s-s^{\prime})A_{\mathfrak{f}}}\Pi_{\mathfrak{f}}\,\mathrm{d}\bm{W}_{s^{\prime}}.

To further make explicit q(2)(s)q^{(2)}(s) that provides the stochastic parameterization we are seeking (after 2 iterations), we introduce

fi(s)\displaystyle f_{i}(s) =ste(ss)λidWsi(ω),i=1,,mc,\displaystyle=-\int_{s}^{t}e^{(s-s^{\prime})\lambda_{i}}\,\mathrm{d}W^{i}_{s^{\prime}}(\omega),\quad i=1,\ldots,m_{c}, (28)
gn(s)\displaystyle g_{n}(s) =tτse(ss)λndWsn(ω),nmc+1.\displaystyle=\int_{t-\tau}^{s}e^{(s-s^{\prime})\lambda_{n}}\,\mathrm{d}W^{n}_{s^{\prime}}(\omega),\quad n\geq m_{c}+1.

We can then rewrite (p(1),q(1))(p^{(1)},q^{(1)}) given in (LABEL:Eq_1st_iteration) as follows:

p(1)(s)=i=1mc(e(st)λiXi+ϵfi(s))𝒆i,\displaystyle p^{(1)}(s)=\sum_{i=1}^{m_{c}}\big{(}e^{(s-t)\lambda_{i}}X_{i}+\epsilon f_{i}(s)\big{)}\bm{e}_{i}, (29)
q(1)(s)=ϵnmc+1gn(s)𝒆n.\displaystyle q^{(1)}(s)=\epsilon\sum_{n\geq m_{c}+1}g_{n}(s)\bm{e}_{n}.

By introducing additionally

Bijn=B(𝒆i,𝒆j),𝒆n,B_{ij}^{n}=\langle B(\bm{e}_{i},\bm{e}_{j}),\bm{e}^{\ast}_{n}\rangle,

we get for any nmc+1n\geq m_{c}+1 the following explicit expressions

ΠnB(p(1)(s),p(1)(s))\displaystyle\Pi_{n}B(p^{(1)}(s),p^{(1)}(s))
=ϵ2i,j=1mcBijnfi(s)fj(s)+ϵi,j=1mcBijnfi(s)e(st)λjXj\displaystyle=\epsilon^{2}\sum_{i,j=1}^{m_{c}}B^{n}_{ij}f_{i}(s)f_{j}(s)+\epsilon\sum_{i,j=1}^{m_{c}}B^{n}_{ij}f_{i}(s)e^{(s-t)\lambda_{j}}X_{j}
+ϵi,j=1mcBijnfj(s)e(st)λiXi+i,j=1mcBijne(st)(λi+λj)XiXj,\displaystyle\;\;+\epsilon\sum_{i,j=1}^{m_{c}}\!B^{n}_{ij}f_{j}(s)e^{(s-t)\lambda_{i}}X_{i}+\sum_{i,j=1}^{m_{c}}\!B^{n}_{ij}e^{(s-t)(\lambda_{i}+\lambda_{j})}X_{i}X_{j},
ΠnB(p(1)(s),q(1)(s))\displaystyle\Pi_{n}B(p^{(1)}(s),q^{(1)}(s)) =ϵ2i=1mcn=mc+1Bi,nnfi(s)gn(s)\displaystyle=\epsilon^{2}\sum_{i=1}^{m_{c}}\sum_{n^{\prime}=m_{c}+1}^{\infty}B^{n}_{i,n^{\prime}}f_{i}(s)g_{n^{\prime}}(s)
+ϵi=1mcn=mc+1Binngn(s)e(st)λiXi,\displaystyle\;+\epsilon\sum_{i=1}^{m_{c}}\sum_{n^{\prime}=m_{c}+1}^{\infty}B^{n}_{in^{\prime}}g_{n^{\prime}}(s)e^{(s-t)\lambda_{i}}X_{i},

and

ΠnB(q(1)(s),q(1)(s))=ϵ2n,n′′mc+1Bnn′′ngn(s)gn′′(s).\Pi_{n}B(q^{(1)}(s),q^{(1)}(s))=\epsilon^{2}\sum_{n^{\prime},n^{\prime\prime}\geq m_{c}+1}B^{n}_{n^{\prime}n^{\prime\prime}}g_{n^{\prime}}(s)g_{n^{\prime\prime}}(s).

Let us denote by Πn\Pi_{n} the projector onto the mode 𝒆n\bm{e}_{n}. Using the above identities in (LABEL:Eq_2nd_iteration), and setting s=ts=t, we get for nmc+1n\geq m_{c}+1,

Πnq(2)(t)\displaystyle\Pi_{n}q^{(2)}(t) =ϵ2an(τ)+ϵi=1mcbin(τ)Xi\displaystyle=\epsilon^{2}a_{n}(\tau)+\epsilon\sum_{i=1}^{m_{c}}b_{in}(\tau)X_{i} (30)
+i,j=1mccijn(τ)BijnXiXj\displaystyle\quad+\sum_{i,j=1}^{m_{c}}c_{ij}^{n}(\tau)B_{ij}^{n}X_{i}X_{j}
+ϵtτte(ts)λndWsn(ω),\displaystyle\quad+\epsilon\int_{t-\tau}^{t}e^{(t-s^{\prime})\lambda_{n}}\,\mathrm{d}W^{n}_{s^{\prime}}(\omega),

where

an(τ)\displaystyle a_{n}(\tau) =tτte(ts)λni,j=1mcBijnfi(s)fj(s)ds(α)+tτte(ts)λn(i=1mcn=mc+1(Binn+Bnin)fi(s)gn(s))ds(β)\displaystyle=\underbrace{\int_{t-\tau}^{t}e^{(t-s^{\prime})\lambda_{n}}\sum_{i,j=1}^{m_{c}}B^{n}_{ij}f_{i}(s^{\prime})f_{j}(s^{\prime})\,\mathrm{d}s^{\prime}}_{(\alpha)}+\underbrace{\int_{t-\tau}^{t}e^{(t-s^{\prime})\lambda_{n}}\bigg{(}\sum_{i=1}^{m_{c}}\sum_{n^{\prime}=m_{c}+1}^{\infty}(B^{n}_{in^{\prime}}+B^{n}_{n^{\prime}i})f_{i}(s^{\prime})g_{n^{\prime}}(s^{\prime})\bigg{)}\,\mathrm{d}s^{\prime}}_{(\beta)} (31)
+tτte(ts)λn(n,n′′mc+1Bnn′′ngn(s)gn′′(s))ds(γ),\displaystyle\hskip 85.35826pt+\underbrace{\int_{t-\tau}^{t}e^{(t-s^{\prime})\lambda_{n}}\bigg{(}\sum_{n^{\prime},n^{\prime\prime}\geq m_{c}+1}B^{n}_{n^{\prime}n^{\prime\prime}}g_{n^{\prime}}(s^{\prime})g_{n^{\prime\prime}}(s^{\prime})\bigg{)}\,\mathrm{d}s^{\prime}}_{(\gamma)},
bin(τ)\displaystyle b_{in}(\tau) =tτte(ts)(λnλi)j=1mc(Bijn+Bjin)fj(s)ds\displaystyle=\int_{t-\tau}^{t}e^{(t-s^{\prime})(\lambda_{n}-\lambda_{i})}\sum_{j=1}^{m_{c}}(B^{n}_{ij}+B^{n}_{ji})f_{j}(s^{\prime})\,\mathrm{d}s^{\prime} (32)
+tτte(ts)(λnλi)n=mc+1(Binn+Bnin)gn(s)ds,\displaystyle+\int_{t-\tau}^{t}e^{(t-s^{\prime})(\lambda_{n}-\lambda_{i})}\sum_{n^{\prime}=m_{c}+1}^{\infty}(B^{n}_{in^{\prime}}+B^{n}_{n^{\prime}i})g_{n^{\prime}}(s^{\prime})\,\mathrm{d}s^{\prime},

and

cijn(τ)\displaystyle c_{ij}^{n}(\tau) =tτte(ts)(λn(λi+λj))ds\displaystyle=\int_{t-\tau}^{t}e^{(t-s^{\prime})(\lambda_{n}-(\lambda_{i}+\lambda_{j}))}\,\mathrm{d}s^{\prime}
={1exp(δijnτ)δijn,if δijn0,τ,otherwise,\displaystyle=\begin{cases}\frac{1-\exp(-\delta_{ij}^{n}\tau)}{\delta_{ij}^{n}},&\text{if $\delta_{ij}^{n}\neq 0$},\\ \tau,&\text{otherwise},\end{cases}

with δijn=λi+λjλn\delta_{ij}^{n}=\lambda_{i}+\lambda_{j}-\lambda_{n}.

In the classical approximation theory of SIM, one is interested in conditions ensuring convergence of the integrals involved in the random coefficients ana_{n} and binb_{in} as τ\tau\rightarrow\infty, since q(2)(t)q^{(2)}(t) (and its higher-order analogues q()(t)q^{(\ell)}(t) with >2\ell>2) aims to approximate the SIM defined in Eq. (24). In our approach to stochastic parameterization, we do not restrict ourselves to such a limiting case but rather seek optimal backward integration time τ\tau that minimizes a parameterization defect as explained below in Section III.3.

However, computing this parameterization defect involves the computation of the random coefficients in the course of time (see Eq. (53) below). The challenge is that the structure of ana_{n} and binb_{in} involves repeated stochastic convolutions in time, and as such one wants to avoid a direct computation by quadrature. We propose below an alternative and efficient way to compute such random coefficients, for a simpler, more brutal approximation than q(2)q^{(2)}. As shown in Sections V and VI, this other class of stochastic parameterizations turns out to be highly performant for the important case where the stochastically forced scales are exclusively part of the neglected scales.

This approximation consists of setting q(1)=0q^{(1)}=0 in Eq. (LABEL:Eq_2nd_iteration), namely to deal with the stochastic parameterization

Φ(t,X,ω)\displaystyle\Phi(t,X,\omega) =tτte(ts)A𝔣Π𝔣G(e(st)A𝔠X)ds\displaystyle=\int_{t-\tau}^{t}e^{(t-s)A_{\mathfrak{f}}}\Pi_{\mathfrak{f}}G\big{(}e^{(s-t)A_{\mathfrak{c}}}X\big{)}\,\mathrm{d}s (33)
+ϵtτte(ts)A𝔣Π𝔣d𝑾s(ω),\displaystyle\qquad+\epsilon\int_{t-\tau}^{t}e^{(t-s)A_{\mathfrak{f}}}\Pi_{\mathfrak{f}}\,\mathrm{d}\bm{W}_{s}(\omega),

after restoring GG as a more general nonlinearity than BB.

Note also that this parameterization Φ(t,X,ω)\Phi(t,X,\omega) is exactly the solution q(s,X,ω)q(s,X,\omega) at s=ts=t of the following BF system

dp=A𝔠pds,sIt,τ,\displaystyle\!\,\mathrm{d}p=A_{\mathfrak{c}}p\,\mathrm{d}s,\hskip 110.96556pts\in I_{t,\tau}, (34a)
dq=(A𝔣q+Π𝔣G(p))ds+ϵΠ𝔣d𝑾s,sIt,τ,\displaystyle\!\,\mathrm{d}q=\left(A_{\mathfrak{f}}q+\Pi_{\mathfrak{f}}G(p)\right)\,\mathrm{d}s+\epsilon\Pi_{\mathfrak{f}}\,\mathrm{d}\bm{W}_{s},\;\;s\in I_{t,\tau}, (34b)
with p(s)|s=t=XH𝔠, and q(s)|s=tτ=0,\displaystyle\!\mbox{with }p(s)|_{s=t}=X\in H_{\mathfrak{c}},\mbox{ and }q(s)|_{s=t-\tau}=0, (34c)

with It,τ=[tτ,t]I_{t,\tau}=[t-\tau,t].

Section III.2 details an efficient computation of the single stochastic convolution in equation (33) using auxiliary ODEs with random coefficients. It is worth noting that previous SPDE reduction approaches have, in certain circumstances or under specific assumptions (e.g., ΠcB(q,q)=0\Pi_{c}B(q,q)=0 as seen in [54, Eq. (2.2)] and [40, Eq. (2.10)]), deliberately opted to avoid directly computing these convolutions via quadrature [100, 101, 102].

The stochastic convolution in Eq. (33) accounts for finite-range memory effects stemming from the stochastic forcing to the original equation. Such terms have been encountered in the approximation of low-mode amplitudes, albeit in their asymptotic format when τ\tau\rightarrow\infty see e.g. [54, 103]. As shown in our examples below (Sections V and VI), stochastic convolution terms become crucial for efficient reduction when the cutoff scale, defining the retained large-scale dynamics, is larger than the forcing scale. There, we show furthermore that the finite-range memory content (measured by τ\tau) is a key factor for a skillful reduction when optimized properly. Also, as exemplified in applications, optimizing the nonlinear terms in Eq. (33) may turn to be of utmost importance to reproduce accurately the average motion of the SPDE dynamics; see Section V.5 below.

The efficient computation of repeated convolutions involved in the coefficients ana_{n} and binb_{in} is however more challenging and will be communicated elsewhere. This is not only a technical aspect though, as these repeated convolutions in ana_{n} given by Eq. (31) characterize important triad interactions reflecting how the noise interacts through the nonlinear terms into three groups: the low-low interactions in (α\alpha), the low-high interactions in (β\beta) and high-high interactions in (γ\gamma). By using the simpler parameterization Φ\Phi defined in Eq. (33) we do not keep these interactions at the parameterization level, however as Φ\Phi approximates the high-mode amplitude it still allows us to account for triad interactions into the corresponding reduced models; see e.g. Eq. (85) below.

III Non-Markovian Parameterizations: Formulas and Optimization

The previous discussion leads us to consider, for each nmc+1n\geq m_{c}+1, the following scale-aware BF systems

dp=A𝔠pds,s[tτ,t],\displaystyle\mathrm{d}p=A_{\mathfrak{c}}p\,\mathrm{d}s,\hskip 105.2751pts\in[t-\tau,t], (35a)
dq=(λnq+ΠnG(p))ds+σndWsn,s[tτ,t],\displaystyle\mathrm{d}q=\Big{(}\lambda_{n}q+\Pi_{n}G\big{(}p\big{)}\Big{)}\,\mathrm{d}s+\sigma_{n}\,\mathrm{d}W_{s}^{n},s\in[t-\tau,t], (35b)
with p(s)|s=t=XH𝔠, and q(s)|s=tτ=Y,\displaystyle\mbox{with }p(s)|_{s=t}=X\in H_{\mathfrak{c}},\mbox{ and }q(s)|_{s=t-\tau}=Y\in\mathbb{R}, (35c)

where Πn\Pi_{n} denotes the projector onto the mode 𝒆n\bm{e}_{n}. Here, τ\tau is a free parameter to be adjusted, no longer condemned to approach \infty as in the approximation theory of SIM discussed above. Note that compared to Eq. (34), we break down the forcing mode by mode for each high mode to allow for adjusting the free backward parameter, τ\tau, per scale to parameterize (scale-awareness). This strategy allows for a greater degree of freedom to calibrate useful parameterizations, as will be apparent in applications dealt with in Sections V and VI.

Also, compared to Eq. (34), the initial condition for the forward integration in Eq. (35) is a scale-aware parameter YY in Eq. (35c). It is aimed at resolving the time-mean of the nn-th high-mode amplitude, u(t),𝒆n\langle u(t),\bm{e}_{n}^{\ast}\rangle with uu solving Eq. (1). In many applications, it is enough to set Y=0Y=0 though.

In the following, we make explicit the stochastic parameterization obtained by integration of the BF system (35) for SPDEs with cubic nonlinear terms, for which the stochastically forced scales are part of the neglected scales.

III.1 Stochastic parameterizations for systems with cubic interactions

We consider

du=(Au+G2(u,u)+G3(u,u,u))dt+d𝑾t.\,\mathrm{d}u=(Au+G_{2}(u,u)+G_{3}(u,u,u))\,\mathrm{d}t+\,\mathrm{d}{\bm{W}_{t}}. (36)

Here, AA is a partial differential operator as defined in (2) for a suitable ambient Hilbert space HH; see also [29, Chap. 2]. The operators G2G_{2} and G3G_{3} are quadratic and cubic, respectively. Here again, we assume the eigenmodes of AA and its adjoint AA^{*} to form a bi-orthonormal basis of HH; namely, the set of eigenmodes {𝒆j}\{\bm{e}_{j}\} of AA and {𝒆j}\{\bm{e}^{*}_{j}\} of AA^{*} each forms a Hilbert basis of HH, and they satisfy the bi-orthogonality condition 𝒆j,𝒆kH=δj,k\langle\bm{e}_{j},\bm{e}^{*}_{k}\rangle_{H}=\delta_{j,k}, where δj,k\delta_{j,k} denotes the Kronecker delta function. The noise term 𝑾t{\bm{W}_{t}} takes the form defined in Eq. (12). Stochastic equations such as Eq. (36) arise in many areas of physics such as in the description of phase separation [104, 105, 56], pattern formations [3, 7] or transitions in geophysical fluid models [20, 106, 107, 108, 25, 21, 31].

We assume that Eq. (36) is well-posed in the sense of possessing for any u0u_{0} in HH a unique mild solution, i.e. there exists a stopping time T>0T>0 such that the integral equation,

u(t)=eAtu0+0teA(ts)(G2(u(s))+G3(u(s)))ds\displaystyle u(t)=e^{At}u_{0}+\int_{0}^{t}e^{A(t-s)}\Big{(}G_{2}(u(s))+G_{3}(u(s))\Big{)}\,\mathrm{d}s (37)
+0teA(ts)d𝑾s,t[0,T],\displaystyle\hskip 71.13188pt+\int_{0}^{t}e^{A(t-s)}\,\mathrm{d}\bm{W}_{s},\;\;t\in[0,T],

possesses, almost surely, a unique solution in C([0,T],H)C([0,T],H) (continuous path in HH) for all stopping time tTt\leq T.

Here the stochastic integral can be represented as a series of one-dimensional Itô integrals

0teA(ts)d𝑾s=kσk0teλk(ts)dWsk𝒆k,\int_{0}^{t}e^{A(t-s)}\,\mathrm{d}\bm{W}_{s}=\sum_{k}\sigma_{k}\int_{0}^{t}e^{\lambda_{k}(t-s)}\,\mathrm{d}W_{s}^{k}\bm{e}_{k}, (38)

where the WskW_{s}^{k} are the independent Brownian motions in (12). It is known that Eq. (36) admits a unique mild solution in HH with the right conditions on the coefficients of G2G_{2} and G3G_{3} (confining potential) [87]; see also [109, Prop. 3.4]. We refer to [88, 87] for background on the stochastic convolution term 0te(ts)Ad𝑾s\int_{0}^{t}e^{(t-s)A}\,\mathrm{d}\bm{W}_{s}.

Throughout this section we assume that only a subset of unresolved modes in the decomposition (6) of HH are stochastically forced according to Eq. (12), namely that the following condition holds:

  • (H)

    σj=0\sigma_{j}=0 for 1jmc1\leq j\leq m_{c} in (12), and that there exists at least one index kk in (mc+1,N)(m_{c}+1,N) such that σk0\sigma_{k}\neq 0.

In this case, the BF system (35) becomes

dp=A𝔠pds,s[tτ,t],\displaystyle\mathrm{d}p=A_{\mathfrak{c}}p\,\mathrm{d}s,\qquad\qquad\qquad\qquad\;\;\;s\in[t-\tau,t], (39)
dqn=(λnqn+ΠnG2(p)+ΠnG3(p))ds\displaystyle\mathrm{d}q_{n}=\Big{(}\lambda_{n}q_{n}+\Pi_{n}G_{2}\big{(}p\big{)}+\Pi_{n}G_{3}\big{(}p\big{)}\Big{)}\,\mathrm{d}s
+σndWsn,s[tτ,t],\displaystyle\hskip 73.97733pt+\sigma_{n}\,\mathrm{d}W_{s}^{n},\;\;\;\;s\in[t-\tau,t],
with p(s)|s=t=XH𝔠, and qn(s)|s=tτ=Y.\displaystyle\mbox{with }p(s)|_{s=t}=X\in H_{\mathfrak{c}},\mbox{ and }q_{n}(s)|_{s=t-\tau}=Y.

The solution qnq_{n} to Eq. (LABEL:Eq_BF_SPDEcubic) at s=ts=t provides a parameterization Φn(τ,X,t)\Phi_{n}(\tau,X,t) aimed at approximating the nnth high-mode amplitude, un(t)=Πnu(t)=u(t),𝒆nu_{n}(t)=\Pi_{n}u(t)=\langle u(t),\bm{e}_{n}\rangle, of the SPDE solution uu when XX is equal to the low-mode amplitude u𝔠(t)=Π𝔠u(t)u_{\mathfrak{c}}(t)=\Pi_{\mathfrak{c}}u(t). The BF system (LABEL:Eq_BF_SPDEcubic) can be solved analytically. Its solution provides the stochastic parameterization, Φn\Phi_{n}, expressed, for nmc+1n\geq m_{c}+1, as the following stochastic nonlinear mapping of XX in H𝔠H_{\mathfrak{c}}:

Φn\displaystyle\Phi_{n} (τ,X,t;ω)\displaystyle(\tau,X,t;\omega) (40)
=qn(t,X,ω)\displaystyle=q_{n}(t,X,\omega)
=eλnτY+σn(Wtn(ω)eτλnWtτn(ω))\displaystyle=e^{\lambda_{n}\tau}Y+\sigma_{n}\big{(}W^{n}_{t}(\omega)-e^{\tau\lambda_{n}}W^{n}_{t-\tau}(\omega)\big{)}
+Zτn(t;ω)+i,j=1mc(Dijn(τ)BijnXiXj)\displaystyle\;\;+Z^{n}_{\tau}(t;\omega)+\sum_{i,j=1}^{m_{c}}\Bigl{(}D^{n}_{ij}(\tau)B_{ij}^{n}X_{i}X_{j}\Bigr{)}
+i,j,k=1mc(Eijkn(τ)CijknXiXjXk),\displaystyle\;\;+\sum_{i,j,k=1}^{m_{c}}\Big{(}E_{ijk}^{n}(\tau)C_{ijk}^{n}X_{i}X_{j}X_{k}\Big{)},

with X==1mcX𝒆.X=\sum_{\ell=1}^{m_{c}}X_{\ell}\bm{e}_{\ell}.

The stochastic term ZτnZ^{n}_{\tau} is given by

Zτn(t;ω)=σnλneλnttτteλnsWsn(ω)ds,Z^{n}_{\tau}(t;\omega)=\sigma_{n}\lambda_{n}e^{\lambda_{n}t}\int_{t-\tau}^{t}e^{-\lambda_{n}s}W_{s}^{n}(\omega)\,\mathrm{d}s, (41)

and as such, is dependent on the noise’s path and the “past” of the Wiener process forcing the nn-th scale: it conveys exogenous memory effects, i.e. it is non-Markovian in the sense of [110].

The coefficients BijnB_{ij}^{n} and CijknC_{ijk}^{n} in (40) are given respectively by

Bijn=G2(𝒆i,𝒆j),𝒆n,Cijkn=G3(𝒆i,𝒆j,𝒆k),𝒆n,\displaystyle B_{ij}^{n}=\langle G_{2}(\bm{e}_{i},\bm{e}_{j}),\bm{e}^{\ast}_{n}\rangle,\;C_{ijk}^{n}=\langle G_{3}(\bm{e}_{i},\bm{e}_{j},\bm{e}_{k}),\bm{e}_{n}^{\ast}\rangle, (42)

while the coefficients Dijn(τ)D^{n}_{ij}(\tau) and Eijkn(τ)E_{ijk}^{n}(\tau) are given by

Dijn(τ)={1exp(δijnτ)δijn,if δijn0,τ,otherwise,\displaystyle D_{ij}^{n}(\tau)=\begin{cases}\frac{1-\exp\big{(}-\delta_{ij}^{n}\tau\big{)}}{\delta_{ij}^{n}},&\text{if $\delta_{ij}^{n}\neq 0$},\\ \tau,&\text{otherwise},\end{cases} (43)

and

Eijkn(τ)={1exp(δijknτ)δijkn,if δijkn0,τ,otherwise,\displaystyle E_{ijk}^{n}(\tau)=\begin{cases}\frac{1-\exp\big{(}-\delta_{ijk}^{n}\tau\big{)}}{\delta_{ijk}^{n}},&\text{if $\delta_{ijk}^{n}\neq 0$},\\ \tau,&\text{otherwise},\end{cases} (44)

with δijn=λi+λjλn\delta_{ij}^{n}=\lambda_{i}+\lambda_{j}-\lambda_{n} and δijkn=λi+λj+λkλn\delta_{ijk}^{n}=\lambda_{i}+\lambda_{j}+\lambda_{k}-\lambda_{n}. From Eqns. (43) and (44), one observes that the parameter τ\tau (depending on nn) allows, in principle, for balancing the small denominators due to small spectral gaps, i.e. when the δijn\delta_{ij}^{n} or the δijkn\delta_{ijk}^{n} are small. This attribute is shared with the parameterization formulas obtained by the BF approach in the deterministic context; see [28, Remark III.1]. It plays an important role in the ability of our parameterizations to handle physically relevant situations with a weak time-scale separation; see Section V.5 below.

Remark III.1

In connection with the stochastic convolution ξt,τ=tτte(ts)A𝔣Π𝔣d𝐖s\xi_{t,\tau}=\int_{t-\tau}^{t}e^{(t-s)A_{\mathfrak{f}}}\Pi_{\mathfrak{f}}\,\mathrm{d}\bm{W}_{s} in Eq. (33), we note that the stochastic terms in Eq. (40), involving the processes WtnW^{n}_{t}, WtτnW^{n}_{t-\tau}, and ZτnZ^{n}_{\tau}, can be rewritten as a scalar stochastic integral, more precisely:

tτteλn(ts)dWtn=WtneτλnWtτn+σn1Zτn(t).\int_{t-\tau}^{t}e^{\lambda_{n}(t-s)}\,\mathrm{d}W^{n}_{t}=W^{n}_{t}-e^{\tau\lambda_{n}}W^{n}_{t-\tau}+\sigma_{n}^{-1}Z^{n}_{\tau}(t). (45)

Note that the RHS in Eq. (45) is simply the projection of ξt,τ\xi_{t,\tau} onto mode 𝐞n\bm{e}_{n} (Eq. (38)).

Equation (45) is a direct consequence of Itô’s formula applied to the product eλsWsne^{-\lambda s}W_{s}^{n} followed by integration over [tτ,t][t-\tau,t]. Indeed, we have (dropping the nn-dependence),

d(eλsWs)\displaystyle\,\mathrm{d}\Big{(}e^{-\lambda s}W_{s}\Big{)} =λeλsWsds+eλsdWs+d(eλs)dWs\displaystyle=-\lambda e^{-\lambda s}W_{s}\,\mathrm{d}s+e^{-\lambda s}\,\mathrm{d}W_{s}+\,\mathrm{d}(e^{-\lambda s})\,\mathrm{d}W_{s}
=λeλsWsds+eλsdWs,\displaystyle=-\lambda e^{-\lambda s}W_{s}\,\mathrm{d}s+e^{-\lambda s}\,\mathrm{d}W_{s},

by treating dsdWs\,\mathrm{d}s\,\mathrm{d}W_{s} as zero due to Itô calculus [111].

By introducing now δW=WteλτWtτ\delta W=W_{t}-e^{\lambda\tau}W_{t-\tau} and denoting by JJ the stochastic convolution tτteλ(ts)dWs\int_{t-\tau}^{t}e^{\lambda(t-s)}\,\mathrm{d}W_{s}, we arrive due to Eq. (41) at

eλtδW+λeλtZτ=eλtJ,e^{-\lambda t}\delta W+\lambda e^{-\lambda t}Z_{\tau}=e^{-\lambda t}J, (46)

which gives the relation J=δW+λZτJ=\delta W+\lambda Z_{\tau}, where we have set σn=1\sigma_{n}=1 to simplify.

In the expression of Φn\Phi_{n} given by Eq. (40), we opt for the expression involving the Riemann–Stieltjes integral ZτnZ^{n}_{\tau} for practical purpose. The latter can be indeed readily generalized to the case of jump noise (Eqns. (99)–(100) below) without relying on stochastic calculus involving a jump measure (Section VII).

In what follows we drop the dependence of Φn\Phi_{n} on ω\omega and sometimes on tt, unless specified. The parameterization of the neglected scales writes then as

Φ𝝉(X,t)=nmc+1Φn(τn,X,t)𝒆n,\Phi_{\bm{\tau}}(X,t)=\sum_{n\geq m_{c}+1}\Phi_{n}(\tau_{n},X,t)\bm{e}_{n},\\ (47)

with 𝝉=(τn)nmc+1{\bm{\tau}}=(\tau_{n})_{n\geq m_{c}+1}, and Φn\Phi_{n} defined in Eq. (40).

We observe that letting τ\tau approach infinity in Eq. (47) recovers Eq. (33) when G=G2+G3G=G_{2}+G_{3}. This implies that the manifold constructed from Eq. (47) can be viewed as a homotopic deformation of the SIM approximation defined in Eq. (33), controlled by the parameters τn\tau_{n}.

This connection motivates utilizing the variational framework presented in [27, 28] to identify the “optimal” stochastic manifold within the family defined by Eq. (47). In Section III.3, we demonstrate that a simple data-driven minimization of a least-squares parameterization error, applied to a single training path, yields trained parameterizations (in terms of τ\tau) with remarkable predictive power. For instance, these parameterizations can be used to infer ensemble statistics of SPDEs for a large set of unseen paths during training.

However, unlike the deterministic case, the stochastic framework necessitates addressing the efficient simulation of the coefficients involved in Eq. (47) for optimization purposes. This challenge is tackled in the following subsection.

III.2 Non-Markovian path-dependent coefficients: Efficient simulation

Our stochastic parameterization given by (47) contains random coefficients Zτn(t;ω)Z^{n}_{\tau}(t;\omega), each of them involving an integral of the history of the Brownian motion making them non-Markovian in the sense of [110], i.e. depending on the noise path history. We present below an efficient mean to compute these random coefficients by solving auxiliary ODEs with path-dependent coefficients, avoiding this way the cumbersome computation of integrals over noise paths that would need to be updated at each time tt. This approach is used for the following purpose:

  • (i)

    To find the optimal vector 𝝉{\bm{\tau}}^{\ast} made of optimal backward-forward integration times and form in turn the stochastic optimal parameterization Φ𝝉\Phi_{{\bm{\tau}}^{\ast}}, and

  • (ii)

    To efficiently simulate the corresponding optimal reduced system built from Φ𝝉\Phi_{{\bm{\tau}}^{\ast}} along with its path-dependent coefficients.

According to (41), the computation of ZτnZ^{n}_{\tau} boils down to the computation of the random integral

eλntτ+tteλnsWsn(ω)ds,e^{\lambda_{n}t}\int_{-\tau+t}^{t}e^{-\lambda_{n}s}W_{s}^{n}(\omega)\,\mathrm{d}s,

which is of the following form:

Iτj(t,ω;κ)=eκtτ+tteκsWsj(ω)ds,I_{\tau}^{j}(t,\omega;\kappa)=e^{\kappa t}\int_{-\tau+t}^{t}e^{-\kappa s}W_{s}^{j}(\omega)\,\mathrm{d}s, (48)

where τ>0\tau>0, and jmc+1j\geq m_{c}+1 due to assumption (H). We only need to consider the case that κ<0\kappa<0, since we consider here only cases in which the unstable modes are included within the space H𝔠H_{\mathfrak{c}} of resolved scales.

By taking time derivative on both sides of (48), we obtain that Iτj(t,ω;κ)I_{\tau}^{j}(t,\omega;\kappa) satisfies the following ODE with path-dependent coefficients also called a random differential equation (RDE):

dIdt=κI+Wtj(ω)eκτWtτj(ω).\frac{\,\mathrm{d}I}{\,\mathrm{d}t}=\kappa I+W_{t}^{j}(\omega)-e^{\kappa\tau}W_{t-\tau}^{j}(\omega). (49)

Since κ<0\kappa<0, the linear part κI\kappa I in (49) brings a stable contribution to (49). As a result, Iτj(t,ω;κ)I_{\tau}^{j}(t,\omega;\kappa) can be computed by integrating (49) forward in time starting at t=0t=0 with initial datum given by:

I(0,ω)=Iτj(0,ω;κ),I(0,\omega)=I_{\tau}^{j}(0,\omega;\kappa), (50)

where Iτj(0,ω;κ)I_{\tau}^{j}(0,\omega;\kappa) is computed using (48).

Now, to determine Zτn(t;ω)Z^{n}_{\tau}(t;\omega) in (41) it is sufficient to observe that

Zτn(t;ω)=σnλnIτn(t,ω;λn).Z^{n}_{\tau}(t;\omega)=\sigma_{n}\lambda_{n}I_{\tau}^{n}(t,\omega;\lambda_{n}). (51)

The random coefficient Zτn(t;ω)Z^{n}_{\tau}(t;\omega) is thus computed by using (51) after Iτn(t,ω;λn)I_{\tau}^{n}(t,\omega;\lambda_{n}) is computed by solving, forward in time, the corresponding RDE of the form (49) with initial datum

I(0,ω)=Iτn(0,ω;λn)=τ0eλnsWsn(ω)ds.I(0,\omega)=I_{\tau}^{n}(0,\omega;\lambda_{n})=\int_{-\tau}^{0}e^{-\lambda_{n}s}W_{s}^{n}(\omega)\,\mathrm{d}s. (52)

This way, we compute only the integral (52) and then propagate it through Eq. (49) to evaluate Zτn(t;ω)Z^{n}_{\tau}(t;\omega), instead of computing for each tt the integral in the definition (41). The resulting procedure is thus much more efficient to handle numerically than a direct computation of Zτn(t;ω)Z^{n}_{\tau}(t;\omega) based on the integral formulation (41) which would involve a careful and computationally more expensive quadrature at each time-step. Note that a similar treatment has been adopted in [29, Chap. 5.3] for the computation of the time-dependent coefficients arising in stochastic parameterizations of SPDEs driven by linear multiplicative noise, and in [28, Sec. III.C.2] for the case of non-autonomous forcing.

III.3 Data-informed optimization and training path

Thanks to Section III.2, we are now in position to propose an efficient variational approach to optimize the stochastic parameterizations of Section III.1 in view of handling parameter regimes located away from the instability onset and thus a greater wealth of possible stochastic transitions.

Given a solution path u(t)u(t) to Eq. (36) available over an interval ITI_{T} of length TT, we aim for determining the optimal parameterization Φn\Phi_{n} (given by (40)) that minimizes—in the τ\tau-variable—the following parameterization defect

𝒬n(τ,T)=|un(t)Φn(τ,u𝔠(t),t)|2¯,\mathcal{Q}_{n}(\tau,T)=\overline{\big{|}u_{n}(t)-\Phi_{n}(\tau,u_{\mathfrak{c}}(t),t)\big{|}^{2}}, (53)

for each nmc+1n\geq m_{c}+1. Here ()¯\overline{(\cdot)} denotes the time-mean over ITI_{T} while un(t)u_{n}(t) and u𝔠(t)u_{\mathfrak{c}}(t) denote the projections of u(t)u(t) onto the high-mode 𝒆n\bm{e}_{n} and the reduced state space H𝔠H_{\mathfrak{c}}, respectively. Figure 1 provides a schematic of the stochastic optimal parameterizing manifold (OPM) found this way.

The objective is obviously not to optimize the parameterizations Φn\Phi_{n} for every solution path, but rather to optimize Φn\Phi_{n} on a single solution path—called the training path and then use the optimized parameterization for predicting dynamical behaviors for any other noise path, or at least in some statistical sense. To do so, the optimized τn\tau_{n}-values (denoted by τn\tau_{n}^{\ast} below) is used to build the resulting stochastic OPM given by (47) whose optimized non-Markovian coefficients are aimed at encoding efficiently the interactions between the nonlinear terms and the noise, in the course of time.

Sections V and VI below illustrate how this single-path training strategy can be used efficiently to predict from the corresponding reduced systems the statistical behavior as time and/or the noise’s path is varied. The next section delves into the justification of this variational approach, and the theoretical characterization of the notion of stochastic OPM by relying on ergodic arguments. The reader interested in applications can jump to Sections V and VI.

Refer to caption
Figure 1: Panel A: This panel illustrates a training solution path (black curve) for the stochastic optimal parameterization method. This path is shown to be transverse to the underlying stochastic manifolds (red curves) as occurring typically for the reduction of stochastic systems using such objects (see [31, Fig. 2]) for a concrete simple example). The optimal parameterization (denoted by Φn(τn,X,t)\Phi_{n}(\tau_{n}^{\ast},X,t)) aims to approximate the actual state (pink dots) of the system’s solution (represented by the black curve) at any given time (here labeled as t1t_{1} and t2t_{2}). A key strength of this parameterization is its ability to capture the solution’s average behavior by its expected value, 𝔼(Φn)\mathbb{E}(\Phi_{n}) (gray curve). Panel B: This panel schematically shows how the “goodness-of-fit” (parameterization defect, 𝒬n\mathcal{Q}_{n}, defined by Eq. (53)) changes with the parameter τ\tau. The red asterisk marks the value τn\tau_{n}^{\ast} that minimizes this defect.

IV Non-Markovian Optimal Parameterizations and Invariant Measures

The problem of ergodicity and mixing of dissipative properties of PDEs subject to a stochastic external force has been an active research topic over the last two decades. It is rather well understood in the case when all deterministic modes are forced; see e.g. [112, 113]. The situation in which only finitely many modes are forced such as considered in this study is much more subtle to handle in order to prove unique ergodicity. The works [114, 109] provide answers in such a degenerate situation based on a theory of hypoellipticity for nonlinear SPDEs. In parallel, the work [115] generalizing the asymptotic coupling method introduced in [116], allows for covering a broad class of nonlinear stochastic evolution equations including stochastic delay equations.

Building upon these results, we proceed under the assumption of a unique ergodic invariant measure, denoted by μ\mu. Given a reduced state space H𝔠H_{\mathfrak{c}}, we demonstrate in this section the following:

  • (i)

    The path-dependence of the non-Markovian optimal reduced model arises from the random measure denoted by ρω\rho_{\omega} that is obtained through the disintegration of μ\mu over the underlying probability space Ω\Omega.

  • (ii)

    The non-Markovian optimal reduced model provides an averaged system in H𝔠H_{\mathfrak{c}} that is still path-dependent. For a given noise path ω\omega, it provides the reduced system in H𝔠H_{\mathfrak{c}} that averages out the unresolved variables with respect to the disintegration of ρω\rho_{\omega} over H𝔠H_{\mathfrak{c}}. A detailed explanation is provided in Theorem IV.2.

IV.1 Theoretical insights

Adopting the framework of random dynamical systems (RDSs) [50], recall that an RDS, S(t,ω)S(t,\omega), is said to be white noise, if the associated “past” and “future” σ\sigma-algebras \mathcal{F}^{-} (see (56)) and +\mathcal{F}^{+} are independent. Given a white noise RDS, S(t,ω)S(t,\omega), the relations between random invariant measures for SS and invariant measures for the associated Markov semigroup PtP_{t} is well known [117]. We recall below these relationships and enrich the discussion with Lemma IV.1 below from [118].

Assume that Eq. (1) generates a white noise RDS, S(t,ω)S(t,\omega), associated with a Markov semigroup PtP_{t} having μ\mu as an invariant measure, then the following limit taken in the weak topology:

ρω=limtS(t,θtω)μ,\rho_{\omega}=\lim_{t\rightarrow\infty}S(t,\theta_{-t}\omega)\mu, (54)

exists \mathbb{P}-a.s., and provides a random probability measure that is SS-invariant in the sense that

S(t,ω)ρω=ρθtω,-a.s.,t0.S(t,\omega)\rho_{\omega}=\rho_{\theta_{t}\omega},\;\;\;\mathbb{P}\mbox{-a.s.},\;\;t\geq 0. (55)

A random probability measure ρω\rho_{\omega} that satisfies (55) is called below a (random) statistical equilibrium. Recall that θt\theta_{t} here denotes the standard ergodic transformation on the set of Wiener path Ω\Omega defined through the helix identity [50, Def. 2.3.6], Ws(θtω)=Wt+s(ω)Wt(ω)W_{s}(\theta_{t}\omega)=W_{t+s}(\omega)-W_{t}(\omega), ω\omega in Ω\Omega.

A statistical equilibrium such as ρω\rho_{\omega} defined above is furthermore Markovian [119], in the sense it is measurable with respect to the past σ\sigma-algebra

=σ{ωS(τ,θtω): 0τt};\mathcal{F}^{-}=\sigma\{\omega\mapsto S(\tau,\theta_{-t}\omega)\;:\;0\leq\tau\leq t\}; (56)

see [117, Prop. 4.2]; see also [50, Theorems 1.7.2 and 2.3.45].

Note that the limit (54) exists in the sense that for every bounded measurable function f:Hf:H\rightarrow\mathbb{R}, the real-valued stochastic process

(t,ω)Hf(x)d(S(t,θtω)μ)=Hf(S(t,θtω)x)dμ,(t,\omega)\mapsto\int_{H}f(x)\,\mathrm{d}\big{(}S(t,\theta_{-t}\omega)\mu\big{)}=\int_{H}f(S(t,\theta_{-t}\omega)x)\,\mathrm{d}\mu, (57)

is a bounded martingale, and therefore converges \mathbb{P}-a.s by the Doob’s first martingale convergence theorem; see e.g. [111, Thm. C.5 p. 302]. This implies, in particular, the \mathbb{P}-almost sure convergence of the measure-valued random variable ωS(t,θtω)μ\omega\mapsto S(t,\theta_{-t}\omega)\mu in the topology of weak convergence of PrΩ(H)Pr_{\Omega}(H), which in turn implies the convergence in the narrow topology of PrΩ(H)Pr_{\Omega}(H); see [120, Chap. 3].

Reciprocally, given a Markovian SS-invariant random measure ρω\rho_{\omega} of an RDS for which the σ\sigma-algebras \mathcal{F}^{-} and +\mathcal{F}^{+} are independent, the probability measure, μ\mu, defined by

μ=𝔼(ρ)=Ωρωd(ω),\mu=\mathbb{E}(\rho_{\bullet})=\int_{\Omega}\rho_{\omega}\,\mathrm{d}\mathbb{P}(\omega), (58)

is an invariant measure of the Markov semigroup PtP_{t}; see [121, Thm. 1.10.1].

In case of uniqueness of μ\mu (and thus ergodicity) satisfying Ptμ=μ,P_{t}\mu=\mu, then this one-to-one correspondence property implies that any other random probability measure νω\nu_{\omega} different from the random measure ρω\rho_{\omega} obtained from (54), is non-Markovian which means in particular that Ωνωdμ\int_{\Omega}\nu_{\omega}\,\mathrm{d}\mathbb{P}\neq\mu otherwise we would have νω=ρω\nu_{\omega}=\rho_{\omega}, \mathbb{P}-a.s.

In case a random attractor 𝒜(ω)\mathcal{A}(\omega) exists, since the latter is a forward invariant compact random set, the Markovian random measure ρω\rho_{\omega} must be supported by 𝒜(ω)\mathcal{A}(\omega); see [120, Thm. 6.17] (see also [122, Prop. 4.5]). We conclude thus —  in the case of a unique ergodic measure for PtP_{t} —  to the existence of a unique Markov SS-invariant random measure ρω\rho_{\omega} supported by 𝒜\mathcal{A}, and that all other SS-invariant random measures (also necessarily supported by 𝒜\mathcal{A}) are non-Markovian.

RDSs where the future and past σ\sigma-algebras are independent, are generated for many stochastic systems in practice. This is the case for instance of a broad class of stochastic differential equations (SDEs); see [50, Sect. 2.3]. The problem of generation of white noise RDSs in infinite dimension is much less clarified and it is beyond the scope of this article to address this question.

Instead, we point out below that there exists other ways to associate to an RDS which is not necessarily of white-noise type, a meaningful random probability measure ρω\rho_{\omega} that still for instance satisfies (58). This is indeed the case if the RDS satisfies some weak mixing property with respect to a (non-random) probability μ\mu on HH as the following Lemma from [118], shows:

Lemma IV.1

Assume there exists μ\mu in Pr(HH) satisfying the following weak mixing property

1t0tdsH𝔼f(S(s,θs)u)dμtHfdμ,\frac{1}{t}\int_{0}^{t}\,\mathrm{d}s\int_{H}\mathbb{E}f(S(s,\theta_{-s}\bullet)u)\,\mathrm{d}\mu\underset{t\rightarrow\infty}{\longrightarrow}\int_{H}f\,\mathrm{d}\mu, (59)

for each bounded, continuous f:Hf:H\rightarrow\mathbb{R}.

Then there exists an 0\mathcal{F}_{0}-measurable random probability measure ρω\rho_{\omega} given by

ρθtω=1t0tS(s,θsω)μds,\rho_{\theta_{t}\omega}=\frac{1}{t}\int_{0}^{t}S(s,\theta_{-s}\omega)\mu\,\mathrm{d}s, (60)

that satisfies

𝔼(ρ)=Ωρωd(ω)=μ,\mathbb{E}(\rho_{\bullet})=\int_{\Omega}\rho_{\omega}\,\mathrm{d}\mathbb{P}(\omega)=\mu, (61)

and is SS-invariant in the sense of (55), i.e. that is a statistical equilibrium.

For a proof of Lemma IV.1 see that of [118, Lemma 2.5].

Thus, either built from an invariant measure of the Markov semigroup, in the case of a white-noise RDS, or from a weakly mixing measure in the sense of (59), a statistical equilibrium satisfying (58) can be naturally associated with Eq. (36) as long as the appropriate assumptions are satisfied. At this level of generality, we do not enter into addressing the important question dealing with sufficient conditions on the linear part AA and and nonlinear terms that ensure for an SPDE like Eq. (36), the generation of an RDS that is either of white-noise type or weakly mixing in the sense of (59); see [118] for examples in the latter case.

We also introduce the following notion of pullback parameterization defect.

Definition IV.1

Given a parameterization Φ:Ω×H𝔠H𝔣\Phi:\Omega\times H_{\mathfrak{c}}\rightarrow H_{\mathfrak{f}} that is measurable (in the measure-theoretic sense), and a mild solution uu of Eq. (36), the pullback parameterization defect associated with Φ\Phi over the interval [0,T][0,T] and for a noise-path ω\omega, is defined as:

𝒬T(ω,Φ)=1T0Tu𝔣(t,θtω)Φ(ω,u𝔠(t,θtω))2dt.\mathcal{Q}_{T}(\omega,\Phi)=\frac{1}{T}\int_{0}^{T}\left\lVert u_{\mathfrak{f}}(t,\theta_{-t}\omega)-\Phi(\omega,u_{\mathfrak{c}}(t,\theta_{-t}\omega))\right\rVert^{2}\,\mathrm{d}t. (62)

Given the statistical equilibrium ρω\rho_{\omega}, we denote by 𝔪ω\mathfrak{m}_{\omega} the push-forward of ρω\rho_{\omega} by the projector Π𝔠\Pi_{\mathfrak{c}} onto H𝔠H^{\mathfrak{c}}, namely

𝔪ω(B)=ρω(Π𝔠1(B)),B(H𝔠),\mathfrak{m}_{\omega}(B)=\rho_{\omega}(\Pi_{\mathfrak{c}}^{-1}(B)),\;\;B\in\mathcal{B}(H_{\mathfrak{c}}), (63)

where (H𝔠)\mathcal{B}(H_{\mathfrak{c}}) denotes the family of Borel sets of H𝔠H_{\mathfrak{c}}; i.e. the family of sets that can be formed from open sets (for the topology on H𝔠H_{\mathfrak{c}} induced by the norm H𝔠\|\cdot\|_{H_{\mathfrak{c}}}) through the operations of countable union, countable intersection, and relative complement.

The random measure 𝔪ω\mathfrak{m}_{\omega} allows us to consider the following functional space 𝒳=L𝔪2(Ω×H𝔠;H𝔣)\mathcal{X}=L^{2}_{\mathfrak{m}}(\Omega\times H_{\mathfrak{c}};H_{\mathfrak{f}}) of parameterizations, defined as

𝒳={Φ:Ω×H𝔠H𝔣measurable|\displaystyle\mathcal{X}=\bigg{\{}\Phi:\Omega\times H_{\mathfrak{c}}\rightarrow H_{\mathfrak{f}}\,\textrm{measurable}\;\Big{|} (64)
H𝔠Φ(ω,X)2d𝔪ω(X)<a.s.},\displaystyle\hskip 28.45274pt\;\int_{H_{\mathfrak{c}}}\|\Phi(\omega,X)\|^{2}\,\mathrm{d}\mathfrak{m}_{\omega}(X)<\infty\;a.s.\bigg{\}},

namely the Hilbert space constituted by H𝔣H_{\mathfrak{f}}-valued random functions of the resolved variables XX in H𝔠H_{\mathfrak{c}}, that are square-integrable with respect to 𝔪ω\mathfrak{m}_{\omega}, almost surely.

We are now in position to formulate the main result of this section, namely the following generalization to the stochastic context of Theorem 4 from [27].

Theorem IV.1

Assume that one of the following properties hold:

  • (i)

    The RDS generated by Eq. (36) is of white-noise type and the Markov semigroup associated with Eq. (1) admits a (unique) ergodic invariant measure μ\mu.

  • (ii)

    The RDS generated by Eq. (36) is weakly mixing in the sense of (59).

Let us denote by ρω\rho_{\omega} the weak-limit of μ\mu defined in (54) for case (i), and by ρω\rho_{\omega} the statistical equilibrium ensured by Lemma IV.1, in case (ii). We denote by ρωX\rho_{\omega}^{X} the disintegration of ρω\rho_{\omega} on the small-scale subspace H𝔣H_{\mathfrak{f}}, conditioned on the coarse-scale variable XX.

Assume that the small-scale variable YY has a finite energy in the sense that

Y2dρωd<.\int\int\left\lVert Y\right\rVert^{2}\,\mathrm{d}\rho_{\omega}\,\mathrm{d}\mathbb{P}<\infty. (65)

Then the minimization problem

minΦ𝒳Ω(X,Y)H𝔠×H𝔣YΦ(ω,X)2dρω(X,Y)d(ω),\underset{\Phi\in\mathcal{X}}{\min}\int_{\Omega}\int_{(X,Y)\in H_{\mathfrak{c}}\times H_{\mathfrak{f}}}\left\lVert Y-\Phi(\omega,X)\right\rVert^{2}\,\mathrm{d}\rho_{\omega}(X,Y)\,\mathrm{d}\mathbb{P}(\omega), (66)

possesses a unique solution whose argmin is given by

Φ(ω,X)=H𝔣YdρωX(Y),XH𝔠,-a.s.\Phi^{\ast}(\omega,X)=\int_{H_{\mathfrak{f}}}Y\,\mathrm{d}\rho_{\omega}^{X}(Y),\;\;X\in H_{\mathfrak{c}},\;\;\;\mathbb{P}\mbox{-}\rm{a.s.} (67)

Furthermore, if the RDS possesses a pullback attractor 𝒜(ω)\mathcal{A}(\omega) in HH, and ρω\rho_{\omega} is pullback mixing in the sense that for all uu in HH,

limT1T0Tf(ω,S(t,θtω)u)dt=\displaystyle\underset{T\rightarrow\infty}{\lim}\frac{1}{T}\int_{0}^{T}f(\omega,S(t,\theta_{-t}\omega)u)\,\mathrm{d}t= (68)
𝒜(ω)f(ω,v)dρω(v),f(ω,)Cb(H),-a.s.,\displaystyle\hskip 14.22636pt\int_{\mathcal{A}(\omega)}f(\omega,v)\,\mathrm{d}\rho_{\omega}(v),\;\;f(\omega,\cdot)\in C_{b}(H),\;\mathbb{P}\mbox{-a.s.},

then \mathbb{P}-almost surely

limT𝒬T(ω,Φ)limT𝒬T(ω,Φ),Φ𝒳.\underset{T\rightarrow\infty}{\lim}\mathcal{Q}_{T}(\omega,\Phi^{\ast})\leq\underset{T\rightarrow\infty}{\lim}\mathcal{Q}_{T}(\omega,\Phi),\;\;\Phi\in\mathcal{X}. (69)

Proof. The proof follows the same lines of [27, Theorem 4]. It consists of replacing:

  • (i)

    The probability measure μ\mu therein by the probability measure ρ\rho on Ω×H\Omega\times H that is naturally associated with the family of random measures ρω\rho_{\omega} and whose marginal on Ω\Omega is given by \mathbb{P}; see [117, Prop. 3.6],

  • (ii)

    The function space Lμ2(H𝔠×H𝔣;H𝔣)L^{2}_{\mu}(H_{\mathfrak{c}}\times H_{\mathfrak{f}};H_{\mathfrak{f}}) therein by the function space =Lρ2(Ω×H𝔠×H𝔣;H𝔣)\mathcal{E}=L^{2}_{\rho}(\Omega\times H_{\mathfrak{c}}\times H_{\mathfrak{f}};H_{\mathfrak{f}}).

By applying to the ambient Hilbert space \mathcal{E}, the standard projection theorem onto closed convex sets [123, Theorem 5.2], one defines (given Π𝔠\Pi_{\mathfrak{c}}) the conditional expectation 𝔼ρ[g|Π𝔠]\mathbb{E}_{\rho}[g|\Pi_{\mathfrak{c}}] of gg as the unique function in \mathcal{E} that satisfies the inequality

𝔼ρ[g𝔼ρ[g|Π𝔠]2]𝔼ρ[gΨ2],for all Ψ.\mathbb{E}_{\rho}[\|g-\mathbb{E}_{\rho}[g|\Pi_{\mathfrak{c}}]\|^{2}]\leq\mathbb{E}_{\rho}[\|g-\Psi\|^{2}],\;\mbox{for all }\Psi\in\mathcal{E}. (70)

Now by applying the general disintegration theorem of probability measures, applied to ρ\rho (see [27, Eq. (3.18)]), we obtain the following explicit representation of the random conditional expectation

𝔼ρ[g|Π𝔠](ω,X)=H𝔣g(ω,X,Y)dρω,X(Y),\mathbb{E}_{\rho}[g|\Pi_{\mathfrak{c}}](\omega,X)=\int_{H_{\mathfrak{f}}}g(\omega,X,Y)\,\mathrm{d}\rho_{\omega,X}(Y), (71)

with ρω,X\rho_{\omega,X} denoting the disintegrated measure of ρ\rho over Ω×H𝔠\Omega\times H_{\mathfrak{c}}. By noting that this disintegrated measure is the same as the disintegration ρωX\rho_{\omega}^{X} (over H𝔠H_{\mathfrak{c}}) of the random measure ρω\rho_{\omega}, we conclude by taking g(ω,X,Y)=Yg(\omega,X,Y)=Y as for the proof of [27, Theorem 4] that Φ\Phi^{\ast} given by (67) solves the minimization problem (66).

The inequality (69) results then from the pullback mixing property (LABEL:strong_ergo_pullback) and the definition (62) of the pullback parameterization defect.   

IV.2 Non-Markovian optimal reduced model, and conditional expectation

The mathematical framework of Section IV.1 allows us to provide a useful interpretation of non-Markovian optimal reduced model for SPDE of type Eq. (36). To simplify the presentation, we restrict ourselves here to the case G3=0G_{3}=0.

We denote then by G(v)G(v) the vector field in HH, formed by gathering the linear and nonlinear parts of Eq. (36) in this case, namely

G(v)=Av+G2(v,v),vH.G(v)=Av+G_{2}(v,v),\quad\;v\in H. (72)

The theorem formulated below characterizes the relationships between the non-Markovian optimal reduced model and the random conditional expectation associated with the projector Π𝔠\Pi_{\mathfrak{c}} onto the reduced state space H𝔠H_{\mathfrak{c}}; i.e. the resolved modes. Its proof is almost identical to that of [27, Theorem 5] with only slight amendment and the details are thus omitted.

Theorem IV.2

Consider the SPDE of type Eq. (36) with G3=0G_{3}=0. Let ρω\rho_{\omega} be a (random) statistical equilibrium satisfying either (54) (in case (i) of Theorem IV.1) or ensured by Lemma IV.1 (in case (ii) of Theorem IV.1).

Then, under the conditions of Theorem IV.1, the random conditional expectation associated with Π𝔠\Pi_{\mathfrak{c}},

𝔼ρθtω[G|Π𝔠](X)=YH𝔣G(X+Y)dρθtωX(Y),XH𝔠,\mathbb{E}_{\rho_{\theta_{t}\omega}}[G|\Pi_{\mathfrak{c}}](X)=\int_{Y\in H_{\mathfrak{f}}}G(X+Y)\,\mathrm{d}\rho_{\theta_{t}\omega}^{X}(Y),\;X\in H_{\mathfrak{c}},

satisfies

𝔼ρθtω[G|Π𝔠](X)=A𝔠X+Π𝔠G2(X,X)\displaystyle\mathbb{E}_{\rho_{\theta_{t}\omega}}[G|\Pi_{\mathfrak{c}}](X)=A_{\mathfrak{c}}X+\Pi_{\mathfrak{c}}G_{2}(X,X) (73)
+2Π𝔠G2(X,Φ(θtω,X))+G2(Y,Y)ρθtωX,\displaystyle\hskip 17.07182pt+2\Pi_{\mathfrak{c}}G_{2}(X,\Phi^{\ast}(\theta_{t}\omega,X))+\langle G_{2}(Y,Y)\rangle_{\rho_{\theta_{t}\omega}^{X}},

with

G2(Y,Y)ρθtωX=YH𝔣Π𝔠G2(Y,Y)dρθtωX(Y),\langle G_{2}(Y,Y)\rangle_{\rho_{\theta_{t}\omega}^{X}}=\int_{Y\in H_{\mathfrak{f}}}\Pi_{\mathfrak{c}}G_{2}(Y,Y)\,\mathrm{d}\rho_{\theta_{t}\omega}^{X}(Y), (74)

for which ρωX\rho_{\omega}^{X} denotes the disintegration of ρω\rho_{\omega} over H𝔠H_{\mathfrak{c}}.

Then, given the reduced state space H𝔠H_{\mathfrak{c}} and the statistical equilibrium ρω\rho_{\omega}, if G2(Y,Y)ρθtωX=0\langle G_{2}(Y,Y)\rangle_{\rho_{\theta_{t}\omega}^{X}}=0, the non-Markovian optimal reduced model of Eq. (36) with noise acting into the “orthogonal direction” of H𝔠H_{\mathfrak{c}} (assumption (H) in Section III.1), is given by:

dX=𝔼ρθtω[G|Π𝔠](X)dt.\,\mathrm{d}X=\mathbb{E}_{\rho_{\theta_{t}\omega}}[G|\Pi_{\mathfrak{c}}](X)\,\mathrm{d}t. (75)

Hereafter, we simply refer to Eq. (75) as the (random) conditional expectation.

IV.3 Practical implications

Thus, Theorem IV.2 teaches us that an approximation of the (actual) non-Markovian optimal parameterization, Φ\Phi^{\ast} given by (67), provides in fine an approximation of the conditional expectation involved in Eq. (75).

The non-Markovian optimal parameterization involves, from its definition, averaging with respect to the unknown probability measure ρω\rho_{\omega}. As such, designing a practical approximation scheme with rigorous error estimates remains challenging. Near the instability onset, the non-Markovian optimal parameterization simplifies to the stochastic invariant manifold. In this regime, probabilistic error estimates have been established for a wide range of SPDEs, including those relevant to fluid dynamics [31]. However, obtaining such guarantees becomes significantly more difficult for scenarios away from the instability onset, even for low-dimensional SDEs [124].

This is where the data-informed optimization approach from Section III.3 offers a practical solution within a relevant class of parameterizations. Variational inequality (69) provides a key insight: the parameterization defect serves as a good measure of a parameterization’s quality. Lower defect values indicate a better parameterization, one that is expected to yield a reduced model closer to the (theoretical) non-Markovian optimal reduced model defined by Eq. (75).

Even with a very good approximation of the actual non-Markovian optimal parameterization, Φ\Phi^{\ast}, a key question remains: under what conditions does the theoretical conditional expectation (Eq. (75)) provide a sufficient system’s closure on its own? We address this question in Sections V and VI, through concrete examples. Analyzing two universal models for non-equilibrium phase transitions, we demonstrate that the non-Markovian reduced models constructed using the formulas from Section III become particularly relevant for understanding and predicting the underlying stochastic transitions when the cutoff scale exceeds the scales forced stochastically (Sections V.3,V.4,V.5 and VI.4).

To deal with these non-equilibrium systems, we consider the class 𝒫\mathcal{P} of parameterizations for approximating the true non-Markovian optimal parameterization, Φ\Phi^{\ast}. This class 𝒫\mathcal{P} consists of continuous deformations away from the instability onset point of parameterizations that are valid near this onset (as discussed in Section III.1).

Our numerical results demonstrate that this class 𝒫\mathcal{P} effectively models and approximates the true non-Markovian terms in the optimal reduced model (Eq. (75)) across various scenarios. After learning an optimal parameterization within class 𝒫\mathcal{P} using a single noise realization, the resulting approximation of the optimal reduced model typically becomes an ODE system with coefficients dependent on the specific noise path. These path-dependent coefficients encode the interactions between noise and nonlinearities, as exemplified by the reduced models Eqns. (85) and (LABEL:Eq_reduced_S_shaped) below, with their ability to predict noise-induced transitions within the reduced state spaces.

V Predicting Stochastic Transitions with Non-Markovian Reduced Models

V.1 Noise-induced transitions in a stochastic Allen-Cahn model

We consider now the following stochastic Allen-Cahn Equation (sACE) [125, 126, 56]

du=(x2u+uu3)dt+d𝑾t,0<x<L,t>0,\,\mathrm{d}u=(\partial_{x}^{2}u+u-u^{3})\,\mathrm{d}t+\mathrm{d}{\bm{W}_{t}},\quad 0<x<L,\;\,t>0, (76)

with homogeneous Dirichlet boundary conditions. The ambient Hilbert space is H=L2(0,L)H=L^{2}(0,L) endowed with its natural inner product denoted by ,\langle\cdot,\cdot\rangle.

This equation and variants have a long history. The deterministic part of the equation provides the gradient flow of the Ginzburg-Landau free energy functional [127]

(u)=0L(|xu(x)|22+V(u(x)))dx,\mathcal{E}(u)=\int_{0}^{L}\left(\frac{|\partial_{x}u(x)|^{2}}{2}+V(u(x))\right)\,\mathrm{d}x, (77)

with the potential VV given by the standard double-well function V(u)=(u21)2/4V(u)=(u^{2}-1)^{2}/4. Indeed, one can readily check that D(u)=x2u+uu3-D\mathcal{E}(u)=\partial_{x}^{2}u+u-u^{3}, where DD\mathcal{E} denotes the Fréchet derivative of \mathcal{E} at uu. The sACE provides a general framework for studying pattern formation and interface dynamics in systems ranging from liquid crystals and ferromagnets to tumor growth and cell membranes [1, 57].

Its universality makes it relevant across diverse fields, contributing to a unified understanding of these complex phenomena. By incorporating the spatio-temporal noise term, d𝑾t\,\mathrm{d}{\bm{W}_{t}}, into the deterministic Allen-Cahn equation, the sACE accounts for inherent fluctuation and uncertainty present in real systems [56]. This allows for a more realistic description of transitions between phases, providing valuable insights beyond what pure deterministic models can offer. Near critical points, where phases coexist, the sACE reveals the delicate balance between deterministic driving forces and stochastic fluctuations. Studying these transitions helps us understand critical phenomena in diverse systems, from superfluid transitions to magnetization reversal in magnets [1, 56].

Phase transitions for the sACE and its variants have indeed been analyzed both theoretically utilizing the large deviation principle [128] and also numerically through different rare event algorithms. For instance, in [129] the expectation of the transition time between two metastable states is rigorously derived for a class of 1D parabolic SPDEs with bistable potential that include sACE as a special case. The phase diagram and various transition paths have also been computed by either an adaptive multilevel splitting algorithm [130] or a minimum-action method [131], where the latter deals with a nongradient flow generalization of the sACE that includes a constant force and also a nonlocal term.

Despite its importance in numerous physical applications, the reduction problem for the accurate reproduction of transition paths in the sACE has received limited attention, with existing works focusing primarily on scenarios near instability onset and low noise intensity (e.g., [132, 103, 102, 133, 134]).

In this study, we consider the sACE placed away from instability onset, after several bifurcations have taken place where multiple steady states coexist; see Fig. 2. It corresponds to the parameter regime

L=3.9π,q=4,N=8,L=3.9\pi,\;q=4,\;N=8, (78)

in which the parameter qq and NN indicates the modes forced stochastically in Eq. (76), according to

𝑾t(x,ω)=j=q+1NσjWtj(ω)𝒆j(x),t,{\bm{W}_{t}}(x,\omega)=\sum_{j={q+1}}^{N}\sigma_{j}W_{t}^{j}(\omega)\bm{e}_{j}(x),\;t\in\mathbb{R}, (79)

where we set σj=σ=0.2\sigma_{j}=\sigma=0.2 for q+1jNq+1\leq j\leq N. Note that as in Section III.1, 𝒆j\bm{e}_{j}, denote the eigenmodes of the operator Au=x2u+uAu=\partial_{x}^{2}u+u (with homogeneous Dirichlet boundary conditions), given here by

𝒆j(x)=2Lsin(jπxL),j,\bm{e}_{j}(x)=\sqrt{\frac{2}{L}}\sin\left(\frac{j\pi x}{L}\right),\quad j\in\mathbb{N}, (80)

with corresponding eigenvalues λj=1j2π2/L2\lambda_{j}=1-j^{2}\pi^{2}/L^{2}. We refer to Appendix B for the numerical details regarding the simulation of Eq. (76).

More specifically, the domain size LL is chosen so that when σ=0\sigma=0 the trivial steady state has experienced three successive bifurcations leading thus to a total of seven steady states with two stable nodes (ϕ1+\phi_{1}^{+} and ϕ1\phi_{1}^{-}) and five saddle points (0, ϕ2a\phi_{2}^{a}, ϕ2b\phi_{2}^{b}, ϕ3a\phi_{3}^{a}, and ϕ3b\phi_{3}^{b}). These steady states are characterized here by their zeros: the saddle states are of sign-change type while the stable ones are of constant sign; see [126, 135]. Note also that the same conclusion holds for the case with either Neumann boundary conditions or periodic boundary conditions; see e.g. [129, Figure 2]. In our notation, aa and bb denote the string made of sign changes over (0,L)(0,L), with leftmost symbol in aa to be ++. For instance, for ϕ3a\phi_{3}^{a} (resp. ϕ3b\phi_{3}^{b}), a={+,,+}a=\{+,-,+\} (resp. b={,+,}b=\{-,+,-\}), while a={+,}a=\{+,-\} (resp. b={,+}b=\{-,+\}) for ϕ2a\phi_{2}^{a} (resp. ϕ2b\phi_{2}^{b}).

By analyzing the energy distribution of these steady states carried out by the eigenmodes (see Table 1), one observes that ϕ1±\phi_{1}^{\pm} are nearly collinear to 𝒆1\bm{e}_{1}, while ϕ2a/b\phi_{2}^{a/b} are nearly collinear to 𝒆2\bm{e}_{2}, and ϕ3a/b\phi_{3}^{a/b} to 𝒆3\bm{e}_{3}.

Table 1: Energy distribution of the steady states across the first few eigenmodes
𝒆1\bm{e}_{1} 𝒆2\bm{e}_{2} 𝒆3\bm{e}_{3} 𝒆4\bm{e}_{4} 𝒆5\bm{e}_{5} 𝒆6\bm{e}_{6} 𝒆7\bm{e}_{7} 𝒆8\bm{e}_{8}
ϕ1±\phi_{1}^{\pm} 94.69% 0 4.79% 0 0.46% 0 0.05% 0
ϕ2a/b\phi_{2}^{a/b} 0 99.16% 0 0 0 0.83% 0 0
ϕ3a/b\phi_{3}^{a/b} 0 0 99.93% 0 0 0 0 0

A sketch of the bifurcation diagram as LL is varied (for σ=0\sigma=0) is shown in Fig. 2A; the vertical dashed line marks the domain size L=3.9πL=3.9\pi considered in our numerical simulations. For this domain size, the system exhibits three unstable modes whose interplay with the nonlinear terms yields the deterministic flow structure depicted in Fig2B. There, the deterministic global attractor is made of the seven steady states mentioned above (black filled/empty circles) connected by heteroclinic orbits shown by solid/dashed curves with arrows. To this attractor, we superimposed a solution path to the stochastic model (76) emanating from u0=0u_{0}=0 (light grey “rough” curve). We adopt the Arnol’d’s convention for this representation: black filled circles indicate the stable steady states while the empty ones indicate the saddle/unstable steady states.

Refer to caption
Figure 2: Panel A: Bifurcation diagram of from Eq. (76) (without noise) as LL is varied; see [135]. The dashed vertical line marks the parameter regime analyzed here. Panel B: Schematic of the deterministic flow structure for this parameter regime. Are shown here, the deterministic global attractor on which is superimposed a solution path to the sACE.

These steady states organize the stochastic dynamics when the noise is turned on: a solution path to Eq. (76) transits typically from one metastable states to another; the latter corresponding to profiles in the physical space that are random perturbations of an underlying steady state’s profile; see insets of Fig. 4 below. For the parameter setting (78), we observe though, over a large ensemble of noise paths, that the SPDE’s dynamics exhibit transition paths connecting u0=0u_{0}=0 to neighborhoods of the steady states ϕ1±\phi_{1}^{\pm} and ϕ2a/b\phi_{2}^{a/b}, but does not meanders near ϕ3a/b\phi_{3}^{a/b} over the interval [0,T][0,T] considered (T=40T=40).

Refer to caption
Figure 3: Typical and rare transition paths for Eq. (76). One path connecting u0=0u_{0}=0 to a sign-changing metastable state (typical, left panel), and one path connecting u0=0u_{0}=0 to a constant-sign metastable state (rare event, right panel). Are shown also the large (u𝔠u_{\mathfrak{c}}) and small (u𝔣u_{\mathfrak{f}}) scale contributions.

Noteworthy is the visual rendering of the sojourn time of the solution path shown in Fig. 2B. It is more substantial in the neighborhood of ϕ2a\phi_{2}^{a} than in that of ϕ1+\phi_{1}^{+}. It is a depiction of a solution path to Eq. (76) over [0,T][0,T]: most of the trajectory spends time near a sign-change metastable state (ϕ2a\phi_{2}^{a}) rather than to the constant-sign one (ϕ1+\phi_{1}^{+}). Large ensemble simulations of the sACE for the parameter regime (78) reveals that what is shown for a single path here is actually observed at the final time t=Tt=T, when the noise path is varied: the sign-change states ϕ2a/b\phi_{2}^{a/b} are the rule whereas the constant-sign states ϕ1±\phi_{1}^{\pm} are the exceptions, corresponding here to rare events. Figure 3 shows two solution paths in the space-time domain connecting u0=0u_{0}=0 to either a sign-changing metastable state (typical) or a constant-sign metastable state (rare event). In Section V.3 below we shall discuss the mechanisms behind these patterns and their transition phenomenology, and the extent to which non-Markovian reduced models described below are able to reproduce it.

V.2 Non-Markovian optimal reduced models

Recall that our general goal is the derivation of reduced models able to reproduce the emergence of noise-driven spatiotemporal patterns and predict their transitions triggered by the subtle coupling between noise and the nonlinear dynamics.

For the sACE (76) at hand, this goal is declined into the following form: To derive reduced models not only able to accurately estimate ensemble statistics (averages across many realizations) of the states reached at a given time t=Tt=T, but also to anticipate the system’s typical metastable states as well as those reached through the rarest transition paths pointed above.

To address this issue, we place ourselves in the challenging situation for which the cutoff scale is larger than the scales forced stochastically; i.e. the case where noise acts only in the unresolved part of the system. We show below that stochastic parameterization framework of Section III allows us to derive such an effective reduced model. This model is built from the stochastic parameterization (40) obtained from the BF system (LABEL:Eq_BF_SPDEcubic) applied to Eqns. (76) and (79), that is optimized—on a single training path—following Section III.3. We describe below the details of this derivation.

Thus, we take the reduced state space to be spanned by the qq first unforced eigenmodes, i.e.

H𝔠=span{𝒆1,,𝒆q}.H_{\mathfrak{c}}=\mathrm{span}\{\bm{e}_{1},\cdots,\bm{e}_{q}\}. (81)

In other words, the reduced state space is spanned by low modes that are not forced stochastically; i.e. q=mcq=m_{c} in (79). The subspace H𝔣H_{\mathfrak{f}} is taken to be the orthogonal complement of H𝔠H_{\mathfrak{c}} in L2(0,L)L^{2}(0,L) and contains the stochastically forced modes; see (79).

Equation (76) fits into the stochastic parameterization framework of Section III (see Eq. (36)) with Au=x2u+uAu=\partial_{x}^{2}u+u subject to homogenous Dirichlet boundary conditions, G2(u)=0G_{2}(u)=0 and with G3G_{3} denoting the cubic Nemytski’i operator defined as G3(u,v,w)(x)=u(x)v(x)w(x)G_{3}(u,v,w)(x)=-u(x)v(x)w(x) for any u,vu,v and ww in L2(0,L)L^{2}(0,L), xx in (0,L)(0,L).

The resulting stochastic parameterization Φn\Phi_{n} given by (40) (with Y=0Y=0) takes then the following form:

Φn\displaystyle\Phi_{n} (τ,X,t;ω)\displaystyle(\tau,X,t;\omega) (82)
=σn(Wtn(ω)eτλnWtτn(ω))+Zτn(t;ω)\displaystyle=\sigma_{n}\big{(}W^{n}_{t}(\omega)-e^{\tau\lambda_{n}}W^{n}_{t-\tau}(\omega)\big{)}+Z^{n}_{\tau}(t;\omega)
+i,j,k=1q(Eijkn(τ)CijknXiXjXk),\displaystyle\qquad+\sum_{i,j,k=1}^{q}\Big{(}E_{ijk}^{n}(\tau)C_{ijk}^{n}X_{i}X_{j}X_{k}\Big{)},

for each nn in (q+1,N)(q+1,N). Here EijknE_{ijk}^{n} is given in (44), and Zτn(t;ω)Z^{n}_{\tau}(t;\omega) is given by (41).

Since the linear operator AA is here self-adjoint, we have 𝒆n=𝒆n\bm{e}^{*}_{n}=\bm{e}_{n}, and the coefficients CijknC_{ijk}^{n} are given by

Cijkn\displaystyle C_{ijk}^{n} =𝒆i𝒆j𝒆k,𝒆n\displaystyle=-\langle\bm{e}_{i}\bm{e}_{j}\bm{e}_{k},\bm{e}_{n}\rangle (83)
=0L𝒆i(x)𝒆j(x)𝒆k(x)𝒆n(x)dx,\displaystyle=-\int_{0}^{L}\bm{e}_{i}(x)\bm{e}_{j}(x)\bm{e}_{k}(x)\bm{e}_{n}(x)\,\mathrm{d}x,

where 1i,j,kq1\leq i,j,k\leq q, and q+1nNq+1\leq n\leq N.

For each nn in (q+1,N)(q+1,N), the free parameter τ\tau in (82) is optimized over a single, common, training path, u(t)u(t), solving the sACE, by minimizing 𝒬n(τ)\mathcal{Q}_{n}(\tau) in (53) with Φn\Phi_{n} given by (82); see Section III.3. To do so, we follow the procedure described in Section III.2 to simulate efficiently the required random random coefficients (here the ZnZ^{n}-terms), solving in particular the random ODEs (49).

Once 𝝉=(τq+1,,τN){\bm{\tau}}^{*}=(\tau^{*}_{q+1},\ldots,\tau^{*}_{N}) is obtained, the corresponding qq-dimensional optimal reduced system in the class 𝒫\mathcal{P} (see Section IV.3), is given component-wisely by

y˙i=λiyi(j=1qyj𝒆j+n=q+1NΦn(τn,𝒚,t;ω)𝒆n)3,𝒆i,\dot{y}_{i}=\lambda_{i}y_{i}-\Big{\langle}\Big{(}\sum_{j=1}^{q}y_{j}\bm{e}_{j}+\sum_{n=q+1}^{N}\Phi_{n}(\tau^{*}_{n},\bm{y},t;\omega)\bm{e}_{n}\Big{)}^{3},\bm{e}_{i}\Big{\rangle}, (84)

where i=1,,qi=1,\cdots,q and 𝒚=(y1,,yq)T\bm{y}=(y_{1},\cdots,y_{q})^{\textrm{T}}.

The optimal reduced system (84), also referred to as the OPM closure hereafter, can be further expanded as

y˙i\displaystyle\dot{y}_{i} =λiyi+j,k,=1qCjkiyjyky\displaystyle=\lambda_{i}y_{i}+\sum_{j,k,\ell=1}^{q}C_{jk\ell}^{i}y_{j}y_{k}y_{\ell} (85)
+3j,k=1qn=q+1NCjkniyjykΦn(τn,𝒚,t;ω)\displaystyle+3\sum_{j,k=1}^{q}\sum_{n=q+1}^{N}C_{jkn}^{i}y_{j}y_{k}\Phi_{n}(\tau^{*}_{n},\bm{y},t;\omega)
+3j=1qn,n=q+1NCjnniyjΦn(τn,𝒚,t;ω)Φn(τn,𝒚,t;ω)\displaystyle+3\sum_{j=1}^{q}\sum_{n,n^{\prime}=q+1}^{N}C_{jnn^{\prime}}^{i}y_{j}\Phi_{n}(\tau^{*}_{n},\bm{y},t;\omega)\Phi_{n^{\prime}}(\tau^{*}_{n^{\prime}},\bm{y},t;\omega)
+n,n,n′′=q+1NCnnn′′iΦn(τn,𝒚,t;ω)Φn(τn,𝒚,t;ω)Φn′′(τn′′,𝒚,t;ω).\displaystyle+\sum_{n,n^{\prime},n^{\prime\prime}=q+1}^{N}C_{nn^{\prime}n^{\prime\prime}}^{i}\Phi_{n}(\tau^{*}_{n},\bm{y},t;\omega)\Phi_{n^{\prime}}(\tau^{*}_{n^{\prime}},\bm{y},t;\omega)\Phi_{n^{\prime\prime}}(\tau^{*}_{n^{\prime\prime}},\bm{y},t;\omega).

Here also, the coefficients CijkC_{ijk}^{\ell} are those defined in (83) for 1i,j,k,N1\leq i,j,k,\ell\leq N. Note that the required ZnZ^{n}-terms to simulate Eq. (85) are advanced in time by solving the corresponding Eq. (49). A total of NqN-q such equations are solved, each corresponding to a single random coefficient per mode to parameterize. Thus, a total of, N=q+(Nq)N=q+(N-q), ODEs with random coefficients are solved to obtain a reduced model of the amplitudes of the qq first modes. We call Eq. (85) together with its auxiliary equations (49), the optimal reduced model for the sACE.

A compact form of Eq. (85) can be written as follows:

𝒚˙=Π𝔠A𝒚+Π𝔠G3(𝒚+Φ𝝉(𝒚,t;ω)),𝒚H𝔠,\displaystyle\dot{\bm{y}}=\Pi_{\mathfrak{c}}A\bm{y}+\Pi_{\mathfrak{c}}G_{3}\left(\bm{y}+\Phi_{{\bm{\tau}}^{\ast}}(\bm{y},t;\omega)\right),\;\bm{y}\in H_{\mathfrak{c}}, (86)
Φ𝝉(𝒚,t;ω)=n=q+1NΦn(τn,𝒚,t;ω)𝒆n,\displaystyle\Phi_{{\bm{\tau}}^{\ast}}(\bm{y},t;\omega)=\sum_{n=q+1}^{N}\Phi_{n}(\tau_{n}^{\ast},\bm{y},t;\omega)\bm{e}_{n},

with Φn\Phi_{n} defined in Eq. (82). We refer to Appendix B for the numerical details regarding the simulation of Eq. (86).

Noteworthy is that an approximation of the sACE solution paths can be obtained from the solutions 𝒚(t,ω)\bm{y}(t,\omega) to the optimal reduced model. Such an approximation consists of simply lifting the surrogate low-mode amplitude 𝒚(t,ω)\bm{y}(t,\omega) simulated from Eq. (86), to H𝔣H_{\mathfrak{f}} via the optimal parameterization, Φ𝝉\Phi_{{\bm{\tau}}^{\ast}}, defined in Eq. (86). This approximation takes then the form

uopm(t,x,ω)\displaystyle u^{\mathrm{opm}}(t,x,\omega) =j=1qyj(t,ω)𝒆j(x)\displaystyle=\sum_{j=1}^{q}y_{j}(t,\omega)\bm{e}_{j}(x) (87)
+n=q+1NΦn(τn,𝒚(t,ω),t;ω)𝒆n(x).\displaystyle+\sum_{n=q+1}^{N}\Phi_{n}(\tau^{*}_{n},\bm{y}(t,\omega),t;\omega)\bm{e}_{n}(x).

The insets of Fig. 7A below show the approximation skills achieved by this formula obtained thus only by solving the reduced equation Eq. (86).

Refer to caption
Figure 4: Ginzburg-Landau free energy functional. Here, the Ginzburg-Landau free energy functional (u)\mathcal{E}(u) given by (77) is shown over the reduced state made of the two dominant modes’ amplitudes. A few solution paths navigating across this energy’s landscape (over [0,T][0,T]) are shown. The “shallower” local minima correspond to ϕ2a/b\phi_{2}^{a/b} (sign-change profile), while the deeper ones correspond to ϕ1±\phi_{1}^{\pm} (constant-sign profile). The insets show examples of metastable states, reached at t=Tt=T by the sACE, located near these steady states. The rare paths are those trapped in the deeper wells due to the degenerate noise employed here; see Text.
Refer to caption
Figure 5: Non-Markovian terms and rare path’s small-scale component. Panel A shows the small-scale component of the solution to the sACE, experiencing a rare transition path. Panel B shows the OPM parameterization of this small-scale component over the interval (0,10), prior the transition takes place. Panels C to F show the ability of the optimal reduced model, Eq. (85), to predict the large fluctuations experienced by the large-scale modes’ amplitudes, beyond this interval.
Refer to caption
Figure 6: Non-Markovian terms and typical path’s small-scale component. Same as Fig. 5 but for a typical path.

V.3 Why energy flows in a certain direction: The role of non-Markovian effects

The sACE (76) is integrated over a large ensemble of noise paths (10610^{6} members), from u0=0u_{0}=0 at t=0t=0 up to a time t=Tt=T. In our simulations with a fixed time horizon (T=40T=40), reaching a constant-sign state at the end (t=Tt=T) is a rare event, even though these states correspond to the deeper energy wells in the system’s landscape (as shown in Fig. 4). Conversely, sign-change states, which reside in shallower wells, are more common. This might seem counterintuitive from an energy perspective. Nevertheless, this can be explained by the degenerate noise (Eq. (79)) used to force the sACE: only a specific group of unresolved (stable) modes is excited by noise and this may limit the system’s ability to explore the deeper energy wells efficiently within the simulation timeframe.

Over time (as tt increases), the noise term in the sACE interacts with the nonlinear terms, gradually affecting the unforced resolved modes. Our experiments show that the amplitude of the sACE’s solution amplitude carried by 𝒆2\bm{e}_{2} grows the fastest for most noise paths. Because ϕ2a/b\phi_{2}^{a/b} is nearly aligned with 𝒆2\bm{e}_{2} (as shown in Table 1), this means that for most noise paths, the solution is expected to reach a metastable state close to ϕ2a/b\phi_{2}^{a/b} when the nonlinear effects become significant (nonlinear saturation kicks in).

Understanding why energy primarily accumulates in the 𝒆2\bm{e}_{2} direction is complex. It stems from how noise initially affecting unresolved modes (modes 55 to 88) propagates through nonlinear interactions, gradually influencing larger-scale resolved modes. As the initial state (u0=0u_{0}=0) lacks energy, this noise-induced energy transfer to larger scales takes some time. Our analysis reveals that during the interval I=(0,10)I=(0,10), the solution is dominated by these small-scale features inherited from the noise forcing (Table 2 and Fig. 3).

This finding has significant implications for closure methods: to accurately reproduce both rare and typical transitions, the underlying parameterization must effectively represent the system’s small-scale response to stochastic forcing before the transition occurs.

Figures 5 and 6 illustrate how the optimal parameterization and reduced model (Eq. (86)) capture the subtle differences in the small-scale component that ultimately lead to rare or typical events in the full sACE system. Notably, for both rare (Figure 5) and typical paths (Figure 6), the low-mode amplitudes of the sACE solution are near zero during the interval I=(0,10)I=(0,10) (Panels C-F and Table 2). Therefore, it is the spatial patterns of the small-scale component, rather than its energy content, that control the triggering of rare or typical events (Figure 3A and 3D). The successful parameterization of these small-scale features (Figures 5B and 6B) is crucial for the optimal reduced model’s ability to reproduce the key behaviors of the sACE’s large-scale dynamics.

The non-Markovian terms are essential for accurately capturing these small-scale patterns before a rare or typical event occurs. When large-scale mode amplitudes are near zero during interval I (Table 2), the optimal parameterization (Eq. (86)) is dominated by the (optimized) non-Markovian field ξt\xi_{t} given by:

ξt(ω,x)=n=q+1NMn(t,τn,ω)𝒆n(x), with\displaystyle\xi_{t}(\omega,x)=\sum_{n=q+1}^{N}M_{n}(t,\tau_{n}^{\ast},\omega)\bm{e}_{n}(x),\mbox{ with} (88)
Mn(t,τn,ω)=σn(Wtn(ω)eτnλnWtτnn(ω))\displaystyle M_{n}(t,\tau_{n}^{\ast},\omega)=\sigma_{n}\big{(}W^{n}_{t}(\omega)-e^{\tau_{n}^{\ast}\lambda_{n}}W^{n}_{t-\tau_{n}^{\ast}}(\omega)\big{)}
+Zτnn(t;ω).\displaystyle\hskip 142.26378pt+Z^{n}_{\tau_{n}^{\ast}}(t;\omega).

After this initial transient period, nonlinear effects become significant, and the large-scale mode amplitudes undergo substantial fluctuations. Panels C-F in Figures 5 and 6 demonstrate this behavior and the optimal reduced model’s ability to predict these large-scale fluctuations beyond interval I. The OPM reduced system’s ability to capture such transitions is rooted in its very structure. Its coefficients in Eq. (85) are nonlinear functionals of the non-Markovian MnM_{n}-terns within ξt\xi_{t} allowing for an accurate representation of the genuine nonlinear interactions between noise and nonlinear terms in the original sACE, which drive fluctuations in the large-mode amplitudes (Modes 1 to 4).

After a transition occurs and energy is transferred to the low modes, the stochastic component ξt\xi_{t} and its non-Markovian terms become secondary while the nonlinear terms in the stochastic parameterization (82) become crucial for capturing the average behavior of the sACE dynamics, as highlighted in Section V.5 below.

Table 2: Energy distribution over (0,10)(0,10) for the sACE’s rare and typical paths
𝒆1\bm{e}_{1} 𝒆2\bm{e}_{2} 𝒆3\bm{e}_{3} 𝒆4\bm{e}_{4} 𝒆5\bm{e}_{5} 𝒆6\bm{e}_{6} 𝒆7\bm{e}_{7} 𝒆8\bm{e}_{8}
Rare path 0 0.05% 0.01% 0 49.6% 26.1% 15.1% 9.2%
Typical path 0 0.22% 0 0 34.8% 33% 19.7% 12.4%
Refer to caption
Figure 7: Comparison of metastable state statistics between the full sACE model (Eq. (76)) and the optimal reduced model (Eq. (86)) at time t=Tt=T. Panels B and D show the distribution of u(T,x,)u(T,x,\cdot) from the sACE solution at t=Tt=T. Panels C and E show the corresponding approximation uopm(T,x,)u^{\mathrm{opm}}(T,x,\cdot) obtained from the optimal reduced model’s solutions at t=Tt=T, leveraging Eq. (87). One million test paths are used to generate these statistics. See Text for more details.

V.4 Rare and typical events statistics from non-Markovian reduced models

Here, we evaluate the ability of our non-Markovian reduced model (Eq. (86)) to statistically predict the rare and typical transitions discussed in the previous section. We test the reduced model’s performance in this task using a vast ensemble of one million noise paths.

Recall that the optimal parameterization (Φn\Phi_{n}, defined in (82)) is trained on a single, “typical” noise path (see Fig. 8A). In other words, we minimized the parameterization defect 𝒬n(τ)\mathcal{Q}_{n}(\tau) in equation (53) for this specific path. Our key objective here is to assess how well the reduced model (Eq. (86)) can predict the distribution of final states (metastable states at time TT) across a much larger set of out-of-sample noise paths.

To do so, we look at a natural “min+max metric.” This metric consists of adding the minimum to the maximum values of the sACE’s solution profile at time TT. The shape of this probability distribution shows that a value close to zero corresponds to a typical sign-change metastable state, while a value close to 11 or 1-1, corresponds to a rare constant-sign metastable state.

Refer to caption
Figure 8: The (normalized) parameterization defects, Qn(τ)Q_{n}(\tau). Panel A: The parameterization Φn\Phi_{n} given by (82) is trained offline on a (single) typical transition path taken over the interval (ts,T)(t_{s},T), with ts=10t_{s}=10 corresponding to the time at which nonlinear effects kick in (see Fig. 5). This optimized stochastic parameterization is used into the reduced system (86) to produce the online ensemble results of Fig. 7. Panel B: The τ\tau-dependence of the Qn(τ)Q_{n}(\tau) is shown on a rare transition path, for information purposes. See Section V.5 for more details.

Figure 7A demonstrates the effectiveness of the non-Markovian optimal reduced model in capturing rare events. The optimal reduced system (Eq. (86)) can predict noise-driven trajectories connecting an initial state (u0=0u_{0}=0) to rare final states (sACE solution’s profile with a constant sign at time TT) with statistically significant accuracy (see Fig. 7A). While these rare events occur less frequently in the reduced system compared to the full sACE (smaller red bumps compared to black ones near 1 and -1), the ability of the reduced equation Eq. (86) to predict them highlights the optimal parameterization’s success in capturing the system’s subtle interactions between the noise and nonlinear effects responsible for these rare events.

Thus, the optimal reduced system reproduces faithfully the probability distribution of both common (frequently occurring sign-change profiles) and rare (constant-sign profiles) final states at time TT (Fig. 7A). This is particularly impressive considering the model was trained on just a single path, yet generalizes well to predict the behavior across a large ensemble of out-of-sample paths.

Figure 8A shows the training stage over a typical transition path. The optimization of the free parameter τ\tau in the stochastic parameterization Φn\Phi_{n} (see Eq. (82)) is executed over a typical transition path, by minimizing the normalized parameterization defects Qn(τ,T)=𝒬n(τ,T)/|un(t)|2¯Q_{n}(\tau,T)=\mathcal{Q}_{n}(\tau,T)/\overline{\big{|}u_{n}(t)\big{|}^{2}} (𝒬n(τ,T)\mathcal{Q}_{n}(\tau,T) defined in Eq. (53)) for all the relevant nn (here n=5,,8n=5,\cdots,8). We observe that Q5Q_{5} and Q6Q_{6} exhibit clear minima, whereas the other modes have their parameterization defect not substantially diminished at finite τ\tau-values compared to its asymptotic value (as τ\tau\rightarrow\infty). Same features are observed over a rare transition path; see Figure 8B. Recall that our reduced system is an ODE system whose stochasticity comes exclusively from its random coefficients as the resolved modes in the full system are not directly forced by noise. It is this specificity that enables the reduced equations to effectively predict noise-induced transitions, including rare events. This remarkable capability relies on the optimized memory content embedded within the path-dependent, non-Markovian coefficients.

Thus, the optimized, non-Markovian and nonlinear reduced equation (85) is able to track efficiently how the noise, acting on the “unseen” part of the system, interacts with the resolved modes through these non-Markovian coefficients. While these coefficients effectively encode the fluctuations for a good reduction of the sACE system, they alone are insufficient to handle regimes with weak time-scale separation such as dealt with here. Understanding the average behavior of the resolved dynamics is equally important. The next section explores how our parameterization, particularly its nonlinear terms, play a crucial role in capturing this average behavior.

V.5 Weak time-scale separation, nonlinear terms and conditional expectation

We assess the ability of our reduced system to capture this average dynamical behavior in the physical space. In that respect, Figures 7B and 7D show the expected profiles of the sACE’s states obtained at t=Tt=T (solid black curves) from the sACE (76), along with their standard deviation (grey shaded areas), for the ensemble used for Fig. 7A. Similarly, Figures 7C and 7E show the expected profiles (solid red curves) with their standard deviation as obtained at t=Tt=T from uopm(T,x,)u^{\mathrm{opm}}(T,x,\cdot) defined in Eq. (87), once the low-mode amplitude is simulated by the optimal reduced model (86) and lifted to H𝔣H_{\mathfrak{f}} by using the stochastic parameterization Φ𝝉\Phi_{{\bm{\tau}}^{\ast}}.

Refer to caption
Figure 9: Markovian optimal parameterization for the 5th and 6th modes’ amplitudes. Panel A: The black curve shows the time evolution of the 6th mode’s amplitude, u6(t)u_{6}(t), obtained by solving the governing equation, Eq. (76), over the time interval 0 to 400. This 6th mode’s amplitude is shown against the corresponding first and second modes’ amplitudes, u1(t)u_{1}(t) and u2(t)u_{2}(t). For visualization purposes, the other solution components in the four-dimensional subspace H𝔠H_{\mathfrak{c}}, u3(t)u_{3}(t) and u4(t)u_{4}(t), are fixed at their most probable values, p3p_{3} and p4p_{4}, respectively. The manifold shown is the underlying Markovian OPM for the 6th mode’s amplitude (Eq. (89) with n=6n=6). Panel B: Same for the 5th mode’s amplitude u5(t)=u(t),𝒆5u_{5}(t)=\langle u(t),\bm{e}_{5}\rangle and its corresponding Markovian OPM (Eq. (89) with n=5n=5).

By conducting large-ensemble (online) simulations of our reduced model (Eq. (86)), we accurately reproduce the mean motion and second-order statistics of the original sACE. This success stems from our parameterization’s ability to effectively represent the average behavior of the unresolved variables during the offline training phase, as defined by the probability measure μ\mu in (58). Figure 9 illustrates this point for the fifth and sixth modes’ amplitudes of the sACE system. Notably, the nonlinear nature of the Markovian optimal parameterization, evident after taking the expectation (see Eq. (89) below), highlights the crucial role of nonlinear terms in our stochastic parameterization (Eq. (82)) in achieving this accurate representation.

The manifold shown in Fig. 9A (resp. Fig. 9B) is obtained as the graph of the following Markovian OPM for the 6th (resp. 5th) mode’s amplitude, given by the expectation of Eq. (82) with n=6n=6 (resp. n=5n=5) after optimization (Fig. 8),

(X1,X2)\displaystyle(X_{1},X_{2})\mapsto 𝔼[Φn(τn,𝑿)]\displaystyle\mathbb{E}[\Phi_{n}(\tau_{n}^{\ast},\bm{X})] (89)
=i,j,k=1q(Eijkn(τn)CijknXiXjXk),\displaystyle=\sum_{i,j,k=1}^{q}\Big{(}E_{ijk}^{n}(\tau_{n}^{*})C_{ijk}^{n}X_{i}X_{j}X_{k}\Big{)},

for 𝑿=(X1,X2,X3,X4)\bm{X}=(X_{1},X_{2},X_{3},X_{4}) with X3=p3X_{3}=p_{3} and X4=p4X_{4}=p_{4}, where p3p_{3} and p4p_{4} denote the most probable values of the 3rd and 4th solution components, u3(t)u_{3}(t) and u4(t)u_{4}(t), to enable visualization as a 2D surface while favoring a certain ”typicalness” of the visualized manifold. Note that the Markovian OPM corresponds simply to the deterministic cubic term in (82) after replacing τ\tau by the optimal τn\tau_{n}^{*}, since the remaining stochastic terms equal to σntτnteλn(ts)dWtn(ω)\sigma_{n}\int_{t-\tau_{n}^{*}}^{t}e^{\lambda_{n}(t-s)}\,\mathrm{d}W^{n}_{t}(\omega) (see Remark III.1), whose expectation is zero as stochastic integral in the sense of Itô.

Without the optimization stage, the parameterizations of Section II experience deficient accuracy when the memory content is taken to be τ=\tau=\infty, due to small values that the spectral gaps δijkn\delta_{ijk}^{n} take in denominators. As a reminder, small spectral gaps are indeed a known limitation of traditional invariant manifold techniques, often leading to inaccurate results in this case. Optimizing the parameter τ\tau helps address this issue. It introduces corrective terms like (1exp(δijkn)τ)(1-\exp(-\delta_{ijk}^{n})\tau) in the cubic coefficients (EijknE_{ijk}^{n}, Eq. (44)) that effectively compensate for these small gaps. This is especially important for fifth and sixth (unresolved) modes whose parameterization defects show a clear minimum during optimization (Figure 8).

Refer to caption
Figure 10: Autocorrelation functions (ACFs). Ensemble averages of the ACFs for the sACE’s solution amplitude carried by the resolved mode 𝒆4\bm{e}_{4} (black curve) and those carried by the unresolved modes 𝒆5,,𝒆8\bm{e}_{5},\ldots,\bm{e}_{8} (red curves). The gray (resp. pinky) shaded area around the black (resp. red) curve shows the ACF’s standard deviation for 𝒆4\bm{e}_{4} (resp. the unresolved modes). These statistics are computed from the same ensemble of sACE solution paths as in Fig. 7.

Small spectral gaps often translate into a more observable dynamical effect: weak time-scale separation between the resolved and unresolved variables near the cutoff scale. This is where the optimization process plays a particularly important role. In our sACE system, we observe indeed that the timescales of the unresolved modes near cutoff have a greater impact on the optimization of the backward integration time τ\tau (Fig. 8) from our backward-forward system (Eq. (LABEL:Eq_BF_SPDEcubic)) which underpins the parameterization (Eq. (82)). Modes like the fifth and sixth (unresolved) have correlation decay rates closer to the fourth resolved mode, compared to the much faster decay rates of the seventh and eighth modes (Fig. 10). Notably, the fourth and sixth modes exhibit very similar decay rates (except for short time lags). This similarity makes finding the optimal value for τ\tau crucial. Interestingly, the minimum in the optimization is more pronounced for the sixth mode compared to the fifth (Fig. 8). This difference could be caused by non-linear interactions between these modes or their relative energy content. The seventh and eighth modes exhibit much faster correlation decay (Fig. 10) and consequently their parameterization defects exhibit marginal optimization at finite τ\tau-values compared to their asymptotic values.

Thus, the closure results shown in Figure 7 demonstrate our reduction approach’s effectiveness in handling weak time-scale separation scenarios, while also suggesting potential for further improvement (see Section IX). Our data-driven optimization of analytical parameterization formulas offers a compelling alternative to traditional multiscale methods or those based on invariant manifolds [132, 103, 102, 133, 134]. These methods, while providing rigorous error estimates, often require strong time-scale separation for the derivation of their amplitude equations. Additionally, they can achieve higher-order nonlinearities in their reduced equations too but through a different route: the Itô formula is applied after exploiting a slow timescale of the system (e.g., [103, Eq. (12)] and [136, Eq. (15)]). Integrating optimization concepts into these alternative frameworks could be a promising avenue to explore how it performs in weak time-scale separation regimes.

VI Predicting Jump-Induced Transitions

VI.1 Spatially extended system with double-fold bifurcation

We consider the following spatially extended system,

x2u+λ(1+u2ϵu3)=0,on (0,L),\displaystyle\partial_{x}^{2}u+\lambda(1+u^{2}-\epsilon u^{3})=0,\;\mbox{on }(0,L), (90)
u(0)=u(L)=0.\displaystyle u(0)=u(L)=0.

This system serves as a basic example of a spatially extended system exhibiting a double-fold bifurcation (also known as an S-shaped bifurcation). In this bifurcation, two equilibrium points collide and vanish, leading to tipping phenomena within a hysteresis loop. Similar elliptic problems to Eq. (LABEL:Eq_Sbif) with these properties emerge in various applications such as gas combustion and chemical kinetics [58, 59], plasma physics and Grad-Shafranov equilibria in Tokamaks [137, 60, 61], and gravitational equilibrium of polytropic stars [137, 138, 139]. Beyond polynomial nonlinearities, other spatially extended systems are known to exhibit S-shaped bifurcation diagrams. Examples include: Earth’s energy balance models [63, 64, 140], and oceanic models based on hydrostatic primitive or Boussinesq equations employed in the study of the Atlantic meridional overturning circulation (AMOC) [65, 141, 66]. We also note that the OPM approach has already been successfully applied to predict tipping phenomena in a prototype Stommel–Cessi model of the AMOC driven by Gaussian noise [28].

Refer to caption
Figure 11: Double-fold bifurcation in Eq. (LABEL:Eq_Sbif). The red curve shows unstable steady states for ϵ=ϵ/2\epsilon=\epsilon^{\ast}/2. The black curves represent stable steady states. The double-fold bifurcation occurs at the two turning points where the unstable states (red curve) collide with the stable states (black curves). The vertical dashed line indicates the value λ=1.32\lambda=1.32 used in Table 3 for stochastic simulation in Section VI.2, and prediction experiments in Section VI.3.

For equation (LABEL:Eq_Sbif), the bifurcation parameter is λ\lambda. For L=2L=2, it is known that the solution curve λuλ\lambda\mapsto u_{\lambda} exhibits an S-shaped bifurcation when ϵ<ϵ0.3103\epsilon<\epsilon^{*}\approx 0.3103 [142, Theorem 3.2]. This critical constant ϵ\epsilon^{\ast} is derived from a general analytic formula applicable to a broader class of equations where u3u^{3} is replaced by upu^{p} with a fractional exponent p>2p>2 [142]. More precisely, for ϵ\epsilon satisfying

ϵpp2<13[2p(p1)]pp2p1p+1[2p(p1)]pp2,\epsilon^{\frac{p}{p-2}}<\frac{1}{3}\bigg{[}\frac{2}{p(p-1)}\bigg{]}^{\frac{p}{p-2}}-\frac{p-1}{p+1}\bigg{[}\frac{2}{p(p-1)}\bigg{]}^{\frac{p}{p-2}}, (91)

Theorem 3.2 of [142] ensures that the solution curve to Eq. (LABEL:Eq_Sbif) (with u3u^{3} replaced by upu^{p}) is exactly S-shaped. We focus here on the case p=3p=3.

To locate the interval of λ\lambda over which an exact multiplicity of three steady states occurs (as predicted by the SS-shape) we compute the bifurcation diagram of Eq. (LABEL:Eq_Sbif) for ϵ=ϵ/2\epsilon=\epsilon^{\ast}/2; see Appendix C.1 for details. The result is shown in Figure 11. The vertical dashed line at λ=1.32\lambda=1.32 marks the λ\lambda-value at which stochastic perturbations are included below to trigger jumps across these steady states.

VI.2 System’s sensitivity to jumps, and jump-driven dynamics

In this section, we consider stochastic perturbations of jump type carried by some of the eigenmodes of the linearized problem at the unstable steady state UλU^{\ast}_{\lambda}. The latter problem is given by

x2ψ+λ(2Uλ3ϵUλ2)ψ=βψ,on (0,L),\displaystyle\partial_{x}^{2}\psi+\lambda(2{U^{\ast}_{\lambda}}-3\epsilon{U^{\ast}_{\lambda}}^{2})\psi=\beta\psi,\;\mbox{on }(0,L), (92)
ψ(0)=ψ(L)=0,\displaystyle\psi(0)=\psi(L)=0,

and is solved below using a Chebyshev spectral method [143, Chapters 6 & 7] for the numerical experiments. Hereafter we denote by (βn,𝒆n)(\beta_{n},\bm{e}_{n}) the eigenpairs solving Eq. (LABEL:Eq_eigenpb). Note that the eigenvalues are real and simple for this problem.

Thus, the jump-driven SPDE at the core of our reduction study is given by:

tux2uλ(1+u2ϵu3)=ληt,\displaystyle\partial_{t}u-\partial_{x}^{2}u-\lambda(1+u^{2}-\epsilon u^{3})=\lambda\eta_{t}, (93)
u(0)=u0H,\displaystyle u(0)=u_{0}\in H,

supplemented with Dirichlet boundary conditions. The ambient Hilbert space is H=L2(0,L)H=L^{2}(0,L) endowed with its natural inner product denoted by ,\langle\cdot,\cdot\rangle.

Here, ηt\eta_{t} is a jump noise term that takes the form

ηt(x)=σζtf(t)(𝒆3(x)+𝒆5(x)),x(0,L),\eta_{t}(x)=\sigma\zeta_{t}f(t)(\bm{e}_{3}(x)+\bm{e}_{5}(x)),\;x\in(0,L), (94)

where σ0\sigma\geq 0, 𝒆3\bm{e}_{3} and 𝒆5\bm{e}_{5} are the third and fifth eigenmodes for Eq. (LABEL:Eq_eigenpb), ζt\zeta_{t} is a random variable uniformly distributed in (1,1)(-1,1), and f(t)f(t) is a square signal, whose activation is randomly distributed in the course of time. More precisely, given a firing rate frf_{r} in (0,1)(0,1), and duration Δt>0\Delta t>0, we define the following real-valued jump process:

f(t)=𝟙{ξnfr},nΔtt<(n+1)Δt,f(t)=\mathds{1}_{\{\xi_{n}\leq f_{r}\}},\quad n\Delta t\leq t<(n+1)\Delta t, (95)

where ξn\xi_{n} is a uniformly distributed random variable taking values in [0,1][0,1] and 𝟙{ξfr}=1\mathds{1}_{\{\xi\leq f_{r}\}}=1 if and only if 0ξfr0\leq\xi\leq f_{r}. We refer to [83, Sec. 3.4] for existence results in Hilbert spaces of mild solutions to SPDEs driven by jump processes, that apply to Eq. (LABEL:Eq_SPDEjump). This type of jump process is also known as a dichotomous Markov noise [144], or a two-state Markov jump process in certain fields [76]. It is encountered in many applications [145].

The specific values for the noise parameters (frequency rate, frf_{r}, time scale, Δt\Delta t, and intensity, σ\sigma) are provided in Table 3. The random forcing in our simulations is chosen with two key considerations:

  • Time scale (Δt\Delta t): The time scale Δt\Delta t is large enough to allow the system to gradually relax towards its stable steady states over time.

  • Frequency rate (frf_{r}): The frequency rate, frf_{r}, is adjusted to ensure this relaxation process can occur effectively.

This configuration ensures that if a perturbation is applied for a duration of Δt\Delta t and then removed for several subsequent Δt\Delta t intervals, the system naturally return to its closest stable steady state.

Table 3: Parameter values for Eq. (LABEL:Eq_SPDEjump)
ϵ\epsilon frf_{r} Δt\Delta t σ\sigma λ\lambda LL
ϵ2\frac{\epsilon^{\ast}}{2} 0.35 11 300 1.32 2
Refer to caption
Figure 12: Jump-driven dynamics. Top panel: The top (resp. bottom) horizontal blue dashed line shown corresponds to UλM,𝒆𝟏\langle U^{M}_{\lambda},\bm{e_{1}}\rangle (resp. Uλm,𝒆𝟏\langle U^{m}_{\lambda},\bm{e_{1}}\rangle) with UλMU^{M}_{\lambda} (resp. UλmU^{m}_{\lambda}) denoting the stable steady state to Eq. (LABEL:Eq_Sbif) with maximum (resp. minimum) energy. The horizontal dashed red line corresponds to Uλ,𝒆𝟏\langle U^{\ast}_{\lambda},\bm{e_{1}}\rangle with UλU^{\ast}_{\lambda} denoting the unstable steady state. The black dotted curve going across these horizontal dashed lines (and the associated potential barriers) is showing y1(t)=u(t),𝒆1y_{1}(t)=\langle u(t),\bm{e}_{1}\rangle with uu solving Eq. (LABEL:Eq_SPDEjump) for the parameter values of Table 3. Bottom panel: Here, is shown R(t,x)=u(t,x)/std(u(t))ηt(x)/std(ηt)R(t,x)=u(t,x)/\texttt{std}(u(t))-\eta_{t}(x)/\texttt{std}(\eta_{t}) when ηt0\eta_{t}\neq 0. The shaded planes mark two different patterns exhibited by the system’s response, for two close y1y_{1}-values (vertical purple dashed lines), illustrating the difficulty of closing accurately the y1y_{1}-variable. See Text for more details.

This relaxation behavior after a perturbation is clearly visible in the first mode’s amplitude, y1(t)=u(t),𝒆1y_{1}(t)=\langle u(t),\bm{e}_{1}\rangle (where uu solves Eq. (LABEL:Eq_SPDEjump)). The top panel of Figure 12 illustrates this concept. Shortly after the vertical dashed magenta line at t=21.4t=21.4, we observe y1(t)y_{1}(t) relaxing towards the constant value UλM,𝒆𝟏\langle U^{M}_{\lambda},\bm{e_{1}}\rangle. Here, UλMU^{M}_{\lambda} denotes the stable steady state with the largest energy for λ=1.32\lambda=1.32 (represented by the top blue dot in Figure 11). Similar relaxation episodes occur throughout the simulation, with y1(t)y_{1}(t) sometimes relaxing towards Uλm,𝒆𝟏\langle U^{m}_{\lambda},\bm{e_{1}}\rangle, involving the stable steady state UλmU^{m}_{\lambda} with the smallest energy (refer again to Figure 11).

While the noise intensity (σ=300\sigma=300) in Table Table 3 may seem large, this value is necessary because we specifically target modes 𝒆3\bm{e}_{3} and 𝒆5\bm{e}_{5} used in the stochastic forcing defined in Eq. (94). These modes contribute less than 1% of the total energy in the unforced parabolic equation. When the jump noise is activated with this intensity, the amplitudes of y3(t)y_{3}(t) and y5(t)y_{5}(t) fluctuate significantly, becoming comparable in magnitude to y1(t)y_{1}(t). This eliminates the previous order-of-magnitude difference.

As a result, the random perturbations introduced by modes 𝒆3\bm{e}_{3} and 𝒆5\bm{e}_{5}, through the system’s nonlinearities, can now drive the SPDE solution across the potential barriers associated with the steady states. This explains the multimodal behavior observed in the time evolution of y1(t)y_{1}(t) as shown in Figure 12.

A crucial question remains: with such a strong random force (σ=300\sigma=300), are the observed dynamics in the system solely a reflection of this forcing, or is there more to it? To address this, we calculate a quantity called R(t,x)R(t,x). This quantity compares the normalized solution, u(t,x)/std(u(t))u(t,x)/\texttt{std}(u(t)), with the normalized forcing term, ηt(x)/std(ηt)\eta_{t}(x)/\texttt{std}(\eta_{t}), at times when forcing is present (ηt0\eta_{t}\neq 0); see caption of Figure 12. Here, std denotes standard deviation.

The key point is that the forcing term, ηt(x)\eta_{t}(x), consists of a time-dependent coefficient multiplied by a specific spatial pattern defined by p(x)=𝒆3(x)+𝒆5(x)p(x)=\bm{e}_{3}(x)+\bm{e}_{5}(x). If the system’s response were strictly proportional to the forcing, then R(t,x)R(t,x) would simply match this spatial pattern (p(x)p(x) with three peaks and two valleys).

However, the bottom panel of Figure 12 tells a different story. We see that R(t,x)R(t,x) deviates significantly and in a complex way from p(x)p(x) as time progresses. This implies that the system’s response is not simply a mirror image of the forcing term. The nonlinear terms in the equation play a crucial role in shaping the response dynamics.

Developing reduced models for this system faces another significant hurdle: the system’s high sensitivity to small changes, particularly in the first mode’s amplitude (y1y_{1}). Here is the key point: even tiny variations in y1y_{1}, especially near the potential barrier of a steady state, can lead to vastly different spatial patterns in the full SPDE solution. Figure 12 illustrates this sensitivity. For example, consider points t1=15.04t_{1}=15.04 and t2=53.49t_{2}=53.49 marked by the two vertical purple dashed lines in the top panel of Figure 12. The corresponding values of y1y_{1} are very close (y1(t1)=3.92y_{1}(t_{1})=3.92 and y1(t2)=3.89y_{1}(t_{2})=3.89). However, the SPDE solutions at these times (shown in the vertical shaded planes) exhibit dramatically different spatial patterns.

The significant difference between the response of the first mode’s amplitude (y1y_{1}) and the full SPDE solution (uu) in equation (LABEL:Eq_SPDEjump) underscores the critical need for an accurate parameterization of the neglected variables. Capturing this kind of sensitive behavior with a reduced model that only tracks y1y_{1} requires a comprehensive representation of the influence from these excluded variables. We demonstrate in the next section that our optimal parameterization framework effectively tackles this challenging parameterization issue.

VI.3 Non-Markovian optimal reduced model

We describe below how our Backward-Forward framework introduced in this work to derive stochastic parameterizations of SPDEs driven by white noise can be readily adapted to SPDEs driven by jump processes. Equation (LABEL:Eq_SPDEjump), driven by the jump process defined in Eq. (94), serves as a concrete example. Our objective is to develop a reduced model for the first mode’s amplitude in Eq. (LABEL:Eq_SPDEjump). In other words, we aim to create a simplified, one-dimensional stochastic model that can accurately reproduce the complex dynamics discussed in Section VI.2, including the stochastic transitions.

To do so, we rewrite Eq. (LABEL:Eq_SPDEjump) for the fluctuation variable

v=uUλ,v=u-{U^{\ast}_{\lambda}}, (96)

where Uλ{U^{\ast}_{\lambda}} denotes the unstable steady state of Eq. (LABEL:Eq_Sbif) (for λ=1.32\lambda=1.32) corresponding to the red dot shown in Fig. 11. The fluctuation equation in the vv-variable reads then as follows:

tvx2v\displaystyle\partial_{t}v-\partial_{x}^{2}v λ(2Uλ3ϵUλ2)vλ(13ϵUλ)v2\displaystyle-\lambda(2{U^{\ast}_{\lambda}}-3\epsilon{U^{\ast}_{\lambda}}^{2})v-\lambda(1-3\epsilon{U^{\ast}_{\lambda}})v^{2} (97)
+λϵv3=ληt,on (0,L)×(0,T).\displaystyle+\lambda\epsilon v^{3}=\lambda\eta_{t},\;\mbox{on }(0,L)\times(0,T).

Each nnth eigenmode, 𝒆n\bm{e}_{n}, solving Eq. (LABEL:Eq_eigenpb) turns out to be very close to the Fourier sine mode 2/Lsin(nπx/L)\sqrt{2/L}\sin(n\pi x/L), and the eigenvalues decay quickly. The first few eigenvalues are β1=0.1815\beta_{1}=0.1815, β2=7.4966\beta_{2}=-7.4966, β3=19.9665\beta_{3}=-19.9665, β4=37.2840\beta_{4}=-37.2840, and β5=59.5108\beta_{5}=-59.5108. Here only the first mode 𝒆1\bm{e}_{1} is unstable while the others are all stable.

Our reduced state space is thus taken to be

H𝔠=span{𝒆1}.H_{\mathfrak{c}}=\mbox{span}\{\bm{e}_{1}\}. (98)

For the parameter setting of Table 3, most of the energy (99.9%99.9\%) is distributed among the unstable mode 𝒆1\bm{e}_{1}, on one hand, and the forced modes, 𝒆3\bm{e}_{3} and 𝒆5\bm{e}_{5}, on the other, with energy carried by the third mode’s amplitude representing up to 60%60\% depending on the noise path. An accurate parameterization of these forced modes along with their nonlinear interactions with the unstable mode is thus key to achieve for a reliable reduced model of the first mode’s amplitude.

To do so, we consider the BF parameterization framework of Section III.1 for cubic SPDEs driven by white noise that we adapt to the jump noise case. If one replaces the Brownian motion in Eq. (LABEL:Eq_BF_SPDEcubic) by the jump process term, σζtf(t)\sigma\zeta_{t}f(t) (from Eq. (94)), one gets the following parameterization of the nnth mode’s amplitude:

Φn(τn,\displaystyle\Phi_{n}(\tau_{n}, X,t;ω)\displaystyle X,t;\omega) (99)
=eβnτY+Jτn(t;ω)+D11n(τ)B11nX2\displaystyle=e^{\beta_{n}\tau}Y+J^{n}_{\tau}(t;\omega)+D^{n}_{11}(\tau)B_{11}^{n}X^{2}
+E111n(τ)C111nX3,n{3,5},\displaystyle\;\;+E_{111}^{n}(\tau)C_{111}^{n}X^{3},\;n\in\{3,5\},

where the coefficients B11nB_{11}^{n}, C111nC_{111}^{n}, D11nD^{n}_{11}, and E111nE_{111}^{n} are those defined in Section III.1, while the stochastic term is now given by the non-Markovian coefficient

Jτn(t;ω)=σλtτteβn(ts)ζsf(s)ds.J^{n}_{\tau}(t;\omega)=\sigma\lambda\int_{t-\tau}^{t}e^{\beta_{n}(t-s)}\zeta_{s}f(s)\,\mathrm{d}s. (100)

This stochastic term is the counterpart of the ZnZ^{n}-term defined in (41) for the white noise case. As for the ZnZ^{n}-term, the JnJ^{n}-term is efficiently generated by solving the following ODE with random coefficients:

dJdt=βnJ+σλ(ζtf(t)eβnτζtτf(tτ)),\frac{\,\mathrm{d}J}{\,\mathrm{d}t}=\beta_{n}J+\sigma\lambda\left(\zeta_{t}f(t)-e^{\beta_{n}\tau}\zeta_{t-\tau}f(t-\tau)\right), (101)

for n=3n=3 and n=5n=5; cf. Eq. (49). Note that the parameterization Eq. (99) takes the same structural form as Eq. (82) in the case of white noise forcing (see Remark III.1), with the bounded jump noise replacing the latter here. This is due to the underlying backward-forward systems at the core of the derivation of these formulas where only the nature of the forcing changes; see also Sec. VIII below.

Then, after optimization over a single training path of the parameterization defects Qn(τ)Q_{n}(\tau) (see Section VI.4 below), an analogue to the optimal reduced model (Eq. (86)) becomes in our case the following 1-D ODE with path-dependent coefficients:

y˙=β1y+(λ3λϵUλ)(y(t)𝒆1+𝒫(t))2,𝒆1\displaystyle\dot{y}=\beta_{1}y+\Big{\langle}(\lambda-3\lambda\epsilon{U^{\ast}_{\lambda}})\big{(}y(t)\bm{e}_{1}+\mathcal{P}(t)\big{)}^{2},\bm{e}_{1}\Big{\rangle} (102)
λϵ(y(t)𝒆1+𝒫(t))3,𝒆1,\displaystyle\qquad\qquad\qquad-\lambda\epsilon\Big{\langle}(y(t)\bm{e}_{1}+\mathcal{P}(t))^{3},\bm{e}_{1}\Big{\rangle},
𝒫(t)=Φ3(τ3,y(t),t)𝒆3+Φ5(τ5,y(t),t)𝒆5.\displaystyle\mathcal{P}(t)=\Phi_{3}(\tau^{*}_{3},y(t),t)\bm{e}_{3}+\Phi_{5}(\tau^{*}_{5},y(t),t)\bm{e}_{5}.

Here, 𝒫(t)\mathcal{P}(t) is the optimized stochastic parameterization in the class 𝒫\mathcal{P} (see Section IV.3) for the SPDE (97) driven by the jump process ηt\eta_{t} defined in Eq. (94). Note that the required non-Markovian JnJ^{n}-terms to simulate Eq. (LABEL:Eq_reduced_S_shaped) are obtained by solving Eq. (101) in the course of time. So a total of three ODEs with random coefficients are solved to obtain a reduced model of the first mode’s amplitude. Thus, Eq. (LABEL:Eq_reduced_S_shaped) along with its auxiliary JJ-equations, form the optimal reduced system.

The inner product in Eq. (LABEL:Eq_reduced_S_shaped) leverages the analytical expression of 𝒫(t)\mathcal{P}(t) provided in the same equation. This inner product is computed once offline, resulting in a degree-9 polynomial with random coefficients after expanding the Φ3\Phi_{3} and Φ5\Phi_{5} terms as described in Eq. (99). However, this expression can be further simplified. By analyzing the order of magnitude of the terms in Eq. (99) for the parameter regime of Table 3 (σ\sigma large), we observe that 𝒫(t)\mathcal{P}(t) simplifies to

𝒫(t)Jτ33(t;ω)𝒆3+Jτ55(t;ω)𝒆5,\mathcal{P}(t)\approx J^{3}_{\tau^{*}_{3}}(t;\omega)\bm{e}_{3}+J^{5}_{\tau^{*}_{5}}(t;\omega)\bm{e}_{5}, (103)

setting Y=0Y=0 in Φn\Phi_{n} (n=3,5n=3,5) defined in Eq. (99)111Note that the other unresolved (and unforced) modes contain less than 0.1%0.1\% of the total energy and are thus not taken here in account into the parameterization 𝒫(t)\mathcal{P}(t).. This approximation expresses essentially a prominence of the stochastic terms over the nonlinear ones in the parameterization (99), unlike for the reduction of the sACE analyzed in Section V.4. The parameterization becomes then purely stochastic without nonlinear terms. Yet, the optimal reduced system exhibits nonlinear products of stochastic coefficients.

To make this point, we introduce a few more notations. We denote by aλϵ(x)a_{\lambda}^{\epsilon}(x) the spatial coefficient, λ3λϵUλ(x)\lambda-3\lambda\epsilon{U^{\ast}_{\lambda}}(x), involved in the quadratic term of Eq. (LABEL:Eq_reduced_S_shaped) (as inherited from Eq. (97)). To simplify the notations we denote also by J3J_{3}^{\ast} (resp. J5J_{5}^{\ast}) the term Jτ33(t;ω)J^{3}_{\tau^{*}_{3}}(t;\omega) (resp. Jτ55(t;ω)J^{5}_{\tau_{5}^{*}}(t;\omega)), and introduce the bilinear operator G2(u,v)(x)=u(x)v(x),G_{2}(u,v)(x)=u(x)v(x), for xx in (0,L)(0,L) and any functions u,vu,v in L2(0,L)L^{2}(0,L). We have then, using the approximation formula (103), that the quadratic term in Eq. (LABEL:Eq_reduced_S_shaped) expands as follows

aλϵ(y(t)𝒆1+𝒫(t))2,𝒆1=y2(t)aλϵG2(𝒆1,𝒆1),𝒆1+(J3(t))2aλϵG2(𝒆3,𝒆3),𝒆1+(J5(t))2aλϵG2(𝒆5,𝒆5),𝒆1\displaystyle\Big{\langle}a_{\lambda}^{\epsilon}\big{(}y(t)\bm{e}_{1}+\mathcal{P}(t)\big{)}^{2},\bm{e}_{1}\Big{\rangle}=y^{2}(t)\langle a_{\lambda}^{\epsilon}G_{2}(\bm{e}_{1},\bm{e}_{1}),\bm{e}_{1}\rangle+(J_{3}^{\ast}(t))^{2}\langle a_{\lambda}^{\epsilon}G_{2}(\bm{e}_{3},\bm{e}_{3}),\bm{e}_{1}\rangle+(J_{5}^{\ast}(t))^{2}\langle a_{\lambda}^{\epsilon}G_{2}(\bm{e}_{5},\bm{e}_{5}),\bm{e}_{1}\rangle (104)
+2aλϵG2(𝒆3,𝒆5),𝒆1J3(t)J5(t)+2aλϵG2(𝒆1,𝒆3),𝒆1y(t)J3(t)+2aλϵG2(𝒆1,𝒆5),𝒆1y(t)J5(t),\displaystyle\hskip 56.9055pt+2\langle a_{\lambda}^{\epsilon}G_{2}(\bm{e}_{3},\bm{e}_{5}),\bm{e}_{1}\rangle J_{3}^{\ast}(t)J_{5}^{\ast}(t)+2\langle a_{\lambda}^{\epsilon}G_{2}(\bm{e}_{1},\bm{e}_{3}),\bm{e}_{1}\rangle y(t)J_{3}^{\ast}(t)+2\langle a_{\lambda}^{\epsilon}G_{2}(\bm{e}_{1},\bm{e}_{5}),\bm{e}_{1}\rangle y(t)J_{5}^{\ast}(t),

with a similar expression obtained from the cubic terms in Eq. (LABEL:Eq_reduced_S_shaped). Already from (LABEL:Expansion_quadterm) one observes the production of stochastic constant terms driven by (J3(t))2(J_{3}^{\ast}(t))^{2}, (J5(t))2(J_{5}^{\ast}(t))^{2} and J3(t)J5(t)J_{3}^{\ast}(t)J_{5}^{\ast}(t) as well as linear terms driven by J3(t)J_{3}^{\ast}(t) and J5(t)J_{5}^{\ast}(t) contributing to the dynamics of the surrogate first mode’s amplitude yy. These stochastic terms turn out to be essential for a good emulation of the first mode’s amplitude. This point is illustrated by the following experiment.

Refer to caption
Figure 13: The normalized parameterization defects, QnQ_{n} for n=3n=3 and n=5n=5. The optimization is performed on a single path. See Text.

VI.4 Training data, parameterization defects, and reduced model’s skills

The training data for the optimal reduced model is comprised of N=40,000N=40,000 snapshots. These sACE snapshots are generated over a training interval from 0 to TT, with TT = 400, for a single, noise path driving the sACE. The corresponding normalized parameterization defects (quantified by QnQ_{n}) are visualized in Figure 13.

Here, it’s important to note that the distinct minima observed in Q3Q_{3} and Q5Q_{5} are not caused by weak time-scale separations (as discussed in Section V.5 for the sACE). Instead, they are primarily a consequence of the high noise intensity (σ\sigma) used in this case, which leads to solutions with large amplitudes. This is further supported by the observation that these minima become less pronounced as σ\sigma is reduced (verified for arbitrary noise paths).

Figure 14 shows the optimal parameterization’s ability to reproduce first mode’s complex behavior. This figure showcases the effectiveness of the optimal reduced model (Eq. (LABEL:Eq_reduced_S_shaped) with 𝒫(t)\mathcal{P}(t) from Eq. (103)) in capturing the bimodal nature of the first mode’s amplitude. The model’s performance is evaluated using a vast set of unseen test paths. When compared to the actual SPDE solution’s amplitude for the first mode (across these test paths), the optimal reduced model closely follows the various transitions between metastable states, crossing as needed the potential barriers. However, it is worth noting that this reduced model tends to slightly overestimate the intensity of these larger excursions (as shown in Figure 15).

Refer to caption
Figure 14: Ensemble PDF. Here, 10510^{5} out-of-sample test paths are used to estimate these ensemble PDFs. Each underlying solution path is made of N=2×105N=2\times 10^{5} iterations for a time-step of δt=102\delta t=10^{-2}.
Refer to caption
Figure 15: First’s mode amplitude dynamics: Full system vs optimal reduction.

VII Towards the Reduction of More General Lévy-driven Dynamics

We prolong the discussion and results of Section VI towards more general Lévy noise. For the sake of simplicity we focus on finite-dimensional SDEs driven by Lévy processes.

There is a vast literature on Lévy processes which roughly speaking are processes given as the sum of a Brownian motion and a jump process [147, 148, 149, 150, 151]. Unlike the case of diffusion processes (SDEs driven by Brownian motions), the representation of Kolmogorov operators is non-unique in the case of Lévy-driven SDEs, and may take the form of operators involving fractional Laplacians or singular integrals among other representations [152]. We adopt here the definition commonly used in probability theory [153, Theorem 3.5.14] which presents the interest of being particularly intuitive as recalled below. We consider SDEs of the form

dXt=𝑭(Xt)dt+𝚺(Xt)d𝑾t+d𝑳t,\,\mathrm{d}X_{t}={\bm{F}}(X_{t})\,\mathrm{d}t+{\bm{\Sigma}}(X_{t})\,\mathrm{d}{\bm{W}_{t}}+\,\mathrm{d}\bm{L}_{t}, (105)

where 𝑾t{\bm{W}_{t}} is a Brownian motion in d\mathbb{R}^{d}, and 𝑳t\bm{L}_{t} is a Lévy process on d\mathbb{R}^{d} independent from 𝑾t{\bm{W}_{t}}. Here, we take 𝑭{\bm{F}} to be a smooth vector field on d\mathbb{R}^{d}, and 𝚺(𝒙){\bm{\Sigma}}(\bm{x}) to be, for each 𝒙\bm{x}, a d×dd\times d matrix with smooth entries, such that (aij(𝒙))=𝚺(𝒙)𝚺T(𝒙)(a_{ij}(\bm{x}))={\bm{\Sigma}}(\bm{x}){\bm{\Sigma}}^{T}(\bm{x}) is a positive definite matrix for each 𝒙\bm{x} in d\mathbb{R}^{d}.

Roughly speaking, a Lévy process 𝑳t\bm{L}_{t} on d\mathbb{R}^{d} is a non-Gaussian stochastic process with independent and stationary increments that experience sudden, large jumps in any direction. The probability distribution of the these jumps is characterized by a non-negative Borel measure ν\nu defined on d\mathbb{R}^{d} and concentrated on d\{0}\mathbb{R}^{d}\backslash\{0\} that satisfies the property d\{0}min(1,𝒚2)ν(d𝒚)<\int_{\mathbb{R}^{d}\backslash\{0\}}\min(1,\bm{y}^{2})\nu(\,\mathrm{d}\bm{y})<\infty. This measure ν\nu is called the jump measure of the Lévy process 𝑳t\bm{L}_{t}. Sometimes XtX_{t} itself is referred to as a Lévy process with triplet (𝑭,𝚺,ν)({\bm{F}},{\bm{\Sigma}},\nu). Within this convention, we reserve ourselves the terminology of a Lévy process to a process with triplet (0,0,ν)(0,0,\nu). We refer to [149] and [148] for the mathematical background on Lévy processes.

Under suitable assumptions on 𝑭{\bm{F}}, 𝚺{\bm{\Sigma}}, and the Lévy measure ν\nu, the solution XtX_{t} to Eq. (105) is a Markov process (e.g. [154]) and even a Feller process [155], with associated Kolmogorov operator taking the following integro-differential form for (e.g.) ψ\psi in Cc(d)C^{\infty}_{c}(\mathbb{R}^{d}) (Courrège theorem [156, 150]):

Kψ(𝒙)\displaystyle\mathcal{L}_{K}\psi(\bm{x}) =𝑭(𝒙)ψ+i,j=1daij(𝒙)ijψ+Γψ(𝒙),with\displaystyle={\bm{F}}(\bm{x})\cdot\nabla\psi+\sum_{i,j=1}^{d}a_{ij}(\bm{x})\partial_{ij}\psi+\Gamma\psi(\bm{x}),\mbox{with} (106)
Γψ(𝒙)=n\{0}[ψ(𝒙+𝒚)ψ(𝒙)𝒚ψ(𝒙)𝟙{𝒚<1}]ν(d𝒚),\displaystyle\Gamma\psi(\bm{x})=\int_{\mathbb{R}^{n}\backslash\{0\}}\Big{[}\psi(\bm{x}+\bm{y})-\psi(\bm{x})-\bm{y}\cdot\nabla\psi(\bm{x})\mathds{1}_{\{\left\lVert\bm{y}\right\rVert<1\}}\Big{]}\nu(\,\mathrm{d}\bm{y}),

where 𝟙{𝒚<1}\mathds{1}_{\{\left\lVert\bm{y}\right\rVert<1\}} denotes the indicator function of the (open) unit ball in n\mathbb{R}^{n}; see also [149, Theorem 3.3.3].

The first-order term in Eq. (106) is the drift term caused by the deterministic, nonlinear dynamics. The second-order differential operator represents the diffusion part of the process XtX_{t}. It is responsible for the continuous component of the process.

The Γ\Gamma-term, involving the integral, represents the jump part of the process. It captures the discontinuous jumps that the process experience due to the sudden changes caused by the Lévy process 𝑳t\bm{L}_{t}. Its intuitive interpretation breaks down as follows. The term, ψ(𝒙+𝒚)ψ(𝒙)\psi(\bm{x}+\bm{y})-\psi(\bm{x}), calculates the difference in the test function value before and after the potential jump, capturing the change in the test function due to the jump.

The term 𝒚ψ(𝒙)𝟙{y<1}-\bm{y}\cdot\nabla\psi(\bm{x})\mathds{1}_{\{\left\lVert y\right\rVert<1\}} represents a first-order correction for small jumps. It aims to account for the fact that a small jump might not land exactly on the grid point (𝒙+𝒚\bm{x}+\bm{y}), but somewhere in its vicinity. This term is often referred to as the Girsanov correction. Thus, the integral term Γ\Gamma accounts for all possible jump sizes (𝒚\bm{y}) within a unit ball, as weighted by the Lévy measure ν(d𝒚)\nu(\,\mathrm{d}\bm{y}). The notion of Kolmogorov operators can be extended to SPDEs driven by cylindrical Wiener processes, with 𝒙\bm{x} lying in a Hilbert space; see [157]. Building up on such a functional framework, it is possible to provide a rigorous meaning of the Kolmogorov operator K\mathcal{L}_{K} in the case of an SPDE driven by a general Lévy process, forcing possibly infinitely many modes.

In the case of SPDE driven by a scalar jump process (forcing only two modes) studied in Section VI (see Eq. (94)), the jump measure is simple to describe. It corresponds to the measure ν(ds)\nu(\,\mathrm{d}s) (ss real) associated with the real-valued jump process f(t)f(t) given by Eq. (95). This measure is the Dirac delta λδs=1\lambda\delta_{s=1}, where λ\lambda is the intensity of the Poisson process associated with the random comb signal. Recall that this intensity is the limit of the probability of a single event occurring in a small interval divided by the length of that interval as the interval becomes infinitesimally small. Therefore, λ\lambda is equal to the firing rate frf_{r} of f(t)f(t).

In the case of SDEs driven by α\alpha-stable Lévy processes,

ν(d𝒚)=cd,α𝒚dαd𝒚,\nu(\,\mathrm{d}\bm{y})=c_{d,\alpha}\|\bm{y}\|^{-d-\alpha}\,\mathrm{d}\bm{y}, (107)

where the non-Gaussianity index α\alpha lies in (0,2)(0,2), and cd,αc_{d,\alpha} is a constant that depends on the dimension and involves the gamma function of Euler [152]. A 22-stable (α=2\alpha=2) process is simply a Brownian motion.

Refer to caption
Figure 16: Chart presenting the main steps, objectives, and characteristics of the OPM reduction approach.

The rigorous theory of slow and center manifolds are in their infancy for SDEs driven by α\alpha-stable Lévy processes [84, 158], but the main formulas share the same ingredients as recalled in Section II.2, with analogous backward-forward interpretations of those discussed in Section II.3 for the Lyapunov-Perron integrals involved therein ([84, Sec. 4]). The non-Markovian parameterizations of Section III and their data-driven optimization, can thus be extended, at least formally, for the case of SDEs driven by α\alpha-stable Lévy processes and will be pursued elsewhere on concrete applications.

VIII A Chart Summarizing the Approach

To streamline the application of the OPM reduction approach for stochastically forced systems presented in this paper, we provide a concise overview of its key steps, objectives, and characteristics in the chart shown in Figure Fig. 16. This summary is tailored to the class of SPDEs discussed in Section II.1, with the primary goal of performing a reduction with a cutoff scale just below the forcing scale. This reduction involves a solution splitting where only the disregarded ”small” scales are stochastically forced (Figures 16A and 16B).

Given this splitting, where the disregarded scales are those directly forced by stochasticity, our approach reveals the importance of non-Markovian terms in the parameterization of these scales. These terms depend on the noise’s path history, enabling the tracking of small-scale fluctuations. The accuracy of this tracking is primarily governed by the memory content (τ\tau) of the parameterization, which is optimized from data by minimizing a natural loss function (Figure 16E) over a single noise realization ω\omega.

Once optimized, the resulting non-Markovian parameterization, the Optimal Parameterizing Manifold (OPM), allows us to capture the average behavior of small-scale fluctuations as a functional of large scales by averaging over noise paths (see inset below Figure 16B). To efficiently optimize the memory content (τ\tau) and avoid the computation of cumbersome quadratures, we decompose the JJ-term in Figure 16, J(t,𝒙)=Jn(t)𝒆n(𝒙)J(t,\bm{x})=\sum J_{n}(t){\bm{e}}_{n}(\bm{x}), into random coefficients JnJ_{n} that solve simple auxiliary ODEs dependent on τ\tau and the noise path ω\omega. The resulting OPM closure, exemplified by Equation (85), takes the form of an SDE system with coefficients that are nonlinear functionals of these JnJ_{n} terms (Figure 16F) reflecting the nonlinear interactions between the noise and nonlinear terms in the original system.

As demonstrated with SPDEs driven by Gaussian noise (Section V) or jump processes (Section VI), the OPM closure effectively predicts noise-induced transitions and rare event statistics. This success is attributed to our hybrid approach, which combines analytical insights for constructing the parameterizing manifold Φ\Phi (Figure 16C and Section III.1) with data-driven learning of the optimal memory content τ\tau (Section III C).

IX Discussion and Perspectives

Thus, this work demonstrates that Optimal Parameterizing Manifolds (OPMs) constructed from Backward-Forward (BF) systems and optimized using a single solution path can accurately reproduce noise-induced transitions across a broad range of out-of-sample, test paths. This approach is effective for non-Gaussian noise with jumps. A remarkable feature of the approach is that, even though these two different types of noise typically lead to different dynamical responses, the OPMs in these two settings take exactly the same structural form with the corresponding non-Markovian terms depending on the past history of the corresponding noise forcing the original SPDE; cf. Eqns. (40)–(41) and cf Eqns. (99)–(100).

Interestingly, the optimal parameterization formulas align with the approximation of stochastic invariant manifolds when they exist. However, when these manifolds are no longer applicable, the optimal parameterizations transcends the invariance constraint. In such cases, it provides the optimal stochastic parameterization with respect to the random invariant measure ρω\rho_{\omega} (defined in Section IV.1), conditioned on the resolved variable XX (as proven in Theorem IV.1).

The formulas derived from the BF systems in Section III enable practical approximation of the theoretical optimal parameterization, particularly its non-Markovian random coefficients. These coefficients become crucial, especially when noise affects unresolved modes. They essentially encode how noise propagates to the resolved modes through nonlinear interactions.

We have shown, through various examples, that training on a single noise path equips the coefficients with an optimized historical memory (of the noise) that plays a key role in the accurate prediction of noise-induced excursion statistics, even when applied to a diverse set of test paths. This demonstrates the strength of our hybrid framework, which combines analytical understanding with data-driven insights. This approach allows us to overcome the “extrapolation problem,” a significant challenge for purely data-driven machine-learned parameterizations.

Section II.3 discusses the possibility of constructing more intricate parameterizations using an iterative scheme based on more general BF systems. However, these elaborations require more computational effort due to the repeated stochastic convolutions. Implementing them efficiently necessitates deriving auxiliary ODEs (analogues to Eq. (49)) for the coefficients (ana_{n} and binb_{in}) in Eqns. (31) and (32). Reference [29], Section 7.3, provides examples of such auxiliary ODEs for handling repeated stochastic convolutions.

These more general stochastic parameterizations have the potential to significantly enhance the parameterization capabilities of Φ\Phi (Eq. (33)). This is because they encode higher-order interaction terms with respect to the noise amplitude, as shown in Eq. (30). Resolving these interactions is potentially important for improving the prediction of rare event statistics by non-Markovian reduced systems, especially when the noise intensity is low. This opens doors to potential applications using large deviation or rare event algorithms (e.g., [159, 160, 161, 162, 130, 163, 108, 21]). Additionally, iterating the BF system (Eq. (25)) beyond =2\ell=2 can also introduce higher-order terms involving the low-mode amplitude XX.

The inclusion of higher-term in XX is non-unique though. For instance, inspired by [29, Sec. 7.2.1], stochastic parameterizations accounting for higher-order terms in XX are produced by solving analytically the following BF system:

dp(1)=A𝔠p(1)ds,\displaystyle\mathrm{d}p^{(1)}=A_{\mathfrak{c}}p^{(1)}\,\mathrm{d}s, (108)
dp(2)=(A𝔠p(2)+ΠcG2(p(1)))ds,\displaystyle\mathrm{d}p^{(2)}=\Big{(}A_{\mathfrak{c}}p^{(2)}+\Pi_{c}G_{2}\big{(}p^{(1)}\big{)}\Big{)}\,\mathrm{d}s,
dqn=(λnqn+ΠnG2(p(2)))ds+σndWsn,\displaystyle\mathrm{d}q_{n}=\Big{(}\lambda_{n}q_{n}+\Pi_{n}G_{2}\big{(}p^{(2)}\big{)}\Big{)}\,\mathrm{d}s+\sigma_{n}\,\mathrm{d}W_{s}^{n},
p(1)(t)=p(2)(t)=XH𝔠,qn(tτ)=Y,\displaystyle p^{(1)}(t)=p^{(2)}(t)=X\in H_{\mathfrak{c}},\;q_{n}(t-\tau)=Y,

for which the two first equations for p(1)p^{(1)} and p(2)p^{(2)} are simultaneously integrated backward over [tτ,t][t-\tau,t], followed by the forward integration of the qnq_{n}-equation over the same interval. Compared to Eq. (LABEL:Eq_BF_SPDEcubic), this BF system has two-layer in the low-mode variable pp resulting into parameterizations of order 4 in XX; see [164, Theorem 2] for an example. These higher-order terms come from the nonlinear self-interactions between the low modes as brought by the term ΠcG2(p(1))\Pi_{c}G_{2}\big{(}p^{(1)}\big{)}. Accounting for additional interactions either through (LABEL:Eq_h2) or the iterative procedure of Section II.3 can help improve the parameterization skills (see also [103, 134]), and should be considered if accuracy is key to resolve.

An alternative route to enhance accuracy would involve to parameterize stochastically the parameterization residual from the loss functions in (53), post-minimization. This technique has proven effective in modeling the fast motion transversal to OPM manifolds by means of networks of nonlinear stochastic oscillators [47, 43], as diagnosed by Ruelle-Pollicott resonances [165, 166]. Similarly, rapid dynamical fluctuations, transversal to (Markovian) OPM manifolds (Figure 9), could benefit from refined parameterizations through the incorporation of additional stochastic components following [47], potentially leading to more accurate representation of rare event statistics by the corresponding closures (Figure 7).

Extending beyond SPDEs driven by Markov processes, our framework can naturally accommodate SPDEs driven by non-Markovian stochastic processes. A prime example is when the stochastic forcing is a fractional Brownian motion (fBm) [167], for which the existence of unstable manifolds has been demonstrated [168]. By leveraging Lyapunov-Perron techniques, as employed in the construction of these manifolds (Section II.2), our extension program to parameterization can proceed analogously to Section III. This involves here as well, finite-time integration of relevant backward-forward systems to derive optimizable parameterizations from the governing equation. The random coefficients in these parameterizations would then become again functionals of the past history of the forcing noise, albeit this time incorporating a higher degree of memory compared to the stochastic process ZτnZ^{n}_{\tau} used in the parameterizations Section III, due to the original memory content of the original forcing. Such generalized parameterizations should provide valuable insights into the interplay between nonlinearity and noise in physical systems exhibiting long-term memory effects [34, 35, 169].

Regardless of the Markovian or non-Markovian nature of the stochastic forcing, our reduction approach and its potential extensions offer valuable tools for addressing critical questions. By elucidating the intricate interplay between noise and nonlinearity, we can gain deeper dynamical insights. For example, we anticipate applying this reduction program to investigate early warning indicators of tipping phenomena, from a dynamical perspective, by tracing the energy flow from forced scales to physically relevant modes [170, 171]. Climate tipping elements provide a compelling domain for exploring these concepts and applying our methods [67, 172, 173].

The proposed stochastic parameterization method offers a new framework for understanding noise-nonlinearity interactions in stochastic multiscale systems. It is important to note that alternative techniques based on stochastic time scale separation, such as the stochastic path perturbation approach, have been employed to study related phenomena, including the random escape of stochastic fields from criticality in explosive systems [174] and first passage time problems in nonlocal Fisher equations [175, 176]. Future research directions could explore the potential of our method to obtain analytical approximations for first passage times in nonlocal SPDEs near criticality. Such an extension could provide valuable insights into pattern formations for a wide range of physical and biological problems such as in infectious disease modeling [177], nonlinear optics [178], or vegetation models [179].

Acknowledgements.
Several insights discussed in this work have substantially benefited from the reviewers’ constructive comments, and we express our deep gratitude to them. This work has been supported by the Office of Naval Research (ONR) Multidisciplinary University Research Initiative (MURI) grant N00014-20-1-2023, and by the National Science Foundation grants DMS-2108856, DMS-2407483, and DMS-2407484. This work has been also supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement No. 810370) and by a Ben May Center grant for theoretical and/or computational research.

Appendix A Proof of Theorem II.1

Proof. Step 1. Let us remark that for any nn

ΠnGk(e(st)A𝔠X+z𝔠)=\displaystyle\Pi_{n}G_{k}(e^{(s-t)A_{\mathfrak{c}}}X+z_{\mathfrak{c}})=
(j1,,jk)Ik=1k(exp((st)λj)Xj+zj)Gj1jkn,\displaystyle\sum_{(j_{1},\cdots,j_{k})\in I^{k}}\prod_{\ell=1}^{k}\bigg{(}\exp((s-t)\lambda_{j_{\ell}})X_{j_{\ell}}+z_{j_{\ell}}\bigg{)}G_{j_{1}\cdots j_{k}}^{n},

where I={1,,mc}I=\{1,\cdots,m_{c}\} and Gj1jkn=Gk(𝒆j1,,𝒆jk),𝒆nG_{j_{1}\cdots j_{k}}^{n}=\langle G_{k}(\bm{e}_{j_{1}},\cdots,\bm{e}_{j_{k}}),\bm{e}_{n}^{\ast}\rangle.

Note that the product term above can be re-written as

=1k\displaystyle\prod_{\ell=1}^{k} (exp((st)λj)Xj+zj)=\displaystyle\bigg{(}\exp((s-t)\lambda_{j_{\ell}})X_{j_{\ell}}+z_{j_{\ell}}\bigg{)}= (109)
Kz(pKKzexp((st)λjp)Xjp)(qKzzjq),\displaystyle\sum_{K_{z}}\bigg{(}\prod_{p\in K\setminus K_{z}}\exp((s-t)\lambda_{j_{p}})X_{j_{p}}\bigg{)}\bigg{(}\underset{q\in K_{z}}{\prod}z_{j_{q}}\bigg{)},

where K={1,,k}K=\{1,\cdots,k\} and KzK_{z} runs through the subsets (possibly empty) of disjoint elements of KK.

Thus,

𝔍(t,X,ω)\displaystyle\mathfrak{J}(t,X,\omega) =nmc+1(te(ts)λnΠnGk(e(st)A𝔠X+z𝔠(s,ω))ds)𝒆n\displaystyle=\sum_{n\geq m_{c}+1}\bigg{(}\int_{-\infty}^{t}e^{(t-s)\lambda_{n}}\Pi_{n}G_{k}(e^{(s-t)A_{\mathfrak{c}}}X+z_{\mathfrak{c}}(s,\omega))\,\mathrm{d}s\bigg{)}\bm{e}_{n} (110)
=nmc+1(j1,,jk)Ik(t=1k(exp((st)λj)Xj+zj)exp((ts)λn)ds)Gj1jkn𝒆n,\displaystyle=\sum_{n\geq m_{c}+1}\sum_{(j_{1},\cdots,j_{k})\in I^{k}}\bigg{(}\int_{-\infty}^{t}\prod_{\ell=1}^{k}\bigg{(}\exp((s-t)\lambda_{j_{\ell}})X_{j_{\ell}}+z_{j_{\ell}}\bigg{)}\exp((t-s)\lambda_{n})\,\mathrm{d}s\bigg{)}G_{j_{1}\cdots j_{k}}^{n}\bm{e}_{n},

and by using (109), 𝔍(t,X,ω)\mathfrak{J}(t,X,\omega) exists if and only if for each (j1,,jk)Ik(j_{1},\cdots,j_{k})\in I^{k} with Gj1jkn0G_{j_{1}\cdots j_{k}}^{n}\neq 0, the integrals

IKz(t)=t(qKzzq(s,ω))exp((ts)[λnpKKzλjp])ds,I_{K_{z}}(t)=\int_{-\infty}^{t}\bigg{(}\underset{q\in K_{z}}{\prod}z_{q}(s,\omega)\bigg{)}\exp\bigg{(}(t-s)\Big{[}\lambda_{n}-\sum_{p\in K\setminus K_{z}}\lambda_{j_{p}}\Big{]}\bigg{)}\,\mathrm{d}s, (111)

exist as KzK_{z} runs through the subsets (possibly empty) of disjoint elements of K={1,,k}K=\{1,\cdots,k\}.

Since the OU process zq(s,ω)z_{q}(s,\omega) has sublinear growth (see e.g. [30, Lemma 3.1]), that is lims±zq(s,ω)s=0\lim_{s\rightarrow\pm\infty}\frac{z_{q}(s,\omega)}{s}=0 for all ω\omega, one can readily check that the integral IKzI_{K_{z}} defined in (111) is finite for all tt provided that

Re(λnpKKzλjp)<0.\mathrm{Re}\left(\lambda_{n}-\sum_{p\in K\setminus K_{z}}\lambda_{j_{p}}\right)<0. (112)

Thus, the Lyapunov-Perron integral 𝔍\mathfrak{J} defined in Eq. (17) is well-defined if the non-resonance condition (LABEL:Eq_NR) holds.

Step 2. Thus 𝔍\mathfrak{J} is well defined under the condition (LABEL:Eq_NR). By direct calculation, we get

𝔍(t,X,ω)t\displaystyle\frac{\partial\mathfrak{J}(t,X,\omega)}{\partial t} =Π𝔣Gk(X+z𝔠(θtω))A𝔣𝔍(t,X,ω)\displaystyle=\Pi_{\mathfrak{f}}G_{k}(X+z_{\mathfrak{c}}(\theta_{t}\omega))-A_{\mathfrak{f}}\mathfrak{J}(t,X,\omega) (113)
te(ts)A𝔣Π𝔣DGk(e(st)A𝔠X+z𝔠(θsω))e(st)A𝔠A𝔠Xds,\displaystyle\quad-\int_{-\infty}^{t}e^{(t-s)A_{\mathfrak{f}}}\Pi_{\mathfrak{f}}DG_{k}\Big{(}e^{(s-t)A_{\mathfrak{c}}}X+z_{\mathfrak{c}}(\theta_{s}\omega)\Big{)}e^{(s-t)A_{\mathfrak{c}}}A_{\mathfrak{c}}X\,\mathrm{d}s,

and that

DX\displaystyle D_{X} 𝔍(t,X,ω)A𝔠X=\displaystyle\mathfrak{J}(t,X,\omega)A_{\mathfrak{c}}X=
te(ts)A𝔣Π𝔣DGk(e(st)A𝔠X+z𝔠(θsω))e(st)A𝔠A𝔠Xds,\displaystyle\int_{-\infty}^{t}e^{(t-s)A_{\mathfrak{f}}}\Pi_{\mathfrak{f}}DG_{k}\Big{(}e^{(s-t)A_{\mathfrak{c}}}X+z_{\mathfrak{c}}(\theta_{s}\omega)\Big{)}e^{(s-t)A_{\mathfrak{c}}}A_{\mathfrak{c}}X\,\mathrm{d}s,

where DGkDG_{k} denote the Fréchet derivative of GkG_{k}. As a result, 𝔍(t,X,ω)\mathfrak{J}(t,X,\omega) solves (19).

Step 3. The solution to (20) is given explicitly

p(s)\displaystyle p(s) =e(st)A𝔠X,\displaystyle=e^{(s-t)A_{\mathfrak{c}}}X, (114)
q(s,ω)\displaystyle q(s,\omega) =tτse(ss)A𝔣Π𝔣Gk(e(ss)A𝔠X+z𝔠(s,ω))ds.\displaystyle=\int_{t-\tau}^{s}e^{(s-s^{\prime})A_{\mathfrak{f}}}\Pi_{\mathfrak{f}}G_{k}\Big{(}e^{(s^{\prime}-s)A_{\mathfrak{c}}}X+z_{\mathfrak{c}}(s^{\prime},\omega)\Big{)}\,\mathrm{d}s^{\prime}.

In particular, the value of q(s,ω)q(s,\omega) at s=ts=t, denoted by qτ(t,X,ω)q_{\tau}(t,X,\omega), is given by

qτ(t,X,ω)=tτte(ts)A𝔣Π𝔣Gk(e(st)A𝔠X+z𝔠(s,ω))ds.q_{\tau}(t,X,\omega)=\int_{t-\tau}^{t}e^{(t-s^{\prime})A_{\mathfrak{f}}}\Pi_{\mathfrak{f}}G_{k}\Big{(}e^{(s^{\prime}-t)A_{\mathfrak{c}}}X+z_{\mathfrak{c}}(s^{\prime},\omega)\Big{)}\,\mathrm{d}s^{\prime}.

We get then

𝔍\displaystyle\mathfrak{J} (t,X,ω)qτ(t,X,ω)=\displaystyle(t,X,\omega)-q_{\tau}(t,X,\omega)= (115)
tτe(ts)A𝔣Π𝔣Gk(e(st)A𝔠X+z𝔠(s,ω))ds.\displaystyle\int_{-\infty}^{t-\tau}e^{(t-s)A_{\mathfrak{f}}}\Pi_{\mathfrak{f}}G_{k}(e^{(s-t)A_{\mathfrak{c}}}X+z_{\mathfrak{c}}(s,\omega))\,\mathrm{d}s.

The boundedness of 𝔍\mathfrak{J} implies that the H𝔣H_{\mathfrak{f}}-norm of the integral on the RHS of (115) converges to zero as τ\tau goes to infinity, and (21) follows.   

Appendix B Solving the sACE and its optimal reduced system

For the numerical integration of the stochastic Allen-Cahn equation (76), we adopt a semi-implicit Euler method, where the nonlinearity u3-u^{3} and the noise term d𝑾t\,\mathrm{d}{\bm{W}_{t}} are treated explicitly, and the linear term x2u+u\partial_{x}^{2}u+u is treated implicitly; the spatial discretization is handled with a pseudo-spectral method; see [29, Section 6.1] for a detailed description of this scheme applied to a stochastic Burgers-type equation. We have set the time-step size to be δt=102\delta t=10^{-2} and the spatial resolution to be δx=L/201\delta x=L/201.

The optimal reduced system (Eq. (85)) is also integrated with a semi-implicit Euler method where the linear term is treated implicitly and the nonlinear terms explicitly, still with δt=102\delta t=10^{-2}. The ODEs with random coefficients such as (49), used to simulate the non-Markovian coefficients ZτnnZ^{n}_{\tau^{*}_{n}} involved in the computation of Φn(τn,𝑿,t)\Phi_{n}(\tau^{*}_{n},\bm{X},t), are simply integrated by a forward Euler scheme, with again δt=102\delta t=10^{-2}. The initial condition (52) for integrating the RDE (49) is computed using the trapezoidal rule.

Appendix C Computations of the double-fold bifurcation diagram, and jump-induced transitions

C.1 Computation of the double-fold bifurcation diagram

To compute the double-fold bifurcation diagram shown in Figure 11 for Eq. (LABEL:Eq_Sbif), we adopt a Fourier sine series expansion of the solutions, in which uu solving Eq. (LABEL:Eq_Sbif) is approximated by uN(x)=n=1Nan2/Lsin(nπx/L)u_{N}(x)=\sum_{n=1}^{N}a_{n}\sqrt{2/L}\sin(n\pi x/L). By doing so, the elliptic boundary value problem (LABEL:Eq_Sbif) is reduced to a nonlinear algebraic system for the coefficients 𝒂N=(a1,,aN)T\bm{a}_{N}=(a_{1},\ldots,a_{N})^{T}, which is solved by the Matlab built-in solver fsolve. The results shown in Figure 11 are obtained from a six-dimensional algebraic system (i.e., with six sine modes, N=6N=6), which turns out to provide high-precision approximations.

We use a simple continuation approach to compute the bifurcation diagram based on this algebraic system, benefiting from the knowledge that the solution curve is S-shaped. Starting from λ=1\lambda=1 with a random initial guess, an increment of Δλ=103\Delta\lambda=10^{-3} is used to compute the lower branch λUλm\lambda\mapsto U^{m}_{\lambda} of solutions to Eq. (LABEL:Eq_Sbif) (lower black curve) with minimal energy. In our continuation procedure, the initial guess for Uλ+ΔλmU^{m}_{\lambda+\Delta\lambda} is taken to be UλmU^{m}_{\lambda} computed at the previous step. As λ\lambda is further increased, the procedure is stopped when the associated Jacobian becomes nearly singular which indicates that λ\lambda is approaching the lower turning point of the S-shaped curve.

The location of this turning point is estimated to be λ1.3309\lambda^{\ast}\approx 1.3309 as obtained by using finer mesh as one gets close to singularity. We then select a λ\lambda-value to be just a few Δλ\Delta\lambda-increments below λ\lambda^{\ast} (to ensure multiplicity of solutions) and pick as many different random initial guesses as needed to get the two other solutions: the unstable one, UλU_{\lambda}^{\ast}, and the stable one, UλMU_{\lambda}^{M}, with maximal energy. From each of these newly computed solutions, we then adopt the same continuation method to trace out the upper branch λUλM\lambda\mapsto U^{M}_{\lambda} made of solutions with maximal energy (upper black curve) and the “in between” branch λUλ\lambda\mapsto U_{\lambda}^{\ast} made of unstable solutions (red curve).

C.2 Solving the jump-driven dynamics in Eq. (97), and its optimal reduced model

The numerical integration of the stochastic equation (97) is performed in the same way as for the sACE outlined in Appendix B using a semi-implicit Euler method. Here also the nonlinearity and the noise term are treated explicitly, while the linear term is treated implicitly. The spatial discretization is handled with a pseudo-spectral method; see again [29, Section 6.1]. We have set the time-step size to be δt=102\delta t=10^{-2} and the spatial resolution to be δx=L/257\delta x=L/257. The initial data is set to be u0=0.5𝒆1u_{0}=0.5\bm{e}_{1} for the results shown in Figs. 12 and 15.

The optimal reduced system (Eq. (LABEL:Eq_reduced_S_shaped)) is integrated with a forward Euler method still with δt=102\delta t=10^{-2}. The initial condition Jτnn(t0;ω)J^{n}_{\tau^{*}_{n}}(t_{0};\omega) (cf. (100)) for integrating (101) is computed using the trapezoidal rule, and t0=max{τ3,τ5}t_{0}=\max\{\tau_{3}^{*},\tau_{5}^{*}\}. The initial data for Eq. (LABEL:Eq_reduced_S_shaped) is set to be the projection of the SPDE solution onto 𝒆1\bm{e}_{1} at time t=t0t=t_{0}.

References

  • [1] P. C. Hohenberg and B. I. Halperin, “Theory of dynamic critical phenomena,” Reviews of Modern Physics, vol. 49, no. 3, p. 435, 1977.
  • [2] J. Pearson, “Complex patterns in a simple system,” Science, vol. 261, no. 5118, pp. 189–192, 1993.
  • [3] M. Cross and P. Hohenberg, “Pattern formation outside of equilibrium,” Reviews of Modern Physics, vol. 65, no. 3, p. 851, 1993.
  • [4] P. S. Berloff and J. C. McWilliams, “Large-scale, low-frequency variability in wind-driven ocean gyres,” Journal of Physical Oceanography, vol. 29, no. 8, pp. 1925–1949, 1999.
  • [5] J. Murray, Mathematical Biology: II: Spatial Models and Biomedical Applications, vol. 3. Springer, 2003.
  • [6] M. Baurmann, T. Gross, and U. Feudel, “Instabilities in spatially extended predator–prey systems: Spatio-temporal patterns in the neighborhood of Turing–Hopf bifurcations,” Journal of Theoretical Biology, vol. 245, no. 2, pp. 220–229, 2007.
  • [7] F. Sagués, J. Sancho, and J. García-Ojalvo, “Spatiotemporal order out of noise,” Reviews of Modern Physics, vol. 79, no. 3, p. 829, 2007.
  • [8] Y. R. Zelnik, E. Meron, and G. Bel, “Gradual regime shifts in fairy circles,” Proc. Natl. Acad. Sci., vol. 112, no. 40, pp. 12327–12331, 2015.
  • [9] E. Meron, “From patterns to function in living systems: Dryland ecosystems as a case study,” Annual Review of Condensed Matter Physics, vol. 9, pp. 79–103, 2018.
  • [10] J. McWilliams, “A survey of submesoscale currents,” Geoscience Letters, vol. 6, no. 1, pp. 1–15, 2019.
  • [11] J. Crawford, “Introduction to bifurcation theory,” Reviews of Modern Physics, vol. 63, no. 4, p. 991, 1991.
  • [12] T. Ma and S. Wang, Phase Transition Dynamics. Springer, second ed., 2019.
  • [13] J. Crawford and E. Knobloch, “Symmetry and symmetry-breaking bifurcations in fluid dynamics,” Annual Review of Fluid Mechanics, vol. 23, no. 1, pp. 341–387, 1991.
  • [14] M. D. Chekroun, E. Park, and R. Temam, “The Stampacchia maximum principle for stochastic partial differential equations and applications,” Journal of Differential Equation, vol. 260, no. 3, pp. 2926–2972, 2016.
  • [15] H. Brand, S. Kai, and S. Wakabayashi, “External noise can suppress the onset of spatial turbulence,” Physical Review Letters, vol. 54, no. 6, p. 555, 1985.
  • [16] S. Kai, H. Fukunaga, and H. R. Brand, “Structure changes induced by external multiplicative noise in the electrohydrodynamic instability of nematic liquid crystals,” Journal of statistical physics, vol. 54, pp. 1133–1152, 1989.
  • [17] G. Ahlers, C. W. Meyer, and D. S. Cannell, “Deterministic and stochastic effects near the convective onset,” Journal of Statistical Physics, vol. 54, pp. 1121–1131, 1989.
  • [18] J. García-Ojalvo, A. Hernández-Machado, and J. M. Sancho, “Effects of external noise on the Swift-Hohenberg equation,” Physical Review Letters, vol. 71, no. 10, p. 1542, 1993.
  • [19] I. Rehberg, S. Rasenat, M. De La Torre Juarez, W. Schöpf, F. Hörner, G. Ahlers, and H. Brand, “Thermally induced hydrodynamic fluctuations below the onset of electroconvection,” Physical Review Letters, vol. 67, no. 5, p. 596, 1991.
  • [20] P. Sura and C. Penland, “Sensitivity of a double-gyre ocean model to details of stochastic forcing,” Ocean Modelling, vol. 4, no. 3-4, pp. 327–345, 2002.
  • [21] E. Simonnet, J. Rolland, and F. Bouchet, “Multistability and rare spontaneous transitions in barotropic β\beta-plane turbulence,” Journal of the Atmospheric Sciences, vol. 78, no. 6, pp. 1889–1911, 2021.
  • [22] R. Saravanan and J. C. McWilliams, “Advective ocean–atmosphere interaction: An analytical stochastic model with implications for decadal variability,” Journal of Climate, vol. 11, no. 2, pp. 165–188, 1998.
  • [23] M. S. Roulston and J. D. Neelin, “The response of an ENSO model to climate noise, weather noise and intraseasonal forcing,” Geophysical Research Letters, vol. 27, no. 22, pp. 3723–3726, 2000.
  • [24] I. Eisenman, L. Yu, and E. Tziperman, “Westerly wind bursts: ENSO’s tail rather than the dog?,” Journal of Climate, vol. 18, no. 24, pp. 5224–5238, 2005.
  • [25] G. Margazoglou, T. Grafke, A. Laio, and V. Lucarini, “Dynamical landscape and multistability of a climate model,” Proceedings of the Royal Society A, vol. 477, p. 20210019, 2021.
  • [26] M. D. Chekroun, H. Liu, and J. C. McWilliams, “The emergence of fast oscillations in a reduced primitive equation model and its implications for closure theories,” Comp. & Fluids, vol. 151, pp. 3–22, 2017.
  • [27] M. Chekroun, H. Liu, and J. McWilliams, “Variational approach to closure of nonlinear dynamical systems: Autonomous case,” J. Stat. Phys., vol. 179, no. 5, pp. 1073–1160, 2020. doi: 10.1007/s10955-019-02458-2.
  • [28] M. Chekroun, H. Liu, and J. C. McWilliams, “Optimal parameterizing manifolds for anticipating tipping points and higher-order critical transitions,” Chaos, vol. 33, no. 9, 2023. doi.org/10.1063/5.0167419.
  • [29] M. D. Chekroun, H. Liu, and S. Wang, Stochastic Parameterizing Manifolds and Non-Markovian Reduced Equations: Stochastic Manifolds for Nonlinear SPDEs II. Springer Briefs in Mathematics, Springer, 2015.
  • [30] M. D. Chekroun, H. Liu, and S. Wang, Approximation of Stochastic Invariant Manifolds: Stochastic Manifolds for Nonlinear SPDEs I. Springer Briefs in Mathematics, Springer, 2015.
  • [31] M. D. Chekroun, H. Liu, J. C. McWilliams, and S. Wang, “Transitions in stochastic non-equilibrium systems: Efficient reduction and analysis,” J. Diff. Eqns., vol. 346, pp. 145–204, 2023.
  • [32] M. Schulz and M. Mudelsee, “REDFIT: estimating red-noise spectra directly from unevenly spaced paleoclimatic time series,” Computers & Geosciences, vol. 28, no. 3, pp. 421–426, 2002.
  • [33] A. Moberg, D. M. Sonechkin, K. Holmgren, N. M. Datsenko, and W. Karlén, “Highly variable Northern Hemisphere temperatures reconstructed from low-and high-resolution proxy data,” Nature, vol. 433, no. 7026, pp. 613–617, 2005.
  • [34] K. Fraedrich, R. Blender, and X. Zhu, “Continuum climate variability: Long-term memory, scaling, and 1/f-noise,” International Journal of Modern Physics B, vol. 23, no. 28n29, pp. 5403–5416, 2009.
  • [35] M. Rypdal and K. Rypdal, “Long-memory effects in linear response models of Earth’s temperature and implications for future global warming,” Journal of Climate, vol. 27, no. 14, pp. 5240–5258, 2014.
  • [36] M. Rypdal and K. Rypdal, “Late Quaternary temperature variability described as abrupt transitions on a 1/f noise background,” Earth System Dynamics, vol. 7, no. 1, pp. 281–293, 2016.
  • [37] K. Hasselmann, “Stochastic climate models. Part I. Theory,” Tellus, vol. 28, no. 6, pp. 473–485, 1976.
  • [38] J. W.-B. Lin and J. D. Neelin, “Influence of a stochastic moist convective parameterization on tropical climate variability,” Geophysical Research Letters, vol. 27, no. 22, pp. 3691–3694, 2000.
  • [39] T. N. Palmer, “A nonlinear dynamical perspective on model error: A proposal for non-local stochastic-dynamic parametrization in weather and climate prediction models,” Quarterly Journal of the Royal Meteorological Society, vol. 127, no. 572, pp. 279–304, 2001.
  • [40] A. J. Majda, I. Timofeyev, and E. Vanden-Eijnden, “A mathematical framework for stochastic climate models,” Comm. Pure Appl. Math, vol. 54, no. 8, pp. 891–974, 2001.
  • [41] C. L. E. Franzke, T. J. O’Kane, J. Berner, P. D. Williams, and V. Lucarini, “Stochastic climate theory and modeling,” Wiley Interdisciplinary Reviews: Climate Change, vol. 6, no. 1, pp. 63–78, 2015.
  • [42] J. Berner, U. Achatz, L. Batté, L. Bengtsson, A. Cámara, H. Christensen, M. Colangeli, D. Coleman, D. Crommelin, S. Dolaptchiev, C. Franzke, P. Friederichs, P. Imkeller, H. Järvinen, S. Juricke, V. Kitsios, F. Lott, V. Lucarini, S. Mahajan, T. Palmer, C. Penland, M. Sakradzija, J.-S. von Storch, A. Weisheimer, M. Weniger, P. Williams, and J.-I. Yano, “Stochastic Parameterization: Toward a New View of Weather and Climate Models,” Bull. Amer. Met. Soc., vol. 98, no. 3, pp. 565–588, 2017.
  • [43] V. Lucarini and M. D. Chekroun, “Theoretical tools for understanding the climate crisis from Hasselmann’s programme and beyond,” Nature Reviews Physics, vol. 5, no. 12, pp. 744–765, 2023. doi:10.1038/s42254-023-00650-8.
  • [44] A. J. Chorin, O. H. Hald, and R. Kupferman, “Optimal prediction with memory,” Physica D, vol. 166, no. 3, pp. 239–257, 2002.
  • [45] D. Givon, R. Kupferman, and A. Stuart, “Extracting macroscopic dynamics: model problems and algorithms,” Nonlinearity, vol. 17, no. 6, pp. R55–R127, 2004.
  • [46] P. Stinis, “A comparative study of two stochastic mode reduction methods,” Physica D, vol. 213, no. 2, pp. 197–213, 2006.
  • [47] M. D. Chekroun, H. Liu, and J. C. McWilliams, “Stochastic rectification of fast oscillations on slow manifold closures,” Proc. Natl. Acad. Sci. USA, vol. 118, no. 48, p. e2113650118, 2021.
  • [48] N. S. Namachchivaya and Y. K. Lin, “Method of stochastic normal forms,” Int. J. Non-linear Mechanics, vol. 26, no. 6, pp. 931–943, 1991.
  • [49] P. H. Coullet, C. Elphick, and E. Tirapegui, “Normal form of a Hopf bifurcation with noise,” Phys. Lett. A, vol. 111, no. 6, pp. 277–282, 1985.
  • [50] L. Arnold, Random Dynamical Systems. Springer, 1998.
  • [51] E. Forgoston and I. B. Schwartz, “Escape rates in a stochastic environment with multiple scales,” SIAM J. Applied Dynamical Systems, vol. 8, no. 3, pp. 1190–1217, 2009.
  • [52] D. Blömker and M. Hairer, Amplitude Equations for SPDEs: Approximate Centre Manifolds and Invariant Measures, in Probability and Partial Differential Equations in Modern Applied Mathematics, vol. 140 of IMA Vol. Math. Appl. Springer, New York, 2005.
  • [53] D. Blömker, Amplitude Equations for Stochastic Partial Differential Equations, vol. 3 of Interdisciplinary Mathematical Science. World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2007.
  • [54] D. Blömker, M. Hairer, and G. A. Pavliotis, “Multiscale analysis for stochastic partial differential equations with quadratic nonlinearities,” Nonlinearity, vol. 20, no. 7, pp. 1721–1744, 2007.
  • [55] K. Klepel, D. Blömker, and W. W. Mohammed, “Amplitude equation for the generalized Swift–Hohenberg equation with noise,” Z. Angew. Math. Phys., vol. 65, pp. 1107–1126, 2014.
  • [56] A. Bray, “Theory of phase-ordering kinetics,” Advances in Physics, vol. 51, no. 2, pp. 481–587, 2002.
  • [57] U. C. Täuber, Critical dynamics: a Field Theory Approach to Equilibrium and Non-equilibrium Scaling Behavior. Cambridge University Press, 2014.
  • [58] J. Bebernes and D. Eberly, Mathematical Problems from Combustion Theory, vol. 83. Springer Science & Business Media, 2013.
  • [59] D. A. Frank-Kamenetskii, Diffusion and Heat Exchange in Chemical Kinetics. Princeton University Press, 2015.
  • [60] H. Grad and H. Rubin, “Hydromagnetic equilibria and force-free fields,” Journal of Nuclear Energy, vol. 7, no. 3-4, pp. 284–285, 1958.
  • [61] V. Shafranov, “On magnetohydrodynamical equilibrium configurations,” Soviet Journal of Experimental and Theoretical Physics, vol. 6, 1958.
  • [62] E. M. Izhikevich, Dynamical Systems in Neuroscience: The Geometry of Excitability and Bursting. Cambridge: MIT Press, 2010.
  • [63] M. Ghil, “Climate stability for a Sellers-type model,” Journal of the Atmospheric Sciences, vol. 33, no. 1, pp. 3–20, 1976.
  • [64] G. R. North, R. F. Cahalan, and J. A. Coakley Jr, “Energy balance climate models,” Reviews of Geophysics, vol. 19, pp. 91–121, 1981.
  • [65] O. Thual and J. Mcwilliams, “The catastrophe structure of thermohaline convection in a two-dimensional fluid model and a comparison with low-order box models,” Geophysical & Astrophysical Fluid Dynamics, vol. 64, no. 1-4, pp. 67–95, 1992.
  • [66] W. Weijer, W. Cheng, S. Drijfhout, A. Fedorov, A. Hu, L. Jackson, W. Liu, E. McDonagh, J. Mecking, and J. Zhang, “Stability of the Atlantic Meridional Overturning Circulation: A review and synthesis,” Journal of Geophysical Research: Oceans, vol. 124, no. 8, pp. 5336–5375, 2019.
  • [67] T. M. Lenton, H. Held, E. Kriegler, J. W. Hall, W. Lucht, S. Rahmstorf, and H. J. Schellnhuber, “Tipping elements in the earth’s climate system,” Proceedings of the national Academy of Sciences, vol. 105, no. 6, pp. 1786–1793, 2008.
  • [68] N. Boers, “Observation-based early-warning signals for a collapse of the atlantic meridional overturning circulation,” Nature Climate Change, vol. 11, no. 8, pp. 680–688, 2021.
  • [69] N. Boers and M. Rypdal, “Critical slowing down suggests that the western greenland ice sheet is close to a tipping point,” Proceedings of the National Academy of Sciences, vol. 118, no. 21, p. e2024192118, 2021.
  • [70] N. Wunderling, J. F. Donges, J. Kurths, and R. Winkelmann, “Interacting tipping elements increase risk of climate domino effects under global warming,” Earth System Dynamics, vol. 12, no. 2, pp. 601–619, 2021.
  • [71] L. Serdukova, Y. Zheng, J. Duan, and J. Kurths, “Metastability for discontinuous dynamical systems under Lévy noise: Case study on Amazonian Vegetation,” Scientific reports, vol. 7, no. 1, p. 9336, 2017.
  • [72] Y. Zheng, F. Yang, J. Duan, X. Sun, L. Fu, and J. Kurths, “The maximum likelihood climate change for global warming under the influence of greenhouse effect and Lévy noise,” Chaos, vol. 30, no. 1, 2020.
  • [73] P. Ditlevsen, “Observation of α\alpha-stable noise induced millennial climate changes from an ice-core record,” Geophysical Research Letters, vol. 26, no. 10, pp. 1441–1444, 1999.
  • [74] T. Solomon, E. R. Weeks, and H. L. Swinney, “Observation of anomalous diffusion and Lévy flights in a two-dimensional rotating flow,” Physical Review Letters, vol. 71, no. 24, p. 3975, 1993.
  • [75] B. Khouider, A. Majda, and M. Katsoulakis, “Coarse-grained stochastic models for tropical convection and climate,” Proc. Natl. Acad. Sci. USA, vol. 100, no. 21, pp. 11941–11946, 2003.
  • [76] S. N. Stechmann and J. D. Neelin, “A stochastic model for the transition to strong convection,” J. Atmos. Sci., vol. 68, no. 12, pp. 2955–2970, 2011.
  • [77] C. Penland and P. Sardeshmukh, “Alternative interpretations of power-law distributions found in nature,” Chaos, vol. 22, no. 2, p. 023119, 2012.
  • [78] S. Thual, A. J. Majda, N. Chen, and S. N. Stechmann, “Simple stochastic model for El Niño with westerly wind bursts,” Proceedings of the National Academy of Sciences, vol. 113, no. 37, pp. 10245–10250, 2016.
  • [79] N. Chen and A. J. Majda, “Simple stochastic dynamical models capturing the statistical diversity of El Niño Southern Oscillation,” Proceedings of the National Academy of Sciences, vol. 114, no. 7, pp. 1468–1473, 2017.
  • [80] N. Chen, C. Mou, L. M. Smith, and Y. Zhang, “A stochastic precipitating quasi-geostrophic model,” arXiv preprint arXiv:2407.20886, 2024.
  • [81] I. Horenko, “Nonstationarity in multifactor models of discrete jump processes, memory, and application to cloud modeling,” Journal of the Atmospheric Sciences, vol. 68, no. 7, pp. 1493–1506, 2011.
  • [82] M. D. Chekroun, I. Koren, H. Liu, and H. Liu, “Generic generation of noise-driven chaos in stochastic time delay systems: Bridging the gap with high-end simulations,” Science Advances, vol. 8, no. 46, p. eabq7137, 2022.
  • [83] A. Debussche, M. Högele, and P. Imkeller, The Dynamics of Nonlinear Reaction-diffusion Equations with Small Lévy Noise, vol. 2085. Springer, 2013.
  • [84] S. Yuan, J. Hu, X. Liu, and J. Duan, “Slow manifolds for dynamical systems with non-Gaussian stable Lévy noise,” Analysis and Applications, vol. 17, no. 03, pp. 477–511, 2019.
  • [85] X. Liu, “Random invariant manifolds of stochastic evolution equations driven by Gaussian and non-Gaussian noises,” Journal of Mathematical Physics, vol. 62, no. 11, p. 112702, 2021.
  • [86] S. Yuan and D. Bloemker, “Modulation and amplitude equations on bounded domains for nonlinear SPDEs driven by cylindrical α\alpha-stable Lévy processes,” SIAM Journal on Applied Dynamical Systems, vol. 21, no. 3, pp. 1748–1777, 2022.
  • [87] G. Da Prato and J. Zabczyk, Stochastic Equations in Infinite Dimensions, vol. 44 of Encyclopedia of Mathematics and its Applications. Cambridge: Cambridge University Press, 2008.
  • [88] G. Da Prato and J. Zabczyk, Ergodicity for Infinite Dimensional Systems. Cambridge University Press, 1996.
  • [89] D. Henry, Geometric Theory of Semilinear Parabolic Equations, vol. 840 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 1981.
  • [90] A. Pazy, Semigroups of Linear Operators and Applications to Partial Differential Equations, vol. 44 of Applied Mathematical Sciences. Springer-Verlag, New York, 1983.
  • [91] T. Ma and S. Wang, Bifurcation Theory and Applications, vol. 53 of World Scientific Series on Nonlinear Science. Series A: Monographs and Treatises. World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2005.
  • [92] G. J. Lord, C. E. Powell, and T. Shardlow, An Introduction to Computational Stochastic PDEs, vol. 50. Cambridge University Press, 2014.
  • [93] M. D. Chekroun and H. Liu, “Effective reduced models from delay differential equations: Bifurcations, tipping solution paths, and ENSO variability,” Physica D, vol. 460, p. 134058, 2024.
  • [94] K.-J. Engel and R. Nagel, One-Parameter Semigroups for Linear Evolution Equations, vol. 194 of Graduate Texts in Mathematics. New York: Springer-Verlag, 2000.
  • [95] M. Chekroun, H. Dijkstra, T. Sengül, and S. Wang, “Transitions of zonal flows in a two-layer quasi-geostrophic ocean model,” Nonlinear dynamics, vol. 109, no. 3, pp. 1887–1904, 2022.
  • [96] A. Bensoussan and F. Flandoli, “Stochastic inertial manifold,” Stochastics Stochastics Rep., vol. 53, pp. 13–39, 1995.
  • [97] G. Da Prato and A. Debussche, “Construction of stochastic inertial manifolds using backward integration,” Stochastics Stochastics Reports, vol. 59, no. 3-4, pp. 305–324, 1996.
  • [98] G. Jetschke, “On the equivalence of different approaches to stochastic partial differential equations,” Mathematische Nachrichten, vol. 128, no. 1, pp. 315–329, 1986.
  • [99] I. Corwin, “The Kardar–Parisi–Zhang equation and universality class,” Random matrices: Theory and applications, vol. 1, no. 01, p. 1130001, 2012.
  • [100] A. J. Roberts, “A step towards holistic discretisation of stochastic partial differential equations,” ANZIAM Journal, vol. 45, pp. C1–C15, 2003.
  • [101] A. J. Roberts, “Resolving the multitude of microscale interactions accurately models stochastic partial differential equations,” LMS Journal of Computation and Mathematics, vol. 9, pp. 193–221, 2006.
  • [102] W. Wang and A. Roberts, “Macroscopic reduction for stochastic reaction–diffusion equations,” The IMA Journal of Applied Mathematics, vol. 78, no. 6, pp. 1237–1264, 2013.
  • [103] D. Blömker and W. W. Mohammed, “Amplitude equations for SPDEs with cubic nonlinearities,” Stochastics: An International Journal of Probability and Stochastic Processes, vol. 85, no. 2, pp. 181–215, 2013.
  • [104] J. Cahn and J. Hilliard, “Free energy of a nonuniform system. I. Interfacial free energy,” The Journal of chemical physics, vol. 28, no. 2, pp. 258–267, 1958.
  • [105] G. Da Prato and A. Debussche, “Stochastic Cahn-Hilliard equation,” Nonlinear Analysis: Theory, Methods & Applications, vol. 26, no. 2, pp. 241–263, 1996.
  • [106] S. Pierini, “Coherence resonance in a double-gyre model of the Kuroshio Extension,” Journal of Physical Oceanography, vol. 40, no. 1, pp. 238–248, 2010.
  • [107] T. P. Sapsis and H. A. Dijkstra, “Interaction of additive noise and nonlinear dynamics in the double-gyre wind-driven ocean circulation,” Journal of physical oceanography, vol. 43, no. 2, pp. 366–381, 2013.
  • [108] V. M. Gálfi, V. Lucarini, F. Ragone, and J. Wouters, “Applications of large deviation theory in geophysical fluid dynamics and climate science,” La Rivista del Nuovo Cimento, vol. 44, no. 6, pp. 291–363, 2021.
  • [109] M. Hairer and J. Mattingly, “A theory of hypoellipticity and unique ergodicity for semilinear stochastic PDEs,” Electron. J. Probab., pp. 658–738, 2011.
  • [110] M. Hairer and A. Ohashi, “Ergodic theory for SDEs with extrinsic memory,” Ann. Probab., pp. 1950–1977, 2007.
  • [111] B. Øksendal, Stochastic Differential Equations: An Introduction with Applications. Springer, 2013.
  • [112] B. Goldys and B. Maslowski, “Exponential ergodicity for stochastic Burgers and 2D Navier–Stokes equations,” Journal of Functional Analysis, vol. 226, no. 1, pp. 230–255, 2005.
  • [113] S. Kuksin and A. Shirikyan, Mathematics of Two-Dimensional Turbulence, vol. 194. Cambridge University Press, 2012.
  • [114] M. Hairer and J. Mattingly, “Ergodicity of the 2D Navier-Stokes equations with degenerate stochastic forcing,” Annals of Mathematics, pp. 993–1032, 2006.
  • [115] M. Hairer, J. Mattingly, and M. Scheutzow, “Asymptotic coupling and a general form of Harris theorem with applications to stochastic delay equations,” Probability Theory and Related Fields, vol. 149, no. 1-2, pp. 223–259, 2011.
  • [116] M. Hairer, “Exponential mixing properties of stochastic PDEs through asymptotic coupling,” Probability theory and related fields, vol. 124, no. 3, pp. 345–380, 2002.
  • [117] H. Crauel, “Measure attractors and Markov attractors,” Dynamical Systems, vol. 23, no. 1, pp. 75–107, 2008.
  • [118] F. Flandoli, B. Gess, and M. Scheutzow, “Synchronization by noise for order-preserving random dynamical systems,” The Annals of Probability, vol. 45, no. 2, pp. 1325–1350, 2017.
  • [119] H. Crauel, “Markov measures for random dynamical systems,” Stochastics: An International Journal of Probability and Stochastic Processes, vol. 37, no. 3, pp. 153–173, 1991.
  • [120] H. Crauel, Random Probability Measures on Polish Spaces, vol. 11. The Stochastic Monographs: Theory and Applications of Stochastic Process Series, Taylor & Francis, 2002.
  • [121] I. Chueshov, Monotone Random Systems Theory and Applications, vol. 1779. Springer Science & Business Media, 2002.
  • [122] H. Crauel and F. Flandoli, “Attractors for random dynamical systems,” Probab. Theory Relat. Fields, vol. 100, pp. 365–393, 1994.
  • [123] H. Brézis, Functional Analysis, Sobolev Spaces and Partial Differential Equations. Springer, 2010.
  • [124] M. D. Chekroun, J. S. W. Lamb, C. J. Pangerl, and M. Rasmussen, “A Girsanov approach to slow parameterizing manifolds in the presence of noise,” arXiv preprint 1903.08598, 2023.
  • [125] S. M. Allen and J. W. Cahn, “A microscopic theory for antiphase boundary motion and its application to antiphase domain coarsening,” Acta Metallurgica, vol. 27, no. 6, pp. 1085–1095, 1979.
  • [126] N. Chafee and E. F. Infante, “A bifurcation problem for a nonlinear partial differential equation of parabolic type,” Applicable Analysis, vol. 4, no. 1, pp. 17–37, 1974.
  • [127] V. Ginzburg and L. Landau, “On the theory of superconductivity,” Zh. Eksp. Teor. Fiz., vol. 20, pp. 1064–1082, 1950.
  • [128] M. I. Freidlin and A. D. Wentzell, Random Perturbations of Dynamical Systems, vol. 260 of Grundlehren der Mathematischen Wissenschaften. Heidelberg: Springer, third ed., 2012. Translated from the 1979 Russian original by Joseph Szücs.
  • [129] N. Berglund and B. Gentz, “Sharp estimates for metastable lifetimes in parabolic SPDEs: Kramers’ law and beyond,” Electron. J. Probab., vol. 18, pp. 1–58, 2013.
  • [130] J. Rolland, F. Bouchet, and E. Simonnet, “Computing transition rates for the 1-D stochastic Ginzburg-Landau-Allen-Cahn equation for finite-amplitude noise with a rare event algorithm,” Journal of Statistical Physics, vol. 162, pp. 277–311, 2016.
  • [131] R. Zakine and E. Vanden-Eijnden, “Minimum-action method for nonequilibrium phase transitions,” Physical Review X, vol. 13, no. 4, p. 041044, 2023.
  • [132] T. Caraballo, H. Crauel, J. A. Langa, and J. C. Robinson, “The effect of noise on the Chafee-Infante equation: A nonlinear case study,” Proc. Amer. Math. Soc., vol. 135, no. 2, pp. 373–382, 2007.
  • [133] D. Blömker and H. Fu, “The impact of multiplicative noise in SPDEs close to bifurcation via amplitude equations,” Nonlinearity, vol. 33, no. 8, p. 3905, 2020.
  • [134] S. Qu and H. Gao, “The high-order approximation of SPDEs with multiplicative noise via amplitude equations,” Communications in Nonlinear Science and Numerical Simulation, 2024. 10.1016/j.cnsns.2024.107937.
  • [135] F. Bai, A. Spence, and A. M. Stuart, “The numerical computation of heteroclinic connections in systems of gradient partial differential equations,” SIAM J. Appl. Math., vol. 53, no. 3, pp. 743–769, 1993.
  • [136] W. W. Mohammed, D. Blömker, and K. Klepel, “Multi-scale analysis of SPDEs with degenerate additive noise,” J. Evolution Equations, vol. 14, pp. 273–298, 2014.
  • [137] S. Chandrasekhar, An Introduction to the Study of Stellar Structure. Dover Publ., N. Y., 1957.
  • [138] R. H. Fowler, “Further studies of Emden’s and similar differential equations,” The Quarterly Journal of Mathematics, no. 1, pp. 259–288, 1931.
  • [139] E. Hopf, “On Emden’s differential equation,” Monthly Notices of the Royal Astronomical Society, vol. 91, 1931.
  • [140] G. Hetzer, “S-shapedness for energy balance climate models of Sellers-type,” in The Mathematics of Models for Climatology and Environment, pp. 253–287, Springer, 1997.
  • [141] H. Dijkstra and M. Moleamaker, “Symmetry breaking and overturning oscillations in thermohaline-driven flows,” Journal of Fluid Mechanics, vol. 331, pp. 169–198, 1997.
  • [142] P. Korman and Y. Li, “On the exactness of an S-shaped bifurcation curve,” Proceedings of the American Mathematical Society, vol. 127, no. 4, pp. 1011–1020, 1999.
  • [143] L. N. Trefethen, Spectral methods in MATLAB. SIAM, 2000.
  • [144] I. Bena, “Dichotomous Markov noise: exact results for out-of-equilibrium systems,” International Journal of Modern Physics B, vol. 20, no. 20, pp. 2825–2888, 2006.
  • [145] W. Horsthemke and R. Lefever, Noise-induced Transitions in Physics, Chemistry, and Biology. Springer, 1984.
  • [146] Note that the other unresolved (and unforced) modes contain less than 0.1%0.1\% of the total energy and are thus not taken here in account into the parameterization 𝒫(t)\mathcal{P}(t).
  • [147] R. Cont and P. Tankov, Financial Modelling with Jump Processes. Chapman and Hall/CRC, 2003.
  • [148] S. Peszat and J. Zabczyk, Stochastic Partial Differential Equations with Lévy noise: An Evolution Equation approach, vol. 113. Cambridge University Press, 2007.
  • [149] D. Applebaum, “Levy Processes and Stochastic Calculus,” Cambridge Studies in Advanced Mathematics, vol. 116, 2009.
  • [150] F. Kühn, “Lévy Matters. VI,” Lecture Notes in Mathematics, vol. 2187, 2017.
  • [151] B. Øksendal and A. Sulem, “Stochastic Control of Jump Diffusions,” in Applied Stochastic Control of Jump Diffusions, pp. 93–155, Springer, 2019.
  • [152] M. Kwaśnicki, “Ten equivalent definitions of the fractional Laplace operator,” Fractional Calculus and Applied Analysis, vol. 20, no. 1, pp. 7–51, 2017.
  • [153] D. Applebaum, Semigroups of Linear Operators: With Applications to Analysis, Probability and Physics. London Mathematical Society Student Texts, Cambridge University Press, 2019. doi:10.1017/9781108672641.
  • [154] P. E. Protter, “Stochastic Integration and Differential Equations,” in Stochastic Modelling and Applied Probability, Springer, 2005.
  • [155] F. Kühn, “Solutions of Lévy-driven SDEs with unbounded coefficients as Feller processes,” Proceedings of the American Mathematical Society, vol. 146, no. 8, pp. 3591–3604, 2018.
  • [156] P. Courrège, “Sur la forme intégro-différentielle des opérateurs de CkC_{k}^{\infty} dans CC satisfaisant au principe du maximum,” Séminaire Brelot-Choquet-Deny. Théorie du Potentiel, vol. 10, no. 1, pp. 1–38, 1965.
  • [157] G. Da Prato, Kolmogorov Equations for Stochastic PDEs. Springer Science & Business Media, 2004.
  • [158] S. Yuan, Z. Zeng, and J. Duan, “Stochastic bifurcation for two-time-scale dynamical system with α\alpha-stable Lévy noise,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2021, no. 3, p. 033204, 2021.
  • [159] W. E, W. Ren, and E. Vanden-Eijnden, “Minimum action method for the study of rare events,” Communications on Pure and Applied Mathematics, vol. 57, pp. 637–656, 2004.
  • [160] F. Cérou and A. Guyader, “Adaptive multilevel splitting for rare event analysis,” Stochastic Analysis and Applications, vol. 25, no. 2, pp. 417–443, 2007.
  • [161] E. Vanden-Eijnden and M. G. Westdickenberg, “Rare events in stochastic partial differential equations on large spatial domains,” J. Stat. Phys., vol. 131, no. 6, pp. 1023–1038, 2008.
  • [162] T. Grafke, R. Grauer, and T. Schäfer, “The instanton method and its numerical implementation in fluid mechanics,” Journal of Physics A: Mathematical and Theoretical, vol. 48, no. 33, p. 333001, 2015.
  • [163] F. Ragone, J. Wouters, and F. Bouchet, “Computation of extreme heat waves in climate models using a large deviation algorithm,” Proceedings of the National Academy of Sciences, vol. 115, no. 1, pp. 24–29, 2018.
  • [164] M. D. Chekroun and H. Liu, “Finite-horizon parameterizing manifolds, and applications to suboptimal control of nonlinear parabolic PDEs,” Acta Applicandae Mathematicae, vol. 135, no. 1, pp. 81–144, 2015.
  • [165] M. D. Chekroun, A. Tantet, H. Dijkstra, and J. D. Neelin, “Ruelle-Pollicott resonances of stochastic systems in reduced state space. Part I: Theory,” J. Stat. Phys., vol. 179, pp. 1366–1402, 2020. doi: 10.1007/s10955-020-02535-x.
  • [166] A. Tantet, M. D. Chekroun, H. Dijkstra, and J. D. Neelin, “Ruelle-Pollicott resonances of stochastic systems in reduced state space. Part II: stochastic Hopf bifurcation,” J. Stat. Phys., vol. 179, pp. 1403–1448, 2020. doi:10.1007/s10955-020-02526-y.
  • [167] B. B. Mandelbrot and J. W. Van Ness, “Fractional Brownian motions, fractional noises and applications,” SIAM Review, vol. 10, no. 4, pp. 422–437, 1968.
  • [168] M. J. Garrido-Atienza, K. Lu, and B. Schmalfuß, “Unstable invariant manifolds for stochastic PDEs driven by a fractional Brownian motion,” Journal of Differential Equations, vol. 248, no. 7, pp. 1637–1667, 2010.
  • [169] M. Pereira, C. Gissinger, and S. Fauve, “1/f noise and long-term memory of coherent structures in a turbulent shear flow,” Physical Review E, vol. 99, p. 023106, 2019.
  • [170] R. M. van Westen, M. Kliphuis, and H. A. Dijkstra, “Physics-based early warning signal shows that AMOC is on tipping course,” Science Advances, vol. 10, p. eadk1189, 2024.
  • [171] V. Lucarini and M. Chekroun, “Detecting and attributing change in climate and complex Systems: Foundations, Green’s functions, and nonlinear fingerprints,” Physical Review Letters, vol. 133, p. 224201, 2024. doi:10.1103/PhysRevLett.133.244201.
  • [172] T. M. Lenton, J. Rockström, O. Gaffney, S. Rahmstorf, K. Richardson, W. Steffen, and H. J. Schellnhuber, “Climate tipping points—too risky to bet against,” Nature, vol. 575, pp. 592–595, 2019.
  • [173] D. I. A. McKay, A. Staal, J. F. Abrams, R. Winkelmann, B. Sakschewski, S. Loriani, I. Fetzer, S. E. Cornell, J. Rockström, and T. M. Lenton, “Exceeding 1.5C global warming could trigger multiple climate tipping points,” Science, vol. 377, p. eabn7950, 2022.
  • [174] M. O. Cáceres and M. A. Fuentes, “Stochastic escape processes from a non-symmetric potential normal form III: extended explosive systems,” Journal of Physics A: Mathematical and General, vol. 32, no. 18, p. 3209, 1999.
  • [175] M. Fuentes, M. Kuperman, and V. Kenkre, “Nonlocal interaction effects on pattern formation in population dynamics,” Physical Review Letters, vol. 91, p. 158104, 2003.
  • [176] M. O. Cáceres and M. A. Fuentes, “First-passage times for pattern formation in nonlocal partial differential equations,” Physical Review E, vol. 92, p. 042122, 2015.
  • [177] S. Ruan, “Spatial-temporal dynamics in nonlocal epidemiological models,” in Mathematics for Life Science and Medicine, Takeuchi, Y., Iwasa, Y., Sato, K. (eds), pp. 97–122, Springer, 2007.
  • [178] C. Fernandez-Oto, M. Clerc, D. Escaff, and M. Tlidi, “Strong nonlocal coupling stabilizes localized structures: an analysis based on front dynamics,” Physical Review Letters, vol. 110, p. 174101, 2013.
  • [179] E. Gilad, J. von Hardenberg, A. Provenzale, M. Shachak, and E. Meron, “Ecosystem engineers: from pattern formation to habitat creation,” Physical Review Letters, vol. 93, p. 098105, 2004.