Optimal Parameterizing Manifolds for Anticipating Tipping Points and Higher-order Critical Transitions

Mickaël D. Chekroun [email protected] Department of Atmospheric and Oceanic Sciences, University of California, Los Angeles, CA 90095-1565, USA Department of Earth and Planetary Sciences, Weizmann Institute of Science, Rehovot 76100, Israel Honghu Liu Department of Mathematics, Virginia Tech, Blacksburg, VA 24061, USA James C. McWilliams Department of Atmospheric and Oceanic Sciences and Institute of Geophysics and Planetary Physics, University of California, Los Angeles, CA 90095-1565, USA

(September 15, 2023)

Abstract

A general, variational approach to derive low-order reduced models from possibly non-autonomous systems, is presented. The approach is based on the concept of optimal parameterizing manifold (OPM) that substitutes the more classical notions of invariant or slow manifolds when breakdown of “slaving” occurs, i.e. when the unresolved variables cannot be expressed as an exact functional of the resolved ones anymore. The OPM provides, within a given class of parameterizations of the unresolved variables, the manifold that averages out optimally these variables as conditioned on the resolved ones.

The class of parameterizations retained here is that of continuous deformations of parameterizations rigorously valid near the onset of instability. These deformations are produced through integration of auxiliary backward-forward (BF) systems built from the model’s equations and lead to analytic formulas for parameterizations. In this modus operandi, the backward integration time is the key parameter to select per scale/variable to parameterize in order to derive the relevant parameterizations which are doomed to be no longer exact away from instability onset, due to breakdown of slaving typically encountered e.g. for chaotic regimes. The selection criterion is then made through data-informed minimization of a least-square parameterization defect. It is thus shown, through optimization of the backward integration time per scale/variable to parameterize, that skilled OPM reduced systems can be derived for predicting with accuracy higher-order critical transitions or catastrophic tipping phenomena, while training our parameterization formulas for regimes prior to these transitions take place.

We introduce a framework for model reduction to produce analytic formulas to parameterize the neglected variables. These parameterizations are built from the model’s equations in which only a scalar is left to calibration per scale/variable to parameterize. This calibration is accomplished through a data-informed minimization of a least-square parameterization defect. By their analytic fabric the resulting parameterizations benefit physical interpretability. Furthermore, our hybrid framework—analytic and data-informed–enables us to bypass the so-called extrapolation problem, known to be an important issue for purely data-driven machine-learned parameterizations. Here, by training our parameterizations prior transitions take place, we are able to perform accurate predictions of these transitions via the corresponding reduced systems.

I Introduction

This article is concerned with the efficient reduction of forced-dissipative systems of the form

\frac{\,\mathrm{d}y}{\,\mathrm{d}t}=Ay+G(y)+F(t),\qquad y\in H,

(1)

in which $H$ denotes a Hilbert state space, possibly of finite dimension. The forcing term $F$ is considered to be either constant in time or time-dependent, while $A$ denotes a linear operator not necessarily self-adjoint ¹¹1but diagonalizable over the set of complex numbers $\mathbb{C}$ . that include dissipative effects, and $G$ denotes a nonlinear operator which saturate the possible unstable growth due to unstable modes and may account for a loss of regularity (such as for nonlinear advection) in the infinite-dimensional setting. Such systems arise in broad range of applications; see e.g. [2, 3, 4, 5, 6, 7, 8, 9, 10, 11].

The framework adopted is that of [12] which allows for deriving analytic parameterizations of the neglected scales that represent efficiently the nonlinear interactions with the retained, resolved variables. The originality of the approach of [12] is that the parameterizations are hybrid in their nature in the sense that they are both model-adaptive, based on the dynamical equations, and data-informed by high-resolution simulations.

For a given system, the optimization of these parameterizations benefits thus from their analytical origin resulting into only a few parameters to learn over short-in-time training intervals, mainly one scalar parameter per scale to parameterize. Their analytic fabric contributes then to their physical interpretability compared to e.g. parameterizations that would be machine learned. The approach is applicable to deterministic systems, finite- or infinite-dimensional, and is based on the concept of optimal parameterizing manifold (OPM) that substitutes the more classical notion of slow or invariant manifolds when takes place a breakdown of “slaving” relationships between the resolved and unresolved variables [12], i.e. when the latter cannot be expressed as an exact functional of the formers anymore.

By construction, the OPM takes its origin in a variational approach. The OPM is the manifold that averages out optimally the neglected variables as conditioned on the the current state of the resolved ones [12, Theorem 4]. The OPM allows for computing approximations of the conditional expectation term arising in the Mori-Zwanzig approach to stochastic modeling of neglected variables [13, 14, 15, 16, 17]; see also [12, Theorem 5] and [18, 19].

The approach introduced in [12] to determine OPMs in practice consists at first deriving analytic parametric formulas that match rigorous leading approximations of unstable/center manifolds or slow manifolds near e.g. the onset of instability [12, Sec. 2], and then to perform a data-driven optimization of the manifold formulas’ parameters to handle regimes further away from that instability onset [12, Sec. 4]. In other words, efficient parameterizations away from the onset are obtained as continuous deformations of parameterizations near the onset; deformations that are optimized by minimization of cost functionals tailored upon the dynamics and measuring the defect of parameterization.

There, the optimization stage allows for alleviating the small denominator problems rooted in small spectral gaps and for improving the parameterization of small-energy but dynamically important variables. Thereby, relevant parameterizations are derived in regimes where constraining spectral gap or timescale separation conditions are responsible for the well-known failure of standard invariant/inertial or slow manifolds [20, 21, 22, 23, 24, 25, 12]. As a result, the OPM approach provides (i) a natural remedy to the excessive backscatter transfer of energy to the low modes classically encountered in turbulence [12, Sec. 6], and (ii) allows for deriving optimal models of the slow motion for fast-slow systems not necessarily in presence of timescale separation [18, 19]. Due to their optimal nature, OPMs allow also for providing accurate parameterizations of dynamically important small-energy variables; a well-known issue encountered in the closure of chaotic dynamics and related to (i).

This work examine the ability of the theory-guided and data-informed parameterization approach of [12] in deriving reduced systems able to predict higher-order transitions or catastrophic tipping phenomena, when the original, full system is possibly subject to time-dependent perturbations. From a data-driven perspective, this problem is tied to the so-called extrapolation problem, known for instance to be an important issue in machine learning, requiring more advanced methods such as e.g. transfer learning [26]. While the past few decades have witnessed a surge and advances of many data-driven reduced-order modeling methodologies [27, 28], the prediction of non-trivial dynamics for parameter regimes not included in the training dataset, remains a challenging task. Here, the OPM approach by its hybrid framework—analytic and data-informed—allows us to bypass this extrapolation problem at a minimal cost in terms of learning efforts as illustrated in Secns. IV and V. As shown below, the training of OPMs at parameter values prior the transitions take place, is sufficient to perform accurate predictions of these transitions via OPM reduced systems.

The remainder of this paper is organized as follows. We first survey in Sec. II the OPM approach and provide the general variational parameterization formulas for model reduction in presence of a time-dependent forcing. We then expand in Sec. III on the backward-forward (BF) systems method [29, 12] to derive these formulas, clarifying fundamental relationships with homological equations arising in normal forms and invariant manifold theories [30, 31, 12, 32]. Section III.3 completes this analysis by analytic versions of these formulas in view of applications. The ability of predicting noise-induced transitions and catastrophic tipping phenomena [33, 34] through OPM reduced systems is illustrated in Sec. IV for a system arising in the study of thermohaline circulation. Successes in predicting higher-order transitions such as period-doubling and chaos are reported in Sec. V for a Rayleigh-Bénard problem, and contrasted by comparison with the difficulties encountered by standard manifold parameterization approaches in Appendix D. We then summarize the findings of this article with some concluding remarks in Sec. VI.

II Variational Parameterizations for Model Reduction

We summarize in this Section the notion of variational parameterizations introduced in [12]. The state space is decomposed as the sum of the subspace, $H_{\mathfrak{c}}$ , of resolved variables (“coarse-scale”), and the subspace, $H_{\mathfrak{s}}$ , of unresolved variables (“small-scale”). In practice $H_{\mathfrak{c}}$ is spanned by the first few eigenmodes of $A$ with dominant real parts (e.g. unstable), and $H_{\mathfrak{s}}$ by the rest of the modes, typically stable.

In many applications, one is interested in deriving reliable reduced systems of Eq. (1) for wavenumbers smaller than a cutoff scale $k_{c}$ corresponding to a reduced state space $H_{\mathfrak{c}}$ of dimension $m_{c}$ . To do so, we are after parameterizations for the unresolved variables $y_{\mathfrak{s}}$ (i.e., the vector component of the solution $y$ to Eq. (1) in $H_{\mathfrak{s}}$ ) of the form

\Phi_{{\bm{\tau}}}(X,t)=\sum_{n\geq m_{c}+1}\Phi_{n}(\tau_{n},X,t)\bm{e}_{n},\quad X\in H_{\mathfrak{c}},

(2)

in which ${{\bm{\tau}}}$ the (possibly infinite) vector formed by the scalars $\tau_{n}$ is a free parameter. As it will be apparent below, the goal is to get a small-scale parameterization which is not necessarily exact such that $y_{\mathfrak{s}}(t)$ is approximated by $\Phi_{{\bm{\tau}}}(y_{\mathfrak{c}}(t),t)$ in a least-square sense, where $y_{\mathfrak{c}}$ is the coarse-scale component of $y$ in $H_{\mathfrak{c}}$ . The vector $\bm{\tau}$ is aimed to be a homotopic deformation parameter. The purpose is to have, as $\bm{\tau}$ is varied, parameterizations that cover situations for which slaving relationships between the large and small scales hold ( $y_{\mathfrak{s}}(t)=\Phi_{{\bm{\tau}}}(y_{\mathfrak{c}}(t),t))$ as well as situations in which they are broken ( $y_{\mathfrak{s}}(t)\neq\Phi_{{\bm{\tau}}}(y_{\mathfrak{c}}(t),t))$ , e.g. far from criticality. With the suitable $\bm{\tau}$ , the goal is to dispose of a reduced system resolving only the “coarse-scales,” able to e.g. predict higher-order transitions caused by nonlinear interactions with the neglected, “small-scales.”

As strikingly shown in [12], meaningful parameterizations away from the instability onset can still be obtained by relying on bifurcation and center manifold theories. To accomplish this feat, the parameterizations, rigorously valid near that onset, need though to be revisited as suitably designed continuous deformations. The modus operandi to provide such continuous deformations is detailed in Sec. III below, based on the method of backward-forward systems introduced in [29, Chap. 4], in a stochastic context. In the case where $G(y)=G_{k}(y)+h.o.t$ with $G_{k}$ a homogeneous polynomial of degree $k$ ), this approach gives analytical formulas of parameterizations given by (see Sec. III.1)

		$\displaystyle\Phi_{\bm{\tau}}(X,t)=\sum_{n\geq m_{c}+1}\bigg{(}\int_{-\tau_{n}}^{0}e^{-s\lambda_{n}}\Big{(}\Pi_{n}G_{k}\big{(}e^{sA_{\mathfrak{c}}}X-\eta(s)\big{)}+\Pi_{n}F(s+t)\Big{)}\,\mathrm{d}s\bigg{)}\bm{e}_{n},$		(3)
		$\displaystyle\mbox{with }X=\sum_{j=1}^{m_{c}}X_{j}\bm{e}_{j}\in H_{\mathfrak{c}}\mbox{ and }\eta(s)=\int_{s}^{0}e^{A_{\mathfrak{c}}(s-r)}\Pi_{c}F(r+t)\,\mathrm{d}r.$		(3)

Here, the $(\lambda_{n},\bm{e}_{n})$ denote the spectral elements of $A$ which are ordered such that $\Re(\lambda_{n})\geq\Re(\lambda_{n+1})$ . In Eq. (LABEL:Eq_app2), $\Pi_{n}$ and $\Pi_{c}$ denote the projector onto span $(\bm{e}_{n})$ and $H_{\mathfrak{c}}=$ span $(\bm{e}_{1},\cdots,\bm{e}_{m_{c}})$ , respectively, while $A_{\mathfrak{c}}=\Pi_{\mathfrak{c}}A$ . This formula provides an explicit expression for $\Phi^{n}_{\tau_{n}}(X,t)$ in (2) given here by the integral term over $[-\tau_{n},0]$ multiplying $\bm{e}_{n}$ in (LABEL:Eq_app2). The only free parameter per mode $\bm{e}_{n}$ to parameterize is the backward integration time $\tau_{n}$ .

The vector ${\bm{\tau}}$ , made of the $\tau_{n}$ , is then optimized by using data from the full model. To allow for a better flexibility, the optimization is executed mode by mode one wishes to parameterize. Thus, given a solution $y(t)$ of the full model Eq. (1) over an interval $I_{T}$ of length $T$ , each parameterization $\Phi^{n}_{\tau_{n}}(X,t)$ of the solution amplitude $y_{n}(t)$ carried by mode $\bm{e}_{n}$ is optimized by minimizing—in the $\tau_{n}$ -variable—the following parameterization defect

\mathcal{Q}_{n}(\tau_{n},T)=\overline{\big{|}y_{n}(t)-\Phi_{n}(\tau_{n},y_{\mathfrak{c}}(t),t)\big{|}^{2}},

(4)

for each $n\geq m_{c}+1$ . Here $\overline{(\cdot)}$ denotes the time-mean over $I_{T}$ while $y_{n}(t)$ and $y_{\mathfrak{c}}(t)$ denote the projections of $y(t)$ onto the high-mode $\bm{e}_{n}$ and the reduced state space $H_{\mathfrak{c}}$ , respectively.

Geometrically, the function $\Phi_{{\bm{\tau}}}(X,t)$ given by (LABEL:Eq_app2) provides a time-dependent manifold $\mathfrak{M}_{{\bm{\tau}}}(t)$ such that:

\overline{\textrm{dist}(y(t),\mathfrak{M}_{{\bm{\tau}}}(t))^{2}}\leq\sum_{n=m_{c}+1}^{N}\mathcal{Q}_{n}(\tau_{n},T),

(5)

where $\textrm{dist}(y(t),\mathfrak{M}_{{\bm{\tau}}}(t))$ denotes the distance of $y(t)$ (lying e.g. on the system’s global attractor) to the manifold $\mathfrak{M}_{{\bm{\tau}}}$ .

Thus minimizing each $\mathcal{Q}_{n}(\tau_{n},T)$ (in the $\tau_{n}$ -variable) is a natural idea to enforce closeness of $y(t)$ in a least-squares sense to the manifold $\mathfrak{M}_{{\bm{\tau}}}(t)$ . Panel A in Fig. 1 illustrates (5) for the $y_{n}$ -component: The optimal parameterization, $\Phi_{n}(\tau_{n}^{\ast},X,t)$ , minimizing (4) is shown for a case where the dynamics is transverse to it (e.g. in absence of slaving) while $\Phi_{n}(\tau_{n}^{\ast},X,t)$ provides the best parameterization in a least-squares sense.

Refer to caption — Figure 1: Panel A: The black curve represents the training trajectory, here shown to to be transverse to the parameterizing manifolds (i.e. absence of exact slaving). The grey smooth curves represent the time-dependent OPM aimed at tracking the state of the neglected variable $y_{n}$ at time $t$ as a function of the resolved variables $X$ . Panel B: A schematic of the dependence on $\tau$ of the parameterization defect $\mathcal{Q}_{n}$ given by (4). The red asterisk marks the minimum of $\mathcal{Q}_{n}$ achieved for $\tau=\tau_{n}^{\ast}$ .

In practice, the normalized parameterization defect, $Q_{n}$ , is often used to compare different parameterizations. It is defined as $Q_{n}(\tau,T)=\mathcal{Q}_{n}(\tau,T)/\overline{|y_{n}|^{2}}$ . For instance, the flat manifold corresponding to no parameterization ( $\tau_{n}=0$ ) of the neglected variables (Galerkin approximations) comes with $Q_{n}=1$ for all $n$ , while a manifold corresponding to a perfect slaving relationship between the $y_{\mathfrak{c}}(t)$ and $y_{n}(t)$ ’s, comes with $Q_{n}=0$ . When $0<Q_{n}<1$ for all $n$ , the underlying manifold $\mathfrak{M}_{{\bm{\tau}}}$ will be referred to as a parameterizing manifold (PM). Once the parameters $\tau_{n}$ are optimized by minimization of (4), the resulting manifold will be referred to as the the optimal parameterizing manifold (OPM) ²²2 Here, we make the abuse of language that the OPM is sought within the class of parameterizations obtained through the backward-forward systems of Sec. III. In general, OPMs are more general objects related to the conditional expectation associated with the projector $\Pi_{\mathfrak{c}}$ ; see [12, Sec. 3.2]..

We conclude this section by a few words of practical considerations. As documented in [12, Secns. 5 and 6], the amount of training data required in order to reach a robust estimation of the optimal backward integration time $\tau^{*}_{n}$ , is often comparable to the dynamics’ time horizon that is necessary to resolve the decay of correlations for the high-mode amplitude $y_{n}(t)$ . For multiscale systems, one thus often needs to dispose of a training dataset sufficiently large to resolve the slowest decay of temporal correlations of the scales to parameterize. On the other hand, by benefiting from their dynamical origin, i.e. through the model’s equations, the parameterizations formulas employed in this study (see Sec. III) allow often for reaching out in practice satisfactory OPM approximations when optimized over training intervals shorter than these characteristic timescales.

When these conditions are met, the minimization of the parameterization defect (4) is performed by a simple gradient-descent algorithm [12, Appendix]. There, the first local minimum that one reaches, corresponds often to the OPM; see Secns. IV and V below, and [12, Secns. 5 and 6]. In the rare occasions where the parameterization defect exhibits more than one local minimum and the corresponding local minima are close to each others, criteria involving colinearity between the features to parameterize and the parameterization itself can be designed to further assist the OPM selection. Section V.4 below illustrates this point with the notion of parameterization correlation.

III Variational Parameterizations: Explicit Formulas

III.1 Homological Equation and Backward-Forward Systems: Time-dependent Forcing

In this section, we recall from [12] and generalize to the case of time-dependent forcing, the theoretical underpinnings of the parameterization formula (LABEL:Eq_app2). First, observe that when $F\equiv 0$ and $\tau_{n}=\infty$ for all $n$ , the parameterization (LABEL:Eq_app2) is reduced (formally) to

\mathfrak{J}(X)=\int_{-\infty}^{0}e^{-sA_{\mathfrak{s}}}\Pi_{\mathfrak{s}}G_{k}(e^{sA_{\mathfrak{c}}}X)\,\mathrm{d}s,\;X\in H_{\mathfrak{c}},

(6)

where $A_{\mathfrak{s}}=\Pi_{\mathfrak{s}}A$ , with $\Pi_{\mathfrak{s}}$ denoting the canonical projectors onto $H_{\mathfrak{s}}$ . Eq. (6) is known in invariant manifold theory as a Lyapunov-Perron integral [7]. It provides the leading-order approximation of the underlying invariant manifold function if a suitable spectral gap condition is satisfied and solves the homological equation:

D\psi(X)A_{\mathfrak{c}}X-A_{\mathfrak{s}}\psi(X)=\Pi_{\mathfrak{s}}G_{k}(X),

(7)

see [2, Lemma 6.2.4] and [12, Theorem 1]. The later equation is a simplification of the invariance equation providing the invariant manifold when it exists; see [5, Sec. VIIA1] and [12, Sec. 2.2]. Thus, Lyapunov-Perron integrals such as (6) are intimately related to the homological equation, and the study of the latter informs us on the former, and in turn on the closed form of the underlying invariant manifold.

Another key element was pointed out in [29, Chap. 4], in the quest of getting new insights about invariant manifolds in general and more specifically concerned with the approximation of stochastic invariant manifolds of stochastic PDEs [36], along with the rigorous low-dimensional approximation of their dynamics [37]. It consists of the method of Backward-Forward (BF) systems, thereafter revealed in [12] as a key tool to produce parameterizations (based on model’s equations) that are relevant beyond the domain of applicability of invariant manifold theory, i.e. away the onset of instability.

To better understand this latter feature, let us recall first that BF systems allow for providing to Lyapunov-Perron integrals such as (6), a flow interpretation. In the case of (6), this BF system takes the form [12, Eq. (2.29)]:


	$\displaystyle\frac{\mathrm{d}p}{\,\mathrm{d}s}=A_{\mathfrak{c}}p,\hskip 67.00006pts\in[-\tau,0],$		(8a)
	$\displaystyle\frac{\mathrm{d}q}{\,\mathrm{d}s}=A_{\mathfrak{s}}q+\Pi_{\mathfrak{s}}G_{k}(p),\qquad s\in[-\tau,0],$		(8b)
	$\displaystyle\mbox{with }p(s)\|_{s=0}=X,\mbox{ and }q(s)\|_{s=-\tau}=0.$		(8c)

The Lyapunov-Perron integral is indeed recovered from this BF system. The solution to Eq. (8b) at $s=0$ is given by

q(0)=\int_{-\tau}^{0}e^{-sA_{\mathfrak{s}}}\Pi_{\mathfrak{s}}G_{k}(e^{sA_{\mathfrak{c}}}X)\,\mathrm{d}s,\;X\in H_{\mathfrak{c}},

(9)

which by taking the limit formally in (9) as $\tau\rightarrow\infty$ , leads to $\mathfrak{J}$ given by (6). Thus, stretching $\tau$ to infinity in the course of integration of the BF system (8) allows for recovering rigorous approximations (under some non-resonance conditions [12, Theorem 1]) of well-known objects such as the center manifold; see also [2, Lemma 6.2.4] and [11, Appendix A.1]. The intimate relationships between Lyapunov-Perron integral, $\mathfrak{J}(X)$ , and the homological Eq. (7) are hence supplemented by their relationships with the BF system (8).

By breaking down, mode by mode, the backward integration of Eq. (8b) (in the case of $A_{\mathfrak{s}}$ diagonal), one allows for a backward integration time $\tau_{n}$ per mode’s amplitude to parameterize, leading in turn to the following class of parameterizations

\Phi_{\bm{\tau}}(X)=\sum_{n\geq m_{c}+1}\bigg{(}\int_{-\tau_{n}}^{0}e^{-s\lambda_{n}}\Pi_{n}G_{k}\big{(}e^{sA_{\mathfrak{c}}}X\big{)}\,\mathrm{d}s\bigg{)}\bm{e}_{n},

(10)

as indexed by $\bm{\tau}=(\tau_{n})_{n\geq m_{c}+1}$ . Clearly Eq. (LABEL:Eq_app2) is a generalization of Eq. (10).

We make precise now the similarities and differences between Eq. (LABEL:Eq_app2) and Eq. (10), at a deeper level as informed by their BF system representation. In that respect, we consider for the case of time-dependent forcing, the following BF system


	$\displaystyle\frac{\mathrm{d}p}{\,\mathrm{d}s}=A_{\mathfrak{c}}p+\Pi_{\mathfrak{c}}{F}(s),\hskip 55.00008pts\in[t-\tau,t],$		(11a)
	$\displaystyle\frac{\mathrm{d}q}{\,\mathrm{d}s}=A_{\mathfrak{s}}q+\Pi_{\mathfrak{s}}G_{k}(p)+\Pi_{\mathfrak{s}}{F}(s),\;\;s\in[t-\tau,t],$		(11b)
	$\displaystyle\mbox{with }p(t)=X\in H_{\mathfrak{c}},\;\mbox{ and }q(t-\tau)=0.$		(11c)

Note that compared to the BF system (8), the backward-forward integration is operated here on $[t-\tau,t]$ to account for the non-autonomous character of the forcing, making this way the corresponding parameterization time-dependent as summarized in Lemma III.1 below. In the presence of an autonomous forcing, operating the backward-forward integration of the BF system (8) on $[t-\tau,t]$ does not change the parameterization, which remains time-independent.

Since the BF system (11) is one-way coupled (with $p$ forcing (11b) but not reciprocally), if one assumes $A_{\mathfrak{s}}$ to be diagonal over $\mathbb{C}$ , one can break down the forward equation Eq. (11b) into the equations,

\frac{\mathrm{d}q_{n}}{\,\mathrm{d}s}=\lambda_{n}q_{n}+\Pi_{n}G_{k}(p)+\Pi_{n}{F}(s),\;\;s\in[t-\tau,t],

(12)

allowing for flexibility in the choice of $\tau$ per mode $\bm{e}_{n}$ whose amplitude is aimed at being parameterized by $q_{n}$ , for each $n\geq m_{c}+1$ .

After backward integration of Eq. (11a) providing $p(s)$ , one obtains by forward integration of Eq. (12) that

	$\displaystyle q_{n}(t)=\int_{t-\tau}^{t}e^{(t-s^{\prime})\lambda_{n}}$	$\displaystyle\Pi_{n}G_{k}\big{(}e^{(s^{\prime}-t)A_{\mathfrak{c}}}X-\gamma(s^{\prime})\big{)}\,\mathrm{d}s^{\prime}$		(13)
		$\displaystyle+\int_{t-\tau}^{t}e^{(t-s^{\prime})\lambda_{n}}\Pi_{n}F(s^{\prime})\,\mathrm{d}s^{\prime},$		(13)

with $\gamma(s^{\prime})=\int_{s^{\prime}}^{t}e^{(s^{\prime}-r)A_{\mathfrak{c}}}\Pi_{\mathfrak{c}}F(r)\,\mathrm{d}r$ . By making the change of variable $s^{\prime}=s+t$ , one gets

	$\displaystyle q_{n}(t)=\int_{-\tau}^{0}e^{-s\lambda_{n}}$	$\displaystyle\Pi_{n}G_{k}\Big{(}e^{sA_{\mathfrak{c}}}X-\eta(s)\Big{)}\,\mathrm{d}s$		(14)
		$\displaystyle+\int_{-\tau}^{0}e^{-s\lambda_{n}}\Pi_{n}F(s+t)\,\mathrm{d}s,$		(14)

with $\eta(s)=\int_{s}^{0}e^{(s-r)A_{\mathfrak{c}}}\Pi_{\mathfrak{c}}F(r+t)\,\mathrm{d}r=\gamma(s+t)$ . Now by summing up these parameterizations over the modes $\bm{e}_{n}$ for $n\geq m_{c}+1$ , we arrive at Eq. (LABEL:Eq_app2). Thus, our general parameterization formula (LABEL:Eq_app2) is derived by solving the BF system made of backward Eq. (11a) and the forward Eqns. (12).

We want to gain into interpretability of such parameterizations in the context of forced-dissipative systems such as Eq. (1). For this purpose, we clarify the fundamental equation satisfied by our parameterizations; the goal being here to characterize the analogue of (7) in this non-autonomous context. To simplify, we restrict to the case $\Pi_{\mathfrak{c}}F=0$ . The next Lemma provides the sought equation.

Lemma III.1

Assume $\Pi_{\mathfrak{c}}F=0$ in the BF system (11). The solution $q_{X,\tau}(t)$ to Eq. (11b) is given by

q_{X,\tau}(t)=\int_{-\tau}^{0}e^{-sA_{\mathfrak{s}}}\Pi_{\mathfrak{s}}G_{k}(e^{sA_{\mathfrak{c}}}X)\,\mathrm{d}s+\int_{-\tau}^{0}e^{-sA_{\mathfrak{s}}}\Pi_{\mathfrak{s}}F(s+t)\,\mathrm{d}s.

(15)

It provides a time-dependent manifold function that maps $H_{\mathfrak{c}}$ into $H_{\mathfrak{s}}$ and satisfies the following first order system of PDEs:

\Big{(}\partial_{t}+\mathcal{L}_{A}\Big{)}\Phi(X,t)=\underbrace{\Pi_{\mathfrak{s}}G_{k}(X)-e^{\tau A_{\mathfrak{s}}}\Pi_{\mathfrak{s}}G_{k}(e^{-\tau A_{\mathfrak{c}}}X)}_{(I)}+\underbrace{\Pi_{\mathfrak{s}}F(t)-e^{\tau A_{\mathfrak{s}}}\Pi_{\mathfrak{s}}F(t-\tau)}_{(II)},

(16)

with $\mathcal{L}_{A}$ being the differential operator acting on smooth mappings $\psi$ from $H_{\mathfrak{c}}$ into $H_{\mathfrak{s}}$ , defined as:

\mathcal{L}_{A}[\psi](X)=D\psi(X)A_{\mathfrak{c}}X-A_{\mathfrak{s}}\psi(X),\;X\in H_{\mathfrak{c}}.

(17)

The proof of this Lemma is found in Appendix A.

As it will become apparent below, the $\tau$ -dependence of the terms in (I) is meant to control the small denominators that arise in presence of small spectral gap between the spectrum of $A_{\mathfrak{c}}$ and $A_{\mathfrak{s}}$ that leads typically to over-parameterization when standard invariant/inertial manifold theory is applied in such situations; see Remark III.1 below and [12, Sec. 6].

The terms in (II) are responsible for the presence of exogenous memory terms in the solution to the homological equation Eq. (16), i.e. in the parameterization $\Phi(X,t)$ ; see Eq. (32) below.

In the case $F(t)$ is bounded ³³3One could also express a constraint on the behavior of $F(t)$ as $t\rightarrow\infty$ . For instance one may require that $\lim_{s\rightarrow-\infty}e^{A_{\mathfrak{c}}s}\int_{s}^{0}e^{A_{\mathfrak{c}}s^{\prime}}\Pi_{\mathfrak{c}}F(s^{\prime})\,\mathrm{d}s^{\prime}<\infty,etc.$ (18) , and $\Re\sigma(A_{\mathfrak{s}})<0$ , Eq. (16) reduces, in the (formal) limit $\tau\rightarrow\infty$ , to:

\Big{(}\partial_{t}+\mathcal{L}_{A}\Big{)}\Phi(X,t)=G_{k}(X)+\Pi_{\mathfrak{s}}F(t).

(19)

Such a system of linear PDEs is known as the homological equation and arises in the theory of time-dependent normal forms [39, 31]; see also [40]. In the case where $\Pi_{\mathfrak{s}}F$ is $T$ -periodic, one can seek an approximate solution in a Fourier expansion form [41, Sec. 5.2] leading to useful insights. For instance, solutions to (19) when $G$ is quadratic, exist if the following non-resonance condition is satisfied

		$\displaystyle\mathfrak{i}\nu\frac{\pi}{T}+\lambda_{i}+\lambda_{j}-\lambda_{n}\neq 0,\;\nu\in\mathbb{Z},$		(20)
		$\displaystyle\text{for }(i,j)\in I^{2},\;n\geq m_{c}+1,\;$		(20)

(with $I=\{1,\cdots,m_{c}\}$ and $\mathfrak{i}^{2}=-1$ ) in the case $\sigma(A)$ is discrete, without Jordan block. Actually one can even prove in this case that

\mbox{Spec}(\partial_{t}+\mathcal{L})=\bigg{\{}\mathfrak{i}\nu\frac{\pi}{T}+\delta_{ij}^{n},\;(i,j)\in I^{2},\;n\geq m_{c}+1,\;\nu\in\mathbb{Z}\bigg{\}},

(21)

(where $\delta_{ij}^{n}=\lambda_{i}+\lambda_{j}-\lambda_{n}$ ) on the space of functions

\mathcal{E}=\left\{\sum_{n\geq m_{c}+1}\bigg{(}\sum_{i,j=1}^{m_{c}}\Gamma^{n}_{ij}(t)X_{i}X_{j}\bigg{)}\bm{e}_{n},\;\Gamma^{n}_{ij}\in{L}^{2}(\mathbb{T})\right\},

(22)

in which $X_{i}$ and $X_{j}$ represent the components of the vector $X$ in $H_{\mathfrak{c}}$ onto $\bm{e}_{i}$ and $\bm{e}_{j}$ , respectively.

Thus, in view of Lemma III.1 and what precedes, the small-scale parameterizations (15) obtained by solving the BF system (11) over finite time intervals, can be conceptualized as perturbed solutions to the homological equation Eq. (19) arising in the computation of invariant and normal forms of non-autonomous dynamical systems [31, Sec. 2.2]. The perturbative terms brought by the $\tau$ -dependence play an essential role to cover situations beyond the domain of application of normal form and invariant manifold theories. As explained in Sec. III.3 and illustrated in Secns. IV-V below, these terms can be optimized to ensure skillful parameterizations for predicting, by reduced systems, higher-order transitions escaping the domain of validity of these theories.

III.2 Homological Equation and Backward-Forward Systems: Constant Forcing

In this Section, we clarify the conditions under which solutions to Eq. (11) exist in the asymptotic limit $\tau\rightarrow\infty$ . We consider the case where $F$ is constant in time, to simplify. For the existence of Lyapunov-Perron integrals under the presence of more general time-dependent forcings we refer to [40]. To this end, we introduce, for $X$ in $H_{\mathfrak{c}}$ ,

\mathfrak{J}_{F}(X)=\int_{-\infty}^{0}e^{-sA_{\mathfrak{s}}}\left(\Pi_{\mathfrak{s}}G_{k}\big{(}p(s)\big{)}+\Pi_{\mathfrak{s}}{F}\right)\,\mathrm{d}s,

(23)

where $p(s)$ is the solution to Eq. (11a), namely

p(s)=e^{sA_{\mathfrak{c}}}X-\int_{s}^{0}e^{A_{\mathfrak{c}}(s-s^{\prime})}\Pi_{\mathfrak{c}}F\,\mathrm{d}s^{\prime}.

(24)

We have then the following result that we cast for the case of finite-dimensional ODE system (of dimension $N$ ) to simply.

Theorem III.1

Assume that $F$ is constant in time and that $A$ is diagonal under its eigenbasis $\{\bm{e_{j}}\in\mathbb{C}^{N}\;|\;j=1,\cdots N\}$ . Assume furthermore that $\Re(\lambda_{m_{c}+1})<\Re(\lambda_{m_{c}})$ , and $\Re\sigma(A_{\mathfrak{s}})<0$ , i.e. $H_{\mathfrak{s}}$ contains only stable eigenmodes.

Finally, assume that the following non-resonance condition holds:

		$\displaystyle\forall\,(j_{1},\cdots,j_{k})\in(1,\cdots,m_{c})^{k},\;n\geq m_{c}+1,$		(25)
		$\displaystyle\left(G_{j_{1}\cdots j_{k}}^{n}\neq 0\right)\implies\left(\Re(\lambda_{n}-\sum_{p=1}^{k}\lambda_{j_{p}})<0\right),$		(25)

with $G_{j_{1}\cdots j_{k}}^{n}=\big{\langle}G_{k}(\bm{e}_{j_{1}},\cdots,\bm{e}_{j_{k}}),\bm{e}_{n}^{\ast}\big{\rangle}$ , where $\langle\cdot,\cdot\rangle$ denotes the inner product on $\mathbb{C}^{N}$ , $G_{k}$ denotes the leading-order term in the Taylor expansion of $G$ , and $\bm{e}_{n}^{\ast}$ denotes the eigenvectors of the conjugate transpose of $A$ .

Then, the Lyapunov-Perron integral $\mathfrak{J}_{F}$ given by (23) is well defined and is a solution to the following homological equation

\mathcal{L}_{A}[\psi](X)+D\psi[X]\Pi_{\mathfrak{c}}F=\Pi_{\mathfrak{s}}G_{k}(X)+\Pi_{\mathfrak{s}}F,

(26)

and provides the leading-order approximation of the invariant manifold function $h(X)$ in the sense that

\left\lVert\mathfrak{J}_{F}(X)-h(X)\right\rVert_{H_{\mathfrak{s}}}=o(\left\lVert X\right\rVert_{H_{\mathfrak{c}}}^{k}),\;\;X\in H_{\mathfrak{c}}.

Moreover, $\mathfrak{J}_{F}$ given by (23) is the limit, as $\tau$ goes to infinity, of the solution to the BF system (11), when $F$ is constant in time. That is

\lim_{\tau\rightarrow\infty}\|q_{X,\tau}(0)-\mathfrak{J}_{F}(X)\|=0,

(27)

where $q_{X,\tau}(0)$ is the solution to Eq. (11b).

Conditions similar to (LABEL:Eq_NR) arise in the smooth linearization of dynamical systems near an equilibrium [42]. Here, condition (LABEL:Eq_NR) implies that the eigenvalues of the stable part satisfy a Sternberg condition of order $k$ [42] with respect to the eigenvalues associated with the modes spanning the reduced state space $H_{\mathfrak{c}}$ .

This theorem is essentially a consequence of [12, Theorems 1 and 2], in which condition (LABEL:Eq_NR) is a stronger version of that used for [12, Theorem 2]; see also [12, Remark 1 (iv)]. This condition is necessary and sufficient here for $\int_{-\infty}^{0}e^{-sA_{\mathfrak{s}}}\Pi_{\mathfrak{s}}G_{k}(p(s))\,\mathrm{d}s$ to be well defined. The convergence of $\int_{-\infty}^{0}e^{-sA_{\mathfrak{s}}}\Pi_{\mathfrak{s}}{F}\,\mathrm{d}s$ is straightforward since $\Re\sigma(A_{\mathfrak{s}})<0$ by assumption and $F$ is constant. The derivation of (26) follows the same lines as the derivation for [12, Eq. (4.6)].

One can also generalize Theorem III.1 to the case when $F$ is time-dependent provided that $F$ satisfies suitable conditions to ensure $\mathfrak{J}_{F}$ given by (23) to be well defined and that the non-resonance condition (LABEL:Eq_NR) is suitably augmented. We leave the precise statement of such a generalization to a future work. For the applications considered in later sections, the forcing term $F$ is either a constant or with a sublinear growth. For such cases, $\mathfrak{J}_{F}$ is always well defined under the assumptions of Theorem III.1. We turn now to present the explicit formulas of the parameterizations based on the BF system (11).

III.3 Explicit Formulas for Variational Parameterizations

We provide in this Section, closed form formulas of parameterizations for forced dissipative such as Eq. (1). We consider first the case where $F$ is constant in time, and then deal with time-dependent forcing case.

III.3.1 The constant forcing case

To favor flexibility in applications, we consider scale-awareness of our parameterizations via BF systems, i.e. we consider parameterizations that are built up, mode by mode, from the integration, for each $n\geq m_{c}+1$ , of the following BF systems


	$\displaystyle\frac{\mathrm{d}p}{\,\mathrm{d}s}=A_{\mathfrak{c}}p+\Pi_{\mathfrak{c}}F,\hskip 67.00006pts\in[-\tau,0],$		(28a)
	$\displaystyle\frac{\mathrm{d}q_{n}}{\,\mathrm{d}s}=\lambda_{n}q_{n}+\Pi_{n}G_{k}\big{(}p(s)\big{)}+\Pi_{n}F,\;s\in[-\tau,0]$		(28b)
	$\displaystyle\mbox{with }p(0)=X\in H_{\mathfrak{c}},\mbox{ and }q_{n}(-\tau)=0.$		(28c)

Recall that $q_{n}(0)$ is aimed at parameterizing the solution amplitude $y_{n}(t)$ carried by mode $\bm{e}_{n}$ , when $X=y_{\mathfrak{c}}(t)$ in Eq. (28c), whose parameterization defect is minimized by optimization of the backward integration time $\tau$ in (4). To dispose of explicit formulas for $q_{n}(0)$ qualifying the dependence on $\tau$ , facilitates greatly this minimization.

In the case the nonlinearity $G(y)$ is quadratic, denoted by $B(y,y)$ , such formulas are easily accessible as (28) is integrable. It is actually integrable in the presence of higher-order terms, but we provide the details here for the quadratic case, leaving the details to the interested reader for extension.

Recall that we denote by $\bm{e}_{j}^{\ast}$ the conjugate modes from the adjoint $A^{\ast}$ and that these modes satisfy the bi-orthogonality condition. That is $\langle\bm{e}_{i},\bm{e}_{j}^{\ast}\rangle=1$ if $i=j$ , and zero otherwise, where $\langle\cdot,\cdot\rangle$ denotes the inner product endowing $H$ . Denoting by $\Phi_{n}(\tau,\bm{\lambda},X)$ the solution $q_{n}(0)$ to Eq. (28b), we find after calculations that

	$\displaystyle\Phi_{n}(\tau,\bm{\lambda},X)=R_{n}$	$\displaystyle(F,{\bm{\lambda}},\tau,X)-\frac{1-e^{\tau\lambda_{n}}}{\lambda_{n}}F_{n},$		(29)
		$\displaystyle+\sum_{i,j=1}^{m_{c}}D_{ij}^{n}(\tau,\bm{\lambda})B_{ij}^{n}X_{i}X_{j},$		(29)

in which $X_{i}$ and $X_{j}$ denote the components of the vector $X$ in $H_{\mathfrak{c}}$ onto $\bm{e}_{i}$ and $\bm{e}_{j}$ , $B_{ij}^{n}=\langle B(\bm{e}_{i},\bm{e}_{j}),\bm{e}^{*}_{n}\rangle$ and

D_{ij}^{n}(\tau,{\bm{\lambda}})=\frac{1-\exp\big{(}-\delta_{ij}^{n}\tau\big{)}}{\delta_{ij}^{n}},\text{if {\hbox{\delta_{ij}^n\neq 0}}},

(30)

while $D_{ij}^{n}(\tau,{\bm{\lambda}})=\tau$ otherwise. Here $\delta_{ij}^{n}=\lambda_{i}+\lambda_{j}-\lambda_{n}$ , with the $\lambda_{j}$ referring to the eigenvalues of $A$ . We refer to [12] and Appendix B for the expression of $R_{n}(F,{\bm{\lambda}},\tau,X)$ which accounts for the nonlinear interactions between the forcing components in $H_{\mathfrak{c}}$ . Here, the dependence on the $\lambda_{j}$ in (29) is made apparent, as this dependence plays a key role in the prediction of the higher-order transitions; see applications to the Rayleigh-Bénard system of Sec. V.

Remark III.1

OPM balances small spectral gaps. Theorem III.1 teaches us that when $\delta_{ij}^{n}>0$ not only $D_{ij}^{n}(\tau,{\bm{\lambda}})$ defined in (30) converges towards a well defined quantity as $\tau_{n}\rightarrow\infty$ but also the coefficients involved in $R_{n}(F,{\bm{\lambda}},\tau,X)$ (see (85)-(86)-(87)), in case of existence of invariant manifold. For parameter regimes where the latter fails to exist, some of the $\delta_{ij}^{n}$ or the $\lambda_{j}-\lambda_{n}$ involved in (86)-(87) can become small, leading to the well known small spectral gap issue [25] manifested typically by over-parameterization of the $\bm{e}_{n}$ ’s mode amplitude when the Lyapunov-Perron parameterization (23) is employed; see [12, Sec. 6]. The presence of the $\tau_{n}$ through the exponential terms in (30) and (86)-(87) allows for balancing these small spectral gaps after optimization and improve notably the parameterization and closure skills; see Sec. V and Appendix D.

Remark III.2

Formulas such as introduced in [43] in approximate inertial manifold (AIM) theory, and used earlier in atmospheric dynamics [44] in the context of non-normal modes initialization [45, 46, 47], are also tied to leading-order approximations of invariant manifolds since $\mathfrak{J}(X)$ given by (6) satisfies

\mathfrak{J}(X)=-A_{\mathfrak{s}}^{-1}\Pi_{\mathfrak{s}}G_{k}(X)+\mathcal{O}(\left\lVert X\right\rVert^{k}),\;X\in H_{\mathfrak{c}},

(31)

and the term $-A_{\mathfrak{s}}^{-1}\Pi_{\mathfrak{s}}G_{k}(X)$ is the main parameterization used in [43]; see [29, Lemma 4.1] and [11, Theorem A.1.1]. Adding higher-order terms can potentially lead to more accurate parameterizations and push the validity of the approximation to a larger neighborhood [48, 49], but in presence of small spectral gaps, such an operation may become also less successful. When such formulas arising in AIM theory are inserted within the proper data-informed variational approach such as done in [12, Sec. 4.4], their optimized version can also handle regimes in which spectral gap becomes small as demonstrated in [12, Sec. 6].

III.3.2 The time-dependent forcing case

Here, we assume that the neglected modes are subject to time-dependent forcing according to $F(t)=\sum_{n\geq m_{c}+1}\sigma_{n}f_{n}(t)\bm{e}_{n}$ . Then by solving the BF systems made of Eq. (11a) and Eqns. (12) with $q_{n}(t-\tau)=\zeta$ (to account for possibly non-zero mean), we arrive at the following time-dependent parameterization of the $n$ th mode’s amplitude:

\Phi_{n}(\tau,\bm{\lambda},X,t)=e^{\lambda_{n}\tau}\zeta+\underbrace{\sigma_{n}e^{\lambda_{n}t}\int_{t-\tau}^{t}e^{-\lambda_{n}s}f_{n}(s)\,\mathrm{d}s}_{(I)}+\sum_{i,j=1}^{m_{c}}D_{ij}^{n}(\tau,\bm{\lambda})B_{ij}^{n}X_{i}X_{j},

(32)

where $X=\sum_{j=1}^{m_{c}}X_{j}\bm{e}_{j}$ lies in $H_{\mathfrak{c}}$ .

This formula of $\Phi_{n}$ gives the solution to the homological equation Eq. (16) of Lemma III.1 in which $\Pi_{\mathfrak{s}}$ therein is replaced here by $\Pi_{n}$ , the projector onto the mode $\bm{e}_{n}$ whose amplitude is parameterized by $\Phi_{n}$ . Clearly, the terms in (I) are produced by the time-dependent forcing. They are functionals of the past $f_{n}$ and convey thus exogenous memory effects.

The integral term in (I) of Eq. (32) is of the form

I(t)=e^{\kappa t}\int_{t-\tau}^{t}e^{-\kappa s}f_{n}(s)\,\mathrm{d}s.

(33)

By taking derivates on both sides of (33), we observe that $I$ satisfies

\frac{\,\mathrm{d}I}{\,\mathrm{d}t}=\kappa I+f_{n}(t)-e^{\kappa\tau}f_{n}(t-\tau).

(34)

As a practical consequence, the computation of $I(t)$ boils down to solving the scalar ODE (34) which can be done with high-accuracy numerically, when $\kappa<0$ . One bypasses thus the computation of many quadratures (as $t$ evolves) that we would have to perform when relying on Eq. (33). Instead only one quadrature is required corresponding to the initialization of the ODE (34) at $t=0$ .

This latter computational aspect is important not only for simulation purposes when the corresponding OPM reduced system is ran online but also for training purposes, in the search of the optimal $\tau$ during the offline minimization stage of the parameterization defect.

If time-dependent forcing terms are present in the reduced state space $H_{\mathfrak{c}}$ , then the BF system (28) can still be solved analytically albeit leading to more involved integral terms than in (I) in the corresponding parameterization. This aspect will be detailed elsewhere.

III.3.3 OPM reduced systems

Either in the constant forcing or time-dependent forcing case, our OPM reduced system takes the form

\dot{X}=A_{\mathfrak{c}}X+\Pi_{\mathfrak{c}}G\big{(}X+\Phi_{\bm{\tau}^{\ast}}(X,t)\big{)}+\Pi_{\mathfrak{c}}F,

(35)

where

\Phi_{\bm{\tau}}(X,t)=\sum_{n\geq m_{c}+1}\Phi_{n}(\tau_{n},{\bm{\lambda}},X,t)\bm{e}_{n},

(36)

where either $\Phi_{n}(\tau_{n},{\bm{\lambda}},X,t)$ is given by (32) in the time-dependent case, or by (29), otherwise. Whatever the case, the vector $\bm{\tau}^{\ast}$ is formed by the minimizers $\tau_{n}^{\ast}$ of $\mathcal{Q}_{n}$ given by (4), for each $n$ . Note that in the case of the time-dependent parameterization (32), the OPM reduced system is run online by augmenting (36) with equations (34), depending on the modes that are forced.

We emphasize finally that from a data-driven perspective, the OPM reduced system benefits from its dynamical origin. By construction, only a scalar parameter $\tau$ is indeed optimized per mode to parameterize. This parameter benefits furthermore from a dynamical interpretation since it plays a key role in balancing the small spectral gaps known as to be the main issue in applications of invariant or inertial manifold theory in practice [25]; see Remark III.1 above.

IV Predicting Tipping Points

IV.1 The Stommel-Cessi model and tipping points

A simple model for oceanic circulation showing bistability is Stommel’s box model [50], where the ocean is represented by two boxes, a low-latitude box with temperature $T_{1}$ and salinity $S_{1}$ , and a high-latitude box with temperature $T_{2}$ and salinity $T_{2}$ ; see [51, Fig. 1]. The Stommel model can be viewed as the simplest “thermodynamic” version of the Atlantic Meridional Overturning Circulation (AMOC) [52, Chap. 6], a major ocean current system transporting warm surface waters toward the northern Atlantic that constitutes an important tipping point element of the climate system; see [53, 54].

Cessi in [51] proposed a variation of this model, based on the box model of [55], consisting of an ODE system describing the evolution of the differences $\Delta T=T_{1}-T_{2}$ and $\Delta S=S_{1}-S_{2}$ ; see [51, Eq. (2.3)]. The Cessi model trades the absolute functions involved in the original Stommel model by polynomial relations more prone to analysis. The dynamics of $\Delta T$ and $\Delta S$ are coupled via the density difference $\Delta\rho$ , approximated by the relation $\Delta\rho=\alpha_{S}\Delta S-\alpha_{T}\Delta T$ which induces an exchange $Q$ of mass between the boxes to be given as $Q=1/\tau_{d}+(q/V)\Delta\rho^{2}$ according to Cessi’s formulation, where $q$ denotes the the Poiseuille transport coefficient, $V$ the volume of a box, and $\tau_{d}$ the diffusion timescale. The coefficient $\alpha_{S}$ is a coefficient inversely proportional to the practical salinity unit, i.e. unit based on the properties of sea water conductivity while $\alpha_{T}$ is in ${}^{\circ}\,\textrm{C}^{-1}$ ; see [51]. In this simple model, $\Delta T$ relaxes at a rate $\tau_{r}$ to a reference temperature $\theta$ (with $T_{1}=\theta/2$ and $T_{2}=-\theta/2$ ) in absence of coupling between the boxes.

Using the dimensionless variables $y=\alpha_{S}\Delta S/(\alpha_{T}\theta)$ , $z=\Delta T/\theta$ , and rescaling time by the diffusion timescale $\tau_{d}$ , the Cessi model can be written as [56]

		$\displaystyle\dot{y}=F-y[1+\mu(z-y)^{2}],$		(37)
		$\displaystyle\dot{z}=-\frac{1}{\epsilon}(z-1)-z[1+\mu(z-y)^{2}],$		(37)

in which $F$ is proportional to the freshwater flux, $\epsilon=\tau_{r}/\tau_{d}$ , and $\mu^{2}=\tau_{d}(\alpha_{T}\theta)^{2}q/V$ ; see [51, Eq. (2.6)] and [56].

The nonlinear exchange of mass between the boxes is reflected by the nonlinear coupling terms in Eq. (LABEL:Cessi_model). These nonlinear terms lead to multiple equilibria in certain parts of the parameter space, in particular when $F$ is varied over a certain range $[F_{c_{1}},F_{c_{2}}]$ , while $\mu$ and $\epsilon$ are fixed. One can even prove that Eq. (LABEL:Cessi_model) experiences two saddle-node bifurcations [57] leading to a typical S-shaped bifurcation diagram; see Fig. 2A.

S-shaped bifurcation diagrams occur in oceanic models that go well beyond Eq. (LABEL:Cessi_model) such as based on the hydrostatic primitive equations or Boussinesq equations; see e.g. [58, 59, 60, 61]. More generally, S-shaped bifurcation diagrams and more complex multiplicity diagrams are known to occur in a broad class of nonlinear problems [62, 63, 64] that include energy balance climate models [65, 66, 67, 68], population dynamics models [69, 70], vegetation pattern models [71, 72], combustion models [73, 74, 75, 76], and many other fields [77].

The very presence of such S-shaped bifurcation diagrams provides the skeleton for tipping point phenomena to take place when such models are subject to the appropriate stochastic disturbances and parameter drift, causing the system to “tip” or move away from one branch of attractors to another; see [33, 34]. From an observational viewpoint, the study of tipping points has gained a considerable attention due to their role in climate change as a few components of the climate system (e.g. Amazon forest, the AMOC) have been identified as candidates for experiencing such critical transitions if forced beyond the point of no return [54, 78, 53].

Whatever the context, tipping phenomena are due to a subtle interplay between nonlinearity, slow parameter drift, and fast disturbances. To better understand how these interactions cooperate to produce a tipping phenomenon could help improve the design of early warning signals. Although, we will not focus on this latter point per se in this study, we show below, on the Cessi model, that the OPM framework provides useful insights in that perspective, by demonstrating its ability of deriving reduced models to predict accurately the crossing of a tipping point; see Sec. IV.3.

IV.2 OPM results for a fixed $F$ value: Noise-induced transitions

We report in this section on the OPM framework skills to derive accurate reduced models to reproduce noise-induced transitions experienced by the Cessi model (LABEL:Cessi_model), when subject to fast disturbances, for a fixed value of $F$ . The training of the OPM operated here serves as a basis for the harder tipping point prediction problem dealt with in Sec. IV.3 below, when $F$ is allowed to drift slowly through the critical value $F_{c_{2}}$ at which the lower branch of steady states experiences a saddle-node bifurcation manifested by a turning point; see Fig. 2A again. Recall that $F_{c_{1}}$ denotes here the $F$ -value corresponding to the turning point experienced by the upper branch.

Reformulation of the Cessi model (LABEL:Cessi_model). The 1D OPM reduced equation is obtained as follows. First, we fix an arbitrary value of $F$ in $[F_{c_{1}},F_{c_{2}}]$ that is denoted by $F_{\mathrm{ref}}$ and marked by the green vertical dashed line in Fig. 2A. The system (LABEL:Cessi_model) is then rewritten for the fluctuation variables $y^{\prime}(t)=y(t)-\overline{y}$ and $z^{\prime}(t)=z(t)-\overline{z}$ , where $\mkern 1.5mu\overline{\mkern-3.2mu\bm{X}\mkern-1.0mu}\mkern 1.5mu=(\overline{y},\overline{z})$ denotes the steady state of Eq. (LABEL:Cessi_model) in the lower branch when $F=F_{\mathrm{ref}}$ .

The resulting equation for $\bm{\delta}=(y^{\prime},z^{\prime})$ is then of the form

\dot{\bm{\delta}}=A\bm{\delta}+G_{2}(\bm{\delta})+G_{3}(\bm{\delta}),

(38)

with $A$ , $G_{2}$ , $G_{3}$ given by:

A=\begin{pmatrix}-1-\mu(\overline{z}-\overline{y})^{2}+2\mu\overline{y}(\overline{z}-\overline{y})&-2\mu\overline{y}(\overline{z}-\overline{y})\\ 2\mu\overline{z}(\overline{z}-\overline{y})&-\frac{1}{\epsilon}-1-\mu(\overline{z}-\overline{y})^{2}-2\mu\overline{z}(\overline{z}-\overline{y})\end{pmatrix},

(39)

G_{2}(\bm{\delta})=\begin{pmatrix}-\mu\overline{y}(z^{\prime}-y^{\prime})^{2}-2\mu(\overline{z}-\overline{y})y^{\prime}(z^{\prime}-y^{\prime})\\ -\mu\overline{z}(z^{\prime}-y^{\prime})^{2}-2\mu(\overline{z}-\overline{y})z^{\prime}(z^{\prime}-y^{\prime})\end{pmatrix}\quad\text{ and }\quad G_{3}(\bm{\delta})=\begin{pmatrix}-\mu y^{\prime}(z^{\prime}-y^{\prime})^{2}\\ -\mu z^{\prime}(z^{\prime}-y^{\prime})^{2}\end{pmatrix}.

(40)

Since $\mkern 1.5mu\overline{\mkern-3.2mu\bm{X}\mkern-1.0mu}\mkern 1.5mu$ is a locally stable steady state, we add noise to the first component of $\bm{\delta}$ to trigger transitions from the lower branch to the top branch in order to learn an OPM that can operate not only locally near $\mkern 1.5mu\overline{\mkern-3.2mu\bm{X}\mkern-1.0mu}\mkern 1.5mu$ but also when the dynamics is visiting the upper branch. This leads to

\dot{\bm{\delta}}=A\bm{\delta}+G_{2}(\bm{\delta})+G_{3}(\bm{\delta})+\bm{\sigma}\dot{W}(t),

(41)

where $\bm{\sigma}=(\sigma,0)^{T}$ and $W$ denotes a standard one-dimensional two-sided Brownian motion.

The eigenvalues of $A$ have negative real parts since $\mkern 1.5mu\overline{\mkern-3.2mu\bm{X}\mkern-1.0mu}\mkern 1.5mu$ is locally stable. We assume that the matrix $A$ has two distinct real eigenvalues, which turns out to be the case for a broad range of parameter regimes. As in Sec. III, the spectral elements of the matrix $A$ (resp. $A^{\ast}$ ) are denoted by $(\lambda_{j},\bm{e}_{j})$ (resp. $(\lambda^{*}_{j},\bm{e}^{*}_{j})$ ), for $j=1,2$ . These eigenmodes are normalized to satisfy $\langle\bm{e}_{j},\bm{e}^{*}_{j}\rangle=1$ , and are bi-orthogonal otherwise.

Let us introduce $\bm{\delta}^{\prime}=(\langle\bm{\delta},\bm{e}^{*}_{1}\rangle,\langle\bm{\delta},\bm{e}^{*}_{2}\rangle)^{T}$ and $\sigma_{j}=\langle\bm{\sigma},\bm{e}^{*}_{j}\rangle$ , for $j=1,2$ . We also introduce $\Lambda=\mbox{diag}(\lambda_{1},\lambda_{2})$ . In the eigenbasis, Eq. (41) can be written then as

\dot{\bm{\delta}^{\prime}}=\Lambda\bm{\delta}^{\prime}+\mathcal{G}_{2}(\bm{\delta}^{\prime})+\mathcal{G}_{3}(\bm{\delta}^{\prime})+(\sigma_{1}\dot{W}(t),\sigma_{2}\dot{W}(t))^{T},

(42)

with

\mathcal{G}_{k}^{j}(\bm{\delta}^{\prime})=\Big{\langle}G_{k}\Big{(}\delta^{\prime}_{1}\bm{e}_{1}+\delta^{\prime}_{2}\bm{e}_{2}\Big{)},\bm{e}^{*}_{j}\Big{\rangle},\;j=1,2,

(43)

where $\delta^{\prime}_{j}$ denotes the $j^{\mathrm{th}}$ component of $\bm{\delta}^{\prime}$ .

Derivation of the OPM reduced equation. We can now use the formulas of Sec. III.3 to obtain an explicit variational parameterization of the most stable direction carrying the variable $\delta^{\prime}_{2}$ here, in terms of the least stable one, carrying $\delta^{\prime}_{1}$ .

For Eq. (42), both of the forcing terms appearing in the $p$ - and $q$ -equations of the corresponding BF system (28) are stochastic (and thus time-dependent). To simplify, we omit the stochastic forcing in Eq. (28a) and work with the OPM formula given by (32) which, as shown below, is sufficient for deriving an efficient reduced system.

Thus, the formula (32) becomes in this context

\displaystyle\Phi_{2}(\tau,X,t)

\displaystyle=D_{11}^{2}(\tau)B_{11}^{2}X^{2}+Z_{\tau}(t),

(44)

where $D_{11}^{2}$ is given by (30), $B_{11}^{2}=\langle\mathcal{G}_{2}(\bm{e}_{1}),\bm{e}^{*}_{2}\rangle$ , and

Z_{\tau}(t)=\sigma_{2}e^{\lambda_{2}t}\int_{t-\tau}^{t}e^{-\lambda_{2}s}\dot{W}(s)\,\mathrm{d}s.

(45)

Once the optimal $\tau^{*}$ is obtained by minimizing the parameterization defect given by (4), we obtain the following OPM reduced equation for $\delta^{\prime}_{1}$ :

	$\displaystyle\dot{X}$	$\displaystyle=\lambda_{1}X+\Big{\langle}G_{2}(X\bm{e}_{1}+\Phi_{2}(\tau^{},X,t)\bm{e}_{2}),\bm{e}^{}_{1}\Big{\rangle}$		(46)
		$\displaystyle\quad+\langle G_{3}(X\bm{e}_{1}+\Phi_{2}(\tau^{},X,t)\bm{e}_{2}),\bm{e}^{}_{1}\rangle+\sigma_{1}\dot{W}(t).$		(46)

The online obtention of $X(t)$ by simulation of the OPM reduced equation Eq. (46) allows us to get the following approximation of the variables $(y(t),z(t))$ from the original Cessi model (LABEL:Cessi_model):

(y_{\mathrm{app}}(t),z_{\mathrm{app}}(t))^{T}=X(t)\bm{e}_{1}+\Phi_{2}(\tau^{*},X(t),t)\bm{e}_{2}+\mkern 1.5mu\overline{\mkern-3.2mu\bm{X}\mkern-1.0mu}\mkern 1.5mu,

(47)

after going back to the original variables.

Numerical results. The OPM reduced equation Eq. (46) is able to reproduce the dynamics of $\delta^{\prime}_{1}$ for a wide range of parameter regimes. We show in Fig. 2 a typical example of skills for parameter values of $\mu$ , $\epsilon$ , $\sigma$ and $F=F_{\mathrm{ref}}$ as listed in Table 1. Since $F_{\mathrm{ref}}$ lies in $[F_{c_{1}},F_{c_{2}}]$ (see Table 1), the Cessi model (LABEL:Cessi_model) admits three steady states among which two are locally stable (lower and upper branches) and the other one is unstable (middle branch); see Fig. 2A again. For this choice of $F_{\mathrm{ref}}$ , the steady state corresponding to the lower branch is $\mkern 1.5mu\overline{\mkern-3.2muX\mkern-1.0mu}\mkern 1.5mu=(\overline{y},\overline{z})$ with $\overline{y}=0.4130$ , $\overline{z}=0.8285$ , and the eigenvalues of $A$ are $\lambda_{1}=-0.5168$ , and $\lambda_{2}=-15.7650$ .

Table 1: Parameter values

$\epsilon$	$\sigma$	$\mu$	$F_{c_{1}}$	$F_{c_{2}}$	$F_{\mathrm{ref}}$
0.1	$\sqrt{\epsilon}$	6.2	0.8513	0.8821	0.855

The offline trajectory $\bm{\delta}^{\prime}(t)$ used as input for training $\Phi_{2}$ to find the optimal $\tau$ , is taken as driven by an arbitrary Brownian path from Eq. (42) for $t$ in the time interval $[20,80]$ . The resulting offline skills of the OPM, $\Phi_{2}(\tau^{\ast},\delta_{1}^{\prime}(t),t)$ , are shown as the blue curve in Fig. 2B, while the original targeted time series $\delta^{\prime}_{2}(t)$ to parameterize is shown in black. The optimal $\tau$ that minimizes the normalized parameterization defect $Q$ turns out to be $\infty$ for the considered regime, as shown in Fig. 2C. One observes that the OPM captures, in average, the fluctuations of $\delta_{2}^{\prime}(t)$ ; compare blue curve with black curve.

The skills of the corresponding OPM reduced equation (46) are shown in Fig. 2D, after converting back to the original $(y,z)$ -variable using (47). The results are shown out of sample, i.e. for another noise path and over a time interval different from the one used for training. The 1D reduced OPM reduced equation (46) is able to capture the multiple noise-induced transitions occurring across the two equilibria (marked by the cyan lines), after transforming back to the original $(y,z)$ -variable; compare red with black curves in Fig. 2D. Both the Cessi model (41) and the reduced OPM equation are numerically integrated using the Euler-Maruyama scheme with time step $\delta t=10^{-3}$ .

Note that we chose here the numerical value of $F_{\mathrm{ref}}$ to be closer to $F_{c_{1}}$ than to $F_{c_{2}}$ for making more challenging the tipping point prediction experiment conducted in Sec. IV.3 below. There, we indeed train the OPM for $F=F_{\mathrm{ref}}$ while aiming at predicting the tipping phenomenon as $F$ drifts slowly through $F_{c_{2}}$ (located thus further away) as time evolves.

IV.3 Predicting the crossing of tipping points

OPM reduced equation (46) in the original coordinates. For better interpretability, we first rewrite the OPM reduced equation (46) under the original coordinates in which the Cessi model (LABEL:Cessi_model) is formulated. For that purpose, we exploit the components of the eigenvectors of $A$ given by (39) (for $F=F_{\mathrm{ref}}$ ) that we write as $\bm{e}_{1}=(e_{11},e_{12})^{T}$ and $\bm{e}_{2}=(e_{21},e_{22})^{T}$ . Then, from (47) the expression of $y_{\mathrm{app}}$ rewrites as

	$\displaystyle y_{\mathrm{app}}(t)$	$\displaystyle=e_{11}X(t)+e_{21}\Phi_{2}(\tau^{*},X(t),t)+\overline{y}$		(48)
		$\displaystyle=e_{11}X(t)+\gamma(\tau^{})X(t)^{2}+e_{21}Z_{\tau^{}}(t)+\overline{y},$		(48)

where the second line is obtained by using the expression of $\Phi_{2}$ given by (44) and the notation $\gamma(\tau^{*})=e_{21}D_{11}^{2}(\tau^{*})B_{11}^{2}$ . It turns out that

X(t)=\frac{-e_{11}+\sqrt{(e_{11})^{2}-4\gamma(\tau^{*})\big{(}e_{21}Z_{\tau^{*}}(t)+\overline{y}-y_{\mathrm{app}}(t)\big{)}}}{2\gamma(\tau^{*})}\stackrel{{\scriptstyle\mathrm{def}}}{{=}}\varphi(\tau^{*},y_{\mathrm{app}}(t),t),\;\;t\geq 0.

(49)

From (47), we also have

	$\displaystyle z_{\mathrm{app}}(t)$	$\displaystyle=e_{12}X(t)+e_{22}\Phi_{2}(\tau^{*},X(t),t)+\overline{z}$		(50)
		$\displaystyle=e_{12}X(t)+\gamma_{2}(\tau^{})X(t)^{2}+e_{22}Z_{\tau^{}}(t)+\overline{z},$		(50)

where $\gamma_{2}(\tau^{*})=e_{22}D_{11}^{2}(\tau^{*})B_{11}^{2}$ .

We can now express $z_{\mathrm{app}}(t)$ as a function of $y_{\mathrm{app}}(t)$ , i.e. $z_{\mathrm{app}}(t)=\Psi(\tau^{*},y_{\mathrm{app}}(t),t)$ with $\Psi$ given by

	$\displaystyle\Psi(\tau^{*},y_{\mathrm{app}},t)=\overline{z}$	$\displaystyle+\gamma_{2}(\tau^{})\varphi(\tau^{},y_{\mathrm{app}},t)^{2}$		(51)
		$\displaystyle\;\;+e_{12}\varphi(\tau^{},y_{\mathrm{app}},t)+e_{22}Z_{\tau^{}}(t),$		(51)

as obtained by replacing $X(t)$ with $\varphi(\tau^{\ast},y_{\mathrm{app}},t)$ in Eq. (50). Replacing $y_{\mathrm{app}}$ by the dummy variable $y$ , one observes that $\Psi$ provides a time-dependent manifold that parameterizes the variable $z$ in the original Cessi model as (a time-dependent) polynomial function of radical functions in $y$ due to the expression of $\varphi(\tau^{\ast},y,t)$ ; see (49).

We are now in position to derive the equation satisfied by $y_{\mathrm{app}}$ aimed at approximating $y$ in the original variables. For that purpose, we first differentiate with respect to time the expression of $X(t)$ given in (49) to obtain

\dot{X}=\frac{-e_{21}\dot{Z}_{\tau^{*}}(t)+\dot{y}_{\mathrm{app}}}{\sqrt{(e_{11})^{2}-4\gamma(\tau^{*})\big{(}e_{21}Z_{\tau^{*}}(t)+\overline{y}-y_{\mathrm{app}}\big{)}}}.

(52)

On the other hand, by taking into account the expressions of $y_{\mathrm{app}}$ in (48) and $z_{\mathrm{app}}$ in (50), we observe that the $X$ -equation (46) can be re-written as

\dot{X}=\langle\mathcal{F}(y_{\mathrm{app}}(t),z_{\mathrm{app}}(t))+\bm{\sigma}\dot{W}(t),\bm{e}^{*}_{1}\rangle.

(53)

with

\mathcal{F}(y,z)=\begin{pmatrix}&F_{\mathrm{ref}}-y[1+\mu(z-y)^{2}]\\ &-\frac{1}{\epsilon}(z-1)-z[1+\mu(z-y)^{2}]\end{pmatrix},

(54)

given by the right-hand side (RHS) of the Cessi model (LABEL:Cessi_model) with $F=F_{\mathrm{ref}}$ .

Now by equating the RHS of Eq. (53) with that of Eq. (52), we obtain, after substitution of $z_{\mathrm{app}}$ by $\Psi(\tau^{*},y_{\mathrm{app}},t)$ , that $y_{\mathrm{app}}$ satisfies the following equation

	$\displaystyle\dot{Y}=\alpha(\tau^{},Y)\Big{\langle}\mathcal{F}(Y,\Psi(\tau^{},Y,t))$	$\displaystyle+\bm{\sigma}\dot{W}(t),\bm{e}^{*}_{1}\Big{\rangle}$		(55)
		$\displaystyle\qquad+e_{21}\dot{Z}_{\tau^{*}}(t),$		(55)

where

\alpha(\tau^{*},Y)=\sqrt{(e_{11})^{2}-4\gamma(\tau^{*})\big{(}e_{21}Z_{\tau^{*}}(t)+\overline{y}-Y\big{)}},

(56)

and $\dot{Z}_{\tau^{*}}$ is given by (cf. (45))

\dot{Z}_{\tau^{*}}(t)=\lambda_{2}Z_{\tau^{*}}(t)+\sigma_{2}\dot{W}(t)-\sigma_{2}e^{\lambda_{2}\tau^{*}}\dot{W}(t-\tau^{*}).

(57)

Eq. (55) is the OPM reduced equation for the $y$ -variable of the Cessis model, as rewritten in the original system of coordinates. This equation in its analytic formulation, mixes information about the two components of the RHS to Eq. (LABEL:Cessi_model) through the inner product involved in Eq. (55), while parameterizing the $z$ -variable by the time-dependent parametarization $\Psi(\tau^{*},\cdot,t)$ given by (51). While designed for $F=F_{\mathrm{ref}}$ away from the lower tipping point, we show next that the OPM reduced system built up this way demonstrates a remarkable ability in predicting this tipping point when $F$ is allowed to drift away from $F_{\mathrm{ref}}$ in the course of time.

Prediction results. Thus, we aim at predicting by our OPM reduced model, the tipping phenomenon experienced by the full model when $F$ is subject to a slow drift in time via

F(t)=F_{0}+\kappa t,

(58)

where $\kappa>0$ is a small parameter, and $F_{0}$ is some fixed value of $F$ such that $F_{0}<F_{c_{2}}$ , with $F_{c_{2}}$ denoting the parameter value of the turning point of the lower branch of equilibria; see Fig. 2A.

The original Cessi model is forced as follows

		$\displaystyle\dot{y}=F(t)-y[1+\mu(z-y)^{2}]+\sigma\dot{W}_{t},$		(59)
		$\displaystyle\dot{z}=-\frac{1}{\epsilon}(z-1)-z[1+\mu(z-y)^{2}].$		(59)

Introducing, $g(t)=\langle(F(t)-F_{\mathrm{ref}},0)^{T},\bm{e}^{*}_{1}\rangle$ , and writing $F(t)$ as $F_{\mathrm{ref}}+(F(t)-F_{\mathrm{ref}})$ , we consider the following forced version of the OPM reduced equation (55):

	$\displaystyle\dot{Y}=\alpha(\tau^{},Y)\Big{\langle}\mathcal{F}(Y,\Psi(\tau^{},$	$\displaystyle Y,t))+\bm{\sigma}\dot{W}(t),\bm{e}^{*}_{1}\Big{\rangle}$		(60)
		$\displaystyle\;\;+e_{21}\dot{Z}_{\tau^{*}}(t)+g(t).$		(60)

In this equation, recall that the parameterization, $\Psi(\tau^{*},\cdot,t)$ has been trained for $F=F_{\mathrm{ref}}$ ; see above.

We set now $F_{0}=0.85$ , $\kappa=2\times 10^{-4}$ , and $\sigma=\sqrt{\epsilon}/50$ , while $\mu=6.2$ and $\epsilon=0.1$ are kept as in Table 1. Note that compared to the results shown in Fig. 2, the noise level is here substantially reduced to focus on the tipping phenomenon to occur in the vicinity of the turning point $F=F_{c_{2}}$ . The goal is thus to compare the behavior of $y(t)$ solving the forced Cessi model (LABEL:Cessi_model_v3) with that of the solution $Y(t)$ of the (forced) OPM reduced equation (60), as $F(t)$ crosses the critical value $F_{c_{2}}$ .

The results are shown in Fig. 3. The red curve corresponds to the solution of the OPM reduced equation (60), and the black curve to the $y$ -component of the forced Cessi model (LABEL:Cessi_model_v3). Our baseline is the slow manifold parameterization which consists of simply parameterizing $z$ as $z=1+\mathcal{O}(\epsilon)$ in Eq. (60) which provides, for $\epsilon$ sufficiently small due to the Tikhonov’s theorem [79, 56], good reduction skills from the “slow” reduced equation,

\dot{y}=F(t)-y[1+\mu(1-y)^{2}]+\sigma\dot{W}_{t}.

(61)

Here, the value $\epsilon=0.1$ lies beyond the domain of applicability of the Tikhonov theorem, and as a result the slow reduced Eq. (61) fails in predicting any tipping phenomenon; see yellow curve in Fig. 3. In contrast, the OPM reduced equation (60) demonstrates a striking success in predicting the tipping phenomenon to the upper branch, in spite of being trained away from the targeted turning point, for $F=F_{\textrm{ref}}$ as marked by the (green) vertical dash line. The only caveat is the overshoot observed in the magnitude of the predicted random steady state in the upper branch.

The striking prediction result of Fig. 3 are shown for one noise realization. We explore now the accuracy in predicting such a tipping phenomenon by the OPM reduced equation (60) when the noise realization is varied.

To do so, we estimate the statistical distribution of the $F$ -value at which tipping takes place, denoted by $F_{\mathrm{transition}}$ , for both the Cessi model (LABEL:Cessi_model_v3) and Eq. (60). These distributions are estimated as follows. We denote by $\bar{y}_{c}$ the $y$ -component of the steady state at which the saddle-node bifurcation occurs in the lower-branch for $F=F_{c_{2}}$ . As noise is turned on and $F(t)$ evolves slowly through $F_{c_{2}}$ , the solution path $y(t)$ to Eq. (LABEL:Cessi_model_v3) increases in average while fluctuating around the $\bar{y}_{c}$ -value (due to small noise) before shooting off to the upper branch as one nears $F_{c_{2}}$ . During this process, there is a time instant, denoted by $t_{\mathrm{transition}}$ , such that for all $t>t_{\mathrm{transition}}$ , $y(t)$ stay above $\bar{y}_{c}$ . We denote the $F$ -value corresponding to $t_{\mathrm{transition}}$ as $F_{\mathrm{transition}}$ according to (58). Whatever the noise realization, the solution to the OPM reduced equation (60) experiences the same phenomenon leading thus to its own $F_{\mathrm{transition}}$ for a particular noise realization. As shown by the histograms in Fig. 4, the distribution of $F_{\mathrm{transition}}$ predicted by the OPM reduced equation (60) (blue curve) follows closely that computed from the full system (LABEL:Cessi_model_v3) (orange bars). Thus, not only the OPM reduced equation (60) is qualitatively able to reproduce the passage through a tipping point (as shown in Fig. 3), but is also able to accurately predicting the critical $F$ -value (or time-instant) at which the tipping phenomenon takes place with an overall success rate over 99%.

V Predicting Higher-Order Critical Transitions

V.1 Problem formulation

In this section, we aim at applying our variational parameterization framework to the the following Rayleigh-Bénard (RB) system from [80]:

$\displaystyle\dot{C_{1}}$	$\displaystyle=-\sigma b_{1}C_{1}-C_{2}C_{4}+b_{4}C_{4}^{2}+b_{3}C_{3}C_{5}-\sigma b_{2}C_{7},$	(62)
$\displaystyle\dot{C_{2}}$	$\displaystyle=-\sigma C_{2}+C_{1}C_{4}-C_{2}C_{5}+C_{4}C_{5}-\frac{\sigma}{2}C_{9},$
$\displaystyle\dot{C_{3}}$	$\displaystyle=-\sigma b_{1}C_{3}+C_{2}C_{4}-b_{4}C_{2}^{2}-b_{3}C_{1}C_{5}+\sigma b_{2}C_{8},$
$\displaystyle\dot{C_{4}}$	$\displaystyle=-\sigma C_{4}-C_{2}C_{3}-C_{2}C_{5}+C_{4}C_{5}+\frac{\sigma}{2}C_{9},$
$\displaystyle\dot{C_{5}}$	$\displaystyle=-\sigma b_{5}C_{5}+\frac{1}{2}C_{2}^{2}-\frac{1}{2}C_{4}^{2},$
$\displaystyle\dot{C_{6}}$	$\displaystyle=-b_{6}C_{6}+C_{2}C_{9}-C_{4}C_{9},$
$\displaystyle\dot{C_{7}}$	$\displaystyle=-b_{1}C_{7}-rC_{1}+2C_{5}C_{8}-C_{4}C_{9},$
$\displaystyle\dot{C_{8}}$	$\displaystyle=-b_{1}C_{8}+rC_{3}-2C_{5}C_{7}+C_{2}C_{9},$
$\displaystyle\dot{C_{9}}$	$\displaystyle=-C_{9}-C_{2}(r+2C_{6}+C_{8})+C_{4}(r+2C_{6}+C_{7})$

Here $\sigma$ denotes the Prandtl number, and $r$ denotes the reduced Rayleigh number defined to be the ratio between the Rayleigh number $R$ and its critical value $R_{c}$ at which the convection sets in. The coefficients $b_{i}$ are given by

		$\displaystyle b_{1}=\frac{4(1+a^{2})}{1+2a^{2}},\quad b_{2}=\frac{1+2a^{2}}{2(1+a^{2})},\quad b_{3}=\frac{2(1-a^{2})}{1+a^{2}},$
		$\displaystyle b_{4}=\frac{a^{2}}{1+a^{2}},\quad b_{5}=\frac{8a^{2}}{1+2a^{2}},\quad b_{6}=\frac{4}{1+2a^{2}},$

with $a=\frac{1}{2}$ corresponding to the critical horizontal wavenumber of the square convection cell. This system is obtained as a Fourier truncation of hydrodynamic equations describing Rayleigh-Bénard (RB) convection in a 3D box [80]. The Prandtl number is chosen to be $\sigma=0.5$ in the experiments performed below, which is the same as used in [80]. The reduced Rayleigh number $r$ is varied according to these experiments; see Table 2.

Our goal is to assess the ability of our variational parameterization framework for predicting higher-order transitions arising in (62) by training the OPM only with data prior to the transition we aim at predicting. The Rayleigh number $r$ is our control parameter. As it increases, Eq. (LABEL:Eq_RBC_eigenbasis) undergoes several critical transitions/bifurcations, leading to chaos via a period-doubling cascade [80]. We focus on the prediction of two transition scenarios beyond Hopf bifurcation: (I) the period-doubling bifurcation, and (II) the transition from a period-doubling regime to chaos. Noteworthy are the failures that standard invariant manifold theory or AIM encounter in the prediction of these transitions, here; see Appendix D.

To do so, we re-write Eq. (62) into the following compact form:

\dot{\bm{C}}=L\bm{C}+B(\bm{C},\bm{C}),

(63)

where $\bm{C}=(C_{1},\cdots C_{9})^{T}$ , and $L$ is the $9\times 9$ matrix given by

L=\begin{pmatrix}-\sigma b_{1}&0&0&0&0&0&-\sigma b_{2}&0&0\\ 0&-\sigma&0&0&0&0&0&0&-\frac{\sigma}{2}\\ 0&0&-\sigma b_{1}&0&0&0&0&\sigma b_{2}&0\\ 0&0&0&-\sigma&0&0&0&0&\frac{\sigma}{2}\\ 0&0&0&0&-\sigma b_{5}&0&0&0&0\\ 0&0&0&0&0&-b_{6}&0&0&0\\ -r&0&0&0&0&0&-b_{1}&0&0\\ 0&0&r&0&0&0&0&-b_{1}&0\\ 0&-r&0&r&0&0&0&0&-1\end{pmatrix}

The nonlinear term $B$ is defined as follows. For any $\bm{\phi}=(\phi_{1},\cdots,\phi_{9})^{T}$ and $\bm{\psi}=(\psi_{1},\cdots,\psi_{9})^{T}$ in $\mathbb{C}^{9}$ , we have

\displaystyle B(\bm{\phi},\bm{\psi})=\begin{pmatrix}-\phi_{2}\psi_{4}+b_{4}\phi_{4}\psi_{4}+b_{3}\phi_{3}\psi_{5}\\ \phi_{1}\psi_{4}-\phi_{2}\psi_{5}+\phi_{4}\psi_{5}\\ \phi_{2}\psi_{4}-b_{4}\phi_{2}\psi_{2}-b_{3}\phi_{1}\psi_{5}\\ -\phi_{2}\psi_{3}-\phi_{2}\psi_{5}+\phi_{4}\psi_{5}\\ \frac{1}{2}\phi_{2}\psi_{2}-\frac{1}{2}\phi_{4}\psi_{4}\\ \phi_{2}\psi_{9}-\phi_{4}\psi_{9}\\ 2\phi_{5}\psi_{8}-\phi_{4}\psi_{9}\\ -2\phi_{5}\psi_{7}+\phi_{2}\psi_{9}\\ -2\phi_{2}\psi_{6}-\phi_{2}\psi_{8}+2\phi_{4}\psi_{6}+\phi_{4}\psi_{7}\end{pmatrix}.

(64)

We now re-write Eq. (62) in terms of fluctuations with respect to its mean state. In that respect, we subtract from $\bm{C}(t)=(C_{1}(t),\cdots,C_{9}(t))$ its mean state $\overline{\bm{C}}_{r}$ , which is estimated, in practice, from simulation of Eq. (62) over a typical characteristic time of the dynamics that resolves e.g. decay of correlations (when the dynamics is chaotic) or a period (when the dynamics is periodic); see also [12]. The corresponding ODE system for the fluctuation variable, $\bm{\delta}(t)=\bm{C}(t)-\overline{\bm{C}}_{r}$ , is then given by:

\frac{\mathrm{d}\bm{\delta}}{\mathrm{d}t}=A\bm{\delta}+B(\bm{\delta},\bm{\delta})+L\overline{\bm{C}}_{r}+B(\overline{\bm{C}}_{r},\overline{\bm{C}}_{r}),

(65)

with

A\bm{\delta}=L\bm{\delta}+B(\overline{\bm{C}}_{r},\bm{\delta})+B(\bm{\delta},\overline{\bm{C}}_{r}).

(66)

Denote the spectral elements of the matrix $A$ by $\{(\lambda_{j},\bm{e}_{j})\;:\;1\leq j\leq 9\}$ and those of $A^{\ast}$ by $\{(\lambda^{*}_{j},\bm{e}^{*}_{j})\;:\;1\leq j\leq 9\}$ . Here the eigenmodes are normalized so that $\langle\bm{e}_{j},\bm{e}^{*}_{k}\rangle=1$ if $j=k$ , and $0$ otherwise. By taking the expansion of $\bm{\delta}$ in terms of the eigenelements of $L$ , we get

\bm{\delta}(t)=\sum_{j=1}^{9}y_{j}(t)\bm{e}_{j}\quad\text{ with }\quad y_{j}(t)=\langle\bm{\delta}(t),\bm{e}^{*}_{j}\rangle.

(67)

Assuming that $A$ is diagonalizable in $\mathbb{C}$ and by rewriting (65) in the variable $\bm{y}=(y_{1},\cdots,y_{9})^{T}$ , we obtain that

		$\displaystyle\dot{y}_{j}=\lambda_{j}(\overline{\bm{C}}_{r})y_{j}+\sum_{k,\ell=1}^{9}B_{k\ell}^{j}(\overline{\bm{C}}_{r})y_{k}y_{\ell}+F_{j}(\overline{\bm{C}}_{r}),$		(68)
		$\displaystyle j=1,\cdots,9,$		(68)

where $B_{k\ell}^{j}(\overline{\bm{C}}_{r})=\langle B(\bm{e}_{k},\bm{e}_{\ell}),\bm{e}^{*}_{j}\rangle$ and

F_{j}(\overline{\bm{C}}_{r})=\langle L\overline{\bm{C}}_{r}+B(\overline{\bm{C}}_{r},\overline{\bm{C}}_{r}),\bm{e}^{*}_{j}\rangle.

(69)

Like the $\lambda_{j}$ , the eigenmodes also depend on the mean state $\overline{\bm{C}}_{r}$ , explaining the dependence of the interaction coefficients $B_{k\ell}^{j}$ . From now on, we work with Eq. (LABEL:Eq_RBC_eigenbasis).

V.2 Predicting Higher-order Transitions via OPM: Method

Thus, we aim at predicting for Eq. (LABEL:Eq_RBC_eigenbasis) two types of transition: (I) the period-doubling bifurcation, and (II) the transition from period-doubling to chaos, referred to as Experiments I and II, respectively. For each of these experiments, we are given a reduced state space $H_{\mathfrak{c}}=\mbox{span}(\bm{e}_{1},\cdots,\bm{e}_{m_{c}})$ with $m_{c}$ as indicated in Table 2, depending on the parameter regime. The eignemodes are here ranked by the real part of the corresponding eigenvalues, $\bm{e}_{1}$ corresponding to the eigenvalue with the largest real part. The goal is to derive an OPM reduced system (35) able to predict such transitions. The challenge lies in the optimization stage of the OPM as due to the prediction constraint, one is prevented to use data from the full model at the parameter $r=r_{P}$ which one desires to predict the dynamics. Only data prior to the critical value $r^{*}$ at which the concerned transition, either period-doubling or chaos, takes place, are here allowed.

Table 2: Prediction experiments for the RB System (LABEL:Eq_RBC_eigenbasis). The parameter values

r_{D}(<r^{\ast})

corresponds to the allowable upper bound for which the mean state dependence of

\overline{\bm{C}}_{r}

r

is estimated, in view of extrapolation at

r=r_{P}

. In each experiment,

I_{r}=[r_{0},r_{D}]

with

r_{D}-r_{0}=2\times 10^{-2}

, corresponding to segments show in orange in Fig. 5-(D). The critical value

r^{\ast}

indicates the parameter value at which the transition occurs, depending on the experiment. The parameter value

r_{P}>r^{\ast}

at which the prediction is sought, is also given.

	$m_{c}$	$r_{D}$	$r^{\ast}$	$r_{P}$
Experiment I (period-doubling)	3	$13.91$	$13.99$	$14$
Experiment II (chaos)	5	$14.10$	$14.17$	$14.22$

Due to the dependence on $\overline{\bm{C}}_{r}$ of $\bm{\lambda}$ , as well as of the interaction coefficients $B_{k\ell}^{j}$ and forcing terms $F_{j}$ , a particular attention to this dependence has to be paid. Indeed, recall that the parameterizations $\Phi_{n}$ given by the explicit formula (29), depend heavily here on the spectral elements of linearized operator $A$ at $\overline{\bm{C}}_{r}$ , and thus does its optimization. Since the goal is to eventually dispose of an OPM reduced system able to predict the dynamical behavior at $r=r_{P}>r^{\ast}$ , one cannot rely on data from the full model at $r=r_{P}$ and thus one cannot exploit in particular the knowledge of the mean state $\overline{\bm{C}}_{r}$ for $r=r_{P}$ . We are thus forced to estimate the latter for our purpose.

To do so, we estimate the dependence on $r$ of $\overline{\bm{C}}_{r}$ on an interval $I_{r}=[r_{0},r_{D}]$ such that $r_{0}<r_{D}<r^{\ast}$ (see Table 2), and use this estimation to extrapolate the value of $\overline{\bm{C}}_{r}$ at $r=r_{P}$ , that we denote by $\overline{\bm{C}}^{\text{ext}}_{r_{P}}$ . For both experiments, it turned out that a linear extrapolation is sufficient. This extrapolated value $\overline{\bm{C}}^{\text{ext}}_{r_{P}}$ allows us to compute the spectral elements of the operator $A$ given by (66) in which $\overline{\bm{C}}^{\text{ext}}_{r_{P}}$ replaces the genuine mean state $\overline{\bm{C}}_{r}$ . Obviously, the better is the approximation of $\overline{\bm{C}}_{r}$ by $\overline{\bm{C}}^{\text{ext}}_{r_{P}}$ , the better the approximation of $\bm{\lambda}(\overline{\bm{C}}_{r_{P}})$ by $\bm{\lambda}(\overline{\bm{C}}^{\text{ext}}_{r_{P}})$ along with the corresponding eigenspace.

We turn now to the training of the OPM exploiting these surrogate spectral elements, that will be used for predicting the dynamical behavior at $r=r_{P}$ . To avoid any looking ahead whatsoever, we allow ourselves to only use the data of the full model (LABEL:Eq_RBC_eigenbasis) for $r=r_{D}<r_{P}$ and minimize the following parameterization defect

\mathcal{J}_{n}(\tau)=\overline{\big{|}y_{n}^{r_{D}}(t)-\Phi_{n}(\tau,\bm{\lambda}(\overline{\bm{C}}^{\text{ext}}_{r_{P}}),\bm{y}_{\mathfrak{c}}^{r_{D}}(t))\big{|}^{2}},

(70)

for each relevant $n$ , in which $\bm{y}_{\mathfrak{c}}^{r_{D}}(t)$ (resp. $y_{n}(t)$ ) denotes the full model solution’s amplitude in $H_{\mathfrak{c}}$ (resp. $\bm{e}_{n}$ ) at $r=r_{D}$ . The normalized defect is then given by $J_{n}(\tau)=\mathcal{J}_{n}(\tau)/\overline{|y_{n}^{r_{D}}|^{2}}$ . To compute these defects we proceed as follows. We first use the spectral elements at $r=r_{P}$ of the linearized matrix $A$ at the extrapolated mean state $\overline{\bm{C}}^{\text{ext}}_{r_{P}}$ to form the coefficients $R_{n}$ , $\lambda_{n}^{-1}(1-e^{\tau\lambda_{n}})F_{n}$ and $D_{ij}^{n}(\tau,\bm{\lambda})B_{ij}^{n}$ involved in the expression (29) of $\Phi_{n}$ . For each $n$ , the defect $\mathcal{J}_{n}(\tau)$ is then fed by the input data at $r=r_{D}$ (i.e. $y_{n}^{r_{D}}(t)$ and $\bm{y}_{\mathfrak{c}}^{r_{D}}(t)$ ), and evaluated for $\tau$ that varies in some sufficiently large interval to capture global minima. The results are shown in Fig. 5C and Fig. 5E for Experiment I and II, respectively. Note that the evaluation of $\mathcal{J}_{n}(\tau)$ for a range of $\tau$ is used here only for visualization purpose as it is not needed to find a minimum. A simple gradient-descent algorithm can be indeed used for the latter; see [12, Appendix]. We denote by $\hstretch{2}{\hat{\hstretch{.6}{\tau}}}_{n}^{\ast}$ the resulting optimal value of $\tau$ per variable to parameterize.

After the $\hstretch{2}{\hat{\hstretch{.6}{\tau}}}_{n}^{\ast}$ are found for each $n\geq m_{c}+1$ , the OPM reduced system used to predict the dynamics at $r=r_{P}$ takes the form:

	$\displaystyle\dot{X}_{j}=\hstretch{2}{\hat{\hstretch{.6}{\lambda}}}_{j}(r_{P})X_{j}$	$\displaystyle+\sum_{k,\ell=1}^{m_{c}}\hstretch{2}{\hat{\hstretch{.6}{B}}}_{k\ell}^{j}(r_{P})X_{k}X_{\ell}+\sum_{k=1}^{m_{c}}\sum_{\ell=m_{c}+1}^{9}\big{(}\hstretch{2}{\hat{\hstretch{.6}{B}}}_{k\ell}^{j}(r_{P})+\hstretch{2}{\hat{\hstretch{.6}{B}}}_{\ell k}^{j}(r_{P})\big{)}X_{k}\Phi_{\ell}(\hstretch{2}{\hat{\hstretch{.6}{\tau}}}_{\ell}^{\ast},\hstretch{2}{\hat{\hstretch{.6}{\bm{\lambda}}}}(r_{P}),X)$		(71)
		$\displaystyle+\sum_{k,\ell=m_{c}+1}^{9}\hstretch{2}{\hat{\hstretch{.6}{B}}}_{k\ell}^{j}(r_{P})\Phi_{k}(\hstretch{2}{\hat{\hstretch{.6}{\tau}}}_{k}^{\ast},\hstretch{2}{\hat{\hstretch{.6}{\bm{\lambda}}}}(r_{P}),X)\Phi_{\ell}(\hstretch{2}{\hat{\hstretch{.6}{\tau}}}_{\ell}^{\ast},\hstretch{2}{\hat{\hstretch{.6}{\bm{\lambda}}}}(r_{P}),X)+\hstretch{2}{\hat{\hstretch{.6}{F}}}_{j}(r_{P}),\quad j=1,\cdots,m_{c},$		(71)

with $\hstretch{2}{\hat{\hstretch{.6}{\lambda}}}_{j}(r_{P})=\lambda_{j}(\overline{\bm{C}}^{\text{ext}}_{r_{P}})$ , $\hstretch{2}{\hat{\hstretch{.6}{B}}}_{k\ell}^{j}(r_{P})=B_{k\ell}^{j}(\overline{\bm{C}}^{\text{ext}}_{r_{P}})$ , and $\hstretch{2}{\hat{\hstretch{.6}{F}}}_{j}(r_{P})=F_{j}(\overline{\bm{C}}^{\text{ext}}_{r_{P}}).$

We can then summarize our approach for predicting higher-order transitions via OPM reduced systems as follows:

Step 1.

Extrapolation $\overline{\bm{C}}^{\text{ext}}_{r_{P}}$ of $\overline{\bm{C}}_{r}$ at $r=r_{P}$ (the parameter at which one desires to predict the dynamics).
Step 2.

Computation of the spectral elements of the linearized operator $A$ at $r=r_{P}$ , by replacing $\overline{\bm{C}}_{r}$ by $\overline{\bm{C}}^{\text{ext}}_{r_{P}}$ in (66).
Step 3.

Training of the OPM using the spectral elements of Step 2 and data of the full model for $r=r_{D}<r_{P}$ .
Step 4.

Run the OPM reduced system (71) to predict the dynamics at $r=r_{P}$ .

We mention that the minimization of certain parameterization defects may require a special care such as for $J_{6}$ in Experiment II. Due to the presence of nearby local minima (see red curve in Fig. 5E), the analysis of the optimal value of $\tau$ to select for calibrating an optimal parameterization of $y_{6}(t)$ is more subtle and exploits actually a complementary metric known as the parameterization correlation; see Sec. V.4.

Obviously, the accuracy in approximating the genuine mean state $\overline{\bm{C}}_{r_{P}}$ by $\hstretch{2}{\hat{\hstretch{.6}{\bm{C}}}}_{r_{P}}$ is a determining factor in the transition prediction procedure described in Steps 1-4 above. Here, the relative error in approximating $\|\overline{\bm{C}}_{r_{P_{i}}}\|$ ( $i=1,2$ ) is $0.03\%$ for Experiment I, and $0.27\%$ for Experiment II. For the latter case, although the parameter dependence is rough beyond $r_{2}^{\ast}$ (see Panel D), there is no brutal local variations of relative large magnitude as identified for other nonlinear systems [81, 18]. Systems for which a linear response to parameter variation is a valid assumption [82, 83], thus constitute seemingly a favorable ground to apply the proposed transition prediction procedure.

V.3 Prediction of Higher-order Transitions by OPM reduced systems

As summarized in Figs. 5A and B, for each prediction experiment of Table 2, the OPM reduced system (71) not only successfully predicts the occurrence of the targeted transition at either $r=r_{P_{1}}$ and $r=r_{P_{2}}$ ( $P_{1}$ and $P_{2}$ in Fig. 5D), but also approximates with good accuracy the embedded (global) attractor as well as key statistics of the dynamics such as the power spectral density (PSD). Note that for Experiment I (period-doubling) we choose the reduced dimension to be $m_{c}=3$ and to be $m_{c}=5$ for Experiment II (transition to chaos). For Experiment I the global minimizers of $\mathcal{J}_{n}(\tau)$ given by (70) are retained to build up the corresponding OPM. In Experiment II, all the global minimizer are also retained except for $\mathcal{J}_{6}(\tau)$ from which the second minimizer is used for the construction of the OPM.

Once the OPM is built, an approximation of $\bm{C}(t)$ is obtained from the solution $X(t)$ to the corresponding OPM reduced system (71) according to

	$\displaystyle\bm{C}^{\text{PM}}(t)=\overline{\bm{C}}^{\text{ext}}_{r_{P}}+$	$\displaystyle\sum_{j=1}^{m_{c}}X_{j}(t)\bm{e}_{j}$		(72)
		$\displaystyle+\sum_{n=m_{c}+1}^{N}\Phi_{n}(\hstretch{2}{\hat{\hstretch{.6}{\tau}}}_{n}^{\ast},\hstretch{2}{\hat{\hstretch{.6}{\bm{\lambda}}}}(r_{P}),X)\bm{e}_{n},$		(72)

with $\Phi_{n}$ given by (29) whose spectral entries are given by the spectral elements of $A$ given by (66) for $r=r_{P}$ in which $\overline{\bm{C}}_{r_{P}}$ is replaced by $\overline{\bm{C}}^{\text{ext}}_{r_{P}}$ . The $\bm{e}_{j}$ and $\bm{e}_{n}$ denote the eigenvectors of this matrix and $N=9$ here.

As baseline, the skills of the OPM reduced system (71) are compared to those from reduced systems when parameterizations from invariant manifold theory such as (6) or from inertial manifold theory such as (31), are employed; see [12, Theorem 2] and Remark III.2. The details and results are discussed in Appendix D. The main message is that compared to the OPM reduced system (71), the reduced systems based on these traditional inertial/inertial manifold theories fail in predicting the correct dynamics. A main player in this failure lies in the inability of these standard parameterizations to accurately approximate small-energy variables that are dynamically important; see Appendix D. The OPM parameterization by its variational nature enables to fix this over-parameterization issue here.

V.4 Distinguishing between close local minima: Parameterization correlation analysis

Since the $\Phi_{n}$ ’s coefficients depend nonlinearly on $\tau$ (see Eqns. (29)-(30) and (85)), the parameterization defects, $\mathcal{J}_{n}(\tau)$ , defined in (70) are also highly nonlinear, and may not be convex in the $\tau$ -variable as shown for certain variables in Panels C and E of Fig. 5. A minimization algorithm to reach most often its global minimum is nevertheless detailed in [12, Appendix A], and is not limited to the RB system analyzed here. In certain rare occasions, a local minimum may be an acceptable halting point with an online performance slightly improved compared to that of the global minimum. In such a case, one discriminates between a local minimum and the global one by typically inspecting another metric offline: the correlation angle that measures essentially the collinearity between the actual high-mode dynamics and its parameterization. Here, such a situation occurs for Experiment II; see $J_{6}$ in Fig. 5E.

Following [12, Sec. 3.1.2], we recall thus a simple criterion to assist the selection of an OPM when there are more than one local minimum displayed by the parameterization defect and the corresponding local minimal values are close to each other.

Given a parameterization $\Phi$ that is not trivial (i.e. $\Phi\neq 0$ ), we define the parameterization correlation as,

c(t)=\frac{\mathrm{Re}\langle\Phi(y_{\mathfrak{c}}(t)),y_{\mathfrak{s}}(t)\rangle}{\|\Phi(y_{\mathfrak{c}}(t))\|\;\|y_{\mathfrak{s}}(t)\|}.

(73)

It provides a measure of collinearity between the unresolved variable $y_{\mathfrak{s}}(t)$ and its parameterization $\Phi(y_{\mathfrak{c}}(t))$ , as time evolves. It serves thus as a complimentary, more geometric way to measure the phase coherence between the resolved and unresolved variables than with the parameterization defect $\mathcal{Q}_{n}(\tau_{n},T)$ defined in (4). The closer to unity $c(t)$ is for a given parameterization, the better the phase coherence between the resolved and unresolved variables is expected to hold.

We illustrate this point on $J_{6}(\tau)$ shown in Fig. 5E. The defect $J_{6}$ exhibits two close minima corresponding to $J_{6}\approx 0.1535$ and $J_{6}\approx 0.1720$ , occurring respectively at $\tau\approx 0.65$ and $\tau\approx 1.80$ . Thus, the parameterization defect alone does not help provide a satisfactory discriminatory diagnosis between these two minimizers. To the contrary, the parameterization correlation shown in Fig. 6 allows for diagnosing more clearly that the OPM associated with the local minimizer has a neat advantage compared to that associated with the global minimizer.

As explained in [12, Sec. 3.1.2], this parameterization correlation criterion is also useful to assist the selection of the dimension of the reduced state space. Indeed, once an OPM has been determined, the dimension $m_{c}$ of the reduced system should be chosen so that the parameterization correlation is sufficiently close to unity as measured for instance in the $L^{2}$ -norm over the training time window. The basic idea is that one should not only parameterize properly the statistical effects of the neglected variables but also avoid to lose their phase relationships with the unresolved variables [84, 12]. For instance, for predicting the transition to chaos, we observe that an OPM comes with a parameterization correlation much closer to unity in the case $m_{c}=5$ than $m_{c}=3$ (not shown).

VI Concluding Remarks

In this article, we have described a general framework, based on the the OPM approach introduced in [12] for autonomous systems, to derive effective (OPM) reduced systems able to either predict higher-order transitions caused by parameter regime shift, or tipping phenomenas caused by model’s stochastic disturbances and slow parameter drift. In each case, the OPMs are sought as continuous deformations of classical invariant manifolds to handle parameter regimes away from bifurcation onsets while keeping the reduced state space relatively low-dimensional. The underlying OPM parameterizations are derived analytically from model’s equations, constructed as explicit solutions to auxiliary BF systems such as Eq. (11), whose backward integration time is optimized per unresolved mode. This optimization involves the minimization of natural cost functionals tailored upon the full dynamics at parameter values prior the transitions take place; see (4). In each case—either for prediction of higher-order transitions or tipping points—we presented compelling evidences of success of the OPM approach to address such extrapolation problems typically difficult to handle for data-driven reduction methods.

As reviewed in Sec. III, the BF systems such as Eq. (11) allow for drawing insightful relationships with the approximation theory of classical invariant/inertial manifolds (see [12, Sec. 2] and [31, 40, 85]) when the backward integration time $\tau$ in Eq. (11) is sent to $\infty$ ; see Theorem III.1. Once the invariant/inertial manifolds fail to provide legitimate parameterizations, the optimization of the backward integration time may lead to local minima at finite $\tau$ of the parameterization defect which if sufficiently small, give access to skillful parameterizations to predict e.g. higher-order transitions. This way, as illustrated in Sec. V (Experiment II), the OPM approach allows us to bypass the well-known stumbling blocks tied to the presence of small spectral gaps such as encountered in inertial manifold theory [25], and tied to the accurate parameterization of dynamically important small-energy variables; see also Remark III.1, Appendix D and [12, Sec. 6].

To understand the dynamical mechanisms at play behind a tipping phenomenon and to predict its occurrence are of uttermost importance, but this task is hindered by the often high dimensionality nature of the underlying physical system. Devising accurate, and analytic reduced models able to predict tipping phenomenon from complex system is thus of prime importance to serve understanding. The OPM results of Sec. IV indicate great promises for the OPM approach to tackle this important task for more complex systems. Other reduction methods to handle the prediction of tipping point dynamics have been proposed recently in the literature but mainly for mutualistic networks [86, 87, 88]. The OPM formulas presented here are not limited to this kind of networks and can be directly applied to a broad class of spatially extended systems governed by (stochastic) PDEs or to nonlinear time-delay systems by adopting the framework of [89] for the latter to devise center manifold parameterizations and their OPM generalization; see [90, 91].

The OPM approach could be also informative to design for such systems nearing a tipping point, early warning signal (EWS) indicators from multivariate time series. Extension of EWS techniques to multivariate data is an active field of research with methods ranging from empirical orthogonal functions reduction [92] to methods exploiting the system’s Jacobian matrix and relationships with the cross-correlation matrix [93] or exploiting the detection of spatial location of “hotspots” of stability loss [94, 95]. By its nonlinear modus operandi for reduction, the OPM parameterizations identifies subgroup of essential variables to characterize a tipping phenomenon which in turn could be very useful to construct the relevant observables of the system for the design of useful EWS indicators.

Thus, since the examples of Secns. IV and V are representative of more general problems of prime importance, the successes demonstrated by the OPM approach on these examples invite for further investigations for more complex systems.

Similarly, we would like to point out another important aspect in that perspective. For spatially extended systems, the modes involve typically wavenumbers that can help interpret certain primary physical patterns. Formulas such as (29) or (32) arising in OPM parameterizations involve a rebalancing of the interactions $B_{ij}^{n}$ among such modes by means of the coefficients $D_{ij}^{n}(\tau,\bm{\lambda})$ ; see Remark III.1. Thus, an OPM parameterization when skillful to derive efficient systems may provide useful insights to explain emergent patterns due to certain nonlinear interactions of wavenumbers, for regimes beyond the instability onset. For more complex systems than dealt with here, it is known already than near the instability onset of primary bifurcations, center manifold theory provides such physical insights; see e.g. [96, 97, 98, 99, 100, 101, 102].

By the approach proposed here relying on OPMs obtained as continuous deformations of invariant/center manifolds, one can thus track the interactions that survive or emerge between certain wavenumbers as one progresses through higher-order transitions. Such insights bring new elements to potentially explain the low-frequency variability of recurrent large-scale patterns typically observed e.g. in oceanic models [103, 104], offering at least new lines of thoughts to the dynamical system approach proposed in previous studies [105, 106, 107, 104]. In that perspective, the analytic expression (85) of terms such as $R_{n}(F,\bm{\lambda},\tau,X)$ in (29) bring also new elements to break down the nonlinear interactions between the forcing of certain modes compared to others. Formulas such as (85) extend to the case of time-dependent or stochastic forcing of the coarse-scale modes whose exact expression will be communicated elsewhere. As for the case of noise-induced transitions reported in Fig. 2D, these generalized formulas are expected to provide in particular new insights to regime shifts not involving crossing a bifurcation point but tied to other mechanisms such as slow-fast cyclic transitions or stochastic resonances [108].

Yet, another important practical aspect that deserve in-depth investigation is related to the robustness of the OPM approach subject to model errors. Such errors can arise from multiple sources, including e.g. imperfection of the originally utilized high-fidelity model in capturing the true physics and noise contamination of the solution data used to train the OPM. For the examples of Secns. IV and V, since we train the OPM in a parameter regime prior to the occurrence of the concerned transitions, the utilized training data contain, de facto, model errors. The reported results in these sections show that the OPM is robust in providing effective reduced models subject to such model errors. Nevertheless, it would provide useful insights if one can systematically quantify uncertainties of the OPM reduced models subject to various types of model errors. In that respect, several approaches could be beneficial for stochastic systems such as those based on the information-theoretic framework of [109] or the perturbation theory of ergodic Markov chains and linear response theory [110], as well as methods based on data-assimilation techniques [111].

Acknowledgements.

We are grateful to the reviewers for their constructive comments, which helped enrich the discussion and presentation. This work has been partially supported by the Office of Naval Research (ONR) Multidisciplinary University Research Initiative (MURI) grant N00014-20-1-2023, by the National Science Foundation grant DMS-2108856, and by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement no. 810370). We also acknowledge the computational resources provided by Advanced Research Computing at Virginia Tech for the results shown in Fig. 4.

Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Appendix A Proof of Lemma III.1

We introduce the notations

		$\displaystyle g_{\tau}(t)=\int_{-\tau}^{0}e^{-sA_{\mathfrak{s}}}\Pi_{\mathfrak{s}}F(s+t)\,\mathrm{d}s,$		(74)
		$\displaystyle\Psi_{\tau}(X)=\int_{-\tau}^{0}e^{-sA_{\mathfrak{s}}}\Pi_{\mathfrak{s}}B(e^{sA_{\mathfrak{c}}}X)\,\mathrm{d}s.$		(74)

In Eq. (15), replacing $X$ by $e^{rA_{\mathfrak{c}}}X$ where $r$ is an arbitrary real number, we get

\Phi_{\tau}(e^{rA_{\mathfrak{c}}}X,t)=\Psi_{\tau}(e^{rA_{\mathfrak{c}}}X)+g_{\tau}(t).

(75)

Note that

	$\displaystyle\varphi(r)$	$\displaystyle=\Psi_{\tau}(e^{rA_{\mathfrak{c}}}X)=\int_{-\tau}^{0}e^{-sA_{\mathfrak{s}}}\Pi_{\mathfrak{s}}B(e^{(s+r)A_{\mathfrak{c}}}X)\,\mathrm{d}s$		(76)
		$\displaystyle=\int_{r-\tau}^{r}e^{(r-s^{\prime})A_{\mathfrak{s}}}\Pi_{\mathfrak{s}}B(e^{s^{\prime}A_{\mathfrak{c}}}X)\,\mathrm{d}s^{\prime}.$		(76)

We obtain then

	$\displaystyle\partial_{r}\varphi(r)=\Pi_{\mathfrak{s}}$	$\displaystyle B(e^{rA_{\mathfrak{c}}}X)$		(77)
		$\displaystyle-e^{\tau A_{\mathfrak{s}}}\Pi_{\mathfrak{s}}B(e^{(r-\tau)A_{\mathfrak{c}}}X)+A_{\mathfrak{s}}\varphi(r).$		(77)

On the other hand, since $\varphi(r)=\Phi_{\tau}(e^{rA_{\mathfrak{c}}}X,t)-g_{\tau}(t)$ due to (15), we also have

\partial_{r}\varphi(r)=D\Phi_{\tau}(e^{rA_{\mathfrak{c}}}X,t)A_{\mathfrak{c}}e^{rA_{\mathfrak{c}}}X.

(78)

By taking the limit $r\rightarrow 0$ , we obtain from (77)–(78) that

	$\displaystyle D\Phi_{\tau}(X,t)A_{\mathfrak{c}}X=\Pi_{\mathfrak{s}}B(X)-e^{\tau A_{\mathfrak{s}}}$	$\displaystyle\Pi_{\mathfrak{s}}B(e^{-\tau A_{\mathfrak{c}}}X)$		(79)
		$\displaystyle+A_{\mathfrak{s}}\Psi_{\tau}(X).$		(79)

Now if we replace $t$ in (15) by $t+r$ , we obtain

\Phi_{\tau}(X,t+r)=\Psi_{\tau}(X)+g_{\tau}(t+r).

(80)

Note that

	$\displaystyle\widetilde{\varphi}(r)$	$\displaystyle=g_{\tau}(t+r)=\int_{-\tau}^{0}e^{-sA_{\mathfrak{s}}}\Pi_{\mathfrak{s}}F(s+t+r)\,\mathrm{d}s$		(81)
		$\displaystyle=\int_{r-\tau}^{r}e^{(r-s^{\prime})A_{\mathfrak{s}}}\Pi_{\mathfrak{s}}F(s^{\prime}+t)\,\mathrm{d}s^{\prime}.$		(81)

It follows that

\partial_{r}\widetilde{\varphi}(r)=\Pi_{\mathfrak{s}}F(r+t)-e^{\tau A_{\mathfrak{s}}}\Pi_{\mathfrak{s}}F(r-\tau+t)+A_{\mathfrak{s}}\widetilde{\varphi}(r).

(82)

Since $\widetilde{\varphi}(r)=\Phi_{\tau}(X,t+r)-\Psi_{\tau}(X)$ , we also have

\partial_{r}\widetilde{\varphi}(r)=\partial_{t}\Phi_{\tau}(X,t+r).

(83)

By taking the limit $r\rightarrow 0$ , we obtain from (82)–(83) that

	$\displaystyle\partial_{t}\Phi_{\tau}(X,t)=\Pi_{\mathfrak{s}}$	$\displaystyle F(t)$		(84)
		$\displaystyle-e^{\tau A_{\mathfrak{s}}}\Pi_{\mathfrak{s}}F(t-\tau)+A_{\mathfrak{s}}g_{\tau}(t).$		(84)

The homological equation (16) follows then from (79) and (84) while gathering the relevant terms to form the operator $\mathcal{L}_{A}$ given by (17).

Appendix B The low-order term in the OPM parameterization

In the special case where $F$ in (28) is time-independent, by introducing $X_{i}=\langle X,\bm{e}^{*}_{i}\rangle$ and $F_{i}=\langle F,\bm{e}^{*}_{i}\rangle$ , the $R_{n}$ -term in the parameterization (29) takes the form

	$\displaystyle R_{n}(F,{\bm{\lambda}},\tau,X)$	$\displaystyle=\sum_{i,j=1}^{m_{c}}U_{ij}^{n}(\tau,\bm{\lambda})B_{ij}^{n}F_{i}F_{j}$		(85)
		$\displaystyle+\sum_{i,j=1}^{m_{c}}V_{ij}^{n}(\tau,\bm{\lambda})F_{j}(B^{n}_{ij}+B^{n}_{ji})X_{i},$		(85)

with

U_{ij}^{n}(\tau,\bm{\lambda})=\begin{cases}\frac{1}{\lambda_{i}\lambda_{j}}\Big{(}D_{ij}^{n}(\tau,\bm{\lambda})-\frac{1-\exp(-\tau(\lambda_{i}-\lambda_{n}))}{\lambda_{i}-\lambda_{n}}\\ \qquad\qquad-\frac{1-\exp(-\tau(\lambda_{j}-\lambda_{n}))}{\lambda_{j}-\lambda_{n}}-\frac{1-\exp(\tau\lambda_{n})}{\lambda_{n}}\Big{)},&\text{if $\lambda_{i}\neq 0$ and $\lambda_{j}\neq 0$},\vspace{2ex}\\ \frac{1}{\lambda_{i}}\Big{(}\frac{\tau\exp(-\tau(\lambda_{i}-\lambda_{n}))}{\lambda_{i}-\lambda_{n}}-\frac{1-\exp(-\tau(\lambda_{i}-\lambda_{n}))}{(\lambda_{i}-\lambda_{n})^{2}}\\ \qquad\qquad+\frac{\tau\exp(\tau\lambda_{n})}{\lambda_{n}}+\frac{1-\exp(\tau\lambda_{n})}{(\lambda_{n})^{2}}\Big{)},&\text{if $\lambda_{i}\neq 0$ and $\lambda_{j}=0$},\vspace{2ex}\\ \frac{1}{\lambda_{j}}\Big{(}\frac{\tau\exp(-\tau(\lambda_{j}-\lambda_{n}))}{\lambda_{j}-\lambda_{n}}-\frac{1-\exp(-\tau(\lambda_{j}-\lambda_{n}))}{(\lambda_{j}-\lambda_{n})^{2}}\\ \qquad\qquad+\frac{\tau\exp(\tau\lambda_{n})}{\lambda_{n}}+\frac{1-\exp(\tau\lambda_{n})}{(\lambda_{n})^{2}}\Big{)},&\text{if $\lambda_{i}=0$ and $\lambda_{j}\neq 0$},\vspace{2ex}\\ \frac{\tau^{2}\exp(\tau\lambda_{n})}{\lambda_{n}}-\frac{2}{\lambda_{n}}\Big{(}\frac{\tau\exp(\tau\lambda_{n})}{\lambda_{n}}+\frac{1-\exp(\tau\lambda_{n})}{(\lambda_{n})^{2}}\Big{)},&\text{if $\lambda_{i}=0$ and $\lambda_{j}=0$},\end{cases}

(86)

and

V_{ij}^{n}(\tau,\bm{\lambda})=\begin{cases}\frac{1-\exp(-\tau(\lambda_{i}+\lambda_{j}-\lambda_{n}))}{\lambda_{j}(\lambda_{i}+\lambda_{j}-\lambda_{n})}-\frac{1-\exp(-\tau(\lambda_{i}-\lambda_{n}))}{\lambda_{j}(\lambda_{i}-\lambda_{n})},&\text{if $\lambda_{j}\neq 0$},\vspace{2ex}\\ \frac{\tau\exp(-\tau(\lambda_{i}-\lambda_{n}))}{\lambda_{i}-\lambda_{n}}-\frac{1-\exp(-\tau(\lambda_{i}-\lambda_{n}))}{(\lambda_{i}-\lambda_{n})^{2}}.&\text{otherwise}.\end{cases}

(87)

For the application to the Rayleigh-Bénard system considered in Sec. V, the forcing term $F$ is produced after rewriting the original unforced system in terms of the fluctuation variable with respect to the mean state. For this problem, the eigenvalues $\bm{\lambda}$ and the interaction coefficients $B_{ij}^{n}$ both depend on the mean state.

Appendix C Numerical aspects

In the numerical experiments of Sec. V, the full RB system (62) as well as the OPM reduced system (71) are numerically integrated using a standard fourth-order Runge–Kutta (RK4) method with a time-step size taken to be $\delta t=5\times 10^{-3}$ . Note that since the eigenvalues of $A$ are typically complex-valued, some care is needed when integrating (71). Indeed, since complex eigenmodes of $A$ must appear in complex conjugate pairs, the corresponding components of $\bm{x}$ in (71) form also complex conjugate pairs. To prevent round-off errors that may disrupt the underlying complex conjugacy, after each RK4 time step, we enforce complex conjugacy as follows. Assuming that $x_{j}$ and $x_{j+1}$ form a conjugate pairs and that after an RK4 step, $x_{j}=a_{1}+ib_{1}$ and $x_{j+1}=a_{2}-ib_{2}$ ( $i^{2}=-1$ ) where $a_{1},a_{2},b_{1}$ and $b_{2}$ are real-valued with $a_{1}\approx a_{2}$ and $b_{1}\approx a_{2}$ , we redefine $x_{j}$ and $x_{j+1}$ to be respectively given by $x_{j}=(a_{1}+a_{2})/2+i(b_{1}+b_{2})/2$ and $x_{j+1}=(a_{1}+a_{2})/2-i(b_{1}+b_{2})/2$ . For each component corresponding to a real eigenmode, after each RK4 time step, we simply redefine it to be its real part and ignore the small imaginary part that may also arise from round-off errors. The same numerical procedure is adopted to integrate the reduced systems obtained from invariant manifold reduction as well as the FMT parameterization used in Appendix D below.

Appendix D Failure from standard manifold parameterizations

In this section we briefly report on the reduction skills for the chaotic case of Sec. V.3 as achieved through application of the invariant manifold theory or standard formulas used in inertial manifold theory. Invariant manifold theory is usually applied near a steady state. To work within a favorable ground for these theories, we first chose the steady state $\overline{\bm{Y}}_{\!r}$ that is closest to the mean state $\overline{\bm{C}}_{r}$ at the parameter value $r$ we want to approximate the chaotic dynamics via a reduced system. This way the chaotic dynamics is located near this steady state.

To derive the invariant manifold (IM) reduced system, we also re-write the RB system (62) in the fluctuation variable, but this time using the steady state $\overline{\bm{Y}}_{\!r}$ , namely for the $\bm{\delta}(t)=\bm{C}(t)-\overline{\bm{Y}}_{\!r}$ . The analogue of Eq. (65) is then projected onto the eigenmodes of linearized part

\tilde{A}\bm{\delta}=L\bm{\delta}+B(\overline{\bm{Y}}_{\!r},\bm{\delta})+B(\bm{\delta},\overline{\bm{Y}}_{\!r}),

(88)

to obtain:

\displaystyle\dot{w}_{j}=\lambda_{j}w_{j}+\sum_{k,\ell=1}^{9}\tilde{B}_{k\ell}^{j}w_{k}w_{\ell},\;\;j=1,\cdots,9.

(89)

Here the $(\lambda_{j},\bm{f}_{j})$ denote the spectral elements of $\tilde{A}$ , while the interaction coefficients are given as $\tilde{B}_{k\ell}^{j}=\langle B(\bm{f}_{k},\bm{f}_{\ell}),\bm{f}^{*}_{j}\rangle$ with $\bm{f}^{*}_{j}$ denoting the eigenvector of $\tilde{A}^{*}$ associated with $\lambda_{j}^{*}$ . Note that unlike (LABEL:Eq_RBC_eigenbasis), there is no constant forcing term on the RHS of (89) since $L\overline{\bm{Y}}_{\!r}+B(\overline{\bm{Y}}_{\!r},\overline{\bm{Y}}_{\!r})=0$ as $\overline{\bm{Y}}_{\!r}$ is a steady state.

Here also, the reduced state space $H_{\mathfrak{c}}=\mbox{span}(\bm{e}_{1},\cdots,\bm{e}_{m_{c}})$ with $m_{c}=5$ as indicated in Table 2, for the chaotic regime. The local IM associated with $H_{\mathfrak{c}}$ has its components approximated by [12, Theorem 2]:

h_{n}(\bm{X})=\sum_{i,j=1}^{m_{c}}\frac{\tilde{B}_{ij}^{n}}{\lambda_{i}+\lambda_{j}-\lambda_{n}}X_{i}X_{j},\quad n=m_{c}+1,\cdots,9.

(90)

when $\Re(\lambda_{i}+\lambda_{j}-\lambda_{n})>0$ for $1\leq i,j\leq m_{c}$ and $n\geq m_{c}+1$ .

The corresponding IM reduced system takes the form given by (71) in which the components, $\Phi_{\ell}$ , of the OPM therein are replaced by the $h_{\ell}$ given by (90) for $\ell=m_{c}+1,\cdots,9$ . Unlike the OPM reduced system (71), here we use the true eigenvalues $\lambda_{j}$ ’s at the parameter $r$ at which the dynamics is chaotic ( $r=r_{P_{2}}$ in the notations of Fig. 5).

To the solution $z(t)=(z_{1}(t),\cdots,z_{m_{c}}(t))$ to this IM reduced system one then forms the approximation $\bm{C}^{\text{IM}}(t)$ of $\bm{C}(t)$ given by:

\bm{C}^{\text{IM}}(t)=\sum_{j=1}^{m_{c}}z_{j}(t)\bm{f}_{j}+\sum_{n=m_{c}+1}^{9}h_{n}(z(t))\bm{f}_{n}+\overline{\bm{Y}}_{\!r}.

(91)

Similarly, we also test the performance of the reduced system when the local IM (90) is replaced by the following FMT parameterization (see (31) in Remark III.2):

h^{\mathrm{FMT}}_{n}=-\sum_{i,j=1}^{m_{c}}\frac{\tilde{B}_{ij}^{n}}{\lambda_{n}}X_{i}X_{j},\quad n=m_{c}+1,\cdots,9.

(92)

Formulas such as (92) have been used in earlier studies relying on inertial manifolds for predicting higher-order bifurcations [49].

The predicted orbits obtained from reduced systems built either on the IM parameterization (90) or the FMT one (92) are shown in the top row of Fig. 7 as blue and magenta curves, respectively. Both reduced systems lead to periodic dynamics and thus fail dramatically in predicting the chaotic dynamics. The small spectral gap issue mentioned in Remark III.1 plays a role in explaining this failure but not only. Another fundamental issue for the closure of chaotic dynamics lies in the difficulty to provide accurate parameterization of small-energy but dynamically important variables; see e.g. [12, Sec. 6]. This issue is encountered here, as some of variables to parameterize for Experiment II contain only from $0.23\%$ to $2.1\%$ of the total energy.

Replacing $\overline{\bm{Y}}_{\!r}$ by the genuine mean state $\overline{\bm{C}}_{r}$ does not fix this issue as some of variables to parameterize contain still a small fraction of the total energy, from $0.36\%$ to $1.5\%$ . The IM parameterization (90) and the FMT one (92) whose coefficients are now determined from the spectral elements of $\tilde{A}$ in which $\overline{\bm{C}}_{r}$ replaces $\overline{\bm{Y}}_{\!r}$ in (88), still fail in parameterizing accurately such small-energy variables; see bottom row of Fig. 7⁴⁴4Note that here the FMT parameterization accounts for the forcing term (see [12, Eq. (4.40)]) produced by fluctuations calculated with respect to $\overline{\bm{C}}_{r}$ . By comparison, the OPM succeeds in parameterizing accurately these small-energy variables. For Experiment I, inaccurate predictions are also observed from the reduced systems built either on the IM parameterization (90) or the FMT one (92) (not shown).

References

Note [1] But diagonalizable over the set of complex numbers $\mathbb{C}$ .
Henry [1981] D. Henry, Geometric Theory of Semilinear Parabolic Equations, Lecture Notes in Mathematics, Vol. 840 (Springer-Verlag, Berlin, 1981).
Ghil and Childress [1987] M. Ghil and S. Childress, Topics in Geophysical Fluid Dynamics: Atmospheric Dynamics, Dynamo Theory and Climate Dynamics (Springer-Verlag, New York/Berlin, 1987).
Hale [1988] J. K. Hale, Asymptotic Behavior of Dissipative Systems, Mathematical Surveys and Monographs, Vol. 25 (American Mathematical Society, Providence, RI, 1988) pp. x+198 pp.
Crawford [1991] J. D. Crawford, Reviews of Modern Physics 63, 991 (1991).
Cross and Hohenberg [1993] M. C. Cross and P. C. Hohenberg, Rev. Mod. Phys. 65, 851 (1993).
Sell and You [2002] G. Sell and Y. You, Dynamics of Evolutionary Equations, Applied Mathematical Sciences, Vol. 143 (Springer-Verlag, New York, 2002) pp. xiv+670.
Ma and Wang [2005] T. Ma and S. Wang, Bifurcation Theory and Applications, World Scientific Series on Nonlinear Science. Series A: Monographs and Treatises, Vol. 53 (World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2005).
Kloeden and Rasmussen [2011] P. E. Kloeden and M. Rasmussen, Nonautonomous Dynamical Systems, Mathematical Surveys and Monographs, Vol. 176 (American Mathematical Society, Providence, RI, 2011).
Temam [2012] R. Temam, Infinite-dimensional Dynamical Systems in Mechanics and Physics, Vol. 68 (Springer Science & Business Media, 2012).
Ma and Wang [2014] T. Ma and S. Wang, Phase Transition Dynamics (Springer, 2014).
Chekroun et al. [2020a] M. D. Chekroun, H. Liu, and J. C. McWilliams, J. Stat. Phys. 179, 1073 (2020a).
Chorin et al. [2002] A. J. Chorin, O. H. Hald, and R. Kupferman, Physica D 166, 239 (2002).
Givon et al. [2004] D. Givon, R. Kupferman, and A. Stuart, Nonlinearity 17, R55 (2004).
Gottwald [2010] G. Gottwald, Gazette of the Australian Mathematical Society 37, 319 (2010).
Gottwald et al. [2017] G. A. Gottwald, D. T. Crommelin, and C. L. E. Franzke, “Stochastic climate theory,” (Cambridge University Press, 2017) pp. 209–240.
Lucarini and Chekroun [2023] V. Lucarini and M. Chekroun, arXiv:2303.12009 (2023).
Chekroun et al. [2017] M. D. Chekroun, H. Liu, and J. C. McWilliams, Computers & Fluids 151, 3 (2017).
Chekroun et al. [2021] M. D. Chekroun, H. Liu, and J. C. McWilliams, Proc. Natl. Acad. Sci. 118, e2113650118 (2021).
Constantin et al. [2012] P. Constantin, C. Foias, B. Nicolaenko, and R. Temam, Integral Manifolds and Inertial Manifolds for Dissipative Partial Differential Equations, Vol. 70 (Springer Science, 2012).
Debussche and Temam [1991] A. Debussche and R. Temam, Differential and Integral Equations 4, 897 (1991).
Debussche and Temam [1994] A. Debussche and R. Temam, Journal de Mathématiques Pures et Appliquées 73, 489 (1994).
Temam and Wirosoetisno [2010] R. Temam and D. Wirosoetisno, SIAM Journal on Mathematical Analysis 42, 427 (2010).
Temam and Wirosoetisno [2011] R. Temam and D. Wirosoetisno, J. Atmos. Sci. 68, 675 (2011).
Zelik [2014] S. Zelik, Proc. R. Soc. Edinb. Sec. A: Mathematics 144, 1245 (2014).
Subel et al. [2022] A. Subel, Y. Guan, A. Chattopadhyay, and P. Hassanzadeh, arXiv preprint arXiv:2206.03198 (2022).
Ahmed et al. [2021] S. Ahmed, S. Pawar, O. San, A. Rasheed, T. Iliescu, and B. Noack, Physics of Fluids 33, 091301 (2021).
Hesthaven et al. [2022] J. Hesthaven, C. Pagliantini, and G. Rozza, Acta Numerica 31, 265 (2022).
Chekroun et al. [2015a] M. D. Chekroun, H. Liu, and S. Wang, Stochastic Parameterizing Manifolds and Non-Markovian Reduced Equations: Stochastic Manifolds for Nonlinear SPDEs II (Springer Briefs in Mathematics, Springer, 2015).
Cabré et al. [2005] X. Cabré, E. Fontich, and R. De La Llave, Journal of Differential Equations 218, 444 (2005).
Haro et al. [2016] À. Haro, M. Canadell, J.-L. Figueras, A. Luque, and J.-M. Mondelo, The Parameterization Method for Invariant Manifolds:From Rigorous Results to Effective Computations, Vol. 195 (Springer-Verlag, Berlin, 2016).
Roberts [2018] A. Roberts, arXiv preprint arXiv:1804.06998 (2018).
Kuehn [2011] C. Kuehn, Physica D 240, 1020 (2011).
Ashwin et al. [2012] P. Ashwin, S. Wieczorek, R. Vitolo, and P. Cox, Phil. Trans. Roy. Soc. A 370, 1166 (2012).
Note [2] Here, we make the abuse of language that the OPM is sought within the class of parameterizations obtained through the backward-forward systems of Sec. III. In general, OPMs are more general objects related to the conditional expectation associated with the projector $\Pi_{\mathfrak{c}}$ ; see [12, Sec. 3.2].
Chekroun et al. [2015b] M. D. Chekroun, H. Liu, and S. Wang, Approximation of Stochastic Invariant Manifolds: Stochastic Manifolds for Nonlinear SPDEs I (Springer Briefs in Mathematics, Springer, 2015).
Chekroun et al. [2023] M. Chekroun, H. Liu, J. McWilliams, and S. Wang, Journal of Differential Equations 346, 145 (2023).

Note [3] One could also express a constraint on the behavior of

F(t)

t\rightarrow\infty

. For instance one may require that

\mathop{lim}\displaylimits_{s\rightarrow-\infty}e^{A_{\mathfrak{c}}s}\intop\ilimits@_{s}^{0}e^{A_{\mathfrak{c}}s^{\prime}}\Pi_{\mathfrak{c}}F(s^{\prime})\tmspace+{.1667em}\mathrm{d}s^{\prime}<\infty,etc.

(93)

Arnold [1988] V. I. Arnold, Geometrical Methods in the Theory of Ordinary Differential Equations, 2nd ed. (Springer-Verlag, New York, 1988).
Pötzsche and Rasmussen [2006] C. Pötzsche and M. Rasmussen, J. Dyn. Diff. Equat. 18, 427 (2006).
Dávid and Sinha [2006] A. Dávid and S. Sinha, in Bifurcation and Chaos in Complex Systems, edited by J.-Q. Sun and A. C. J. Luo (Elsevier, 2006) pp. 279–338.
Sell [1985] G. R. Sell, American Journal of Mathematics , 1035 (1985).
Foias et al. [1988] C. Foias, O. Manley, and R. Temam, RAIRO Modél. Math. Anal. Numér. 22, 93 (1988).
Daley [1980] R. Daley, Monthly Weather Review 108, 100 (1980).
Baer and Tribbia [1977] F. Baer and J. J. Tribbia, Monthly Weather Review 105, 1536 (1977).
Machenhauer [1977] B. Machenhauer, Beitr. Phys. Atmos 50, 253 (1977).
Leith [1980] C. Leith, J. Atmos. Sci. 37, 958 (1980).
Roberts [1988] A. Roberts, The ANZIAM Journal 29, 480 (1988).
Brown et al. [1991] H. S. Brown, I. G. Kevrekidis, and M. S. Jolly, in Patterns and Dynamics in Reactive Media, edited by R. Aris, D. G. Aronson, and H. L. Swinney (Springer, 1991) pp. 11–31.
Stommel [1961] H. Stommel, Tellus 13, 224 (1961).
Cessi [1994] P. Cessi, Journal of Physical Oceanography 24, 1911 (1994).
Dijkstra [2013] H. A. Dijkstra, Nonlinear climate dynamics (Cambridge University Press, 2013).
Boers [2021] N. Boers, Nature Climate Change 11, 680 (2021).
Lenton et al. [2008] T. Lenton, H. Held, E. Kriegler, J. Hall, W. Lucht, S. Rahmstorf, and H. J. Schellnhuber, Proc. Natl. Acad. Sci. 105, 1786 (2008).
Bryan and Hansen [1995] K. Bryan and F. Hansen, in Natural Climate Variability on Decade-to-Century Time Scales (National Academy Press, 1995) pp. 355–364.
Berglund and Gentz [2002] N. Berglund and B. Gentz, Stochastics and Dynamics 2, 327 (2002).
Kuznetsov [2004] Y. A. Kuznetsov, Elements of Applied Bifurcation Theory, 3rd ed., Applied Mathematical Sciences, Vol. 112 (Springer-Verlag, New York, 2004) pp. xxii+631 pp.
Thual and McWilliams [1992] O. Thual and J. McWilliams, Geophysical & Astrophysical Fluid Dynamics 64, 67 (1992).
Dijkstra and Moleamaker [1997] H. Dijkstra and M. Moleamaker, Journal of Fluid Mechanics 331, 169 (1997).
Weijer et al. [2003] W. Weijer, H. Dijkstra, H. Oksuzoglu, F. Wubs, and A. de Niet, Journal of Computational Physics 192, 452 (2003).
Weijer et al. [2019] W. Weijer, W. Cheng, S. Drijfhout, A. Fedorov, A. Hu, L. Jackson, W. Liu, E. McDonagh, J. Mecking, and J. Zhang, Journal of Geophysical Research: Oceans 124, 5336 (2019).
Lions [1982] P.-L. Lions, SIAM review 24, 441 (1982).
Kielhöfer [2011] H. Kielhöfer, Bifurcation theory: An Introduction with Applications to Partial Differential Equations, Vol. 156 (Springer Science & Business Media, 2011).
Chekroun [2018] M. D. Chekroun, Disc. Cont. Dyn. Syst. B 9, 3723 (2018).
Ghil [1976] M. Ghil, Journal of Atmospheric Sciences 33, 3 (1976).
Ghil [1981] M. Ghil, in Climatic variations and Variability: Facts and theories (Springer, 1981) pp. 461–480.
Hetzer [1997] G. Hetzer, in The Mathematics of models for Climatology and Environment (Springer, 1997) pp. 253–287.
Bódai et al. [2015] T. Bódai, V. Lucarini, F. Lunkeit, and R. Boschi, Climate Dynamics 44, 3361 (2015).
Lee et al. [2011] E. Lee, S. Sasi, and R. Shivaji, Journal of Mathematical Analysis and Applications 381, 732 (2011).
Pruitt et al. [2018] J. Pruitt, A. Berdahl, C. Riehl, N. Pinter-Wollman, H. Moeller, E. Pringle, L. Aplin, E. Robinson, J. Grilli, P. Yeh, et al., Phil. Trans. Roy. Soc. A: Biological Sciences 285, 20181282 (2018).
Bel et al. [2012] G. Bel, A. Hagberg, and E. Meron, Theoretical Ecology 5, 591 (2012).
Zelnik et al. [2015] Y. R. Zelnik, E. Meron, and G. Bel, Proc. Natl. Acad. Sci. 112, 12327 (2015).
Bebernes and Eberly [2013] J. Bebernes and D. Eberly, Mathematical Problems from Combustion Theory, Vol. 83 (Springer Science & Business Media, 2013).
Frank-Kamenetskii [2015] D. A. Frank-Kamenetskii, Diffusion and Heat Exchange in Chemical Kinetics (Princeton University Press, 2015).
Du [2000] Y. Du, SIAM Journal on Mathematical Analysis 32, 707 (2000).
Du and Lou [2001] Y. Du and Y. Lou, Journal of Differential Equations 173, 213 (2001).
Feudel et al. [2018] U. Feudel, A. Pisarchik, and K. Showalter, Chaos 28 (2018).
Caesar et al. [2018] L. Caesar, S. Rahmstorf, A. Robinson, G. Feulner, and V. Saba, Nature 556, 191 (2018).
Tikhonov [1952] A. N. Tikhonov, Mat. Sbornik N.S. 31, 575 (1952).
Reiterer et al. [1998] P. Reiterer, C. Lainscsek, F. Schürrer, C. Letellier, and J. Maquet, J. Phys. A: Math. Gen. 31, 7121 (1998).
Chekroun et al. [2014] M. D. Chekroun, J. D. Neelin, D. Kondrashov, J. C. McWilliams, and M. Ghil, Proc. Natl. Acad. Sci. 111, 1684 (2014).
Gallavotti and Cohen [1995] G. Gallavotti and E. Cohen, Journal of Statistical Physics 80, 931 (1995).
Lucarini [2018] V. Lucarini, J. Stat. Phys. 173, 1698 (2018).
McComb et al. [2001] W. D. McComb, A. Hunter, and C. Johnston, Physics of Fluids 13, 2030 (2001).
Pötzsche and Rasmussen [2009] C. Pötzsche and M. Rasmussen, Numer. Math. 112, 449 (2009).
Jiang et al. [2018] J. Jiang, Z.-G. Huang, T. P. Seager, W. Lin, C. Grebogi, A. Hastings, and Y.-C. Lai, Proc. Natl. Acad. Sci. 115, E639 (2018).
Jiang et al. [2019] J. Jiang, A. Hastings, and Y.-C. Lai, Journal of the Royal Society Interface 16, 20190345 (2019).
Laurence et al. [2019] E. Laurence, N. Doyon, L. J. Dubé, and P. Desrosiers, Physical Review X 9, 011042 (2019).
Chekroun et al. [2016] M. D. Chekroun, M. Ghil, H. Liu, and S. Wang, Disc. Cont. Dyn. Sys. A 36, 4133 (2016), doi:10.3934/dcds.2016.36.4133.
Chekroun et al. [2020b] M. D. Chekroun, I. Koren, and H. Liu, Chaos 30, 053130 (2020b).
Chekroun et al. [2022a] M. D. Chekroun, I. Koren, H. Liu, and H. Liu, Science Advances 8, eabq7137 (2022a).
Held and Kleinen [2004] H. Held and T. Kleinen, Geophysical Research Letters 31 (2004).
Williamson and Lenton [2015] M. Williamson and T. Lenton, Chaos 25 (2015).
Bathiany et al. [2013] S. Bathiany, M. Claussen, and K. Fraedrich, Earth System Dynamics 4, 63 (2013).
Kéfi et al. [2014] S. Kéfi, V. Guttal, W. Brock, S. Carpenter, A. Ellison, V. Livina, D. Seekell, M. Scheffer, E. Van Nes, and V. Dakos, PloS one 9, e92097 (2014).
Chekroun et al. [2022b] M. D. Chekroun, H. Dijkstra, T. Sengul, and S. Wang, Nonlinear Dynamics 109, 1887 (2022b).
Dijkstra et al. [2015] H. Dijkstra, T. Sengul, J. Shen, and S. Wang, SIAM Journal on Applied Mathematics 75, 2361 (2015).
Han et al. [2020] D. Han, Q. Wang, and X. Wang, Physica D: Nonlinear Phenomena 414, 132687 (2020).
Hsia et al. [2008] C.-H. Hsia, T. Ma, and S. Wang, Zeitschrift für Analysis und ihre Anwendungen 27, 233 (2008).
Hsia et al. [2010] C.-H. Hsia, T. Ma, and S. Wang, Discrete Contin. Dyn. Syst. 28, 99 (2010).
Lu et al. [2019] C. Lu, Y. Mao, Q. Wang, and D. Yan, Journal of Differential Equations 267, 2560 (2019).
Sengul et al. [2015] T. Sengul, J. Shen, and S. Wang, Mathematical Methods in the Applied Sciences 38, 3792 (2015).
Berloff and McWilliams [1999] P. Berloff and J. McWilliams, Journal of Physical Oceanography 29, 1925 (1999).
Dijkstra and Ghil [2005] H. Dijkstra and M. Ghil, Reviews of Geophysics 43 (2005).
Nadiga and Luce [2001] B. Nadiga and B. Luce, Journal of Physical Oceanography 31, 2669 (2001).
Simonnet et al. [2003] E. Simonnet, M. Ghil, K. Ide, R. Temam, and S. Wang, Journal of Physical Oceanography 33, 729 (2003).
Simonnet et al. [2005] E. Simonnet, M. Ghil, and H. A. Dijkstra, J. Mar. Res. 63, 931 (2005).
Dakos et al. [2015] V. Dakos, S. Carpenter, E. van Nes, and M. Scheffer, Philosophical Transactions of the Royal Society B: Biological Sciences 370, 20130263 (2015).
Majda and Chen [2018] A. Majda and N. Chen, Entropy 20, 644 (2018).
Zhang et al. [2021] H. Zhang, J. Harlim, and X. Li, Physica D 427, 133022 (2021).
Gottwald and Reich [2021] G. Gottwald and S. Reich, Physica D: Nonlinear Phenomena 423, 132911 (2021).
Note [4] Note that here the FMT parameterization accounts for the forcing term (see [12, Eq. (4.40)]) produced by fluctuations calculated with respect to $\overline{\bm{C}}_{r}$ .

Optimal Parameterizing Manifolds for Anticipating Tipping Points and Higher-order Critical Transitions

Abstract

I Introduction

II Variational Parameterizations for Model Reduction

III Variational Parameterizations: Explicit Formulas

III.1 Homological Equation and Backward-Forward Systems: Time-dependent Forcing

Lemma III.1

III.2 Homological Equation and Backward-Forward Systems: Constant Forcing

Theorem III.1

III.3 Explicit Formulas for Variational Parameterizations

III.3.1 The constant forcing case

Remark III.1

Remark III.2

III.3.2 The time-dependent forcing case

III.3.3 OPM reduced systems

IV Predicting Tipping Points

IV.1 The Stommel-Cessi model and tipping points

IV.2 OPM results for a fixed FF value: Noise-induced transitions

IV.3 Predicting the crossing of tipping points

V Predicting Higher-Order Critical Transitions

V.1 Problem formulation

V.2 Predicting Higher-order Transitions via OPM: Method

V.3 Prediction of Higher-order Transitions by OPM reduced systems

V.4 Distinguishing between close local minima: Parameterization correlation analysis

VI Concluding Remarks

Acknowledgements.

Data Availability

Appendix A Proof of Lemma III.1

Appendix B The low-order term in the OPM parameterization

Appendix C Numerical aspects

Appendix D Failure from standard manifold parameterizations

References

IV.2 OPM results for a fixed $F$ value: Noise-induced transitions