This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Most probable transition paths in piecewise-smooth stochastic differential equations

Kaitlin Hill   , Jessica Zanetell , and John A. Gemmer33footnotemark: 3 Department of Mathematics, St. Mary’s University, San Antonio, TX 78228, USACorresponding author, Email: [email protected] of Mathematics, Wake Forest University, Winston-Salem, NC 27109, USAEmail: [email protected]
Abstract

We develop a path integral framework for determining most probable paths for a class of systems of stochastic differential equations with piecewise-smooth drift and additive noise. This approach extends the Freidlin-Wentzell theory of large deviations to cases where the system is piecewise-smooth and may be non-autonomous. In particular, we consider an nn-dimensional system with a switching manifold in the drift that forms an (n1)(n-1)-dimensional hyperplane and investigate noise-induced transitions between metastable states on either side of the switching manifold. To do this, we mollify the drift and use Γ\Gamma-convergence to derive an appropriate rate functional for the system in the piecewise-smooth limit. The resulting functional consists of the standard Freidlin-Wentzell rate functional, with an additional contribution due to times when the most probable path slides in a crossing region of the switching manifold. We explore implications of the derived functional through two case studies, which exhibit notable phenomena such as non-unique most probable paths and noise-induced sliding in a crossing region.

Keywords: Piecewise smooth dynamical systems, Filippov systems, Freidlin-Wentzell rate functional, Gamma-convergence, noise induced tipping, rare events

2000 MSC: 37H10, 37J45

1 Introduction

In dynamical systems, a tipping event is loosely defined as occurring when a sudden or small change to a variable or parameter induces a large change to the state of the system. While tipping is often studied within the context of climate applications [2, 3, 4, 5, 6, 7, 8], it also has broad applications in ecology [9, 10], ecosystems [11, 12] and epidemiology [13, 14, 15, 16], to name a few. While there is no precise mathematical definition of a tipping event, in [17] it was proposed that tipping events could be classified according to whether the underlying mathematical mechanism involves, predominantly, a bifurcation (B-tipping), noise-induced transitions (N-tipping), or transitions between basins of attraction induced by fast changes in parameters (R-tipping), i.e. rate-induced tipping; see also [18, 19, 20, 21].

The term ‘noise-induced tipping’ encompasses a range of phenomena such as transitions between metastable states, bursting, stochastic resonance, stochastic coherence and stochastic synchronization [22] and can be analyzed from several points of view including the Freidlin-Wentzell (FW) theory of large deviations, [23, 24, 25], the path integral framework [26, 27], transition path theory [28, 29], and formal asymptotics [30, 31]. In particular, for smooth autonomous dynamical systems additively perturbed by Gaussian white noise, the FW theory provides a framework for computing most probable transitions as minimizers of a rate functional in the asymptotic limit of vanishing noise strength. The benefit of the FW framework is that most probable transition paths can be numerically computed using iterative schemes such as the string method for gradient systems [32], the minimum action method [24], the geometric minimum action method [33] and explicit gradient descent [34]. Moreover, knowledge of the most probable transition path can then be coupled with statistical techniques such as importance sampling to compute quantities of interest such as the expected time of tipping for nonzero noise [35, 25].

In this paper we extend the path integral framework to a class of differential equations with additive noise and piecewise-smooth drift. There has been considerable recent interest in understanding the dynamics of piecewise-smooth stochastic dynamical systems, particularly in relation to rare events and tipping, or transitions between metastable states, but also in varying applications [36, 37, 38]. In [39, 40], Chen et al. use the backward Fokker-Planck technique to derive the distribution of paths in a model of dry friction. Notably, in [39], Chen et al. use a similar method to that of the present study, smoothing out the piecewise-smooth vector field to numerically analyze the desingularized SDE. In [41, 42], Baule et al. use the path integral approach, as in the present study, to investigate most probable paths in a piecewise-smooth model of stick-slip friction. Beyond mechanical systems, the study of noise-induced tipping in piecewise-smooth systems has applications in biology [43, 44] and climate models [45, 46, 47, 49, 50]. To our knowledge, most probable paths in general stochastic piecewise-smooth systems have not yet been addressed.

1.1 Most probable paths in smooth systems

During rare event transitions from one metastable state to another, a stochastic system will follow paths according to some distribution, which in the limit of vanishing noise strength is generally singly-peaked along a most probable transition path. Following [23], for a smooth vector field \mathbfcal{F}, the most probable transition path between 𝐱0\mathbf{x}_{0} and 𝐱f\mathbf{x}_{f} can be defined as follows. We first define an admissible set of transition paths 𝒜(t0,tf){\mathcal{A}^{(t_{0},t_{f})}} by

𝒜(t0,tf)={𝜶H1([t0,tf];n):𝜶(t0)=𝐱0 and 𝜶(tf)=𝐱f},{\mathcal{A}^{(t_{0},t_{f})}}=\{\bm{\alpha}\in H^{1}([t_{0},t_{f}];{\mathbb{R}}^{n}):\bm{\alpha}(t_{0})=\mathbf{x}_{0}\text{ and }\bm{\alpha}(t_{f})=\mathbf{x}_{f}\},

where H1([t0,tf];n)H^{1}([t_{0},t_{f}];{\mathbb{R}}^{n}) is the Hilbert space of weakly differentiable curves 𝜶\bm{\alpha} satisfying 𝜶H1\bm{\alpha}\in H^{1} if and only if t0tf|𝜶|2+|𝜶˙|2dt<\int_{t_{0}}^{t_{f}}|\bm{\alpha}|^{2}+|\dot{\bm{\alpha}}|^{2}\,dt<\infty. The most probable transition path 𝜶𝒜(t0,tf)\bm{\alpha}^{*}\in{\mathcal{A}^{(t_{0},t_{f})}} is then defined to be the global minimizer of the Freidlin-Wentzell rate functional I(t0,tf):𝒜(t0,tf)¯,{I^{(t_{0},t_{f})}}:{\mathcal{A}^{(t_{0},t_{f})}}\mapsto\overline{{\mathbb{R}}}, given by

I(t0,tf)[𝜶]=t0tf𝜶˙(t)𝜶2𝑑t,{I^{(t_{0},t_{f})}}[\bm{\alpha}]=\int_{t_{0}}^{t_{f}}\left\|\dot{\bm{\alpha}}(t)-\mathbfcal{F}(\bm{\alpha}(t),t)\right\|^{2}\,dt, (1)

so that 𝜶=argmin𝜶𝒜(t0,tf)I(t0,tf)[𝜶]\bm{\alpha}^{*}=\operatorname*{arg\,min}_{\bm{\alpha}\in{\mathcal{A}^{(t_{0},t_{f})}}}{I^{(t_{0},t_{f})}}[\bm{\alpha}]. For this functional, the necessary condition satisfied by second differentiable minimizers is the following Euler-Lagrange equations [25]:

𝜶¨=𝜶˙\displaystyle\ddot{\bm{\alpha}}=\mathbfcal{F}_{t}+\dot{\bm{\alpha}}\left(\nabla\mathbfcal{F}-\nabla\mathbfcal{F}^{\intercal}\right)+\nabla\mathbfcal{F}^{\intercal}\mathbfcal{F}.

Here, the subscript tt refers to the partial derivative with respect to time.

1.2 Framing the piecewise-smooth problem

We determine the most probable transition path of a noise-induced tipping event when the drift is piecewise-smooth. In general, for systems with a piecewise-smooth drift, the appropriate rate functional to minimize is not known. Minimizers of the Freidlin-Wentzell rate functional may not be well-defined when a region of the vector field is not continuous, or even Lipschitz continuous. It is not clear how the most probable path may traverse across a switching manifold, which may have an attracting or repelling sliding region that introduces more complex dynamics. Specifically, letting 𝐅±:nn\mathbf{F}^{\pm}:{\mathbb{R}}^{n}\mapsto{\mathbb{R}}^{n} be smooth vector fields, we consider a system of stochastic differential equations of the form

d𝐱t=𝐅(𝐱t)dt+σd𝐖t,d\mathbf{x}_{t}=\mathbf{F}(\mathbf{x}_{t})dt+\sigma d\mathbf{W}_{t}, (2)

where 𝐱=(x,𝐲)n\mathbf{x}=(x,\mathbf{y})\in{\mathbb{R}}^{n} such that xx\in{\mathbb{R}} and 𝐲n1\mathbf{y}\in{\mathbb{R}}^{n-1}, σ\sigma\in{\mathbb{R}}, 𝐖=(W1,Wn)\mathbf{W}=(W_{1},\ldots W_{n}) is an nn-dimensional Wiener process, and 𝐅:n{x=0}n\mathbf{F}:{\mathbb{R}}^{n}\setminus\{x=0\}\mapsto{\mathbb{R}}^{n} is defined by

𝐅(𝐱)={𝐅+(𝐱),x>0,𝐅(𝐱),x<0.\mathbf{F}(\mathbf{x})=\begin{cases}\mathbf{F}^{+}(\mathbf{x}),&x>0,\\ \mathbf{F}^{-}(\mathbf{x}),&x<0.\end{cases} (3)

We let S+={𝐱n:x>0}S_{+}=\{\mathbf{x}\in{\mathbb{R}}^{n}:x>0\}, S={𝐱n:x<0}S_{-}=\{\mathbf{x}\in{\mathbb{R}}^{n}:x<0\}, and Σ={𝐱n:x=0}\Sigma=\{\mathbf{x}\in{\mathbb{R}}^{n}:x=0\}. The set Σ\Sigma is often called the switching manifold or discontinuity boundary [51]. In general, Σ\Sigma may be defined as the zero level set of a smooth function H:n,H:{\mathbb{R}}^{n}\mapsto{\mathbb{R}}, but we assume for simplicity that Σ\Sigma is a hyperplane in n1{\mathbb{R}}^{n-1}.

We define the deterministic skeleton of Equation (2) as the dynamical system

𝐱˙=𝐅(𝐱).\dot{\mathbf{x}}=\mathbf{F}(\mathbf{x}). (4)

We assume that there exist asymptotically stable fixed points 𝐱0S+\mathbf{x}_{0}\in S_{+} and 𝐱fS\mathbf{x}_{f}\in S_{-} of Equation (4), such that 𝐅+(𝐱0)=0\mathbf{F}^{+}(\mathbf{x}_{0})=0 and 𝐅(𝐱f)=0\mathbf{F}^{-}(\mathbf{x}_{f})=0. By noise-induced transitions, we mean realizations of Equation (2) that transition from the basin of attraction of 𝐱0\mathbf{x}_{0} to that of 𝐱f\mathbf{x}_{f}. More precisely, we define a noise-induced transition on the interval [t0,tf][t_{0},t_{f}] as a solution to the stochastic boundary value problem given by realizations of Equation (2) that satisfy 𝐱(t0)=𝐱0\mathbf{x}(t_{0})=\mathbf{x}_{0} and 𝐱(tf)=𝐱f\mathbf{x}(t_{f})=\mathbf{x}_{f}. Here the boundary conditions mean that the process is conditioned to transition from 𝐱0\mathbf{x}_{0} to 𝐱f\mathbf{x}_{f}.

1.3 Background on piecewise-smooth systems

Dynamics that occur entirely within the smooth regions S±S_{\pm} can be fully described by regular (smooth) dynamical systems theory. On the other hand, dynamics on a switching manifold Σ\Sigma may not be defined and are typically imposed a posteriori, e.g. using Filippov’s convex combination [51, 52, 53]. In general, the lack of smoothness requirements across the switching manifold allows for more diverse phenomena than in smooth systems of similar dimension, since the vector field across Σ\Sigma may be discontinuous or even point in opposite directions. Regions of Σ\Sigma where either occurs are called sliding regions. A solution that reaches Σ\Sigma at a sliding region may “slide” along it. On the other hand, crossing regions occur on Σ\Sigma where sliding is not possible. We differentiate these regions using the following notation:

  1. 1.

    Attracting sliding regions: ΣA={𝐱Σ:F1+(0,𝐲)0,F1(0,𝐲)0}\Sigma_{A}=\{\mathbf{x}\in\Sigma:F_{1}^{+}(0,\mathbf{y})\leq 0,\,F_{1}^{-}(0,\mathbf{y})\geq 0\},

  2. 2.

    Repelling sliding regions: ΣR={𝐱Σ:F1+(0,𝐲)0,F1(0,𝐲)0},\Sigma_{R}=\{\mathbf{x}\in\Sigma:F_{1}^{+}(0,\mathbf{y})\geq 0,\,F_{1}^{-}(0,\mathbf{y})\leq 0\},

  3. 3.

    Positive crossing regions: Σ+={𝐱Σ:F1±(0,𝐲)>0}\Sigma_{+}=\{\mathbf{x}\in\Sigma:F^{\pm}_{1}(0,\mathbf{y})>0\},

  4. 4.

    Negative crossing regions: Σ={𝐱Σ:F1±(0,𝐲)<0}\Sigma_{-}=\{\mathbf{x}\in\Sigma:F^{\pm}_{1}(0,\mathbf{y})<0\},

where F1±{F}^{\pm}_{1} is the first component of 𝐅±\mathbf{F}^{\pm}.

When solutions may slide along the switching manifold Σ,\Sigma, we impose a flow using the Filippov convex method [53, 51]. That is, we define the sliding flow as a convex combination of 𝐅±,\mathbf{F}^{\pm},

𝐅s(0,𝐲)=λ𝐅+(0,𝐲)+(1λ)𝐅(0,𝐲)\mathbf{F}^{s}(0,\mathbf{y})=\lambda\mathbf{F}^{+}(0,\mathbf{y})+(1-\lambda)\mathbf{F}^{-}(0,\mathbf{y}) (5)

with

λ(𝐲)=F1(0,𝐲)F1(0,𝐲)F1+(0,𝐲)[0,1].\lambda(\mathbf{y})=\frac{F_{1}^{-}(0,\mathbf{y})}{F_{1}^{-}(0,\mathbf{y})-F_{1}^{+}(0,\mathbf{y})}\in[0,1]. (6)

Notice that Equation (6) fixes λ(𝐲)\lambda(\mathbf{y}). This flow naturally arises in the context of our main result without a priori imposing it; see Theorem 3.8. Also, note that 𝐅s(0,𝐲)\mathbf{F}^{s}(0,\mathbf{y}) may vanish for some point (0,𝐲)Σ(0,\mathbf{y})\in\Sigma. Since it is not an equilibrium of the smooth flow, we say that such a point 𝐱s=(0,𝐲s)Σ\mathbf{x}_{s}=(0,\mathbf{y}_{s})\in\Sigma is a pseudoequilibrium if it is an equilibrium of the sliding flow; i.e., for some λ(0,1),\lambda\in(0,1), 𝐅s(0,𝐲s)=𝟎\mathbf{F}^{s}(0,\mathbf{y}_{s})=\mathbf{0}.

Filippov’s convex combination interprets solutions of System (3),(4) as a continuous curve 𝐱(t)\mathbf{x}(t) satisfying the following conditions:

{𝐱˙(t)=𝐅(𝐱),𝐱(0)=𝐱0,\begin{cases}\dot{\mathbf{x}}(t)=\mathbf{F}(\mathbf{x}),\\ \mathbf{x}(0)=\mathbf{x}_{0},\end{cases} (7)

where 𝐅:nn\mathbf{F}:{\mathbb{R}}^{n}\mapsto{\mathbb{R}}^{n} is defined as

𝐅(𝐱)={𝐅+(𝐱),𝐱S+Σ+,𝐅(𝐱),𝐱SΣ,𝐅s(𝐱),𝐱ΣAΣR.\mathbf{F}(\mathbf{x})=\begin{cases}\mathbf{F}^{+}(\mathbf{x}),&\mathbf{x}\in S_{+}\cup\Sigma_{+},\\ \mathbf{F}^{-}(\mathbf{x}),&\mathbf{x}\in S_{-}\cup\Sigma_{-},\\ \mathbf{F}^{s}(\mathbf{x}),&\mathbf{x}\in\Sigma_{A}\cup\Sigma_{R}.\end{cases} (8)

That is, for points in time in which a solution curve 𝐱(t)\mathbf{x}(t) intersects Σ\Sigma it will either cross Σ,\Sigma, tracking the flow of either 𝐅+\mathbf{F}^{+} or 𝐅\mathbf{F}^{-} depending on whether the curve is crossing into S+S_{+} or SS_{-}, or it will track Σ\Sigma in the sliding regions until entering a crossing region. Note, implicit in this definition is that solution curves are differentiable everywhere except on Σ\Sigma.

To provide a brief illustration, Figure 1(a)-(c) shows some possible dynamics across Σ\Sigma in a two-dimensional system, with (a) Σ+\Sigma_{+}, (b) Σ±\Sigma_{\pm} and ΣA\Sigma_{A}, and (c) Σ±\Sigma_{\pm} and ΣR\Sigma_{R}. In (a) the flow traverses across Σ\Sigma continuously but not smoothly; in (b) and (c) the flow may also remain on Σ,\Sigma, in intervals indicated by the blue line (for ΣA\Sigma_{A}) and the green line (for ΣR\Sigma_{R}). Note that the flow leaving a repelling sliding region is non-unique in forward time, while the flow leaving an attracting sliding region is non-unique in backward time. In addition, complex dynamics may appear on Σ\Sigma itself, depending on the manner in which the imposed vector field on Σ\Sigma transitions between S+S_{+} and SS_{-}. Figure 1(d) shows one example of a path traversing from a stable equilibrium in S+S_{+}, through the switching manifold in an attracting sliding region using dynamics imposed using Filippov’s convex combination [53], then traversing to a stable equilibrium in SS_{-}. The system in this illustration is a piecewise-smooth version of the Lorenz 63 model,

(x˙y˙z˙)=(σ(yy0x+x0)(xx0)(ρz+z0)y+y0(xx0)(yy0)β(zz0)),x<0, and(x˙y˙z˙)=(σ+(yx)x(ρ+z)yxyβ+z),x>0,\begin{pmatrix}\dot{x}\\ \dot{y}\\ \dot{z}\end{pmatrix}=\begin{pmatrix}\sigma_{-}\left(y-y_{0}-x+x_{0}\right)\\ (x-x_{0})(\rho_{-}-z+z_{0})-y+y_{0}\\ (x-x_{0})(y-y_{0})-\beta_{-}(z-z_{0})\end{pmatrix},\;x<0,\mbox{ and}\;\begin{pmatrix}\dot{x}\\ \dot{y}\\ \dot{z}\end{pmatrix}=\begin{pmatrix}\sigma_{+}\left(y-x\right)\\ x(\rho_{+}-z)-y\\ xy-\beta_{+}z\end{pmatrix},\;x>0, (9)

originally studied in [56]. Here, σ±,ρ±,β±>0\sigma_{\pm},\rho_{\pm},\beta_{\pm}>0 are parameters and x0,y0,z0x_{0},y_{0},z_{0}\in{\mathbb{R}}. For this example, S±S_{\pm} and Σ\Sigma are defined as previously, with n=3n=3.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Figure 1: (a)-(c) Phase plane of a generic 𝐅\mathbf{F} with n=2n=2, with (a) a crossing region, (b) crossing and attracting sliding, and (c) crossing and repelling sliding. (d) Phase diagram of a piecewise-smooth version of the Lorenz 63 model, System (9). Red points are equilibria of S±,S_{\pm}, and the black point is a pseudoequilibrium of the imposed flow on Σ\Sigma. The sliding region, ΣA,\Sigma_{A}, is shaded blue. The black curve connects the two equilibria, going through the pseudoequilibrium at the origin and sliding in ΣA\Sigma_{A}. The blue curve is the sliding trajectory. Parameters used are σ+=10\sigma_{+}=10, β+=2\beta_{+}=2, ρ±=2\rho_{\pm}=2, σ=11\sigma_{-}=11, β=3\beta_{-}=3, x0=1x_{0}=1, y0=1y_{0}=-1, and z0=0z_{0}=0.

1.4 Outline of analysis

For our problem, noise-induced transitions must cross Σ\Sigma and the Freidlin-Wentzell rate functional, Equation (1), must be modified to account for the discontinuity in 𝐅\mathbf{F} and possible discontinuity-induced dynamics. For example, a most probable path that reaches a repelling sliding region ΣR\Sigma_{R} leaves it non-uniquely due to the non-uniqueness of the flow, as illustrated in the case studies in Sections 4 and 5. Additionally, if one were to naively piece together the most probable paths of the smooth regions and neglect possible contribution from the manner in which the path crosses the switching manifold, this may lead to paths that do not reflect a global minimum of the rate functional.

We resolve these issues by smoothing out 𝐅\mathbf{F} via mollification with a compactly supported, smooth, radially symmetric kernel of characteristic width ε\varepsilon, and consider the sequence of minimizers to the rate functional Equation (1) as ε0\varepsilon\rightarrow 0. We show for any piecewise-smooth system of the form given by System (2),(3) that the most probable path minimizes Equation (1) in the smooth regions S±S_{\pm}, plus an additional functional whose contribution represents time spent sliding in a crossing region of the switching manifold. For 𝐅=(F1,𝐆)\mathbf{F}=(F_{1},\mathbf{G}) such that F1:nΣF_{1}:{\mathbb{R}}^{n}\setminus\Sigma\mapsto{\mathbb{R}} and 𝐆:nΣn1,\mathbf{G}:{\mathbb{R}}^{n}\setminus\Sigma\mapsto{\mathbb{R}}^{n-1}, and 𝜶=(α,𝜷)𝒜(t0,tf)\bm{\alpha}=(\alpha,\bm{\beta})\in{\mathcal{A}^{(t_{0},t_{f})}} and 0<ε10<\varepsilon\ll 1, the appropriate rate functional with both contributions is

I¯(t0,tf)[𝜶]=\displaystyle{\overline{I}^{(t_{0},t_{f})}}[\bm{\alpha}]= {tS+S}𝜶˙(t)𝐅(𝜶(t))2𝑑t\displaystyle\int_{\{t\in S_{+}\cup S_{-}\}}\|\dot{\bm{\alpha}}(t)-\mathbf{F}(\bm{\alpha}(t))\|^{2}\,dt
+{tΣ±}minλ[0,1]{[λF1+(0,𝜷)+(1λ)F1(0,𝜷)]2\displaystyle+\int_{\{t\in\Sigma_{\pm}\}}\min_{\lambda\in[0,1]}\left\{\left[\lambda F_{1}^{+}(0,\bm{\beta})+(1-\lambda)F_{1}^{-}(0,\bm{\beta})\right]^{2}\right.
+|𝜷˙λ𝐆+(0,𝜷)(1λ)𝐆(0,𝜷)|2}dt.\displaystyle\left.\mathmakebox[\widthof{$ +\int_{\mathcal{I}_{\Sigma}[\bm{\alpha}]}\min\qquad$}][l]{}+\left|\dot{\bm{\beta}}-\lambda\mathbf{G}^{+}(0,\bm{\beta})-(1-\lambda)\mathbf{G}^{-}(0,\bm{\beta})\right|^{2}\right\}\,dt.

We make the following observations about this result:

  1. 1.

    The form of the above rate functional is independent of the chosen mollifier.

  2. 2.

    When there is a sliding region, the most probable path may track the Filippov dynamics; that is, no additional contribution is made during times where the most probable path slides via the imposed flow on the switching manifold.

We rigorously prove the above result using a technique from the calculus of variations called Γ\Gamma-convergence to show that I¯(t0,tf){\overline{I}^{(t_{0},t_{f})}} is the Γ\Gamma-limit of the sequence of rate functionals for the mollified system [57, 58]. The Γ\Gamma-limit is the natural notion of a limiting functional in this context in the sense that minimizers of the rate functional with the mollified drift converge (weakly) to a minimizer of I¯(t0,tf){\overline{I}^{(t_{0},t_{f})}} in the limit as ε0\varepsilon\rightarrow 0. While Γ\Gamma-convergence has been used to study the convergence of minimizers of the Onsager-Machlup functional in the small noise limit [59, 60, 61, 62, 63], to our knowledge we are the first to use Γ\Gamma-convergence to compute an appropriate extension of the Freidlin-Wentzell rate functional for piecewise smooth systems.

The utility of the derived rate functional is demonstrated through two case studies. Each case study also presents distinct phenomena in their most probable transition paths that are not possible in systems with smooth drift. The first case study analyzes the most probable transition path in a two-dimensional piecewise-linear system, a simple demonstrative case in which the typical rate functional formulation of the most probable path breaks down. Notably, in this case the most probable path follows the switching manifold along an attracting sliding region, but it does not follow the switching manifold along a repelling sliding region.

The second case study analyzes the most probable path in a simple one-dimensional periodically-forced piecewise-smooth system, constructed similarly to some conceptual models for Arctic energy balance [5, 64]. A significant portion of the analysis of the deterministic dynamics and the Monte Carlo simulations were first derived in Zanetell’s Master’s thesis [65]. Here we again observe the breakdown of the typical formulation of the most probable path. Additionally, we observe an emergent dynamical phenomenon, which we describe as noise-induced sliding, in which the most probable path follows the switching manifold in a crossing region, despite the energy cost of the associated rate functional.

In both case studies, we note in some regimes the most probable path(s) predicted by our result do not match observed Monte Carlo simulations. Several factors may contribute to this discrepancy, including needing a smaller noise strength σ,\sigma, a higher-order functional in the smoothing parameter ε,\varepsilon, or including the correction term in the Onsager-Machlup functional. Such possibilities will be explored in future work.

The paper is organized as follows: in Section 2 we briefly highlight notation we use and some technical assumptions on 𝐅\mathbf{F}. Then in Section 3 we derive the mollified vector field and use Γ\Gamma-convergence to determine the appropriate limiting functional for calculating the most probable path for stochastic differential equations of the form given by System (2),(3). Next, in Section 4 we provide a simple case study of a planar linear piecewise-smooth system as an illustration of the most probable path calculated in Section 3. Finally, in Section 5 we provide a case study that extends the most probable path derivation to the case of a one-dimensional non-autonomous system.

2 Notation and Assumptions

In this paper we use the following conventions for notation:

  1. 1.

    We use the convention that \|\cdot\| represents the L2L^{2} norm for functions and |||\cdot| represents the 2\ell^{2} norm for vectors in n{\mathbb{R}}^{n}. Furthermore, we use ,\langle\cdot,\cdot\rangle to denote the inner product in 2\ell^{2}.

  2. 2.

    Subscripts refer to the index of a vector unless otherwise indicated. E.g., for 𝐅=(F1,F2,,Fn),\mathbf{F}=(F_{1},\,F_{2},\,\ldots,\,F_{n}), we write the components as Fj,F_{j}, j=1,,nj=1,\,\ldots,\,n.

  3. 3.

    Unless noted otherwise, 𝐅\mathbf{F} may be non-autonomous. Although our analysis applies to non-autonomous systems, for simplicity of presentation we suppress the time dependence.

  4. 4.

    Since we assume that the underlying drift is only piecewise-smooth in the first component, we use a convenient shorthand that separates this first component from the remaining ones. Namely, we write 𝐱=(x,𝐲)n,\mathbf{x}=(x,\mathbf{y})\in{\mathbb{R}}^{n}, where xx\in{\mathbb{R}} and 𝐲n1,\mathbf{y}\in{\mathbb{R}}^{n-1}, and 𝐅=(F1,𝐆),\mathbf{F}=(F_{1},\mathbf{G}), where F1:n{x=0}F_{1}:{\mathbb{R}}^{n}\setminus\{x=0\}\mapsto{\mathbb{R}} and 𝐆:n{x=0}n1\mathbf{G}:{\mathbb{R}}^{n}\setminus\{x=0\}\mapsto{\mathbb{R}}^{n-1}.

  5. 5.

    For the path 𝜶𝒜(t0,tf),\bm{\alpha}\in{\mathcal{A}^{(t_{0},t_{f})}}, we separate the first component from the remaining ones as 𝜶=(α,𝜷),\bm{\alpha}=\left(\alpha,\bm{\beta}\right), so that αH1([t0,tf];),\alpha\in H^{1}\left([t_{0},t_{f}];\mathbb{R}\right), α(t0)=x0\alpha(t_{0})=x_{0}, and α(tf)=xf,\alpha(t_{f})=x_{f}, where 𝐱0,f=(x0,f,𝐲0,f)\mathbf{x}_{0,f}=\left(x_{0,f},\mathbf{y}_{0,f}\right). Similarly, 𝜷H1([t0,tf];n1),\bm{\beta}\in H^{1}\left([t_{0},t_{f}];\mathbb{R}^{n-1}\right), so that 𝜷(t0)=𝐲0\bm{\beta}(t_{0})=\mathbf{y}_{0} and 𝜷(tf)=𝐲f\bm{\beta}(t_{f})=\mathbf{y}_{f}.

  6. 6.

    We denote the Jacobian of a function 𝐅(x,𝐲)\mathbf{F}(x,\mathbf{y}) as 𝐅(x,𝐲)\nabla\mathbf{F}(x,\mathbf{y}). For the Jacobian comprised of only the partial derivatives with respect to the components of 𝐲,\mathbf{y}, we write 𝐲𝐅(x,𝐲)\nabla_{\mathbf{y}}\mathbf{F}(x,\mathbf{y}).

  7. 7.

    We indicate that a variable is being treated as fixed or pre-determined by separating it with a semicolon. E.g., if the value of yy has been set, then 𝐅(x;y)\mathbf{F}(x;\,y) is only a function of xx.

We also make the following assumptions for each smooth vector field 𝐅±.\mathbf{F}^{\pm}. We will use these assumptions to prove two lemmas for the mollified vector field 𝐅ε{\mathbf{F}^{\varepsilon}} in Section 3 which are necessary to establish existence of minimizers to the FW functional. Note, these assumptions also apply in the non-autonomous case but the explicit dependence on time in the argument of F is suppressed.

Assumptions.

Let 𝐅±:n{x=0}n\mathbf{F}^{\pm}:{\mathbb{R}}^{n}\setminus\{x=0\}\mapsto{\mathbb{R}}^{n} be smooth vector fields. We assume the following properties about 𝐅±\mathbf{F}^{\pm}.

  1. 1.

    (Growth conditions) There exist R1,c1,c3,c5>0R_{1},c_{1},c_{3},c_{5}>0 and c2,c4,c6c_{2},c_{4},c_{6}\in\mathbb{R} such that c5<c1c_{5}<c_{1} and for some 1<p<1<p<\infty, |𝐱|>R1|\mathbf{x}|>R_{1} implies

    {|𝐅(𝐱)|c1|𝐱|p+c2,if x<0,|𝐅+(𝐱)|c1|𝐱|p+c2,if x>0,\begin{cases}|\mathbf{F}^{-}(\mathbf{x})|\geq c_{1}|\mathbf{x}|^{p}+c_{2},&\text{if }x<0,\\ |\mathbf{F}^{+}(\mathbf{x})|\geq c_{1}|\mathbf{x}|^{p}+c_{2},&\text{if }x>0,\end{cases} (10)

    and

    {|𝐅x(𝐱)|c3|𝐱|p+c4,if x<0,|𝐅+x(𝐱)|c3|𝐱|p+c4,if x>0,\begin{cases}\left|\frac{\partial\mathbf{F}^{-}}{\partial x}(\mathbf{x})\right|\leq c_{3}|\mathbf{x}|^{p}+c_{4},&\text{if }x<0,\\ \left|\frac{\partial\mathbf{F}^{+}}{\partial x}(\mathbf{x})\right|\leq c_{3}|\mathbf{x}|^{p}+c_{4},&\text{if }x>0,\end{cases} (11)

    and

    |F+(0,𝐲)F(0,𝐲)|c5|𝐲|p+c6.|F^{+}(0,\mathbf{y})-F^{-}(0,\mathbf{y})|\leq c_{5}|\mathbf{y}|^{p}+c_{6}. (12)
  2. 2.

    (Asymptotically inward-flowing) There exist R2,c7>0R_{2},c_{7}>0 such that |𝐱|>R2|\mathbf{x}|>R_{2} implies 𝐅±(𝐱)0\mathbf{F}^{\pm}(\mathbf{x})\neq 0 and

    {𝐅(𝐱),𝐫(𝐱)<c7|𝐅(𝐱)|,if x0,𝐅+(𝐱),𝐫(𝐱)<c7|𝐅+(𝐱)|,if x0,\begin{cases}\langle\mathbf{F}^{-}(\mathbf{x}),\mathbf{r(\mathbf{x}})\rangle<-c_{7}|\mathbf{F}^{-}(\mathbf{x})|,&\text{if }x\leq 0,\\ \langle\mathbf{F}^{+}(\mathbf{x}),\mathbf{r}(\mathbf{x})\rangle<-c_{7}|\mathbf{F}^{+}(\mathbf{x})|,&\text{if }x\geq 0,\end{cases} (13)

    where 𝐫(𝐱)=𝐱/|𝐱|\mathbf{r}(\mathbf{x})=\mathbf{x}/|\mathbf{x}| is the normalized outward-pointing radial vector at 𝐱\mathbf{x}.

3 Mollified Freidlin-Wentzell rate functional

As discussed in Section 1, the standard form of the Freidlin-Wentzell rate functional cannot be directly used to calculate most probable paths in piecewise-smooth systems. In this section, we seek to ameliorate this deficiency by studying the convergence of the minimizer of the Freidlin-Wentzell functional, where we approximate Equation (3) as a smooth vector field obtained by mollification in the xx-direction by a smooth, compactly supported, symmetric kernel. Specifically, by smoothing 𝐅\mathbf{F} via mollification in xx over a region of width 2ε2\varepsilon, we obtain a sequence of functionals Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} and a corresponding sequence of minimizers 𝜶ε𝒜(t0,tf)\bm{\alpha}_{\varepsilon}\in{\mathcal{A}^{(t_{0},t_{f})}}, which we prove converge — up to a subsequence — to a minimizer of a limiting functional I¯(t0,tf){\overline{I}^{(t_{0},t_{f})}} in the limit as ε0\varepsilon\rightarrow 0.

3.1 Mollified functional and existence of a minimum

We begin this section by recalling the definition of a (Friedrichs) mollifier, following the presentation in [66]. Let ζ:\zeta:{\mathbb{R}}\mapsto{\mathbb{R}} be a smooth even bump function, defined by

ζ(x)={Ch(x),|x|<1,0,|x|1,\zeta(x)=\begin{cases}Ch(x),&|x|<1,\\ 0,&|x|\geq 1,\end{cases} (14)

where h:h:\mathbb{R}\rightarrow\mathbb{R} is a smooth function satisfying h(x)0h(x)\geq 0, h(1)=h(1)=0h(-1)=h(1)=0 and all of its derivatives vanish at x=1x=-1 and x=1x=1, e.g., h(x)=exp(1/(|x|21))h(x)=\exp\left(1/\left(|x|^{2}-1\right)\right). Additionally, C>0C>0 is determined so that ζ(x)𝑑x=1\int_{-\infty}^{\infty}\zeta(x)\,dx=1; i.e., C=1/h(x)𝑑xC=1/\int_{-\infty}^{\infty}\,h(x)dx. For each ε>0,\varepsilon>0, we define a sequence of functions ζε(x)=ζ(x/ε)/ε\zeta_{\varepsilon}(x)=\zeta\left(x/\varepsilon\right)/\varepsilon which satisfy εεζε(x)𝑑x=1\int_{-\varepsilon}^{\varepsilon}\zeta_{\varepsilon}(x)\,dx=1 and are compactly supported on [ε,ε][-\varepsilon,\varepsilon]. The mollification Fε\textbf{F}^{\varepsilon} of F in xx is then defined by convolution with ζε\zeta_{\varepsilon} in the xx-direction:

Fε(x,𝐲)=ζε(x)F(x,𝐲)=\displaystyle\textbf{F}^{\varepsilon}(x,\mathbf{y})=\zeta_{\varepsilon}(x)*\textbf{F}(x,\mathbf{y})= 0ζε(xs)𝐅(s,𝐲)𝑑s+0ζε(xs)𝐅+(s,𝐲)𝑑s,\displaystyle\int_{-\infty}^{0}\zeta_{\varepsilon}(x-s)\mathbf{F}^{-}(s,\mathbf{y})\,ds+\int_{0}^{\infty}\zeta_{\varepsilon}(x-s)\mathbf{F}^{+}(s,\mathbf{y})\,ds, (15)
=\displaystyle= {εεζε(u)𝐅(xu,𝐲)𝑑u,xε,εxζε(u)𝐅+(xu,𝐲)𝑑u+xεζε(u)𝐅(xu,𝐲)𝑑u,|x|<ε,εεζε(u)𝐅+(xu,𝐲)𝑑u,xε,\displaystyle\begin{cases}\displaystyle\int_{-\varepsilon}^{\varepsilon}\zeta_{\varepsilon}(u)\mathbf{F}^{-}(x-u,\mathbf{y})\,du,&x\leq-\varepsilon,\\ \displaystyle\int_{-\varepsilon}^{x}\zeta_{\varepsilon}(u)\mathbf{F}^{+}(x-u,\mathbf{y})\,du+\int_{x}^{\varepsilon}\zeta_{\varepsilon}(u)\mathbf{F}^{-}(x-u,\mathbf{y})\,du,&|x|<\varepsilon,\\ \displaystyle\int_{-\varepsilon}^{\varepsilon}\zeta_{\varepsilon}(u)\mathbf{F}^{+}(x-u,\mathbf{y})\,du,&x\geq\varepsilon,\\ \end{cases}
=\displaystyle= εεζε(u)(𝐅(xu,𝐲)𝟙{ux}(u)+𝐅+(xu,𝐲)𝟙{ux}(u))𝑑u,\displaystyle\int_{-\varepsilon}^{\varepsilon}\zeta_{\varepsilon}(u)\left(\mathbf{F}^{-}(x-u,\mathbf{y})\mathbbm{1}_{\{u\geq x\}}(u)+\mathbf{F}^{+}(x-u,\mathbf{y})\mathbbm{1}_{\{u\leq x\}}(u)\right)du,

where 𝟙A\mathbbm{1}_{A} is the indicator function on the set AA. The mollified Freidlin-Wentzell rate functional Iε(t0,tf):𝒜(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}}:{\mathcal{A}^{(t_{0},t_{f})}}\mapsto\mathbb{R} is then defined by:

Iε(t0,tf)[𝜶]=t0tf|𝜶˙𝐅ε(𝜶)|2𝑑t.{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}]=\int_{t_{0}}^{t_{f}}\left|\dot{\bm{\alpha}}-{\mathbf{F}^{\varepsilon}}(\bm{\alpha})\right|^{2}\,dt. (16)

Note, by construction, 𝐅ε(𝐱){\mathbf{F}^{\varepsilon}}(\mathbf{x}) is a smooth function that converges pointwise to 𝐅(𝐱)\mathbf{F}(\mathbf{x}) as ε0\varepsilon\rightarrow 0 except at x=0x=0 [66]. However, since 𝐅(𝐱)\mathbf{F}(\mathbf{x}) is not continuous this convergence is necessarily not uniform and thus it is not a priori clear which properties of 𝐅\mathbf{F} are preserved under mollification. The following lemmas ensure that 𝐅ε{\mathbf{F}^{\varepsilon}} also has pp-growth and is asymptotically inward-flowing. These properties are crucial to establishing the existence of a minimum for Iε(t0,tf),{I_{\varepsilon}^{(t_{0},t_{f})}}, but to streamline our main results, the proofs of these lemmas are in Appendix A.

Lemma 3.1 (pp-growth).

There exists ε\varepsilon^{*} such that for all ε>0\varepsilon>0 satisfying ε<ε\varepsilon<\varepsilon^{*}, there exist R1ε,c1ε>0R^{\varepsilon}_{1},c_{1}^{\varepsilon}>0 and c2εc_{2}^{\varepsilon}\in\mathbb{R} such that |𝐱|>R1ε|\mathbf{x}|>R_{1}^{\varepsilon} implies

|𝐅ε(𝐱)|c1ε|𝐱|p+c2ε|{\mathbf{F}^{\varepsilon}}(\mathbf{x})|\geq c_{1}^{\varepsilon}|\mathbf{x}|^{p}+c_{2}^{\varepsilon}

for some p>1p>1.

Lemma 3.2 (Asymptotically inward-flowing).

For all ε>0,\varepsilon>0, there exist R2ε,c7ε>0R_{2}^{\varepsilon},c_{7}^{\varepsilon}>0 such that |𝐱|>R2ε|\mathbf{x}|>R_{2}^{\varepsilon} implies 𝐅ε(𝐱)0{\mathbf{F}^{\varepsilon}}(\mathbf{x})\neq 0 and

𝐅ε(𝐱),𝐫(𝐱)<c7ε|𝐅ε(𝐱)|,\langle{\mathbf{F}^{\varepsilon}}(\mathbf{x}),\mathbf{r}(\mathbf{x})\rangle<-c_{7}^{\varepsilon}|{\mathbf{F}^{\varepsilon}}(\mathbf{x})|,

where 𝐫(𝐱)=𝐱/|𝐱|\mathbf{r}(\mathbf{x})=\mathbf{x}/|\mathbf{x}| is the normalized outward-pointing radial vector at 𝐱\mathbf{x}.

We now prove the existence of a minimizer of Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} using Tonelli’s direct method of the calculus of variations; see, e.g., [67, 58]. This procedure consists of first proving that a minimizing sequence of Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} is bounded with respect to the H1H^{1} norm and thus has a weakly convergent subsequence in the H1H^{1} topology. Note, it is necessary to work in the H1H^{1} topology since closed and bounded sets are not necessarily compact in the strong topology. Nevertheless, upon passing to a subsequence, we can establish a limit point for the minimizing sequence. Finally, by proving that Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} is lower semi-continuous with respect to weak convergence in the H1H^{1} topology, it follows that limit points of the minimizing sequence are indeed minimizers.

To begin the procedure for the direct method of the calculus of variations outlined in the previous paragraph, we first recall the definition of weak convergence in the H1H^{1} topology in the context of our problem.

Definition 3.3 (Weak convergence in H1H^{1} [68]).

A sequence {𝛂m}m=1H1([t0,tf];n)\{\bm{\alpha}_{m}\}_{m=1}^{\infty}\subset H^{1}([t_{0},t_{f}];\mathbb{R}^{n}) converges weakly to 𝛂H1([t0,tf];n)\bm{\alpha}^{*}\in H^{1}([t_{0},t_{f}];\mathbb{R}^{n}), written

𝜶m𝜶inH1([t0,tf];n)or𝜶mH1𝜶,\bm{\alpha}_{m}\rightharpoonup\bm{\alpha}^{*}\;\;\mbox{in}\;\;H^{1}([t_{0},t_{f}];\mathbb{R}^{n})\qquad\mbox{or}\qquad\bm{\alpha}_{m}\,\overset{H^{1}}{\rightharpoonup}\,\bm{\alpha}^{*},

provided 𝛂m𝛂\bm{\alpha}_{m}\rightharpoonup\bm{\alpha}^{*} in L2([t0,tf];n)L^{2}([t_{0},t_{f}];\mathbb{R}^{n}) and 𝛂˙m𝛂˙\dot{\bm{\alpha}}_{m}\rightharpoonup\dot{\bm{\alpha}}^{*} in L2([t0,tf];n)L^{2}([t_{0},t_{f}];\mathbb{R}^{n}). That is, for all 𝛚L2([t0,tf];n)\bm{\omega}\in L^{2}([t_{0},t_{f}];\mathbb{R}^{n}),

limmt0tf𝜶m,𝝎𝑑t=t0tf𝜶,𝝎𝑑tandlimmt0tf𝜶˙m,𝝎𝑑t=t0tf𝜶˙,𝝎𝑑t.\lim_{m\rightarrow\infty}\int_{t_{0}}^{t_{f}}\langle\bm{\alpha}_{m},\bm{\omega}\rangle\,dt=\int_{t_{0}}^{t_{f}}\langle\bm{\alpha}^{*},\bm{\omega}\rangle\,dt\qquad\mbox{and}\qquad\lim_{m\rightarrow\infty}\int_{t_{0}}^{t_{f}}\langle\dot{\bm{\alpha}}_{m},\bm{\omega}\rangle\,dt=\int_{t_{0}}^{t_{f}}\langle\dot{\bm{\alpha}}^{*},\bm{\omega}\rangle\,dt.

The following lemma establishes that boundedness of Iε(t0,tf)[𝜶m]{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{m}] implies boundedness of 𝜶m\bm{\alpha}_{m} with respect to the H1H^{1} norm.

Lemma 3.4.

There exists ε>0\varepsilon^{*}>0 such that if ε<ε\varepsilon<\varepsilon^{*} and if 𝛂m𝒜\bm{\alpha}_{m}\in\mathcal{A} satisfies Iε(t0,tf)[𝛂m]<M1{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{m}]<M_{1} for some M1>0M_{1}>0, then there exists M2>0M_{2}>0 such that 𝛂mH1<M2\|\bm{\alpha}_{m}\|_{H^{1}}<M_{2}.

Proof.

For contradiction, suppose 𝜶m\bm{\alpha}_{m} is unbounded with respect to the H1H^{1} norm. Therefore, there exists a subsequence 𝜶mk\bm{\alpha}_{m_{k}} such that 𝜶mkH1\|\bm{\alpha}_{m_{k}}\|_{H^{1}}\rightarrow\infty.

1. Suppose there exists M>0M>0 such that 𝜶mk<M\|\bm{\alpha}_{m_{k}}\|_{\infty}<M. Consequently, since 𝐅ε{\mathbf{F}^{\varepsilon}} is smooth there exists C1>0C_{1}>0 such that |𝐅ε(𝜶mk)|<C1|{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{m_{k}})|<C_{1}. By the Cauchy-Schwarz inequality,

Iε(t0,tf)\displaystyle{I_{\varepsilon}^{(t_{0},t_{f})}} =t0tf[|𝜶˙mk|22𝜶mk,𝐅ε(𝜶mk)+|𝐅ε(𝜶mk)|2]𝑑t\displaystyle=\int_{t_{0}}^{t_{f}}\left[|\dot{\bm{\alpha}}_{m_{k}}|^{2}-2\langle\bm{\alpha}_{m_{k}},{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{m_{k}})\rangle+|{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{m_{k}})|^{2}\right]\,dt
t0tf[|𝜶˙mk|22𝜶mk,𝐅ε(𝜶mk)]𝑑t\displaystyle\geq\int_{t_{0}}^{t_{f}}\left[|\dot{\bm{\alpha}}_{m_{k}}|^{2}-2\langle\bm{\alpha}_{m_{k}},{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{m_{k}})\rangle\right]\,dt
𝜶˙mk22C1(tft0)12𝜶˙mk.\displaystyle\geq\|\dot{\bm{\alpha}}_{m_{k}}\|^{2}-2C_{1}(t_{f}-t_{0})^{\frac{1}{2}}\|\dot{\bm{\alpha}}_{m_{k}}\|.

Therefore, since 𝜶mkH1\|\bm{\alpha}_{m_{k}}\|_{H^{1}}\rightarrow\infty, it follows from Poincare’s inequality that 𝜶˙mk\|\dot{\bm{\alpha}}_{m_{k}}\|\rightarrow\infty. Thus, Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}}\rightarrow\infty, which is a contradiction.

2. Suppose 𝜶mk\|\bm{\alpha}_{m_{k}}\|_{\infty} is unbounded. Upon passing to another subsequence which we also label 𝜶mk\bm{\alpha}_{m_{k}}, it follows that 𝜶mk\|\bm{\alpha}_{m_{k}}\|_{\infty}\rightarrow\infty. Applying the elementary inequality a2+b22aba^{2}+b^{2}\geq 2ab, it follows that

Iε(t0,tf)[𝜶mk]\displaystyle{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{m_{k}}] 2t0tf[|𝜶˙mk||𝐅ε(𝜶mk)|𝜶m,𝐅ε(𝜶mk)]𝑑t\displaystyle\geq 2\int_{t_{0}}^{t_{f}}\left[|\dot{\bm{\alpha}}_{m_{k}}||{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{m_{k}})|-\langle\bm{\alpha}_{m},{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{m_{k}})\rangle\right]dt
=2t0tf[|𝜶˙mk||𝐅ε(𝜶mk)|sin2(θ2)]𝑑t,\displaystyle=2\int_{t_{0}}^{t_{f}}\left[|\dot{\bm{\alpha}}_{m_{k}}||{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{m_{k}})|\sin^{2}\left(\frac{\theta}{2}\right)\right]dt,

where θ(t)\theta(t) is the angle between 𝜶˙mk(t)\dot{\bm{\alpha}}_{m_{k}}(t) and 𝐅ε(𝜶mk(t)){\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{m_{k}}(t)).

We now choose ε,R1ε\varepsilon^{*},R_{1}^{\varepsilon} as in Lemma 3.1 and R2εR_{2}^{\varepsilon} as in Lemma 3.2, set Rε=max{R1ε,R2ε}R^{\varepsilon}=\max\{R_{1}^{\varepsilon},R_{2}^{\varepsilon}\}, and let Ak={t:|𝜶mk(t)|>RεA_{k}=\{t:|\bm{\alpha}_{m_{k}}(t)|>R^{\varepsilon} and ddt|𝜶mk(t)|0}\frac{d}{dt}|\bm{\alpha}_{m_{k}}(t)|\geq 0\}. It follows from Lemmas 3.1 and 3.2 that there exists a constant C2>0C_{2}>0 such that

Iε(t0,tf)[𝜶mk]\displaystyle{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{m_{k}}] C2Ak|𝜶˙mk||𝐅ε(𝜶mk)|𝑑t\displaystyle\geq C_{2}\int_{A_{k}}|\dot{\bm{\alpha}}_{m_{k}}||{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{m_{k}})|dt
C2Ak|𝜶˙mk||𝜶mk|p𝑑t\displaystyle\geq C_{2}\int_{A_{k}}|\dot{\bm{\alpha}}_{m_{k}}||\bm{\alpha}_{m_{k}}|^{p}\,dt
=C2Ak|𝜶˙mk||𝜶mk||𝜶mk|p1𝑑t.\displaystyle=C_{2}\int_{A_{k}}|\dot{\bm{\alpha}}_{m_{k}}||\bm{\alpha}_{m_{k}}||\bm{\alpha}_{m_{k}}|^{p-1}\,dt.

Applying the Cauchy-Schwarz inequality and the fundamental theorem of calculus, we obtain

Iε(t0,tf)[𝜶mk]\displaystyle{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{m_{k}}] C2Ak𝜶˙mk,𝜶mk|𝜶|p1𝑑t\displaystyle\geq C_{2}\int_{A_{k}}\langle\dot{\bm{\alpha}}_{m_{k}},\bm{\alpha}_{m_{k}}\rangle|\bm{\alpha}|^{p-1}dt
=C2Akddt|𝜶|p+1𝑑t\displaystyle=C_{2}\int_{A_{k}}\frac{d}{dt}|\bm{\alpha}|^{p+1}dt
=C2(maxt[t0,tf]|𝜶mk|p+1(Rε)p+1).\displaystyle=C_{2}\left(\max_{t\in[t_{0},t_{f}]}|\bm{\alpha}_{m_{k}}|^{p+1}-(R^{\varepsilon})^{p+1}\right).

Since 𝜶mk\|\bm{\alpha}_{m_{k}}\|_{\infty}\rightarrow\infty implies maxt[t0,tf]|𝜶mk(t)|\max_{t\in[t_{0},t_{f}]}|\bm{\alpha}_{m_{k}}(t)|\rightarrow\infty, it follows that Iε(t0,tf)[𝜶mk]{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{m_{k}}]\rightarrow\infty, which is a contradiction. It follows from items 1 and 2 that 𝜶m\bm{\alpha}_{m} is bounded with respect to the H1H^{1} norm. ∎

Finally, we use the direct method of the calculus of variations to prove the existence of a minimum.

Theorem 3.5 (Existence of minimizer of the mollified functional).

There exists an ε>0\varepsilon^{*}>0 such that if ε<ε\varepsilon<\varepsilon^{*}, then there exists an 𝛂ε𝒜(t0,tf){\bm{\alpha}}^{*}_{\varepsilon}\in{\mathcal{A}^{(t_{0},t_{f})}} such that for all 𝛂𝒜(t0,tf)\bm{\alpha}\in{\mathcal{A}^{(t_{0},t_{f})}},

Iε(t0,tf)[𝜶ε]Iε(t0,tf)[𝜶].{I_{\varepsilon}^{(t_{0},t_{f})}}[{\bm{\alpha}}^{*}_{\varepsilon}]\leq{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}].
Proof.

Let ε>0,\varepsilon>0, and let 𝜶m\bm{\alpha}_{m} be a minimizing sequence of Iε(t0,tf)[𝜶]{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}]; i.e.,

limmIε(t0,tf)[𝜶m]=inf𝜶𝒜(t0,tf)Iε(t0,tf)[𝜶].\lim_{m\rightarrow\infty}{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{m}]=\inf_{\bm{\alpha}\in\mathcal{A}^{(t_{0},t_{f})}}{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}].

Consequently, there exists M1>0M_{1}>0 such that Iε(t0,tf)[𝜶m]<M1{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{m}]<M_{1} and thus by Lemma 3.4 there exists M2>0M_{2}>0 such that 𝜶mH1<M2\|\bm{\alpha}_{m}\|_{H^{1}}<M_{2}. Thus, by the Banach–Alaoglu theorem and the Rellich-Kondrachov theorem there exists 𝜶ε𝒜(t0,tf)\bm{\alpha}^{*}_{\varepsilon}\in{\mathcal{A}^{(t_{0},t_{f})}} and a subsequence 𝜶mk\bm{\alpha}_{m_{k}} such that 𝜶˙mkL2𝜶˙ε\dot{\bm{\alpha}}_{m_{k}}\overset{L^{2}}{\rightharpoonup}\dot{\bm{\alpha}}^{*}_{\varepsilon} and 𝜶mkL𝜶ε\bm{\alpha}_{m_{k}}\overset{L^{\infty}}{\rightarrow}{\bm{\alpha}}^{*}_{\varepsilon} as kk\rightarrow\infty [69]. Since the integrand appearing in Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} is convex with respect to 𝜶˙,\dot{\bm{\alpha}}, it follows that Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} is lower semicontinuous with respect to weak convergence [68], and therefore

Iε(t0,tf)[𝜶ε]lim infkIε(t0,tf)[𝜶mk]=limmIε(t0,tf)[𝜶m]=inf𝜶𝒜(t0,tf)Iε(t0,tf)[𝜶].{I_{\varepsilon}^{(t_{0},t_{f})}}[{\bm{\alpha}}^{*}_{\varepsilon}]\leq\liminf_{k\rightarrow\infty}{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{m_{k}}]=\lim_{m\rightarrow\infty}{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{m}]=\inf_{\bm{\alpha}\in{\mathcal{A}^{(t_{0},t_{f})}}}{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}].

3.2 Euler-Lagrange equations

Now that the existence of minimizers is established for the Freidlin-Wentzell rate Iε(t0,tf)[𝜶ε]{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{\varepsilon}] corresponding to the mollified system 𝐱˙=𝐅ε\dot{\mathbf{x}}={\mathbf{F}^{\varepsilon}}, we can establish a necessary condition for minimizers, which is that 𝜶𝒜Σ(t0,tm)\bm{\alpha}\in{\mathcal{A}_{\Sigma}^{(t_{0},t_{m})}} is a C2C^{2} integral curve of the flow generated by the following system of Euler-Lagrange (EL) equations:

𝐱¨=𝐅tε+(𝐅ε𝐅ε)𝐱˙+𝐅ε𝐅ε,\displaystyle\ddot{\mathbf{x}}=\mathbf{F}^{\varepsilon}_{t}+\left(\nabla{\mathbf{F}^{\varepsilon}}-\nabla{\mathbf{F}^{\varepsilon}}^{\intercal}\right)\dot{\mathbf{x}}+\nabla{\mathbf{F}^{\varepsilon}}^{\intercal}{\mathbf{F}^{\varepsilon}}, (17)

where the subscript tt denotes the partial derivative with respect to time. It is convenient to introduce the scaling z=x/ε,z=x/\varepsilon, z[1,1],z\in[-1,1], so the EL Equations (17) become the fast-slow system

εz¨\displaystyle\varepsilon\ddot{z} =F1,tε+(𝐲F1ε𝐆zε)𝐲˙+𝐆zε𝐆ε,\displaystyle=F^{\varepsilon}_{1,t}+\left(\nabla_{\mathbf{y}}F^{\varepsilon}_{1}-\mathbf{G}^{\varepsilon}_{z}\right)\mathbf{\dot{y}}+\mathbf{G}^{\varepsilon\,\intercal}_{z}\mathbf{G}^{\varepsilon},
𝐲¨\displaystyle\ddot{\mathbf{y}} =𝐆tε+(𝐆zε𝐲F1ε)εz˙+(𝐲𝐆ε𝐲𝐆ε)𝐲˙+𝐲𝐆ε𝐆ε.\displaystyle=\mathbf{G}^{\varepsilon}_{t}+\left(\mathbf{G}^{\varepsilon}_{z}-\nabla_{\mathbf{y}}F^{\varepsilon}_{1}\right)\varepsilon\dot{z}+\left(\nabla_{\mathbf{y}}\mathbf{G}^{\varepsilon}-\nabla_{\mathbf{y}}\mathbf{G}^{\varepsilon\,\intercal}\right)\dot{\mathbf{y}}+\nabla_{\mathbf{y}}\mathbf{G}^{\varepsilon\,\intercal}\mathbf{G}^{\varepsilon}.

Introducing the conjugate momenta φ=εz˙F1ε\varphi=\varepsilon\dot{z}-F^{\varepsilon}_{1} and 𝝍=𝐲˙𝐆ε\bm{\psi}=\dot{\mathbf{y}}-\mathbf{G}^{\varepsilon} leads to the Hamiltonian form,

εz˙\displaystyle\varepsilon\dot{z} =F1ε+φ,\displaystyle=F^{\varepsilon}_{1}+\varphi, (18)
𝐲˙\displaystyle\dot{\mathbf{y}} =𝐆ε+𝝍,\displaystyle=\mathbf{G}^{\varepsilon}+\mathbf{\bm{\psi}},
φ˙\displaystyle\dot{\varphi} =φF1,zε𝝍,𝐆zε,\displaystyle=-\varphi F^{\varepsilon}_{1,z}-\langle\mathbf{\bm{\psi}},\mathbf{G}^{\varepsilon}_{z}\rangle,
𝝍˙\displaystyle\mathbf{\dot{\bm{\psi}}} =φ𝐲F1ε𝝍,𝐲𝐆ε.\displaystyle=-\varphi\nabla_{\mathbf{y}}F^{\varepsilon}_{1}-\langle\mathbf{\bm{\psi}},\nabla_{\mathbf{y}}\mathbf{G}^{\varepsilon}\rangle.

The most probable path of System (2),(3) corresponds to solutions of the EL Equations (18) subject to boundary conditions for appropriate values of tt. In the case studies of Sections 4 and 5, we use the gradient flow to numerically solve System (18) when an explicit solution is not available.

3.3 Limiting functional

In this section, we consider the problem of computing an appropriate limiting functional I¯(t0,tf):𝒜(t0,tf){\overline{I}^{(t_{0},t_{f})}}:{\mathcal{A}^{(t_{0},t_{f})}}\mapsto\mathbb{R}. Theorem 3.5 guarantees the existence of a minimizer of Iε(t0,tf)[𝜶]{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}] for fixed ε\varepsilon and is given by 𝜶ε=argmin𝜶𝒜(t0,tf)Iε(t0,tf)[𝜶(t)].{\bm{\alpha}}^{*}_{\varepsilon}=\operatorname*{arg\,min}_{\bm{\alpha}\in{\mathcal{A}^{(t_{0},t_{f})}}}{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}(t)]. If we now consider the sequence of minimizers 𝜶ε{\bm{\alpha}}^{*}_{\varepsilon} and suppose there exists 𝜶𝒜(t0,tf)\bm{\alpha}^{*}\in{\mathcal{A}^{(t_{0},t_{f})}} such that 𝜶εH1𝜶\bm{\alpha}^{*}_{\varepsilon}\,\overset{H^{1}}{\rightharpoonup}\,\bm{\alpha}^{*}, an appropriate limiting functional I¯(t0,tf){\overline{I}^{(t_{0},t_{f})}} should have the property that 𝜶\bm{\alpha}^{*} minimizes I¯(t0,tf){\overline{I}^{(t_{0},t_{f})}}. Introduced by de Giorgi, the Γ\Gamma-limit is the precise notion of a limiting functional that ensures this property [58, 57]. The following definition of a Γ\Gamma-limit is adapted from [58] for our specific problem.

Definition 3.6 (Γ\Gamma-Convergence [58]).

We say that Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} Γ\Gamma-converges to I¯(t0,tf):𝒜(t0,tf)¯{\overline{I}^{(t_{0},t_{f})}}:{\mathcal{A}^{(t_{0},t_{f})}}\mapsto\overline{{\mathbb{R}}} with respect to H1H^{1} weak convergence if for all 𝛂𝒜(t0,tf)\bm{\alpha}\in{\mathcal{A}^{(t_{0},t_{f})}}, we have

  1. 1.

    (liminf inequality) for every sequence 𝜶ε𝒜(t0,tf)\bm{\alpha}_{\varepsilon}\in{\mathcal{A}^{(t_{0},t_{f})}} satisfying 𝜶εH1𝜶\bm{\alpha}_{\varepsilon}\overset{H^{1}}{\rightharpoonup}\bm{\alpha},

    I¯(t0,tf)[𝜶]lim infε0Iε(t0,tf)[𝜶ε];{\overline{I}^{(t_{0},t_{f})}}[\bm{\alpha}]\leq\liminf_{\varepsilon\rightarrow 0}{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{\varepsilon}];
  2. 2.

    (recovery sequence) there exists a sequence 𝜶εH1𝜶\bm{\alpha}_{\varepsilon}\overset{H^{1}}{\rightharpoonup}\bm{\alpha} such that

    limε0Iε(t0,tf)[𝜶ε]=I¯(t0,tf)[𝜶].\lim_{\varepsilon\rightarrow 0}{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{\varepsilon}]={\overline{I}^{(t_{0},t_{f})}}[\bm{\alpha}].

If Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} Γ\Gamma-converges to I¯(t0,tf){\overline{I}^{(t_{0},t_{f})}} we write Γlimε0Iε(t0,tf)=I¯(t0,tf)\displaystyle\Gamma-\lim_{\varepsilon\rightarrow 0}{I_{\varepsilon}^{(t_{0},t_{f})}}={\overline{I}^{(t_{0},t_{f})}}.

Showing both the liminf inequality and the existence of a recovery sequence establish that both the limsup and the liminf of Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} over all weakly converergent sequences exists and is equal to I¯(t0,tf){\overline{I}^{(t_{0},t_{f})}}. Indeed, these two conditions ensure that Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} has a common lower bound, the lower bound is optimal, and the limiting functional is precisely this lower bound. Moreover, the definition of Γ\Gamma-convergence is constructed in such a manner that ensures the convergence of minimizers of Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} to the minimizer of I¯(t0,tf){\overline{I}^{(t_{0},t_{f})}} if an additional equicoecervity condition is satisfied. The following theorem adapted from [58] ensures this convergence for our specfic problem.

Theorem 3.7.

Suppose Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} Γ\Gamma-converges to I¯(t0,tf):𝒜(t0,tf)¯{\overline{I}^{(t_{0},t_{f})}}:{\mathcal{A}^{(t_{0},t_{f})}}\mapsto\overline{{\mathbb{R}}} as ε0\varepsilon\rightarrow 0 with respect to H1H^{1} weak convergence and suppose for all 𝛂ε\bm{\alpha}^{*}_{\varepsilon} that minimize Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} there exists a constant M>0M>0 such that 𝛂εH1<M\|\bm{\alpha}_{\varepsilon}^{*}\|_{H^{1}}<M. Then there exists an 𝛂𝒜(t0,tf)\bm{\alpha}^{*}\in{\mathcal{A}^{(t_{0},t_{f})}} such that

inf𝜶𝒜(t0,tf)I¯(t0,tf)[𝜶]=I¯(t0,tf)[𝜶].\inf_{\bm{\alpha}\in{\mathcal{A}^{(t_{0},t_{f})}}}{\overline{I}^{(t_{0},t_{f})}}[\bm{\alpha}]={\overline{I}^{(t_{0},t_{f})}}[\bm{\alpha}^{*}].

Moreover, every limit of a subsequence of 𝛂ε\bm{\alpha}^{*}_{\varepsilon} as ε0\varepsilon\rightarrow 0 is a minimizer of I¯(t0,tf){\overline{I}^{(t_{0},t_{f})}}.

One of the challenges with proving a Γ\Gamma-limit is that the target limiting functional must be formed from an educated guess based on analysis of minimizers. In the system of interest, it is natural to expect that the appropriate Γ\Gamma-limit will correspond to the standard FW functional except for intervals of time on which the curve tracks Σ\Sigma. However, for 𝜶𝒜(t0,tf)\bm{\alpha}\in{\mathcal{A}^{(t_{0},t_{f})}} satisfying {t:α(t)=0}\{t\in\mathbb{R}:\alpha(t)=0\} has nonzero measure, the recovery sequence must be constructed to minimize the functional for times in which 𝜶(t)Σε={𝐱n:|x|<ε}\bm{\alpha}(t)\in\Sigma^{\varepsilon}=\{\mathbf{x}\in\mathbb{R}^{n}:|x|<\varepsilon\}. To compute the appropriate Γ\Gamma-limit for Iε(t0,tf),{I_{\varepsilon}^{(t_{0},t_{f})}}, we introduce the following intervals of time:

[𝜶]={t(t0,tf):𝜶(t)Σ}={t(t0,tf):α1(t)0},ε[𝜶]={t(t0,tf):𝜶(t)Σε}={t(t0,tf):|α1(t)|ε},Σ[𝜶]={t(t0,tf):𝜶(t)Σ}={t(t0,tf):α1(t)=0},Σ+[𝜶]={t(t0,tf):𝜶(t)Σ+}={t(t0,tf):α1(t)=0 and F1±(0,𝜷)>0},Σ[𝜶]={t(t0,tf):𝜶(t)Σ}={t(t0,tf):α1(t)=0 and F1±(0,𝜷)<0},Σε[𝜶]={t(t0,tf):𝜶(t)Σε}={t(t0,tf):|α1(t)|<ε},±[𝜶]={t(t0,tf):𝜶(t)S±},±ε[𝜶]={t(t0,tf):𝜶(t)S±ε},\begin{split}\mathcal{I}[\bm{\alpha}]&=\{t\in(t_{0},t_{f}):\bm{\alpha}(t)\not\in\Sigma\}=\{t\in(t_{0},t_{f}):\alpha_{1}(t)\neq 0\},\\ \mathcal{I}^{\varepsilon}[\bm{\alpha}]&=\{t\in(t_{0},t_{f}):\bm{\alpha}(t)\not\in\Sigma^{\varepsilon}\}=\{t\in(t_{0},t_{f}):|\alpha_{1}(t)|\geq\varepsilon\},\\ \mathcal{I}_{\Sigma}[\bm{\alpha}]&=\{t\in(t_{0},t_{f}):\bm{\alpha}(t)\in\Sigma\}=\{t\in(t_{0},t_{f}):\alpha_{1}(t)=0\},\\ \mathcal{I}_{\Sigma_{+}}[\bm{\alpha}]&=\{t\in(t_{0},t_{f}):\bm{\alpha}(t)\in\Sigma_{+}\}=\{t\in(t_{0},t_{f}):\alpha_{1}(t)=0\mbox{ and }F^{\pm}_{1}(0,\bm{\beta})>0\},\\ \mathcal{I}_{\Sigma_{-}}[\bm{\alpha}]&=\{t\in(t_{0},t_{f}):\bm{\alpha}(t)\in\Sigma_{-}\}=\{t\in(t_{0},t_{f}):\alpha_{1}(t)=0\mbox{ and }F^{\pm}_{1}(0,\bm{\beta})<0\},\\ \mathcal{I}^{\varepsilon}_{\Sigma}[\bm{\alpha}]&=\{t\in(t_{0},t_{f}):\bm{\alpha}(t)\in\Sigma^{\varepsilon}\}=\{t\in(t_{0},t_{f}):|\alpha_{1}(t)|<\varepsilon\},\\ \mathcal{I}_{\pm}[\bm{\alpha}]&=\{t\in(t_{0},t_{f}):\bm{\alpha}(t)\in S_{\pm}\},\\ \mathcal{I}_{\pm}^{\varepsilon}[\bm{\alpha}]&=\{t\in(t_{0},t_{f}):\bm{\alpha}(t)\in S_{\pm}^{\varepsilon}\},\\ \end{split}

as well as introduce the function Λ:\Lambda:{\mathbb{R}}\mapsto{\mathbb{R}}, defined by

Λ(z)=12+0zζ(u)𝑑u.\Lambda(z)=\frac{1}{2}+\int_{0}^{z}\zeta(u)\,du. (19)

Note that Λ(z)=1\Lambda(z)=1 if z>1,z>1, Λ(z)=0\Lambda(z)=0 if z<1,z<-1, and Λ(z)\Lambda(z) smoothly transitions from 0 to 11 on the interval [1,1][-1,1]. We now state the following Γ\Gamma-limit result.

Theorem 3.8.

There exists I¯(t0,tf):𝒜(t0,tf)¯{\overline{I}^{(t_{0},t_{f})}}:{\mathcal{A}^{(t_{0},t_{f})}}\mapsto\overline{{\mathbb{R}}} such that

Γlimε0Iε(t0,tf)=I¯(t0,tf),\Gamma-\lim_{\varepsilon\rightarrow 0}{I_{\varepsilon}^{(t_{0},t_{f})}}={\overline{I}^{(t_{0},t_{f})}},

where for 𝛂=(α,𝛃)𝒜(t0,tf)\bm{\alpha}=(\alpha,\bm{\beta})\in{\mathcal{A}^{(t_{0},t_{f})}},

I¯(t0,tf)[𝜶]=\displaystyle{\overline{I}^{(t_{0},t_{f})}}[\bm{\alpha}]= [𝜶]𝜶˙(t)𝐅(𝜶(t))2𝑑t\displaystyle\int_{\mathcal{I}[\bm{\alpha}]}\|\dot{\bm{\alpha}}(t)-\mathbf{F}(\bm{\alpha}(t))\|^{2}\,dt (20)
+Σ[𝜶]minλ[0,1]{[λF1+(0,𝜷)+(1λ)F1(0,𝜷)]2\displaystyle+\int_{\mathcal{I}_{\Sigma}[\bm{\alpha}]}\min_{\lambda\in[0,1]}\left\{\left[\lambda F_{1}^{+}(0,\bm{\beta})+(1-\lambda)F_{1}^{-}(0,\bm{\beta})\right]^{2}\right.
+|𝜷˙λ𝐆+(0,𝜷)(1λ)𝐆(0,𝜷)|2}dt.\displaystyle\left.\mathmakebox[\widthof{$ +\int_{\mathcal{I}_{\Sigma}[\bm{\alpha}]}\min\qquad$}][l]{}+\left|\dot{\bm{\beta}}-\lambda\mathbf{G}^{+}(0,\bm{\beta})-(1-\lambda)\mathbf{G}^{-}(0,\bm{\beta})\right|^{2}\right\}\,dt.
Proof.

For fixed 𝜶=(α,𝜷)𝒜(t0,tf)\bm{\alpha}=(\alpha,\bm{\beta})\in{\mathcal{A}^{(t_{0},t_{f})}}, we first show the existence of a recovery sequence, 𝜶ε\bm{\alpha}_{\varepsilon}. We will construct 𝜶ε\bm{\alpha}_{\varepsilon} over three intervals, +ε[𝜶],\mathcal{I}^{\varepsilon}_{+}[\bm{\alpha}], ε[𝜶],\mathcal{I}^{\varepsilon}_{-}[\bm{\alpha}], and Σε[𝜶],\mathcal{I}^{\varepsilon}_{\Sigma}[\bm{\alpha}], by

𝜶ε(t)\displaystyle\bm{\alpha}_{\varepsilon}(t) =𝜶(t)𝟙{t+ε[𝜶]}+𝜶(t)𝟙{tε[𝜶]}+(εz(t),𝜷(t))𝟙{tΣε[𝜶]}\displaystyle=\bm{\alpha}(t)\mathbbm{1}_{\{t\in\mathcal{I}^{\varepsilon}_{+}[\bm{\alpha}]\}}+\bm{\alpha}(t)\mathbbm{1}_{\{t\in\mathcal{I}^{\varepsilon}_{-}[\bm{\alpha}]\}}+(\varepsilon z(t),\bm{\beta}(t))\mathbbm{1}_{\{t\in\mathcal{I}_{\Sigma}^{\varepsilon}[\bm{\alpha}]\}}
=𝜶(t)𝟙{𝜶S+ε}+𝜶(t)𝟙{𝜶Sε}+(εz(t),𝜷(t))𝟙{(εz,𝜷)Σε},\displaystyle=\bm{\alpha}(t)\mathbbm{1}_{\{\bm{\alpha}\in S^{\varepsilon}_{+}\}}+\bm{\alpha}(t)\mathbbm{1}_{\{\bm{\alpha}\in S^{\varepsilon}_{-}\}}+(\varepsilon z(t),\bm{\beta}(t))\mathbbm{1}_{\{(\varepsilon z,\bm{\beta})\in\Sigma^{\varepsilon}\}},

where z(t)z(t) is given by

z=argminz[1,1]\displaystyle z=\operatorname*{arg\,min}_{z^{\prime}\in[-1,1]} {[Λ(z)F1+(0,𝜷)+(1Λ(z))F1(0,𝜷)]2\displaystyle\left\{\left[\Lambda(z^{\prime})F_{1}^{+}(0,\bm{\beta})+\left(1-\Lambda(z^{\prime})\right)F_{1}^{-}(0,\bm{\beta})\right]^{2}\right. (21)
+|𝜷˙Λ(z)𝐆+(0,𝜷)(1Λ(z))𝐆(0,𝜷)|2}.\displaystyle\mathmakebox[\widthof{$ \left\{ \left[\right.\right.d$}][l]{}+\left.\left|\dot{\bm{\beta}}-\Lambda(z^{\prime})\mathbf{G}^{+}(0,\bm{\beta})-\left(1-\Lambda(z^{\prime})\right)\mathbf{G}^{-}(0,\bm{\beta})\right|^{2}\right\}.

To prove convergence of Iε(t0,tf)[𝜶ε]{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{\varepsilon}] to I¯(t0,tf)[𝜶]{\overline{I}^{(t_{0},t_{f})}}[\bm{\alpha}] and 𝜶εH1𝜶,\bm{\alpha}_{\varepsilon}\,\overset{H^{1}}{\rightharpoonup}\,\bm{\alpha}, we first consider the interval of time +ε[𝜶]\mathcal{I}_{+}^{\varepsilon}[\bm{\alpha}]. Since 𝜶ε(t)𝟙{𝜶εS+ε}=𝜶(t)𝟙{𝜶S+ε}\bm{\alpha}_{\varepsilon}(t)\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in S^{\varepsilon}_{+}\}}=\bm{\alpha}(t)\mathbbm{1}_{\{\bm{\alpha}\in S^{\varepsilon}_{+}\}} it follows that 𝜶ε(t)𝟙{𝜶εS+ε}H1𝜶(t)𝟙{𝜶S+ε}\bm{\alpha}_{\varepsilon}(t)\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in S^{\varepsilon}_{+}\}}\,\overset{H^{1}}{\rightharpoonup}\,\bm{\alpha}(t)\mathbbm{1}_{\{\bm{\alpha}\in S^{\varepsilon}_{+}\}}. Moreover, we have

+ε[𝜶ε]\displaystyle\int_{\mathcal{I}^{\varepsilon}_{+}[\bm{\alpha}_{\varepsilon}]} 𝜶˙ε𝐅ε(𝜶ε)2dt+[𝜶]𝜶˙𝐅+(𝜶)2𝑑t\displaystyle\|\dot{\bm{\alpha}}_{\varepsilon}-{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{\varepsilon})\|^{2}\,dt-\int_{\mathcal{I}_{+}[\bm{\alpha}]}\|\dot{\bm{\alpha}}-\mathbf{F}^{+}(\bm{\alpha})\|^{2}\,dt
=\displaystyle= +ε[𝜶]+[𝜶](2𝜶˙,𝐅+(𝜶)𝐅ε(𝜶)+|𝐅ε(𝜶)|2|𝐅+(𝜶)|2)𝑑t\displaystyle\int_{\mathcal{I}^{\varepsilon}_{+}[\bm{\alpha}]\cap\mathcal{I}_{+}[\bm{\alpha}]}\left(2\langle\dot{\bm{\alpha}},\mathbf{F}^{+}(\bm{\alpha})-{\mathbf{F}^{\varepsilon}}(\bm{\alpha})\rangle+|{\mathbf{F}^{\varepsilon}}(\bm{\alpha})|^{2}-|\mathbf{F}^{+}(\bm{\alpha})|^{2}\right)\,dt
++ε[𝜶]+[𝜶]𝜶˙𝐅ε(𝜶)2𝑑t+[𝜶]+ε[𝜶]𝜶˙𝐅+(𝜶)2𝑑t.\displaystyle+\int_{\mathcal{I}^{\varepsilon}_{+}[\bm{\alpha}]\setminus\mathcal{I}_{+}[\bm{\alpha}]}\|\dot{\bm{\alpha}}-{\mathbf{F}^{\varepsilon}}(\bm{\alpha})\|^{2}\,dt-\int_{\mathcal{I}_{+}[\bm{\alpha}]\setminus\mathcal{I}^{\varepsilon}_{+}[\bm{\alpha}]}\|\dot{\bm{\alpha}}-\mathbf{F}^{+}(\bm{\alpha})\|^{2}\,dt.

Since 𝐅ε(x,𝐲)𝟙{xε}=11ζ(v)𝐅+(xεv,𝐲)𝑑v𝟙{xε}{\mathbf{F}^{\varepsilon}}(x,\mathbf{y})\mathbbm{1}_{\{x\geq\varepsilon\}}=\int_{-1}^{1}\zeta(v)\mathbf{F}^{+}(x-\varepsilon v,\mathbf{y})\,dv\mathbbm{1}_{\{x\geq\varepsilon\}} and 11ζ(v)𝐅+(xεv,𝐲)𝑑v\int_{-1}^{1}\zeta(v)\mathbf{F}^{+}(x-\varepsilon v,\mathbf{y})\,dv converges uniformly to 𝐅+(x,𝐲)\mathbf{F}^{+}(x,\mathbf{y}) as ε0,\varepsilon\rightarrow 0, the integral +ε[𝜶]+[𝜶]𝜶˙,𝐅+(𝜶)𝐅ε(𝜶)𝑑t\int_{\mathcal{I}^{\varepsilon}_{+}[\bm{\alpha}]\cap\mathcal{I}_{+}[\bm{\alpha}]}\langle\dot{\bm{\alpha}},\mathbf{F}^{+}(\bm{\alpha})-{\mathbf{F}^{\varepsilon}}(\bm{\alpha})\rangle\,dt vanishes as ε0\varepsilon\rightarrow 0. By Minkowski’s inequality and the triangle inequality,

|+ε[𝜶]+[𝜶]|𝐅ε(𝜶)|2|𝐅+(𝜶)|2dt|\displaystyle\left|\int_{\mathcal{I}^{\varepsilon}_{+}[\bm{\alpha}]\cap\mathcal{I}_{+}[\bm{\alpha}]}|{\mathbf{F}^{\varepsilon}}(\bm{\alpha})|^{2}-|\mathbf{F}^{+}(\bm{\alpha})|^{2}\,dt\right| +ε[𝜶]+[𝜶]|𝐅ε(𝜶)2𝐅+(𝜶)2|𝑑t\displaystyle\leq\int_{\mathcal{I}^{\varepsilon}_{+}[\bm{\alpha}]\cap\mathcal{I}_{+}[\bm{\alpha}]}\left|\|{\mathbf{F}^{\varepsilon}}(\bm{\alpha})\|^{2}-\|\mathbf{F}^{+}(\bm{\alpha})\|^{2}\right|\,dt
+ε[𝜶]+[𝜶]𝐅ε(𝜶)𝐅+(𝜶)2𝑑t,\displaystyle\leq\int_{\mathcal{I}^{\varepsilon}_{+}[\bm{\alpha}]\cap\mathcal{I}_{+}[\bm{\alpha}]}\|{\mathbf{F}^{\varepsilon}}(\bm{\alpha})-\mathbf{F}^{+}(\bm{\alpha})\|^{2}\,dt,

and thus by the same uniform convergence argument as above vanishes as ε0\varepsilon\rightarrow 0. Since the measures of +ε[𝜶]+[𝜶]\mathcal{I}^{\varepsilon}_{+}[\bm{\alpha}]\setminus\mathcal{I}_{+}[\bm{\alpha}] and +[𝜶]+ε[𝜶]\mathcal{I}_{+}[\bm{\alpha}]\setminus\mathcal{I}^{\varepsilon}_{+}[\bm{\alpha}] vanish as ε0\varepsilon\rightarrow 0 it follows that the remaining integrals also vanish. The analogous case for [𝜶]\mathcal{I}_{-}[\bm{\alpha}] is identical. Hence,

limε0Iε(t0,tf)[𝜶ε𝟙{𝜶εS±ε}]\displaystyle\lim_{\varepsilon\rightarrow 0}{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{\varepsilon}\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in S^{\varepsilon}_{\pm}\}}] =limε0ε[𝜶]𝜶˙𝐅ε(𝜶)2𝑑t\displaystyle=\lim_{\varepsilon\rightarrow 0}\int_{\mathcal{I}^{\varepsilon}[\bm{\alpha}]}\|\dot{\bm{\alpha}}-{\mathbf{F}^{\varepsilon}}(\bm{\alpha})\|^{2}\,dt
=[𝜶]𝜶˙𝐅(𝜶)2𝑑t\displaystyle=\int_{\mathcal{I}[\bm{\alpha}]}\|\dot{\bm{\alpha}}-\mathbf{F}(\bm{\alpha})\|^{2}\,dt
=I¯(t0,tf)[𝜶𝟙{𝜶Σ}].\displaystyle={\overline{I}^{(t_{0},t_{f})}}[\bm{\alpha}\mathbbm{1}_{\{\bm{\alpha}\not\in\Sigma\}}].

We now consider the interval of time Σε[𝜶]\mathcal{I}^{\varepsilon}_{\Sigma}[\bm{\alpha}]. Note that using the substitution v=u/ε,v=u/\varepsilon, we have

limε0𝐅ε(εz,𝐲)𝟙{|εz|<ε}\displaystyle\lim_{\varepsilon\rightarrow 0}{\mathbf{F}^{\varepsilon}}(\varepsilon z,\mathbf{y})\mathbbm{1}_{\{|\varepsilon z|<\varepsilon\}} =limε0[1zζ(v)𝐅+(εzεv,𝐲)𝑑v+z1ζ(v)𝐅(εzεv,𝐲)𝑑v]\displaystyle=\lim_{\varepsilon\rightarrow 0}\left[\int_{-1}^{z}\zeta(v)\mathbf{F}^{+}(\varepsilon z-\varepsilon v,\mathbf{y})\,dv+\int_{z}^{1}\zeta(v)\mathbf{F}^{-}(\varepsilon z-\varepsilon v,\mathbf{y})\,dv\right]
=Λ(z)𝐅+(0,𝐲)+(1Λ(z))𝐅(0,𝐲).\displaystyle=\Lambda(z)\mathbf{F}^{+}(0,\mathbf{y})+\left(1-\Lambda(z)\right)\mathbf{F}^{-}(0,\mathbf{y}).

Let

𝐇(𝐲;z)=Λ(z)𝐅+(0,𝐲)+(1Λ(z))𝐅(0,𝐲)=λ𝐅+(0,𝐲)+(1λ)𝐅(0,𝐲),\mathbf{H}(\mathbf{y};\,z)=\Lambda(z)\mathbf{F}^{+}(0,\mathbf{y})+\left(1-\Lambda(z)\right)\mathbf{F}^{-}(0,\mathbf{y})=\lambda\mathbf{F}^{+}(0,\mathbf{y})+\left(1-\lambda\right)\mathbf{F}^{-}(0,\mathbf{y}),

where we set z(t)z(t) as in Equation (21), so that λ=Λ(z)[0,1]\lambda=\Lambda(z)\in[0,1] is a constant. Thus, in general limε0𝐅ε(εz,𝐲)=𝐇(z,𝐲)\lim_{\varepsilon\rightarrow 0}{\mathbf{F}^{\varepsilon}}(\varepsilon z,\mathbf{y})=\mathbf{H}(z,\mathbf{y}), where zz varies in [1,1],[-1,1], but we evaluate 𝐇\mathbf{H} at a particular zz-value to correspond with the Γ\Gamma-limit. Since 𝜶ε(t)𝟙{𝜶εΣε}=(εz,𝜷)𝒜(t0,tf)\bm{\alpha}_{\varepsilon}(t)\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in\Sigma^{\varepsilon}\}}=(\varepsilon z,\bm{\beta})\in{\mathcal{A}^{(t_{0},t_{f})}} it follows that

Σε[𝜶]\displaystyle\int_{\mathcal{I}^{\varepsilon}_{\Sigma}[\bm{\alpha}]} 𝜶˙ε𝐅ε(𝜶ε)2dtΣ[𝜶]𝜶˙𝐇(𝜷;z)2𝑑t\displaystyle\|\dot{\bm{\alpha}}_{\varepsilon}-{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{\varepsilon})\|^{2}\,dt-\int_{\mathcal{I}_{\Sigma}[\bm{\alpha}]}\left\|\dot{\bm{\alpha}}-\mathbf{H}(\bm{\beta};\,z)\right\|^{2}\,dt
=\displaystyle= ΣεΣ𝜶˙ε𝐅ε(𝜶ε)2𝜶˙𝐇(𝜷;z)2dt\displaystyle\int_{\mathcal{I}^{\varepsilon}_{\Sigma}\cap\mathcal{I}_{\Sigma}}\|\dot{\bm{\alpha}}_{\varepsilon}-{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{\varepsilon})\|^{2}-\left\|\dot{\bm{\alpha}}-\mathbf{H}(\bm{\beta};\,z)\right\|^{2}\,dt
+ΣεΣ𝜶˙ε𝐅ε2𝑑tΣΣε𝜶˙𝐇2𝑑t.\displaystyle+\int_{\mathcal{I}^{\varepsilon}_{\Sigma}\setminus\mathcal{I}_{\Sigma}}\|\dot{\bm{\alpha}}_{\varepsilon}-{\mathbf{F}^{\varepsilon}}\|^{2}\,dt-\int_{\mathcal{I}_{\Sigma}\setminus\mathcal{I}^{\varepsilon}_{\Sigma}}\|\dot{\bm{\alpha}}-\mathbf{H}\|^{2}\,dt.

As above, the last two integrals vanish as ε0,\varepsilon\rightarrow 0, since the measure of ΣεΣ\mathcal{I}^{\varepsilon}_{\Sigma}\setminus\mathcal{I}_{\Sigma} converges to zero and ΣΣε=,\mathcal{I}_{\Sigma}\setminus\mathcal{I}^{\varepsilon}_{\Sigma}=\varnothing, and the integrands are bounded. Since ΣεΣ=Σ,\mathcal{I}^{\varepsilon}_{\Sigma}\cap\mathcal{I}_{\Sigma}=\mathcal{I}_{\Sigma},

ΣεΣ\displaystyle\int_{\mathcal{I}^{\varepsilon}_{\Sigma}\cap\mathcal{I}_{\Sigma}} 𝜶˙ε𝐅ε(𝜶ε)2𝜶˙𝐇(𝜷;z)2dt\displaystyle\|\dot{\bm{\alpha}}_{\varepsilon}-{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{\varepsilon})\|^{2}-\left\|\dot{\bm{\alpha}}-\mathbf{H}(\bm{\beta};\,z)\right\|^{2}\,dt
=\displaystyle= Σ|𝜶˙ε|22𝜶˙ε,𝐅ε(𝜶ε)+|𝐅ε(𝜶ε)|2|𝜶˙|2+2𝜶˙,𝐇(𝜷;z)|𝐇(𝜷;z)|2dt.\displaystyle\int_{\mathcal{I}_{\Sigma}}|\dot{\bm{\alpha}}_{\varepsilon}|^{2}-2\langle\dot{\bm{\alpha}}_{\varepsilon},{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{\varepsilon})\rangle+|{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{\varepsilon})|^{2}-\left|\dot{\bm{\alpha}}\right|^{2}+2\langle\dot{\bm{\alpha}},\mathbf{H}(\bm{\beta};\,z)\rangle-\left|\mathbf{H}(\bm{\beta};\,z)\right|^{2}\,dt.

Since 𝜶˙ε𝟙{𝜶εΣε}H1𝜶˙𝟙{𝜶Σ}\dot{\bm{\alpha}}_{\varepsilon}\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in\Sigma^{\varepsilon}\}}\,\overset{H^{1}}{\rightharpoonup}\,\dot{\bm{\alpha}}\mathbbm{1}_{\{\bm{\alpha}\in\Sigma\}} and 𝐅ε(𝜶ε)𝟙{𝜶εΣε}H1𝐇(𝜷;z)𝟙{𝜶Σ},{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{\varepsilon})\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in\Sigma^{\varepsilon}\}}\,\overset{H^{1}}{\rightharpoonup}\,\mathbf{H}(\bm{\beta};\,z)\mathbbm{1}_{\{\bm{\alpha}\in\Sigma\}},

Σ|𝜶˙ε|2|𝜶˙|2dt=|t0tf|𝜶˙ε𝟙{𝜶εΣε}|2|𝜶˙𝟙{𝜶Σ}|2dt|t0tf(𝜶˙ε𝜶˙)𝟙{𝜶εΣε}2𝑑t0\displaystyle\int_{\mathcal{I}_{\Sigma}}|\dot{\bm{\alpha}}_{\varepsilon}|^{2}-|\dot{\bm{\alpha}}|^{2}\,dt=\left|\int_{t_{0}}^{t_{f}}|\dot{\bm{\alpha}}_{\varepsilon}\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in\Sigma^{\varepsilon}\}}|^{2}-|\dot{\bm{\alpha}}\mathbbm{1}_{\{\bm{\alpha}\in\Sigma\}}|^{2}\,dt\right|\leq\int_{t_{0}}^{t_{f}}\|\left(\dot{\bm{\alpha}}_{\varepsilon}-\dot{\bm{\alpha}}\right)\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in\Sigma^{\varepsilon}\}}\|^{2}\,dt\rightarrow 0

as ε0\varepsilon\rightarrow 0, and

limε0Σ|𝐅ε(𝜶ε)|2|𝐇(𝜷;z)|2dt\displaystyle\lim_{\varepsilon\rightarrow 0}\int_{\mathcal{I}_{\Sigma}}|{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{\varepsilon})|^{2}-\left|\mathbf{H}(\bm{\beta};\,z)\right|^{2}\,dt =limε0|t0tf|𝐅ε𝟙{𝜶εΣε}|2|𝐇𝟙{𝜶Σ}|2dt|\displaystyle=\lim_{\varepsilon\rightarrow 0}\left|\int_{t_{0}}^{t_{f}}|{\mathbf{F}^{\varepsilon}}\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in\Sigma^{\varepsilon}\}}|^{2}-|\mathbf{H}\mathbbm{1}_{\{\bm{\alpha}\in\Sigma\}}|^{2}\,dt\right|
limε0t0tf(𝐅ε𝐇)𝟙{𝜶εΣε}2𝑑t\displaystyle\leq\lim_{\varepsilon\rightarrow 0}\int_{t_{0}}^{t_{f}}\|\left({\mathbf{F}^{\varepsilon}}-\mathbf{H}\right)\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in\Sigma^{\varepsilon}\}}\|^{2}\,dt
=0,\displaystyle=0,

where we have used Minkowski’s inequality and the triangle inequality, as above. Also, as a consequence of the uniform boundedness of sequences in H1,H^{1}, the product of a weakly convergent sequence and a strongly convergent sequence is weakly convergent [68], and thus

Σ𝜶˙,𝐇(𝜷;z)𝜶˙ε,𝐅ε(𝜶ε)dt=t0tf𝜶˙,𝐇(𝜷;z)𝟙{𝜶Σ}𝜶˙ε,𝐅ε(𝜶ε)𝟙{𝜶εΣε}dt0\displaystyle\int_{\mathcal{I}_{\Sigma}}\langle\dot{\bm{\alpha}},\mathbf{H}(\bm{\beta};\,z)\rangle-\langle\dot{\bm{\alpha}}_{\varepsilon},{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{\varepsilon})\rangle\,dt=\int_{t_{0}}^{t_{f}}\langle\dot{\bm{\alpha}},\mathbf{H}(\bm{\beta};\,z)\rangle\mathbbm{1}_{\{\bm{\alpha}\in\Sigma\}}-\langle\dot{\bm{\alpha}}_{\varepsilon},{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{\varepsilon})\rangle\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in\Sigma^{\varepsilon}\}}\,dt\rightarrow 0

as ε0\varepsilon\rightarrow 0. Hence,

limε0Iε(t0,tf)[𝜶ε𝟙{𝜶εΣε}]\displaystyle\lim_{\varepsilon\rightarrow 0}{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{\varepsilon}\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in\Sigma^{\varepsilon}\}}] =limε0Σε𝜶˙ε𝐅ε(𝜶ε)2𝑑t\displaystyle=\lim_{\varepsilon\rightarrow 0}\int_{\mathcal{I}^{\varepsilon}_{\Sigma}}\|\dot{\bm{\alpha}}_{\varepsilon}-{\mathbf{F}^{\varepsilon}}(\bm{\alpha}_{\varepsilon})\|^{2}\,dt
=Σ𝜶˙𝐇(𝜷;z)2𝑑t\displaystyle=\int_{\mathcal{I}_{\Sigma}}\|\dot{\bm{\alpha}}-\mathbf{H}(\bm{\beta};z)\|^{2}\,dt
=Σminλ[0,1]{|λF1+(0,𝜷)+(1λ)F1(0,𝜷)|2\displaystyle=\int_{\mathcal{I}_{\Sigma}}\min_{\lambda\in[0,1]}\left\{\left|\lambda F^{+}_{1}(0,\bm{\beta})+\left(1-\lambda\right)F^{-}_{1}(0,\bm{\beta})\right|^{2}\right.
+|𝜷˙λ𝐆+(0,𝜷)(1λ)𝐆(0,𝜷)|2}dt\displaystyle\left.\mathmakebox[\widthof{$ = \int_{\mathcal{I}_{\Sigma}}\min_{\lambda\in[0,1]}$}][l]{}+\left|\dot{\bm{\beta}}-\lambda\mathbf{G}^{+}(0,\bm{\beta})-\left(1-\lambda\right)\mathbf{G}^{-}(0,\bm{\beta})\right|^{2}\right\}\,dt
=I¯(t0,tf)[𝜶𝟙{𝜶Σ}].\displaystyle={\overline{I}^{(t_{0},t_{f})}}[\bm{\alpha}\mathbbm{1}_{\{\bm{\alpha}\in{\Sigma}\}}].

We now prove the liminf inequality. For the portion of 𝜶\bm{\alpha} in S±,S_{\pm}, the liminf inequality follows directly from the recovery sequence calculation, since we chose 𝜶ε=𝜶\bm{\alpha}_{\varepsilon}=\bm{\alpha} for any 𝜶𝒜(t0,tf)\bm{\alpha}\in{\mathcal{A}^{(t_{0},t_{f})}}. For the portion of 𝜶\bm{\alpha} in Σ,\Sigma, we have that for every 𝜶ε,𝜶𝒜(t0,tf)\bm{\alpha}_{\varepsilon},\bm{\alpha}\in{\mathcal{A}^{(t_{0},t_{f})}} satisfying 𝜶ε𝟙{𝜶εΣε}H1𝜶𝟙{𝜶Σ},\bm{\alpha}_{\varepsilon}\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in\Sigma^{\varepsilon}\}}\,\overset{H^{1}}{\rightharpoonup}\,\bm{\alpha}\mathbbm{1}_{\{\bm{\alpha}\in\Sigma\}},

limε0Iε(t0,tf)[𝜶ε𝟙{𝜶εΣε}]\displaystyle\lim_{\varepsilon\rightarrow 0}{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{\varepsilon}\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in\Sigma^{\varepsilon}\}}] =limε0t0tf|𝜶˙ε𝟙{𝜶εΣε}𝐅ε(𝜶ε𝟙{𝜶εΣε})|2𝑑t\displaystyle=\lim_{\varepsilon\rightarrow 0}\int_{t_{0}}^{t_{f}}|\dot{\bm{\alpha}}_{\varepsilon}\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in\Sigma^{\varepsilon}\}}-{\mathbf{F}^{\varepsilon}}\left(\bm{\alpha}_{\varepsilon}\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in\Sigma^{\varepsilon}\}}\right)|^{2}\,dt
=Σ[𝜶]|𝜶˙𝟙{𝜶Σ}𝐇(𝜶𝟙{𝜶Σ})|2𝑑t\displaystyle=\int_{\mathcal{I}_{\Sigma}[\bm{\alpha}]}|\dot{\bm{\alpha}}\mathbbm{1}_{\{\bm{\alpha}\in\Sigma\}}-\mathbf{H}(\bm{\alpha}\mathbbm{1}_{\{\bm{\alpha}\in\Sigma\}})|^{2}\,dt
Σ[𝜶]minλ[0,1]{|λF1+(0,𝜷)+(1λ)F1(0,𝜷)|2\displaystyle\geq\int_{\mathcal{I}_{\Sigma}[\bm{\alpha}]}\min_{\lambda\in[0,1]}\left\{\left|\lambda F^{+}_{1}(0,\bm{\beta})+\left(1-\lambda\right)F^{-}_{1}(0,\bm{\beta})\right|^{2}\right.
+|𝜷˙λ𝐆+(0,𝜷)(1λ)𝐆(0,𝜷)|2}dt.\displaystyle\mathmakebox[\widthof{$ = \int_{\mathcal{I}_{\Sigma}[\bm{\alpha}]} \min\left\{\right.$}][l]{}+\left.\left|\dot{\bm{\beta}}-\lambda\mathbf{G}^{+}(0,\bm{\beta})-\left(1-\lambda\right)\mathbf{G}^{-}(0,\bm{\beta})\right|^{2}\right\}\,dt.

Then by the lower semicontinuity of the rate functional with respect to weak convergence,

I¯(t0,tf)[𝜶𝟙{𝜶Σ}]\displaystyle{\overline{I}^{(t_{0},t_{f})}}[\bm{\alpha}\mathbbm{1}_{\{\bm{\alpha}\in\Sigma\}}] =Σminλ[0,1]{|λF1+(0,𝜷)+(1λ)F1(0,𝜷)|2\displaystyle=\int_{\mathcal{I}_{\Sigma}}\min_{\lambda\in[0,1]}\left\{\left|\lambda F^{+}_{1}(0,\bm{\beta})+\left(1-\lambda\right)F^{-}_{1}(0,\bm{\beta})\right|^{2}\right.
+|𝜷˙λ𝐆+(0,𝜷)(1λ)𝐆(0,𝜷)|2}dt\displaystyle\mathmakebox[\widthof{$ = \int_{\mathcal{I}_{\Sigma}[\bm{\alpha}]} \min\left\{\right.$}][l]{}+\left.\left|\dot{\bm{\beta}}-\lambda\mathbf{G}^{+}(0,\bm{\beta})-\left(1-\lambda\right)\mathbf{G}^{-}(0,\bm{\beta})\right|^{2}\right\}\,dt
lim infε0Iε(t0,tf)[𝜶ε𝟙{𝜶εΣε}].\displaystyle\leq\liminf_{\varepsilon\rightarrow 0}{I_{\varepsilon}^{(t_{0},t_{f})}}[\bm{\alpha}_{\varepsilon}\mathbbm{1}_{\{\bm{\alpha}_{\varepsilon}\in\Sigma^{\varepsilon}\}}].

In other words, this result shows that in the smooth regions S±,S_{\pm}, the minimum of the rate functional I(t0,tf){I^{(t_{0},t_{f})}} by 𝜶(t),\bm{\alpha}(t), where 𝜶\bm{\alpha} is the solution to the EL Equations (18). Then in the switching manifold Σ\Sigma, the rate functional is minimized by some 𝜶(t;z)=(0,𝜷(t;z))𝒜(t0,tf),\bm{\alpha}(t;z)=(0,\bm{\beta}(t;z))\in{\mathcal{A}^{(t_{0},t_{f})}}, where zz is defined using Equation (21). Note that the critical manifold of the fast-slow System (18) is defined implicitly as solutions in n\mathbb{R}^{n} to following system of equations

M0{0=Λ(z)F1+(0,𝐲)+(1Λ(z))F1(0,𝐲)+φ0=φ(F1+(0,𝐲)+F1(0,𝐲))+𝝍,𝐆+(0,𝐲)+𝐆(0,𝐲)},M_{0}\subseteq\left\{\begin{aligned} 0&=\Lambda(z)F_{1}^{+}(0,\mathbf{y})+\left(1-\Lambda(z)\right)F_{1}^{-}(0,\mathbf{y})+\varphi\\ 0&=\varphi\left(F_{1}^{+}(0,\mathbf{y})+F_{1}^{-}(0,\mathbf{y})\right)+\langle\bm{\psi},\mathbf{G}^{+}(0,\mathbf{y})+\mathbf{G}^{-}(0,\mathbf{y})\rangle\end{aligned}\right\},

so when the conjugate momentum has φ=0,\varphi=0, Λ(z)F1+(0,𝐲)+(1Λ(z))F1(0,𝐲)=0\Lambda(z)F_{1}^{+}(0,\mathbf{y})+\left(1-\Lambda(z)\right)F_{1}^{-}(0,\mathbf{y})=0. Thus, the path that minimizes the rate functional in Σ\Sigma is given by 𝜶=(0,𝜷),\bm{\alpha}=(0,\bm{\beta}), where

𝜷˙=Λ(z)𝐆+(0,𝜷)+(1Λ(z))𝐆(0,𝜷).\dot{\bm{\beta}}=\Lambda(z)\mathbf{G}^{+}(0,\bm{\beta})+\left(1-\Lambda(z)\right)\mathbf{G}^{-}(0,\bm{\beta}). (22)

Practically, to obtain the most probable path in the switching manifold, we first determine the value of zz so that Λ(z)F1+(0,𝐲)+(1Λ(z))F1(0,𝐲)=0\Lambda(z)F_{1}^{+}(0,\mathbf{y})+\left(1-\Lambda(z)\right)F_{1}^{-}(0,\mathbf{y})=0 for each 𝐲n1\mathbf{y}\in{\mathbb{R}}^{n-1}. This fixes zz, so we may calculate the path using Equation (22).

To make explicit the connection of this result to the dynamics imposed on the switching manifold by the Filippov convex combination, recall the definition of sliding flow given by Equations (5),(6). Here, we may fix λ=Λ(z)\lambda=\Lambda(z) so that the first component of the sliding flow in Equation (5) becomes zero. The remaining components describing the sliding flow in Equation (5) then either coincide with Equation (22) or its time-reversed flow, with tt.t\rightarrow-t. That is, Theorem 3.8 indicates that the intersection of the most probable path with the switching manifold, 𝜶(t)𝟙{𝜶Σ},\bm{\alpha}(t)\mathbbm{1}_{\{\bm{\alpha}\in\Sigma\}}, may correspond to the solution of the flow given by 𝐲˙=𝐆s(0,𝐲;z),\dot{\mathbf{y}}=\mathbf{G}^{s}(0,\mathbf{y};z), where 𝐆s\mathbf{G}^{s} is the convex combination

𝐆s(0,𝐲;z)=λ𝐆+(0,𝐲;z)+(1λ)𝐆(0,𝐲;z),\mathbf{G}^{s}(0,\mathbf{y};z)=\lambda\mathbf{G}^{+}(0,\mathbf{y};z)+(1-\lambda)\mathbf{G}^{-}(0,\mathbf{y};z),

and λ[0,1]\lambda\in[0,1]. Thus, the most probable path in the switching manifold precisely recovers the Filippov dynamics with this particular form of λ\lambda (or its time-reversed dynamics). This result is independent of the mollifier imposed, as long as it fits the form defined in Equation (14), though the path does depend on the mollifier via Λ(z)\Lambda(z).

Furthermore, this indicates that the functional given by Equation (20) has no contribution during times when the path slides in a sliding region, but it does when the path slides in a crossing region.

4 Case Study 1: Linear system in 2D

In this section, we explore the application of the most probable path derived in Section 3 to a simple piecewise-linear system in 2{\mathbb{R}}^{2}. Consider System (2),(3), where 𝐅±:22\mathbf{F}^{\pm}:{\mathbb{R}}^{2}\mapsto{\mathbb{R}}^{2} are defined as

F+(x,y)\displaystyle\textbf{F}^{+}(x,y) =(f+g+)=(a(x1)+b(yη)c(x1)+yη),\displaystyle=\begin{pmatrix}f^{+}\\ g^{+}\end{pmatrix}=\begin{pmatrix}a(x-1)+b(y-\eta)\\ c(x-1)+y-\eta\end{pmatrix}, (23)
F(x,y)\displaystyle\textbf{F}^{-}(x,y) =(fg)=(p(x+1)+q(y+η)r(x+1)+y+η).\displaystyle=\begin{pmatrix}f^{-}\\ g^{-}\end{pmatrix}=\begin{pmatrix}p(x+1)+q(y+\eta)\\ r(x+1)+y+\eta\end{pmatrix}.

Here, a,b,c,p,q,r,ηa,b,c,p,q,r,\eta\in{\mathbb{R}} are parameters. This splits up the plane into two smooth regions separated by the switching manifold Σ={(x,y)|x=0}\Sigma=\{(x,y)~{}|~{}x=0\}. This system has been nondimensionalized to scale out a parameter.

This case study is intended to represent a prototypical example of a planar model with a one-dimensional switching manifold and linear or linearizable dynamics near the switch; see, e.g., [2, 7, 40, 41, 44, 45, 50]. For example, the stochastic Stommel model for thermohaline convection switches between temperature- and salinity-driven ocean circulation [2, 50]. Here the vector field is nonlinear, but it is reasonable to consider linearizing the vector field near the switch.

We begin by investigating the dynamics of the deterministic skeleton, System (4),(23), as the parameter η\eta varies. For all values of η,\eta, there are two stable fixed points separated by the switching manifold Σ\Sigma. For most values of η,\eta, dynamics fall under two broad categories: (1) the two fixed points are the only asymptotic states possible for 𝐱02Σ\mathbf{x}_{0}\in{\mathbb{R}}^{2}\setminus\Sigma, and (2) there is a sliding-cycle surrounding each fixed point, with a crossing-cycle surrounding all sliding-cycles (see Figure 2(f)). For intermediate values of η\eta we observe (3) both standard and nonstandard discontinuity-induced bifurcations.

Taking the two most prevalent cases (1) and (2), we perform Monte Carlo simulations of System (2),(23). In the case of attracting sliding, Monte Carlo simulations match the most probable path derived by minimizing the functional Equation (20), which slides along Σ\Sigma. However, in the case of repelling sliding, Monte Carlo simulations match a local minimizer that does not slide, instead crossing Σ\Sigma at the onset of sliding.

4.1 Deterministic dynamics

In System (4),(23), there is one equilibrium in each smooth region, at 𝐱±=(±1,±η)\mathbf{x}_{\pm}=(\pm 1,\pm\eta). Figure 2(a) summarizes the linear stability analysis of 𝐱+\mathbf{x}_{+}; due to the symmetry of the system, the linear stability of 𝐱\mathbf{x}_{-} is identical if we consider the vertical axis as qrqr and the horizontal axis as pp. The solid line separates saddle (a<bca<bc) from non-saddle equilibria. For non-saddle equilibria, the dashed line further separates stable equilibria (a<1a<-1) from unstable equilibria. Stable spirals occur in the yellow region and stable nodes occur in the blue region, which are separated by the curve bc=(a1)2/4bc=-(a-1)^{2}/4.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Refer to caption
(e)
Refer to caption
(f)
Figure 2: (a) Stability plane for 𝐱+\mathbf{x}_{+}. Shaded regions have stable nodes (blue) or stable spirals (yellow). (b) A bifurcation diagram for System (4),(23). Red lines are stable foci and the black line is a pseudosaddle. Brown curves are the crossing points of the large crossing limit cycle. The orange and purple dashed curves are the upper and lower limits of the sliding portion of the two (unstable) sliding limit cycles. The grey shaded region indicates η\eta-values where the two sliding limit cycles may have both period-1 and period-2. Vertical dashed lines indicate bifurcations identified: a visible tangency bifurcation (VV2)(VV_{2}), a crossing cycle-tangency bifurcation (I),(I), two sliding cycle-tangency bifurcations (II)(II) and (III),(III), and a sliding cycle-nonunique period bifurcation (IV)(IV). (c)-(f) Phase portraits for illustrative values of η\eta. Colors correspond to (b), with the period-2 unstable sliding limit cycle indicated as a dashed gray curve. Parameters used in all plots are a=p=2,b=q=7,a=p=-2,\,b=q=-7, and c=r=1c=r=1.

Figure 2(b) shows the bifurcations in the system as η\eta varies, in the case where 𝐱±\mathbf{x}_{\pm} are stable spirals (η\eta does not affect the linear stability of 𝐱±\mathbf{x}_{\pm}). The remainder of this section discusses dynamics shown in this bifurcation diagram. We identify bifurcations using existing terminology where possible, though some bifurcations do not appear to have existing classification. Note, although we discuss the time-reversed dynamics in this section, in general most probable paths are not time-reversible unless there is a gradient structure [70].

For large η,\eta, in addition to the two stable foci x±,\textbf{x}_{\pm}, the system has three limit cycles of the piecewise-smooth flow: one stable crossing limit cycle with large amplitude and two unstable sliding limit cycles around each focus. See Figure 2(f) for a visualization of the phase portrait with all three limit cycles. In (b) the points at which the crossing limit cycle crosses Σ\Sigma are shown as brown curves, and the maximum and minimum of the intersection of each sliding limit cycle with the (repelling) sliding region are dashed orange and purple curves, respectively. Although dynamics on Σ\Sigma are not defined a priori, if we impose a sliding flow using the Filippov convex method, with

λ(y)=a+b(yη)a+b(yη)pq(y+η),\lambda(y)=\frac{-a+b(y-\eta)}{-a+b(y-\eta)-p-q(y+\eta)},

then there are at most two pseudoequilibria 𝐱s=(xs,ys)\mathbf{x}_{s}=(x_{s},y_{s}), with xs=0x_{s}=0 and ysy_{s} given by the root(s) of the quadratic [λ(ys)(r+ys+η)+(1λ(ys))(c+ysη)]\left[\lambda(y_{s})(r+y_{s}+\eta)+(1-\lambda(y_{s}))(-c+y_{s}-\eta)\right]. For a=p,a=p, b=q,b=q, and c=rc=r there is one pseudoequilibrium, at the origin.

As we decrease η\eta from 0.8, the two sliding cycles collide (though not at a point of tangency). Due to the repelling sliding region and nonuniqueness of solution curves leaving the sliding region, this collision leads to higher period behavior for both limit cycles while preserving the period-1 behavior, in that on the limit cycle a solution may either cycle around one focus, or both, or switch between the two. Although similar to a period-doubling bifurcation, in this case the limit cycle may have any number of periods, and the stability is unchanged by the bifurcation. This bifurcation is labeled as (IV)(IV) in Figure 2(b); for brevity, we refer to it as a ‘sliding cycle-nonunique period’ bifurcation. We indicate η\eta-values with potential higher-period limit cycles using gray shading. In reverse time, in which the unstable sliding cycles become stable and the repelling sliding region becomes attracting, this behaves like a period-doubling bifurcation of two sliding limit cycles. Neither the repelling nor attracting sliding cases appear to fit within recent classifications of piecewise-smooth bifurcations; see [71, 72, 73].

At (III)(III) in Figure 2(b), the point of tangency for the right sliding limit cycle collides with the maximum of the left sliding limit cycle on Σ,\Sigma, annihilating the left sliding cycle and thereby the higher-period limit cycles as well. However, the trajectory that formed the left sliding limit cycle persists, though it can no longer slide. Then at (II)(II) a similar bifurcation occurs when the minimum of the right sliding cycle collides with the tangent point of the left formerly-sliding cycle trajectory, breaking the right limit cycle.

At (I)(I) in Figure 2(b), the crossing limit cycle collides with both the left and right formerly-sliding cycle trajectories, breaking the cycle. Although the collision does not occur at the tangency points of the formerly-sliding trajectories, for brevity we refer to this as a ‘crossing cycle-tangency’ bifurcation. As with (IV),(IV), bifurcations (I)(III)(I)-(III) do not appear to have been previously classified.

Finally, at η=0\eta=0 the two points of tangency at the switching manifold collide and slide past each other, causing the repelling sliding region to become an attracting sliding region. In the context of Kuznetsov, Rinaldi, and Gragnani’s classification of bifurcations in planar piecewise-smooth systems, this appears to be a VV2VV_{2}-type bifurcation [71].

4.2 Most probable paths

We determine the most probable path through the switching manifold by smoothing out the boundary, then taking the limit of the resulting most probable path as our smoothing parameter goes to zero to return to the original piecewise-smooth system. We define the mollification of the vector field as in Equation (15), so that 𝐅ε(x,y)=(fε(x,y),gε(x,y))=(fε(𝐱),gε(𝐱)){\mathbf{F}^{\varepsilon}}(x,y)=\left(f^{\varepsilon}(x,y),\,g^{\varepsilon}(x,y)\right)=\left(f^{\varepsilon}(\mathbf{x}),\,g^{\varepsilon}(\mathbf{x})\right), where

fε(𝐱)\displaystyle f^{\varepsilon}(\mathbf{x}) ={f(𝐱),xε,f+(𝐱)Λ(xε)+f(𝐱)(1Λ(xε))aΦ(ε,x)pΦ(x,ε),|x|<ε,f+(𝐱),xε,\displaystyle=\begin{cases}\displaystyle f^{-}(\mathbf{x}),&x\leq-\varepsilon,\\ \displaystyle f^{+}(\mathbf{x})\Lambda\left(\frac{x}{\varepsilon}\right)+f^{-}(\mathbf{x})\left(1-\Lambda\left(\frac{x}{\varepsilon}\right)\right)-a\Phi(-\varepsilon,x)-p\Phi(x,\varepsilon),&|x|<\varepsilon,\\ f^{+}(\mathbf{x}),&x\geq\varepsilon,\end{cases} (24)
gε(𝐱)\displaystyle g^{\varepsilon}(\mathbf{x}) ={g(𝐱),xε,g+(𝐱)Λ(xε)+g(𝐱)(1Λ(xε))cΦ(ε,x)rΦ(x,ε),|x|<ε,g+(𝐱),xε.\displaystyle=\begin{cases}\displaystyle g^{-}(\mathbf{x}),&x\leq-\varepsilon,\\ \displaystyle g^{+}(\mathbf{x})\Lambda\left(\frac{x}{\varepsilon}\right)+g^{-}(\mathbf{x})\left(1-\Lambda\left(\frac{x}{\varepsilon}\right)\right)-c\Phi(-\varepsilon,x)-r\Phi(x,\varepsilon),&|x|<\varepsilon,\\ g^{+}(\mathbf{x}),&x\geq\varepsilon.\\ \end{cases}

Here, Λ(x/ε)\Lambda(x/\varepsilon) is defined as in Equation (19), and we define Φ(x1,x2)\Phi(x_{1},x_{2}) as

Φ(x1,x2)=x1x2uζε(u)𝑑u.\Phi(x_{1},x_{2})=\int_{x_{1}}^{x_{2}}u\zeta_{\varepsilon}(u)\,du.

Note that for |x|<ε|x|<\varepsilon, if a=pa=p the second two terms of fεf^{\varepsilon} vanish, and if c=rc=r the second two terms of gεg^{\varepsilon} vanish. There are now two switching manifolds in the system, Σ±ε={𝐱2:x=±ε},\Sigma^{\varepsilon}_{\pm}=\{\mathbf{x}\in{\mathbb{R}}^{2}:x=\pm\varepsilon\}, but the flow is smooth across them. Note that due to the linearity of the system, mollification does not affect the fixed points 𝐱±\mathbf{x}_{\pm} as long as ε<1\varepsilon<1.

The persistence of the pseudoequilibrium 𝐱s\mathbf{x}_{s} through mollification is not as straightforward as that of 𝐱±\mathbf{x}_{\pm}. Although a regular equilibrium may limit to a pseudoequilibrium as a smooth system becomes piecewise-smooth, nonlinear dynamics on the switching manifold may lead to multiple possible phenomena in smoothing out a piecewise-smooth field [54]. However, for the particular choice of 𝐅\mathbf{F} in this case study, after mollification there is a saddle equilibrium at the origin, which was the pre-mollification location of the pseudosaddle.

We introduce the scaling z=x/ε,z=x/\varepsilon, z[1,1]z\in[-1,1], to simplify the limit as ε0\varepsilon\rightarrow 0. With this change of variables, System (4),(24) becomes a fast-slow system,

εz˙\displaystyle\varepsilon\dot{z} =fε(εz,y),\displaystyle=f^{\varepsilon}(\varepsilon z,y),
y˙\displaystyle\dot{y} =gε(εz,y).\displaystyle=g^{\varepsilon}(\varepsilon z,y).

Minimizers of the rate functional I(t0,tf){I^{(t_{0},t_{f})}} (Equation (1)) in this rescaled flow are solutions of the EL equations

εz¨\displaystyle\varepsilon\ddot{z} =y˙(fyεgzε)+fεfzε+gεgzε,\displaystyle=\dot{y}\left(f^{\varepsilon}_{y}-g^{\varepsilon}_{z}\right)+f^{\varepsilon}f^{\varepsilon}_{z}+g^{\varepsilon}g^{\varepsilon}_{z},
y¨\displaystyle\ddot{y} =εz˙(gzεfyε)+fεfyε+gεgyε,\displaystyle=\varepsilon\dot{z}\left(g^{\varepsilon}_{z}-f^{\varepsilon}_{y}\right)+f^{\varepsilon}f^{\varepsilon}_{y}+g^{\varepsilon}g^{\varepsilon}_{y},

or in Hamiltonian form, with the conjugate momenta φ=εz˙fε\varphi=\varepsilon\dot{z}-f^{\varepsilon} and ψ=y˙gε,\psi=\dot{y}-g^{\varepsilon},

εz˙\displaystyle\varepsilon\dot{z} =fε(εz,y)+φ,\displaystyle=f^{\varepsilon}(\varepsilon z,y)+\varphi, (25)
y˙\displaystyle\dot{y} =gε(εz,y)+ψ,\displaystyle=g^{\varepsilon}(\varepsilon z,y)+\psi,
φ˙\displaystyle\dot{\varphi} =fzε(εz,y)φgzε(εz,y)ψ,\displaystyle=-f_{z}^{\varepsilon}(\varepsilon z,y)\varphi-g_{z}^{\varepsilon}(\varepsilon z,y)\psi,
ψ˙\displaystyle\dot{\psi} =fyε(εz,y)φψ.\displaystyle=-f_{y}^{\varepsilon}(\varepsilon z,y)\varphi-\psi.

This ensures that the stable fixed points 𝐱±\mathbf{x}_{\pm} persist through the rescaling, extended by two additional components due to φ\varphi and ψ\psi. Fixed points in this extended system are given by z±=(z±,y±,0,0)=(±1/ε,±η,0,0)\textbf{z}_{\pm}=(z_{\pm},y_{\pm},0,0)=(\pm 1/\varepsilon,\pm\eta,0,0).

The projection onto the xyxy-plane of the solution to System (25), which is the most probable path of System (4),(23), is shown in Figure 3. Note that we have used xx instead of zz to plot the most probable path for several values of ε\varepsilon together. Figure 3(a) shows the most probable path for the mollified rescaled system, System (25) with ε=0.5,\varepsilon=0.5, including the deterministic flow from 𝐱s\mathbf{x}_{s} to 𝐱\mathbf{x}_{-}. Figure 3(b) shows the most probable path for decreasing values of ε\varepsilon, and Figure 3(c) shows a zoomed-in region of (b). As illustrated in (b) and (c), this is consistent with the minimizer of the rate functional for the piecewise-smooth system as ε0\varepsilon\rightarrow 0. Here and in the remaining figures for this case study, we have used the function h(x)=exp(1/(x21))h(x)=\exp\left(1/\left(x^{2}-1\right)\right) to construct the mollifier. See Appendix B for the derivation of the most probable path as the solution of System (25).

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Figure 3: (a) Most probable path of System (4),(23) from 𝐱+\mathbf{x}_{+} to 𝐱\mathbf{x}_{-}. (b) Most probable path as in (a), with varying ε\varepsilon. (c) Zoomed region of (b). All parameters given by a=p=2,b=q=7,c=r=1,a=p=-2,\,b=q=-7,\,c=r=1, and η=2\eta=-2. In (a) ε=0.5\varepsilon=0.5, and in (b),(c) ε=0.5, 0.1, 0.05,\varepsilon=0.5,\,0.1,\,0.05, and ε0\varepsilon\rightarrow 0 (light grey to black curves).

We can numerically validate the most probable path using Monte Carlo simulations of System (2),(23) with additive noise using the Euler-Maruyama scheme. Note that this system has a discontinuous and non-monotonic drift coefficient. Euler-Maruyama has been shown to converge in probability for such systems [74, 75], but results for stronger convergence remain elusive; see [76] and references therein. Since we are concerned with convergence of the mean of the distribution of paths, Euler-Maruyama is sufficient for this investigation.

Refer to caption
(a)
Refer to caption
(b)
Figure 4: Distribution of tipping events from Monte Carlo simulations and most probable paths (red and black curves) from 𝐱+\mathbf{x}_{+} to 𝐱\mathbf{x}_{-}, with (a) attracting sliding (η=2\eta=-2; 6,487 tips with N=1.056×107N=1.056\times 10^{7} simulations) and (b) repelling sliding (η=1\eta=1; 1,960 tips with N=1.676×107N=1.676\times 10^{7} simulations). Parameters used are a=p=2,a=p=-2, b=q=7,b=q=-7, c=r=1,c=r=1, and σ=0.3\sigma=0.3. All other curves are as in Figure 2.

We performed Monte Carlo simulations for two values of η\eta: η=1,\eta=1, where there is repelling sliding along Σ\Sigma, and η=2,\eta=-2, where there is attracting sliding. See Figure 4 for histograms of the solutions to System (2),(23) that ‘tip’ from 𝐱+\mathbf{x}_{+} to a neighborhood of 𝐱\mathbf{x}_{-}. For (a) η=2\eta=-2, transition paths tip over the basin boundary Σ\Sigma by tracking the attracting sliding region until it ends. Recall that in this parameter regime, there are only two limit points in the deterministic system. For (b) η=1,\eta=1, transition paths tip to a neighborhood of 𝐱\mathbf{x}_{-} in a sequence of basin-crossings, first over the x>0x>0 unstable sliding cycle, then primarily following the deterministic flow toward the crossing limit cycle, but ultimately crossing over the x<0x<0 unstable sliding cycle and approaching 𝐱\mathbf{x}_{-}. For all figures, we have overlaid the most probable path calculated as the path 𝜶\bm{\alpha} that minimizes the rate functional, Equation (20).

Note that in Figure 4(b), we indicate both the most probable path followed by the Monte Carlo simulations (black curve) and the predicted family of nonunique most probable paths that slide (red curves). These paths coincide for x>0x>0 but are distinct for x0.x\leq 0. The observed most probable path in the left-half plane corresponds to the solutions to the EL equations with boundary conditions 𝐱(t1)=(0,maxyΣRy)\mathbf{x}(t_{1})=(0,\max_{y\in\Sigma_{R}}y) and 𝐱(tf)=𝐱.\mathbf{x}(t_{f})=\mathbf{x}_{-}. We calculate such solutions using the gradient flow; that is, we consider solutions 𝜶:[t1,tf]2\bm{\alpha}:[t_{1},t_{f}]\mapsto{\mathbb{R}}^{2} to

𝜶s\displaystyle\frac{\partial\bm{\alpha}}{\partial s} =δI¯(t0,tf)δ𝜶=𝜶¨𝐅𝐅+(𝐅𝐅)𝜶˙,\displaystyle=-\frac{\delta{\overline{I}^{(t_{0},t_{f})}}}{\delta\bm{\alpha}}=\ddot{\bm{\alpha}}-\nabla\mathbf{F}^{\intercal}\mathbf{F}+(\nabla\mathbf{F}-\nabla\mathbf{F}^{\intercal})\dot{\bm{\alpha}}, (26)
𝜶(s,t1)\displaystyle\bm{\alpha}(s,t_{1}) =(0,maxyΣRy),\displaystyle=(0,\max_{y\in\Sigma_{R}}y),
𝜶(s,tf)\displaystyle\bm{\alpha}(s,t_{f}) =𝐱,\displaystyle=\mathbf{x}_{-},
𝜶(0,t)\displaystyle\bm{\alpha}(0,t) =𝐱g(t),\displaystyle=\mathbf{x}_{g}(t),

where 𝐱g(t)\mathbf{x}_{g}(t) is a smooth function satisfying 𝐱g(t1)=(0,maxyΣRy)\mathbf{x}_{g}(t_{1})=(0,\max_{y\in\Sigma_{R}}y) and 𝐱g(tf)=𝐱,\mathbf{x}_{g}(t_{f})=\mathbf{x}_{-}, and s>0s>0 is an artificial time. Along solution curves α(s,t)\alpha(s,t),

ddsI¯(t0,tf)[𝜶(s,t)]0.\frac{d}{ds}{\overline{I}^{(t_{0},t_{f})}}[\bm{\alpha}(s,t)]\leq 0.

Therefore I¯(t0,tf){\overline{I}^{(t_{0},t_{f})}} is a Lyapunov functional and thus lims𝜶(s,t)=𝜶(t)\lim_{s\rightarrow\infty}\bm{\alpha}(s,t)=\bm{\alpha}^{*}(t), where 𝜶(t)\bm{\alpha}^{*}(t) is a solution to the EL equations, System (25). That is, steady state solutions of Equation (26) solve the Euler-Lagrange equations. The converged solution to the IBVP in Equation (26), approximated using the Forward Time Centered Space finite difference scheme with ss as the time variable and tt as the space variable, is shown as the black curve in Figure 4(b).

Refer to caption
Figure 5: Values of the derived rate functional, Equation (20) for η=1\eta=1 and 𝐅=𝐅ε\mathbf{F}={\mathbf{F}^{\varepsilon}}, as ε\varepsilon varies. The black line corresponds to the rate functional value for the crossing most probable path and the red lines correspond to the functional values for the non-unique family of sliding most probable paths.

From Equation (20) we might expect the predicted most probable paths in Figure 4(b) to minimize I¯(t0,tf){\overline{I}^{(t_{0},t_{f})}} because I¯(t1,tf)=0\overline{I}^{(t_{1},t_{f})}=0. However, if we consider the predicted most probable path of the piecewise-smooth system in the context of the mollified system and calculate Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} for several values of ε>0,\varepsilon>0, the contribution of the rate functional as ε0\varepsilon\rightarrow 0 is substantial: the minimum rate functional values for the predicted family of most probable paths are two magnitudes greater than those of the observed most probable path; see Figure 5. This shows anecdotally that although the minimizers of the Freidlin-Wentzell rate functional converge weakly to minimizers of Equation (20), they may not converge strongly. Furthermore, since the repelling sliding region has Lebesgue measure zero, the probability that the Euler-Maruyama simulations lie on Σ\Sigma at time tt is zero for all tt.

5 Case Study 2: Non-autonomous system in 1D

In this section, we extend the most probable path framework to a linear one-dimensional non-autonomous system with periodic forcing. Consider Equation (2), where 𝐅±:×\mathbf{F}^{\pm}:{\mathbb{R}}\times{\mathbb{R}}\mapsto{\mathbb{R}} are defined by

𝐅+=f+(x,t)=r+(x1)+A+cos(2πt),𝐅=f(x,t)=r(xa)+Acos(2π(tp)).\begin{split}\mathbf{F}^{+}=f^{+}(x,t)&=-r_{+}(x-1)+A_{+}\cos(2\pi t),\\ \mathbf{F}^{-}=f^{-}(x,t)&=-r_{-}(x-a)+A_{-}\cos(2\pi(t-p)).\end{split} (27)

Here, r±,A±,p0r_{\pm},A_{\pm},p\geq 0 and a0a\leq 0 are parameters. Since 𝐅\mathbf{F} is a scalar here, we will use the notation 𝐅=f\mathbf{F}=f throughout this case study. This case study represents a prototypical example of a one-dimensional periodically forced model with a switching manifold in the state variable, which is a typical feature of some low-dimensional conceptual climate models; see, e.g., [3, 5, 46, 47, 49, 64].

Since we are considering periodic forcing, we view the flow generated by System (4),(27) as lying on the cylindrical extended phase space Π=×S1\Pi={\mathbb{R}}\times S^{1}, where S1S^{1} denotes a circle. More precisely, we make the change of variable τ=t\tau=t, augmenting Equation (4) by the additional equation dτ/dt=1d\tau/dt=1, and apply periodic boundary conditions at τ=0\tau=0 and τ=1\tau=1. However, for ease of presentation, we will continue to use Equation (4) to represent the dynamics on Π\Pi. With this convention, Σ\Sigma and S±S_{\pm} are again defined as in Section 1 and the underlying SDE is given by

dx=f(x)dt+σdW,dx=f(x)dt+\sigma dW, (28)

where, following Filippov’s convex combination method, the deterministic skeleton is given by

x˙=f(x)={f+(x,t),(x,t)S+Σ+,f(x,t),(x,t)SΣ,0,(x,t)ΣAΣR.\dot{x}=f(x)=\begin{cases}f^{+}(x,t),&(x,t)\in S_{+}\cup\Sigma_{+},\\ f^{-}(x,t),&(x,t)\in S_{-}\cup\Sigma_{-},\\ 0,&(x,t)\in\Sigma_{A}\cup\Sigma_{R}.\end{cases} (29)

5.1 Deterministic dynamics

Since f(x,t)f(x,t) is linear in xx for (x,t)Σ(x,t)\notin\Sigma, the general solutions to Equation (7) on S±S_{\pm} are given by

x±(t)=C±er±t+h±(t),x_{\pm}(t)=C_{\pm}e^{-r_{\pm}t}+h_{\pm}(t),

respectively. Here C±C_{\pm}\in{\mathbb{R}} are integration constants and

h+(t)=1+r+A+r+2+4π2cos(2πt)+2A+πr+2+4π2sin(2πt),h(t)=a+rAr2+4π2cos(2π(tp))+2Aπr2+4π2sin(2π(tp)).\begin{split}h_{+}(t)&=1+\frac{r_{+}A_{+}}{r_{+}^{2}+4\pi^{2}}\cos(2\pi t)+\frac{2A_{+}\pi}{r_{+}^{2}+4\pi^{2}}\sin(2\pi t),\\ h_{-}(t)&=a+\frac{r_{-}A_{-}}{r_{-}^{2}+4\pi^{2}}\cos(2\pi(t-p))+\frac{2A_{-}\pi}{r_{-}^{2}+4\pi^{2}}\sin(2\pi(t-p)).\end{split}

A solution with initial conditions in either S+S_{+} or SS_{-} is then constructed by selecting C±C_{\pm} to satisfy the initial condition as well as the continuity assumption on Σ\Sigma. Note, in this construction solutions to Equation (7) may be non-unique, since solutions can intersect in forward time when tracking ΣA\Sigma_{A} and in backwards time on ΣR\Sigma_{R}.

We now consider the various parameter regimes of interest. First, if the following inequalities are satisfied:

A+<r+2+4π2 and A<ar2+4π2,A_{+}<\sqrt{r_{+}^{2}+4\pi^{2}}\quad\text{ and }\quad A_{-}<-a\sqrt{r_{-}^{2}+4\pi^{2}}, (30)

then h±(t)h_{\pm}(t) do not intersect Σ\Sigma and thus are stable limit cycle solutions to Equation (7). Since we are ultimately interested in noise-induced transitions from one stable limit cycle to another, we will assume these inequalities hold throughout the rest of the case study. Second, the geometry of the basins of attraction B±B_{\pm} for h±(t)h_{\pm}(t), respectively, and the existence of ΣA,\Sigma_{A}, ΣR,\Sigma_{R}, and Σ±\Sigma_{\pm} depend on whether the nullclines of ff intersect Σ\Sigma. The nullclines are given by curves along which f±(x,t)=0;f^{\pm}(x,t)=0; for this system, this occurs along n±(t),n_{\pm}(t), where

n+(t)=1+A+r+cos(2πt) and n(t)=a+Arcos(2π(tp).n_{+}(t)=1+\frac{A_{+}}{r_{+}}\cos(2\pi t)\quad\mbox{ and }\quad n_{-}(t)=a+\frac{A_{-}}{r_{-}}\cos(2\pi(t-p).

There are thus four cases to consider which are illustrated in Figure 6:

  1. 1.

    If A+<r+A_{+}<r_{+} and A<r|a|,A_{-}<r_{-}|a|, then neither nullcline intersects Σ\Sigma and thus ΣR=Σ\Sigma_{R}=\Sigma and Σ±=ΣA=\Sigma_{\pm}=\Sigma_{A}=\emptyset; see Figure 6(a). In this case B±B_{\pm} correspond to S+S_{+} and SS_{-}.

  2. 2.

    If A+>r+A_{+}>r_{+} and A<r|a|,A_{-}<r_{-}|a|, then n+n_{+} intersects Σ\Sigma and thus Σ\Sigma_{-}\neq\emptyset, ΣR=Σ\Sigma_{R}={\mathbb{R}}-\Sigma_{-}, and Σ+=ΣA=\Sigma_{+}=\Sigma_{A}=\emptyset; see Figure 6(b). In this case BB_{-} has a nontrivial intersection with S+S_{+} and Σ\Sigma.

  3. 3.

    If A+<r+A_{+}<r_{+} and A>r|a|,A_{-}>r_{-}|a|, then nn_{-} intersects Σ\Sigma and thus Σ+\Sigma_{+}\neq\emptyset, ΣR=ΣΣ+\Sigma_{R}=\Sigma-\Sigma_{+}, and Σ=ΣA=\Sigma_{-}=\Sigma_{A}=\emptyset; see Figure 6(c). In this case B+B_{+} has a nontrivial intersection with SS_{-} and Σ\Sigma.

  4. 4.

    If A+>r+A_{+}>r_{+} and A>r|x0|,A_{-}>r_{-}|x_{0}|, then both n±n_{\pm} intersect Σ\Sigma. In this case Σ±\Sigma_{\pm}\neq\emptyset but, depending on the phase shift pp, ΣA\Sigma_{A} can be either empty or nonempty; see Figure 6(d)-(f). Moreover, B±B_{\pm} can both have a nontrivial intersection with S+S_{+} and SS_{-}.

Refer to caption
(a) Neither n±n_{\pm} intersect Σ\Sigma
Refer to caption
(b) n+n_{+} intersects Σ\Sigma
Refer to caption
(c) nn_{-} intersects Σ\Sigma
Refer to caption
(d) n±n_{\pm} both intersect Σ\Sigma
Refer to caption
(e) n±n_{\pm} both intersect Σ\Sigma
Refer to caption
(f) n±n_{\pm} both intersect Σ\Sigma
Figure 6: Example phase portraits for Equations (7)-(8). The red curves show the stable limit cycles h±h_{\pm}, the black curves show the nullclines n±n_{\pm}, the green line is ΣR\Sigma_{R}, the blue line is ΣA\Sigma_{A}, and the dashed black lines are Σ±\Sigma_{\pm}. The dark and light gray regions correspond to the basins of attraction B±B_{\pm} for h±h_{\pm} respectively.

5.2 Most probable paths

In contrast to Case Study 1, this system does not contain stable fixed points but instead stable limit cycles. Consequently, for t0<tft_{0}<t_{f}, we consider the problem of determining the most probable path from h(t0)h_{-}(t_{0}) to h+(tf)h_{+}(t_{f}). That is, we consider the family of optimization problems parameterized by t0t_{0} and tft_{f} and define the most probable transition path from hh_{-} to h+h_{+} as the minimum over this family. To do so, we redefine the admissible set of transition paths 𝒜(t0,tf){\mathcal{A}^{(t_{0},t_{f})}} by

𝒜(t0,tf)={αH1([t0,tf];):α(t0)=h(t0) and α(tf)=h+(tf)},{\mathcal{A}^{(t_{0},t_{f})}}=\{\alpha\in H^{1}([t_{0},t_{f}];{\mathbb{R}}):\alpha(t_{0})=h_{-}(t_{0})\text{ and }\alpha(t_{f})=h_{+}(t_{f})\},

and consider the optimization problem

inft0<tfinfα𝒜(t0,tf)I¯(t0,tf)[α],\inf_{t_{0}<t_{f}}\inf_{\alpha\in{\mathcal{A}^{(t_{0},t_{f})}}}{\overline{I}^{(t_{0},t_{f})}}[\alpha],

where it follows from Theorem 3.8 that

I¯(t0,tf)[α]=[α]|α˙(t)f(α(t),t)|2𝑑t+Σ[α]minλ[0,1]{[λf+(0,t)+(1λ)f(0,t)]2}𝑑t.{\overline{I}^{(t_{0},t_{f})}}[\alpha]=\int_{\mathcal{I}[\alpha]}\left|\dot{\alpha}(t)-f(\alpha(t),t)\right|^{2}\,dt+\int_{\mathcal{I}_{\Sigma}[\alpha]}\min_{\lambda\in[0,1]}\left\{\left[\lambda f^{+}(0,t)+\left(1-\lambda\right)f^{-}(0,t)\right]^{2}\right\}dt. (31)

In this case study, the minimum over λ[0,1]\lambda\in[0,1] in the second integral of I¯(t0,tf){\overline{I}^{(t_{0},t_{f})}} can be explicitly calculated. First, for times in which α\alpha lies in ΣAΣR\Sigma_{A}\cup\Sigma_{R}, the minimum is obtained by setting λ=f(0,t)/(f(0,t)f+(0,t)).\lambda=f^{-}(0,t)/\left(f^{-}(0,t)-f^{+}(0,t)\right). Second, for times in which α\alpha lies in Σ±\Sigma_{\pm}, f+(0,t)f^{+}(0,t) and f(0,t)f^{-}(0,t) have the same sign and thus the minimum is obtained when λ=0\lambda=0 or λ=1\lambda=1. Therefore, it follows that

I¯(t0,tf)[α]=[α]|α˙(t)f(α,t)|2dt+Σ±[α]min{|f+(0,t)|,|f(0,t)|}2dt.{\overline{I}^{(t_{0},t_{f})}}[\alpha]=\int_{\mathcal{I}[\alpha]}|\dot{\alpha}(t)-f(\alpha,t)|^{2}dt+\int_{\mathcal{I}_{\Sigma_{\pm}}[\alpha]}\min\{|f^{+}(0,t)|,|f^{-}(0,t)|\}^{2}dt. (32)

To study the optimization problem defined by Equation (32) we first present the following lemma which simplifies the analysis.

Lemma 5.1.

For the vector field defined by Equation (27) with parameters satisfying (30):

inft0<tfinfα𝒜(t0,tf)I¯(t0,tf)[α]=infα𝒜(,)I¯(,)[α].\inf_{t_{0}<t_{f}}\inf_{\alpha\in\mathcal{A}^{(t_{0},t_{f})}}{\overline{I}^{(t_{0},t_{f})}}[\alpha]=\inf_{\alpha\in\mathcal{A}^{(-\infty,\infty)}}\overline{I}^{(-\infty,\infty)}[\alpha]. (33)
Proof.

Since f±(x,t)f^{\pm}(x,t) have linear growth in xx and are asymptotically inward-flowing, it follows from Theorem 3.7 that for t0<tft_{0}<t_{f} satisfying t0>t_{0}>-\infty and tf<t_{f}<\infty there exists α𝒜(t0,tf)\alpha^{*}\in\mathcal{A}^{(t_{0},t_{f})} such that

I¯(t0,tf)[α]=infα𝒜(t0,tf)I¯(t0,tf)[α].{\overline{I}^{(t_{0},t_{f})}}[\alpha^{*}]=\inf_{\alpha\in{\mathcal{A}^{(t_{0},t_{f})}}}{\overline{I}^{(t_{0},t_{f})}}[\alpha].

Define α¯𝒜(,)\bar{\alpha}\in\mathcal{A}^{(-\infty,\infty)} by

α¯(t)={h(t),tt0,α(t),t0<t<tf,h+(t),t>tf.\bar{\alpha}(t)=\begin{cases}h^{-}(t),&t\leq t_{0},\\ \alpha^{*}(t),&t_{0}<t<t_{f},\\ h^{+}(t),&t>t_{f}.\end{cases}

By construction, α¯˙(t)=f(α¯(t),t)\dot{\bar{\alpha}}(t)=f(\bar{\alpha}(t),t) for tt0t\leq t_{0} and ttf,t\geq t_{f}, and thus

infα𝒜(,)I¯(,)[α]I¯(,)[α¯]=I¯(t0,tf)[α]=infα𝒜(t0,tf)I¯(t0,tf)[α].\inf_{\alpha\in\mathcal{A}^{(-\infty,\infty)}}\overline{I}^{(-\infty,\infty)}[\alpha]\leq\overline{I}^{(-\infty,\infty)}[\bar{\alpha}]={\overline{I}^{(t_{0},t_{f})}}[\alpha^{*}]=\inf_{\alpha\in{\mathcal{A}^{(t_{0},t_{f})}}}{\overline{I}^{(t_{0},t_{f})}}[\alpha].

Since this inequality is true for all values of t0<tft_{0}<t_{f} satisfying t0t_{0}\neq-\infty and tf,t_{f}\neq\infty, the result follows. ∎

Interestingly, it follows from Lemma 33 that minimizers of Equation (31) are not unique. Specifically, since f±(x,t)=f±(x,t+T)f^{\pm}(x,t)=f^{\pm}(x,t+T) for TT\in\mathbb{Z}, it follows that if α𝒜(,)\alpha^{*}\in\mathcal{A}^{(-\infty,\infty)} is a minimum then αT𝒜(,)\alpha^{*}_{T}\in\mathcal{A}^{(-\infty,\infty)} defined by αT(t)=α(t+T)\alpha^{*}_{T}(t)=\alpha^{*}(t+T) is a minimum as well. Nevertheless, when represented as curves in the cylndrical phase space Π\Pi these curves are identical. However, I¯(,)\overline{I}^{(-\infty,\infty)} can possibly contain an uncountable number of non-unique minimizers even on Π\Pi. This possibility arises when ΣR\Sigma_{R} and the closure of B+,B_{+}, which we denote as B¯+,\overline{B}_{+}, have a non-trivial intersection, and a global minimizer of I¯(,)\overline{I}^{(-\infty,\infty)} intersects this region of ΣR\Sigma_{R}. Each member of the uncountable family of global minimizers can then be constructed by joining three curves as follows: the first minimizes the functional to ΣR\Sigma_{R}, the second tracks ΣR\Sigma_{R} until reaching an arbitrary point t=st=s in the intersection of ΣR\Sigma_{R} with B¯+\overline{B}_{+}, and the third curve leaves ΣR\Sigma_{R} and tracks the drift; see Figure 7(a). Key to this construction is that for Filippov systems there is a fundamental non-uniqueness in how solution curves exit a repelling sliding region, and minimizers of I¯(,)\overline{I}^{(-\infty,\infty)} can track the drift at no cost.

Refer to caption
(a)
Refer to caption
(b)
Figure 7: (a) Non-unique minimizers of I¯(,)\overline{I}^{(-\infty,\infty)} (blue curves) overlaid on the deterministic dynamics in the case when Σ=ΣR\Sigma=\Sigma_{R}. (b) Most probable path overlaid on the probability density generated by Monte-Carlo simulations of Equation (28) in a parameter regime in which (tmax,0)B+(t_{\max},0)\in B_{+}. The solid black line corresponds to the most probable path constructed using Theorem 5.2 and the dashed magenta curve is the mean of the probability density. All other curves in both (a) and (b) follow the same convention as in Figure 6. The parameter values used in both (a) and (b) are r+=2,r_{+}=2, r=3,r_{-}=3, p=0p=0, A+=1,A_{+}=1, A=1,A_{-}=1, and a=0.5a=-0.5. We set σ=0.2\sigma=0.2 for the Monte-Carlo simulations.

The following theorem summarizes when explicit minimizers of I¯(,)\overline{I}^{(-\infty,\infty)} can be calculated for this case study. The key to the proof is the fact that since the system is piecewise linear, a change of variables converts the problem of determining the most probable path of the non-autonomous system to Σ\Sigma to that of an autonomous system. Moreover, the autonomous system is a gradient system and thus by Kramers’ law, the most probable transition through Σ\Sigma occurs when the potential difference is a minimum [77, 23]. In this case, this corresponds precisely to when the separation between hh_{-} and Σ\Sigma is minimal,

tmax=p+12πtan1(2πr)+T,t_{max}=p+\frac{1}{2\pi}\tan^{-1}\left(\frac{2\pi}{r_{-}}\right)+T,

where T;T\in\mathbb{Z}; i.e., times when hh_{-} has a maximum.

Theorem 5.2.

If (tmax,0)B¯+(t_{\max},0)\in\overline{B}_{+}, where tmax=p+12πtan1(2πr)+Tt_{\max}=p+\frac{1}{2\pi}\tan^{-1}\left(\frac{2\pi}{r_{-}}\right)+T, and α𝒜(,)\alpha^{*}\in\mathcal{A}^{(-\infty,\infty)} is defined piecewise by

{α(t)=h(tmax)er(ttmax)+h(t),ttmax,α˙(t) satisfies (29),t>tmax,\begin{cases}\alpha^{*}(t)=-h_{-}(t_{\max})e^{r_{-}(t-t_{\max})}+h_{-}(t),&t\leq t_{\max},\\ \dot{\alpha}^{*}(t)\text{ satisfies \eqref{Eq:DetSkeleton}},&t>t_{\max},\end{cases}

then

2h(tmax)2=I¯(,)[α]=infα𝒜(,)I¯(,)[α].2h_{-}(t_{\max})^{2}=\overline{I}^{(-\infty,\infty)}[\alpha^{*}]=\inf_{\alpha\in\mathcal{A}^{(-\infty,\infty)}}\overline{I}^{(-\infty,\infty)}[\alpha].
Proof.

We first compute the most probable transition paths to Σ\Sigma by determining the most probable transition paths for the SDE

dx=f(x,t)dt+σdWdx=f^{-}(x,t)dt+\sigma dW

from h(t)h_{-}(t) to Σ\Sigma. To do so, it is convenient to introduce the change of variables y(t)=x(t)h(t),y(t)=x(t)-h_{-}(t), yielding

dy=rydt+σdW,dy=-r_{-}ydt+\sigma dW,

which corresponds to a gradient system with potential V(y)=ry2/2V(y)=r_{-}y^{2}/2. With this change of variables, the relevant admissible set 𝒜¯Σ(t0,tm){\overline{\mathcal{A}}_{\Sigma}^{(t_{0},t_{m})}} and functional I¯Σ(t0,tm):𝒜¯Σ(t0,tm){\overline{I}_{\Sigma}^{(t_{0},t_{m})}}:{\overline{\mathcal{A}}_{\Sigma}^{(t_{0},t_{m})}}\mapsto\mathbb{R} are defined by

𝒜¯Σ(t0,tm)\displaystyle{\overline{\mathcal{A}}_{\Sigma}^{(t_{0},t_{m})}} ={γH1([t0,tm];):γ(t0)=0,γ(tm)=h(tm), and γ(t)<h(t))},\displaystyle=\{\gamma\in H^{1}([t_{0},t_{m}];{\mathbb{R}}):\gamma(t_{0})=0,\gamma(t_{m})=-h_{-}(t_{m}),\text{ and }\gamma(t)<-h_{-}(t))\},
I¯Σ(t0,tm)[γ]\displaystyle{\overline{I}_{\Sigma}^{(t_{0},t_{m})}}[\gamma] =t0tm|γ˙(t)+rγ(t)|2𝑑t=t0tm|γ˙(t)+V(γ(t))|2𝑑t.\displaystyle=\int_{t_{0}}^{t_{m}}\left|\dot{\gamma}(t)+r_{-}\gamma(t)\right|^{2}\,dt=\int_{t_{0}}^{t_{m}}\left|\dot{\gamma}(t)+V^{\prime}(\gamma(t))\right|^{2}\,dt.

Expanding and integrating, it follows that for all γ𝒜¯Σ(t0,tm)\gamma\in{\overline{\mathcal{A}}_{\Sigma}^{(t_{0},t_{m})}}:

I¯Σ(t0,tm)[γ]\displaystyle{\overline{I}_{\Sigma}^{(t_{0},t_{m})}}[\gamma] =t0tm[γ˙(t)2+2γ˙(t)V(γ(t))+V(γ(t))2]𝑑t\displaystyle=\int_{t_{0}}^{t_{m}}\left[\dot{\gamma}(t)^{2}+2\dot{\gamma}(t)V^{\prime}(\gamma(t))+V^{\prime}(\gamma(t))^{2}\right]dt
=t0tm[(γ˙V(γ(t)))2+4ddtV(γ(t))]𝑑t\displaystyle=\int_{t_{0}}^{t_{m}}\left[\left(\dot{\gamma}-V^{\prime}(\gamma(t))\right)^{2}+4\frac{d}{dt}V(\gamma(t))\right]dt
=t0tm[γ˙V(γ(t))]2𝑑t+4[V(γ(tm))V(γ(t0))]\displaystyle=\int_{t_{0}}^{t_{m}}\left[\dot{\gamma}-V^{\prime}(\gamma(t))\right]^{2}dt+4\left[V(\gamma(t_{m}))-V(\gamma(t_{0}))\right]
4V(γ(tm))\displaystyle\geq 4V(\gamma(t_{m}))
=2h(tm)2.\displaystyle=2h(t_{m})^{2}.

This lower bound is an equality if t0=t_{0}=-\infty and γ˙=V(y)\dot{\gamma}=V^{\prime}(y), i.e. γ\gamma tracks the time-reversed dynamics. Consequently, setting tm=tmaxt_{m}=t_{\max} it implies that γ(t)=h(tmax)er(ttmax)\gamma^{*}(t)=-h_{-}({t_{\max}})e^{r_{-}(t-t_{\max})} minimizes I¯Σ(,tm)\overline{I}^{(-\infty,t_{m})}_{\Sigma} over all tmt_{m}\in\mathbb{R}.

Finally, for all α𝒜(,),\alpha\in\mathcal{A}^{(-\infty,\infty)}, if we let tc=min{t:α(t)=0}t_{c}=\min\{t:\alpha(t)=0\} denote the first time α\alpha crosses Σ\Sigma, then by the above inequality,

I¯(,)[α]I¯Σ(,tc)[αh]=2h(tc)22h(tmax)2.\overline{I}^{(-\infty,\infty)}[\alpha]\geq\overline{I}_{\Sigma}^{(-\infty,t_{c})}[\alpha-h_{-}]=2h_{-}(t_{c})^{2}\geq 2h_{-}(t_{\max})^{2}.

Therefore, since the integrand of I¯(,)[α]\overline{I}^{(-\infty,\infty)}[\alpha] vanishes for times in which α˙\dot{\alpha} satisfies Equation (29)\eqref{Eq:DetSkeleton}, it follows that if (tmax,0)B¯+(t_{\max},0)\in\overline{B}_{+} then α\alpha^{*} as constructed in the hypothesis of the theorem achieves this lower bound. ∎

5.3 Comparison of most probable paths to Monte-Carlo simulations

As in Section 4, we compare the most probable paths with Monte-Carlo simulations of tipping events using the Euler-Maruyama scheme to numerically approximate realizations of Equation (28). Specifically, for a realization χ\chi of Equation (28) we define the following tipping time

τ+(χ)=min{t:χ(t)>h+(t)},\tau_{+}(\chi)=\min\{t:\chi(t)>h_{+}(t)\},

and given t0<tft_{0}<t_{f} we say χ\chi is a tipping event on the interval [t0,tf][t_{0},t_{f}] if τ+(χ)<tf\tau_{+}(\chi)<t_{f}. The Monte-Carlo simulations are then conducted until the distribution of tipping events on [t0,tf][t_{0},t_{f}] converges.

In the case when (tmax,0)B¯+(t_{\max},0)\in\overline{B}_{+}, Theorem 5.2 provides an explicit construction for the most probable paths which can be directly compared with the distribution of tipping events. In Figure 7(b) we plot the most probable path overlaid on the probability density of the tipping events on the interval [0,3][0,3] for the same parameter values used to generate Figure 7(a). In this case, B¯+={x:x0}\overline{B}_{+}=\{x:x\geq 0\} and thus (tmax,0)B¯+(t_{\max},0)\in\overline{B}_{+}. Figure 7(b) illustrates that there is excellent agreement between the predicted most probable path and the mean and mode of the probability density generated by the Monte-Carlo simulations. Note, however, that the most probable path plotted is one that crosses and does not slide as in the other potential most probable paths illustrated in Figure 7(a). Clearly, while the minimizers of I¯(,)\overline{I}^{(-\infty,\infty)} are not unique, the tipping events concentrate about the most probable path that does not slide.

When (tmax,0)B¯+(t_{\max},0)\notin\overline{B}_{+} the most probable paths cannot be directly computed and thus we again approximate the most probable paths by numerically computing the stationary curves of the gradient flow. In this case, to make the problem numerically tractable, we use the mollified vector field computed using a Gaussian kernel

ζε(x)=1ε2πexp(x22ε2).\zeta_{\varepsilon}(x)=\frac{1}{\varepsilon\sqrt{2\pi}}\exp\left(-\frac{x^{2}}{2\varepsilon^{2}}\right). (34)

While ζε\zeta_{\varepsilon} does not have compact support, ζ1(x)<1016\zeta_{1}(x)<10^{-16} for |x|9|x|\geq 9 and thus these kernels serve as an accurate approximation of a function with compact support. With this kernel the mollified vector field for this problem becomes

fε(x,t)=ε(rr+)2π+12(f+(x,t)(1+erf(xε2))+f(x,t)(1erf(xε2))),f^{\varepsilon}(x,t)=\varepsilon\frac{(r_{-}-r_{+})}{\sqrt{2\pi}}+\frac{1}{2}\left(f_{+}(x,t)\left(1+\text{erf}\left(\frac{x}{\varepsilon\sqrt{2}}\right)\right)+f_{-}(x,t)\left(1-\text{erf}\left(\frac{x}{\varepsilon\sqrt{2}}\right)\right)\right), (35)

where erf(x)=2π120xexp(s2)𝑑s\text{erf}(x)=2\pi^{-\frac{1}{2}}\int_{0}^{x}\exp(-s^{2})\,ds denotes the standard error function. The gradient flow in this case is given by the partial differential equation

αs\displaystyle\frac{\partial\alpha}{\partial s} =12δIδα=2αt2fεt|(α,t)fεα|(α,t)fε(α,t),\displaystyle=-\frac{1}{2}\frac{\delta I}{\delta\alpha}=\frac{\partial^{2}\alpha}{\partial t^{2}}-\left.\frac{\partial f^{\varepsilon}}{\partial t}\right|_{(\alpha,t)}-\left.\frac{\partial f^{\varepsilon}}{\partial\alpha}\right|_{(\alpha,t)}f^{\varepsilon}(\alpha,t), (36)
α(s,t0)\displaystyle\alpha(s,t_{0}) =h(t0),\displaystyle=h_{-}(t_{0}),
α(s,tf)\displaystyle\alpha(s,t_{f}) =h+(tf),\displaystyle=h_{+}(t_{f}),
α(0,t)\displaystyle\alpha(0,t) =xg(t),\displaystyle=x_{g}(t),

where xg(t)x_{g}(t) is a smooth function satisfying xg(t0)=h(t0)x_{g}(t_{0})=h_{-}(t_{0}) and xg(tf)=h+(tf)x_{g}(t_{f})=h_{+}(t_{f}) and again s>0s>0 is an artificial time. Note, since we used a Gaussian kernel for the mollification, the functional forms of fεf^{\varepsilon} and its various derivatives are in terms of Gaussian and error functions and thus Equation (36) can be numerically solved using a Forward Time Centered Space finite difference scheme.

In Figure 8 we plot the numerically computed most probable path for the mollified vector field (35) overlaid on the density of tipping events for two sets of parameters in which (tmax,0)B¯+(t_{\max},0)\notin\overline{B}_{+}. In Figure 8(a) the n+n_{+} nullcline intersects Σ\Sigma and thus BB_{-} contains regions in which x>0x>0. Moreover, (tmax,0)(t_{\max},0) lies in a crossing region. Nevertheless, in this case, the most probable path intersects Σ\Sigma slightly to the right of (tmax,0)B¯+(t_{\max},0)\notin\overline{B}_{+} while the mean of the density appears to pass precisely through (tmax,0)(t_{\max},0). In Figure 8(b) the parameters are selected so that both nullclines n+n_{+} and nn_{-} intersect Σ\Sigma and thus, in addition to BB_{-} having nontrivial intersection with x>0x>0, B+B_{+} contains regions in which x<0x<0. Again, (tmax,0)(t_{\max},0) lies in a crossing region but in this case the most probable path and the mean of the tipping events are both to the left of (tmax,0)(t_{\max},0). However, in this case there is remarkable agreement between the most probable path and the mean of the tipping events. Moreover, the most probable path indicates that there is “noise-induced” sliding along Σ\Sigma until reaching the boundary of B+B_{+}.

Refer to caption
(a)
Refer to caption
(b)
Figure 8: Most probable paths overlaid on the probability density generated by Monte-Carlo simulations of Equation (28) in a parameter regime in which (tmax,0)B+(t_{\max},0)\notin B_{+}. The solid black line corresponds to the most probable path numerically computed as a stationary curve of Equation (36) and the dashed magenta curve is the mean of the probability density. All other curves follow the same convention as in Figure 7. In (a) the parameters are r+=2,r_{+}=2, r=3,r_{-}=3, p=0.25,p=0.25, A+=3,A_{+}=3, A=1,A_{-}=1, a=0.5,a=-0.5, σ=0.22,\sigma=0.22, and ε=σ4\varepsilon=\sigma^{4}. In (b) the parameters are r+=2,r_{+}=2, r=3,r_{-}=3, p=0.25,p=0.25, A+=4,A_{+}=4, A=3,A_{-}=3, a=0.5,a=-0.5, and σ=0.07,\sigma=0.07, and ε=σ4\varepsilon=\sigma^{4}.

6 Discussion

We have proposed an extension to the Freidlin-Wentzell theory of large deviations for determining most probable transition paths in stochastic differential equations with a piecewise-smooth drift, where the switch in drift dynamics occurs across an (n1)(n-1)-dimensional hyperplane. Without loss of generality, we assume this hyperplane spans the first component, so Σ={x=0}\Sigma=\{x=0\}. Namely, we derived an appropriate functional to determine the most probable path(s) between two metastable states on either side of Σ\Sigma. For smooth regions of the drift, the derived functional matches the Friedlin-Wentzell rate functional. However, across Σ\Sigma there is an additional possible contribution to the rate functional when the most probable path slides in a crossing region of Σ\Sigma.

Smoothing out the piecewise-smooth drift by mollifying it with a compactly supported, smooth, radially symmetric kernel of characteristic width ε\varepsilon allowed us to consider most probable paths of the mollified system as minimizers of the Friedlin-Wentzell rate functional. We then considered the sequence of minimizers of the Friedlin-Wentzell rate functional in the limit as ε0.\varepsilon\rightarrow 0. Using the direct method of calculus of variations, we showed that this sequence of minimizers converges weakly, and that their limit minimizes our derived functional, which itself is the Γ\Gamma-limit of the sequence of rate functionals for the mollified system as ε0.\varepsilon\rightarrow 0.

Notably, as we recovered the piecewise-smooth drift in the limit as ε0,\varepsilon\rightarrow 0, we encountered a natural choice regarding which dynamics to impose on the switching manifold Σ\Sigma and thereby determine which path across Σ\Sigma has minimal energy. Following the convention of piecewise-smooth dynamical systems, we did not choose either side, but rather introduced a scaling variable z=x/ε[1,1],z=x/\varepsilon\in[-1,1], so that as ε0\varepsilon\rightarrow 0 we could still define the drift using a nonzero variable z.z. Then the path which contributes minimal energy is tangent to Σ\Sigma in the first component and also minimizes the contribution of zz to the drift as ε0.\varepsilon\rightarrow 0. Interpret the limiting drift on Σ\Sigma as a convex combination of the drift on either side of Σ\Sigma for a certain choice of weights, we recovered that the most probable path follows the Filippov convex combination or its time-reversed dynamics — although we did not specify that the flow on Σ\Sigma be defined using this convention.

The derived rate functional, which has a possible contribution in addition to the Friedlin-Wentzell rate functional that depends on minimizing over some parameter λ[0,1],\lambda\in[0,1], presents an intriguing perspective on two fronts: the need for a minimum function in the rate functional and the emergence of the Filippov convex combination. The Filippov convex combination, however, is not the only possible choice for imposing a sliding flow on Σ\Sigma: for example, the possibility of nonlinear sliding has recently been explored as a generalization of Filippov sliding, which may reflect observed types of sliding behavior that Filippov flow cannot reproduce; see, e.g., [54, 55]. Possible extensions of the rate functional derived here may include exploring the effect of imposing a nonlinear sliding flow in Σ,\Sigma, or a complete derivation of the limiting functional for the case of a non-autonomous vector field in n.{\mathbb{R}}^{n}.

We used two illustrative case studies to explore implications of most probable paths in a low-dimensional setting, one case study with 2D linear piecewise-smooth drift and another with periodically-forced 1D piecewise-smooth drift. In both of these case studies, most probable paths calculated using minimizers of the derived rate functional generally matched the mean of Monte Carlo simulations of tipping events between stable states on either side of a switching manifold Σ\Sigma. Additionally, both cases reveal an implication of the piecewise-smooth drift and possibility of most probable paths sliding: the most probable path is naturally non-unique when sliding in a repelling sliding region.

However, there are still open questions arising from regimes where there is disagreement between theory and Monte Carlo simulations. Although minimizer(s) of the derived rate functional theoretically equal the most probable path only in the zero noise limit [22], we expect qualitative agreement for small noise. However, in the linear case study in Section 4, Monte Carlo simulations do not match the predicted (non-unique) family of most probable paths when the most probable paths slide in a repelling sliding region. Instead, simulations in which tipping is observed qualitatively match the most probable path conditioned to travel from the original fixed point 𝐱+\mathbf{x}_{+} to the onset of sliding, (0,maxyΣRy)(0,\max_{y\in\Sigma_{R}}y), but from the onset of sliding they follow the most probable path in the left-half plane from (0,maxyΣRy)(0,\max_{y\in\Sigma_{R}}y) to 𝐱\mathbf{x}_{-}.

Two possible explanations for this behavior are the need for a smaller noise strength, σ\sigma, or the need for a higher-order functional in the smoothing parameter, ε\varepsilon. It may be that although the Monte Carlo simulations that tipped only happened for a small percent of simulations (0.12% for Figure 4(b)), the noise strength σ\sigma needs to be smaller to capture the rare event statistics necessary to slide.

Additionally, when we consider the limiting behavior of the predicted most probable path in the piecewise-smooth limit as ε0\varepsilon\rightarrow 0, although the path seen in simulations is associated with nonzero action, for 0<ε10<\varepsilon\ll 1 the cost of sliding is at least two magnitudes greater than the cost of crossing; see Figure 5 and related discussion in Section 4. This may indicate that in the repelling sliding case, the Γ\Gamma-limit of the Friedlin-Wentzell rate functional does not completely characterize the asymptotic behavior of Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}}. In such a situation, one may consider an asymptotic development to calculate the appropriate correction terms to the limiting functional; i.e., set

Iε(t0,tf)=I¯(t0,tf),0+εI¯(t0,tf),1+ε2I¯(t0,tf),2++εkI¯(t0,tf),k+o(εk),{I_{\varepsilon}^{(t_{0},t_{f})}}=\overline{I}^{(t_{0},t_{f}),0}+\varepsilon\,\overline{I}^{(t_{0},t_{f}),1}+\varepsilon^{2}\,\overline{I}^{(t_{0},t_{f}),2}+\cdots+\varepsilon^{k}\,\overline{I}^{(t_{0},t_{f}),k}+o(\varepsilon^{k}),

where I¯(t0,tf),0=I¯(t0,tf)\overline{I}^{(t_{0},t_{f}),0}={\overline{I}^{(t_{0},t_{f})}} is the Γ\Gamma-limit of Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} and each I¯(t0,tf),\overline{I}^{(t_{0},t_{f}),\ell} for =1,2,,k\ell=1,2,\ldots,k is a functional recursively defined by the Γ\Gamma-limit of I¯(t0,tf),1\overline{I}^{(t_{0},t_{f}),\ell-1} and its infimum [78]. One possible extension to the present work is an asymptotic development of a family of functionals I¯(t0,tf),\overline{I}^{(t_{0},t_{f}),\ell}, >0\ell>0, to more completely characterize the asymptotic behavior of the family Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}}.

Similarly, in the non-autonomous case study in Section 5, while the predicted most probable path may slide non-uniquely in a repelling sliding region, the mean of the observed paths does not slide. One may expect this behavior due to the sliding region having Lebesgue measure zero; see Figure 7 and the related discussion at the end of Section 4. However, as in the linear case study, the observed most probable path intersects Σ\Sigma at the onset of the sliding region. Additionally, although the derived rate functional in Equation (20) penalizes sliding in a crossing region, in certain parameter regimes we nevertheless observe the mean of the paths exhibiting sliding behavior; see Figure 8(b).

The discrepancies described above for both case studies may also be related to factors not accounted for in the derived rate functional. Such factors arise in the probability density describing the transition paths 𝜶(t)\bm{\alpha}(t) from 𝐱(t0)\mathbf{x}(t_{0}) to 𝐱(tf)\mathbf{x}(t_{f}):

P(𝜶(t))=cexp[1σ2(t0tf𝜶˙(t)𝐅(𝜶(t),t)2𝑑t+σ2t0tf𝐅(𝜶(t))𝑑t)],P(\bm{\alpha}(t))=c\exp\left[-\frac{1}{\sigma^{2}}\left(\int_{t_{0}}^{t_{f}}\left\|\dot{\bm{\alpha}}(t)-\mathbf{F}(\bm{\alpha}(t),t)\right\|^{2}\,dt+\sigma^{2}\int_{t_{0}}^{t_{f}}\nabla\cdot\mathbf{F}(\bm{\alpha}(t))\,dt\right)\right],

for some normalization factor cc\in{\mathbb{R}} [27]. Note that the first integral leads to the Freidlin-Wentzell rate functional, but together both integrals are referred to as the Onsager-Machlup rate functional. Neither the normalization factor cc nor the Onsager-Machlup rate functional are well-defined for piecewise-smooth systems. In particular, since 𝐅\mathbf{F} is non-differentiable the divergence term requires considering regimes of the Γ\Gamma-limit of the mollified 𝐅ε{\mathbf{F}^{\varepsilon}} that incorporate sigma explicitly to ensure convergence. Possible connections between most probable paths across repelling sliding regions and terms in the probability density or nonlinear convex combination will be explored in a subsequent study.

Appendix A: Inward-pointing flow and pp-growth of 𝐅ε{\mathbf{F}^{\varepsilon}}

In this appendix, we supply the proofs of Lemmas 3.1 and 3.2.

Lemma 3.1 (pp-growth).

There exists ε\varepsilon^{*} such that for all ε>0\varepsilon>0 satisfying ε<ε\varepsilon<\varepsilon^{*}, there exist R1ε,c1ε>0R^{\varepsilon}_{1},c_{1}^{\varepsilon}>0 and c2εc_{2}^{\varepsilon}\in\mathbb{R} such that |𝐱|>R1ε|\mathbf{x}|>R_{1}^{\varepsilon} implies

|𝐅ε(𝐱)|c1ε|𝐱|p+c2ε.|{\mathbf{F}^{\varepsilon}}(\mathbf{x})|\geq c_{1}^{\varepsilon}|\mathbf{x}|^{p}+c_{2}^{\varepsilon}.
Proof.

For 𝐱=(x,𝐲)\mathbf{x}=(x,\mathbf{y}), we split the proof into the cases |x|>ε|x|>\varepsilon and |x|<ε|x|<\varepsilon.

First, assume |x|>ε|x|>\varepsilon. It follows from the reverse triangle inequality and the fact that ζ\zeta is nonnegative with unit area that

||𝐅ε(𝐱)||𝐅(x)||\displaystyle\bigg{|}|{\mathbf{F}^{\varepsilon}}(\mathbf{x})|-|\mathbf{F}(x)|\bigg{|} |𝐅ε(𝐱)𝐅(x)|\displaystyle\leq|{\mathbf{F}^{\varepsilon}}(\mathbf{x})-\mathbf{F}(x)|
=|εεζε(u)𝐅(xu,𝐲)𝑑u𝐅(𝐱)|\displaystyle=\left|\int_{-\varepsilon}^{\varepsilon}\zeta_{\varepsilon}(u)\mathbf{F}(x-u,\mathbf{y})\,du-\mathbf{F}(\mathbf{x})\right|
=|εεζε(u)(𝐅(xu,𝐲)𝐅(𝐱))𝑑u|\displaystyle=\left|\int_{-\varepsilon}^{\varepsilon}\zeta_{\varepsilon}(u)\left(\mathbf{F}(x-u,\mathbf{y})-\mathbf{F}(\mathbf{x})\right)du\right|
εεζε(u)|𝐅(xu,𝐲)𝐅(𝐱)|𝑑u.\displaystyle\leq\int_{-\varepsilon}^{\varepsilon}\zeta_{\varepsilon}(u)\left|\mathbf{F}(x-u,\mathbf{y})-\mathbf{F}(\mathbf{x})\right|du.

Let R1,c1,c2,c3,c4,c5>0R_{1},c_{1},c_{2},c_{3},c_{4},c_{5}>0 be defined as in Assumption 1. If |𝐱|>R1+ε|\mathbf{x}|>R_{1}+\varepsilon, it follows from the mean value theorem and growth condition Equation (11) that

||𝐅ε(𝐱)||𝐅(x)||\displaystyle\bigg{|}|{\mathbf{F}^{\varepsilon}}(\mathbf{x})|-|\mathbf{F}(x)|\bigg{|} εεζε(u)(c3max{|(x+ε,𝐲)|p,|(xε,𝐲)|p}+c4)|u|𝑑u\displaystyle\leq\int_{-\varepsilon}^{\varepsilon}\zeta_{\varepsilon}(u)\left(c_{3}\max\{|(x+\varepsilon,\mathbf{y})|^{p},|(x-\varepsilon,\mathbf{y})|^{p}\}+c_{4}\right)|u|\,du
(c3max{|(x+ε,𝐲)|p,|(xε,𝐲)|p}+c4)εεζε(u)ε𝑑u\displaystyle\leq\left(c_{3}\max\{|(x+\varepsilon,\mathbf{y})|^{p},|(x-\varepsilon,\mathbf{y})|^{p}\}+c_{4}\right)\int_{-\varepsilon}^{\varepsilon}\zeta_{\varepsilon}(u)\varepsilon du
=ε(c3max{|(x+ε,𝐲)|p,|(xε,𝐲)|p}+c4),\displaystyle=\varepsilon\left(c_{3}\max\{|(x+\varepsilon,\mathbf{y})|^{p},|(x-\varepsilon,\mathbf{y})|^{p}\}+c_{4}\right),

and thus

|𝐅ε(𝐱)||𝐅(𝐱)|>ε(c3max{|(x+ε,𝐲)|p,|(xε,𝐲)|p}+c4).|{\mathbf{F}^{\varepsilon}}(\mathbf{x})|-|\mathbf{F}(\mathbf{x})|>-\varepsilon\left(c_{3}\max\{|(x+\varepsilon,\mathbf{y})|^{p},|(x-\varepsilon,\mathbf{y})|^{p}\}+c_{4}\right).

Consequently, since |𝐱|>R1|\mathbf{x}|>R_{1} and |x|>ε|x|>\varepsilon it follows from Equation (10) that

|𝐅ε(𝐱)|\displaystyle|\mathbf{F}^{\varepsilon}(\mathbf{x})| >ε(c3max{|(x+ε,𝐲)|p,|(xε,𝐲)|p}+c4)+c1|𝐱|p+c2\displaystyle>-\varepsilon\left(c_{3}\max\{|(x+\varepsilon,\mathbf{y})|^{p},|(x-\varepsilon,\mathbf{y})|^{p}\}+c_{4}\right)+c_{1}|\mathbf{x}|^{p}+c_{2}
>ε(c3(ε+|𝐱|)p+c4)+c1|𝐱|p+c2\displaystyle>-\varepsilon(c_{3}(\varepsilon+|\mathbf{x}|)^{p}+c_{4})+c_{1}|\mathbf{x}|^{p}+c_{2}
>ε(c3(2|𝐱|)p+c4)+c1|𝐱|p+c2\displaystyle>-\varepsilon\left(c_{3}(2|\mathbf{x}|)^{p}+c_{4}\right)+c_{1}|\mathbf{x}|^{p}+c_{2}
=(c12pc3ε)|𝐱|p+c2c4ε.\displaystyle=(c_{1}-2^{p}c_{3}\varepsilon)|\mathbf{x}|^{p}+c_{2}-c_{4}\varepsilon.

Now, assume |x|<ε|x|<\varepsilon and without loss of generality x0x\geq 0. Again, from the reverse triangle inequality,

||𝐅ε(𝐱)||𝐅(𝐱)||\displaystyle\bigg{|}|{\mathbf{F}^{\varepsilon}}(\mathbf{x})|-|\mathbf{F}(\mathbf{x})|\bigg{|}\leq |εxζε(u)(𝐅+(xu,𝐲)𝐅+(𝐱))𝑑u+xεζε(u)(𝐅(xu,𝐲)𝐅+(𝐱))𝑑u|\displaystyle\left|\int_{-\varepsilon}^{x}\zeta_{\varepsilon}(u)\left(\mathbf{F}^{+}(x-u,\mathbf{y})-\mathbf{F}^{+}(\mathbf{x})\right)du+\int_{x}^{\varepsilon}\zeta_{\varepsilon}(u)\left(\mathbf{F}^{-}(x-u,\mathbf{y})-\mathbf{F}^{+}(\mathbf{x})\right)du\right|
=\displaystyle= |εxζε(u)(𝐅+(xu,𝐲)𝐅+(𝐱))du+xεζε(u)(𝐅(xu,𝐲)𝐅(0,𝐲))du\displaystyle\left|\int_{-\varepsilon}^{x}\zeta_{\varepsilon}(u)\left(\mathbf{F}^{+}(x-u,\mathbf{y})-\mathbf{F}^{+}(\mathbf{x})\right)du+\int_{x}^{\varepsilon}\zeta_{\varepsilon}(u)\left(\mathbf{F}^{-}(x-u,\mathbf{y})-\mathbf{F}^{-}(0,\mathbf{y})\right)du\right.
+xεζε(u)(𝐅(0,𝐲)𝐅+(0,𝐲))du+xεζε(u)(𝐅+(0,𝐲)𝐅+(x,𝐲))du|\displaystyle+\left.\int_{x}^{\varepsilon}\zeta_{\varepsilon}(u)\left(\mathbf{F}^{-}(0,\mathbf{y})-\mathbf{F}^{+}(0,\mathbf{y})\right)du+\int_{x}^{\varepsilon}\zeta_{\varepsilon}(u)\left(\mathbf{F}^{+}(0,\mathbf{y})-\mathbf{F}^{+}(x,\mathbf{y})\right)du\right|
\displaystyle\leq εxζε(u)|𝐅+(xu,𝐲)𝐅+(𝐱)|𝑑u+xεζε(u)|𝐅(xu,𝐲)𝐅(𝐱)|𝑑u\displaystyle\int_{-\varepsilon}^{x}\zeta_{\varepsilon}(u)\left|\mathbf{F}^{+}(x-u,\mathbf{y})-\mathbf{F}^{+}(\mathbf{x})\right|du+\int_{x}^{\varepsilon}\zeta_{\varepsilon}(u)\left|\mathbf{F}^{-}(x-u,\mathbf{y})-\mathbf{F}^{-}(\mathbf{x})\right|du
+|𝐅(0,𝐲)𝐅+(0,𝐲)|+|𝐅+(0,𝐲)𝐅+(𝐱)|.\displaystyle+\left|\mathbf{F}^{-}(0,\mathbf{y})-\mathbf{F}^{+}(0,\mathbf{y})\right|+\left|\mathbf{F}^{+}(0,\mathbf{y})-\mathbf{F}^{+}(\mathbf{x})\right|.

Therefore, if |𝐱|>R1+ε|\mathbf{x}|>R_{1}+\varepsilon it follows from the mean value theorem, the growth condition Equation (11), and the jump condition Equation (12) that

||𝐅ε(𝐱)||𝐅(𝐱)||\displaystyle\bigg{|}|{\mathbf{F}^{\varepsilon}}(\mathbf{x})|-|\mathbf{F}(\mathbf{x})|\bigg{|}\leq εxζε(u)(c3|(x+ε,𝐲)|p+c4)|u|𝑑u+xεζε(u)(c3|(xε,𝐲)|p+c4)|u|𝑑u\displaystyle\int_{-\varepsilon}^{x}\zeta_{\varepsilon}(u)\left(c_{3}|(x+\varepsilon,\mathbf{y})|^{p}+c_{4}\right)|u|du+\int_{x}^{\varepsilon}\zeta_{\varepsilon}(u)\left(c_{3}|(x-\varepsilon,\mathbf{y})|^{p}+c_{4}\right)|u|du
+c5|𝐲|p+c6+ε(c3|(x+ε,𝐲)|p+c4)\displaystyle+c_{5}|\mathbf{y}|^{p}+c_{6}+\varepsilon\left(c_{3}|(x+\varepsilon,\mathbf{y})|^{p}+c_{4}\right)
\displaystyle\leq 2εc4+c3|(x+ε,𝐲)|p+c3|(xε,𝐲)|p+c5|𝐲|p+c6+ε(c3|(x+ε,𝐲)|p+c4).\displaystyle 2\varepsilon c_{4}+c_{3}|(x+\varepsilon,\mathbf{y})|^{p}+c_{3}|(x-\varepsilon,\mathbf{y})|^{p}+c_{5}|\mathbf{y}|^{p}+c_{6}+\varepsilon\left(c_{3}|(x+\varepsilon,\mathbf{y})|^{p}+c_{4}\right).

Therefore,

|𝐅ε(𝐱)|\displaystyle|{\mathbf{F}^{\varepsilon}}(\mathbf{x})| c5|𝐱|pc62εc3max{|2ε|p,|𝐲|p}2εc4+c4+|𝐅(𝐱)|\displaystyle\geq-c_{5}|\mathbf{x}|^{p}-c_{6}-2\varepsilon c_{3}\max\{|2\varepsilon|^{p},|\mathbf{y}|^{p}\}-2\varepsilon c_{4}+c_{4}+|\mathbf{F}(\mathbf{x})|
(c1c52p+1εc3)|𝐱|p+(c2c62εc4).\displaystyle\geq(c_{1}-c_{5}-2^{p+1}\varepsilon c_{3})|\mathbf{x}|^{p}+(c_{2}-c_{6}-2\varepsilon c_{4}).

The same argument can be applied for x<0x<0 to obtain an identical lower bound.

Therefore, from these two cases and the assumption that c1c5>0,c_{1}-c_{5}>0, if

ε\displaystyle\varepsilon <min{(c1c5)/(2p+1c3),c1/(2pc3)}=(c1c5)/(2p+1c3),\displaystyle<\min\{(c_{1}-c_{5})/(2^{p+1}c_{3}),c_{1}/(2^{p}c_{3})\}=(c_{1}-c_{5})/(2^{p+1}c_{3}),
c1ε\displaystyle c_{1}^{\varepsilon} =min{c12pc3ε,c1c52p+1εc3}=c1c52p+1εc3,\displaystyle=\min\{c_{1}-2^{p}c_{3}\varepsilon,c_{1}-c_{5}-2^{p+1}\varepsilon c_{3}\}=c_{1}-c_{5}-2^{p+1}\varepsilon c_{3},
c2ε\displaystyle c_{2}^{\varepsilon} =min{c2c4ε,c2c62εc4}=c262εc4,\displaystyle=\min\{c_{2}-c_{4}\varepsilon,c_{2}-c_{6}-2\varepsilon c_{4}\}=c_{2}-6-2\varepsilon c_{4},
R1ε\displaystyle R_{1}^{\varepsilon} =R1+ε,\displaystyle=R_{1}+\varepsilon,

then the result follows. ∎

Lemma 3.2 (Asymptotically inward-flowing).

For all ε>0\varepsilon>0 there exist R2ε,c7ε>0R_{2}^{\varepsilon},c_{7}^{\varepsilon}>0 such that |𝐱|>R2ε|\mathbf{x}|>R_{2}^{\varepsilon} implies 𝐅ε(𝐱)0{\mathbf{F}^{\varepsilon}}(\mathbf{x})\neq 0 and

𝐅ε(𝐱),𝐫(𝐱)<c7ε|𝐅ε(𝐱)|.\langle{\mathbf{F}^{\varepsilon}}(\mathbf{x}),\mathbf{r}(\mathbf{x})\rangle<-c_{7}^{\varepsilon}|{\mathbf{F}^{\varepsilon}}(\mathbf{x})|.
Proof.

By construction,

𝐅ε(𝐱),𝐫(𝐱)=εεζε(u)𝐅(xu,𝐲)𝟙{ux}(u)+𝐅+(xu,𝐲)𝟙{ux}(u),𝐫(𝐱)𝑑u.\langle{\mathbf{F}^{\varepsilon}}(\mathbf{x}),\mathbf{r}(\mathbf{x})\rangle=\int_{-\varepsilon}^{\varepsilon}\zeta_{\varepsilon}(u)\langle\mathbf{F}^{-}(x-u,\mathbf{y})\mathbbm{1}_{\{u\geq x\}}(u)+\mathbf{F}^{+}(x-u,\mathbf{y})\mathbbm{1}_{\{u\leq x\}}(u),\mathbf{r}(\mathbf{x})\rangle du.

For 𝐱n\mathbf{x}\in\mathbb{R}^{n} and u(ε,ε)u\in(-\varepsilon,\varepsilon), let θ(u)[0,π]\theta(u)\in[0,\pi] denote the angle between 𝐫(𝐱)\mathbf{r}(\mathbf{x}) and 𝐫(xu,𝐲)\mathbf{r}(x-u,\mathbf{y}). It follows from the mean value theorem that there exists C(𝐱,u)(ε,ε)C(\mathbf{x},u)\in(-\varepsilon,\varepsilon) such that

|1cos(θ(u))|\displaystyle|1-\cos(\theta(u))| =𝐫(𝐱),𝐫(𝐱)𝐫(𝐱),𝐫(xu,𝐲)\displaystyle=\langle\mathbf{r}(\mathbf{x}),\mathbf{r}(\mathbf{x})\rangle-\langle\mathbf{r}(\mathbf{x}),\mathbf{r}(x-u,\mathbf{y})\rangle
=C(𝐱,u)|𝐲|2|𝐱||(C(𝐱,u)x,𝐲)|3u\displaystyle=C(\mathbf{x},u)\frac{|\mathbf{y}|^{2}}{|\mathbf{x}|\cdot|(C(\mathbf{x},u)-x,\mathbf{y})|^{3}}u
=C(𝐱,u)|𝐲|2|𝐱|(C(𝐱,u)22C(𝐱,u)x+|𝐱|2)32u\displaystyle=C(\mathbf{x},u)\frac{|\mathbf{y}|^{2}}{|\mathbf{x}|(C(\mathbf{x},u)^{2}-2C(\mathbf{x},u)x+|\mathbf{x}|^{2})^{\frac{3}{2}}}u
ε2|𝐱|2(1+(C(𝐱,u)22C(𝐱,u)x)/|𝐱|2)32.\displaystyle\leq\frac{\varepsilon^{2}}{|\mathbf{x}|^{2}\left(1+(C(\mathbf{x},u)^{2}-2C(\mathbf{x},u)x)/|\mathbf{x}|^{2}\right)^{\frac{3}{2}}}.

Therefore, for all u(ε,ε)u\in(-\varepsilon,\varepsilon) and θ(0,π]\theta^{*}\in(0,\pi] there exists Rθ>0R_{\theta^{*}}>0 such that |𝐱|>Rθ|\mathbf{x}|>R_{\theta^{*}} implies θ(𝐱,u)<θ\theta(\mathbf{x},u)<\theta^{*}. Furthermore, since 𝐅\mathbf{F} is asymptotically inward-flowing there exist R2,c7>0R_{2},c_{7}>0 such that if |𝐱|>R2|\mathbf{x}|>R_{2} then Inequality (13) is satisfied. Consequently, if |𝐱|>max{R2+ε,Rθ}|\mathbf{x}|>\max\{R_{2}+\varepsilon,R_{\theta^{*}}\}, then for u(ε,ε)u\in(-\varepsilon,\varepsilon) it follows that

𝐅(xu,𝐲)𝟙{ux}(u)+𝐅+(xu,𝐲)𝟙{ux}(u),𝐫(𝐱)|(𝐅(xu,𝐲)𝟙{ux}(u)+𝐅+(xu,𝐲)𝟙{ux}(u))|\displaystyle\frac{\langle\mathbf{F}^{-}(x-u,\mathbf{y})\mathbbm{1}_{\{u\geq x\}}(u)+\mathbf{F}^{+}(x-u,\mathbf{y})\mathbbm{1}_{\{u\leq x\}}(u),\mathbf{r}(\mathbf{x})\rangle}{\left|\left(\mathbf{F}^{-}(x-u,\mathbf{y})\mathbbm{1}_{\{u\geq x\}}(u)+\mathbf{F}^{+}(x-u,\mathbf{y})\mathbbm{1}_{\{u\leq x\}}(u)\right)\right|} cos(cos1(c7)+θ)\displaystyle\leq\cos(\cos^{-1}(-c_{7})+\theta^{*})
=c7cos(θ)1c72sin(θ).\displaystyle=-c_{7}\cos(\theta^{*})-\sqrt{1-c_{7}^{2}}\sin(\theta^{*}).

Hence, if we choose θ\theta^{*} so that c7ε=c7cos(θ)1c72sin(θ)>0c_{7}^{\varepsilon}=-c_{7}\cos(\theta^{*})-\sqrt{1-c_{7}^{2}}\sin(\theta^{*})>0, there there exists R2ε>0R_{2}^{\varepsilon}>0 such that |𝐱|>R2ε|\mathbf{x}|>R_{2}^{\varepsilon} implies

𝐅(xu,𝐲)𝟙{ux}(u)+𝐅+(xu,𝐲)𝟙{ux}(u),𝐫(𝐱)|(𝐅(xu,𝐲)𝟙{ux}(u)+𝐅+(xu,𝐲)𝟙{ux}(u))|<c7ε.\frac{\langle\mathbf{F}^{-}(x-u,\mathbf{y})\mathbbm{1}_{\{u\geq x\}}(u)+\mathbf{F}^{+}(x-u,\mathbf{y})\mathbbm{1}_{\{u\leq x\}}(u),\mathbf{r}(\mathbf{x})\rangle}{\left|\left(\mathbf{F}^{-}(x-u,\mathbf{y})\mathbbm{1}_{\{u\geq x\}}(u)+\mathbf{F}^{+}(x-u,\mathbf{y})\mathbbm{1}_{\{u\leq x\}}(u)\right)\right|}<-c_{7}^{\varepsilon}.

Therefore,

𝐅ε(𝐱),𝐫(𝐱)|𝐅ε(𝐱)|\displaystyle\frac{\langle{\mathbf{F}^{\varepsilon}}(\mathbf{x}),\mathbf{r}(\mathbf{x})\rangle}{|{\mathbf{F}^{\varepsilon}}(\mathbf{x})|} <c7εεεζε(u)(𝐅(xu,𝐲)𝟙{ux}(u)+𝐅+(xu,𝐲)𝟙{ux}(u))𝑑u|εεζε(u)(𝐅(xu,𝐲)𝟙{ux}(u)+𝐅+(xu,𝐲)𝟙{ux}(u))𝑑u|\displaystyle<-c_{7}^{\varepsilon}\frac{\int_{-\varepsilon}^{\varepsilon}\zeta_{\varepsilon}(u)\left(\mathbf{F}^{-}(x-u,\mathbf{y})\mathbbm{1}_{\{u\geq x\}}(u)+\mathbf{F}^{+}(x-u,\mathbf{y})\mathbbm{1}_{\{u\leq x\}}(u)\right)du}{\left|\int_{-\varepsilon}^{\varepsilon}\zeta_{\varepsilon}(u)\left(\mathbf{F}^{-}(x-u,\mathbf{y})\mathbbm{1}_{\{u\geq x\}}(u)+\mathbf{F}^{+}(x-u,\mathbf{y})\mathbbm{1}_{\{u\leq x\}}(u)\right)du\right|}
<c7εεεζε(u)(𝐅(xu,𝐲)𝟙{ux}(u)+𝐅+(xu,𝐲)𝟙{ux}(u))𝑑uεεζε(u)|(𝐅(xu,𝐲)𝟙{ux}(u)+𝐅+(xu,𝐲)𝟙{ux}(u))|𝑑u\displaystyle<-c_{7}^{\varepsilon}\frac{\int_{-\varepsilon}^{\varepsilon}\zeta_{\varepsilon}(u)\left(\mathbf{F}^{-}(x-u,\mathbf{y})\mathbbm{1}_{\{u\geq x\}}(u)+\mathbf{F}^{+}(x-u,\mathbf{y})\mathbbm{1}_{\{u\leq x\}}(u)\right)du}{\int_{-\varepsilon}^{\varepsilon}\zeta_{\varepsilon}(u)\left|\left(\mathbf{F}^{-}(x-u,\mathbf{y})\mathbbm{1}_{\{u\geq x\}}(u)+\mathbf{F}^{+}(x-u,\mathbf{y})\mathbbm{1}_{\{u\leq x\}}(u)\right)\right|du}
=c7ε.\displaystyle=-c_{7}^{\varepsilon}.

Appendix B: Derivation of the most probable path in Case Study 1

The most probable path corresponds to solutions of System (25) subject to appropriate boundary conditions. Since the vector field is now smooth, there must be at least one invariant manifold separating the basins of attraction of z±\textbf{z}_{\pm}. The most probable path from z+\textbf{z}_{+} to z\textbf{z}_{-} follows an invariant manifold of zs\textbf{z}_{s} (the pseudoequilibrium xs\textbf{x}_{s} in the rescaled and extended system) in two heteroclinic orbits between zs\textbf{z}_{s} and the other fixed points. We will determine the most probable path by minimizing the rate functional Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}} for solutions of System (25) over two stages: (1) from z+\textbf{z}_{+} to the switching manifold Σ+ε,\Sigma^{\varepsilon}_{+}, and (2) from Σ+ε\Sigma^{\varepsilon}_{+} to zs\textbf{z}_{s}. From zs\textbf{z}_{s} to z\textbf{z}_{-} the most probable path follows the deterministic flow, which does not contribute to Iε(t0,tf){I_{\varepsilon}^{(t_{0},t_{f})}}. See Figure 9 for a diagram illustrating the sequence of mollifying, scaling, and translating transformations used in this derivation. Translating the z+\textbf{z}_{+} equilibrium to the origin by defining z~=zz+\tilde{z}=z-z_{+} and y~=yy+\tilde{y}=y-y_{+}, the boundary conditions for the path from 𝐳+\mathbf{z}_{+} to Σ+ε\Sigma^{\varepsilon}_{+} (Stage (1)) become

z~()\displaystyle\tilde{z}(-\infty) =0,\displaystyle=0,\qquad y~()\displaystyle\tilde{y}(-\infty) =0,\displaystyle=0,\qquad φ()\displaystyle\varphi(-\infty) =0,\displaystyle=0,\qquad ψ()\displaystyle\psi(-\infty) =0,\displaystyle=0, (B.37)
z~(0)\displaystyle\tilde{z}(0) =11ε,\displaystyle=1-\frac{1}{\varepsilon}, y~(0)\displaystyle\tilde{y}(0) =y1η,\displaystyle=y_{1}-\eta, φ(0)\displaystyle\varphi(0) =φ1,\displaystyle=\varphi_{1}, ψ(0)\displaystyle\psi(0) =ψ1.\displaystyle=\psi_{1}.

The boundary conditions φ1\varphi_{1} and ψ1\psi_{1} over-determine the system if we consider that the solution must be finite at t=t=-\infty and t=0,t=0, so we allow them to vary. We will discuss choosing y1y_{1} for both Stages (1) and (2) below. We then translate the most probable path for Stage (1) back to the original coordinates before continuing to Stage (2), which gives the most probable path from Σ+ε\Sigma^{\varepsilon}_{+} to 𝐳s\mathbf{z}_{s}.

Refer to caption
Figure 9: Diagram illustrating the transformations made to System (4),(23) when calculating the most probable path, for η=2\eta=-2. Red points are equilibria, the black point is a pseudoequilibrium, and the vertical lines are switching manifolds.

The boundary conditions for Stage (2) are as follows:

z(0)\displaystyle z(0) =1,\displaystyle=1,\qquad y(0)\displaystyle y(0) =y1,\displaystyle=y_{1},\qquad φ(0)\displaystyle\varphi(0) =φ1,\displaystyle=\varphi_{1},\qquad ψ(0)\displaystyle\psi(0) =ψ1,\displaystyle=\psi_{1}, (B.38)
z(t2)\displaystyle z(t_{2}) =0,\displaystyle=0,\qquad y(t2)\displaystyle y(t_{2}) =0,\displaystyle=0,\qquad φ(t2)\displaystyle\varphi(t_{2}) =0,\displaystyle=0,\qquad ψ(t2)\displaystyle\psi(t_{2}) =0,\displaystyle=0,

where t2<t_{2}<\infty is unknown a priori. Solving (25),(B.38) gives us the most probable path for Stage (2). To apply the boundary conditions, we first eliminate coefficients of the solution associated with positive eigenvalues. Applying the boundary condition at t=0t=0 then leads to a system with two unknown coefficients and three initial conditions, which is over-determined. If we introduce the condition that along the most probable path the Hamiltonian of the system is zero,

H=fεφ+gεψ+12φ2+12ψ2=0,H=f^{\varepsilon}\varphi+g^{\varepsilon}\psi+\frac{1}{2}\varphi^{2}+\frac{1}{2}\psi^{2}=0,

we can eliminate an equation in System (25). It suffices to choose the variable associated with a positive real eigenvalue and an eigenvector with nonzero conjugate momentum. For the parameters used in Figure 3, φ\varphi satisfies this condition. There are generally two possible values of φ\varphi: we choose φ=fε+(fε)22gεψψ2,\varphi=-f^{\varepsilon}+\sqrt{(f^{\varepsilon})^{2}-2g^{\varepsilon}\psi-\psi^{2}}, so that φ=0\varphi=0 when ψ=0\psi=0.

7 Acknowledgments

The work of JG was supported by the American Institute of Mathematics, and the work of JG and JZ was supported by Wake Forest University. Computations were performed on the Wake Forest University DEAC Cluster, a centrally managed resource with support provided in part by the University.

References

  • [1]
  • [2] N Berglund and B Gentz. Metastability in simple climate models: pathwise analysis of slowly driven Langevin equations. Stoch Dyn, 2(03):327–356, 2002, doi:doi.org/10.1142/S0219493702000455.
  • [3] I Eisenman and JS Wettlaufer. Nonlinear threshold behavior during the loss of Arctic sea ice. Proc Natl Acad Sci USA, 106(1):28–32, 2009, doi:doi.org/10.1073/pnas.0806887106.
  • [4] S Wieczorek, P Ashwin, CM Luke, and PM Cox. Excitability in ramped systems: the compost-bomb instability. Proc R Soc A, 467(2129):1243–1269, 2010, doi:doi.org/10.1098/rspa.2010.0485.
  • [5] I Eisenman. Factors controlling the bifurcation structure of sea ice retreat. J Geophys Res Atmos, 117(D1), 2012, doi:doi.org/10.1029/2011JD016164.
  • [6] DH Rothman. Characteristic disruptions of an excitable carbon cycle. Proc Natl Acad Sci USA, 116(30):14813–14822, 2019, doi:doi.org/10.1073/pnas.1905164116.
  • [7] CW Arnscheidt and DH Rothman. Routes to global glaciation. Proc Roy Soc A, 476(2239):20200303, 2020, doi:doi.org/10.1098/rspa.2020.0303.
  • [8] J Lohmann and PD Ditlevsen. Risk of tipping the overturning circulation due to increasing rates of ice melt. Proc Natl Acad Sci USA, 118(9), 2021, doi:doi.org/10.1073/pnas.2017989118.
  • [9] A Vanselow, S Wieczorek, and U Feudel. When very slow is too fast-collapse of a predator-prey system. J Theor Biol, 479:64–72, 2019, doi:doi.org/10.1016/j.jtbi.2019.07.008.
  • [10] PE O’Keeffe and S Wieczorek. Tipping phenomena and points of no return in ecosystems: beyond classical bifurcations. SIAM J Appl Dyn Syst, 19(4):2371–2402, 2020, doi:https://doi.org/10.1137/19M1242884.
  • [11] JM Drake and BD Griffen. Early warning signals of extinction in deteriorating environments. Nature, 467(7314):456–459, 2010, doi:https://doi.org/10.1038/nature09389.
  • [12] M Scheffer, EH Van Nes, M Holmgren, and T Hughes. Pulse-driven loss of top-down control: the critical-rate hypothesis. Ecosystems, 11(2):226–237, 2008, doi:https://doi.org/10.1007/s10021-007-9118-8.
  • [13] E Forgoston, S Bianco, LB Shaw, and IB Schwartz. Maximal sensitive dependence and the optimal path to epidemic extinction. Bull Math Biol, 73(3):495–514, 2011, doi:doi.org/10.1007/s11538-010-9537-0.
  • [14] IB Schwartz, E Forgoston, S Bianco, and LB Shaw. Converging towards the optimal path to extinction. J R Soc Interface, 8(65):1699–1707, 2011, doi:doi.org/10.1098/rsif.2011.0159.
  • [15] L Billings and E Forgoston. Seasonal forcing in stochastic epidemiology models. Ric di Mat, 67(1):27–47, 2018, doi:doi.org/10.1007/s11587-017-0346-8.
  • [16] J Hindes and IB Schwartz. Rare events in networks with internal and external noise. Europhys Lett, 120(5):56004, 2018, doi:doi.org/10.1209/0295-5075/120/56004.
  • [17] P Ashwin, S Wieczorek, R Vitolo, and P Cox. Tipping points in open systems: bifurcation, noise-induced and rate-dependent examples in the climate system. Philos Trans R Soc A, 370(1962):1166–1184, 2012, doi:doi.org/10.1098/rsta.2011.0306.
  • [18] JMT Thompson and J Sieber. Climate tipping as a noisy bifurcation: a predictive technique. IMA J Appl Math, 76(1):27–46, 2011, doi:doi.org/10.1093/imamat/hxq060.
  • [19] TM Lenton. Early warning of climate tipping points. Nat Clim Change, 1(4):201–209, 2011, doi:doi.org/10.1038/nclimate1143.
  • [20] L Halekotte and U Feudel. Minimal fatal shocks in multistable complex networks. Sci Rep, 10(1):1–13, 2020, doi:doi.org/10.1038/s41598-020-68805-6.
  • [21] H Alkhayuon, RC Tyson, and S Wieczorek. Phase tipping: how cyclic ecosystems respond to contemporary climate. Proc R Soc A, 477(2254):20210059, 2021, doi:doi.org/10.1098/rspa.2021.0059.
  • [22] B Lindner, J Garcıa-Ojalvo, A Neiman, and L Schimansky-Geier. Effects of noise in excitable systems. Phys Rep, 392(6):321–424, 2004, doi:doi.org/10.1016/j.physrep.2003.10.015.
  • [23] MI Freidlin and AD Wentzell. Random perturbations of dynamical systems, volume 260. Springer Science & Business Media, 2012.
  • [24] W Ren and E Vanden-Eijnden. Minimum action method for the study of rare events. Commun Pure Appl Math, 57(5):637–656, 2004, doi:doi.org/10.1002/cpa.20005.
  • [25] E Forgoston and RO Moore. A primer on noise-induced transitions in applied dynamical systems. SIAM Rev, 60(4):969–1009, 2018, doi:doi.org/10.1137/17M1142028.
  • [26] D Dürr and A Bach. The Onsager-Machlup function as Lagrangian for the most probable path of a diffusion process. Commun Math Phys, 60(2):153–170, 1978, doi:doi.org/10.1007/BF01609446.
  • [27] M Chaichian and A Demichev. Path Integrals in Physics: Stochastic Processes and Quantum Mechanics. Institute of Physics, 2001.
  • [28] E Weinan and E Vanden-Eijnden. Transition-path theory and path-finding algorithms for the study of rare events. Annu Rev Phys Chem, 61:391–420, 2010, doi:doi.org/10.1146/annurev.physchem.040808.090412.
  • [29] J Finkel, DS Abbot, and J Weare. Path properties of atmospheric transitions: illustration with a low-order sudden stratospheric warming model. J Atmos Sci, 77(7):2327–2347, 2020, doi:doi.org/10.1175/JAS-D-19-0278.1.
  • [30] RS Maier and DL Stein. Limiting exit location distributions in the stochastic exit problem. SIAM J Appl Math, 57(3):752–790, 1997, doi:doi.org/10.1137/S0036139994271753.
  • [31] C B Muratov and E Vanden-Eijnden. Noise-induced mixed-mode oscillations in a relaxation oscillator near the onset of a limit cycle. Chaos, 18(1):015111, 2008, doi:https://doi.org/10.1063/1.2779852.
  • [32] E Weinan, W Ren, and E Vanden-Eijnden. String method for the study of rare events. Phys Rev B, 66(5):052301, 2002, doi:doi.org/10.1103/PhysRevB.66.052301.
  • [33] M Heymann and E Vanden-Eijnden. The geometric minimum action method: A least action principle on the space of curves. Commun Pure Appl Math, 61(8):1052–1117, 2008, doi:doi.org/10.1002/cpa.20238.
  • [34] BS Lindley and IB Schwartz. An iterative action minimizing method for computing optimal paths in stochastic dynamical systems. Physica D, 255:22–30, 2013, doi:doi.org/10.1016/j.physd.2013.04.001.
  • [35] GM Donovan and WL Kath. An iterative stochastic method for simulating large deviations and rare events. SIAM J Appl Math, 71(3):903–924, 2011, doi:doi.org/10.1137/100789373.
  • [36] DJW Simpson and R Kuske. The influence of localized randomness on regular grazing bifurcations with applications to impacting dynamics. J Vib Control, 24(2):407–426, 2018, doi:doi.org/10.1177/107754631664205.
  • [37] EJ Staunton and PT Piiroinen. Discontinuity mappings for stochastic nonsmooth systems. Physica D, 406:132405, 2020, doi:doi.org/10.1016/j.physd.2020.132405.
  • [38] EJ Staunton and PT Piiroinen. Estimating the dynamics of systems with noisy boundaries. Nonlinear Anal: Hybrid Syst, 36:100863, 2020, doi:doi.org/10.1016/j.nahs.2020.100863.
  • [39] Y Chen, A Baule, H Touchette, and W Just. Weak-noise limit of a piecewise-smooth stochastic differential equation. Phys Rev E, 88(5):052103, 2013, doi:doi.org/10.1103/PhysRevE.88.052103.
  • [40] Y Chen and W Just. Large-deviation properties of Brownian motion with dry friction. Phys Rev E, 90(4):042102, 2014, doi:doi.org/10.1103/PhysRevE.90.042102.
  • [41] A Baule, H Touchette, and EGD Cohen. Stick–slip motion of solids with dry friction subject to random vibrations and an external field. Nonlinearity, 24(2):351, 2010, doi:doi.org/10.1088/0951-7715/24/2/001.
  • [42] A Baule, EGD Cohen, and H Touchette. A path integral approach to random motion with nonlinear friction. J Phys A Math, 43(2):025003, 2009, doi:doi.org/10.1088/1751-8113/43/2/025003.
  • [43] DJW Simpson and R Kuske. Mixed-mode oscillations in a stochastic, piecewise-linear system. Physica D, 240(14-15):1189–1198, 2011, doi:doi.org/10.1016/j.physd.2011.04.017.
  • [44] SH Piltz, MA Porter, and PK Maini. Prey switching with a linear preference trade-off. SIAM J Appl Dyn Syst, 13(2):658–682, 2014, doi:doi.org/10.1137/130910920.
  • [45] L Serdukova, Y Zheng, J Duan, and J Kurths. Metastability for discontinuous dynamical systems under Lévy noise: Case study on Amazonian vegetation. Sci Rep, 7(1):1–13, 2017, doi:doi.org/10.1038/s41598-017-07686-8.
  • [46] W Moon and JS Wettlaufer. A stochastic dynamical model of Arctic sea ice. J Clim, 30(13):5119–5140, 2017, doi:doi.org/10.1175/JCLI-D-16-0223.1.
  • [47] F Yang, Y Zheng, J Duan, L Fu, and S Wiggins. The tipping times in an Arctic sea ice system under influence of extreme events. Chaos, 30(6):063125, 2020, doi:doi:10.1063/5.0006626.
  • [48] T Mitsui, M Crucifix, and K Aihara. Bifurcations and strange nonchaotic attractors in a phase oscillator model of glacial–interglacial cycles. Physica D, 306:25–33, 2015, doi:doi.org/10.1016/j.physd.2015.05.007.
  • [49] KS Morupisi and C Budd. An analysis of the periodically forced PP04 climate model, using the theory of non-smooth dynamical systems. IMA J Appl Math, 86(1):76–120, 2021, doi:doi.org/10.1093/imamat/hxaa039.
  • [50] AH Monahan. Stabilization of climate regimes by noise in a simple model of the thermohaline circulation. J Phys Oceanogr, 32(7):2072–2085, 2002, doi:doi.org/10.1175/15200485(2002)032%3C2072:SOCRBN%3E2.0.CO;2.
  • [51] M di Bernardo, CJ Budd, AR Champneys, P Kowalczyk, AB Nordmark, GO Tost, and PT Piiroinen. Bifurcations in nonsmooth dynamical systems. SIAM Rev, 50(4):629–701, 2008, doi:doi.org/10.1137/050625060.
  • [52] M di Bernardo, CJ Budd, AR Champneys, and P Kowalczyk. Piecewise-smooth dynamical systems: theory and applications, volume 163 of Applied Mathematical Sciences. Springer Science & Business Media, 2008.
  • [53] AF Filippov. Differential equations with discontinuous righthand sides, volume 18. Springer Science & Business Media, 2088.
  • [54] MR Jeffrey. Hidden dynamics in models of discontinuity and switching. Physica D, 273:34–45, 2014, doi:doi.org/10.1016/j.physd.2014.02.003.
  • [55] DD Novaes and MR Jeffrey. Regularization of hidden dynamics in piecewise smooth flows. J Differ Equ, 259(9):4615–4633, 2015, doi:doi.org/10.1016/j.jde.2015.06.005.
  • [56] EN Lorenz. Deterministic nonperiodic flow. J Atmos Sci, 20(2):130–141, 1963, doi:doi.org/10.1175/1520-0469(1963)020%0130:DNF%2.0.CO;2.
  • [57] G Dal Maso. An introduction to Gamma-convergence, volume 8. Springer Science & Business Media, 2012.
  • [58] A Braides. Gamma-convergence for Beginners, volume 22. Clarendon Press, 2002.
  • [59] FJ Pinski, AM Stuart, and F Theil. γ\gamma-limit for transition paths of maximal probability. J Stat Phys, 146(5):955–974, 2012, doi:doi.org/10.1007/s10955-012-0443-8.
  • [60] Y Lu, A Stuart, and H Weber. Gaussian approximations for transition paths in Brownian dynamics. SIAM J Math Anal, 49(4):3005–3047, 2017, doi:doi.org/10.1137/16M1071845.
  • [61] T Li and X Li. Gamma-limit of the Onsager–Machlup functional on the space of curves. SIAM J Math Anal, 53(1):1–31, 2021, doi:doi.org/10.1137/20M1310539.
  • [62] B Ayanbayev, I Klebanov, HC Lie, and TJ Sullivan. Gamma-convergence of Onsager-Machlup functionals. Part I: With applications to maximum a posteriori estimation in Bayesian inverse problems. Inverse Probl, 38(2):025005, 2021, doi:doi.org/10.1088/1361-6420/ac3f81.
  • [63] B Ayanbayev, I Klebanov, H C Lie, and TJ Sullivan. Gamma-convergence of Onsager-Machlup functionals. Part II: Infinite product measures on Banach spaces. Inverse Probl, 38(2):025006, 2021, doi:doi.org/10.1088/1361-6420/ac3f82.
  • [64] K Hill, DS Abbot, and M Silber. Analysis of an Arctic sea ice loss model in the limit of a discontinuous albedo. SIAM J Appl Dyn Syst, 15(2):1163–1192, 2016, doi:doi.org/10.1137/15M1037718.
  • [65] J Zanetell. Tipping points in stochastically perturbed linear Filippov systems. Master’s thesis, Wake Forest University, 2018.
  • [66] LC Evans. Partial differential equations, volume 19. American Mathematical Society, 2010.
  • [67] J Jost and X Li-Jost. Calculus of variations, volume 64. Cambridge University Press, 1998.
  • [68] LC Evans. Weak convergence methods for nonlinear partial differential equations, volume 74. American Mathematical Socety, 1990.
  • [69] RA Adams and JJF Fournier. Sobolev spaces. Elsevier, 2003.
  • [70] T Grafke. String method for generalized gradient flows: Computation of rare events in reversible stochastic processes. J Stat Mech Theory Exp, 2019(4):043206, 2019, doi:doi.org/10.1088/1742-5468/ab11db.
  • [71] YA Kuznetsov, S Rinaldi, and A Gragnani. One-parameter bifurcations in planar Filippov systems. Int J Bifurc Chaos Appl Sci Eng, 13(8):2157–2188, 2003, doi:doi.org/10.1142/S0218127403007874.
  • [72] M di Bernardo and SJ Hogan. Discontinuity-induced bifurcations of piecewise smooth dynamical systems. Philos Trans R Soc A, 368(1930):4915–4935, 2010, doi:doi.org/10.1098/rsta.2010.0198.
  • [73] P Glendinning, MR Jeffrey, E Bossolini, J T Lázaro, and JM Olm. An introduction to piecewise smooth dynamics. Springer, 2019.
  • [74] I Gyöngy and N Krylov. Existence of strong solutions for Itô’s stochastic equations via approximations. Probab Theory Relat Fields, 105(2):143–158, 1996, doi:doi.org/10.1007/BF01203833.
  • [75] I Gyöngy. A note on Euler’s approximations. Potential Anal, 8(3):205–216, 1998, doi:doi.org/10.1023/A:1008605221617.
  • [76] A Neuenkirch, M Szölgyenyi, and L Szpruch. An adaptive Euler–Maruyama scheme for stochastic differential equations with discontinuous drift and its convergence analysis. SIAM J Numer Anal, 57(1):378–403, 2019, doi:doi.org/10.1137/18M1170017.
  • [77] N Berglund. Kramers’ law: Validity, derivations and generalisations. Markov Process Relat Fields, 19(3):459–490, 2013, doi:math-mprf.org/journal/articles/id1304.
  • [78] G Anzellotti and S Baldo. Asymptotic development by γ\gamma-convergence. Appl Math Optim, 27(2):105–123, 1993, doi:doi.org/10.1007/BF01195977.