Convex computation of maximal Lyapunov exponents

Hans Oeri, David Goluskin Email: [email protected]

(Department of Mathematics and Statistics, University of Victoria, Victoria, Canada)

Abstract

We describe an approach for finding upper bounds on an ODE dynamical system’s maximal Lyapunov exponent among all trajectories in a specified set. A minimization problem is formulated whose infimum is equal to the maximal Lyapunov exponent, provided that trajectories of interest remain in a compact set. The minimization is over auxiliary functions that are defined on the state space and subject to a pointwise inequality. In the polynomial case—i.e., when the ODE’s right-hand side is polynomial, the set of interest can be specified by polynomial inequalities or equalities, and auxiliary functions are sought among polynomials—the minimization can be relaxed into a computationally tractable polynomial optimization problem subject to sum-of-squares constraints. Enlarging the spaces of polynomials over which auxiliary functions are sought yields optimization problems of increasing computational cost whose infima converge from above to the maximal Lyapunov exponent, at least when the set of interest is compact. For illustration, we carry out such polynomial optimization computations for two chaotic examples: the Lorenz system and the Hénon–Heiles system. The computed upper bounds converge as polynomial degrees are raised, and in each example we obtain a bound that is sharp to at least five digits. This sharpness is confirmed by finding trajectories whose leading Lyapunov exponents approximately equal the upper bounds.

1 Introduction

In a dynamical system, the rates at which infinitesimally nearby trajectories converge or diverge are captured by Lyapunov exponents (LEs) [57]. These diagnostic exponents are especially prominent in the study of chaotic systems, where a positive exponent corresponds to exponentially diverging trajectories and, therefore, to sensitive dependence on initial conditions. For trajectories that are periodic orbits, LEs coincide with Floquet multipliers. If periodic orbits are linearly unstable and embedded in a chaotic attractor, their positive multipliers are one measure of their relative importance to the chaotic dynamics [11]. However, the instability of trajectories that positive LEs indicate also makes it difficult to compute such LEs precisely using numerical integration [12, 15, 23]. The present work is concerned with LEs that are maximal among all trajectories in a chosen set. Rather than using numerical integration, we formulate convex optimization problems that yield convergent upper bounds on maximal LEs.

Consider dynamics of $x(t)$ in $\mathbb{R}^{n}$ governed by an autonomous ODE system,

\frac{\mathrm{d}}{\mathrm{d}t}x(t)=f(x(t)),\qquad x(0)=x_{0}.

(1)

We ensure the system is well-posed by assuming $f\in\mathcal{C}^{1}(\mathbb{R}^{n},\mathbb{R}^{n})$ , where $\mathcal{C}^{k}(\mathbb{R}^{n},\mathbb{R}^{n})$ denotes the space of $k$ -times continuously differentiable functions mapping $\mathbb{R}^{n}$ to $\mathbb{R}^{n}$ , and similarly $\mathcal{C}^{k}(\mathbb{R}^{n})$ denotes functions mapping $\mathbb{R}^{n}$ to $\mathbb{R}$ . At each point along $x(t)$ , the linearization of 1 defines dynamics of $y(t)$ evolving in the tangent space $\mathbb{R}^{n}$ according to

\frac{\mathrm{d}}{\mathrm{d}t}y(t)=Df(x(t))\,y(t),\qquad y(0)=y_{0},

(2)

where $Df(x)$ is the $n\times n$ Jacobian matrix for $f(x)$ . The perturbation vector from a trajectory $x(t)$ to a very nearby trajectory evolves proportionally to $y(t)$ , so long as the trajectories remain close enough for the linearization 2 to be accurate. If two trajectories separate on average and evolve nonlinearly, eventually 2 will not describe the vector between them. Thus 2 may not approximate nearby trajectories uniformly in time, but it captures, at each instant $t$ , the dynamics of trajectories that are infinitesimally near $x(t)$ . Since 2 is linear in $y$ , the dynamics of $y$ are independent of its magnitude. Thus we fix $|y_{0}|=1$ , so $y(t)$ initially lies on the sphere $\mathbb{S}^{n-1}\subset\mathbb{R}^{n}$ but can grow or shrink in time.

For each initial condition $x_{0}\in\mathbb{R}^{n}$ and initial direction $y_{0}\in\mathbb{S}^{n-1}$ in the tangent space, define the resulting LE as

\mu(x_{0},y_{0})=\limsup_{t\rightarrow\infty}\frac{1}{t}\log|y(t)|.

(3)

Vectors $y(t)$ evolving in the tangent space of the trajectory $x(t)$ shrink or grow like $e^{\mu t}$ , on average. Under our assumption that $f\in\mathcal{C}^{1}(\mathbb{R}^{n},\mathbb{R}^{n})$ , if $x(t)$ remains bounded then $y(t)$ cannot shrink or grow faster than exponentially, so $\mu(x_{0},y_{0})$ cannot be $\pm\infty$ .

For the trajectory emanating from each $x_{0}$ , the possible values of the limit 3 among all initial directions $y_{0}$ define an associated spectrum of LEs, $\mu_{1}(x_{0})\geq\mu_{2}(x_{0})\geq\cdots$ . There are at most $n$ distinct LEs in the spectrum [45]. The Oseledets ergodic theorem [51, 57] implies that the leading exponent $\mu_{1}(x_{0})$ is attained for almost every initial direction $y_{0}$ , so often the leading LE is of primary interest, and many authors refer to it as “the” LE.

Here we pursue upper bounds that apply to all trajectories starting within any chosen set $\Omega\subset\mathbb{R}^{n}$ . In other words, these upper bounds apply to the maximal Lyapunov exponent $\mu^{*}_{\Omega}$ defined by

\mu_{\Omega}^{*}=\sup_{\begin{subarray}{c}x_{0}\in\Omega~{}~{}\\ ~{}\;y_{0}\in\mathbb{S}^{n-1}\end{subarray}}\mu(x_{0},y_{0}).

(4)

We assume all $x_{0}\in\Omega$ lead to bounded trajectories; if not, the supremum in 4 can be restricted to such $x_{0}$ . A bounded trajectory on which a positive $\mu^{*}_{\Omega}$ value is attained is, by definition, the most unstable trajectory starting in $\Omega$ . The methods we describe for upper-bounding the supremum in 4 could analogously lower-bound the infimum over $(x_{0},y_{0})$ , meaning they would give lower bounds on the smallest LE $\mu_{n}$ among trajectories starting in $\Omega$ , but here we speak only in terms of upper bounds on the maximal LE.

The difficulty in computing $\mu^{*}_{\Omega}$ by numerical integration is twofold. First, for a single $x_{0}$ leading to positive $\mu_{1}$ , the value of $\mu_{1}$ is hard to compute accurately because the exponential separation of trajectories magnifies numerical errors as time goes forward. In chaotic systems, the time average $\tfrac{1}{t}\log|y(t)|$ often converges extremely slowly with increasing $t$ , if at all [53, 5, 33, 60, 45]. Second, even if $\mu_{1}$ can be computed accurately for single trajectories, there may be an infinite number of trajectories with different $\mu_{1}$ values. This is expected of the unstable periodic orbits embedded in a chaotic attractor, for instance. Searching over such trajectories for the largest $\mu_{1}$ value is not a convex search, so even if one finds the trajectory that attains the maximal LE, the existence of another trajectory with larger $\mu_{1}$ cannot be ruled out using numerical integration. These difficulties motivate our current approach, which may complement the search for maximal LEs by numerical integration but is fundamentally different.

Often one is most interested in $\mu_{1}$ of a chaotic attractor, which will be the $\mu_{1}$ value of almost every trajectory in its basin of attraction. However, this $\mu_{1}$ value is usually strictly smaller than the maximal LE $\mu^{*}_{\Omega}$ for any $\Omega$ containing the attractor, with the maximal LE being attained only on a particular unstable orbit. The upper bounds we present here will converge to $\mu^{*}_{\Omega}$ from above and so would be strictly larger than $\mu_{1}$ on the attractor, nonetheless they will be better upper bounds than any existing method can give in many cases.

Our approach to bounding the maximal LE $\mu^{*}_{\Omega}$ from above is based on two ingredients that are explained in the next section. The first ingredient is a simple formula that expresses $\mu(x_{0},y_{0})$ as an infinite-time-averaged integral along trajectories $(x(t),y(t)/|y(t)|)$ in the state space $\mathbb{R}^{n}\times\mathbb{S}^{n-1}$ . Various versions of this formula have appeared in the dynamical systems literature, including the Furstenberg–Khasminskii formula for stochastic dynamics [3, 4]. Because $\mu^{*}_{\Omega}$ can be expressed as a time average that is maximized among certain trajectories, our second ingredient is a method for computing arbitrarily sharp upper bounds on such time averages. In the important case where the right-hand side of an ODE system is polynomial in the state space variables—and in some generalizations of this case—polynomial optimization recently has been used to compute arbitrarily sharp upper bounds on maximal time averages of various quantities [9, 24, 38, 50, 36, 13]. This approach yields upper bounds as numerically computed minima of sum-of-squares (SOS) programs—convex optimization problems in which the coefficients of polynomial expressions are constrained by requiring that the polynomials admit representations as sums of squares of other polynomials. A polynomial being SOS is a sufficient condition for pointwise nonnegativity that is stronger in general but also is more tractable to enforce computationally.

Starting concurrently with the advent of SOS programming two decades ago [48, 54, 39], methods have been developed to study dynamical systems by solving SOS programs. These methods all involve constructing scalar functions on a dynamical system’s state space and requiring that the functions obey pointwise inequalities. The inequalities, which can be enforced by SOS constraints, imply mathematical statements about the dynamics without the need to compute any trajectories. In addition to the method for bounding time averages that we employ here, SOS computations have been used to verify nonlinear stability [54, 2, 27, 21, 37], show that trajectories do or do not enter specified sets [29, 7, 52, 10], bound expectations in stochastic dynamical systems [37, 18], bound extrema over certain trajectories [17, 46] or over global attractors [25], and more. We note that all of these methods can be formulated in terms of occupation measures, which are certain measures induced by the dynamics. When an optimization over measures is truncated to a finite number of moments of the measure, the resulting problem is related to an SOS optimization problem by convex duality. For the methods presented below we speak in terms of SOS optimization, but there is a dual description in terms of measures and moments; see [32] for an introduction to occupation measures and the relationship between moments and SOS polynomials.

Section 2 recalls how SOS computations can bound time averages for ODE systems and then applies this approach to bounding LEs. Section 3 presents computational results showing the convergence of upper bounds to the maximal LE in two chaotic examples: the Lorenz system and the Hénon–Heiles system. Section 4 offers conclusions and future extensions.

2 Optimization problems for bounding maximal LEs

Section 2.1 explains how maximal time averages among ODE trajectories can be bounded above using convex optimization, including how the optimization problems can be relaxed into SOS programs. Section 2.2 specializes the approach of section 2.1 to the task of bounding a maximal LE from above, yielding computationally useful SOS programs along with theoretical sharpness guarantees. Section 2.3 explains how symmetries in the dynamical system allow symmetries to be imposed in the SOS programs for bounding maximal LEs, which improves computational cost and numerical accuracy. The computational methods of section 2.2, including the symmetry restrictions of section 2.3, are implemented to bound LEs in the examples of section 3.

2.1 Bounding time averages using convex optimization

Consider any continuous function $\Phi\in\mathcal{C}(\mathbb{R}^{n})$ taking values in $\mathbb{R}$ , along with its infinite-time average along a trajectory $x(t)$ of the ODE 1,

\displaystyle\overline{\Phi}(x_{0})=\limsup_{T\rightarrow\infty}\frac{1}{T}\int_{0}^{T}\Phi(x(t))\,\mathrm{d}t.

(5)

We define time averages with $\limsup$ to ensure they exist. (The bounds obtained would be unchanged if $\overline{\Phi}$ were defined using $\liminf$ .) If the trajectory emanating from $x_{0}$ is bounded forward in time, then so is $\Phi(x(t))$ , and thus $\overline{\Phi}$ is finite.

The strategy for bounding $\overline{\Phi}$ without knowing any particular trajectories is to introduce a differentiable auxiliary function $V\in\mathcal{C}^{1}(\mathbb{R}^{n})$ . For any such $V$ , the continuous function $f\cdot\nabla V$ , which also maps $\mathbb{R}^{n}$ to $\mathbb{R}$ , has a vanishing infinite-time average along every bounded trajectory. This is true because $\frac{\mathrm{d}}{\mathrm{d}t}V(x(t))=f(x(t))\cdot\nabla V(x(t))$ along trajectories; in other words, $f\cdot\nabla V$ is the Lie derivative of $V$ along ODE solutions. Since $V(x(t))$ is uniformly bounded forward in time along each bounded trajectory, $\overline{f\cdot\nabla V}=\limsup_{T\to\infty}\frac{1}{T}[V(x(T))-V(x_{0})]=0$ . For each differentiable $V$ and each set $\mathcal{B}\subset\mathbb{R}^{n}$ in which $x(t)$ remains for $t\geq 0$ ,

\displaystyle\overline{\Phi}=\overline{\Phi+f\cdot\nabla V}\leq\sup_{x\in\mathcal{B}}\left[\Phi(x)+f(x)\cdot\nabla V(x)\right].

(6)

If nothing is known about trajectories’ locations aside from each trajectory being bounded, one can always choose $\mathcal{B}=\mathbb{R}^{n}$ in 6. For a poor choice of $V$ the upper bound in 6 will be conservative—possibly even infinite if $\mathcal{B}$ is unbounded—but one can seek $V$ that make the right-hand side as small as possible. This optimization of $V$ is described in general in section 2.1.1, followed by its SOS relaxation in the polynomial case in section 2.1.2.

2.1.1 Optimization over auxiliary functions

The upper bound 6 holds for each bounded $x(t)$ and differentiable $V$ , so one can maximize the left-hand side over initial conditions $x_{0}$ and minimize the right-hand side over $V$ . Let $\mathcal{B}_{0}$ be a set of initial conditions whose trajectories remain bounded, uniformly in $t\geq 0$ but not necessarily uniformly in $x_{0}$ . Suppose all of these trajectories eventually remain in a set $\mathcal{B}$ , which may be unbounded. Then 6 can be minimized over $x_{0}\in\mathcal{B}_{0}$ and maximized over $V\in\mathcal{C}^{1}(\mathcal{B})$ to obtain

\sup_{x_{0}\in\mathcal{B}_{0}}\overline{\Phi}\leq\inf_{V\in\mathcal{C}^{1}(\mathcal{B})}\sup_{x\in\mathcal{B}}\left[\Phi(x)+f(x)\cdot\nabla V(x)\right].

(7)

Here $\mathcal{C}^{1}(\mathcal{B})$ is defined as the restriction of $\mathcal{C}^{1}(\mathbb{R}^{n})$ functions to a domain $\mathcal{B}$ (possibly of lower dimension), which is equivalent to the set of functions mapping $\mathcal{B}$ to $\mathbb{R}$ that admit a differentiable extension to an open neighborhood of $\mathcal{B}$ . In 7 one can always choose $\mathcal{B}=\mathbb{R}^{n}$ but often it is possible to choose smaller $\mathcal{B}$ , and this may decrease the upper bound. If $\mathcal{B}_{0}$ lies in the basin of a local attractor, for instance, one can choose any $\mathcal{B}$ that contains the attractor. If $\mathcal{B}_{0}$ is forward invariant, meaning no trajectories leave the set forward in time, then one can choose $\mathcal{B}=\mathcal{B}_{0}$ .

When $\mathcal{B}$ is forward invariant and also compact, and one chooses $\mathcal{B}_{0}=\mathcal{B}$ , then 7 is an equality [62],

\sup_{x_{0}\in\mathcal{B}}\overline{\Phi}=\inf_{V\in\mathcal{C}^{1}(\mathcal{B})}\sup_{x\in\mathcal{B}}\left[\Phi(x)+f(x)\cdot\nabla V(x)\right].

(8)

The equality 8 can be seen as a statement of strong convex duality: the right-hand side can be restated with the inner maximization over invariant measures, and the left-hand side can be stated in the same way with the order of minimization and maximization reversed. For a sketch of the proof of 8, see expression (17) in [62]. The duality result 8 has analogous precedents in the context of discrete-time dynamics, for which we refer to the discussion and references in [6, 35]. In [62] it is assumed that $\mathcal{B}$ is a full-dimensional region, but their proof works more generally for any set $\mathcal{B}$ that is compact and forward invariant [61]. This generality is needed in what follows, where we apply 8 to sets defined by Cartesian products involving the sphere $\mathbb{S}^{n-1}$ .

Although the infimum on the right-hand sides of 7 and 8 may not be attained, there exist $V$ giving arbitrarily sharp upper bounds on the left-hand supremum. However, it can be difficult to evaluate the inner supremum on the right-hand side for a given $V$ , and optimizing over $V$ is still harder. An important exception, in which computationally tractable relaxations of the right-hand minimax problem are possible, is the case where the ODE right-hand side $f(x)$ and the quantity $\Phi(x)$ are polynomials.

2.1.2 SOS relaxations

The minimax problem on the right-hand sides of 7 and 8 can be rewritten as a minimization subject to a pointwise inequality constraint,

\inf_{V\in\mathcal{C}^{1}(\mathcal{B})}\sup_{x\in\mathcal{B}}\left[\Phi(x)+f(x)\cdot\nabla V(x)\right]=\inf_{V\in\mathcal{C}^{1}(\mathcal{B})}B\quad\text{s.t.}\quad S(x)\geq 0~{}~{}\forall x\in\mathcal{B},

(9)

where

S(x)=B-\Phi(x)-f(x)\cdot\nabla V(x).

(10)

In the case where the given $f$ and $\Phi$ are polynomial in the components of $x$ , if $V$ is chosen to be polynomial also, then so is $S$ . Even for polynomial $S$ the right-hand minimization in 9 is often intractable; deciding nonnegativity of a multivariate polynomial on a given set has NP-hard computational complexity in general [47]. However, there is a standard way to enforce the nonnegativity constraint on $S$ using SOS conditions that are more tractable.

The simplest way to enforce nonnegativity of $S$ using SOS constraints is to require that $S$ admits an SOS representation, meaning that there exist a finite number of polynomials $q_{k}(x)$ such that $S=\sum_{k=1}^{K}q_{k}^{2}$ . In other words, $S$ belongs to the set $\Sigma_{n}$ of $n$ -variate SOS polynomials, which is a strict subset (when $n\geq 2$ ) of the set of globally nonnegative polynomials. In the corresponding SOS relaxation of the right-hand side of 9, the set $\mathcal{C}^{1}(\mathcal{B})$ is replaced by its subset $\mathbb{R}[x]$ of real polynomials in $x$ , and the constraint is strengthened to $S\in\Sigma_{n}$ . In some applications [24, 26, 38, 50] this simple SOS relaxation has given an upper bound not only equal to the left-hand side of 9 but also to the maximal time average $\sup_{x_{0}\in\mathcal{B}_{0}}\overline{\Phi}$ that one seeks to bound. In general, however, enforcing $S\in\Sigma_{n}$ can strictly increase the right-hand infimum in 9. This would be the case in our application to LEs below, where nonnegativity of a polynomial must be enforced only when one of its vector arguments has unit length. Thus a constraint weaker than $S\in\Sigma_{n}$ is needed.

When $\mathcal{B}\subsetneq\mathbb{R}^{n}$ there is a standard way to formulate SOS constraints that imply nonnegativity on $\mathcal{B}$ but not on all of $\mathbb{R}^{n}$ , assuming that $\mathcal{B}$ is a semialgebraic set. That is, we assume $\mathcal{B}$ can be specified by a finite number of polynomial inequalities and/or equalities:

\mathcal{B}=\left\{x\in\mathbb{R}^{n}~{}:~{}g_{i}(x)\geq 0~{}\text{for }i=1,\ldots,I,~{}\;h_{j}(x)=0~{}\text{for }j=1,\ldots,J\right\}.

(11)

A sufficient condition for $S$ to be nonnegative on $\mathcal{B}$ is that there exist polynomials $\rho_{j}\in\mathbb{R}[x]$ and SOS polynomials $\sigma_{i}\in\Sigma_{n}$ such that

	$\displaystyle S-\textstyle\sum_{i=1}^{I}\sigma_{i}g_{i}-\sum_{j=1}^{J}\rho_{j}h_{j}$	$\displaystyle\in\Sigma_{n},$		(12)
	$\displaystyle\sigma_{i}$	$\displaystyle\in\Sigma_{n},\quad i=1,\ldots,I.$		(12)

When $x\in\mathcal{B}$ , the first sum in 12 is nonnegative and the second vanishes, so $S(x)\geq 0$ . When $x\notin\mathcal{B}$ , either sign of $S$ is possible. Relaxing the right-hand minimization in 9 by restricting to $V\in\mathbb{R}[x]$ and strengthening its constraint to 12 gives a possibly larger infimum,

\inf_{V\in\mathcal{C}^{1}(\mathcal{B})}\sup_{x\in\mathcal{B}}\left[\Phi(x)+f(x)\cdot\nabla V(x)\right]\\[-10.0pt] \leq\inf_{\begin{subarray}{c}V,\sigma_{i},\rho_{j}\in\mathbb{R}[x]\end{subarray}}B\quad\text{s.t.}\quad\begin{array}[]{rl}\\ S-\sum_{i=1}^{I}\sigma_{i}g_{i}-\sum_{j=1}^{J}\rho_{j}h_{j}\in\Sigma_{n}{}\\ \sigma_{i}\in\Sigma_{n},&i=1,\ldots,I.\end{array}

(13)

It is proved in [38] that 13 is an equality if the semialgebraic specification 11 of $\mathcal{B}$ satisfies a technical condition called the Archimedean property [40, definition 3.4]. Every set with an Archimedean specification is compact, and every compact semialgebraic set’s specification can be made Archimedean, if it isn’t already, by adding a constraint $g_{i}(x)=R^{2}-|x|^{2}\geq 0$ with $R$ large enough so that the constraint is redundant.

If a compact set $\mathcal{B}$ is specified in an Archimedean way so that 13 is an equality, and if $\mathcal{B}$ is forward invariant so that the equality 8 holds, then the maximal time average we seek to bound is equal to the infimum on the right-hand side of 13 [38, theorem 1]:

\sup_{x_{0}\in\mathcal{B}}\overline{\Phi}=\inf_{\begin{subarray}{c}V,\sigma_{i},\rho_{j}\in\mathbb{R}[x]\end{subarray}}B\quad\text{s.t.}\quad\begin{array}[]{rl}\\ S-\sum_{i=1}^{I}\sigma_{i}g_{i}-\sum_{j=1}^{J}\rho_{j}h_{j}\in\Sigma_{n}{}\\ \sigma_{i}\in\Sigma_{n},&i=1,\ldots,I.\end{array}

(14)

Theorem 4 in [36] implies an equality similar to 14 but where the infimum subject to SOS constraints is replaced by its convex dual – a maximization subject to constraints on moments of invariant measures.

One more relaxation of the right-hand minimization in 14 is needed to obtain a computationally tractable problem: one chooses finite polynomial spaces from which to seek $V$ and each $\sigma_{i}$ and $\rho_{j}$ . This yields an SOS program—a convex optimization problem that has SOS constraints and whose tunable parameters appear only affinely in these constraints and in the optimization objective [55]. (Note that the tunable parameters in 14 are $B$ and the coefficients of $V$ , which appear affinely in $S$ , along with the coefficients of each $\sigma_{i}$ and $\rho_{j}$ .) The resulting SOS program will be computationally tractable when the dimensions of the ODE state space and of the vector spaces of tunable polynomials are not too large. A sequence of SOS programs with increasing computational cost can be defined by choosing successively higher-dimensional spaces over which to optimize the tunable polynomials. The infima of these SOS programs converge from above to the right-hand side of 14. In cases where the equality 14 holds, this means that SOS programs can give arbitrarily sharp upper bounds on the left-hand maximal time average.

2.2 Upper bounds on maximal LEs

The framework of section 2.1 gives bounds on a time-integrated average $\overline{\Phi}$ over bounded trajectories, where $\Phi$ is an explicit function of the state space variable. We can apply this framework to the bounding of LEs 3 by expressing the LE $\mu$ as the time-integrated average of a particular $\Phi$ . This is not possible with $\Phi$ depending only on the state space vector $x$ of the ODE system 1, but it is straightforward if $\Phi$ is a function on the enlarged state space $(x,y)$ since the definition 3 of $\mu$ can be rewritten as

\mu(x_{0},y_{0})=\limsup_{T\rightarrow\infty}\frac{1}{T}\int_{0}^{T}\frac{\rm d}{{\rm d}t}\log{|y|}\;{\rm d}t=\overline{y\cdot\tfrac{{\rm d}y}{{\rm d}t}/|y|^{2}}=\overline{y^{\sf T}Df\bigl{(}x\bigr{)}y/|y|^{2}}.

(15)

The first equality follows from the definition 3 by the fundamental theorem of calculus, relying on the absolute continuity of $t\mapsto\log|y(t)|$ . The third equality follows from the $y$ ODE 2.

In cases where $\mu(x_{0},y_{0})$ is positive, $y(t)$ is unbounded despite $x(t)$ being bounded, and so the framework of section 2.1 does not apply to trajectories in $(x,y)$ state space. We therefore project the tangent dynamics 2 onto $\mathbb{S}^{n-1}$ by letting $z(t)=y(t)/|y(t)|$ . Then $\mu$ can be written as a time average over a bounded trajectory in $(x,z)$ state space as

\displaystyle\mu(x_{0},z_{0})

\displaystyle=\overline{z^{\sf T}Df(x)z}.

(16)

Versions of the above formula have appeared in the literature for decades, including with modifications for stochastic dynamics as the Furstenberg–Khasminskii formula [3, 4].

2.2.1 Bounds from minimization over auxiliary functions

To bound the time average on the right-hand side of 16 using the framework of section 2.1, we need evolution equations for the $(x,z)$ state space. The ODE for $z$ is found by taking the time derivative of the definition $z(t)=y(t)/|y(t)|$ and applying the $y$ ODE 2. Coupling this with the $x$ ODE 1 gives

\displaystyle\frac{\mathrm{d}}{\mathrm{d}t}\begin{bmatrix}x\\ z\end{bmatrix}=\begin{bmatrix}f(x)\\ \ell(x,z)\end{bmatrix},\qquad(x,z)\in\mathbb{R}^{n}\times\mathbb{S}^{n-1},

(17)

where

\ell(x,z)=Df(x)z-[z^{\sf T}Df(x)z]z.

(18)

Recall that the framework of section 2.1 was described for an ODE with state space $\mathbb{R}^{n}$ , right-hand side $f(x)$ , and initial set $\mathcal{B}_{0}$ whose trajectories eventually remain in $\mathcal{B}$ . To bound the maximal LE of the $x$ ODE, we apply the same framework to an ODE with state space $\mathbb{R}^{n}\times\mathbb{S}^{n-1}$ , right-hand side as in 17, and initial set $\mathcal{B}_{0}\times\mathbb{S}^{n-1}$ whose trajectories eventually remain in $\mathcal{B}\times\mathbb{S}^{n-1}$ . We choose $\Phi(x,z)=z^{\sf T}Df(x)z$ , so that $\mu=\overline{\Phi}$ . Then the left-hand side of 7 becomes the maximal LE $\mu^{*}_{\mathcal{B}_{0}}$ defined in 4, and the right-hand side of 7 becomes the upper bound that we seek. This upper bound on $\mu^{*}_{\mathcal{B}_{0}}$ is stated in the first part of proposition 1 below, which requires no further proof. When $\mathcal{B}$ is compact and forward invariant for the $x$ ODE, so that $\mathcal{B}\times\mathbb{S}^{n-1}$ is compact and forward invariant for 17, the equality 8 proved in [62] can be applied in the present context to give the second part of the proposition. The assumption $f\in\mathcal{C}^{2}$ may be stronger than needed but suffices to ensure $(f,\ell)\in\mathcal{C}^{1}$ , in which case the result of [62] is directly applicable to the ODE with right-hand side $(f,\ell)$ .

Proposition 1.

Let $\frac{\mathrm{d}}{\mathrm{d}t}x(t)=f(x(t))$ .

Let $f\in\mathcal{C}^{1}(\mathbb{R}^{n},\mathbb{R}^{n})$ , and let $\mathcal{B}_{0}\subset\mathbb{R}^{n}$ and $\mathcal{B}\subset\mathbb{R}^{n}$ be such that each trajectory $x(t)$ with $x_{0}\in\mathcal{B}_{0}$ is bounded for $t\geq 0$ and eventually remains in $\mathcal{B}$ . Then the maximal LE $\mu^{*}_{\mathcal{B}_{0}}$ among trajectories with initial conditions in $\mathcal{B}_{0}$ is bounded above by

\mu_{\mathcal{B}_{0}}^{*}\leq\inf_{V\in\mathcal{C}^{1}(\mathcal{B}\times\mathbb{S}^{n-1})}\sup_{\begin{subarray}{c}x\in\mathcal{B}~{}~{}\\ ~{}\;z\in\mathbb{S}^{n-1}\end{subarray}}\left(z^{\sf T}Dfz+f\cdot\nabla_{x}V+[Dfz-(z^{\sf T}Dfz)z]\cdot\nabla_{z}V\right).

(19)

Let $f\in\mathcal{C}^{2}(\mathbb{R}^{n},\mathbb{R}^{n})$ , and let $\mathcal{B}\subset\mathbb{R}^{n}$ be compact and forward invariant. Then,

\mu_{\mathcal{B}}^{*}=\inf_{V\in\mathcal{C}^{1}(\mathcal{B}\times\mathbb{S}^{n-1})}\sup_{\begin{subarray}{c}x\in\mathcal{B}~{}~{}\\ ~{}\;z\in\mathbb{S}^{n-1}\end{subarray}}\left(z^{\sf T}Dfz+f\cdot\nabla_{x}V+[Dfz-(z^{\sf T}Dfz)z]\cdot\nabla_{z}V\right).

(20)

In the proposition, $\nabla_{x}V$ and $\nabla_{z}V$ denote gradients of the auxiliary function $V(x,z)$ with respect to $x$ and $z$ , respectively. In the first part of the proposition, neither $\mathcal{B}_{0}$ nor $\mathcal{B}$ is necessarily bounded—for instance, 19 holds with $\mathcal{B}_{0}=\mathcal{B}=\mathbb{R}^{n}$ if each trajectory $x(t)$ is bounded forward in time. Note that the right-hand sides of 19 and 20 are identical, but 20 requires stronger assumptions on $\mathcal{B}$ along with choosing $\mathcal{B}_{0}=\mathcal{B}$ on the left-hand side. A result analogous to 20 in the context of linear systems with stochastic forcing is proved in [37].

2.2.2 Bounds from SOS relaxations

If the $x$ ODE 1 is polynomial, then so are the $(x,z)$ ODE 17 and the quantity $\Phi=z^{\sf T}Df(x)z$ , and we let $V(x,z)$ be polynomial also. The inner supremum on the right-hand sides of 19 and 20 is bounded above by a constant $B$ if and only if the polynomial

\displaystyle P(x,z)=B-z^{\sf T}Df(x)z-f(x)\cdot\nabla_{x}V(x,z)-\big{[}Df(x)z-(z^{\sf T}Df(x)z)z\big{]}\cdot\nabla_{z}V(x,z)

(21)

is nonnegative for all $(x,z)\in\mathcal{B}\times\mathbb{S}^{n-1}$ . Note that the $P(x,z)\geq 0$ constraint in the context of the $(x,z)$ ODE 17 is the analogue of the $S(x)\geq 0$ constraint appearing in 9 in the context of the $x$ ODE alone. Just as $S(x)\geq 0$ on $\mathcal{B}$ if the SOS conditions in 12 hold, $P\geq 0$ on $\mathcal{B}\times\mathbb{S}^{n-1}$ if there exist polynomials $\rho_{j}(x,z)$ and SOS polynomials $\sigma_{i}(x,z)$ such that

	$\displaystyle P-\textstyle\sum_{i=1}^{I}\sigma_{i}g_{i}-\sum_{j=0}^{J}\rho_{j}h_{j}$	$\displaystyle\in\Sigma_{2n},$		(22)
	$\displaystyle\sigma_{i}$	$\displaystyle\in\Sigma_{2n},\quad i=1,\ldots,I,$		(22)

where $h_{0}(z)=1-|z|^{2}$ so that it vanishes on $\mathbb{S}^{n-1}$ , and the conditions $g_{i}(x)\geq 0$ and $h_{j}(x)=0$ for $i,j\geq 1$ specify $\mathcal{B}$ as in 11. Note that we allow non-compact $\mathcal{B}$ , including the case $\mathcal{B}=\mathbb{R}^{n}$ wherein 22 should be interpreted as the single SOS constraint $P-\rho_{0}h_{0}\in\Sigma_{2n}$ .

The SOS relaxation of the right-hand minimization in 19 and 20 is the minimization of the constant $B$ that appears in $P(x,z)$ , where the polynomials $V,\sigma_{i},\rho_{j}$ are optimized subject to 22. The infimum of the relaxed problem is an upper bound on the right-hand infimum in 19, which gives the first part of proposition 2 below with no further proof needed. If the specification of $\mathcal{B}$ is Archimedean, then adding the constraint $h_{0}(z)=0$ gives an Archimedean specification of $\mathcal{B}\times\mathbb{S}^{n-1}$ . In this case the SOS relaxation does not change the value of the right-hand infimum in 19 and 20, as explained after 13. The second part of proposition 2 thus follows from the second part of proposition 1.

Proposition 2.

Let $\frac{\mathrm{d}}{\mathrm{d}t}x(t)=f(x(t))$ with $f\in\mathbb{R}^{n}[x]$ . Let $\mathcal{B}\subset\mathbb{R}^{n}$ be either all of $\mathbb{R}^{n}$ or specified by a finite number of polynomial inequalities $g_{i}(x)\geq 0$ and equalities $h_{j}(x)=0$ as in 11. Let $h_{0}(z)=1-|z|^{2}$ , and define $P(x,z)$ as in 21.

Let $\mathcal{B}_{0}\subset\mathbb{R}^{n}$ be such that each trajectory $x(t)$ with $x_{0}\in\mathcal{B}_{0}$ is bounded for $t\geq 0$ and eventually remains in $\mathcal{B}$ . Then the maximal LE $\mu^{*}_{\mathcal{B}_{0}}$ among trajectories with initial conditions in $\mathcal{B}_{0}$ is bounded by

\mu_{\mathcal{B}_{0}}^{*}\leq\inf_{\begin{subarray}{c}V,\sigma_{i},\rho_{j}\in\mathbb{R}[x,z]\end{subarray}}B\quad\text{s.t.}\quad\begin{array}[]{rl}\\ P-\sum_{i=1}^{I}\sigma_{i}g_{i}-\sum_{j=0}^{I}\rho_{j}h_{j}\in\Sigma_{2n}{}\\ \sigma_{i}\in\Sigma_{2n},&i=1,\ldots,I.\end{array}

(23)

Let $\mathcal{B}$ be compact and forward invariant, and suppose its semialgebraic specification is Archimedean. Then,

\mu_{\mathcal{B}}^{*}=\inf_{\begin{subarray}{c}V,\sigma_{i},\rho_{j}\in\mathbb{R}[x,z]\end{subarray}}B\quad\text{s.t.}\quad\begin{array}[]{rl}\\ P-\sum_{i=1}^{I}\sigma_{i}g_{i}-\sum_{j=0}^{I}\rho_{j}h_{j}\in\Sigma_{2n}{}\\ \sigma_{i}\in\Sigma_{2n},&i=1,\ldots,I.\end{array}

(24)

The right-hand sides of 23 and 24 are identical, but 24 requires stronger assumptions on $\mathcal{B}$ along with choosing $\mathcal{B}_{0}=\mathcal{B}$ on the left-hand side. The minimization on the right-hand sides of 23 and 24 is over $V(x,z)$ , $\rho_{j}(x,z)$ , and $\sigma_{i}(x,z)$ in the infinite-dimensional space $\mathbb{R}[x,z]$ of $2n$ -variate polynomials. Restricting each of these tunable polynomials to a finite-dimensional subspace—i.e., choosing a finite basis with tunable coefficients—gives an SOS program that can be solved computationally when $n$ , $I$ , $J$ , and the polynomial subspace dimensions are not too large. As required for an SOS program, all tunable coefficients appear affinely in the SOS expressions since $B$ and $V$ appear affinely in the definition 21 of $P$ . Although restricting the tunable polynomials $V,\sigma_{i},\rho_{j}$ to finite subspaces may increase the infimum on the right-hand sides of 23 and 24, one can always enlarge these subspaces and solve another SOS program with greater computational cost. In this way, a sequence of increasingly expensive SOS programs will give a nonincreasing sequence of infima that converge to the right-hand infimum in 23 and 24. Thus, when the equality 24 holds, SOS computations can give arbitrarily sharp upper bounds on the maximal LE $\mu^{*}_{\mathcal{B}}$ , at least in theory. The computational examples presented below in section 3 show that convergence of upper bounds to the maximal LE indeed can be achieved in practice.

Remark 1.

In the definition 3 of the LE, the Euclidean norm can be replaced by any other norm without changing the value of the LE [45]. If one chooses a weighted Euclidean norm $|y|_{\sf A}=\big{(}y^{\sf T}{\sf A}y\big{)}^{1/2}$ for any positive definite $n\times n$ matrix ${\sf A}$ , all results of section 2.2 can be generalized analogously. In this generalization, $\Phi$ and $\ell$ are replaced by $\Phi_{\sf A}(x,z)=z^{\sf T}{\sf A}Df(x)z$ and $\ell(x,z)=Df(x)z-\Phi_{\sf A}(x,z)z$ , and the normalization of $z$ requires $h_{0}(z)=1-|z|_{\sf A}^{2}=0$ . The resulting SOS minimization on the right-hand sides of 23 and 24 is modified only by the matrix ${\sf A}$ appearing affinely in $P$ and in $h_{0}$ . For an SOS program defined by fixing the spaces over which $V,\sigma_{i},\rho_{j}$ are optimized, different fixed ${\sf A}$ may give smaller or larger minima, meaning better or worse upper bounds on the maximal LE. One might tune ${\sf A}$ along with $V,\sigma_{i},\rho_{j}$ to minimize the bound. However, doing so would make the minimization non-convex because ${\sf A}$ and $\rho_{0}$ multiply each other in the first SOS constraint, so we do not pursue this idea here.

2.3 Symmetries

Making use of symmetries in SOS programming significantly reduces computational cost and numerical imprecision. In particular, when a polynomial that is constrained to be SOS has a known symmetry, block diagonal structure can be imposed on the matrix representation of that polynomial in the corresponding semidefinite program [22]. For the SOS programs described in section 2.1 in the general setting of bounding time averages in ODEs, the SOS programs can be formulated to have symmetries if the ODEs and the quantities whose time averages are being bounded have corresponding symmetries [26, 38]. Since our framework for bounding maximal LEs is a particular application of section 2.1, we can use any symmetries that are shared by the $(x,z)$ ODE 17, the domain $\mathcal{B}\times\mathbb{S}^{n-1}$ , and the function $\Phi=z^{\sf T}Df(x)z$ whose time average is equal to an LE. Section 2.3.1 describes how orthogonal symmetries of the $x$ dynamics induce symmetries of the $(x,z)$ dynamics, and section 2.3.2 describes how the latter lead to symmetries in the SOS programs that give upper bounds on LEs.

2.3.1 Symmetries of the $(x,z)$ dynamics

An ODE is said to be symmetric under a transformation if solutions are mapped to solutions. Let $\Lambda\in GL(n)$ , where $GL(n)$ denotes the group of invertible linear transformations on $\mathbb{R}^{n}$ . An ODE $\frac{\mathrm{d}}{\mathrm{d}t}x=f(x)$ on $\mathbb{R}^{n}$ is symmetric under $\Lambda$ if and only if $f$ is equivariant under $\Lambda$ , meaning $f(\Lambda x)=\Lambda f(x)$ for all $x\in\mathbb{R}^{n}$ . A function $\Phi$ is invariant under $\Lambda$ if $\Phi(\Lambda x)=\Phi(x)$ for all $x\in\mathbb{R}^{n}$ . The invariant transformations of any function form a group, as do its equivariant transformations. A function is said to be $\mathcal{G}$ -invariant (resp. $\mathcal{G}$ -equivariant) for a group $\mathcal{G}$ if it is invariant (resp. equivariant) under all $\Lambda\in\mathcal{G}$ . Turning particularly to the $(x,z)$ dynamics 17, we seek a symmetry group for which the right-hand side of the ODE is equivariant, and the function $\Phi=z^{\sf T}Df(x)z$ is invariant, under all transformations in the group.

The transformation $(x,z)\mapsto(x,-z)$ is a symmetry both of the $(x,z)$ ODE 17 and of the function $\Phi(x,z)=z^{\sf T}Df(x)z$ , reflecting the fact that the growth or decay rate of a vector in the tangent space is unaffected by reversing its direction. The equivariance of the ODE right-hand side under this transformation amounts to the oddness $\ell(x,-z)=-\ell(x,z)$ , where $\ell(x,z)$ is defined as in 18, and the invariance $\Phi(x,-z)=\Phi(x,z)$ is clear.

Further symmetries of the $(x,z)$ dynamics are guaranteed only if the $x$ ODE alone is symmetric under orthogonal transformations, common examples of which include sign changes or rotations. In particular, if the $x$ dynamics are symmetric under $\Lambda\in O(n)$ , where $O(n)$ denotes the group of orthogonal linear transformations on $\mathbb{R}^{n}$ , then the $(x,z)$ dynamics are symmetric under $(x,z)\mapsto(\Lambda x,\Lambda z)$ , and so is the function $\Phi(x,z)$ . To state this precisely in the following proposition, for each group $\mathcal{G}\subset O(n)$ we define the corresponding group $\mathcal{G}^{\prime}\in O(2n)$ as

\mathcal{G}^{\prime}=\left\{\begin{bmatrix}\Lambda&0\\ 0&\pm\Lambda\end{bmatrix}\in O(2n)~{}:~{}\Lambda\in\mathcal{G}\subset O(n)\right\}.

(25)

Proposition 3.

Let $f\in\mathcal{C}^{1}(\mathbb{R}^{n},\mathbb{R}^{n})$ . Define $\ell:\mathbb{R}^{n}\times\mathbb{S}^{n-1}\to\mathbb{R}^{n}$ and $\Phi:\mathbb{R}^{n}\times\mathbb{S}^{n-1}\to\mathbb{R}$ by $\ell(x,z)=Df(x)z-[z^{\sf T}Df(x)z]z$ and $\Phi(x,z)=z^{\sf T}Df(x)z$ . If $f$ is $\mathcal{G}$ -equivariant for some $\mathcal{G}\subset O(n)$ , then $(f,\ell)$ is $\mathcal{G}^{\prime}$ -equivariant for $\mathcal{G}^{\prime}$ defined by 25, and $\Phi$ is $\mathcal{G}^{\prime}$ -invariant.

Proof.

Let $\Lambda\in\mathcal{G}$ . Invariance under $\mathcal{G}^{\prime}$ is equivalent to invariance under $(x,z)\mapsto(x,-z)$ and under $(x,z)\mapsto(\Lambda x,\Lambda z)$ , and likewise for equivariance. For $(x,z)\mapsto(x,-z)$ , the equivariance of $(f,\ell)$ and invariance of $\Phi$ are immediate, as described above in the text. For $(x,z)\mapsto(\Lambda x,\Lambda z)$ , the invariance of $\Phi$ is shown by

\displaystyle\Phi(\Lambda x,\Lambda z)=(\Lambda z)^{\sf T}Df(\Lambda x)\Lambda z=z^{\sf T}\Lambda^{\sf T}\Lambda Df(x)z=z^{\sf T}Df(x)z=\Phi(x,z),

(26)

where the second equality uses the relation $Df(\Lambda x)\Lambda=\Lambda Df(x)$ that is found by differentiating the equivariance definition $f(\Lambda x)=\Lambda f(x)$ , and the third equality requires $\Lambda\in O(n)$ . The claimed equivariance of $(f,\ell)$ holds if $f(\Lambda x)=\Lambda f(x)$ , which is true by assumption, and also $\ell(\Lambda x,\Lambda z)=\Lambda\ell(x,z)$ . To show the latter, we note that $\ell$ is related to $\Phi$ by $\ell(x,z)=Df(x)z-\Phi(x,z)z$ , and we use the relation $Df(\Lambda x)\Lambda=\Lambda Df(x)$ again along with the invariance of $\Phi$ :

\displaystyle\ell(\Lambda x,\Lambda z)=Df(\Lambda x)\Lambda z-\Phi(\Lambda x,\Lambda z)\Lambda z=\Lambda[Df(x)z-\Phi(x,z)z]=\Lambda\ell(x,z).

(27)

This completes the proof. ∎

Remark 2.

The symmetry results described above and in section 2.3.2 below are not directly applicable if $\mathcal{G}\not\subset O(n)$ , but they are applicable after a change of variables if $\mathcal{G}$ is a subgroup of $O(n)$ after a similarity transformation. Concretely, suppose the $x$ dynamics has a symmetry group $\mathcal{G}_{\sf M}\not\subset O(n)$ that takes the form

\mathcal{G}_{\sf M}=\{{\sf M}^{-1}\Lambda{\sf M}~{}:~{}\Lambda\in\mathcal{G}\}

(28)

for any fixed $\mathcal{G}\subset O(n)$ and ${\sf M}\in GL(n)$ . (Groups of the form 28 are exactly all compact subgroups of $GL(n)$ [8, theorem 0.3.5].) Defining $\tilde{x}(t)={\sf M}x(t)$ , so that tangent space vectors for $\tilde{x}(t)$ are ${\sf M}y(t)$ , we note that the $x$ dynamics and the $\tilde{x}$ dynamics have the same LEs. Thus the framework of section 2.2 for bounding LEs can be applied to the $\tilde{x}$ ODE $\frac{\mathrm{d}}{\mathrm{d}t}\tilde{x}={\sf M}f({\sf M}^{-1}\tilde{x})$ , rather than to the $x$ ODE. The $\tilde{x}$ ODE has the orthogonal symmetry group $\mathcal{G}$ , so according to proposition 3 the $(x,z)$ dynamics is symmetric under $\mathcal{G}^{\prime}$ .

2.3.2 Symmetries of the optimization problems

If an ODE, its domain, and a time-averaged quantity share a compact linear group of symmetries, then the upper bound on this time average given by the right-hand infimum in 7 and 8 is unchanged if the auxiliary function $V$ is further constrained to be invariant under the same symmetries. In particular, any $V$ that yields an upper bound on a time average can be used to construct a symmetrized $V$ yielding the same bound. This is proved in [26, Proposition A.1] in the context of a finite cyclic symmetry group. Proposition 6 in the appendix generalizes the result to all compact groups of linear transformations—i.e., all groups of orthogonal transformations, along with similarity transformations 28 of such groups.

We apply these symmetry results in the context of bounding LEs by the infimum on the right-hand sides of 19 and 20, in which case the pertinent symmetry group $\mathcal{G}^{\prime}$ of the $(x,z)$ dynamics is given by proposition 3 above. Provided that the domain $\mathcal{B}$ is $\mathcal{G}$ -invariant, the following proposition guarantees that corresponding symmetries can be imposed on $V(x,z)$ in the right-hand minimization of 19 and 20 without changing the resulting upper bound.

Proposition 4.

Let $\mathcal{G}\subset O(n)$ and $f\in\mathcal{C}^{1}(\mathcal{B},\mathbb{R}^{n})$ . If $f$ is $\mathcal{G}$ -equivariant and $\mathcal{B}$ is $\mathcal{G}$ -invariant, then the infimum over $V\in\mathcal{C}^{1}(\mathcal{B}\times\mathbb{S}^{n-1})$ on the right-hand sides of 19 and 20 is unchanged if $V$ is constrained to be $\mathcal{G}^{\prime}$ -invariant for $\mathcal{G}^{\prime}$ defined by 25.

Proof.

Under the assumptions on $\mathcal{B}$ and $f$ , proposition 3 guarantees that $(f,\ell)$ is $\mathcal{G}^{\prime}$ -equivariant and $\Phi=z^{\sf T}Dfz$ is $\mathcal{G}^{\prime}$ -invariant. The domain $\mathcal{B}\times\mathbb{S}^{n-1}$ is also $\mathcal{G}^{\prime}$ -invariant because $\mathcal{B}$ is $\mathcal{G}$ -invariant by assumption and $\mathbb{S}^{n-1}$ is $\mathcal{G}$ -invariant for any $\mathcal{G}\subset O(n)$ . We can therefore apply proposition 6 from the appendix in the context of symmetry group $\mathcal{G}^{\prime}$ , domain $\mathcal{B}\times\mathbb{S}^{n-1}$ , ODE right-hand side $(f,\ell)$ , and $\Phi=z^{\sf T}Dfz$ . In this context the proposition guarantees that if there exist $V\in\mathcal{C}^{1}(\mathcal{B}\times\mathbb{S}^{n-1})$ and $B\in\mathbb{R}$ such that $P(x,z)\geq 0$ for all $(x,z)\in\mathcal{B}\times\mathbb{S}^{n-1}$ , where $P$ is as defined by 21, then there exists $\widehat{V}(x,z)\in\mathcal{C}^{1}(\mathcal{B}\times\mathbb{S}^{n-1})$ that satisfies the same inequality and is $\mathcal{G}^{\prime}$ -invariant. Therefore, the infimum of

\inf_{V\in\mathcal{C}^{1}(\mathcal{B}\times\mathbb{S}^{n-1})}B\quad\text{s.t.}\quad P(x,z)\geq 0~{}~{}\forall(x,z)\in\mathcal{B}\times\mathbb{S}^{n-1}

(29)

is unchanged if $V$ is further constrained to be $\mathcal{G}^{\prime}$ -invariant. The constrained minimization problem 29 is equivalent to the minimax problem on the right-hand sides of 19 and 20, so the claim is proved. ∎

Proposition 4, which allows invariance to be imposed on $V$ in the minimization that appears on the right-hand sides of 19 and 20, has an analogue for the minimization’s SOS relaxation that appears on the right-hand sides of 23 and 24. This is stated by proposition 5 below, whose assumptions require that the $\mathcal{G}$ -invariant set $\mathcal{B}$ is specified in terms of $\mathcal{G}$ -invariant polynomials, and whose conclusions allow invariance to be imposed not only on $V$ but also on any other tunable polynomials $\sigma_{i}$ and $\rho_{j}$ . We do not include a proof because the result is a direct application of proposition 7 in the same way that proposition 4 is an application of proposition 6.

Proposition 5.

Let $\mathcal{G}\subset O(n)$ and $f\in\mathbb{R}^{n}[x]$ . Let $\mathcal{B}\subset\mathbb{R}^{n}$ be either all of $\mathbb{R}^{n}$ or specified by a finite number of polynomial inequalities $g_{i}(x)\geq 0$ and equalities $h_{j}(x)=0$ as in 11. If $f$ is $\mathcal{G}$ -equivariant and all $g_{i}$ and $h_{j}$ are $\mathcal{G}$ -invariant, then the infimum on the right-hand sides of 23 and 24 over $V,\sigma_{i},\rho_{j}\in\mathbb{R}[x,z]$ (or over subspaces of $\mathbb{R}[x,z]$ ) is unchanged if $V$ and all $\sigma_{i}$ and $\rho_{j}$ are constrained to be $\mathcal{G}^{\prime}$ -invariant for $\mathcal{G}^{\prime}$ defined by 25. When $V$ and all $\sigma_{i}$ and $\rho_{j}$ are $\mathcal{G}^{\prime}$ -invariant, the first expression constrained to be SOS in 23 and 24 is $\mathcal{G}^{\prime}$ -invariant also.

Proposition 5 has practical implications for the SOS programs whose solutions give upper bounds on the maximal LE—i.e., the SOS programs obtained by restricting the minimization on the right-hand sides of 23 and 24 to finite spaces of polynomials. Imposing $\mathcal{G}^{\prime}$ -invariance on $V,\sigma_{i},\rho_{j}$ does not change the resulting upper bounds on LEs, but it ensures that all expressions constrained to be SOS are $\mathcal{G}^{\prime}$ -invariant. Exploiting this latter invariance in the numerical solution of SOS programs can greatly improve computational cost and accuracy [22].

Remark 3.

If the $x$ ODE has no symmetries, then $\mathcal{G}$ contains only the identity, and $\mathcal{G}^{\prime}$ contains the identity and the transformation $(x,z)\mapsto(x,-z)$ . In this case proposition 4 reduces to the statement that making $V(x,z)$ even in $z$ does not change the resulting upper bounds on LEs, and proposition 5 says the same about making all tunable polynomials even in $z$ .

3 Computational examples

This section presents two examples for which we have computed upper bounds on the maximal LE using SOS programming, in particular by numerically solving the minimization on the right-hand sides of 23 and 24 with the tunable polynomials restricted to various maximum degrees. The example of section 3.1 is the Lorenz system, which has a chaotic attractor. The example of section 3.2 is the Hénon–Heiles system with a chosen range of energy values, which is Hamiltonian and chaotic. The computed upper bounds converge rapidly to the maximal LE as we raise the degrees of the tunable polynomials. In both examples our best bounds are sharp to at least five digits.

The SOS programs whose solutions we report below were solved numerically in the standard way, wherein each polynomial that must be SOS is represented by a symmetric matrix using monomial bases. This results in matrices subject to affine and semidefinite constraints, constituting a semidefinite program that can be solved numerically using a variety of standard software. Here we used a version [16] of the parser YALMIP [42, 43] to specify SOS programs, translate them into semidefinite programs, and pass the latter to a solver. For the solver we used Mosek [1], which is not open source but implements a second-order interior-point algorithm.

3.1 The Lorenz system

Consider the Lorenz system [44],

\frac{\mathrm{d}}{\mathrm{d}t}\begin{bmatrix}x_{1}\\ x_{2}\\ x_{3}\end{bmatrix}=\begin{bmatrix}\sigma(x_{2}-x_{1})\\ rx_{1}-x_{2}-x_{1}x_{3}\\ x_{1}x_{2}-\beta x_{3}\end{bmatrix},

(30)

with the standard chaotic parameter values $(\beta,\sigma,r)=(8/3,10,28)$ . For the unstable fixed point at the origin, or any trajectory approaching it along its stable manifold, the LEs are the real parts of the eigenvalues of $Df(\mathbf{0})$ . Thus the leading LE at the origin is $\mu_{1}=\bigl{(}-1-\sigma+\sqrt{1-2\sigma+4r\sigma+\sigma^{2}}\bigr{)}/2\approx 11.82772$ . In fact this is the maximal LE among all trajectories of the Lorenz system, as verified by an upper bound on the maximal LE that is saturated by $\mu_{1}$ at the origin. This sharp upper bound has been proved analytically [41] by a method that is not sharp in general but that is sometimes sharp when the maximal LE is attained on a fixed point. The computations we report below also give the sharp upper bound, at least up to numerical error.

We use the SOS approach of section 2.2.2 to compute upper bounds on the maximal LE $\mu^{*}_{\mathbb{R}^{3}}$ of the Lorenz system. For any $\mathcal{B}$ containing the global attractor, expression 23 gives an upper bound. We choose $\mathcal{B}=\mathbb{R}^{3}$ , so the constraints on the right-hand side of 23 reduce to simply $P-\rho_{0}h_{0}\in\Sigma_{6}$ . For this non-compact $\mathcal{B}$ , part 2 of proposition 2 does not apply to guarantee sharpness of the upper bound, nonetheless the computations described below apparently give the sharp result. The SOS programs we solve are obtained by restricting $V$ and $\rho_{0}$ to finite-dimensional spaces, in particular by specifying a maximum degree $d$ and imposing invariances that, according to proposition 5, do not change the optimum of the SOS program. In addition to the invariance under negation of $z$ that can always be imposed, the symmetry of the Lorenz equations under $(x_{1},x_{2},x_{3})\mapsto(-x_{1},-x_{2},x_{3})$ induces another, so $V$ and $\rho_{0}$ are optimized within the spaces

\mathcal{V}_{d}=\left\{p\in\mathbb{R}[x,z]_{d}~{}:~{}p(x,z)=p(x,-z)=p(\Lambda x,\Lambda z)~{}~{}\forall(x,z)\in\mathbb{R}^{3}\times\mathbb{R}^{3}\right\},

(31)

where $\Lambda$ negates the first two coordinates, and $\mathbb{R}[x,z]_{d}$ denotes the space of polynomials in $(x,z)$ with total degree no larger than $d$ . In our computations we let $\rho_{0}\in\mathcal{V}_{d}$ for various $d$ , but we keep $V\in\mathcal{V}_{2}$ since quadratic $V$ suffice for sharp bounds in this case. Each fixed $d$ gives an SOS program that we solve to obtain an upper bound on the maximal LE of the Lorenz system:

\mu_{\mathbb{R}^{3}}^{*}\leq\inf_{\begin{subarray}{c}V\in\mathcal{V}_{2}\\ \rho_{0}\in\mathcal{V}_{d}\end{subarray}}B\quad\text{s.t.}\quad P-\rho_{0}(1-|z|^{2})\in\Sigma_{6},

(32)

where $P(x,z)$ is defined as in 21.

Table 1 reports the upper bounds 32 found by numerically solving the SOS programs with the maximum degree of $\rho_{0}$ fixed to $d=2$ , 4, and 6. The upper bounds are converged to at least 7 digits when $d\geq 4$ , and we expect that the exact minimum of the SOS program with $d=4$ is identical to the maximal LE of the Lorenz system, meaning 32 is an equality in this case.

Table 1: Upper bounds on the global maximal LE

\mu^{*}_{\mathbb{R}^{3}}

of the Lorenz system 30 at the standard parameters, found by numerically solving the right-hand SOS program in 32 with quadratic

V

and with the maximum degree

d

\rho_{0}

fixed to various values. Tabulated bounds are rounded to the precision shown. To this precision, the converged bound is saturated by the fixed point at the origin (see text).

Degree of $\rho_{0}$	Upper bound on maximal LE
2	14.02562
4	11.82772
6	11.82772

The results of table 1 constitute an example of our method giving sharp upper bounds on the maximal LE of a chaotic system using polynomials of modest degree—and, therefore, using SOS computations of modest cost. The example is relatively easy, even though the Lorenz system is chaotic, in the sense that the maximal LE is attained by the fixed point at the origin rather than a more complicated trajectory. In the next subsection we apply our method to a more challenging example.

3.2 The Hénon–Heiles system

To demonstrate the success of our method when the maximal LE is attained on an orbit that is more complicated than a fixed point, we consider the Hénon–Heiles system [28, 58], which is Hamiltonian. The Hamiltonian function is

\displaystyle H(x_{1},x_{2},x_{3},x_{4})=\frac{1}{2}(x_{1}^{2}+x_{2}^{2}+x_{3}^{2}+x_{4}^{2})+x_{1}^{2}x_{2}-\frac{1}{3}x_{2}^{3},

(33)

where $x_{1}$ and $x_{2}$ are position variables with corresponding momentum variables $x_{3}$ and $x_{4}$ . This $H$ gives rise to the ODE system

\frac{\mathrm{d}}{\mathrm{d}t}\begin{bmatrix}x_{1}\\ x_{2}\\ x_{3}\\ x_{4}\end{bmatrix}=\begin{bmatrix}x_{3}\\ x_{4}\\ -x_{1}-2x_{1}x_{2}\\ -x_{2}-x_{1}^{2}+x_{2}^{2}\end{bmatrix}.

(34)

For our example, we seek a set that is invariant under the dynamics and where the maximal LE is not attained by a fixed point. The fixed points of 34 have energies of $H=0$ and $H=1/6$ . The three fixed points with $H=1/6$ , which are related by symmetry as described below, have a large leading LE. We omit these points by restricting attention to the set where $0\leq H\leq 1/7$ , which is dynamically invariant since $H$ is conserved along trajectories. This set consists of several disconnected regions in $\mathbb{R}^{4}$ , one of which is bounded; figure 1 shows the intersection of these regions with the $(x_{1},x_{2})$ plane. To restrict to the bounded region we add the condition $x_{1}^{2}+x_{2}^{2}\leq 1$ , so the region of interest is the compact semialgebraic set

\mathcal{B}=\{x\in\mathbb{R}^{4}~{}:~{}g_{i}(x)\geq 0~{}\text{for }i=1,2,3\},

(35)

where


$\displaystyle g_{1}(x)$	$\displaystyle=1/7-H(x_{1},x_{2},x_{3},x_{4}),$	(36a)
$\displaystyle g_{2}(x)$	$\displaystyle=H(x_{1},x_{2},x_{3},x_{4}),$	(36b)
$\displaystyle g_{3}(x)$	$\displaystyle=1-x_{1}^{2}-x_{2}^{2}.$	(36c)

The set $\mathcal{B}$ is invariant under the dynamics since the invariant region where $0\leq H\leq 1/7$ does not intersect the surface where $g_{3}=0$ . We have verified using SOS computations (although not analytically) that the semialgebraic specification of $\mathcal{B}$ is Archimedean, in which case part 2 of proposition 2 guarantees that 24 holds, so SOS programs with increasing polynomial degrees give arbitrarily sharp upper bounds on $\mu^{*}_{\mathcal{B}}$ .

Refer to caption — Figure 1: The shaded areas show the intersection of the $(x_{1},x_{2})$ plane with the regions where $0\leq H\leq 1/7$ . These disconnected regions are separated by the set where $x_{1}^{2}+x_{2}^{2}=1$ ( $\cdot$ $\cdot$ $\cdot$ $\cdot$ $\cdot$ ). The central shaded area is where the compact region $\mathcal{B}$ intersects the $(x_{1},x_{2})$ plane.

We use the SOS approach of section 2.2.2 to compute upper bounds on the maximal LE $\mu^{*}_{\mathcal{B}}$ among trajectories in the dynamically invariant set $\mathcal{B}$ . In order to impose symmetries on the SOS programs as described in section 2.3.2, we note that the Hénon–Heiles ODE has a symmetry group $\mathcal{G}\subset O(4)$ . Each $\Lambda\in\mathcal{G}$ is a transformation in which some $A\in D_{3}$ acts on both the position vector $(x_{1},x_{2})$ and the momentum vector $(x_{3},x_{4})$ , where $D_{3}$ is the dihedral group generated by two planar transformations: reflection in the first coordinate and rotation by $2\pi/3$ . Since the $g_{i}$ polynomials defining $\mathcal{B}$ are $\mathcal{G}$ -invariant also, proposition 5 guarantees that the upper bound from the SOS program is unchanged if $\mathcal{G}^{\prime}$ -invariance is imposed on all tunable polynomials. However, the software [16] with which we implement SOS programs can automatically exploit sign-reversal symmetries but not rotational ones, so the only symmetry of 34 that we make use of is $(x_{1},x_{2},x_{3},x_{4})\mapsto(-x_{1},x_{2},-x_{3},x_{4})$ . The tunable polynomials are therefore optimized within the spaces

\mathcal{W}_{d}=\left\{p\in\mathbb{R}[x,z]_{d}~{}:~{}p(x,z)=p(x,-z)=p(\Lambda x,\Lambda z)~{}~{}\forall(x,z)\in\mathbb{R}^{4}\times\mathbb{R}^{4}\right\},

(37)

where $\Lambda$ negates the first and third coordinates. In our computations we specify the same maximum total degree $d$ for all tunable polynomials. Each fixed $d$ gives an SOS program that we solve to obtain an upper bound on the maximal LE of the Hénon–Heiles system in the region $\mathcal{B}$ :

\mu_{\mathcal{B}}^{*}\leq\inf_{V,\sigma_{1},\sigma_{2},\sigma_{3},\rho_{0}\in\mathcal{W}_{d}}B\quad\text{s.t.}\quad\begin{array}[t]{rl}P-\sum_{i=1}^{3}\sigma_{i}g_{i}-\rho_{0}(1-|z|^{2})\in\Sigma_{8}{}\\ \sigma_{i}\in\Sigma_{8},&i=1,2,3,\end{array}

(38)

where $P(x,z)$ is defined as in 21.

Table 2 reports the upper bounds 38 found by numerically solving the SOS program with the maximum degrees of all tunable polynomials fixed to $d=2$ , 4, 6, 8, and 10. The upper bounds improve as $d$ is raised, appearing to converge to five digits once polynomials are of degree 8. To confirm the sharpness of this upper bound, we computed a number of unstable periodic orbits of the Hénon–Heiles system and calculated the leading LE of each using a procedure described below. On the shortest of these periodic orbit we computed a leading LE of $\mu_{1}\approx 0.23081$ , confirming that the best upper bounds in table 2 are sharp to at least five digits.

Table 2: Upper bounds on the maximal LE

\mu^{*}_{\mathcal{B}}

among trajectories of the Hénon–Heiles system 34 in the set

\mathcal{B}

defined by 35 and 36. Bounds are found by numerically solving the right-hand SOS program in 38 with the maximum degree

d

of all tunable polynomials fixed to various values. Tabulated bounds are rounded to the precision shown. To this precision, the best bound is saturated by the periodic orbits shown in figure 2.

Degree of polynomials	Upper bound on maximal LE
2	0.86999
4	0.41206
6	0.26717
8	0.23081
10	0.23081

To compute a number of unstable periodic orbits—and thus to find one with a large leading LE—we first numerically integrated the Hénon–Heiles system from various initial conditions and searched for close returns to the initial conditions. A fourth–order symplectic integrator [14] was used to conserve $H$ . The numerical coefficients for the scheme were as in [20, equation 4.8]. Choosing 121 initial conditions along the curve where $(x_{3},x_{4})=(0,0)$ and $H=1/7$ , we integrated each trajectory for $0\leq t\leq 33$ . For every trajectory that satisfied the close return condition $|x(t)-x(0)|<10^{-3}$ at some $t\geq 1$ , we selected the smallest such time $T$ and corresponding initial condition $x(0)$ as producing an approximate periodic orbit.

After compiling 19 pairs of initial conditions and periods giving approximate periodic orbits, we used a shooting method to converge each more precisely to a periodic orbit. Precise convergence was needed because errors in an initial condition often produce larger errors in the LE later computed for that orbit. Our shooting method implementation used MATLAB’s fminsearch function to minimize $|x(T)-x(0)|$ by tuning the period $T$ and the $x_{1}$ coordinate of the initial condition. (The initial $x_{2}$ coordinate was determined from the conditions that $(x_{3},x_{4})=(0,0)$ and $H=1/7$ .) On each iteration of the shooting method, symplectic integration up to time $T$ was carried out using a fixed time step close to $10^{-3}$ but evenly dividing $T$ . All 19 orbits were converged such that $|x(T)-x(0)|<10^{-13}$ , which required modifying the default options of the fminsearch function to tighten tolerances and allow more iterations. Among the 19 converged orbits there were only 7 different periods (approximately 6.97, 8.07, 15.6, 23.0, 29.3, 29.7, and 30.0) with orbits of the same period being related to each other by symmetries of the Hénon–Heiles system. The leading LEs of all 7 orbits were computed as described below, and the largest LE was found on the orbit with the shortest period, whose numerically converged initial condition and period are

x(0)=(0.562878385826716,-0.053847890920149,0,0),\quad T=6.966517640959103.

(39)

Figure 2 shows this periodic orbit, along with the two symmetry-related orbits obtained by rotating both the position vector $(x_{1},x_{2})$ and the momentum vector $(x_{3},x_{4})$ by $\pm 2\pi/3$ .

To compute the leading LE on each periodic orbit we integrated the ODEs 1 and 2 for the orbit $x(t)$ and the unnormalized tangent space vector $y(t)$ , using the symplectic integrator for $x$ and the fourth-order Runge–Kutta method for $y$ . In order to align $y$ with the leading Lyapunov vector, we began with $y$ of unit length pointing in a random direction and integrated repeatedly around the orbit, normalizing $y$ once per period. After 14 periods the direction of $y$ converged to machine precision. Using this normalized $y$ for the initial condition, we integrated for one more period and calculated the leading LE as $\mu_{1}=\tfrac{1}{T}\log|y(T)|$ . For the shortest-period orbits, which are shown in figure 2, our procedure gave a leading LE of $\mu_{1}\approx 0.23081$ . This value agrees with our best upper bound in table 2 to all five digits. (We do not compare additional digits because we estimate that the numerically computed infima of the SOS programs are precise to about five digits.) These results strongly suggest that the maximal LE among trajectories in $\mathcal{B}$ indeed occurs on the orbits shown in figure 2.

We emphasize that using numerical integration to find periodic orbits and obtain the precise value $\mu_{1}\approx 0.23081$ on one such orbit was much more difficult than using SOS optimization to compute the upper bound $\mu^{*}_{\mathcal{B}}\leq 0.23081$ . There is inherent numerical difficulty in finding unstable orbits, especially the most unstable orbits that attain the maximal LE, and in converging the orbits to the precision needed to compute their LEs. On the other hand, such considerations do not affect the SOS programs that give upper bounds on the maximal LE. Our MATLAB script implementing these SOS programs with existing software is fewer than 30 lines. For the computation with degree-8 polynomials that gives the nearly exact bound, the runtime was less than 8 minutes on a laptop and could be made faster by fully exploiting the rotational symmetries.

4 Conclusions

We have described an approach for using polynomial optimization to compute upper bounds on a polynomial ODE system’s maximal LE among all bounded trajectories in a given set—i.e., the leading LE of the most unstable trajectory. For trajectories remaining in a compact domain that is specified as a semialgebraic set subject to a mild technical condition, the upper bounds are guaranteed to converge to the maximal LE as the polynomial degrees in the optimization problem are raised. Because symmetries allow for faster and more accurate polynomial optimization, we have studied how an ODE’s symmetries induce symmetries in the tangent dynamics of its Lyapunov vectors and, in turn, in the optimization problems that bound LEs.

The polynomial optimization problems that we formulate are SOS programs, and typically they must be solved numerically. If desired one could produce computer-assisted proofs by a similar approach, either by augmenting numerical computations with rational arithmetic [56] or interval arithmetic [34], or by using exact algebraic computation [30, 31] if its cost is not prohibitive. For polynomials of modest degree, the computations can sometimes guide analytical proofs also [24].

For illustration, we solved SOS programs numerically to find upper bounds on the maximal LE of two chaotic systems: the Lorenz system, which is dissipative, and the Henon–Héiles system, which is Hamiltonian. In each case the bounds become sharp as polynomial degrees are raised in the optimization problems, which also raises the computational cost. Bounds for the Lorenz system indicate that the maximal LE is attained on the fixed point at the origin, which has been proved analytically already [41]. Bounds for the Hénon–Heiles system indicate that the maximal LE in a certain invariant set is attained on the shortest unstable periodic orbit.

The more typical approach to computing LEs, in contrast to ours, requires computing various approximate trajectories by numerical integration, along with the approximate LEs of these trajectories. Computing a variety of unstable trajectories can be hard on its own, but computing LEs on such orbits presents additional difficulty. On periodic orbits, imprecision in computation of the orbits can be magnified in the LE computation. On chaotic orbits, LEs are infinite-time averages approximated over finite times, but the convergence of these time averages is often very slow and without a priori error estimates [59, Chapter 1.9]. Such LE computations, to the extent that they can be trusted, are lower bounds on the maximal LE among all trajectories. Our approach provides upper bounds and thus complements numerical integration.

There are various natural extensions to the present work. For dynamics governed by discrete maps rather than differential equations, there is a direct analogue to the general variational formulation of proposition 1, but additional steps are needed to arrive at an SOS relaxation similar to proposition 2. This is described in the doctoral thesis of the first author [49]. Extensions to LEs governed by stochastic ODEs are immediate since the general method for bounding time averages that we have applied here has an analogue for stochastic ODEs [18]. In an example with linear deterministic dynamics subject to multiplicative noise, upper and lower bounds on the stochastic LE that essentially equal one another have been computed successfully [37]. We expect significantly more difficulty in cases where the deterministic dynamics is nonlinear and has different LEs on different trajectories. This is because computing bounds on stationary expectations for a stochastic system that would be violated by time averages in the corresponding deterministic system requires polynomials of high degree and leads to poor numerical conditioning [18]. Overcoming this obstacle would provide a powerful tool, for instance to find positive lower bounds on LEs of stochastic systems and thereby verify chaotic behavior.

Acknowledgements

We thank Charles Doering, Giovanni Fantuzzi, Jeremy Parker, Sam Punshon-Smith, and Anthony Quas for some very helpful discussions in the course of this work. The discussions with Charles Doering were enabled by funding for his visit to the University of Victoria from the Pacific Institute for the Mathematical Sciences. Both authors were supported by the Canadian NSERC Discovery Grants Program via award numbers RGPIN-2018-04263, RGPAS-2018-522657, and DGECR-2018-0037.

Appendix

Propositions 6 and 7 below concern symmetry in the general framework for bounding time averages that the present work has applied to bounding LEs. Proposition 6 generalizes [26, Proposition A.1], which was proved in the context of finite cyclic symmetry groups, to all compact subgroups of $GL(n)$ . Proposition 7 generalizes [38, Theorem 2] in the same way. Compact subgroups of $GL(n)$ either are subgroups of $O(n)$ or are similarity transformations of such a subgroup as in 28 [8, theorem 0.3.5]. The significance of propositions 6 and 7 is that symmetry can be imposed on the function $V$ being optimized in the minimization on the right-hand sides of 7 and 8, and on $V$ and the other tunable polynomials in the SOS relaxation on the right-hand sides of 13 and 14, without changing the infima that give upper bounds on $\overline{\Phi}$ . In the SOS programs, the polynomial expressions that are constrained to be SOS will be symmetric also, which can be exploited for faster and more accurate numerical solutions of SOS programs.

Proposition 6.

Let $\mathcal{G}$ be a compact subgroup of $GL(n)$ . Suppose the set $\mathcal{B}\subset\mathbb{R}^{n}$ is $\mathcal{G}$ -invariant, the function $f:\mathcal{B}\to\mathbb{R}^{n}$ is $\mathcal{G}$ -equivariant, and the function $\Phi:\mathcal{B}\to\mathbb{R}$ is $\mathcal{G}$ -invariant. If there exist $V\in\mathcal{C}^{1}(\mathcal{B})$ and $B\in\mathbb{R}$ such that $B-\Phi(x)-f(x)\cdot\nabla V(x)\geq 0$ for all $x\in\mathcal{B}$ , then there exists $\widehat{V}\in\mathcal{C}^{1}(\mathcal{B})$ that is $\mathcal{G}$ -invariant and satisfies the same inequality.

Proof.

Because $\mathcal{G}$ is compact it has a Haar probability measure $m$ [19, proposition 11.4] that is invariant in the sense that [8, theorem 3.1]

\int_{\mathcal{G}}\iota(\Lambda\Lambda^{\prime})\,{\rm d}m(\Lambda)=\int_{\mathcal{G}}\iota(\Lambda)\,{\rm d}m(\Lambda)

(40)

for all $\Lambda^{\prime}\in\mathcal{G}$ and $\iota\in\mathcal{C}(\mathcal{G})$ . Define the symmetrization of $V$ as

\displaystyle\widehat{V}(x)=\int_{\mathcal{G}}V(\Lambda x)\;{\rm d}m(\Lambda),

(41)

and note that $\widehat{V}$ is $\mathcal{G}$ -invariant by virtue of 40. Since $B-\Phi(x)-f(x)\cdot\nabla V(x)\geq 0$ holds for all $x\in\mathcal{B}$ , and $\mathcal{B}$ is $\mathcal{G}$ -invariant, the same inequality holds when $x$ is replaced by $\Lambda x$ for any $\Lambda\in\mathcal{G}$ . Evaluating this inequality at $\Lambda x$ and then using the invariance of $\Phi$ and equivariance of $f$ under $\Lambda$ gives

\displaystyle B-\Phi(x)-\Lambda f(x)\cdot\nabla V(\Lambda x)\geq 0\quad\forall x\in\mathcal{B}.

(42)

The gradient in 42 is with respect to the entire argument $\Lambda x$ , but this alternatively can be expressed using a gradient with respect to $x$ since

\Lambda f(x)\cdot\nabla_{x^{\prime}}V(x^{\prime})|_{x^{\prime}=\Lambda x}=f(x)\cdot\nabla_{x}V(\Lambda x).

(43)

Substituting the right-hand side of 43 into 42 and integrating over $\mathcal{G}$ against $m$ gives

\displaystyle B-\Phi(x)-f(x)\cdot\nabla\widehat{V}(x)\geq 0\quad\forall x\in\mathcal{B},

(44)

which is the desired inequality for the symmetrization $\widehat{V}$ . ∎

Proposition 7.

Let $\mathcal{G}$ be a compact subgroup of $GL(n)$ . Suppose all $g_{i},h_{j}\in\mathbb{R}[x]$ that define $\mathcal{B}$ via 11 are $\mathcal{G}$ -invariant, and suppose $\Phi\in\mathbb{R}[x]$ and $f\in\mathbb{R}^{n}[x]$ are $\mathcal{G}$ -invariant and $\mathcal{G}$ -equivariant, respectively. If there exist $V,\sigma_{i},\rho_{j}\in\mathbb{R}[x]$ and $B\in\mathbb{R}$ such that

B-\Phi-f\cdot\nabla V-\textstyle\sum_{i=1}^{I}\sigma_{i}g_{i}-\sum_{j=1}^{J}\rho_{j}h_{j}\in\Sigma_{n}\quad\text{ and }\quad\sigma_{i}\in\Sigma_{n}~{}\text{ for }~{}i=1,\ldots,I,

(45)

then there exist $\mathcal{G}$ -invariant $\widehat{V},\widehat{\sigma}_{i},\widehat{\rho}_{j}\in\mathbb{R}[x]$ satisfying 45 in place of $V,\sigma_{i},\rho_{j}$ . With these $\widehat{V},\widehat{\sigma}_{i},\widehat{\rho}_{j}\in\mathbb{R}[x]$ , the expression in the first constraint of 45 is $\mathcal{G}$ -invariant also.

Proof.

Define $\widehat{V}$ as the $\mathcal{G}$ -invariant symmetrization of $V$ using the invariant Haar probability measure $m$ , as in 41, and define $\widehat{\sigma}_{i}$ and $\widehat{\rho}_{j}$ as the analogous symmetrizations of $\sigma_{i}$ and $\rho_{j}$ . To see that $\sigma_{i}\in\Sigma_{n}$ implies $\widehat{\sigma}_{i}\in\Sigma_{n}$ , let $b(x)$ be a vector of all monomials with degree up to half the degree of $\sigma_{i}$ . There exists a symmetric matrix $Q_{\sigma_{i}}$ such that $\sigma_{i}(x)=b(x)^{\sf T}Q_{\sigma_{i}}b(x)$ , and because $\sigma_{i}\in\Sigma_{n}$ it is possible to choose $Q_{\sigma_{i}}\succeq 0$ . For each $\Lambda\in\mathcal{G}$ , every entry in $b(\Lambda x)$ is a linear combination of the entries of $b(x)$ , so there exists a matrix $\Gamma(\Lambda)$ such that $b(\Lambda x)=\Gamma(\Lambda)b(x)$ . Therefore,

\widehat{\sigma}_{i}(x)=\int_{\mathcal{G}}\sigma_{i}(\Lambda x)\,{\rm d}m(\Lambda)=b(x)^{\sf T}\underbrace{\left[\int_{\mathcal{G}}\Gamma(\Lambda)^{\sf T}Q_{\sigma_{i}}\Gamma(\Lambda)\,{\rm d}m(\Lambda)\right]}_{\widehat{Q}_{\sigma_{i}}}b(x).

(46)

This formula makes clear that $\widehat{\sigma}_{i}$ is a polynomial, and it is also SOS since $\Gamma(\Lambda)^{\sf T}Q_{\sigma_{i}}\Gamma(\Lambda)\succeq 0$ for each $\Lambda\in\mathcal{G}$ implies $\widehat{Q}_{\sigma_{i}}\succeq 0$ . Similar arguments confirm that $\widehat{V}$ and $\widehat{\rho}_{j}$ are polynomials but not that they are SOS since $V$ and $\rho_{j}$ need not be.

The argument for the first SOS constraint is similar to the argument for $\widehat{\sigma}_{i}$ . There exist a (possibly different) basis vector $b(x)$ and a matrix $Q\succeq 0$ such that

B-\Phi-f\cdot\nabla V-\textstyle\sum_{i=1}^{I}\sigma_{i}g_{i}-\sum_{j=1}^{J}\rho_{j}h_{j}=b^{\sf T}Q\,b

(47)

since the left-hand expression is assumed to be SOS. The above equality holds for all $x\in\mathbb{R}^{n}$ , including with $x$ replaced by $\Lambda x$ . Substituting $\Lambda x$ for $x$ , we use the $\Lambda$ -invariance of $V,g_{i},h_{j}$ and the $\Lambda$ -equivariance of $f$ , along with 43, to find

B-\Phi(x)-f(x)\cdot\nabla_{x}V(\Lambda x)-\textstyle\sum_{i=1}^{J}\sigma_{i}(\Lambda x)g_{i}(x)-\textstyle\sum_{j=1}^{I}\rho_{j}(\Lambda x)h_{j}(x)\\ =b(x)^{\sf T}\Gamma(\Lambda)^{\sf T}Q\,\Gamma(\Lambda)b(x).

(48)

As in 46, integrating both sides of the above equality over $\mathcal{G}$ against $m$ gives

B-\Phi-f\cdot\nabla\widehat{V}-\textstyle\sum_{i=1}^{J}\widehat{\sigma}_{i}g_{i}-\textstyle\sum_{j=1}^{I}\widehat{\rho}_{j}h_{j}=b^{\sf T}\underbrace{\left[\int_{\mathcal{G}}\Gamma(\Lambda)^{\sf T}Q\Gamma(\Lambda)\;dm(\Lambda)\right]}_{\widehat{Q}}b.

(49)

The integrand defining $\widehat{Q}$ is positive semidefinite for each $\Lambda$ , so $\widehat{Q}\succeq 0$ also, thus the left-hand expression is SOS. This completes the proof that $\widehat{V},\widehat{\sigma}_{i},\widehat{\rho}_{j}$ satisfy 45. The left-hand expression in 49 is also $\mathcal{G}$ -invariant since it can be written as an integral against the Haar probability measure of $\mathcal{G}$ , and this confirms the second part of the claim. ∎

References

Andersen and Andersen [2000] E. D. Andersen and K. D. Andersen. The MOSEK interior point optimizer for linear programming: an implementation of the homogeneous algorithm. In High performance optimization, pages 197–232. Springer, 2000.
Anderson and Papachristodoulou [2015] J. Anderson and A. Papachristodoulou. Advances in computational Lyapunov analysis using sum-of-squares programming. Discrete Contin. Dyn. Syst-B, 20:2361–2381, 2015.
Arnold and Imkeller [1995] L. Arnold and P. Imkeller. Furstenberg–Khasminskii formulas for Lyapunov exponents via anticipative calculus. Stochastics Stoch. Reports, 54:127–168, 1995.
Bedrossian et al. [2022] J. Bedrossian, A. Blumenthal, and S. Punshon-Smith. A regularity method for lower bounds on the Lyapunov exponent for stochastic differential equations. Invent. Math., 227:429–516, 2022.
Benettin et al. [1980] G. Benettin, L. Galgani, A. Giorgilli, and J.-M. Strelcyn. Lyapunov characteristic exponents for smooth dynamical systems and for Hamiltonian systems; a method for computing all of them. Part 1: Theory. Meccanica, 15:9–20, 1980.
Bochi [2018] J. Bochi. Ergodic optimization of Birkhoff averages and Lyapunov exponents. In Proc. Int. Congr. Math., pages 1825–1846, 2018.
Bramburger and Goluskin [2020] J. J. Bramburger and D. Goluskin. Minimum wave speeds in monostable reaction-diffusion equations: sharp bounds by polynomial optimization. Proc. R. Soc. A, 476:20200450, 2020.
Bredon [1972] G. E. Bredon. Introduction to compact transformation groups. Academic Press, 1972.
Chernyshenko et al. [2014] S. I. Chernyshenko, P. Goulart, D. Huang, and A. Papachristodoulou. Polynomial sum of squares in fluid dynamics: a review with a look ahead. Philos. Trans. R. Soc. Lond. A, 372:20130350, 2014.
Cibulka et al. [2022] V. Cibulka, M. Korda, and T. Hanis. Spatio-temporal decomposition of sum-of-squares programs for the region of attraction and reachability. IEEE Control Syst. Lett., 6:812–817, 2022.
Cvitanović et al. [2016] P. Cvitanović, R. Artuso, R. Mainieri, G. Tanner, and G. Vattay. Chaos: classical and quantum, ChaosBook.org. Niels Bohr Institute, 2016.
Dieci et al. [1997] L. Dieci, R. D. Russell, and E. S. Van Vleck. On the computation of Lyapunov exponents for continuous dynamical systems. SIAM J. Numer. Anal., 34:402–423, 1997.
Doering and McMillan [2022] C. R. Doering and A. McMillan. Optimal time averages in non-autonomous nonlinear dynamical systems. J. Pure Appl. Funct. Anal., 7:231–251, 2022.
Donnelly and Rogers [2005] D. Donnelly and E. Rogers. Symplectic integrators: an introduction. Am. J. Phys., 73:938–945, 2005.
Eckmann and Ruelle [1985] J.-P. Eckmann and D. Ruelle. Ergodic theory of chaos and strange attractors. In The theory of chaotic attractors, pages 273–312. Springer, 1985.
Fantuzzi [2019] G. Fantuzzi. Aeroimperial-YALMIP, 2019. URL https://github.com/aeroimperial-optimization/aeroimperial-yalmip.
Fantuzzi and Goluskin [2020] G. Fantuzzi and D. Goluskin. Bounding extreme events in nonlinear dynamics using convex optimization. SIAM J. Appl. Dyn. Syst, 19:1823–1864, 2020.
Fantuzzi et al. [2016] G. Fantuzzi, D. Goluskin, D. Huang, and S. I. Chernyshenko. Bounds for deterministic and stochastic dynamical systems using sum-of-squares optimization. SIAM J. Appl. Dyn. Syst, 15:1962–1988, 2016.
Folland [1999] G. B. Folland. Real analysis: modern techniques and their applications. John Wiley & Sons, 1999.
Forest and Ruth [1990] E. Forest and R. D. Ruth. Fourth-order symplectic integration. Physica D, 43:105–117, 1990.
Fuentes et al. [2022] F. Fuentes, D. Goluskin, and S. Chernyshenko. Global stability of fluid flows despite transient growth of energy. Phys. Rev. Lett., 128:204502, 2022.
Gatermann and Parrilo [2004] K. Gatermann and P. A. Parrilo. Symmetry groups, semidefinite programs, and sums of squares. J. Pure Appl. Algebra, 192:95–128, 2004.
Geist et al. [1990] K. Geist, U. Parlitz, and W. Lauterborn. Comparison of different methods for computing Lyapunov exponents. Progr. Theoret. Phys., 83:875–893, 1990.
Goluskin [2018] D. Goluskin. Bounding averages rigorously using semidefinite programming: mean moments of the Lorenz system. J. Nonlinear Sci., 28:621–651, 2018.
Goluskin [2020] D. Goluskin. Bounding extrema over global attractors using polynomial optimization. Nonlinearity, 33:4878–4899, 2020.
Goluskin and Fantuzzi [2019] D. Goluskin and G. Fantuzzi. Bounds on mean energy in the Kuramoto–Sivashinsky equation computed using semidefinite programming. Nonlinearity, 32:1705–1730, 2019.
Hafstein et al. [2018] S. Hafstein, S. Gudmundsson, P. Giesl, and E. Scalas. Lyapunov function computation for autonomous linear stochastic differential equations using sum-of-squares programming. Discrete Contin. Dyn. Syst. Ser. B, 23:939–956, 2018.
Hénon and Heiles [1964] M. Hénon and C. Heiles. The applicability of the third integral of motion: some numerical experiments. Astron. J, 69:73, 1964.
Henrion and Korda [2014] D. Henrion and M. Korda. Convex computation of the region of attraction of polynomial control systems. IEEE Trans. Automat. Contr., 59:297–312, 2014.
Henrion et al. [2016] D. Henrion, S. Naldi, and M. Safey El Din. Exact algorithms for linear matrix inequalities. SIAM J. Optim., 26:2512–2539, 2016.
Henrion et al. [2019] D. Henrion, S. Naldi, and M. Safey El Din. SPECTRA – a Maple library for solving linear matrix inequalities in exact arithmetic. Optim. Methods Softw., 34:62–78, 2019.
Henrion et al. [2021] D. Henrion, M. Korda, and J. B. Lasserre. The moment–SOS hierarchy. World Scientific, 2021.
Hubertus et al. [1997] F. Hubertus, F. E. Udwadia, and W. Proskurowski. An efficient QR based method for the computation of Lyapunov exponents. Physica D, 101:1–16, 1997.
Jansson et al. [2008] C. Jansson, D. Chaykin, and C. Keil. Rigorous error bounds for the optimal value in semidefinite programming. SIAM J. Numer. Anal., 46:180–200, 2008.
Jenkinson [2019] O. Jenkinson. Ergodic optimization in dynamical systems. Ergod. Theory Dyn. Syst., 39:2593–2618, 2019.
Korda et al. [2021] M. Korda, D. Henrion, and I. Mezić. Convex computation of extremal invariant measures of nonlinear dynamical systems and Markov processes. J. Nonlinear Sci., 31:14, 2021.
Kuntz et al. [2016] J. Kuntz, M. Ottobre, G.-B. Stan, and M. Barahona. Bounding stationary averages of polynomial diffusions via semidefinite programming. SIAM J. Sci. Comput., 38:A3891–A3920, 2016.
Lakshmi et al. [2020] M. V. Lakshmi, G. Fantuzzi, J. D. Fernández-Caballero, Y. Hwang, and S. I. Chernyshenko. Finding extremal periodic orbits with polynomial optimization, with application to a nine-mode model of shear flow. SIAM J. Appl. Dyn. Syst., 19:763–787, 2020.
Lasserre [2001] J. B. Lasserre. Global optimization with polynomials and the problem of moments. SIAM J. Optim., 11:796–817, 2001.
Lasserre et al. [2008] J. B. Lasserre, D. Henrion, C. Prieur, and E. Trélat. Nonlinear optimal control via occupation measures and LMI-relaxations. SIAM J. Control Optim., 47:1643–1666, 2008.
Leonov et al. [2016] G. A. Leonov, N. V. Kuznetsov, N. A. Korzhemanova, and D. V. Kusakin. Lyapunov dimension formula for the global attractor of the Lorenz system. Commun. Nonlinear Sci. Numer. Simul., 41:84–103, 2016.
Löfberg [2009] J. Löfberg. Pre- and post-processing sum-of-squares programs in practice. IEEE Trans. Automat. Control, 54:1007–1011, 2009.
Löfberg [2016] J. Löfberg. Sum-of-squares programming, 2016. URL https://yalmip.github.io/tutorial/sumofsquaresprogramming.
Lorenz [1963] E. N. Lorenz. Deterministic nonperiodic flow. J. Atmospheric Sci., 20:130–141, 1963.
Meiss [2017] J. D. Meiss. Differential Dynamical Systems. SIAM, 2017.
Miller et al. [2021] J. Miller, D. Henrion, and M. Sznaier. Peak estimation recovery and safety analysis. IEEE Control Syst. Lett., 5:1982–1987, 2021.
Murty and Kabadi [1987] K. G. Murty and S. N. Kabadi. Some NP-complete problems in quadratic and nonlinear programming. Math. Program., 39:117–129, 1987.
Nesterov [2000] Y. Nesterov. Squared functional systems and optimization problems. In H. Frenk, K. Roos, T. Terlaky, and S. Zhang, editors, High Perform. Optim., pages 405–440. Springer, 2000.
Oeri [2023] H. Oeri. Convex optimization methods for bounding Lyapunov exponents. PhD thesis, University of Victoria, 2023.
Olson et al. [2020] M. L. Olson, D. Goluskin, W. W. Schultz, and C. R. Doering. Heat transport bounds for a truncated model of Rayleigh–Bénard convection via polynomial optimization. Physica D, 415:132748, 2020.
Oseledets [1968] V. I. Oseledets. A multiplicative ergodic theorem. Liapunov characteristic number for dynamical systems. Trans. Moscow Math. Soc., 19:197–231, 1968.
Parker et al. [2021] J. P. Parker, D. Goluskin, and G. M. Vasil. A study of the double pendulum using polynomial optimization. Chaos, 31:103102, 2021.
Parker and Chua [2012] T. S. Parker and L. Chua. Practical numerical algorithms for chaotic systems. Springer Science & Business Media, 2012.
Parrilo [2000] P. A. Parrilo. Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization. PhD thesis, California Institute of Technology, 2000.
Parrilo [2013] P. A. Parrilo. Polynomial optimization, sums of squares, and applications. In G. Blekherman, P. A. Parrilo, and R. R. Thomas, editors, Semidefinite optimization and convex algebraic geometry, chapter 3, pages 47–157. SIAM, 2013.
Peyrl and Parrilo [2008] H. Peyrl and P. A. Parrilo. Computing sum of squares decompositions with rational coefficients. Theor. Comput. Sci., 409:269–281, 2008.
Pikovsky and Politi [2016] A. Pikovsky and A. Politi. Lyapunov exponents: a tool to explore complex dynamics. Cambridge University Press, 2016.
Shevchenko and Mel’nikov [2003] I. I. Shevchenko and A. Mel’nikov. Lyapunov exponents in the Hénon–Heiles problem. J. Exp. Theor. Phys., 77:642–646, 2003.
Sprott [2010] J. C. Sprott. Elegant chaos: algebraically simple chaotic flows. World Scientific, 2010.
Tancredi et al. [2001] G. Tancredi, A. Sánchez, and F. Roig. A comparison between methods to compute Lyapunov exponents. Astron. J., 121:1171–1179, 2001.
Tobasco [2022] I. Tobasco. Personal communication. 2022.
Tobasco et al. [2018] I. Tobasco, D. Goluskin, and C. R. Doering. Optimal bounds and extremal trajectories for time averages in nonlinear dynamical systems. Phys. Lett. A, 382:382–386, 2018.

Convex computation of maximal Lyapunov exponents

Abstract

1 Introduction

2 Optimization problems for bounding maximal LEs

2.1 Bounding time averages using convex optimization

2.1.1 Optimization over auxiliary functions

2.1.2 SOS relaxations

2.2 Upper bounds on maximal LEs

2.2.1 Bounds from minimization over auxiliary functions

Proposition 1.

2.2.2 Bounds from SOS relaxations

Proposition 2.

Remark 1.

2.3 Symmetries

2.3.1 Symmetries of the (x,z)(x,z) dynamics

Proposition 3.

Proof.

Remark 2.

2.3.2 Symmetries of the optimization problems

Proposition 4.

Proof.

Proposition 5.

Remark 3.

3 Computational examples

3.1 The Lorenz system

3.2 The Hénon–Heiles system

4 Conclusions

Acknowledgements

Appendix

Proposition 6.

Proof.

Proposition 7.

Proof.

References

2.3.1 Symmetries of the $(x,z)$ dynamics