This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Generalized Nonlinear and Finsler Geometry for Robotics

Nathan D. Ratliff1, Karl Van Wyk1, Mandy Xie1,2, Anqi Li1,3, and Muhammad Asif Rana1,2 1All authors are with (or interned at) NVIDIA, 2Georgia Tech, 3University of Washington
Abstract

Robotics research has found numerous important applications of Riemannian geometry. Despite that, the concept remain challenging to many roboticists because the background material is complex and strikingly foreign. Beyond Riemannian geometry, there are many natural generalizations in the mathematical literature—areas such as Finsler geometry and spray geometry—but those generalizations are largely inaccessible, and as a result there remain few applications within robotics. This paper presents a re-derivation of spray and Finsler geometries we found critical for the development of our recent work on a powerful behavioral design tool we call geometric fabrics. These derivations build from basic tools in advanced calculus and the calculus of variations making them more accessible to a robotics audience than standard presentations. We focus on the pragmatic and calculable results, avoiding the use of tensor notation to appeal to a broader audience, emphasizing geometric path consistency over ideas around connections and curvature. We hope that these derivations will contribute to an increased understanding of generalized nonlinear, and even classical Riemannian, geometry within the robotics community and inspire future research into new applications.

I Introduction

Nonlinear geometry is in many ways fundamental to robotics. Robotic configuration spaces are naturally modeled as manifolds [1], the classical mechanical dynamics of the robot is intimately linked to a Riemannian geometry [2], Gauss-Newton optimization has strong application in vision [3] and motion optimization [4] and those algorithms are closely related to Riemannian geometry [5, 6] natural gradients [7] which are core to many modern machine learning methods [8]. Despite their importance, however, even classical Riemannian geometry remains inaccessible to many roboticists. And beyond that, Finsler [9] and spray geometry [10], natural generalizations of Riemannian, are largely unheard of within the community. How many applications are out-of-reach as a result?

Our own work on Riemannian Motion Policies (RMPs) [11, 12] has led us to a study of what we call geometric fabrics [13, 14] as a formal provably stable model of reactive behavior, and to get there, we had to address this gap. Both Finsler and spray geometries proved critical for the development of that work, but were largely confined within opaque mathematical manuscripts mired with abstract concepts and foreign notation. The importance of the material, however, led us to entirely re-derive the foundations, rigorously, but in a language we could understand. This paper presents these re-derivations in a hope that they will be more accessible to a broader robotics audience and inspire new future applications.

We use notations of advanced calculus [15] as much as possible and build on the Calculus of Variations [16] (Finsler geometry maybe viewed as the geometry of these functional optima), and we draw connections to classical mechanics [17], Riemannian geometry [18, 2], and the interrelations observed in geometric mechanics [2]. We limit ourselves to a study of geometric path consistency which we call generalized nonlinear geometry (see Section II),111Our terminology differs from the literature. Equations exhibiting geometric path consistency are commonly called sprays in mathematics [10], and literature further develops more general geometry with a very similar name (a semi-spray), which doesn’t exhibit such path consistency. We, therefore, restrict the term geometry to imply this concrete notion of path consistency to aid intuition and reserve the term semi-spray in our applications [13] for these more general non-path-consistent equations. and show that Finsler geometries are a type of generalized nonlinear geometry (see Section III), inheriting their geometric consistency.

Refer to caption
Figure 1: Paths generated from particle motion initiated at different speeds. Faded paths indicate higher α\alpha and opaque paths indicate lower α\alpha.

Manifolds [19] are a standard mathematical foundation of modern nonlinear geometry. However, these concepts can be very abstract and daunting for those unfamiliar. We, therefore, present our derivations in specific coordinates, following the conventions of many texts on classical mechanics [17]. For those familiar with the concepts, we can say we do so without loss of generality—the Euler-Lagrange equation is naturally covariant, enabling curvilinear changes of coordinates as necessary while traversing a manifold with no change to the underlying behavior; covariant behavior of more general nonlinear geometries can be expressed through a transform tree as was done in [12] in the definition of structured Geometric Dynamical Systems (GDS).

II Generalized nonlinear geometries

A generalized nonlinear geometry (known as a spray in mathematics [10]) is a second-order differential equation describing a smooth collection of paths. These paths are equivalence classes of trajectories all passing through the same points but differing in speed profile. In essence, we allow a trajectory to speed up or slow down arbitrarily, and as long as its geometric shape remains consistent (i.e. it passes through the same one-dimensional set (submanifold) of points in space) we say the trajectory follows the same path (i.e. is part of the equivalence class). Colloquially, similar to how we can think of an arbitrary second-order differential equation as a collection of trajectories (its integral curves), we can think of the geometry as a collection of tubes. Each tube represents a path and contains multiple trajectories (infinitely many of them), the collection of all trajectories following that path with differing speed profiles (see Figure 1).

Concretely, two trajectories are said to be equivalent, and hence along the same path, if they are a time reparameterization away from one another. Given a trajectory with time index tt denoted 𝐱t(t)\mathbf{x}_{t}(t), a time reparameterization is a smooth, strictly monotonically increasing, nonlinear function t:t:\mathbb{R}\rightarrow\mathbb{R} denoted t(s)t(s) giving rise to a new time index ss. The time reparameterization creates a new trajectory 𝐱s(s)=𝐱t(t(s))\mathbf{x}_{s}(s)=\mathbf{x}_{t}\big{(}t(s)\big{)}. Since, for a given s0s_{0} and corresponding t0=t(s0)t_{0}=t(s_{0}), the points of the trajectories align

𝐱s(s0)=𝐱t(t(s0))=𝐱t(t0),\displaystyle\mathbf{x}_{s}(s_{0})=\mathbf{x}_{t}\big{(}t(s_{0})\big{)}=\mathbf{x}_{t}(t_{0}), (1)

and since this relationship is a bijection (due to the strict monotonicity),222Formally, it is a diffeomorphism between coordinate charts of the same one-dimensional manifold of points. the two trajectories 𝐱s(s)\mathbf{x}_{s}(s) and 𝐱t(t)\mathbf{x}_{t}(t) pass through the same set of points and we say they follow the same path. Since dds𝐱t(t(s))=d𝐱tdtdtds\frac{d}{ds}\mathbf{x}_{t}\big{(}t(s)\big{)}=\frac{d\mathbf{x}_{t}}{dt}\frac{dt}{ds} and d2ds2𝐱t(t(s))=d2𝐱tdt2(dtds)2+d𝐱tdtd2tds2\frac{d^{2}}{ds^{2}}\mathbf{x}_{t}\big{(}t(s)\big{)}=\frac{d^{2}\mathbf{x}_{t}}{dt^{2}}\left(\frac{dt}{ds}\right)^{2}+\frac{d\mathbf{x}_{t}}{dt}\frac{d^{2}t}{ds^{2}}, we see velocities and accelerations under the time reparameterization are linked to one another as

𝐱˙s\displaystyle{\dot{\mathbf{x}}}_{s} =dtds𝐱˙t\displaystyle=\frac{dt}{ds}{\dot{\mathbf{x}}}_{t} (2)
𝐱¨s\displaystyle{\ddot{\mathbf{x}}}_{s} =(dtds)2𝐱¨t+d2tds2𝐱˙t,\displaystyle=\left(\frac{dt}{ds}\right)^{2}{\ddot{\mathbf{x}}}_{t}+\frac{d^{2}t}{ds^{2}}{\dot{\mathbf{x}}}_{t},

where notationally the dots are understood to be time derivatives w.r.t. their respective time indices. E.g. 𝐱˙s=d𝐱sds{\dot{\mathbf{x}}}_{s}=\frac{d\mathbf{x}_{s}}{ds}, 𝐱˙t=d𝐱tdt{\dot{\mathbf{x}}}_{t}=\frac{d\mathbf{x}_{t}}{dt} and so forth. A smooth nonlinear geometry is defined by the collection of all time-reparameterization invariant smooth paths in a space. For every point 𝐱0\mathbf{x}_{0} and speed-independent direction vector 𝐯^\widehat{\mathbf{v}}, the nonlinear geometry defines a unique path eminating from that point 𝐱0\mathbf{x}_{0} following the specified initial direction 𝐯^\widehat{\mathbf{v}}.

Denoting the orthogonal projector projecting orthogonally to velocity as 𝐏𝐱˙=𝐈𝐱˙^𝐱˙^T\mathbf{P}_{\dot{\mathbf{x}}}^{\perp}=\mathbf{I}-\widehat{{\dot{\mathbf{x}}}}\widehat{{\dot{\mathbf{x}}}}^{T}, the family of geometries we consider here are those characterized by a second-order differential equation of the form

𝐏𝐱˙[𝐱¨+𝐡2(𝐱,𝐱˙)]=𝟎,\displaystyle\mathbf{P}_{\dot{\mathbf{x}}}^{\perp}\Big{[}{\ddot{\mathbf{x}}}+\mathbf{h}_{2}(\mathbf{x},{\dot{\mathbf{x}}})\Big{]}=\mathbf{0}, (3)

where 𝐡2(𝐱,𝐱˙)\mathbf{h}_{2}(\mathbf{x},{\dot{\mathbf{x}}}) is a smooth function that is positively homogeneous of degree 2 (HD2) in velocities.333Generally, a function f(𝐳)f(\mathbf{z}) is said to be positively homogeneous of degree kk (abbreviated HDkk) if f(λ𝐳)=λkf(𝐳)f(\lambda\mathbf{z})=\lambda^{k}f(\mathbf{z}) for λ0\lambda\geq 0 [20]. Homogeneity of degree 1 and 2 is used in the definitions below as well. In this case, 𝐡2\mathbf{h}_{2} must be HD2, meaning 𝐡2(𝐱,λ𝐱˙)=λ2𝐡2(𝐱,𝐱˙)\mathbf{h}_{2}(\mathbf{x},\lambda{\dot{\mathbf{x}}})=\lambda^{2}\mathbf{h}_{2}(\mathbf{x},{\dot{\mathbf{x}}}) for λ>0\lambda>0. We call equations of this form geometric equations.

Since the projector 𝐏𝐱˙\mathbf{P}_{\dot{\mathbf{x}}}^{\perp} is reduced rank, there is solution redundancy. We will see that this redundancy precisely describes the ability to arbitrarily speed up or slow down along a trajectory while sticking to the same path.

The interior equation in isolation

𝐱¨+𝐡2(𝐱,𝐱˙)=𝟎\displaystyle{\ddot{\mathbf{x}}}+\mathbf{h}_{2}(\mathbf{x},{\dot{\mathbf{x}}})=\mathbf{0} (4)

we call the generating equation and is said to generate the geometry via its system of trajectories. Note that since 𝐏𝐱˙\mathbf{P}_{\dot{\mathbf{x}}}^{\perp} has null space spanned by 𝐱˙{\dot{\mathbf{x}}}, solutions to the geometric equation are solutions to

𝐱¨+𝐡2(𝐱,𝐱˙)+α(t)𝐱˙=𝟎\displaystyle{\ddot{\mathbf{x}}}+\mathbf{h}_{2}(\mathbf{x},{\dot{\mathbf{x}}})+\alpha(t){\dot{\mathbf{x}}}=\mathbf{0} (5)

where α:\alpha:\mathbb{R}\rightarrow\mathbb{R} is a smooth function defining an acceleration along the direction of motion. We call this equation the explicit form geometric equation.

Theorem II.1.

All time reparameterizations of generating solutions are geometric solutions, and each geometric solution characterized by starting position 𝐱0\mathbf{x}_{0} and direction 𝐯^\widehat{\mathbf{v}} is a time reparameterization away from any generating solution with initial conditions (𝐱0,γ𝐯^)(\mathbf{x}_{0},\gamma\widehat{\mathbf{v}}) for some γ>0\gamma>0.

Proof.

We first address reparameterization of generating solutions. Let 𝐱s(s)\mathbf{x}_{s}(s) be a generating solution trajectory and let s(t)s(t) be an arbitrary time reparameterization. In terms of its inverse t(s)t(s) (which always exists since s(t)s(t) is strictly monotonically increasing by definition), by Equations 2 we have

𝐱¨s+𝐡2(𝐱s,𝐱˙s)=𝟎\displaystyle\ \ \ {\ddot{\mathbf{x}}}_{s}+\mathbf{h}_{2}(\mathbf{x}_{s},{\dot{\mathbf{x}}}_{s})=\mathbf{0} (6)
(dtds)2𝐱¨t+d2tds2𝐱˙t+𝐡2(𝐱t,dtds𝐱˙t)=𝟎\displaystyle\Rightarrow\left(\frac{dt}{ds}\right)^{2}{\ddot{\mathbf{x}}}_{t}+\frac{d^{2}t}{ds^{2}}{\dot{\mathbf{x}}}_{t}+\mathbf{h}_{2}(\mathbf{x}_{t},\frac{dt}{ds}{\dot{\mathbf{x}}}_{t})=\mathbf{0} (7)
(dtds)2[𝐱¨t+𝐡2(𝐱t,𝐱˙t)]+d2tds2𝐱˙t=𝟎\displaystyle\Rightarrow\left(\frac{dt}{ds}\right)^{2}\big{[}{\ddot{\mathbf{x}}}_{t}+\mathbf{h}_{2}(\mathbf{x}_{t},{\dot{\mathbf{x}}}_{t})\big{]}+\frac{d^{2}t}{ds^{2}}{\dot{\mathbf{x}}}_{t}=\mathbf{0} (8)
𝐱¨t+𝐡2(𝐱t,𝐱˙t)+α𝐱˙t=𝟎,\displaystyle\Rightarrow{\ddot{\mathbf{x}}}_{t}+\mathbf{h}_{2}(\mathbf{x}_{t},{\dot{\mathbf{x}}}_{t})+\alpha{\dot{\mathbf{x}}}_{t}=\mathbf{0}, (9)

where α=(dtds)2d2tds2\alpha=\left(\frac{dt}{ds}\right)^{-2}\frac{d^{2}t}{ds^{2}}. Thus, 𝐱t(t)\mathbf{x}_{t}(t) solves an explicit form geometric equation and is, therefore, a geometric solution.

Next, let 𝐱s(s)\mathbf{x}_{s}(s) be any solution to the geometric equation. Then at every point,

𝐱¨s+𝐡2(𝐱s,𝐱˙s)+α(s)𝐱˙s=𝟎\displaystyle{\ddot{\mathbf{x}}}_{s}+\mathbf{h}_{2}(\mathbf{x}_{s},{\dot{\mathbf{x}}}_{s})+\alpha(s){\dot{\mathbf{x}}}_{s}=\mathbf{0} (10)

for some smooth function α(s)\alpha(s) across the trajectory. Under a time reparameterization s(t)s(t) we get the following equation in terms of 𝐱t\mathbf{x}_{t}

(dtds)2𝐱¨t+d2tds2𝐱˙t+𝐡2(𝐱t,dtds𝐱˙t)\displaystyle\left(\frac{dt}{ds}\right)^{2}{\ddot{\mathbf{x}}}_{t}+\frac{d^{2}t}{ds^{2}}{\dot{\mathbf{x}}}_{t}+\mathbf{h}_{2}(\mathbf{x}_{t},\frac{dt}{ds}{\dot{\mathbf{x}}}_{t}) (11)
+α(s)dtds𝐱˙t=𝟎\displaystyle\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ +\alpha(s)\frac{dt}{ds}{\dot{\mathbf{x}}}_{t}=\mathbf{0}
(dtds)2[𝐱¨t+𝐡2(𝐱t𝐱˙t)]\displaystyle\Rightarrow\left(\frac{dt}{ds}\right)^{2}\Big{[}{\ddot{\mathbf{x}}}_{t}+\mathbf{h}_{2}(\mathbf{x}_{t}{\dot{\mathbf{x}}}_{t})\Big{]} (12)
+(d2tds2+α(s)dtds)𝐱˙t=𝟎.\displaystyle\ \ \ \ \ \ \ \ \ \ \ \ \ +\left(\frac{d^{2}t}{ds^{2}}+\alpha(s)\frac{dt}{ds}\right){\dot{\mathbf{x}}}_{t}=\mathbf{0}.

Since d2tds2+α(s)dtds=0\frac{d^{2}t}{ds^{2}}+\alpha(s)\frac{dt}{ds}=0 is an ordinary second-order differential equation it has a unique solution for every initial condition (t(0),dtds(0))\big{(}t(0),\frac{dt}{ds}(0)\big{)}. Under any of those solutions, the second term vanishes and we have 𝐱¨t+𝐡2(𝐱t,𝐱˙t)=𝟎{\ddot{\mathbf{x}}}_{t}+\mathbf{h}_{2}(\mathbf{x}_{t},{\dot{\mathbf{x}}}_{t})=\mathbf{0} (since dtds0\frac{dt}{ds}\neq 0). Therefore, under such a time reparameterization, 𝐱t(t)\mathbf{x}_{t}(t) is a generating solution.

Moreover, since 𝐱˙s=dtds𝐱˙t{\dot{\mathbf{x}}}_{s}=\frac{dt}{ds}{\dot{\mathbf{x}}}_{t} the initial condition dtds(0)\frac{dt}{ds}(0) defines the initial velocity 𝐱˙t(0)=(dtds(0))1𝐱˙s(0){\dot{\mathbf{x}}}_{t}(0)=\left(\frac{dt}{ds}(0)\right)^{-1}{\dot{\mathbf{x}}}_{s}(0) which uniquely defines the generating solution moving in direction 𝐯^=𝐱˙^s(0)\widehat{\mathbf{v}}=\widehat{{\dot{\mathbf{x}}}}_{s}(0). Since any initial speed can be generated this way, every time reparameterization of this sort maps to a corresponding generating solution by initial conditions and every generating solution can be created by such a time reparameterization. Therefore, there is a bijection between time reparameterizations solving d2tds2+α(s)dtds=0\frac{d^{2}t}{ds^{2}}+\alpha(s)\frac{dt}{ds}=0 and generating solutions with initial conditions (𝐱,γ𝐯^)(\mathbf{x},\gamma\widehat{\mathbf{v}}). ∎

The proof of the above theorem shows that any time reparameterization t(s)t(s) solving

d2tds2+α(s)dtds=0\displaystyle\frac{d^{2}t}{ds^{2}}+\alpha(s)\frac{dt}{ds}=0 (13)

induces a generating solution. Since solutions to Equation 13 are defined by their initial conditions, given a geometric solution 𝐱s\mathbf{x}_{s}, we can choose a time reparameterization s(t)s(t) (the inverse of t(s)t(s)) so that 𝐱t=𝐱s(s(t))\mathbf{x}_{t}=\mathbf{x}_{s}\big{(}s(t)\big{)} is a generating solution whose velocity matches the geometric solution’s velocity at a given time s¯\bar{s} (choose t(s)t(s) solving Equation 13 using initial conditions t(s¯)=s¯t(\bar{s})=\bar{s} and dtds(s¯)=1\frac{dt}{ds}(\bar{s})=1). That mapping from ss to instantaneous velocity-matching generating solution is smooth, so we can view the geometric solution as smoothly moving between generating solutions, using the redundant accelerations α(s)𝐱˙s\alpha(s){\dot{\mathbf{x}}}_{s} to do so by speeding up and slowing down along the direction of motion.

To illustrate path consistency, we designed a geometry, 𝐡2(𝐪,𝐪˙)\mathbf{h}_{2}(\mathbf{q},{\dot{\mathbf{q}}}), that naturally produces particle paths that avoid a circular object in coordinates, 𝐪2\mathbf{q}\in\mathbb{R}^{2}, as

𝐡2(𝐪,𝐪˙)=λ𝐪˙2𝐪ψ(ϕ(𝐪))\displaystyle\mathbf{h}_{2}(\mathbf{q},{\dot{\mathbf{q}}})=\lambda\|{\dot{\mathbf{q}}}\|^{2}\>\partial_{\mathbf{q}}\psi\big{(}\phi(\mathbf{q})\big{)} (14)

where ϕ(𝐪)\phi(\mathbf{q}) is a differentiable map that captures the distance to a circular object and λ+\lambda\in\mathbb{R}^{+} is a scaling gain. More specifically, ϕ(𝐪)=𝐪𝐪orr\phi(\mathbf{q})=\frac{\|\mathbf{q}-\mathbf{q}_{o}\|-r}{r}, where 𝐪o\mathbf{q}_{o} and rr are the circle’s center and radius, respectively. Furthermore, ψ(ϕ(𝐪))+\psi\big{(}\phi(\mathbf{q})\big{)}\in\mathbb{R}^{+} is a barrier potential function, ψ(ϕ(𝐪))=kϕ(𝐪)2\psi\big{(}\phi(\mathbf{q})\big{)}=\frac{k}{\phi(\mathbf{q})^{2}}, where k+k\in\mathbb{R}^{+} is a scaling gain. Altogether, 𝐪ψ(ϕ(𝐪))\partial_{\mathbf{q}}\psi\big{(}\phi(\mathbf{q})\big{)} produces an increasing repulsive force as distance to the object decreases, and 𝐪˙2\|{\dot{\mathbf{q}}}\|^{2} makes 𝐡2(𝐪,𝐪˙)\mathbf{h}_{2}(\mathbf{q},{\dot{\mathbf{q}}}) homogeneous of degree 2 in 𝐪˙{\dot{\mathbf{q}}}. For this experiment, λ=0.7\lambda=0.7, k=0.5k=0.5 for two scenarios: 1) α=1.5\alpha=1.5, and 2) α=0.75\alpha=0.75 for the initial conditions, (𝐪0,α𝐪˙^0)(\mathbf{q}_{0},\alpha\hat{{\dot{\mathbf{q}}}}_{0}). Eleven vertically spaced particles that follow the above geometry are initialized with the two different initial speeds. Traced paths at the two different speeds are overlaid as shown in Fig. 1. Noticeably, the paths generated are completely overlapping confirming path consistency.

III Finsler geometry

Here, we derive a broad class of nonlinear geometries, known as Finsler geometries [9]. These geometries arise from Calculus of Variations problems [16] that generalize the notion of arc length to cases where length elements can vary by direction (Minkowski norms [something]). This section builds to a fundamental result showing that applying the Euler-Lagrange equation to these generalized length elements produces a Geometric equation (a Finsler geometry of paths) whose corresponding generator can be derived by applying the Euler-Lagrange equation to the length element’s energy form. Solutions to the latter conserve the derived energy and can thus be viewed as energy levels. And from the above analysis, we can characterize geometric solutions to Finsler geometries as smoothly transitioning between these concretely defined energy levels (generating solutions) by speeding up and slowing down along the direction of motion while remaining along the same common path.

III-A The Euler-Lagrange equation: a review

For completeness we review some background on the Calculus of Variations [16]. The Calculus of Variations studies extremal trajectory problems. Given a function (𝐱,𝐱˙)\mathcal{L}(\mathbf{x},{\dot{\mathbf{x}}}) of position and velocity, known as a Lagrangian, and the class of smooth trajectories Ξ\Xi ranging between two end points 𝐱0\mathbf{x}_{0} and 𝐱1\mathbf{x}_{1}, we can ask which of those trajectories minimizes the “total Lagrangian” across the trajectory:

min𝐱(t)Ξ0T(𝐱,𝐱˙)𝑑t,\displaystyle\min_{\mathbf{x}(t)\in\Xi}\int_{0}^{T}\mathcal{L}(\mathbf{x},{\dot{\mathbf{x}}})dt, (15)

where TT is understood to vary per trajectory based on its natural time interval. The integral A[𝐱]=(𝐱,𝐱˙)𝑑tA[\mathbf{x}]=\int\mathcal{L}(\mathbf{x},{\dot{\mathbf{x}}})dt is known as the Lagrangian’s action functional.

Refer to caption
Figure 2: A depiction of a Calculus of Variations problem. Green curves are select trajectories 𝐱(t)Ξ\mathbf{x}(t)\in\Xi between 𝐱0\mathbf{x}_{0} and 𝐱1\mathbf{x}_{1} from a smooth family of trajectories Ξ\Xi; the blue curve 𝐱(t)\mathbf{x}^{*}(t) depicts an extremal trajectory that optimizes an objective functional.

Figure 2 depicts this problem pictorally. Here the dark trajectories are possible candidates in Ξ\Xi and the dotted trajectory is the extremum.

We won’t derive it here, but extremal solutions are characterized by solutions to the following boundary-valued second-order differential equation:

ddt𝐱˙𝐱=𝟎s.t.{𝐱(0)=𝐱0𝐱(T)=𝐱1\displaystyle\frac{d}{dt}\partial_{\dot{\mathbf{x}}}\mathcal{L}-\partial_{\mathbf{x}}\mathcal{L}=\mathbf{0}\ \ \mathrm{s.t.}\ \ \left\{\begin{array}[]{l}\mathbf{x}(0)=\mathbf{x}_{0}\\ \mathbf{x}(T)=\mathbf{x}_{1}\end{array}\right. (18)

This equation ddt𝐱˙𝐱=𝟎\frac{d}{dt}\partial_{\dot{\mathbf{x}}}\mathcal{L}-\partial_{\mathbf{x}}\mathcal{L}=\mathbf{0} is important in its own right and is known as the Euler-Lagrange equation.444We write it in negated form here relative to the common expression from the Calculus of Variations [16] to match better with the equations of motion below, as we’ll see. By the theory of ordinary differential equations [19], this equation has unique initial-value solutions when it is point-wise well-formed for 𝐱˙𝟎{\dot{\mathbf{x}}}\neq\mathbf{0}. (We’ll see that this means its velocity Hessian 𝐱˙𝐱˙2\partial^{2}_{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}}\mathcal{L} is invertible for 𝐱˙𝟎{\dot{\mathbf{x}}}\neq\mathbf{0}. This won’t be the case for Finsler structures (and we’ll get the solution redundancy characteristic of geometric equations from that), but it will be the case for the corresponding energy-form generating equation.) This uniqueness of solution means that for any (𝐱0,𝐱˙0)(\mathbf{x}_{0},{\dot{\mathbf{x}}}_{0}), we can play forward the Euler-Lagrange equation uniquely to generate a solution trajectory starting from that position and velocity. We can connect this solution back to the extremal problem by noting that every point 𝐱1\mathbf{x}_{1} encountered along this initial-value solution is a possible end point, and the solution to the initial-valued problem solves the boundary value problem with boundary constraints 𝐱0\mathbf{x}_{0} and 𝐱1\mathbf{x}_{1}. In other words, every subtrajectory of the initial-value solution is an extremal solution between its end points. Importantly, this enables us to consider the Euler-Lagrange equation in isolation and understand all of its solutions as extremal solutions of the action.

Expanding the time derivative brings the Euler-Lagrange equation to a more concrete form and clarifies its role as a second-order dynamical system:

ddt𝐱˙𝐱=𝟎\displaystyle\ \ \frac{d}{dt}\partial_{\dot{\mathbf{x}}}\mathcal{L}-\partial_{\mathbf{x}}\mathcal{L}=\mathbf{0} (19)
𝐱˙𝐱˙2𝐱¨+𝐱˙𝐱𝐱˙𝐱=𝟎\displaystyle\Rightarrow\partial^{2}_{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}}\mathcal{L}{\ddot{\mathbf{x}}}+\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}{\dot{\mathbf{x}}}-\partial_{\mathbf{x}}\mathcal{L}=\mathbf{0} (20)
𝐌𝐱¨+𝐟=𝟎,\displaystyle\Rightarrow\mathbf{M}_{\mathcal{L}}{\ddot{\mathbf{x}}}+\mathbf{f}_{\mathcal{L}}=\mathbf{0}, (21)

where 𝐌=𝐱˙𝐱˙2\mathbf{M}_{\mathcal{L}}=\partial^{2}_{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}}\mathcal{L} plays a role analogous to a mass matrix and 𝐟=𝐱˙𝐱𝐱˙𝐱\mathbf{f}_{\mathcal{L}}=\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}{\dot{\mathbf{x}}}-\partial_{\mathbf{x}}\mathcal{L} is a force-like object. Solutions to the Euler-Lagrange equation can be easily integrated forward from an initial position 𝐱\mathbf{x} and velocity 𝐱˙{\dot{\mathbf{x}}} using the solved acceleration form 𝐱¨=𝐌1𝐟{\ddot{\mathbf{x}}}=-\mathbf{M}_{\mathcal{L}}^{-1}\mathbf{f}_{\mathcal{L}}. Note that this solved form is only well-defined when 𝐌\mathbf{M}_{\mathcal{L}} is invertible as noted earlier. While here we consider 𝐌\mathbf{M}_{\mathcal{L}} and 𝐟\mathbf{f}_{\mathcal{L}} as merely analogous to mass and force, we will see that this analogy takes on a deeper, more concrete, meaning under Finsler geometry, where we require the Lagrangian to take on a special form so that (𝐱,𝐱˙)\mathcal{L}(\mathbf{x},{\dot{\mathbf{x}}}), for fixed 𝐱\mathbf{x}, intuitively becomes a squared norm like measure on velocities, giving it an interpretation of “length squared”. We will see below that, under these particular Lagrangians, 𝐌\mathbf{M}_{\mathcal{L}} plays a role of mass, 𝐟\mathbf{f}_{\mathcal{L}} plays a role of force, and the equation 𝐱¨+𝐌1𝐟=𝟎{\ddot{\mathbf{x}}}+\mathbf{M}_{\mathcal{L}}^{-1}\mathbf{f}_{\mathcal{L}}=\mathbf{0}, or 𝐱¨+𝐡2(𝐱,𝐱˙)=𝟎{\ddot{\mathbf{x}}}+\mathbf{h}_{2}(\mathbf{x},{\dot{\mathbf{x}}})=\mathbf{0} with 𝐡2=𝐌1𝐟\mathbf{h}_{2}=\mathbf{M}_{\mathcal{L}}^{-1}\mathbf{f}_{\mathcal{L}}, is a geometry generator.

III-B Finsler structures

We call a Lagrangian whose Euler-Lagrange equation can be expressed in the standard geometric form

𝐏𝐱˙[𝐱¨+𝐡2(𝐱,𝐱˙)]=𝟎,\displaystyle\mathbf{P}^{\perp}_{\dot{\mathbf{x}}}\Big{[}{\ddot{\mathbf{x}}}+\mathbf{h}_{2}(\mathbf{x},{\dot{\mathbf{x}}})\Big{]}=\mathbf{0}, (22)

where 𝐡2\mathbf{h}_{2} is homogeneous of degree 2 in velocity (HD2), a geometric Lagrangian. Recall that such geometries described by these geometric equations exhibit an invariance to time-reparameterization (see Section II) with solutions characterizing a geometry of paths. A Finsler structure is a particular geometric Lagrangian g\mathcal{L}_{g} with the following nice properties:

  1. 1.

    g(𝐱,𝐱˙)0\mathcal{L}_{g}(\mathbf{x},{\dot{\mathbf{x}}})\geq 0 with equality if and only if 𝐱˙=𝟎{\dot{\mathbf{x}}}=\mathbf{0}.

  2. 2.

    g\mathcal{L}_{g} is positively homogeneous (HD1) in 𝐱˙{\dot{\mathbf{x}}} so that g(𝐱,λ𝐱˙)=λg(𝐱,𝐱˙)\mathcal{L}_{g}(\mathbf{x},\lambda{\dot{\mathbf{x}}})=\lambda\mathcal{L}_{g}(\mathbf{x},{\dot{\mathbf{x}}}) for λ0\lambda\geq 0.

  3. 3.

    𝐱˙𝐱˙2e\partial^{2}_{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}}\mathcal{L}_{e} is invertible when 𝐱˙𝟎{\dot{\mathbf{x}}}\neq\mathbf{0}, where e=12g2\mathcal{L}_{e}=\frac{1}{2}\mathcal{L}_{g}^{2}.

We call e=12g2\mathcal{L}_{e}=\frac{1}{2}\mathcal{L}_{g}^{2} defined in Property 3, the corresponding Finsler energy.

The positive homogeneity requirement enforces that the action functional is natively independent of time reparameterization. A time reparameterization defines s=s(t)s=s(t) with 𝐱˙s=dtds𝐱˙t{\dot{\mathbf{x}}}_{s}=\frac{dt}{ds}{\dot{\mathbf{x}}}_{t}, so homogeneity implies the action functional has the property

A[𝐱s]\displaystyle A[\mathbf{x}_{s}] =g(𝐱s,𝐱˙s)𝑑s=g(𝐱t,dtds𝐱˙t)ds\displaystyle=\int\mathcal{L}_{g}(\mathbf{x}_{s},{\dot{\mathbf{x}}}_{s})ds=\mathcal{L}_{g}\left(\mathbf{x}_{t},\frac{dt}{ds}{\dot{\mathbf{x}}}_{t}\right)ds
=g(𝐱t,𝐱˙t)dtds𝑑s\displaystyle=\int\mathcal{L}_{g}(\mathbf{x}_{t},{\dot{\mathbf{x}}}_{t})\frac{dt}{ds}ds
=g(𝐱t,𝐱˙t)𝑑t=A[𝐱t].\displaystyle=\int\mathcal{L}_{g}(\mathbf{x}_{t},{\dot{\mathbf{x}}}_{t})dt=A[\mathbf{x}_{t}].

This property suggests that solutions to the Euler-Lagrange equation of g\mathcal{L}_{g} must also be independent of time reparameterization, i.e. they form paths. Theorem III.1 proves that conjecture and concretely connects Finsler geometries to the generalized notion of nonlinear geometries of paths outlined above. The speed independence of this action functional, in conjunction with the conditions of a Finsler structure listed above, also suggest we can view the Finsler structure g\mathcal{L}_{g} as a generalized length element. The action functional is, therefore, a generalized arc-length integral.

Since the Finsler structure g\mathcal{L}_{g} is HD1, it’s associated Finsler energy has the property e(𝐱,λ𝐱˙)=12(g(𝐱,λ𝐱˙))2=λ212(g(𝐱,𝐱˙))2=λ2e(𝐱,𝐱˙)\mathcal{L}_{e}(\mathbf{x},\lambda{\dot{\mathbf{x}}})=\frac{1}{2}\big{(}\mathcal{L}_{g}(\mathbf{x},\lambda{\dot{\mathbf{x}}})\big{)}^{2}=\lambda^{2}\frac{1}{2}\big{(}\mathcal{L}_{g}(\mathbf{x},{\dot{\mathbf{x}}})\big{)}^{2}=\lambda^{2}\mathcal{L}_{e}(\mathbf{x},{\dot{\mathbf{x}}}). That means the Finsler energy e\mathcal{L}_{e} is positively homogeneous of degree 2 (HD2). A useful property homogeneity is given by Euler’s theorem on homogeneous functions [20], which states that if (𝐱,𝐱˙)\mathcal{L}(\mathbf{x},{\dot{\mathbf{x}}}) is homogeneous of degree kk in 𝐱˙{\dot{\mathbf{x}}}, then the Hamiltonian, a conserved quantity under the Euler-Lagrange equation [21], is

=𝐱˙T𝐱˙=(k1).\displaystyle\mathcal{H}_{\mathcal{L}}=\partial_{\dot{\mathbf{x}}}\mathcal{L}^{T}{\dot{\mathbf{x}}}-\mathcal{L}=(k-1)\mathcal{L}. (23)

In the case of the Finsler energy e=12g2\mathcal{L}_{e}=\frac{1}{2}\mathcal{L}_{g}^{2}, we have k=2k=2, so

e=(k1)e=e.\displaystyle\mathcal{H}_{\mathcal{L}_{e}}=(k-1)\mathcal{L}_{e}=\mathcal{L}_{e}. (24)

The equations of motion under the Finsler energy e\mathcal{L}_{e}, which we call the energy equations, are

𝐱˙2e𝐱¨+𝐱˙𝐱e𝐱˙xe=𝟎\displaystyle\partial^{2}_{\dot{\mathbf{x}}}\mathcal{L}_{e}{\ddot{\mathbf{x}}}+\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}_{e}{\dot{\mathbf{x}}}-\partial_{x}\mathcal{L}_{e}=\mathbf{0} (25)
𝐌e𝐱¨+𝐟e=𝟎,\displaystyle\ \ \ \Leftrightarrow\mathbf{M}_{e}{\ddot{\mathbf{x}}}+\mathbf{f}_{e}=\mathbf{0}, (26)

where 𝐌e=𝐱˙2e\mathbf{M}_{e}=\partial^{2}_{\dot{\mathbf{x}}}\mathcal{L}_{e} and 𝐟e=𝐱˙𝐱e𝐱˙xe\mathbf{f}_{e}=\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}_{e}{\dot{\mathbf{x}}}-\partial_{x}\mathcal{L}_{e}. Since these equations are known to conserve e\mathcal{H}_{e}, and in this case e=e\mathcal{H}_{e}=\mathcal{L}_{e}, we see that this Finsler energy is conserved. e\mathcal{L}_{e} is often referred to as the energy of the system, and 𝐌e\mathbf{M}_{e} is its energy tensor.

Note that the third requirement on Finsler structures given above is actually a requirement on the Finsler energy e\mathcal{L}_{e}. It ensures that 𝐌e\mathbf{M}_{e} is invertable by definition when 𝐱˙𝟎{\dot{\mathbf{x}}}\neq\mathbf{0}. The equations of motion 𝐌e𝐱¨+𝐟e=𝟎\mathbf{M}_{e}{\ddot{\mathbf{x}}}+\mathbf{f}_{e}=\mathbf{0}, therefore, can always be solved to give a unique acceleration form

𝐱¨+𝐌e1𝐟e=𝟎.\displaystyle{\ddot{\mathbf{x}}}+\mathbf{M}_{e}^{-1}\mathbf{f}_{e}=\mathbf{0}. (27)

This equation, derived from the Finsler energy e\mathcal{L}_{e}, is known as the geodesic equation, and we will see that it acts as a generator for the geometry expressed by the Finsler structure g\mathcal{L}_{g}’s equations of motion.

Without proving it here, we note that derivatives reduce a function’s homogeneity by 1 (a general property of homogeneous functions). Therefore, examining 𝐟e\mathbf{f}_{e}, we see

𝐟e=𝐱˙𝐱e𝐱˙𝐱e.\displaystyle\mathbf{f}_{e}=\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}_{e}{\dot{\mathbf{x}}}-\partial_{\mathbf{x}}\mathcal{L}_{e}. (28)

Here 𝐱e\partial_{\mathbf{x}}\mathcal{L}_{e} is already HD2, and 𝐱˙{\dot{\mathbf{x}}} multiplies the HD1 𝐱˙𝐱e\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}_{e} making 𝐱˙𝐱e𝐱˙\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}_{e}{\dot{\mathbf{x}}} HD2 as well. So 𝐟e\mathbf{f}_{e} is HD2 in its entirety. Moreover, 𝐌e=𝐱˙𝐱˙2e\mathbf{M}_{e}=\partial^{2}_{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}}\mathcal{L}_{e} has two derivatives so it is HD0 (i.e. 𝐌e(𝐱,λ𝐱˙)=𝐌e(𝐱,𝐱˙)\mathbf{M}_{e}(\mathbf{x},\lambda{\dot{\mathbf{x}}})=\mathbf{M}_{e}(\mathbf{x},{\dot{\mathbf{x}}}), meaning the energy tensor is independent of the scale of the velocity, depending only on 𝐱˙{\dot{\mathbf{x}}}’s directionality). That means 𝐡2(𝐱,𝐱˙)=𝐌e1𝐟e\mathbf{h}_{2}(\mathbf{x},{\dot{\mathbf{x}}})=\mathbf{M}_{e}^{-1}\mathbf{f}_{e} is HD2, making the geodesic equation 𝐱¨+𝐌e1𝐟e=𝟎{\ddot{\mathbf{x}}}+\mathbf{M}_{e}^{-1}\mathbf{f}_{e}=\mathbf{0} a generating equation with associated geometric equation

P𝐱˙[𝐱¨+𝐌e1𝐟e]=𝟎.\displaystyle P^{\perp}_{\dot{\mathbf{x}}}\Big{[}{\ddot{\mathbf{x}}}+\mathbf{M}_{e}^{-1}\mathbf{f}_{e}\Big{]}=\mathbf{0}. (29)

The following theorem shows that this geometric equation is precisely that characterized by g\mathcal{L}_{g}’s equations of motion. We denote the geometric equations of motion by 𝐌g𝐱¨+𝐟g=𝟎\mathbf{M}_{g}{\ddot{\mathbf{x}}}+\mathbf{f}_{g}=\mathbf{0} where 𝐌g=𝐱˙𝐱˙2g\mathbf{M}_{g}=\partial^{2}_{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}}\mathcal{L}_{g} and 𝐟g=𝐱˙𝐱g𝐱g\mathbf{f}_{g}=\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}_{g}-\partial_{\mathbf{x}}\mathcal{L}_{g} for Finsler structure g\mathcal{L}_{g}. In this case, 𝐌g\mathbf{M}_{g} is reduced rank, so we cannot solve for a unique acceleration. Instead, the redundancy is precisely that expressed by the geometric equation.

Theorem III.1.

Let g\mathcal{L}_{g} be a Finsler structure with energy form e=12g2\mathcal{L}_{e}=\frac{1}{2}\mathcal{L}_{g}^{2}. Then the energy equation 𝐌e𝐱¨+𝐟e=𝟎\mathbf{M}_{e}{\ddot{\mathbf{x}}}+\mathbf{f}_{e}=\mathbf{0} is a generating equation 𝐱¨+𝐡2(𝐱,𝐱˙)=𝟎{\ddot{\mathbf{x}}}+\mathbf{h}_{2}(\mathbf{x},{\dot{\mathbf{x}}})=\mathbf{0} with 𝐡2=𝐌e1𝐟e\mathbf{h}_{2}=\mathbf{M}_{e}^{-1}\mathbf{f}_{e}. The associated geometric equation 𝐏𝐱˙[𝐱¨+𝐌e1𝐟e]=𝟎\mathbf{P}_{\dot{\mathbf{x}}}\big{[}{\ddot{\mathbf{x}}}+\mathbf{M}_{e}^{-1}\mathbf{f}_{e}\big{]}=\mathbf{0} is given by the geometric equations of motion 𝐌g𝐱¨+𝐟g=𝟎\mathbf{M}_{g}{\ddot{\mathbf{x}}}+\mathbf{f}_{g}=\mathbf{0}.

Proof.

We already observed that 𝐡2(𝐱,𝐱˙)=𝐌e1𝐟e\mathbf{h}_{2}(\mathbf{x},{\dot{\mathbf{x}}})=\mathbf{M}_{e}^{-1}\mathbf{f}_{e} is homogeneous of degree 2, so 𝐱¨+𝐡2(𝐱,𝐱˙)=𝟎{\ddot{\mathbf{x}}}+\mathbf{h}_{2}(\mathbf{x},{\dot{\mathbf{x}}})=\mathbf{0} is a generating equation where it is (uniquely) defined. Since 𝐌e\mathbf{M}_{e} is invertible by definition when 𝐱˙𝟎{\dot{\mathbf{x}}}\neq\mathbf{0}, it is only undefined for 𝐱˙=𝟎{\dot{\mathbf{x}}}=\mathbf{0}. But for 𝐱˙=𝟎{\dot{\mathbf{x}}}=\mathbf{0}, solutions to 𝐌e𝐱¨+𝐟e=𝟎\mathbf{M}_{e}{\ddot{\mathbf{x}}}+\mathbf{f}_{e}=\mathbf{0} are stationary point trajectories (𝐱¨=𝟎{\ddot{\mathbf{x}}}=\mathbf{0}) since 𝐟e\mathbf{f}_{e} is homogeneous. Therefore, defining 𝐡2(𝐱,𝟎)=𝟎\mathbf{h}_{2}(\mathbf{x},\mathbf{0})=\mathbf{0} creates matching limiting behavior (independent of the characteristic properties of generating equations, which characterize geometrically consistent generating trajectories for 𝐱˙𝟎{\dot{\mathbf{x}}}\neq\mathbf{0}), so 𝐱¨+𝐡2(𝐱,𝐱˙)=𝟎{\ddot{\mathbf{x}}}+\mathbf{h}_{2}(\mathbf{x},{\dot{\mathbf{x}}})=\mathbf{0} is a generating equation with solutions consistent with 𝐌e𝐱¨+𝐟e=𝟎\mathbf{M}_{e}{\ddot{\mathbf{x}}}+\mathbf{f}_{e}=\mathbf{0}. The energy equation, therefore, generates a geometry. We will see below that this geometry is given by the geometric equations of motion.

To show 𝐌g𝐱¨+𝐟g=𝟎\mathbf{M}_{g}{\ddot{\mathbf{x}}}+\mathbf{f}_{g}=\mathbf{0} is a geometric equation, we calculate explicit expressions for 𝐌g\mathbf{M}_{g} and 𝐟g\mathbf{f}_{g}. Since e=12g2\mathcal{L}_{e}=\frac{1}{2}\mathcal{L}_{g}^{2} and g=(2e)12\mathcal{L}_{g}=(2\mathcal{L}_{e})^{\frac{1}{2}}, we have

𝐌g\displaystyle\mathbf{M}_{g} =𝐱˙𝐱˙2g=2𝐱˙[𝐱˙e12]=2𝐱˙(12e12𝐱˙e)\displaystyle=\partial^{2}_{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}}\mathcal{L}_{g}=\sqrt{2}\partial_{\dot{\mathbf{x}}}\left[\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}^{\frac{1}{2}}\right]=\sqrt{2}\partial_{\dot{\mathbf{x}}}\left(\frac{1}{2}\mathcal{L}_{e}^{-\frac{1}{2}}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}\right)
=22[(12)e121𝐱˙e𝐱˙eT+e12𝐱˙𝐱˙2e]\displaystyle=\frac{\sqrt{2}}{2}\left[\left(-\frac{1}{2}\right)\mathcal{L}_{e}^{-\frac{1}{2}-1}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}^{T}+\mathcal{L}_{e}^{-\frac{1}{2}}\partial^{2}_{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}}\mathcal{L}_{e}\right]
=1(2e)12[𝐱˙𝐱˙2e12e𝐱˙e𝐱˙eT]\displaystyle=\frac{1}{(2\mathcal{L}_{e})^{\frac{1}{2}}}\left[\partial^{2}_{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}}\mathcal{L}_{e}-\frac{1}{2\mathcal{L}_{e}}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}^{T}\right] (30)
=1g[𝐌e𝐩e𝐩eT𝐩eT𝐌e1𝐩e],\displaystyle=\frac{1}{\mathcal{L}_{g}}\left[\mathbf{M}_{e}-\frac{\mathbf{p}_{e}\mathbf{p}_{e}^{T}}{\mathbf{p}_{e}^{T}\mathbf{M}_{e}^{-1}\mathbf{p}_{e}}\right],

where we use 𝐌e=𝐱˙𝐱˙2e\mathbf{M}_{e}=\partial^{2}_{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}}\mathcal{L}_{e}, denote 𝐩e=𝐱˙e\mathbf{p}_{e}=\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e} (this quantity is called the generalized momentum and has a recurring role in Finsler theory), and use the following identity, e=12𝐩eT𝐌e1𝐩e\mathcal{L}_{e}=\frac{1}{2}\mathbf{p}_{e}^{T}\mathbf{M}_{e}^{-1}\mathbf{p}_{e} (see Lemma III.2). Denoting 𝐑𝐱˙=𝐌e𝐩e𝐩eT𝐩eT𝐌e1𝐩e\mathbf{R}_{\dot{\mathbf{x}}}=\mathbf{M}_{e}-\frac{\mathbf{p}_{e}\mathbf{p}_{e}^{T}}{\mathbf{p}_{e}^{T}\mathbf{M}_{e}^{-1}\mathbf{p}_{e}} and noting 𝐩e=𝐌e𝐱˙\mathbf{p}_{e}=\mathbf{M}_{e}{\dot{\mathbf{x}}} (see again Lemma III.2), we see that 𝐌g=1g𝐑𝐱˙\mathbf{M}_{g}=\frac{1}{\mathcal{L}_{g}}\mathbf{R}_{\dot{\mathbf{x}}} is a reduced rank matrix with null space spanned by 𝐱˙{\dot{\mathbf{x}}} since

𝐑𝐱˙𝐱˙\displaystyle\mathbf{R}_{\dot{\mathbf{x}}}{\dot{\mathbf{x}}} =(𝐌e𝐩e𝐩eT𝐩eT𝐌e1𝐩e)𝐌e1𝐩e\displaystyle=\left(\mathbf{M}_{e}-\frac{\mathbf{p}_{e}\mathbf{p}_{e}^{T}}{\mathbf{p}_{e}^{T}\mathbf{M}_{e}^{-1}\mathbf{p}_{e}}\right)\mathbf{M}_{e}^{-1}\mathbf{p}_{e} (31)
=𝐌e𝐌e1𝐩e𝐩e(𝐩eT𝐌e1𝐩e)𝐩eT𝐌e1𝐩e\displaystyle=\mathbf{M}_{e}\mathbf{M}_{e}^{-1}\mathbf{p}_{e}-\frac{\mathbf{p}_{e}\big{(}\mathbf{p}_{e}^{T}\mathbf{M}_{e}^{-1}\mathbf{p}_{e}\big{)}}{\mathbf{p}_{e}^{T}\mathbf{M}_{e}^{-1}\mathbf{p}_{e}} (32)
=𝐩e𝐩e=𝟎.\displaystyle=\mathbf{p}_{e}-\mathbf{p}_{e}=\mathbf{0}. (33)

Using a calculation analogous to that which we used for Equation III-B we get

𝐱˙𝐱g=1g[𝐱˙𝐱e12e𝐱˙e𝐱eT],\displaystyle\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}_{g}=\frac{1}{\mathcal{L}_{g}}\left[\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}_{e}-\frac{1}{2\mathcal{L}_{e}}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}\partial_{\mathbf{x}}\mathcal{L}_{e}^{T}\right], (34)

and separately,

𝐱g=𝐱(2e)12=22e12𝐱e=1g𝐱e,\displaystyle\partial_{\mathbf{x}}\mathcal{L}_{g}=\partial_{\mathbf{x}}(2\mathcal{L}_{e})^{\frac{1}{2}}=\frac{\sqrt{2}}{2}\mathcal{L}_{e}^{-\frac{1}{2}}\partial_{\mathbf{x}}\mathcal{L}_{e}=\frac{1}{\mathcal{L}_{g}}\partial_{\mathbf{x}}\mathcal{L}_{e}, (35)

so combining we get

𝐟g\displaystyle\mathbf{f}_{g} =𝐱˙𝐱g𝐱˙𝐱g\displaystyle=\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}_{g}{\dot{\mathbf{x}}}-\partial_{\mathbf{x}}\mathcal{L}_{g} (36)
=1g[(𝐱˙𝐱e𝐱˙𝐱e)12e𝐱˙e𝐱eT𝐱˙]\displaystyle=\frac{1}{\mathcal{L}_{g}}\left[\big{(}\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}_{e}{\dot{\mathbf{x}}}-\partial_{\mathbf{x}}\mathcal{L}_{e}\big{)}-\frac{1}{2\mathcal{L}_{e}}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}\partial_{\mathbf{x}}\mathcal{L}_{e}^{T}{\dot{\mathbf{x}}}\right] (37)
=1g[𝐟e12e𝐱˙e𝐱eT𝐱˙].\displaystyle=\frac{1}{\mathcal{L}_{g}}\left[\mathbf{f}_{e}-\frac{1}{2\mathcal{L}_{e}}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}\partial_{\mathbf{x}}\mathcal{L}_{e}^{T}{\dot{\mathbf{x}}}\right]. (38)

We first show that 𝐱˙T𝐟g=𝟎{\dot{\mathbf{x}}}^{T}\mathbf{f}_{g}=\mathbf{0} (i.e. 𝐟e\mathbf{f}_{e} is orthogonal to 𝐱˙{\dot{\mathbf{x}}}) and then use that insight to derive an explicit projected expression for 𝐟g\mathbf{f}_{g}:

𝐱˙T𝐟g\displaystyle{\dot{\mathbf{x}}}^{T}\mathbf{f}_{g} =1g[𝐱˙T𝐟e12e𝐱˙T𝐱˙e𝐱eT𝐱˙]\displaystyle=\frac{1}{\mathcal{L}_{g}}\left[{\dot{\mathbf{x}}}^{T}\mathbf{f}_{e}-\frac{1}{2\mathcal{L}_{e}}{\dot{\mathbf{x}}}^{T}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}\partial_{\mathbf{x}}\mathcal{L}_{e}^{T}{\dot{\mathbf{x}}}\right] (39)
=1g[𝐱˙T(𝐱˙𝐱e𝐱˙𝐱e)\displaystyle=\frac{1}{\mathcal{L}_{g}}\bigg{[}{\dot{\mathbf{x}}}^{T}\big{(}\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}_{e}{\dot{\mathbf{x}}}-\partial_{\mathbf{x}}\mathcal{L}_{e}\big{)} (40)
12𝐱˙T𝐱˙eexeT𝐱˙].\displaystyle\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ -\frac{\frac{1}{2}{\dot{\mathbf{x}}}^{T}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}}{\mathcal{L}_{e}}\partial_{x}\mathcal{L}_{e}^{T}{\dot{\mathbf{x}}}\bigg{]}.

From Lemma III.2 we also get e=12𝐩eT𝐱˙=12𝐱˙eT𝐱˙\mathcal{L}_{e}=\frac{1}{2}\mathbf{p}_{e}^{T}{\dot{\mathbf{x}}}=\frac{1}{2}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}^{T}{\dot{\mathbf{x}}}, so the expression reduces to

𝐱˙T𝐟g=1g[𝐱˙T𝐱˙𝐱e𝐱˙2𝐱eT𝐱˙].\displaystyle{\dot{\mathbf{x}}}^{T}\mathbf{f}_{g}=\frac{1}{\mathcal{L}_{g}}\left[{\dot{\mathbf{x}}}^{T}\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}_{e}{\dot{\mathbf{x}}}-2\partial_{\mathbf{x}}\mathcal{L}_{e}^{T}{\dot{\mathbf{x}}}\right]. (41)

Finally,

𝐱˙T𝐱˙𝐱e𝐱˙2𝐱eT𝐱˙\displaystyle{\dot{\mathbf{x}}}^{T}\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}_{e}{\dot{\mathbf{x}}}-2\partial_{\mathbf{x}}\mathcal{L}_{e}^{T}{\dot{\mathbf{x}}} (42)
=(𝐱˙T𝐱(𝐱˙e)2𝐱e)𝐱˙\displaystyle\ \ =\left({\dot{\mathbf{x}}}^{T}\partial_{\mathbf{x}}\big{(}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}\big{)}-2\partial_{\mathbf{x}}\mathcal{L}_{e}\right){\dot{\mathbf{x}}} (43)
=𝐱(𝐱˙T𝐱˙e2e)𝐱˙\displaystyle\ \ =\partial_{\mathbf{x}}\left({\dot{\mathbf{x}}}^{T}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}-2\mathcal{L}_{e}\right){\dot{\mathbf{x}}} (44)
=𝐱(2e2e)𝐱˙=𝟎,\displaystyle\ \ =\partial_{\mathbf{x}}\left(2\mathcal{L}_{e}-2\mathcal{L}_{e}\right){\dot{\mathbf{x}}}=\mathbf{0}, (45)

again since e=12𝐱˙eT𝐱˙\mathcal{L}_{e}=\frac{1}{2}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}^{T}{\dot{\mathbf{x}}} by Lemma III.2. Therefore, 𝐱˙T𝐟g=𝟎{\dot{\mathbf{x}}}^{T}\mathbf{f}_{g}=\mathbf{0}.

By Equation 38 𝐟g\mathbf{f}_{g} takes the form 𝐟g=1g[𝐟eα𝐩e]\mathbf{f}_{g}=\frac{1}{\mathcal{L}_{g}}\left[\mathbf{f}_{e}-\alpha\mathbf{p}_{e}\right] (with 𝐩e=𝐱˙e\mathbf{p}_{e}=\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e} and α=12e𝐱eT𝐱˙\alpha=\frac{1}{2\mathcal{L}_{e}}\partial_{\mathbf{x}}\mathcal{L}_{e}^{T}{\dot{\mathbf{x}}}). Since we just saw that 𝐟g\mathbf{f}_{g} is orthogonal to 𝐱˙{\dot{\mathbf{x}}}, that α\alpha coefficient must be precisely the coefficient on 𝐩e\mathbf{p}_{e} needed to remove the component of 𝐟e\mathbf{f}_{e} along 𝐱˙{\dot{\mathbf{x}}}. Explicitly, α\alpha must satisfy 𝐱˙T[𝐟eα𝐩e]=𝟎{\dot{\mathbf{x}}}^{T}\big{[}\mathbf{f}_{e}-\alpha\mathbf{p}_{e}\big{]}=\mathbf{0}, which means

α=𝐱˙T𝐟e𝐱˙T𝐩e\displaystyle\alpha=\frac{{\dot{\mathbf{x}}}^{T}\mathbf{f}_{e}}{{\dot{\mathbf{x}}}^{T}\mathbf{p}_{e}} (46)

is another expression for α\alpha. That means

𝐟g\displaystyle\mathbf{f}_{g} =1g[𝐟eα𝐩e]=1g[𝐟e(𝐱˙T𝐟e𝐱˙T𝐩e)𝐩e]\displaystyle=\frac{1}{\mathcal{L}_{g}}\big{[}\mathbf{f}_{e}-\alpha\mathbf{p}_{e}\big{]}=\frac{1}{\mathcal{L}_{g}}\left[\mathbf{f}_{e}-\left(\frac{{\dot{\mathbf{x}}}^{T}\mathbf{f}_{e}}{{\dot{\mathbf{x}}}^{T}\mathbf{p}_{e}}\right)\mathbf{p}_{e}\right] (47)
=1g[𝐈𝐩e𝐱˙T𝐱˙T𝐩e]𝐟e.\displaystyle=\frac{1}{\mathcal{L}_{g}}\left[\mathbf{I}-\frac{\mathbf{p}_{e}{\dot{\mathbf{x}}}^{T}}{{\dot{\mathbf{x}}}^{T}\mathbf{p}_{e}}\right]\mathbf{f}_{e}. (48)

Noting again that 𝐩e=𝐌e𝐱˙\mathbf{p}_{e}=\mathbf{M}_{e}{\dot{\mathbf{x}}} (see Lemma III.2), we have

𝐟g\displaystyle\mathbf{f}_{g} =1g[𝐈𝐌e𝐱˙𝐱˙T𝐱˙T𝐌e𝐱˙]𝐟e\displaystyle=\frac{1}{\mathcal{L}_{g}}\left[\mathbf{I}-\frac{\mathbf{M}_{e}{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}^{T}}{{\dot{\mathbf{x}}}^{T}\mathbf{M}_{e}{\dot{\mathbf{x}}}}\right]\mathbf{f}_{e} (49)
=1g𝐌e[𝐌e1𝐱˙𝐱˙T𝐱˙T𝐌e𝐱˙]𝐟e\displaystyle=\frac{1}{\mathcal{L}_{g}}\mathbf{M}_{e}\left[\mathbf{M}_{e}^{-1}-\frac{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}^{T}}{{\dot{\mathbf{x}}}^{T}\mathbf{M}_{e}{\dot{\mathbf{x}}}}\right]\mathbf{f}_{e} (50)
=1g𝐌e𝐑𝐩e𝐟e,\displaystyle=\frac{1}{\mathcal{L}_{g}}\mathbf{M}_{e}\mathbf{R}_{\mathbf{p}_{e}}\mathbf{f}_{e}, (51)

where

𝐑𝐩e=𝐌e1𝐱˙𝐱˙T𝐱˙T𝐌e𝐱˙\displaystyle\mathbf{R}_{\mathbf{p}_{e}}=\mathbf{M}_{e}^{-1}-\frac{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}^{T}}{{\dot{\mathbf{x}}}^{T}\mathbf{M}_{e}{\dot{\mathbf{x}}}} (52)

is a reduced rank matrix analogous to 𝐑𝐱˙\mathbf{R}_{\dot{\mathbf{x}}}, with null space spanned by 𝐩e=𝐌e𝐱˙\mathbf{p}_{e}=\mathbf{M}_{e}{\dot{\mathbf{x}}}.

Combining all of these expressions so far, we get geometric equations of motion

𝐌g𝐱¨+𝐟g=1g𝐑𝐱˙𝐱¨+1g𝐌e𝐑𝐩e𝐟e=𝟎.\displaystyle\mathbf{M}_{g}{\ddot{\mathbf{x}}}+\mathbf{f}_{g}=\frac{1}{\mathcal{L}_{g}}\mathbf{R}_{\dot{\mathbf{x}}}{\ddot{\mathbf{x}}}+\frac{1}{\mathcal{L}_{g}}\mathbf{M}_{e}\mathbf{R}_{\mathbf{p}_{e}}\mathbf{f}_{e}=\mathbf{0}. (53)

However, 𝐑𝐱˙\mathbf{R}_{\dot{\mathbf{x}}} and 𝐑𝐩e\mathbf{R}_{\mathbf{p}_{e}} are related by the identity 𝐌e𝐑𝐩e𝐌e=𝐑𝐱˙\mathbf{M}_{e}\mathbf{R}_{\mathbf{p}_{e}}\mathbf{M}_{e}=\mathbf{R}_{\dot{\mathbf{x}}} since

𝐑𝐱˙\displaystyle\mathbf{R}_{\dot{\mathbf{x}}} =𝐌e𝐩e𝐩eT𝐩eT𝐌e1𝐩e\displaystyle=\mathbf{M}_{e}-\frac{\mathbf{p}_{e}\mathbf{p}_{e}^{T}}{\mathbf{p}_{e}^{T}\mathbf{M}_{e}^{-1}\mathbf{p}_{e}} (54)
=𝐌e𝐌e𝐱˙𝐱˙T𝐌e(𝐌e𝐱˙)T𝐌e1(𝐌e𝐱˙)\displaystyle=\mathbf{M}_{e}-\frac{\mathbf{M}_{e}{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}^{T}\mathbf{M}_{e}}{\big{(}\mathbf{M}_{e}{\dot{\mathbf{x}}}\big{)}^{T}\mathbf{M}_{e}^{-1}\big{(}\mathbf{M}_{e}{\dot{\mathbf{x}}}\big{)}} (55)
=𝐌e[𝐌e1𝐱˙𝐱˙T𝐱˙T𝐌e𝐱˙]𝐌e\displaystyle=\mathbf{M}_{e}\left[\mathbf{M}_{e}^{-1}-\frac{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}^{T}}{{\dot{\mathbf{x}}}^{T}\mathbf{M}_{e}{\dot{\mathbf{x}}}}\right]\mathbf{M}_{e} (56)
=𝐌e𝐑𝐩e𝐌e.\displaystyle=\mathbf{M}_{e}\mathbf{R}_{\mathbf{p}_{e}}\mathbf{M}_{e}. (57)

Therefore, the geometric equations of motion can be expressed

1g𝐌e𝐑𝐩e𝐌e𝐱¨+1g𝐌e𝐑𝐩e𝐟e=𝟎\displaystyle\frac{1}{\mathcal{L}_{g}}\mathbf{M}_{e}\mathbf{R}_{\mathbf{p}_{e}}\mathbf{M}_{e}{\ddot{\mathbf{x}}}+\frac{1}{\mathcal{L}_{g}}\mathbf{M}_{e}\mathbf{R}_{\mathbf{p}_{e}}\mathbf{f}_{e}=\mathbf{0} (58)
1g𝐌e𝐑𝐩e[𝐌e𝐱¨+𝐟e]=𝟎\displaystyle\Rightarrow\frac{1}{\mathcal{L}_{g}}\mathbf{M}_{e}\mathbf{R}_{\mathbf{p}_{e}}\big{[}\mathbf{M}_{e}{\ddot{\mathbf{x}}}+\mathbf{f}_{e}\big{]}=\mathbf{0} (59)
or𝐏𝐩e[𝐌e𝐱¨+𝐟e]=𝟎,\displaystyle\mathrm{or}\ \mathbf{P}_{\mathbf{p}_{e}}^{\perp}\big{[}\mathbf{M}_{e}{\ddot{\mathbf{x}}}+\mathbf{f}_{e}\big{]}=\mathbf{0}, (60)

where 𝐏𝐩e\mathbf{P}_{\mathbf{p}_{e}}^{\perp} is an orthogonal projector with null space 𝐩e\mathbf{p}_{e} (any matrix with 𝐩e\mathbf{p}_{e} spanning its null space would express the same equation). Solutions to this equation are exactly those for which 𝐌e𝐱¨+𝐟e\mathbf{M}_{e}{\ddot{\mathbf{x}}}+\mathbf{f}_{e} lies along the null space 𝐩e\mathbf{p}_{e}, so they can be expressed as solutions to the unprojected equation

𝐌e𝐱¨+𝐟e+α𝐩e=𝟎\displaystyle\mathbf{M}_{e}{\ddot{\mathbf{x}}}+\mathbf{f}_{e}+\alpha\mathbf{p}_{e}=\mathbf{0} (61)

for any α\alpha\in\mathbb{R}. Since 𝐩e=𝐌e𝐱˙\mathbf{p}_{e}=\mathbf{M}_{e}{\dot{\mathbf{x}}} by Lemma III.2, those solutions also solve

𝐌e𝐱¨+𝐟e+α𝐌e𝐱˙=𝟎\displaystyle\ \ \mathbf{M}_{e}{\ddot{\mathbf{x}}}+\mathbf{f}_{e}+\alpha\mathbf{M}_{e}{\dot{\mathbf{x}}}=\mathbf{0} (62)
𝐱¨+𝐌e1𝐟e+α𝐱˙=𝟎\displaystyle\Rightarrow\ \ {\ddot{\mathbf{x}}}+\mathbf{M}_{e}^{-1}\mathbf{f}_{e}+\alpha{\dot{\mathbf{x}}}=\mathbf{0} (63)

for any α\alpha\in\mathbb{R}, which are exactly the solutions to

𝐏𝐱˙[𝐱¨+𝐌e1𝐟e]=𝟎,\displaystyle\mathbf{P}_{\dot{\mathbf{x}}}^{\perp}\big{[}{\ddot{\mathbf{x}}}+\mathbf{M}_{e}^{-1}\mathbf{f}_{e}\big{]}=\mathbf{0}, (64)

where 𝐏𝐱˙\mathbf{P}_{\dot{\mathbf{x}}}^{\perp} is the orthogonal projector with null space spanned by 𝐱˙{\dot{\mathbf{x}}}. This equation is the geometric equation induced by generator 𝐱¨+𝐌e1𝐟e=𝟎{\ddot{\mathbf{x}}}+\mathbf{M}_{e}^{-1}\mathbf{f}_{e}=\mathbf{0}. Therefore, the energy equations of motion 𝐌e𝐱¨+𝐟e=𝟎\mathbf{M}_{e}{\ddot{\mathbf{x}}}+\mathbf{f}_{e}=\mathbf{0} form a geometric generator whose geometric equation is given by the geometric equations of motion 𝐌g𝐱¨+𝐟g=𝟎\mathbf{M}_{g}{\ddot{\mathbf{x}}}+\mathbf{f}_{g}=\mathbf{0}. ∎

The above proof alludes to a number of properties of the energy e\mathcal{L}_{e} and its generalized momentum 𝐩e=𝐱˙e\mathbf{p}_{e}=\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}. These results are collected and proved in the following Lemma. Note that the following identities match those seen in Riemannian geometry simply with the Riemannian metric replaced by the generalized metric tensor. See Section IV for details.

Lemma III.2 (Energy and momentum identities).

Let g\mathcal{L}_{g} be a Finsler structure with energy form e=12g2\mathcal{L}_{e}=\frac{1}{2}\mathcal{L}_{g}^{2}, and denote the generalized momentum 𝐩e=𝐱˙e\mathbf{p}_{e}=\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}. Then

  1. 1.

    𝐩e=𝐌e𝐱˙\mathbf{p}_{e}=\mathbf{M}_{e}{\dot{\mathbf{x}}}

  2. 2.

    e=12𝐩eT𝐱˙=12𝐱˙T𝐌e𝐱˙=12𝐩eT𝐌e1𝐩e\mathcal{L}_{e}=\frac{1}{2}\mathbf{p}_{e}^{T}{\dot{\mathbf{x}}}=\frac{1}{2}{\dot{\mathbf{x}}}^{T}\mathbf{M}_{e}{\dot{\mathbf{x}}}=\frac{1}{2}\mathbf{p}_{e}^{T}\mathbf{M}_{e}^{-1}\mathbf{p}_{e}.

Proof.

Since e\mathcal{L}_{e} is homogeneous of degree 2, as we’ve seen e=𝐩eT𝐱˙e=e\mathcal{H}_{\mathcal{L}_{e}}=\mathbf{p}_{e}^{T}{\dot{\mathbf{x}}}-\mathcal{L}_{e}=\mathcal{L}_{e}. This directly implies

e=12𝐩eT𝐱˙\displaystyle\mathcal{L}_{e}=\frac{1}{2}\mathbf{p}_{e}^{T}{\dot{\mathbf{x}}} (65)

giving the first form of identity (2). The rest of those identities derive from identity (1); to prove that identity, we take the gradient of this expression for e\mathcal{L}_{e}

𝐱˙e\displaystyle\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e} =𝐱˙[12𝐱˙eT𝐱˙]\displaystyle=\partial_{\dot{\mathbf{x}}}\big{[}\frac{1}{2}\partial_{{\dot{\mathbf{x}}}}\mathcal{L}_{e}^{T}{\dot{\mathbf{x}}}\big{]} (66)
=12𝐱˙𝐱˙2e𝐱˙+12𝐱˙e\displaystyle=\frac{1}{2}\partial^{2}_{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}}\mathcal{L}_{e}{\dot{\mathbf{x}}}+\frac{1}{2}\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e} (67)

which implies

𝐱˙𝐱˙2e𝐱˙=2𝐱˙e𝐱˙e=𝐱˙e\displaystyle\partial^{2}_{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}}\mathcal{L}_{e}{\dot{\mathbf{x}}}=2\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}-\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}=\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e} (68)

or 𝐌e𝐱˙=𝐩e\mathbf{M}_{e}{\dot{\mathbf{x}}}=\mathbf{p}_{e}. ∎

IV Riemannian geometry is a Finsler geometry

Riemannian geometry is a special case of Finsler geometry. This section shows how many of the most important properties of Riemannian geometry arise from the more general properties of Finsler geometry. The Riemannian Finsler structure is g=(𝐱˙T𝐆(𝐱)𝐱˙)12\mathcal{L}_{g}=\big{(}{\dot{\mathbf{x}}}^{T}\mathbf{G}(\mathbf{x}){\dot{\mathbf{x}}}\big{)}^{\frac{1}{2}}, where 𝐆(𝐱)\mathbf{G}(\mathbf{x}) is a smoothly changing symmetric positive definite matrix. Since 𝐆\mathbf{G} is independent of 𝐱˙{\dot{\mathbf{x}}}, it’s easy to see that the Finsler structure is HD1

g(𝐱,λ𝐱˙)\displaystyle\mathcal{L}_{g}(\mathbf{x},\lambda{\dot{\mathbf{x}}}) =((λ𝐱˙)T𝐆(𝐱)(λ𝐱˙))12\displaystyle=\big{(}(\lambda{\dot{\mathbf{x}}})^{T}\mathbf{G}(\mathbf{x})(\lambda{\dot{\mathbf{x}}})\big{)}^{\frac{1}{2}}
=(λ2𝐱˙T𝐆(𝐱)𝐱˙)12\displaystyle=\big{(}\lambda^{2}{\dot{\mathbf{x}}}^{T}\mathbf{G}(\mathbf{x}){\dot{\mathbf{x}}}\big{)}^{\frac{1}{2}}
=λg(𝐱,𝐱˙).\displaystyle=\lambda\mathcal{L}_{g}(\mathbf{x},{\dot{\mathbf{x}}}).

Likewise, since 𝐆\mathbf{G} is positive definite g𝟎\mathcal{L}_{g}\geq\mathbf{0} with equality only when 𝐱˙=𝟎{\dot{\mathbf{x}}}=\mathbf{0}. And since e=12𝐱˙T𝐆𝐱˙\mathcal{L}_{e}=\frac{1}{2}{\dot{\mathbf{x}}}^{T}\mathbf{G}{\dot{\mathbf{x}}}, we have 𝐌e=𝐱˙2e=𝐆\mathbf{M}_{e}=\partial^{2}_{\dot{\mathbf{x}}}\mathcal{L}_{e}=\mathbf{G}, which is everywhere invertible. The matrix 𝐆\mathbf{G} is called a Riemannian metric, and plays the role of the Finsler metric tensor. The Finsler structure defines a norm on the tangent space (the space of velocities at a point 𝐱\mathbf{x})

g=(𝐱˙T𝐆𝐱˙)12=𝐱˙𝐆>0for 𝐱˙0,\displaystyle\mathcal{L}_{g}=({\dot{\mathbf{x}}}^{T}\mathbf{G}{\dot{\mathbf{x}}})^{\frac{1}{2}}=\|{\dot{\mathbf{x}}}\|_{\mathbf{G}}>0\ \ \mbox{for ${\dot{\mathbf{x}}}\neq 0$}, (69)

showing that the action defining the extremal problem’s objective is

A[𝐱]=g𝑑t=𝐱˙𝐆𝑑t.\displaystyle A[\mathbf{x}]=\int\mathcal{L}_{g}dt=\int\|{\dot{\mathbf{x}}}\|_{\mathbf{G}}dt. (70)

This action functional can be understood as a generalized arc-length integral across the trajectory.

Since the action is a generalized arc-length integral it seems natural that this measure would be invariant to time-reparameterization of the trajectory (i.e. invariant to speed profile across the trajectory). Using time-reparameterization t(s)t(s) with 𝐱˙s=dtds𝐱˙t{\dot{\mathbf{x}}}_{s}=\frac{dt}{ds}{\dot{\mathbf{x}}}_{t}, we can perform the calculation of Section IV explicitly

A[𝐱s]\displaystyle A[\mathbf{x}_{s}] =𝐱˙s𝐆𝑑s=dtds𝐱˙t𝐆𝑑s\displaystyle=\int\|{\dot{\mathbf{x}}}_{s}\|_{\mathbf{G}}ds=\int\left\|\frac{dt}{ds}{\dot{\mathbf{x}}}_{t}\right\|_{\mathbf{G}}ds
=𝐱˙t𝐆dtds𝑑s=𝐱˙t𝐆𝑑t=A[𝐱t].\displaystyle=\int\|{\dot{\mathbf{x}}}_{t}\|_{\mathbf{G}}\frac{dt}{ds}ds=\int\|{\dot{\mathbf{x}}}_{t}\|_{\mathbf{G}}dt=A[\mathbf{x}_{t}].

Per Theorem III.1 we would expect the Riemannian Finsler structure’s Euler-Lagrange equation 𝐌g𝐱¨+𝐟g=𝟎\mathbf{M}_{g}{\ddot{\mathbf{x}}}+\mathbf{f}_{g}=\mathbf{0} to have a reduced rank 𝐌g=𝐱˙2g\mathbf{M}_{g}=\partial^{2}_{\dot{\mathbf{x}}}\mathcal{L}_{g} (otherwise, it would have a unique (non-redundant) solution). Indeed, by calculation, we have

𝐌g\displaystyle\mathbf{M}_{g} =𝐱˙𝐱˙2g=𝐱˙[𝐱˙(𝐱˙T𝐆(𝐱)𝐱˙)12]\displaystyle=\partial^{2}_{{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}}\mathcal{L}_{g}=\partial_{\dot{\mathbf{x}}}\left[\partial_{\dot{\mathbf{x}}}({\dot{\mathbf{x}}}^{T}\mathbf{G}(\mathbf{x}){\dot{\mathbf{x}}})^{\frac{1}{2}}\right] (71)
=𝐱˙[12(𝐱˙T𝐆𝐱˙)122𝐆𝐱˙]=𝐱˙[𝐆𝐱˙𝐱˙𝐆]\displaystyle=\partial_{\dot{\mathbf{x}}}\left[\frac{1}{2}({\dot{\mathbf{x}}}^{T}\mathbf{G}{\dot{\mathbf{x}}})^{-\frac{1}{2}}2\mathbf{G}{\dot{\mathbf{x}}}\right]=\partial_{\dot{\mathbf{x}}}\left[\frac{\mathbf{G}{\dot{\mathbf{x}}}}{\|{\dot{\mathbf{x}}}\|_{\mathbf{G}}}\right] (72)
=𝐱˙𝐆𝐆𝐆𝐱˙(𝐱˙T𝐆𝐱˙𝐆)𝐱˙𝐆2\displaystyle=\frac{\|{\dot{\mathbf{x}}}\|_{\mathbf{G}}\mathbf{G}-\mathbf{G}{\dot{\mathbf{x}}}\left(\frac{{\dot{\mathbf{x}}}^{T}\mathbf{G}}{\|{\dot{\mathbf{x}}}\|_{\mathbf{G}}}\right)}{\|{\dot{\mathbf{x}}}\|_{\mathbf{G}}^{2}} (73)
=1𝐱˙𝐆[𝐆𝐆𝐱˙𝐱˙T𝐆𝐱˙𝐆2]\displaystyle=\frac{1}{\|{\dot{\mathbf{x}}}\|_{\mathbf{G}}}\left[\mathbf{G}-\frac{\mathbf{G}{\dot{\mathbf{x}}}{\dot{\mathbf{x}}}^{T}\mathbf{G}}{\|{\dot{\mathbf{x}}}\|^{2}_{\mathbf{G}}}\right] (74)
=1𝐱˙𝐆𝐆12[𝐈𝐯^𝐯^T]𝐆12,\displaystyle=\frac{1}{\|{\dot{\mathbf{x}}}\|_{\mathbf{G}}}\mathbf{G}^{\frac{1}{2}}\left[\mathbf{I}-\widehat{\mathbf{v}}\widehat{\mathbf{v}}^{T}\right]\mathbf{G}^{\frac{1}{2}}, (75)

where 𝐯=𝐆12𝐱˙\mathbf{v}=\mathbf{G}^{\frac{1}{2}}{\dot{\mathbf{x}}} and 𝐯^=𝐯𝐯\widehat{\mathbf{v}}=\frac{\mathbf{v}}{\|\mathbf{v}\|}. This matrix is reduced rank since 𝐈𝐯^𝐯^T\mathbf{I}-\widehat{\mathbf{v}}\widehat{\mathbf{v}}^{T} is an orthogonal projector (with null space spanned by 𝐯\mathbf{v}). 𝐌g\mathbf{M}_{g}’s null space is, therefore, spanned by 𝐮\mathbf{u} such that 𝐆12𝐮=𝐯\mathbf{G}^{\frac{1}{2}}\mathbf{u}=\mathbf{v}. Since 𝐯=𝐆12𝐱˙\mathbf{v}=\mathbf{G}^{\frac{1}{2}}{\dot{\mathbf{x}}}, it must be that 𝐮=𝐱˙\mathbf{u}={\dot{\mathbf{x}}}. Therefore, solutions to 𝐌g𝐱¨+𝐟g=𝟎\mathbf{M}_{g}{\ddot{\mathbf{x}}}+\mathbf{f}_{g}=\mathbf{0} can be expressed as any nominal solution 𝐱¨0{\ddot{\mathbf{x}}}_{0} offset by a null space element α𝐱˙\alpha{\dot{\mathbf{x}}}, i.e. 𝐱¨=𝐱¨0+α𝐱˙{\ddot{\mathbf{x}}}={\ddot{\mathbf{x}}}_{0}+\alpha{\dot{\mathbf{x}}}.

Remember that the Hamiltonian (conserved quantity) of the energy system given in this case by e=12g2\mathcal{L}_{e}=\frac{1}{2}\mathcal{L}_{g}^{2} is (generically) e=𝐱˙eT𝐱˙e\mathcal{H}_{\mathcal{L}_{e}}=\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}^{T}{\dot{\mathbf{x}}}-\mathcal{L}_{e} In the case of Riemannian geometry, this Hamiltonian is known to evaluate to

e=(𝐆𝐱˙)T𝐱˙12𝐱˙T𝐆𝐱˙=12𝐱˙T𝐆𝐱˙=e.\displaystyle\mathcal{H}_{\mathcal{L}_{e}}=\big{(}\mathbf{G}{\dot{\mathbf{x}}}\big{)}^{T}{\dot{\mathbf{x}}}-\frac{1}{2}{\dot{\mathbf{x}}}^{T}\mathbf{G}{\dot{\mathbf{x}}}=\frac{1}{2}{\dot{\mathbf{x}}}^{T}\mathbf{G}{\dot{\mathbf{x}}}=\mathcal{L}_{e}. (76)

From the above discussion around homogeneous functions (see Section II) we see that this seemingly coincidental results actually derives from the second-degree homogeneity of e\mathcal{L}_{e}, which in turn derives from the first-degree homogeneity of the Finsler structure g=(𝐱˙T𝐆(𝐱)𝐱˙)12\mathcal{L}_{g}=\big{(}{\dot{\mathbf{x}}}^{T}\mathbf{G}(\mathbf{x}){\dot{\mathbf{x}}}\big{)}^{\frac{1}{2}}, i.e. the Riemannian length element.

The energy equations 𝐌e𝐱¨+𝐟e=𝟎\mathbf{M}_{e}{\ddot{\mathbf{x}}}+\mathbf{f}_{e}=\mathbf{0} in Riemannian geometry have

𝐌e\displaystyle\mathbf{M}_{e} =𝐱˙2e=𝐆\displaystyle=\partial^{2}_{\dot{\mathbf{x}}}\mathcal{L}_{e}=\mathbf{G} (77)
and𝐟e\displaystyle\mbox{and}\ \ \mathbf{f}_{e} =𝐱˙𝐱e𝐱˙𝐱e\displaystyle=\partial_{{\dot{\mathbf{x}}}\mathbf{x}}\mathcal{L}_{e}{\dot{\mathbf{x}}}-\partial_{\mathbf{x}}\mathcal{L}_{e} (78)
=𝐱(𝐆𝐱˙)𝐱˙𝐱(12𝐱˙T𝐆𝐱˙).\displaystyle=\partial_{\mathbf{x}}\big{(}\mathbf{G}{\dot{\mathbf{x}}}\big{)}{\dot{\mathbf{x}}}-\partial_{\mathbf{x}}\left(\frac{1}{2}{\dot{\mathbf{x}}}^{T}\mathbf{G}{\dot{\mathbf{x}}}\right).

From these expressions we can easily see that 𝐌e\mathbf{M}_{e} is HD0 and 𝐟e\mathbf{f}_{e} is HD2, making 𝐱¨+𝐌e1𝐟e=𝟎{\ddot{\mathbf{x}}}+\mathbf{M}_{e}^{-1}\mathbf{f}_{e}=\mathbf{0} an HD2 geometry generator. We do not fully derive the geometric equation here, but this result in conjunction with the above observation that 𝐌g\mathbf{M}_{g} is reduced rank support the results of Theorem III.1.

Refer to caption
Figure 3: Geometric fabrics, an application of general nonlinear and Finsler geometries for the modeling and design of reactive robotic behavior. Top row, a sophisticated geometry constructed layer-by-layer with increasing complexity from left-to-right. Bottom row, the resulting convergent system when forced by an objective potential. See Section V for details.

Finally, we note that there is a direct one-to-one correspondence between Riemannian geometric energy systems and classical mechanical systems, a formulation known as geometric mechanics [2]. Under this correspondence, 𝐌e=𝐆\mathbf{M}_{e}=\mathbf{G} is the generalized mass matrix and 𝐟e\mathbf{f}_{e} captures fictitious forces such as Coriolis and centripetal forces. The energy e=12𝐱˙T𝐆𝐱˙\mathcal{L}_{e}=\frac{1}{2}{\dot{\mathbf{x}}}^{T}\mathbf{G}{\dot{\mathbf{x}}} is the kinetic energy of the mechanical system and the quantity 𝐩e=𝐱˙e=𝐆𝐱˙\mathbf{p}_{e}=\partial_{\dot{\mathbf{x}}}\mathcal{L}_{e}=\mathbf{G}{\dot{\mathbf{x}}} is the generalized momentum. These quantities match the generalized versions given by Lemma III.2.

System solutions of the generator 𝐌e𝐱¨+𝐟e=𝟎\mathbf{M}_{e}{\ddot{\mathbf{x}}}+\mathbf{f}_{e}=\mathbf{0} conserve the Hamiltonian and are, therefore, constant energy solutions and can be considered to be energy levels. Since 𝐌e𝐱¨+𝐟e=𝟎\mathbf{M}_{e}{\ddot{\mathbf{x}}}+\mathbf{f}_{e}=\mathbf{0} is a generator, the solutions are geometrically consistent (solutions of all initial value problems with initial velocity pointing in the same direction from the same initial point follow overlapping paths). This geometric property of classical mechanical systems is less often quoted in robotics, but is clear from the generalized development presented here.

V An application: Geometric Fabrics

One important application of general nonlinear and Finsler geometries is in the design of reactive robotic behavior using geometric fabrics [13], where behavior is modeled as HD2 generalized nonlinear geometries. The second-order homogeneity property models the behavior as a geometry of paths making it independent of speed and giving the system a path consistency that is important for design intuition. These behaviors are designed as a combination of many modular components, each weighed together using the metric tensor of a paired Finsler energy as its weight matrix (a metric-weighted average of component geometries). Importantly, each component can be understood as a Finsler geometry, with a well-defined conserved Finsler energy whose metric tensor defines the priority, whose geometry has been bent to align with the HD2 nonlinear geometry defining the desired behavior. This bending process curves the Finsler geometry in a way that both preserves its geometric HD2 properties and preserves energy conservation (the bending term performs no work on the system and is known as a zero work modification). What we get in the end is an HD2 geometry that aligns with the desired behavioral geometry but also retains the Finsler geometry’s energy metric for prioritization.

Figure 3 depicts a complex geometry in a 2D point space constructed layer-by-layer. From left to right (top row), we see first an underlying Euclidean geometry (first panel) creating straight line behaviors. Adding a barrier geometry curves those geodesic paths to bend away from the walls (second panel), and adding an obstacle barrier (third panel) ensures that the system never hits a circular obstacle as well. Then (fourth panel) a random vortex geometry is added which makes the geodesics bend and twist randomly. Despite the random motions, the resulting geodesics still never penetrate either type of barrier because of the prior layers’ contributions. Finally (last panel), an attractor geometry funnels the geodesics heuristically toward a target. In all cases, Finsler energies shape each geometry’s priority in the weighted average so it dominates only when most important. See [14] for an in-depth description the geometries and Finsler energies used in this example.

The theory of geometric fabrics ensures that geometries constructed in this way can be optimized over to create system goals represented by local minima of an objective potential simply by forcing the system with the potential’s negative gradient. We do just that for each of these geometries (Figure 3, bottom row) using a simple attractor potential of the type described in [14]. In each case, the geometry greatly shapes the behavior of the system en route, but all differential equations ultimately converge to the target in the end as guaranteed by the theory of fabrics.

VI CONCLUSIONS

Geometric fabrics are powerful tools for designing robot behavior and are an important application of the generalized geometries described here. The derivations here informed the development of Geometric fabrics [13], which have produced the most flexible, intuitive, and consistent provably-stable tools to date for designing reactive robot behavior. We hope these derivations and exposition will enable and inspire many more applications of these more general geometries within robotics in the same way Riemannian geometry has already found substantial application.

References

  • [1] R. M. Murray, Z. Li, and S. S. Sastry, A Mathematical Introduction to Robotic Manipulation.   CRC Press, 1994.
  • [2] F. Bullo and A. D. Lewis, Geometric control of mechanical systems: modeling, analysis, and design for simple mechanical control systems.   Springer Science & Business Media, 2004, vol. 49.
  • [3] T. Schmidt, R. Newcombe, and D. Fox, “DART: Dense Articulated Real-time Tracking with consumer depth cameras,” Autonomous Robots, vol. 39, no. 3, pp. 239–258, 2015.
  • [4] M. Toussaint, “Newton methods for k-order Markov constrained motion problems,” CoRR, vol. abs/1407.0414, 2014. [Online]. Available: http://arxiv.org/abs/1407.0414
  • [5] N. Ratliff, M. Toussaint, and S. Schaal, “Understanding the geometry of workspace obstacles in motion optimization,” in IEEE International Conference on Robotics and Automation (ICRA), 2015.
  • [6] P.-A. Absil, R. Mahony, and R. Sepulchre, Optimization Algorithms on Matrix Manifolds.   Princeton University Press, 2008.
  • [7] S.-I. Amari and H. Nagaoka, Methods of Information Geometry.   American Mathematical Society, 1994, translated 2000, and renewed 2007.
  • [8] J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” ser. Proceedings of Machine Learning Research, F. Bach and D. Blei, Eds., vol. 37.   Lille, France: PMLR, 07–09 Jul 2015, pp. 1889–1897.
  • [9] D. D.-W. Bao, S.-S. Chern, and Z. Shen, An Introduction to Riemann-Finsler Geometry.   Springer; 2000th Edition, 2000.
  • [10] Z. Shen, Differential Geometry of Spray and Finsler Spaces.   Springer; 2001 Edition, 2001.
  • [11] N. D. Ratliff, J. Issac, D. Kappler, S. Birchfield, and D. Fox, “Riemannian motion policies,” arXiv:1801.02854, 2018.
  • [12] C.-A. Cheng, M. Mukadam, J. Issac, S. Birchfield, D. Fox, B. Boots, and N. Ratliff, “RMPflow: A computational graph for automatic motion policy generation,” in The 13th International Workshop on the Algorithmic Foundations of Robotics, 2018.
  • [13] M. Xie, K. V. Wyk, A. Li, M. A. Rana, D. Fox, B. Boots, and N. Ratliff, “Geometric fabrics for the acceleration based design of robotic motion,” arXiv:2010.14750 [cs.RO], 2020. [Online]. Available: https://arxiv.org/abs/2010.14750
  • [14] N. D. Ratliff, K. V. Wyk, M. Xie, A. Li, and A. M. Rana, “Optimization fabrics for behavioral design,” arXiv:2010.15676 [cs.RO], 2020. [Online]. Available: https://arxiv.org/abs/2010.15676
  • [15] G. B. Folland, Advanced Calculus.   Pearson; 1st Edition, 2001.
  • [16] I. Gelfand and S. Fomin, Calculus of Variations.   Dover, orig. Prentice-Hall, 1963.
  • [17] J. R. Taylor, Classical Mechanics.   University Science Books, 2005.
  • [18] J. M. Lee, Riemannian Manifolds: An Introduction To Curvature.   Springer, 1997.
  • [19] J. Lee, Introduction to Smooth Manifolds, 2nd ed.   Springer, 2012.
  • [20] J. Lewis, Homogeneous Functions and Euler’s Theorem. In: An Introduction to Mathematics.   Palgrave Macmillan, London., 1969.
  • [21] L. Susskind, The Theoretical Minimum: Classical Mechanics.   Stanford: Continuing Studies, 2011. [Online]. Available: http://theoreticalminimum.com/courses/classical-mechanics/2011/fall