This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

From Second-Order Differential Geometry to
Stochastic Geometric Mechanics

Qiao Huang111Group of Mathematical Physics (GFMUL), Department of Mathematics, Faculty of Sciences, University of Lisbon, Campo Grande, Edifício C6, PT-1749-016 Lisboa, Portugal. Email: [email protected], Jean-Claude Zambrini222Group of Mathematical Physics (GFMUL), Department of Mathematics, Faculty of Sciences, University of Lisbon, Campo Grande, Edifício C6, PT-1749-016 Lisboa, Portugal. Email: [email protected]
Abstract

Classical geometric mechanics, including the study of symmetries, Lagrangian and Hamiltonian mechanics, and the Hamilton-Jacobi theory, are founded on geometric structures such as jets, symplectic and contact ones. In this paper, we shall use a partly forgotten framework of second-order (or stochastic) differential geometry, developed originally by L. Schwartz and P.-A. Meyer, to construct second-order counterparts of those classical structures. These will allow us to study symmetries of stochastic differential equations (SDEs), to establish stochastic Lagrangian and Hamiltonian mechanics and their key relations with second-order Hamilton-Jacobi-Bellman (HJB) equations. Indeed, stochastic prolongation formulae will be derived to study symmetries of SDEs and mixed-order Cartan symmetries. Stochastic Hamilton’s equations will follow from a second-order symplectic structure and canonical transformations will lead to the HJB equation. A stochastic variational problem on Riemannian manifolds will provide a stochastic Euler-Lagrange equation compatible with HJB one and equivalent to the Riemannian version of stochastic Hamilton’s equations. A stochastic Noether’s theorem will also follow. The inspirational example, along the paper, will be the rich dynamical structure of Schrödinger’s problem in optimal transport, where the latter is also regarded as a Euclidean version of hydrodynamical interpretation of quantum mechanics.

AMS 2020 Mathematics Subject Classification: 70L10, 35F21, 70H20, 49Q22.
Keywords and Phrases: Stochastic Hamiltonian mechanics, stochastic Lagrangian mechanics, Hamilton-Jacobi-Bellman equations, stochastic Hamilton’s equations, stochastic Euler-Lagrange equation, stochastic Noether’s theorem, Schrödinger’s problem, second-order differential geometry.

1 Introduction

Hamilton-Jacobi (HJ) partial differential equations and the associated theory lie at the center of classical mechanics [1, 7, 63, 35]. Motivated by Hamilton’s approach to geometrical optic where the action represents the time needed by a particle to move between two points and a variational principle due to Fermat, Jacobi extended this approach to Lagrangian and Hamiltonian mechanics. Jacobi designed a concept of “complete” solution of HJ equations allowing him to recover all solutions simply by substitutions and differentiations. Although, in general, it is more complicated to solve than a system of ODEs like Hamilton’s ones, HJ equations proved to be powerful tools of integration of classical equations of motion. In addition, Jacobi’s approach suggested him to ask what diffeomorphisms of the cotangent bundle, the geometric arena of canonical equations, preserve the structure of these first order equations. Those are called today symplectic or canonical transformations and Jacobi’s method of integration is precisely one of them.

It is not always recognized as it should be that HJ equations were also fundamental in the construction of quantum mechanics. The reading of Schrödinger [78], Fock [30], Dirac [20] and others until Feynman [28] makes abundantly clear that most of new ideas in the field made use of HJ equations for the classical system to be “quantized”, or some quantum deformation of them. There are at least two ways to express this deformation. On the one hand, one can exponentiate the L2L^{2} wave function, call SS its complex exponent and look for the equation solved by SS (see [35]). When the system is a single particle in a scalar potential, one obtain the classical HJ equation with an additional Laplacian term and factor ii\hbar, representing the regularization expected from the quantization of the system. This complex factor is symptomatic of the basic quantum probability problem, at least for pure states. In a nutshell, it is the reason why Feynman’s diffusions, in his path integral approach, do not exist. On the other hand, there is an hydrodynamical interpretation of quantum mechanics, founded on Madelung transform, a polar representation of the wave function whose real part is the square root of a probability density. The argument solves another deformation of HJ equation. The geometry of this transform has been thoroughly investigated recently, highlighting its relations with optimal transport theory [49, 86].

However, the probabilistic content of quantum mechanics, especially for pure states, remained a vexing mathematical mystery right from its beginning, despite several interesting (but unsuccessful) attempts [69]. The current consensus is that regular probability theory and stochastic analysis have little or nothing to teach us about it. And, in particular, that all that can be saved from Feynman path integral theory is Wiener’s measure and perturbations of it by potential terms. This is the “Euclidean approach”, one of the starting points of mathematical quantum field theory.

In 1931, however, Schrödinger suggested in a paper almost forgotten until the eighties [79] (but insightfully commented by the probabilist S. Bernstein [11]) the existence of a completely different Euclidean approach to quantum dynamics. In short, a stochastic variational boundary value problem for probability densities characterizes optimal diffusions on a given time interval as having a density product of two positive solutions of time adjoint heat equations. This idea, revived and elaborated from 1986 [88], is known today as “Schrödinger’s problem” in the community of optimal transport, where it has proved to provide, among other results, very efficient regularization of fundamental problems of this field [58]. In fact, Schrödinger’s problem hinted toward the existence of a stochastic dynamical theory of processes, considerably more general than its initial quantum motivation. In it, various regularizations associated with the tools of stochastic calculus should play the role of those involved in quantum mechanics in Hilbert space, where the looked-for measures do not exist.

The variational side of the stochastic theory has been developed in the last decades, inspired by number of results in stochastic optimal control [37, 29] and stochastic optimal transport [67]. In this context, the crucial role of (second-order) Hamilton-Jacobi-Bellman (HJB) equation has been known for a long time. It provides the proper regularization of the (first-order) HJ equation needed to construct well defined stochastic dynamical theories. In contrast, for instance, with the notion of viscosity solution, whose initial target was the study of the classical PDE, HJB equation becomes central, there, as natural stochastic deformation of this one, compatible with Itô’s calculus. It is worth mentioning that in any fields like AI or reinforcement learning, where HJB equations play a fundamental role [75], it is natural to expect that such a stochastic dynamical framework, built on them, should present some interest.

The geometric, and especially, Hamiltonian side of the dynamical theory had resisted until now and constitutes the main contribution of this paper. It is our hope that it will be useful far beyond its initial motivation referred to, afterward, as its “inspirational examples”. In this sense, it can clearly be interpreted as a general contribution to stochastic geometric mechanics. More precisely, we are trying to answer the following questions:

  • Do we have any geometric interpretation of the Hamilton-Jacobi-Bellman equation? That is, can we derive the HJB equation from some sort of canonical transformations?

  • Can we formulate some variational problem that leads to a Euler-Lagrange equation which is equivalent to the HJB equation?

  • More systematically, can we develop some counterpart of Lagrangian and Hamiltonian mechanics that are associated with the HJB equation?

The first question indicates that canonical transformations should be somehow second-order, so that the corresponding symplectic and contact structures are also second-order. Meanwhile, the stochastic generalization of optimal control and optimal transport suggests that the variational problem of the second question should be formulated in stochastic sense. Combining these hints, the third question amounts to seeking a new theory of geometric mechanics that integrates stochastics and second-order together.

The cornerstone of stochastic analysis, the well-known Itô’s formula, tells us that the generator of a diffusion process is a second-order differential operator. This provides a very natural way to connect the stochastics with the second-order. That is, in order to build a stochastic or second-order counterpart of geometric mechanics, we need to encode the rule of Itô’s formula into the geometric structures.

There is a theory named second-order differential geometry (“stochastic differential geometry” is also used by some authors but we would like to keep the original terminology), which was devised by L. Schwartz and P.-A. Meyer around 1980 [80, 81, 82, 64, 65], and later on developed by Belopolskaya and Dalecky [10], Gliklikh [34], Emery [25], etc. See [26] for a survey of this aspect. Compared with the theory of stochastic analysis on manifolds (or geometric stochastic analysis) developed by Itô [44, 45], Malliavin [62], Bismut [12] and Elworthy [24] etc., which focus on Stratonovich stochastic differential equations on classical geometric structures, like Riemannian manifolds, frame bundles and Lie groups, so that the Leibniz’s rule is preserved, Schwartz’ second-order differential calculus alter the underlying geometric structures to include second-order Itô correction terms, and provide a broader picture even though it loses Leibniz’s rule and is much less known.

In this paper, we will adopt the viewpoint of Schwartz–Meyer and enlarge their picture to develop a theory of stochastic geometric mechanics. We first give an equivalent and more intuitive description for the second-order tangent bundle by equivalent classes of diffusions, via Nelson’s mean derivatives. And then we generalize this idea to construct stochastic jets, from which stochastic prolongation formulae are proved and the stochastic counterpart of Cartan symmetries is studied. The second-order cotangent bundle is also studied, which helps us to establish stochastic Hamiltonian mechanics. We formulate the stochastic Hamilton’s equations, a system of stochastic equations on the second-order cotangent bundle in terms of mean derivatives. By introducing the second-order symplectic structure and the mixed-order contact structure, we derive the second-order HJB equations via canonical transformations. Finally, we set up a stochastic variational problem on the space of diffusion processes, also in terms of mean derivatives. Two kinds of stochastic principle of least action are built: stochastic Hamilton’s principle and stochastic Maupertuis’s principle. Both of them yield a stochastic Euler-Lagrange equation. The equivalence between the stochastic Euler-Lagrange equation and the HJB equation is proved, which exactly leads to the equivalence between our stochastic variational problem and Schrödinger’s problem in optimal transport. Last but not least (actually vital), a stochastic Noether’s theorem is proved. It says that every symmetry of HJB equation corresponds to a martingale that is exactly a conservation law in the stochastic sense. It should be observed, however, that the Schwartz-Meyer approach, together with the one of Bismut [12], has also inspired a distinct, Stratonovich-type stochastic Hamiltonian framework [53] leading to a stochastic HJ equation [54], without relations with Schrödinger’s problem or optimal transport.

The key results of the present paper and the dependence among them are briefly expressed in the following diagram:

Stochastic symmetries Stochastic jets Stochastic prolongation formulae Mixed-order Cartan symmetries Stochastic Hamiltonian mechanics Second-order symplectic structure Mixed-order contact structure Stochastic Hamilton’s equations Global stochastic Hamilton’s equations HJB equations Stochastic Lagrangian mechanics Stochastic stationary- action principles Stochastic Maupertuis’s principle Stochastic Hamilton’s principle Schrödinger’s problem Stochastic Euler-Lagrange equation Stochastic Noether’s theorem

The organization of this paper is the following.

Chapter 2 is a summary on the theory of stochastic differential equations on manifolds, in the perspective appropriate to our goal. In particular, diffusions will be characterized by their mean and quadratic mean derivatives as in Nelson’s stochastic mechanics [69] although the resulting dynamical content of our theory will have very little to do with his. In this way, we are able to rewrite Itô SDEs on manifolds as ODE-like equations that have better geometric nature. The notion of second-order tangent bundle answers to the question: the drift parts of Itô SDEs are sections of what?

Chapter 3 is devoted to the notion of Stochastic jets. In the same way as tangent vector on MM are defined as equivalence classes of smooth curves through a given point and then generalized to higher-order cases to produce the notion of jets, the stochastic tangent vector is defined as equivalence classes of diffusions so that the stochastic tangent bundle is isomorphic to the elliptic subbundle of the second-order tangent bundle. Stochastic jets are also constructed. This provides an intrinsic definition of SDEs under consideration.

Chapter 4 illustrates the use of the above geometric formulation of SDEs for the study of their symmetries. Prolongations of MM-valued diffusions are defined as new processes with values on the stochastic tangent bundle. Among all deterministic space-time transformations, bundle homomorphisms will be the only subclass to transform diffusions to diffusions. Total mean and quadratic derivative are defined in conformity with the rules of Itô’s calculus. The prolongation of diffusions allows to define symmetries of SDEs and their infinitesimal versions. Stochastic prolongation formulae are derived for infinitesimal symmetries, which yield determining equations for Itô SDEs.

In Chapter 5, the second-order cotangent bundle, as dual bundle of second-order tangent bundle, is defined and analyzed. The properties of second-order differential operator, pushforwards and pullbacks are described. When time is involved, i.e., the base manifold is the product manifold ×M\mathbb{R}\times M, the corresponding bundles are mixed-order tangent and cotangent bundles, where “mixed-order” means they are second-order in space but first-order in time. More about this topic, like mixed-order pushforwards and pullbacks, pushforwards and pullbacks by diffusions, and Lie derivatives, can be found in Appendix A. An generalized notion to stochastic Cartan distribution and its symmetries are discussed in Appendix B based on the mixed-order contact structure.

The point of Chapter 6 is to use the tools developed before in the construction of the stochastic Hamiltonian mechanics which is one of the main goals of the paper. One of our inspirational example will be the one underlying the dynamical content of Schrödinger’s problem. By analogy with Poincaré 11-form in the cotangent bundle of classical mechanics and its associated symplectic form, one can construct counterparts in the second-order cotangent bundle. Using the canonical second-order symplectic form on second-order cotangent bundles, one defines second-order symplectomorphisms. The generalization of classical Hamiltonian vector fields becomes second-order operators, for a given real-valued Hamiltonian function on the second-order cotangent bundle. The resulting stochastic Hamiltonian system involves pairs of extra equations compared with their classical versions. Bernstein’s reciprocal processes inspired by Schrödinger’s problem are described in this framework, corresponding to a large class of second-order Hamiltonians on Riemannian manifolds. A mixed-order contact structure describes time-dependent stochastic Hamiltonian systems. The last section of this chapter is devoted to canonical transformations preserving the form of stochastic Hamilton’s equations. The corresponding generating function satisfies the Hamilton-Jacobi-Bellman equation.

Chapter 7 treats the stochastic version of classical Lagrangian mechanics on Riemannian manifolds. Itô’s stochastic deformation of the classical notion of parallel displacements are recalled. Another one, called damped parallel displacement in the mathematical literature, involving the Ricci tensor, is also indicated. Each of these displacements corresponds to a mean covariant derivative along diffusions. The action functional is defined as expectation of Lagrangian and the stochastic Euler-Lagrange equation involves the damped mean covariant derivative. The dynamics of Schrödinger’s problem is, again, used as illustration. The equivalence between stochastic Hamilton’s equations on Riemannian manifolds and the stochastic Euler-Lagrange one as well as the HJB equation are derived via the Legendre transform. Relations with stochastic control are also mentioned. The chapter ends with the stochastic Noether’s theorem. The stochastic version of Maupertuis principle, as the twin of stochastic Hamilton’s principle, is left into Appendix C.

We end the introduction with a list of notations and abbreviations frequently used in the paper, for reader’s convenience.

1.1 List of main notations and abbreviations

HJB equation Hamilton-Jacobi-Bellman equation
MDE Mean differential equations
2nd-order Second-order
SDE Stochastic differential equations
S-EL equation Stochastic Euler-Lagrange equation
S-H equations Stochastic Hamilton’s equations
AA A general second-order differential operator or second-order vector field
AXA^{X} Generator of the diffusion XX
d\circ\,d Stratonovich stochastic differential
dd Exterior differential on the manifold MM, or Itô stochastic differential
d Linear operator extended from the exterior differential on the tangent bundle TMTM
d2d^{2} Second-order differential on MM
dd^{\circ} Mixed-order differential on ×M\mathbb{R}\times M
dxd_{x} Horizontal differential on the tangent bundle TMTM or cotangent bundle TMT^{*}M
dx˙d_{\dot{x}} Vertical differential on TMTM
(DX,QX),DX,Q(X,Y)(DX,QX),D_{\nabla}X,Q(X,Y) Mean derivatives
𝐃t,𝐐t\mathbf{D}_{t},\mathbf{Q}_{t} Total mean derivatives
𝐃dt,𝐃¯dt\frac{\mathbf{D}}{dt},\frac{\overline{\mathbf{D}}}{dt} Mean covariant derivative and damped mean covariant derivative
Δ\Delta, ΔLD\Delta_{\text{LD}} Connection Laplacian and Laplace-de Rham operator
FS,FSF^{S}_{*},F^{S*} Second-order pushforward and pullback of a smooth map F:MNF:M\to N
FR,FRF^{R}_{*},F^{R*} Mixed-order pushforward and pullback of FF
Γ\Gamma Christoffel symbols or stochastic parallel displacement
Γ¯\overline{\Gamma} Damped parallel displacement
It(M),I(t,q)(M),I(t,q)T,μ(M)I_{t}(M),I_{(t,q)}(M),I_{(t,q)}^{T,\mu}(M) Various sets of MM-valued diffusions starting at time tt
jqX,jqX,j(t,q)X,jtXj_{q}X,j_{q}^{\nabla}X,j_{(t,q)}X,j_{t}X Stochastic tangent vectors and stochastic jets
\mathcal{L} Lie derivatives
\nabla Linear connection, Levi-Civita connection, covariant derivative, or gradient on MM
2\nabla^{2} Hessian operator
p\nabla_{p} Vertical gradient on TMT^{*}M
(Ω,,𝐏)(\Omega,\mathcal{F},\mathbf{P}) Probability space Ω\Omega with σ\sigma-field \mathcal{F} and probability measure 𝐏\mathbf{P}
{𝒫t}t\{\mathcal{P}_{t}\}_{t\in\mathbb{R}}, {t}t\{\mathcal{F}_{t}\}_{t\in\mathbb{R}} Past (nondecreasing) filtration and future (nonincreasing) filtration
t,t\frac{\partial}{\partial{t}},\partial_{t} Differential operator with respect to coordinate tt
xi,i\frac{\partial}{\partial{x^{i}}},\partial_{i} Differential operator with respect to coordinate xix^{i}
2xjxk,jk\frac{\partial^{2}}{\partial x^{j}\partial x^{k}},\partial_{jk} Second-order differential operator with respect to coordinates xjx^{j} and xkx^{k}
pi,pi\frac{\partial}{\partial{p_{i}}},\partial_{p_{i}} Differential operator with respect to coordinate pip_{i}
RR, Ric\mathrm{Ric} Riemann curvature tensor and Ricci (1,1)(1,1)-tensor
𝒯OM,𝒯EM\mathcal{T}^{O}M,\mathcal{T}^{E}M Second-order tangent bundle and second-order elliptic tangent bundle
𝒯SM\mathcal{T}^{S}M Stochastic tangent bundle
𝒯SM\mathcal{T}^{S*}M Second-order cotangent bundle
VV A general vector field
(xi,Dix,Qjkx)(x^{i},D^{i}x,Q^{jk}x) Canonical coordinates on 𝒯SM\mathcal{T}^{S}M
(xi,pi,ojk)(x^{i},p_{i},o_{jk}) Canonical coordinates on 𝒯SM\mathcal{T}^{S*}M
X,XX_{*},X^{*} Pushforward and pullback of the diffusion XX
𝐗\mathbf{X} A horizontal diffusion valued on a general bundle EE or on 𝒯SM\mathcal{T}^{S*}M
𝕏\mathbb{X} A horizontal diffusion valued on TMT^{*}M

2 Stochastic differential equations on manifolds

In this chapter, we will study several types of stochastic differential equations on manifolds which are weakly equivalent to Itô SDEs. We start with a dd-dimensional smooth manifold MM and a probability space (Ω,,𝐏)(\Omega,\mathcal{F},\mathbf{P}), and equip the latter with a filtration {𝒫t}t\{\mathcal{P}_{t}\}_{t\in\mathbb{R}}, i.e., a family of nondecreasing sub-σ\sigma-fields of \mathcal{F}. We call {𝒫t}t\{\mathcal{P}_{t}\}_{t\in\mathbb{R}} a past filtration. Unless otherwise specified, the manifold MM will not be endowed with any structures other than the smooth structure. In some cases, it will be endowed with a linear connection, a Riemannian metric, or a Levi-Civita connection.

Recall from [40, Definition 1.2.1] that by an MM-valued (forward) {𝒫t}\{\mathcal{P}_{t}\}-semimartingale, we mean a {𝒫t}\{\mathcal{P}_{t}\}-adapted continuous MM-valued process X={X(t)}t[t0,τ)X=\{X(t)\}_{t\in[t_{0},\tau)}, where t0t_{0}\in\mathbb{R} and τ\tau is a {𝒫t}\{\mathcal{P}_{t}\}-stopping time satisfying t0<τ+t_{0}<\tau\leq+\infty, such that f(X)f(X) is a real-valued {𝒫t}\{\mathcal{P}_{t}\}-semimartingale on [t0,τ)[t_{0},\tau) for all fC(M)f\in C^{\infty}(M). The stopping time τ\tau is called the lifetime of XX. If we adopt the convention to introduce the one-point compactification of MM by M:=M{M}M^{*}:=M\cup\{\partial_{M}\}, then the process XX can be extended to the whole time line [t0,+)[t_{0},+\infty) by setting X(t)=MX(t)=\partial_{M} for all tτt\geq\tau. The point M\partial_{M} is often called the cemetery point in the context of Markovian theory.

2.1 Itô SDEs on manifolds

Given N+1N+1 time-dependent vector field b,σr,r=1,,Nb,\sigma_{r},r=1,\cdots,N on MM, one can introduce a Stratonovich SDE in local coordinates, which has the same form as in Euclidean space [40, Section 1.2]. The form of Stratonovich SDEs on MM is invariant under changes of coordinates, as Stratonovich stochastic differentials obey the Leibniz’s rule.

However, for Itô stochastic differentials this is not the case because of Itô’s formula. Hence, we cannot directly write an Euclidean form of Itô SDE on MM in local coordinates, since it is no longer invariant under changes of coordinates. Indeed, a change of coordinates will always produce an additional term. To balance this term, a common way is to add a correction term to the drift part of the Euclidean form of Itô SDE, by taking advantage of a linear connection. More precisely, under local coordinates (xi)(x^{i}), we consider the following Itô SDE [34, Section 7.1, 7.2]:

dXi(t)=[bi(t,X(t))12r=1NΓjki(X(t))(σrjσrk)(t,X(t))]dt+σri(t,X(t))dWr(t),dX^{i}(t)=\left[b^{i}(t,X(t))-\frac{1}{2}\sum_{r=1}^{N}\Gamma^{i}_{jk}(X(t))(\sigma^{j}_{r}\sigma^{k}_{r})(t,X(t))\right]dt+\sigma^{i}_{r}(t,X(t))dW^{r}(t), (2.1)

where (Γjki)(\Gamma^{i}_{jk}) is the family of Christoffel symbols for a given linear connection \nabla on TMTM. When conditioning on {X(t)=q}\{X(t)=q\} and taking (xi)(x^{i}) as normal coordinates at qMq\in M, (2.1) turns to the Euclidean form, since at qq,

r=1NΓjkiσrjσrk=12r=1N(Γjki+Γkji)σrjσrk=0.\sum_{r=1}^{N}\Gamma^{i}_{jk}\sigma^{j}_{r}\sigma^{k}_{r}=\frac{1}{2}\sum_{r=1}^{N}\left(\Gamma^{i}_{jk}+\Gamma^{i}_{kj}\right)\sigma^{j}_{r}\sigma^{k}_{r}=0. (2.2)

If we denote

σσ:=r=1Nσrσr=r=1Nσrjσrkxjxk.\sigma\circ\sigma^{*}:=\sum_{r=1}^{N}\sigma_{r}\otimes\sigma_{r}=\sum_{r=1}^{N}\sigma_{r}^{j}\sigma_{r}^{k}\frac{\partial}{\partial{x^{j}}}\otimes\frac{\partial}{\partial{x^{k}}}.

Then, clearly σσ\sigma\circ\sigma^{*} is a symmetric and positive semi-definite (2,0)(2,0)-tensor field. We also introduce formally a modified drift 𝔟\mathfrak{b} which has the following coordinate expression

𝔟i=bi12r=1NΓjkiσrjσrk.\mathfrak{b}^{i}=b^{i}-\frac{1}{2}\sum_{r=1}^{N}\Gamma^{i}_{jk}\sigma^{j}_{r}\sigma^{k}_{r}. (2.3)

We change the coordinate chart from (U,(xi))(U,(x^{i})) to (V,(x~j))(V,(\tilde{x}^{j})) with UVU\cap V\neq\emptyset. Since each σr\sigma_{r} transforms as a vector, we apply the change-of-coordinate formula for Christoffel symbols (e.g., [50, Proposition III.7.2]) to derive that

Γjkiσrjσrk=(Γ~mnlx~mxjx~nxkxix~l+2x~lxjxkxix~l)σrjσrk=(Γ~mnlσ~rmσ~rn+2x~lxjxkσrjσrk)xix~l.\begin{split}\Gamma^{i}_{jk}\sigma^{j}_{r}\sigma^{k}_{r}&=\left(\tilde{\Gamma}^{l}_{mn}\frac{\partial\tilde{x}^{m}}{\partial x^{j}}\frac{\partial\tilde{x}^{n}}{\partial x^{k}}\frac{\partial x^{i}}{\partial\tilde{x}^{l}}+\frac{\partial^{2}\tilde{x}^{l}}{\partial x^{j}\partial x^{k}}\frac{\partial x^{i}}{\partial\tilde{x}^{l}}\right)\sigma^{j}_{r}\sigma^{k}_{r}=\left(\tilde{\Gamma}^{l}_{mn}\tilde{\sigma}^{m}_{r}\tilde{\sigma}^{n}_{r}+\frac{\partial^{2}\tilde{x}^{l}}{\partial x^{j}\partial x^{k}}\sigma^{j}_{r}\sigma^{k}_{r}\right)\frac{\partial x^{i}}{\partial\tilde{x}^{l}}.\end{split}

It follows that the coefficients of the modified drift 𝔟\mathfrak{b} in (2.3) transform as

𝔟~l=b~l12r=1NΓ~mnlσ~rmσ~rn=bix~lxi12r=1N(Γjkix~lxi2x~lxjxk)σrjσrk=𝔟ix~lxi+122x~lxjxkr=1Nσrjσrk.\begin{split}\tilde{\mathfrak{b}}^{l}&=\tilde{b}^{l}-\frac{1}{2}\sum_{r=1}^{N}\tilde{\Gamma}^{l}_{mn}\tilde{\sigma}^{m}_{r}\tilde{\sigma}^{n}_{r}=b^{i}\frac{\partial\tilde{x}^{l}}{\partial x^{i}}-\frac{1}{2}\sum_{r=1}^{N}\left(\Gamma^{i}_{jk}\frac{\partial\tilde{x}^{l}}{\partial x^{i}}-\frac{\partial^{2}\tilde{x}^{l}}{\partial x^{j}\partial x^{k}}\right)\sigma^{j}_{r}\sigma^{k}_{r}=\mathfrak{b}^{i}\frac{\partial\tilde{x}^{l}}{\partial x^{i}}+\frac{1}{2}\frac{\partial^{2}\tilde{x}^{l}}{\partial x^{j}\partial x^{k}}\sum_{r=1}^{N}\sigma^{j}_{r}\sigma^{k}_{r}.\end{split} (2.4)

Therefore, 𝔟\mathfrak{b} is not a vector field as it does not pointwisely transform as a vector.

Finally, using Itô’s formula, we derive the transformation of (2.1) as follows:

dx~l=x~lxidxi+122x~lxjxkd[xj,xk]=[x~lxi(bi12r=1NΓjkiσrjσrk)+12r=1N2x~lxjxkσrjσrk]dt+x~lxiσridWr=(b~l12r=1NΓ~mnlσ~rmσ~rn)dt+σ~rldWr,\begin{split}d\tilde{x}^{l}&=\frac{\partial\tilde{x}^{l}}{\partial x^{i}}dx^{i}+\frac{1}{2}\frac{\partial^{2}\tilde{x}^{l}}{\partial x^{j}\partial x^{k}}d[x^{j},x^{k}]\\ &=\left[\frac{\partial\tilde{x}^{l}}{\partial x^{i}}\left(b^{i}-\frac{1}{2}\sum_{r=1}^{N}\Gamma^{i}_{jk}\sigma^{j}_{r}\sigma^{k}_{r}\right)+\frac{1}{2}\sum_{r=1}^{N}\frac{\partial^{2}\tilde{x}^{l}}{\partial x^{j}\partial x^{k}}\sigma^{j}_{r}\sigma^{k}_{r}\right]dt+\frac{\partial\tilde{x}^{l}}{\partial x^{i}}\sigma^{i}_{r}dW^{r}\\ &=\left(\tilde{b}^{l}-\frac{1}{2}\sum_{r=1}^{N}\tilde{\Gamma}^{l}_{mn}\tilde{\sigma}^{m}_{r}\tilde{\sigma}^{n}_{r}\right)dt+\tilde{\sigma}^{l}_{r}dW^{r},\end{split}

where the bracket [,][\cdot,\cdot] on the right hand side (RHS) of the first equality denotes the quadratic variation. This shows that equation (2.1) is indeed invariant under changes of coordinates.

Remark 2.1.

One can regard σ=(σr)r=1N(N)𝔛(M)\sigma=(\sigma_{r})_{r=1}^{N}\in(\mathbb{R}^{N})^{*}\otimes\mathfrak{X}(M) as an (N)(\mathbb{R}^{N})^{*}-valued vector field on MM. In this way, the pair (b,σ)(b,\sigma) is called an Itô vector field in [34, Chapter 7], while the pair (𝔟,σ)(\mathfrak{b},\sigma) is called an Itô equation therein.

Now we present the definition of weak solutions to (2.1).

Definition 2.2 (Weak solutions to Itô SDEs).

Given a linear connection on MM, a weak solution of the Itô SDE (2.1) is a triple (X,W)(X,W), (Ω,,𝐏)(\Omega,\mathcal{F},\mathbf{P}), {𝒫t}t\{\mathcal{P}_{t}\}_{t\in\mathbb{R}}, where

  • (i)

    (Ω,,𝐏)(\Omega,\mathcal{F},\mathbf{P}) is a probability space, and {𝒫t}t\{\mathcal{P}_{t}\}_{t\in\mathbb{R}} is a past (i.e., nondecreasing) filtration of \mathcal{F} satisfying the usual conditions,

  • (ii)

    X={X(t)}t[t0,τ)X=\{X(t)\}_{t\in[t_{0},\tau)} is a continuous, {𝒫t}\{\mathcal{P}_{t}\}-adapted MM-valued process with {𝒫t}\{\mathcal{P}_{t}\}-stopping time τ>t0\tau>t_{0}, WW is an NN-dimensional {𝒫t}\{\mathcal{P}_{t}\}-Brownian motion, and

  • (iii)

    for every qMq\in M, tt0t\geq t_{0} and any coordinate chart (U,(xi))(U,(x^{i})) of qq, it holds under the conditional probability 𝐏(|X(t0)=q)\mathbf{P}(\cdot|X(t_{0})=q) that almost surely in the event {X(t)U}\{X(t)\in U\},

    Xi(t)=Xi(t0)+t0t(bi(s,X(s))12r=1NΓjki(X(s))(σrjσrk)(s,X(s)))𝑑s+t0tσri(s,X(s))𝑑Wr(s).X^{i}(t)=X^{i}(t_{0})+\int_{t_{0}}^{t}\left(b^{i}(s,X(s))-\frac{1}{2}\sum_{r=1}^{N}\Gamma^{i}_{jk}(X(s))(\sigma^{j}_{r}\sigma^{k}_{r})(s,X(s))\right)ds+\int_{t_{0}}^{t}\sigma^{i}_{r}(s,X(s))dW^{r}(s).
Definition 2.3 (Uniqueness in law).

We say that uniqueness in the sense of probability law holds for the Itô SDE (2.1) if, for any two weak solutions (X,W)(X,W), (Ω,,𝐏)(\Omega,\mathcal{F},\mathbf{P}), {𝒫t}t\{\mathcal{P}_{t}\}_{t\in\mathbb{R}}, and (X^,W^)(\hat{X},\hat{W}), (Ω^,^,𝐏^)(\hat{\Omega},\hat{\mathcal{F}},\hat{\mathbf{P}}), {𝒫^t}t\{\hat{\mathcal{P}}_{t}\}_{t\in\mathbb{R}} with the same initial data, i.e., 𝐏(X(0)=x0)=𝐏^(X^(0)=x0)=1\mathbf{P}(X(0)=x_{0})=\hat{\mathbf{P}}(\hat{X}(0)=x_{0})=1, the two processes XX and X~\tilde{X} have the same law.

Note that it is possible to change σ\sigma and WW in the Itô SDE (2.1) but keep the same weak solution in law. In other words, the form of (2.1) does not univocally correspond to its weak solution in law. For this reason, we will reformulate SDEs in a fashion that makes them look more like ODEs and have better geometric nature. Moreover, we will see that it is the pair (𝔟,σσ)(\mathfrak{b},\sigma\circ\sigma^{*}) that univocally corresponds to the weak solution of (2.1).

2.2 Mean derivatives and mean differential equations on manifolds

In this part, we will recall the definitions of Nelson’s mean derivatives and extend them to MM-valued processes. In Nelson’s stochastic mechanics [69], the probability space (Ω,,𝐏)(\Omega,\mathcal{F},\mathbf{P}) is equipped with two different filtrations. The first one is just an usual nondecreasing filtration {𝒫t}t\{\mathcal{P}_{t}\}_{t\in\mathbb{R}}, a past filtration. The second is a family of nonincreasing sub-σ\sigma-fields of \mathcal{F}, which is denoted by {t}t\{\mathcal{F}_{t}\}_{t\in\mathbb{R}} and called a future filtration. For an d\mathbb{R}^{d}-valued process {X(t)}tI\{X(t)\}_{t\in I}, its forward mean derivative DXDX and forward quadratic mean derivative QXQX are defined by conditional expectations as follows:

DX(t)=limϵ0+𝐄[X(t+ϵ)X(t)ϵ|𝒫t],QX(t)=limϵ0+𝐄[(X(t+ϵ)X(t))(X(t+ϵ)X(t))ϵ|𝒫t],DX(t)=\lim_{\epsilon\to 0^{+}}\mathbf{E}\left[\frac{X(t+\epsilon)-X(t)}{\epsilon}\bigg{|}\mathcal{P}_{t}\right],\quad QX(t)=\lim_{\epsilon\to 0^{+}}\mathbf{E}\left[\frac{(X(t+\epsilon)-X(t))\otimes(X(t+\epsilon)-X(t))}{\epsilon}\bigg{|}\mathcal{P}_{t}\right],

Their backward versions, i.e., the backward mean derivative and backward quadratic mean derivative, are defined as follows:

DX(t)=limϵ0+𝐄[X(t)X(tϵ)ϵ|t],QX(t)=limϵ0+𝐄[(X(t)X(tϵ))(X(t)X(tϵ))ϵ|t].\overleftarrow{D}X(t)=\lim_{\epsilon\to 0^{+}}\mathbf{E}\left[\frac{X(t)-X(t-\epsilon)}{\epsilon}\bigg{|}\mathcal{F}_{t}\right],\quad\overleftarrow{Q}X(t)=\lim_{\epsilon\to 0^{+}}\mathbf{E}\left[\frac{(X(t)-X(t-\epsilon))\otimes(X(t)-X(t-\epsilon))}{\epsilon}\bigg{|}\mathcal{F}_{t}\right].

In our present paper, we will only focus on the “forward” case, so that only the past filtration {𝒫t}t\{\mathcal{P}_{t}\}_{t\in\mathbb{R}} will be invoked. The “backward” case is analogous and every part of this paper can have its “backward” counterpart (cf. [89]).

Denote by Sym2(TM)\mathrm{Sym}^{2}(TM) (and Sym+2(TM)\mathrm{Sym}^{2}_{+}(TM)) the fiber bundle of symmetric (and respectively, symmetric positive semi-definite) (2,0)(2,0)-tensors on MM. Now we define quadratic mean derivatives for MM-valued semimartingales, cf. [34, Chapter 9].

Definition 2.4 (Quadratic mean derivatives).

The (forward) quadratic mean derivative of the MM-valued semimartingale {X(t)}t[t0,τ)\{X(t)\}_{t\in[t_{0},\tau)} is a Sym+2(TM)\mathrm{Sym}^{2}_{+}(TM)-valued process QXQX on [t0,τ)[t_{0},\tau), whose value at time t[t0,τ)t\in[t_{0},\tau) in any coordinate chart (U,(xi))(U,(x^{i})) and in the event {X(t)U}\{X(t)\in U\} is given by

(QX)ij(t)=limϵ0+𝐄[(Xi(t+ϵ)Xi(t))(Xj(t+ϵ)Xj(t))ϵ|𝒫t],(QX)^{ij}(t)=\lim_{\epsilon\to 0^{+}}\mathbf{E}\left[\frac{(X^{i}(t+\epsilon)-X^{i}(t))(X^{j}(t+\epsilon)-X^{j}(t))}{\epsilon}\bigg{|}\mathcal{P}_{t}\right], (2.5)

where the limits are assumed to exist in L1(Ω,,𝐏)L^{1}(\Omega,\mathcal{F},\mathbf{P}).

More generally, we can define the (forward) quadratic mean derivative for two MM-valued semimartingales XX and YY in local coordinates by

(Q(X,Y))ij(t)=limϵ0+𝐄[(Xi(t+ϵ)Xi(t))(Yj(t+ϵ)Yj(t))ϵ|𝒫t].(Q(X,Y))^{ij}(t)=\lim_{\epsilon\to 0^{+}}\mathbf{E}\left[\frac{(X^{i}(t+\epsilon)-X^{i}(t))(Y^{j}(t+\epsilon)-Y^{j}(t))}{\epsilon}\bigg{|}\mathcal{P}_{t}\right].

Due to Itô’s formula for semimartingales, QX(t)QX(t) does transform as a (2,0)(2,0)-tensor and is obviously symmetric, so that the definition is independent of the choice of UU. However, the formal limit 𝐄[1ϵ(Xi(t+ϵ)Xi(t))|𝒫t]\mathbf{E}[\frac{1}{\epsilon}(X^{i}(t+\epsilon)-X^{i}(t))|\mathcal{P}_{t}] under any coordinates (xi)(x^{i}), no longer transforms as a vector, as can be guessed from (2.4). In order to turn it into a vector we need to specify a coordinate system. A natural choice is the normal coordinate system. For this purpose, we endow MM with a linear connection \nabla, which determines a normal coordinate system near each point on MM.

Definition 2.5 (\nabla-mean derivatives).

Given a linear connection \nabla on MM, the (forward) \nabla-mean derivative of the MM-valued semimartingale {X(t)}t[t0,τ)\{X(t)\}_{t\in[t_{0},\tau)} is a TMTM-valued process DXD_{\nabla}X on [t0,τ)[t_{0},\tau), whose value at time t[t0,τ)t\in[t_{0},\tau) is defined in normal coordinates (xi)(x^{i}) on the normal neighborhood UU of qMq\in M and under the conditional probability 𝐏(|X(t)=q)\mathbf{P}(\cdot|X(t)=q) as follows:

(DX)i(t)=limϵ0+𝐄[Xi(t+ϵ)Xi(t)ϵ|𝒫t],(D_{\nabla}X)^{i}(t)=\lim_{\epsilon\to 0^{+}}\mathbf{E}\left[\frac{X^{i}(t+\epsilon)-X^{i}(t)}{\epsilon}\bigg{|}\mathcal{P}_{t}\right],

where the limits are assumed to exist in L1(Ω,,𝐏)L^{1}(\Omega,\mathcal{F},\mathbf{P}).

As we force DX(t)D_{\nabla}X(t) to be vector-valued by definition, its coordinate expression under any other coordinate system can be calculated via Leibniz’s rule. Let us stress that the notation DD_{\nabla} should not be confused with the one of covariant derivatives in geometry.

Now we formally take forward mean derivatives in Itô SDE (2.1), and note that the correction term in the modified drift involving Christoffel symbols vanishes by (2.2). Then, we get an ODE-like system:

{DX(t)=b(t,X(t)),QX(t)=(σσ)(t,X(t)).\left\{\begin{aligned} &D_{\nabla}X(t)=b(t,X(t)),\\ &QX(t)=(\sigma\circ\sigma^{*})(t,X(t)).\end{aligned}\right. (2.6)

We call equations (2.6) a system of mean differential equations (MDEs). Note that both MDEs (2.6) and Itô SDE (2.1) rely on linear connections on MM.

Definition 2.6 (Solutions to MDEs).

Given a linear connection on MM, a solution of MDEs (2.6) is a triple XX, (Ω,,𝐏)(\Omega,\mathcal{F},\mathbf{P}), {𝒫t}t\{\mathcal{P}_{t}\}_{t\in\mathbb{R}}, where

  • (i)

    (Ω,,𝐏)(\Omega,\mathcal{F},\mathbf{P}) is a probability space, and {𝒫t}t\{\mathcal{P}_{t}\}_{t\in\mathbb{R}} is a past filtration of \mathcal{F} satisfying the usual conditions,

  • (ii)

    X={X(t)}t[t0,τ)X=\{X(t)\}_{t\in[t_{0},\tau)} is a continuous, {𝒫t}\{\mathcal{P}_{t}\}-adapted MM-valued semimartingale with lifetime a {𝒫t}\{\mathcal{P}_{t}\}-stopping time τ>t0\tau>t_{0}, and

  • (iii)

    the \nabla-mean derivative and quadratic mean derivative of XX exist and satisfy (2.6).

2.3 Second-order operators and martingale problems

Definition 2.7 (Second-order operators).

A second-order operator on MM is a linear operator A:C(M)C(M)A:C^{\infty}(M)\to C^{\infty}(M), which has the following expression in a coordinate chart (U,(xi))(U,(x^{i})),

Af=Aifxi+Aij2fxixj,fC(M),Af=A^{i}\frac{\partial f}{\partial x^{i}}+A^{ij}\frac{\partial^{2}f}{\partial x^{i}\partial x^{j}},\quad f\in C^{\infty}(M), (2.7)

where (Aij)(A^{ij}) is a symmetric (2,0)(2,0)-tensor field, and the expression is required to be invariant under changes of coordinates. If (Aij)(A^{ij}) is positive semi-definite, then we say the second-order operator AA is elliptic; if (Aij)(A^{ij}) is positive definite, we say AA is nondegenerate elliptic.

There is a coordinate-free definition of second-order operators. A linear map Aq:C(M)A_{q}:C^{\infty}(M)\to\mathbb{R} is called a second-order derivation at qMq\in M, if there is a symmetric (2,0)(2,0)-tensor ΓAq\Gamma_{A_{q}} at qq such that Aq(fg)=f(q)Aqg+g(q)Aqf+(dfdg)(ΓAq)A_{q}(fg)=f(q)A_{q}g+g(q)A_{q}f+(df\otimes dg)(\Gamma_{A_{q}}) for all f,gC(M)f,g\in C^{\infty}(M). Then, a second-order operator is nothing but a smooth field of second-order derivations. From this, we see that for AA in (2.7), Ai=A(xi)A^{i}=A(x^{i}), Aij=A(xixj)xiA(xj)xjA(xi)A^{ij}=A(x^{i}x^{j})-x^{i}A(x^{j})-x^{j}A(x^{i}), and

ΓA=Aijxixj.\Gamma_{A}=A^{ij}\frac{\partial}{\partial{x^{i}}}\otimes\frac{\partial}{\partial{x^{j}}}. (2.8)

We call ΓA\Gamma_{A} the squared field operator (originally “opérateur carré du champ”) associated with AA. We also denote ΓA(f,g):=(dfdg)(ΓA)\Gamma_{A}(f,g):=(df\otimes dg)(\Gamma_{A}). Clearly, for a classical vector field VV, ΓV0\Gamma_{V}\equiv 0 by Leibniz’s rule.

It is easy to verify from the coordinate-change invariance that the coefficients AiA^{i}’s and AijA^{ij}’s transform under the change of coordinates from (xi)(x^{i}) to (x~j)(\tilde{x}^{j}) by the following rule (e.g., [43, Section V.4]),

A~i=x~ixjAj+2x~ixjxkAjk,A~ij=x~ixkx~jxlAkl.\tilde{A}^{i}=\frac{\partial\tilde{x}^{i}}{\partial x^{j}}A^{j}+\frac{\partial^{2}\tilde{x}^{i}}{\partial x^{j}\partial x^{k}}A^{jk},\quad\tilde{A}^{ij}=\frac{\partial\tilde{x}^{i}}{\partial x^{k}}\frac{\partial\tilde{x}^{j}}{\partial x^{l}}A^{kl}. (2.9)

The formal generator of Itô SDE (2.1) is given by,

AtX=𝔟i(t)xi+12r=1Nσri(t)σrj(t)2xixj,A^{X}_{t}=\mathfrak{b}^{i}(t)\frac{\partial}{\partial x^{i}}+\frac{1}{2}\sum_{r=1}^{N}\sigma^{i}_{r}(t)\sigma^{j}_{r}(t)\frac{\partial^{2}}{\partial x^{i}\partial x^{j}}, (2.10)

which is a time-dependent second-order elliptic operator due to the change-of-coordinate formula (2.4).

Denote by 𝒞t0\mathcal{C}_{t_{0}} the subspace of C([t0,),M)C([t_{0},\infty),M^{*}) consisting of all paths always staying in MM or eventually stopped at M\partial_{M}. That is, ω𝒞t0\omega\in\mathcal{C}_{t_{0}} if and only if there exists τ(ω)(t0,]\tau(\omega)\in(t_{0},\infty] such that ω(t)M\omega(t)\in M for t[t0,τ(ω))t\in[t_{0},\tau(\omega)) and ω(t)=M\omega(t)=\partial_{M} for t[τ(ω),)t\in[\tau(\omega),\infty). Let (𝒞t0)\mathcal{B}(\mathcal{C}_{t_{0}}) be the σ\sigma-field generated by Borel cylinder sets. Let X(t):𝒞t0M,X(t,ω)=ω(t),tt0X(t):\mathcal{C}_{t_{0}}\to M^{*},X(t,\omega)=\omega(t),t\geq t_{0} be the coordinate mapping. For each tt\in\mathbb{R}, define a sub-σ\sigma-field by t=σ{X(s):t0st0t}\mathcal{B}_{t}=\sigma\{X(s):t_{0}\leq s\leq t_{0}\vee t\}. Then {t}t\{\mathcal{B}_{t}\}_{t\in\mathbb{R}} is a past filtration of (𝒞t0)\mathcal{B}(\mathcal{C}_{t_{0}}) and τ\tau is a {t}\{\mathcal{B}_{t}\}-stopping time.

Definition 2.8 (Martingale problems on manifolds, [40, Definition 1.3.1]).

Given a time-dependent second-order elliptic operator A=(At)tt0A=(A_{t})_{t\geq t_{0}}, a solution to the martingale problem associated with AA is a triple XX, (Ω,,𝐏)(\Omega,\mathcal{F},\mathbf{P}), {𝒫t}t\{\mathcal{P}_{t}\}_{t\in\mathbb{R}}, where

  • (i)

    (Ω,,𝐏)(\Omega,\mathcal{F},\mathbf{P}) is a probability space, and {𝒫t}t\{\mathcal{P}_{t}\}_{t\in\mathbb{R}} is a past filtration of \mathcal{F} satisfying the usual conditions,

  • (ii)

    X:Ω𝒞t0X:\Omega\to\mathcal{C}_{t_{0}} is an MM^{*}-valued {𝒫t}\{\mathcal{P}_{t}\}-semimartingale, and

  • (iii)

    for every fC(×M)f\in C^{\infty}(\mathbb{R}\times M), the process Mf,X(t):=f(t,X(t))f(t0,X(t0))t0t(t+As)f(s,X(s))𝑑sM^{f,X}(t):=f(t,X(t))-f(t_{0},X(t_{0}))-\int_{t_{0}}^{t}(\frac{\partial}{\partial t}+A_{s})f(s,X(s))ds, t[t0,τ(X))t\in[t_{0},\tau(X)), is a real-valued continuous {𝒫t}\{\mathcal{P}_{t}\}-martingale.

The process {X(t)}t[t0,τ(X))\{X(t)\}_{t\in[t_{0},\tau(X))} is called an MM-valued {𝒫t}\{\mathcal{P}_{t}\}-diffusion process with generator AA (or simply an AA-diffusion).

The uniqueness in the sense of probability law for both MDEs and martingale problems can be defined in a similar fashion to Definition 2.3. Note that unlike Itô SDEs or MDEs, the definition for martingale problems does not rely on linear connections.

When provided with a linear connection on MM, one can see, in the same way as in Stroock and Varadhan’s theory (e.g., [48, Section 5.4]), that the existence of a solution to the martingale problem associated with AX=(AtX)tt0A^{X}=(A^{X}_{t})_{t\geq t_{0}} in (2.10) is equivalent to the existence of a weak solution to the Itô SDE (2.1), and also equivalent to the existence of a solution to MDEs (2.6); their uniqueness in law of are also equivalent.

2.4 The second-order tangent bundle

As we have seen, the modified drift 𝔟\mathfrak{b} in (2.3) is not a vector field. Is 𝔟\mathfrak{b} a section (and, in the affirmative, of what)? In fact, it is not a section of any bundle, as its changes-of-coordinate formula (2.4) involves σ\sigma. But if we look at the formal generator AXA^{X} in (2.10), or the pair (𝔟,σσ)(\mathfrak{b},\sigma\circ\sigma^{*}) of its coefficients, then we can construct a bundle whose structure group is governed by the changes-of-coordinate formulae (2.9), so that the sections are just second-order operators.

We denote by Sym2(d)\mathrm{Sym}^{2}(\mathbb{R}^{d}) the space of all symmetric (2,0)(2,0)-tensors on d\mathbb{R}^{d}, and by Sym+2(d)\mathrm{Sym}^{2}_{+}(\mathbb{R}^{d}) the subspace of it consisting of all positive semi-definite (2,0)(2,0)-tensors. Also denote by (n,d)\mathcal{L}(\mathbb{R}^{n},\mathbb{R}^{d}) the space of all linear maps from n\mathbb{R}^{n} to d\mathbb{R}^{d}.

Definition 2.9 (The second-order tangent bundle).

(i). [34, Definition 7.14] The Itô group GIdG_{I}^{d} is the Cartesian product (but not direct product of groups) GL(d,)×(dd,d)\mathrm{GL}(d,\mathbb{R})\times\mathcal{L}(\mathbb{R}^{d}\otimes\mathbb{R}^{d},\mathbb{R}^{d}) equipped with the following binary operation:

(g2,κ2)(g1,κ1)=(g2g1,g2κ1+κ2(g1g1)),(g_{2},\kappa_{2})\circ(g_{1},\kappa_{1})=(g_{2}\circ g_{1},g_{2}\circ\kappa_{1}+\kappa_{2}\circ(g_{1}\otimes g_{1})),

for all g1,g2GL(d,)g_{1},g_{2}\in\mathrm{GL}(d,\mathbb{R}), κ1,κ2(dd,d)\kappa_{1},\kappa_{2}\in\mathcal{L}(\mathbb{R}^{d}\otimes\mathbb{R}^{d},\mathbb{R}^{d}).

(ii). The left group action of GIdG_{I}^{d} on d×Sym2(d)\mathbb{R}^{d}\times\mathrm{Sym}^{2}(\mathbb{R}^{d}) is defined by

(g,κ)(𝔟,a)=(g𝔟+κa,(gg)a),(g,\kappa)\cdot(\mathfrak{b},a)=(g\mathfrak{b}+\kappa a,(g\otimes g)a), (2.11)

for all (g,κ)GId(g,\kappa)\in G_{I}^{d}, 𝔟d\mathfrak{b}\in\mathbb{R}^{d}, aSym2(d)a\in\mathrm{Sym}^{2}(\mathbb{R}^{d}).

(iii). The second-order tangent bundle (𝒯OM,τMO,M)(\mathcal{T}^{O}M,\tau^{O}_{M},M) is the fiber bundle with base space MM, typical fiber d×Sym2(d)\mathbb{R}^{d}\times\mathrm{Sym}^{2}(\mathbb{R}^{d}), and structure group GIdG_{I}^{d}.

(iv). The fiber 𝒯qOM\mathcal{T}^{O}_{q}M at qMq\in M is called second-order tangent space to MM at qq. An element (𝔟,a)q𝒯qOM(\mathfrak{b},a)_{q}\in\mathcal{T}^{O}_{q}M is called a second-order tangent vector at qq. A (global or local) section of τMO\tau^{O}_{M} is called a second-order vector field.

(v). Denote by 𝒯EM\mathcal{T}^{E}M the subbundle of 𝒯OM\mathcal{T}^{O}M consisting of all elements (𝔟,a)q𝒯qOM(\mathfrak{b},a)_{q}\in\mathcal{T}^{O}_{q}M, qMq\in M, with aqa_{q} a positive semi-definite (2,0)(2,0)-tensors. Let τME=τMO|𝒯EM\tau^{E}_{M}=\tau^{O}_{M}|_{\mathcal{T}^{E}M}. We call (𝒯EM,τME,M)(\mathcal{T}^{E}M,\tau^{E}_{M},M) the second-order elliptic tangent bundle.

Remark 2.10.

(i). We indulge in some abuse of notions. For example, the second-order vector fields should not be confused with the semisprays which are sections of the double tangent bundle T2MT^{2}M (e.g., [77, Section 1.4], [51, Section IV.3]).

(ii). Some authors just defined second-order vector fields as second-order operators as in Definition 2.7 ([25, Definition 6.3] or [34, Definition 2.74]). As soon as we choose a frame for 𝒯OM\mathcal{T}^{O}M, it will be clear that second-order vector fields are identified with second-order operators.

(iii). The authors in [10, 34] define a bundle which has the Itô group as its structure group and has the pair (𝔟,σ)(\mathfrak{b},\sigma) of coefficients in Itô SDE (2.1) as its section. They name it Itô’s bundle and denote it as M\mathcal{I}M. The difference is that, in our formulation, the pair (𝔟,σσ)(\mathfrak{b},\sigma\circ\sigma^{*}) of coefficients of the generator of Itô SDE (2.1) is a section of second-order elliptic tangent bundle τME\tau^{E}_{M}. The advantage of the bundle τME\tau^{E}_{M} is that it is a natural generalization of tangent bundle to second-order and has a good geometric interpretation, as we will see in Proposition 3.2.

(iv). Note that the typical fiber d×Sym2(d)\mathbb{R}^{d}\times\mathrm{Sym}^{2}(\mathbb{R}^{d}) of τMO\tau^{O}_{M} is a vector space of dimension d+d(d+1)2d+\frac{d(d+1)}{2}. But τME\tau^{E}_{M} is not a vector bundle, since its structure group GIdG_{I}^{d} is not a linear group (subgroup of general linear group). The typical fiber of τME\tau^{E}_{M} is d×Sym+2(d)\mathbb{R}^{d}\times\mathrm{Sym}^{2}_{+}(\mathbb{R}^{d}), which is not even a vector space, so that τME\tau^{E}_{M} is not a vector bundle either. Indeed, we may call them quadratic bundles, just as the way they call Itô’s bundle in [10, Chapter 4].

(v). The Itô’s bundle M\mathcal{I}M defined in [34, Definition 7.17] is the fiber bundle over manifold MM, with fiber d×(N,d)\mathbb{R}^{d}\times\mathcal{L}(\mathbb{R}^{N},\mathbb{R}^{d}) and structure group GIdG_{I}^{d} which acts on the fiber from the left by

(g,κ)(𝔟,σ)=(g𝔟+12tr(κ(σσ)),gσ),(g,\kappa)(\mathfrak{b},\sigma)=\left(g\mathfrak{b}+\textstyle{{\frac{1}{2}}}\mathrm{tr}\,(\kappa\circ(\sigma\otimes\sigma)),g\circ\sigma\right),

for all (g,κ)GId(g,\kappa)\in G_{I}^{d}, 𝔟d\mathfrak{b}\in\mathbb{R}^{d}, σ(N,d)\sigma\in\mathcal{L}(\mathbb{R}^{N},\mathbb{R}^{d}). For the same reason as 𝒯OM\mathcal{T}^{O}M or 𝒯EM\mathcal{T}^{E}M, Itô’s bundle M\mathcal{I}M is not a vector bundle. There is a bundle homomorphism over MM from M\mathcal{I}M to 𝒯EM\mathcal{T}^{E}M, which maps in fibers from qM\mathcal{I}_{q}M to 𝒯qEM\mathcal{T}^{E}_{q}M, qMq\in M, by (𝔟,σ)(𝔟,σσ)(\mathfrak{b},\sigma)\to(\mathfrak{b},\sigma\circ\sigma^{*}). It is easy to see that this bundle homomorphism is also a subjective submersion. If we identify gGL(d,)g\in\mathrm{GL}(d,\mathbb{R}) with (g,0)GId(g,0)\in G_{I}^{d}, then GL(d,)\mathrm{GL}(d,\mathbb{R}) is a subgroup of GIdG_{I}^{d}. We define the Stratonovich’s bundle 𝒮M\mathcal{S}M to be the reduction of M\mathcal{I}M to the structure group GL(d,)\mathrm{GL}(d,\mathbb{R}), that is, the fiber bundle over MM, with fiber d×(N,d)\mathbb{R}^{d}\times\mathcal{L}(\mathbb{R}^{N},\mathbb{R}^{d}) and structure group GL(d,)\mathrm{GL}(d,\mathbb{R}) which acts on the fiber from the left by

g(𝔟,σ)=(g𝔟,gσ).g(\mathfrak{b},\sigma)=(g\mathfrak{b},g\circ\sigma).

Unlike 𝒯OM\mathcal{T}^{O}M or M\mathcal{I}M, Stratonovich’s bundle 𝒮M\mathcal{S}M is indeed a vector bundle, and the tangent bundle TMTM is a vector subbundle of 𝒮M\mathcal{S}M. It can be expected that Stratonovich’s bundle is a natural bundle to formulate Stratonovich SDEs. But, in this paper, we mainly focus on Itô SDEs and their generators.

It is natural to regard the differential operators

{xi,2xjxk:1id,1jkd}\left\{\frac{\partial}{\partial{x^{i}}},\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}:1\leq i\leq d,1\leq j\leq k\leq d\right\} (2.12)

as a local frame of 𝒯OM\mathcal{T}^{O}M over the local chart (U,(xi))(U,(x^{i})) on MM. In the sequel, we will usually shorten them by

{i,jk:1id,1jkd}.\left\{\partial_{i},\ \partial_{j}\partial_{k}:1\leq i\leq d,1\leq j\leq k\leq d\right\}.

We make the convention that kj=jk\partial_{k}\partial_{j}=\partial_{j}\partial_{k} for all 1jkd1\leq j\leq k\leq d. A second-order vector field (𝔟,a)(\mathfrak{b},a) is expressed in terms of this local frame by

(𝔟,a)=𝔟ii+12ajkjk.(\mathfrak{b},a)=\mathfrak{b}^{i}\partial_{i}+\textstyle{{\frac{1}{2}a^{jk}}}\partial_{j}\partial_{k}.

In this way, every second-order vector field can be regarded as a second-order operator and vice versa. In particular, the generator AXA^{X} of an MM-valued diffusion process XX, for example the generator (2.10) of the Itô SDE, is a time-dependent second-order vector field, so that we can rewrite AXA^{X} as AtX=(𝔟(t),(σσ)(t))A^{X}_{t}=(\mathfrak{b}(t),(\sigma\circ\sigma^{*})(t)).

The tangent bundle TMTM is a subbundle (but not a vector subbunddle) and also an embedded submanifold of 𝒯OM\mathcal{T}^{O}M, as the bundle monomorphism

ι:(TM,τM,M)(𝒯OM,τMO,M),vq(v,0)q\iota:(TM,\tau_{M},M)\to\left(\mathcal{T}^{O}M,\tau^{O}_{M},M\right),\quad v_{q}\mapsto(v,0)_{q} (2.13)

is also an embedding. However, there is no canonical bundle epimorphism from 𝒯OM\mathcal{T}^{O}M to TMTM which is a left inverse of ι\iota and linear in fiber. We call such a bundle epimorphism a fiber-linear bundle projection from 𝒯OM\mathcal{T}^{O}M to TMTM. The choice of such a bundle epimorphism is exactly the choice of a linear connection on MM. More precisely, we have the following connection correspondence properties, the first of which can also be found in [34, Section 2.9].

Proposition 2.11 (Connection correspondence).

Any linear connection on MM induces a fiber-linear bundle projection from 𝒯OM\mathcal{T}^{O}M to TMTM. Conversely, any fiber-linear bundle projection from 𝒯OM\mathcal{T}^{O}M to TMTM induces a torsion-free linear connection on MM.

Remark 2.12.

The connection correspondence is similar to the correspondence between horizontal subbundles of the tangent bundle of a vector bundle and connections on this vector bundle, cf. [77, Section 3.1].

Proof.

Let (Γijk)(\Gamma_{ij}^{k}) be the Christoffel symbols of a linear connection \nabla on MM. Define a projection by

ϱ:𝒯OMTM,(𝔟,a)q(𝔟i+12ajkΓjki(q))i|q.\varrho_{\nabla}:\mathcal{T}^{O}M\to TM,\quad(\mathfrak{b},a)_{q}\mapsto\left(\mathfrak{b}^{i}+\textstyle{{\frac{1}{2}}}a^{jk}\Gamma^{i}_{jk}(q)\right)\partial_{i}\big{|}_{q}. (2.14)

Clearly, ϱ\varrho_{\nabla} is linear in fiber and ϱι=𝐈𝐝TM\varrho_{\nabla}\circ\iota=\mathbf{Id}_{TM}. Conversely, let ϱ:𝒯OMTM\varrho:\mathcal{T}^{O}M\to TM be a fiber-linear bundle projection. Then, on each coordinate chart (U,(xi))(U,(x^{i})) around qMq\in M, there exists a diffeomorphism BU:U(Sym2(d),d)B_{U}:U\to\mathcal{L}(\mathrm{Sym}^{2}(\mathbb{R}^{d}),\mathbb{R}^{d}), such that

ϱ(𝔟,a)=(𝔟i+BU(q)(a)i)i|q,(𝔟,a)𝒯qOM,qU.\varrho(\mathfrak{b},a)=\left(\mathfrak{b}^{i}+B_{U}(q)(a)^{i}\right)\partial_{i}\big{|}_{q},\quad(\mathfrak{b},a)\in\mathcal{T}^{O}_{q}M,q\in U.

The family of diffeomorphisms (BU)(B_{U}) determines a spray and then a torsion-free linear connection on MM (see, e.g., [51, Section IV.3]). The torsion-freeness follows from the symmetry of BUB_{U}’s. ∎

Observe that a group action of GL(d,)\mathrm{GL}(d,\mathbb{R}) on Sym2(d)\mathrm{Sym}^{2}(\mathbb{R}^{d}) can be separated from (2.11), which is given by ga=(gg)ag\cdot a=(g\otimes g)a. Thus the second component aa of each element (𝔟,a)𝒯qOM(\mathfrak{b},a)\in\mathcal{T}^{O}_{q}M can be regarded as a (2,0)(2,0)-tensor. Recall that we denote by Sym2(TM)\mathrm{Sym}^{2}(TM) the bundle of (2,0)(2,0)-tensors on MM, then there is a canonical bundle epimorphism

ϱ^:𝒯OMSym2(TM),(𝔟,a)qaq,\hat{\varrho}:\mathcal{T}^{O}M\to\mathrm{Sym}^{2}(TM),\quad(\mathfrak{b},a)_{q}\mapsto a_{q}, (2.15)

whose kernel is the image of ι\iota. Conversely, we also have a similar connection correspondence property for Sym2(TM)\mathrm{Sym}^{2}(TM), as in Proposition 2.11. That is, a linear connection \nabla on MM induces a fiber-linear bundle monomorphism from Sym2(TM)\mathrm{Sym}^{2}(TM) to 𝒯OM\mathcal{T}^{O}M, which is a right inverse of ϱ^\hat{\varrho} and given by

ι^:Sym2(TM)𝒯OM,aqaij(ij|qΓijk(q)k|q)=aiji,j2|q\hat{\iota}_{\nabla}:\mathrm{Sym}^{2}(TM)\to\mathcal{T}^{O}M,\quad a_{q}\mapsto a^{ij}\left(\partial_{i}\partial_{j}\big{|}_{q}-\Gamma^{k}_{ij}(q)\partial_{k}\big{|}_{q}\right)=a^{ij}\nabla^{2}_{\partial_{i},\partial_{j}}\big{|}_{q} (2.16)

where 2\nabla^{2} is the second covariant derivative [74, Subsection 2.2.2.3] (which is also called the Hessian operator when acting on smooth functions [47]). In other words, i,j2|q=ι^(dxidxj|q)\nabla^{2}_{\partial_{i},\partial_{j}}|_{q}=\hat{\iota}_{\nabla}(dx^{i}\odot dx^{j}|_{q}), where \odot is the symmetrization operator on T2MT^{2}M.

Combining (2.13) and (2.14) together, we have the following short exact sequence:

0TMι𝒯OMϱ^Sym2(TM)0.0\longrightarrow TM\stackrel{{\scriptstyle\iota}}{{\longrightarrow}}\mathcal{T}^{O}M\stackrel{{\scriptstyle\hat{\varrho}}}{{\longrightarrow}}\mathrm{Sym}^{2}(TM)\longrightarrow 0. (2.17)

Proposition 2.11 and (2.15), (2.16) imply that when a linear connection \nabla is given, the sequence is also split, in the fiber-wise sense. The induced decomposition

𝒯OM=ι(TM)ι^(Sym2(TM))TMSym2(TM),\mathcal{T}^{O}M=\iota(TM)\oplus\hat{\iota}_{\nabla}\left(\mathrm{Sym}^{2}(TM)\right)\cong TM\oplus\mathrm{Sym}^{2}(TM), (2.18)

where both the first direct sum \oplus and the isomorphism \cong are in the fiber-wise sense (but not bundle isomorphism and Whitney sum), while the second direct sum is the Whitney sum, and is given by

(𝔟,a)q=bii|q+12aiji,j2|q(bq,aq),(\mathfrak{b},a)_{q}=b^{i}\partial_{i}\big{|}_{q}+\textstyle{{\frac{1}{2}}}a^{ij}\nabla^{2}_{\partial_{i},\partial_{j}}\big{|}_{q}\mapsto(b_{q},a_{q}), (2.19)

for bq=(𝔟i+12ajkΓjki(q))i|qTqMb_{q}=(\mathfrak{b}^{i}+\textstyle{{\frac{1}{2}}}a^{jk}\Gamma^{i}_{jk}(q))\partial_{i}|_{q}\in T_{q}M. A similar short exact sequence as (2.17) holds with 𝒯EM\mathcal{T}^{E}M and Sym+2(TM)\mathrm{Sym}^{2}_{+}(TM) in place of 𝒯OM\mathcal{T}^{O}M and Sym2(TM)\mathrm{Sym}^{2}(TM), respectively.

Now we introduce a subclass of semimartingales on manifolds which contains diffusions. We call the MM-valued process X={X(t)}t[t0,τ)X=\{X(t)\}_{t\in[t_{0},\tau)} an Itô process, if there exists a {𝒫t}\{\mathcal{P}_{t}\}-adapted continuous 𝒯EM\mathcal{T}^{E}M-valued process {(𝔟,a)(t)}t[t0,τ)\{(\mathfrak{b},a)(t)\}_{t\in[t_{0},\tau)} satisfying (𝔟,a)(t)𝒯X(t)EM(\mathfrak{b},a)(t)\in\mathcal{T}^{E}_{X(t)}M for each t[t0,τ)t\in[t_{0},\tau), such that for every fC(×M)f\in C^{\infty}(\mathbb{R}\times M), Mf,X(t):=f(t,X(t))f(t0,X(t0))t0t(t+𝒜X)f(s,X(s))𝑑sM^{f,X}(t):=f(t,X(t))-f(t_{0},X(t_{0}))-\int_{t_{0}}^{t}(\frac{\partial}{\partial{t}}+\mathcal{A}^{X})f(s,X(s))ds, t[t0,τ)t\in[t_{0},\tau) is a real-valued {𝒫t}\{\mathcal{P}_{t}\}-martingale, where 𝒜tX=(𝔟,a)(t)=𝔟i(t)i+12aij(t)ij\mathcal{A}^{X}_{t}=(\mathfrak{b},a)(t)=\mathfrak{b}^{i}(t)\partial_{i}+\textstyle{{\frac{1}{2}}}a^{ij}(t)\partial_{i}\partial_{j}. We call the process {(𝔟,a)(t)}t[t0,τ)={𝒜tX}t[t0,τ)\{(\mathfrak{b},a)(t)\}_{t\in[t_{0},\tau)}=\{\mathcal{A}^{X}_{t}\}_{t\in[t_{0},\tau)} the random generator of XX. A similar notion “Brownian semimartingale” is also used in the literature (e.g., [22]). If XX is a diffusion with generator AtX=(𝔟(t),a(t))A^{X}_{t}=(\mathfrak{b}(t),a(t)), then it is an Itô process with random generator 𝒜tX=A(t,X(t))X=(𝔟(t,X(t)),a(t,X(t)))\mathcal{A}^{X}_{t}=A^{X}_{(t,X(t))}=(\mathfrak{b}(t,X(t)),a(t,X(t))). The difference between Itô processes and diffusions is that the randomness of the random generator of the former can not only appear on the base manifold MM, but also on the fibers.

Then, we can define forward mean derivatives in a coordinate-free way, without relying on linear connections.

Definition 2.13 (Mean derivatives).

For an MM-valued Itô process X={X(t)}t[t0,τ)X=\{X(t)\}_{t\in[t_{0},\tau)}, we define its (forward) mean derivatives (DX(t),QX(t))(DX(t),QX(t)) at time t[t0,τ)t\in[t_{0},\tau) by

(DX(t),QX(t))=(𝔟,a)(t)𝒯X(t)EM,(DX(t),QX(t))=(\mathfrak{b},a)(t)\in\mathcal{T}^{E}_{X(t)}M,

where (𝔟,a)(\mathfrak{b},a) is the random generator of XX.

Comparing with forward mean derivatives defined in local coordinates before, we have the following relations. The proof follows the lines of [34, Lemma 9.4].

Lemma 2.14.

Given an MM-valued Itô process X={X(t)}t[t0,τ)X=\{X(t)\}_{t\in[t_{0},\tau)} and a coordinate chart (U,(xi))(U,(x^{i})) centered at qMq\in M.

(i). In the event {X(t)U}\{X(t)\in U\}, QX(t)QX(t) has the coordinate expression (2.5) and

(DX)i(t)=limϵ0+𝐄[Xi(t+ϵ)Xi(t)ϵ|𝒫t].(DX)^{i}(t)=\lim_{\epsilon\to 0^{+}}\mathbf{E}\left[\frac{X^{i}(t+\epsilon)-X^{i}(t)}{\epsilon}\bigg{|}\mathcal{P}_{t}\right].

(ii). Given a linear connection \nabla on MM, we have, under the conditional probability 𝐏(|X(t)=q)\mathbf{P}(\cdot|X(t)=q), that

(DX)i(t)=(DX)i(t)+12Γjki(X(t))(QX)jk(t).(D_{\nabla}X)^{i}(t)=(DX)^{i}(t)+\frac{1}{2}\Gamma^{i}_{jk}(X(t))(QX)^{jk}(t). (2.20)

It follows from (2.20) that the map ϱ\varrho_{\nabla} in (2.14) acts on the generator AXA^{X} of a diffusion XX by

ϱ(A(t,X(t))X)=ϱ(DX(t),QX(t))=DX(t)\varrho_{\nabla}(A^{X}_{(t,X(t))})=\varrho_{\nabla}(DX(t),QX(t))=D_{\nabla}X(t) (2.21)

For a time-dependent second-order vector field At=(𝔟(t),a(t))A_{t}=(\mathfrak{b}(t),a(t)), we can take MDEs (2.6) to set up a new type of MDEs by using the mean derivatives as follows:

{DX(t)=𝔟(t,X(t)),QX(t)=a(t,X(t)).\left\{\begin{aligned} DX(t)&=\mathfrak{b}(t,X(t)),\\ QX(t)&=a(t,X(t)).\end{aligned}\right. (2.22)

Then, similarly to Definitions 2.6 and 2.3, we may also define solutions and uniqueness in law for MDEs (2.22). We call a solution of (2.22) an integral process of A=(At)A=(A_{t}). Note that the system (2.22) does not rely on linear connections. The equivalence of the well-posedness of (2.22) and the martingale problem in Definition 2.8 is easy to verify. When a linear connection is specified, the system (2.22) and martingale problem associated with AXA^{X} in (2.10) are both equivalent to the Itô SDE (2.1) and MDEs (2.6).

3 Stochastic jets

In classical differential geometry, a tangent vector to a manifold may be defined as an equivalence class of curves passing through a given point, where two curves are equivalent if they have the same derivative at that point [55, Chapter 3]. This idea can be generalized to higher-order cases, which leads to the notion of jets. The jet structures allow us to translate a system of differential equations to a system of algebraic equations, and make it more intuitive to study the symmetries of systems of differential equations.

In this chapter we shall generalize these ideas to the stochastic case. We will first give an equivalent description to the second-order elliptic tangent bundle τME\tau^{E}_{M} by constructing an equivalence relation on diffusions. Then we will define the stochastic jets and figure out the “jet-like” bundle structure involved in the space of stochastic jets. Finally, we shall see that the bundle structure is the appropriate platform to formulate SDEs intrinsically. In the next chapter, we will apply stochastic jets to study stochastic symmetries.

3.1 The stochastic tangent bundle

Recall that a tangent vector can be represented as a equivalence classes of smooth curves that have the same velocity at the base point. This leads to the following equivalent definition of tangent bundle TMTM:

TM{[γ]q:γC(0,q)(M),qM},TM\cong\left\{[\gamma]_{q}:\gamma\in C^{\infty}_{(0,q)}(M),q\in M\right\}, (3.1)

where C(0,q)(M)C^{\infty}_{(0,q)}(M) is the set of all smooth curves on MM that pass through qq at time t=0t=0, and the equivalence relation is defined as γ,γ~C(0,q)(M)\gamma,\tilde{\gamma}\in C^{\infty}_{(0,q)}(M) are equivalent if and only if (fγ)(0)=(fγ~)(0)(f\circ\gamma)^{\prime}(0)=(f\circ\tilde{\gamma})^{\prime}(0) for every real-valued smooth function ff defined in neighborhood qq. If we replace smooth curves by diffusion processes, and time derivatives by mean derivatives, then we get the following definition.

Definition 3.1 (The stochastic tangent bundle).

Two MM-valued diffusion processes X={X(t)}t[0,τ)X=\{X(t)\}_{t\in[0,\tau)}, Y={Y(t)}t[0,σ)Y=\{Y(t)\}_{t\in[0,\sigma)} are said to be stochastically equivalent at (t,q)×M(t,q)\in\mathbb{R}\times M, if, almost surely, X(t)=Y(t)=qX(t)=Y(t)=q and D(fX)(t)=D(fY)(t)D(f\circ X)(t)=D(f\circ Y)(t) for all fC(M)f\in C^{\infty}(M). The equivalence class containing XX is called the stochastic tangent vector of XX at qq and is denoted by j(t,q)Xj_{(t,q)}X. When t=0t=0, we denote jqX:=j(0,q)Xj_{q}X:=j_{(0,q)}X in short. Let I(t,q)(M)I_{(t,q)}(M) be the set of all MM-valued diffusion processes starting from qq at time tt. The stochastic tangent bundle of MM is the set

𝒯SM={jqX:XI(0,q)(M),qM}.\mathcal{T}^{S}M=\{j_{q}X:X\in I_{(0,q)}(M),q\in M\}.

Note that since X,YX,Y are MM-valued diffusion processes, f(X)f(X) and f(Y)f(Y) are real-valued Itô processes, and hence their mean derivatives exists.

At this stage, we have not yet touched the jet-like formulation even though we used the jet-like notation jqXj_{q}X. Indeed, if one follows strictly the definition of jet bundles over the trivial bundle (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}), it is more rational to use the time line \mathbb{R} as “source” and the manifold MM as “target” (cf. [77, Example 4.1.16]). But here we just assign the “target” to the manifold MM, because, roughly speaking, one can talk about the velocity of a smooth curve at a moment tt, but not about the generator of a diffusion at a moment tt. Instead, we can talk about the generator of a diffusion at a position qMq\in M. Later on, we will define the “bona fide” stochastic jet space which possess the time line \mathbb{R} as “source” and the manifold MM as “target”.

Similarly to the one-to-one correspondence between tangent space and space of equivalence classes of smooth curves, we have the following:

Proposition 3.2.

There is a one-to-one correspondence between the stochastic tangent bundle 𝒯SM\mathcal{T}^{S}M and the second-order elliptic tangent bundle 𝒯EM\mathcal{T}^{E}M.

Proof.

For an MM-valued diffusion process XI(0,q)(M)X\in I_{(0,q)}(M), qMq\in M, we denote by AXA^{X} its generator. Then the map jqXA(0,q)X=(DX(0),QX(0))j_{q}X\mapsto A^{X}_{(0,q)}=(DX(0),QX(0)) defines a one-to-one correspondence between 𝒯SM\mathcal{T}^{S}M and 𝒯EM\mathcal{T}^{E}M. The inverse map is Aq=(𝔟,a)qjqXAA_{q}=(\mathfrak{b},a)_{q}\mapsto j_{q}X^{A}, where AA is a section of 𝒯EM\mathcal{T}^{E}M (i.e., an elliptic second-order operator) smoothly extending the element Aq𝒯qEMA_{q}\in\mathcal{T}^{E}_{q}M, and XAI(0,q)(M)X^{A}\in I_{(0,q)}(M) is a diffusion processes having AA as its generator. ∎

Therefore, the stochastic tangent bundle 𝒯SM\mathcal{T}^{S}M admit a smooth structure which makes it to be a smooth manifold diffeomorphic to 𝒯EM\mathcal{T}^{E}M, and hence it is a bona fide fiber bundle over MM. In the sequel, we will identify 𝒯SM\mathcal{T}^{S}M with 𝒯EM\mathcal{T}^{E}M without ambiguity. And the projection map from 𝒯SM\mathcal{T}^{S}M to MM will be denoted by τMS\tau^{S}_{M}, that is, τMS(jqX)=q\tau^{S}_{M}(j_{q}X)=q for any jqX𝒯SMj_{q}X\in\mathcal{T}^{S}M.

Definition 3.3 (Canonical coordinate system on 𝒯SM\mathcal{T}^{S}M).

Let (U,(xi))(U,(x^{i})) be an coordinate system on MM. The induced canonical coordinate chart (U(1),x(1))(U^{(1)},x^{(1)}) on 𝒯SM\mathcal{T}^{S}M is defined by

U(1):={jqX:qU,XI(0,q)(M)},x(1):=(xi,Dix,Qjkx),U^{(1)}:=\left\{j_{q}X:q\in U,X\in I_{(0,q)}(M)\right\},\quad x^{(1)}:=(x^{i},D^{i}x,Q^{jk}x),

where xi(jqX)=xi(q)x^{i}(j_{q}X)=x^{i}(q), Dix(jqX)=(DX)i(0)D^{i}x(j_{q}X)=(DX)^{i}(0) and Qjkx(jqX)=(QX)jk(0)Q^{jk}x(j_{q}X)=(QX)^{jk}(0).

Our slightly ambiguous notations DixD^{i}x and QjkxQ^{jk}x are chosen so as to avoid the worse one QxjkQx^{jk}.

When a linear connection \nabla is provided, we can also define the coordinates via the \nabla-mean derivative DD_{\nabla} instead of DD, as follows:

Dix(jqX):=(DX)i(0).D^{i}_{\nabla}x(j_{q}X):=(D_{\nabla}X)^{i}(0).

Then, x(1):=(xi,Dix,Qjkx)x^{(1)}_{\nabla}:=(x^{i},D^{i}_{\nabla}x,Q^{jk}x) also forms a coordinate system on 𝒯SM\mathcal{T}^{S}M, which we call the \nabla-canonical coordinate system. It follows from relation (2.20) that

Dix=Dix+12(Γjkix)Qjkx.D_{\nabla}^{i}x=D^{i}x+\textstyle{{\frac{1}{2}}}(\Gamma^{i}_{jk}\circ x)Q^{jk}x. (3.2)

Using the identification of elements jqX𝒯qSMj_{q}X\in\mathcal{T}^{S}_{q}M and (𝔟,a)q𝒯qEM(\mathfrak{b},a)_{q}\in\mathcal{T}^{E}_{q}M via Proposition 3.2, as well as their relations with the element (bq,aq)TMSym2(TM)(b_{q},a_{q})\in TM\oplus\mathrm{Sym}^{2}(TM), via (2.19), we have Dix(jqX)=𝔟iD^{i}x(j_{q}X)=\mathfrak{b}^{i}, Dix(jqX)=bi=𝔟i+12ajkΓjki(q)D^{i}_{\nabla}x(j_{q}X)=b^{i}=\mathfrak{b}^{i}+\textstyle{{\frac{1}{2}}}a^{jk}\Gamma^{i}_{jk}(q) and Qjkx(jqX)=ajkQ^{jk}x(j_{q}X)=a^{jk}. In this way the fiber-linear bundle projection ϱ\varrho_{\nabla} of (2.14) maps, under the canonical coordinates (x,x˙)(x,\dot{x}) on TMTM, as follows:

x˙iϱ(jqX)=(Dix+12(Γjkix)Qjkx)(jqX)=Dix(jqX),\dot{x}^{i}\circ\varrho_{\nabla}(j_{q}X)=\left(D^{i}x+\textstyle{{\frac{1}{2}}}(\Gamma^{i}_{jk}\circ x)Q^{jk}x\right)(j_{q}X)=D^{i}_{\nabla}x(j_{q}X), (3.3)

so that Dix=x˙iϱD_{\nabla}^{i}x=\dot{x}^{i}\circ\varrho_{\nabla}. Therefore, (xi,Dix)(x^{i},D_{\nabla}^{i}x) is a partial coordinate system on 𝒯SM\mathcal{T}^{S}M that coincides with (xi,x˙i)(x^{i},\dot{x}^{i}) when restricted on TMTM. Moreover, the decomposition in (2.19) yields the following expressions for second-order vector fields:

(Dx,Qx)=Dixi+12Qjkxjk=Dixi+12Qjkxj,k2.(Dx,Qx)=D^{i}x\partial_{i}+\textstyle{{\frac{1}{2}}}Q^{jk}x\partial_{j}\partial_{k}=D_{\nabla}^{i}x\partial_{i}+\textstyle{{\frac{1}{2}}}Q^{jk}x\nabla^{2}_{\partial_{j},\partial_{k}}. (3.4)

Similarly to Definition 3.1, we define a \nabla-dependent equivalence relation as follows:

Definition 3.4.

Two MM-valued diffusion processes X={X(t)}t[0,τ)X=\{X(t)\}_{t\in[0,\tau)}, Y={Y(t)}t[0,σ)Y=\{Y(t)\}_{t\in[0,\sigma)} are said to be \nabla-stochastically equivalent at (t,q)×M(t,q)\in\mathbb{R}\times M, if, almost surely, X(t)=Y(t)=qX(t)=Y(t)=q and DX(t)=DX(t)D_{\nabla}X(t)=D_{\nabla}X(t). The equivalence class containing XX is called the \nabla-tangent vector of XX at qq and is denoted by j(t,q)Xj^{\nabla}_{(t,q)}X. When t=0t=0, we denote jqX:=j(0,q)Xj^{\nabla}_{q}X:=j^{\nabla}_{(0,q)}X for short.

Then, similarly to Proposition 3.2, one can show that the tangent bundle TMTM can be identified with the following set of equivalent classes of diffusions:

{jqX:XI(0,q)(M),qM},\left\{j_{q}^{\nabla}X:X\in I_{(0,q)}(M),q\in M\right\}, (3.5)

via jqXDX(0)j_{q}^{\nabla}X\mapsto D_{\nabla}X(0). Under this identification, it follows from (2.21) that jqX=ϱ(jqX)j_{q}^{\nabla}X=\varrho_{\nabla}(j_{q}X). Clearly, if we regard all smooth curves as special diffusions, then the partition determined by (3.1) is the restriction of the one determined by (3.5) to the set of all smooth curves.

Remark 3.5.

In presence of a linear connection \nabla on MM, one can easily follow Definition 3.1 and Proposition 3.2 with DD_{\nabla} in place of DD, to verify the one-to-one correspondence between the set 𝒯SM\mathcal{T}^{S}M of equivalent classes and the Whitney sum TMSym+2(TM)TM\oplus\mathrm{Sym}^{2}_{+}(TM), which brings back to the fiber-wise isomorphism (2.18). But since such kind of correspondence need to specify beforehand a linear connection, we still endow 𝒯SM\mathcal{T}^{S}M with the structure of 𝒯EM\mathcal{T}^{E}M instead of that of TMSym2(TM)TM\oplus\mathrm{Sym}^{2}(TM) in this paper, although the latter is also feasible and may provide easier calculations.

3.2 The stochastic jet space

In classical jet theory, for the trivial bundle (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}), there is a one-to-one correspondence between 1-jets and tangent vectors, and there is a canonical diffeomorphism between the first-order jet bundle J1πJ^{1}\pi and ×TM\mathbb{R}\times TM [77, Example 4.1.16].

Now using similar ideas, we will introduce the “bona fide” stochastic jet space. The key is to modify the definition of stochastic tangent vectors, to involve the time line \mathbb{R} as the “source” as well as to randomize the initial datum of the diffusion processes. Intuitively, an MM-valued diffusion process XX can be regarded as a random “section” of the trivial “bundle” (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) which is merely continuous in time and depends on the sample point ω\omega.

For a metric space (F,d)(F,d), we denote by L0(Ω,F)L^{0}(\Omega,F) the quotient space of all FF-valued random elements, by the following equivalence relation: two random elements are equivalent if and only if they are identical almost surely. We endow L0(Ω,F)L^{0}(\Omega,F) with the topology of the following 𝐏\mathbf{P}-essential metric (cf. [68, Section 43]):

ρ(ξ,ζ)=inf{c>0:𝐏(d(ξ,ζ)>c)=0}1.\rho(\xi,\zeta)=\inf\{c>0:\mathbf{P}(d(\xi,\zeta)>c)=0\}\wedge 1.
Definition 3.6.

Two MM-valued diffusion processes X={X(s)}s[t,τ)X=\{X(s)\}_{s\in[t,\tau)}, Y={Y(s)}s[t,σ)Y=\{Y(s)\}_{s\in[t,\sigma)} starting at time tt, are said to be stochastically equivalent at tt\in\mathbb{R}, if, almost surely, X(t)=Y(t)X(t)=Y(t) and (DX(t),QX(t))=(DY(t),QY(t))(DX(t),QX(t))=(DY(t),QY(t)). The equivalence class containing XX is called the stochastic jet of XX at tt, denoted by jtXj_{t}X. Let It(M)I_{t}(M) be the set of all MM-valued diffusion processes starting at time tt. Then the stochastic jet space of MM is the set

𝒥SM={jtX:XIt(M),t}.\mathcal{J}^{S}M=\{j_{t}X:X\in I_{t}(M),t\in\mathbb{R}\}.

The functions π1S\pi^{S}_{1} and π1,0S\pi^{S}_{1,0}, called stochastic source and target projections, are defined by

π1S:𝒥SM,jtXt,\pi^{S}_{1}:\mathcal{J}^{S}M\to\mathbb{R},\quad j_{t}X\mapsto t,

and

π1,0S:𝒥SM×L0(Ω,M),jtX(t,X(t)).\pi^{S}_{1,0}:\mathcal{J}^{S}M\to\mathbb{R}\times L^{0}(\Omega,M),\quad j_{t}X\mapsto(t,X(t)).

In the above definition, since πMϕ=𝐈𝐝M\pi_{M}\circ\phi=\mathbf{Id}_{M}, we have π(Y)=πMϕ(X)=X\pi(Y)=\pi_{M}\circ\phi(X)=X a.s., that is, XX is the projection of YY.

To characterize the relation between 𝒥SM\mathcal{J}^{S}M and 𝒯SM\mathcal{T}^{S}M (or 𝒯EM\mathcal{T}^{E}M), we need the following definitions.

Definition 3.7 (Horizontal subspace).

Let (E,πM,M)(E,\pi_{M},M) be a fiber bundle. The horizontal subspace of L0(Ω,E)L^{0}(\Omega,E) is defined by

Lh(Ω;πM):={ϕξL0(Ω,E):ϕ is a section of πM,ξL0(Ω,M)}.L^{h}(\Omega;\pi_{M}):=\{\phi\circ\xi\in L^{0}(\Omega,E):\phi\text{ is a section of }\pi_{M},\xi\in L^{0}(\Omega,M)\}.

An element of the horizontal subspace Lh(Ω;τME)L^{h}(\Omega;\tau^{E}_{M}) of L0(Ω,𝒯EM)L^{0}(\Omega,\mathcal{T}^{E}M) is then of the form AξA\circ\xi, where AA is a section of τME\tau^{E}_{M} and ξL0(Ω,M)\xi\in L^{0}(\Omega,M). Such an element AξA\circ\xi will be denoted by AξA_{\xi}. By the correspondence of 𝒯SM\mathcal{T}^{S}M and 𝒯EM\mathcal{T}^{E}M, one can easily get the following equivalent definition for Lh(Ω;τME)L^{h}(\Omega;\tau^{E}_{M}),

Lh(Ω;τME)=Lh(Ω;τMS):={jX(0)X:XI0(M)}L0(Ω,𝒯SM).L^{h}(\Omega;\tau^{E}_{M})=L^{h}(\Omega;\tau^{S}_{M}):=\{j_{X(0)}X:X\in I_{0}(M)\}\subset L^{0}(\Omega,\mathcal{T}^{S}M).

The correspondence is given explicitly by

jX(0)X=AX(0)X=(DX(0),QX(0)),orAξ=jξXAξ.j_{X(0)}X=A^{X}_{X(0)}=(DX(0),QX(0)),\quad\text{or}\quad A_{\xi}=j_{\xi}X^{A_{\xi}}.

where XAξX^{A_{\xi}} is an MM-valued diffusion with generator AA and with XAξ(0)=ξX^{A_{\xi}}(0)=\xi a.s..

Proposition 3.8.

The stochastic jet space 𝒥SM\mathcal{J}^{S}M is trivial. More precisely, we have the homeomorphism

𝒥SM×Lh(Ω;τMS),\mathcal{J}^{S}M\cong\mathbb{R}\times L^{h}(\Omega;\tau^{S}_{M}),

given by jtX(t,jX(t)(θtX))j_{t}X\mapsto(t,j_{X(t)}(\theta_{t}X)), for any XIt(M)X\in I_{t}(M), where θt\theta_{t} is the shift operator on 𝒞\mathcal{C}, that is, θtω()=ω(+t)\theta_{t}\omega(\cdot)=\omega(\cdot+t).

Proof.

The homeomorphism 𝒥SM×𝒥0SM\mathcal{J}^{S}M\cong\mathbb{R}\times\mathcal{J}^{S}_{0}M is given by jtX(t,j0(θtX))j_{t}X\mapsto(t,j_{0}(\theta_{t}X)). The homeomorphism 𝒥0SMLh(Ω;τMS)\mathcal{J}^{S}_{0}M\cong L^{h}(\Omega;\tau^{S}_{M}) is given by j0XjX(0)Xj_{0}X\mapsto j_{X(0)}X, whose inverse map is Aξj0XAξA_{\xi}\mapsto j_{0}X^{A_{\xi}}. ∎

Definition 3.9 (Stochastic fibered space).

(i) Given a fiber bundle (E,πM,M)(E,\pi_{M},M) with total space EE, base space MM and typical fiber manifold FF, the stochastic fibered space associated with it is the triplet (ES,πMS,M)(E^{S},\pi^{S}_{M},M) where

ES:={(q,ξ):qM,ξL^(Ω,Eq)},E^{S}:=\{(q,\xi):q\in M,\xi\in\hat{L}(\Omega,E_{q})\},

πMS:ESM\pi^{S}_{M}:E^{S}\to M is the natural projection given by πMS(q,ξ)=q\pi^{S}_{M}(q,\xi)=q, and L^(Ω,F)\hat{L}(\Omega,F) is a subspace of L0(Ω,F)L^{0}(\Omega,F), with EqE_{q} denoting the fiber of πM\pi_{M} over qq. The fiber bundle EE is called model bundle of ESE^{S}. There is a family of projections {πω}ωΩ\{\pi_{\omega}\}_{\omega\in\Omega} from the stochastic fiber manifold ESE^{S} to its model bundle EE, defined by

πω:ESE,(q,ξ)(q,ξ(ω)).\pi_{\omega}:E^{S}\to E,\quad(q,\xi)\mapsto(q,\xi(\omega)).

(ii) A global section of (ES,πMS,M)(E^{S},\pi^{S}_{M},M) is called a random global section. A random local section is a map σ:UE\sigma:U\to E defined on some measurable subset UΩ×MU\subset\Omega\times M and such that, for almost all ωΩ\omega\in\Omega, σ(ω):UωE\sigma(\omega):U_{\omega}\to E is a local section of (E,πM,M)(E,\pi_{M},M), where Uω=U({ω}×M)U_{\omega}=U\cap(\{\omega\}\times M).

Note that a random global section is a random local section defined on all Ω×M\Omega\times M.

It follows from Proposition 3.8 that the stochastic jet space (𝒥SM,π1S,)(\mathcal{J}^{S}M,\pi_{1}^{S},\mathbb{R}) is a stochastic fibered space, whose associated model bundle is (×𝒯SM,π1,)(\mathbb{R}\times\mathcal{T}^{S}M,\pi_{1},\mathbb{R}). Just like the first-order jet bundle J1πJ^{1}\pi which is diffeomorphic to ×TM\mathbb{R}\times TM, the model bundle ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M is itself a jet bundle and also has two bundle structures, with base space \mathbb{R} and ×M\mathbb{R}\times M, respectively. The corresponding source and target projections are defined, respectively by

π1:×𝒯SM,(t,jqX)t,\pi_{1}:\mathbb{R}\times\mathcal{T}^{S}M\to\mathbb{R},\quad(t,j_{q}X)\mapsto t,

and

π1,0:×𝒯SM×M,(t,jqX)(t,q).\pi_{1,0}:\mathbb{R}\times\mathcal{T}^{S}M\to\mathbb{R}\times M,\quad(t,j_{q}X)\mapsto(t,q).

Moreover, we will denote the natural projection from ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M to 𝒯SM\mathcal{T}^{S}M by π0,1\pi_{0,1}. This projection map is indeed a bundle homomorphism from (×𝒯SM,π1,0,×M)(\mathbb{R}\times\mathcal{T}^{S}M,\pi_{1,0},\mathbb{R}\times M) to (𝒯SM,τMS,M)(\mathcal{T}^{S}M,\tau^{S}_{M},M), whose projection is the natural projection from ×M\mathbb{R}\times M to MM, denoted by π^\hat{\pi}.

Similarly to Proposition 3.8, we have the following diffeomorphisms for the model bundle ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M:

{j(t,q)X:XI(t,q)(M),t,qM}×𝒯SM×𝒯EM,\{j_{(t,q)}X:X\in I_{(t,q)}(M),t\in\mathbb{R},q\in M\}\cong\mathbb{R}\times\mathcal{T}^{S}M\cong\mathbb{R}\times\mathcal{T}^{E}M,

which is given by

j(t,q)X(t,jq(θtX))A(t,q)X=(t,DX(t),QX(t)),j_{(t,q)}X\mapsto(t,j_{q}(\theta_{t}X))\mapsto A^{X}_{(t,q)}=(t,DX(t),QX(t)), (3.6)

for any XI(t,q)(M)X\in I_{(t,q)}(M), where AXA^{X} is the generator of XX as a section of ×𝒯EM\mathbb{R}\times\mathcal{T}^{E}M (i.e., a time-dependent elliptic second-order differential operator). Furthermore, the proof of Proposition 3.2 allows us to find simply the inverse maps, especially for the second diffeomorphism. That is, for any (t,Aq)=(t,𝔟,a)π1,01(t,q)(t,A_{q})=(t,\mathfrak{b},a)\in\pi_{1,0}^{-1}(t,q),

(t,Aq)=(t,𝔟,a)(t,jq(θtXA))j(t,q)XA,(t,A_{q})=(t,\mathfrak{b},a)\mapsto\left(t,j_{q}(\theta_{t}X^{A})\right)\mapsto j_{(t,q)}X^{A}, (3.7)

where AA is a section of ×𝒯EM\mathbb{R}\times\mathcal{T}^{E}M such that A(t,q)=AqA_{(t,q)}=A_{q}, and XAI(t,q)(M)X^{A}\in I_{(t,q)}(M) is a diffusion process having AA as its generator.

The “stochastic target” of 𝒥SM\mathcal{J}^{S}M, i.e., the trivial bundle (×L0(Ω,M),πS,M)(\mathbb{R}\times L^{0}(\Omega,M),\pi^{S},M), is another example of stochastic fibered spaces. Its model bundle is the trivial bundle (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}). The graph of an MM-valued stochastic process defined on a random time interval [0,τ)[0,\tau) is a random (local) section of (×L0(Ω,M),πS,)(\mathbb{R}\times L^{0}(\Omega,M),\pi^{S},\mathbb{R}). The projection of πω\pi_{\omega} on the targets from ×L0(Ω,M)\mathbb{R}\times L^{0}(\Omega,M) to ×M\mathbb{R}\times M is denoted by π^ω\hat{\pi}_{\omega}.

We may summarize how all these maps fit together by the following diagram:

𝒥SM×Lh(Ω;τMS){\mathcal{J}^{S}M\cong\mathbb{R}\times L^{h}(\Omega;\tau^{S}_{M})}×𝒯SM{\mathbb{R}\times\mathcal{T}^{S}M}𝒯SM𝒯EM{\mathcal{T}^{S}M\cong\mathcal{T}^{E}M}TM{TM}×L0(Ω,M){\mathbb{R}\times L^{0}(\Omega,M)}×M{\mathbb{R}\times M}M{M}{\mathbb{R}}πω\scriptstyle{\pi_{\omega}}π1,0S\scriptstyle{\pi^{S}_{1,0}}π1S\scriptstyle{\pi^{S}_{1}}π1,0\scriptstyle{\pi_{1,0}}π1\scriptstyle{\pi_{1}}π0,1\scriptstyle{\pi_{0,1}}τMS\scriptstyle{\tau^{S}_{M}}ι\scriptstyle{\iota}τM\scriptstyle{\tau_{M}}π^ω\scriptstyle{\hat{\pi}_{\omega}}πS\scriptstyle{\pi^{S}}π\scriptstyle{\pi}π^\scriptstyle{\hat{\pi}}

When a linear connection is specified on MM, one can easily obtain, similarly to (3.6), the following homeomorphism:

{jtX:XIt(M),t}×Lh(Ω;τM),jtX(t,jX(t)(θtX)),\left\{j^{\nabla}_{t}X:X\in I_{t}(M),t\in\mathbb{R}\right\}\cong\mathbb{R}\times L^{h}(\Omega;\tau_{M}),\quad j^{\nabla}_{t}X\mapsto\left(t,j^{\nabla}_{X(t)}(\theta_{t}X)\right),

and the following diffeomorphisms:

{j(t,q)X:XI(t,q)(M),t,qM}×{jqX:XI(0,q)(M),qM}×TMJ1π,\left\{j^{\nabla}_{(t,q)}X:X\in I_{(t,q)}(M),t\in\mathbb{R},q\in M\right\}\cong\mathbb{R}\times\left\{j_{q}^{\nabla}X:X\in I_{(0,q)}(M),q\in M\right\}\cong\mathbb{R}\times TM\cong J^{1}\pi,

where the first two diffeomorphisms are given by

j(t,q)X(t,jq(θtX))(t,DX(t)),j^{\nabla}_{(t,q)}X\mapsto\left(t,j^{\nabla}_{q}(\theta_{t}X)\right)\mapsto(t,D_{\nabla}X(t)),

and the last one is due to the classical theory.

3.3 Intrinsic formulation of SDEs

With the classical machinery of jet structures, it is possible to translate differential equations into algebraic equations on jet bundle [77]. In this section, we follow this way to formulate intrinsic SDEs.

For a subset SS of the model bundle ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M and tt\in\mathbb{R}, we denote by StS_{t} the intersection of SS with the fiber {t}×𝒯SM\{t\}\times\mathcal{T}^{S}M.

Definition 3.10.

A stochastic differential equation on MM is a closed embedded submanifold SS of the model jet bundle ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M with S0S_{0}\neq\emptyset. A (local) solution of the stochastic differential equation SS is a triple XX, (Ω,,𝐏)(\Omega,\mathcal{F},\mathbf{P}), {𝒫t}t0\{\mathcal{P}_{t}\}_{t\geq 0}, where

  • (i)

    (Ω,,𝐏)(\Omega,\mathcal{F},\mathbf{P}) is a probability space, and {𝒫t}t0\{\mathcal{P}_{t}\}_{t\geq 0} is a past filtration of \mathcal{F} satisfying the usual conditions,

  • (ii)

    X={X(t)}t[0,τ)X=\{X(t)\}_{t\in[0,\tau)} is a {𝒫t}\{\mathcal{P}_{t}\}-adapted MM-valued diffusion process over [0,τ)[0,\tau), where τ\tau is a {𝒫t}\{\mathcal{P}_{t}\}-stopping time, and

  • (iii)

    almost surely jtX=(t,jX(t)(θtX))Sj_{t}X=(t,j_{X(t)}(\theta_{t}X))\in S for every t[0,τ)t\in[0,\tau).

Remark 3.11.

(i). The condition that S0S_{0}\neq\emptyset is just for convenience, in order to set the initial time at t=0t=0.

(ii). There is an equivalent way to formulate the solution of a stochastic differential equation SS. That is, a (local) solution is a pair (P,τ)(P,\tau), where PP is a probability measure on (𝒞,(𝒞),{t})(\mathcal{C},\mathcal{B}(\mathcal{C}),\{\mathcal{B}_{t}\}) and τ\tau is a {t}\{\mathcal{B}_{t}\}-stopping time, such that for PP-almost surely ω\omega, jtω=(t,jω(t)(θtω))Sj_{t}\omega=(t,j_{\omega(t)}(\theta_{t}\omega))\in S for every t[0,τ(ω))t\in[0,\tau(\omega)).

This definition does not look like the traditional definition of a stochastic differential equation, but we can see the relationship between the two by using coordinates. Since SS is a embedded submanifold of ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M, it admits a local defining function in a neighborhood of each of its points [55, Proposition 5.16]. That is, for a coordinate chart (×U(1),(t,x(1)))(\mathbb{R}\times U^{(1)},(t,x^{(1)})) of the point (0,jqX)S0(0,j_{q}X)\in S_{0}, there is a function Θ:×U(1)K\Theta:\mathbb{R}\times U^{(1)}\to\mathbb{R}^{K} where K=dim𝒯SMdimSK=\dim\mathcal{T}^{S}M-\dim S, such that S(×U(1))=Θ1(0)S\cap(\mathbb{R}\times U^{(1)})=\Theta^{-1}(0) and 0 is a regular value of Θ\Theta. Then, the condition jtX=(t,jX(t)(θtX))Sj_{t}X=(t,j_{X(t)}(\theta_{t}X))\in S before X(t)X(t) leaves the neighborhood U=τMS(U(1))U=\tau^{S}_{M}(U^{(1)}) reads in local coordinates as

Θ(t,x,Dx,Qx)(jtX)=Θ(t,X(t),DX(t),QX(t))=0,\Theta(t,x,Dx,Qx)(j_{t}X)=\Theta(t,X(t),DX(t),QX(t))=0, (3.8)

which defines a general MDE (in terms of mean derivatives). The use of a submanifold SS is therefore a way to distinguish the definition of the equation from a definition of its solutions.

As an example, the system of MDEs (2.22) can be rewritten to the form (3.8) by setting the defining function

Θ(t,x,Dx,Qx)=(Dx𝔟(t,x),Qx(σσ)(t,x)).\Theta(t,x,Dx,Qx)=\left(Dx-\mathfrak{b}(t,x),Qx-(\sigma\circ\sigma^{*})(t,x)\right). (3.9)

So far we have not done anything but reformulate the basic problem of finding solutions of systems of stochastic differential equations in a more geometrical form, ideally suited to our investigation into symmetry groups thereof.

4 Stochastic symmetries

The symmetry group of a system of differential equations is the largest local group of transformations acting on the independent and dependent variables of the system with the property that it transform solutions of the system into other solutions [72]. In the stochastic case, we can proceed analogously.

All methods of this chapter work in the local case, that is, the vector fields are not necessarily complete and the bundle homomorphisms could be only locally defined.

4.1 Prolongations of diffusions and bundle homomorphisms

Definition 4.1 (Prolongations of diffusions).

Let XX be an MM-valued diffusion process defined on a stopping time interval [t0,τ)[t_{0},\tau). The prolongation of XX is a 𝒯SM\mathcal{T}^{S}M-valued process jXjX defined by, for θt\theta_{t} the shift operator,

jX(t)=jX(t)(θtX),t[t0,τ).jX(t)=j_{X(t)}(\theta_{t}X),\quad t\in[t_{0},\tau).

Note that jtX=(t,jX(t)(θtX))=(t,jX(t))j_{t}X=(t,j_{X(t)}(\theta_{t}X))=(t,jX(t)). Thus the graph of the prolongation process jXjX is nothing but the random section jXjX of the stochastic jet space 𝒥SM\mathcal{J}^{S}M. It is easy to see that if XX is an MM-valued diffusion process, then jXjX is a 𝒯SM\mathcal{T}^{S}M-valued diffusion process.

Given two smooth manifolds MM and NN, a bundle homomorphism FF from (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) to (×N,ρ,)(\mathbb{R}\times N,\rho,\mathbb{R}) is a projectable (or fiber-preserving) smooth map, which means it maps fibers of π\pi to fibers of ρ\rho. Hence, there exist two smooth maps F0:F^{0}:\mathbb{R}\to\mathbb{R} and F¯:×MN\bar{F}:\mathbb{R}\times M\to N such that F(t,q)=(F0(t),F¯(t,q))F(t,q)=(F^{0}(t),\bar{F}(t,q)). This leads to ρF=F0π\rho\circ F=F^{0}\circ\pi which is the original definition of bundle homomorphisms. We denote F=(F0,F¯)F=(F^{0},\bar{F}) and say that FF projects to F0F^{0}.

The following lemma shows that a bundle homomorphisms has the property that it always transforms diffusions into diffusions. One can find a proof of it in Lemma 4.8 or Corollary A.5.

Lemma 4.2.

Given a bundle homomorphism F=(F0,F¯)F=(F^{0},\bar{F}) from (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) to (×N,ρ,)(\mathbb{R}\times N,\rho,\mathbb{R}), where F0F^{0} is a diffeomorphism, for every MM-valued diffusion process X={X(t)}t[t0,τ)X=\{X(t)\}_{t\in[t_{0},\tau)}, the image of its graph (or its corresponding random local section) {(t,X(t)):t[t0,τ)}\{(t,X(t)):t\in[t_{0},\tau)\} by FF, i.e.,

{F(t,X(t)):t[t0,τ)}\{F(t,X(t)):t\in[t_{0},\tau)\}

is almost surely the graph of a well-defined NN-valued diffusion process X~\tilde{X} given by

X~(s)=F¯((F0)1(s),X((F0)1(s))),s[F0(t0),F0(τ)).\tilde{X}(s)=\bar{F}\left((F^{0})^{-1}(s),X((F^{0})^{-1}(s))\right),\quad s\in[F^{0}(t_{0}),F^{0}(\tau)). (4.1)

As observed in Remark A.6, among all (deterministic) smooth maps from ×M\mathbb{R}\times M to ×N\mathbb{R}\times N, the class of bundle homomorphisms is the only subclass that maps diffusions to diffusions.

Definition 4.3 (Pushforwards of diffusions by bundle homomorphisms).

We call the diffusion X~\tilde{X} of Lemma 4.2 the pushforward of XX by FF, and write X~=FX\tilde{X}=F\cdot X. When M=NM=N and FF is a bundle endomorphism on (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}), we also call FXF\cdot X the transform of XX by FF.

We now introduce the idea of stochastic prolongation whereby a bundle homomorphism may be extended to act upon the model jet bundle.

Definition 4.4 (Stochastic prolongations of bundle homomorphisms).

Let FF be a bundle homomorphism from (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) to (×N,ρ,)(\mathbb{R}\times N,\rho,\mathbb{R}) projecting to a diffeomorphism F0:F^{0}:\mathbb{R}\to\mathbb{R}. The stochastic prolongation of FF is the map jF:×𝒯SM×𝒯SNjF:\mathbb{R}\times\mathcal{T}^{S}M\to\mathbb{R}\times\mathcal{T}^{S}N defined by

jF(j(t,q)X)=jF(t,q)(FX).jF(j_{(t,q)}X)=j_{F(t,q)}(F\cdot X). (4.2)

It is easy to see from (4.1) that if j(t,q)X=j(t,q)Yj_{(t,q)}X=j_{(t,q)}Y, then jF(t,q)(FX)=jF(t,q)(FY)j_{F(t,q)}(F\cdot X)=j_{F(t,q)}(F\cdot Y). Therefore, the map jFjF is well defined. By letting F=(F0,F¯)F=(F^{0},\bar{F}), definition (4.2) can be rewritten in a more evident way:

jF(t,jq(θtX))=(F0(t),jF¯(t,q)θF0(t)(FX)).jF(t,j_{q}(\theta_{t}X))=\big{(}F^{0}(t),j_{\bar{F}(t,q)}\theta_{F^{0}(t)}(F\cdot X)\big{)}. (4.3)

The following properties are easy to check.

Corollary 4.5.

(i) The map jF:π1ρ1jF:\pi_{1}\to\rho_{1} is a bundle homomorphism projecting to F0F^{0}.
(ii) The map jF:π1,0ρ1,0jF:\pi_{1,0}\to\rho_{1,0} is a bundle homomorphism projecting to FF.
(iii) j(𝐈𝐝×M)=𝐈𝐝×𝒯SMj(\mathbf{Id}_{\mathbb{R}\times M})=\mathbf{Id}_{\mathbb{R}\times\mathcal{T}^{S}M}. Let FF and GG be two bundle endomorphisms on (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) projecting to diffeomorphisms. Then j(FG)=jFjGj(F\circ G)=jF\circ jG.

By virtue of (4.3) and Corollary 4.5.(i), we may write jF=(F0,jF¯)jF=(F^{0},\overline{jF}), where jF¯:×𝒯SM𝒯SN\overline{jF}:\mathbb{R}\times\mathcal{T}^{S}M\to\mathcal{T}^{S}N is the smooth map given by

jF¯(t,jq(θtX))=jF¯(t,q)θF0(t)(FX).\overline{jF}(t,j_{q}(\theta_{t}X))=j_{\bar{F}(t,q)}\theta_{F^{0}(t)}(F\cdot X). (4.4)

We can also consider the pushforward of the 𝒯SM\mathcal{T}^{S}M-valued process jXjX by the bundle homomorphism jFjF.

Corollary 4.6.

Given a bundle homomorphism F:(×M,π,)(×N,ρ,)F:(\mathbb{R}\times M,\pi,\mathbb{R})\to(\mathbb{R}\times N,\rho,\mathbb{R}) projecting to a diffeomorphism on \mathbb{R}, and an MM-valued diffusion process XX, we have

jFjX=j(FX).jF\cdot jX=j(F\cdot X).
Proof.

It follows from (4.1), (4.4) and Definition 4.1 that

jFjX(s)=jF¯((F0)1(s),jX((F0)1(s)))=jF¯((F0)1(s),jX((F0)1(s))(θ(F0)1(s)X))=jX~(s)(θsX~)=jX~(s).\begin{split}jF\cdot jX(s)&=\overline{jF}\left((F^{0})^{-1}(s),jX((F^{0})^{-1}(s))\right)=\overline{jF}\left((F^{0})^{-1}(s),j_{X((F^{0})^{-1}(s))}(\theta_{(F^{0})^{-1}(s)}X)\right)\\ &=j_{\tilde{X}(s)}(\theta_{s}\tilde{X})=j\tilde{X}(s).\end{split}

The result follows. ∎

Now we need to investigate the coordinate representation of jFjF, in stochastic analysis terms. Before that, we introduce the stochastic version of the notion of total derivatives.

Definition 4.7 (Total mean derivatives).

Let ff be a smooth real-valued function on ×M\mathbb{R}\times M. The total mean derivative and total quadratic mean derivative of ff are the unique smooth functions 𝐃tf\mathbf{D}_{t}f and 𝐐tf\mathbf{Q}_{t}f defined on ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M, with the property that if XI(t0,q)(M)X\in I_{(t_{0},q)}(M) is a representative diffusion process of j(t0,q)Xj_{(t_{0},q)}X, then

(𝐃tf)(j(t0,q)X)\displaystyle(\mathbf{D}_{t}f)(j_{(t_{0},q)}X) =D[f(t0,X(t0))],\displaystyle=D[f(t_{0},X(t_{0}))],
(𝐐tf)(j(t0,q)X)\displaystyle(\mathbf{Q}_{t}f)(j_{(t_{0},q)}X) =Q[f(t0,X(t0))].\displaystyle=Q[f(t_{0},X(t_{0}))].

There is an abuse of notations in the above definition. Indeed, the left-hand sides (LHSs) of the above two equations both involve subscripts tt, but their RHS’s do not depend on tt. Those two equations need to be understood as that functions 𝐃tf,𝐐tf\mathbf{D}_{t}f,\mathbf{Q}_{t}f taking their values on the point j(t0,q)X×𝒯SMj_{(t_{0},q)}X\in\mathbb{R}\times\mathcal{T}^{S}M equal to the RHS’s.

It is easy to check that the definitions of total mean derivatives are independent of the choice of representative diffusions. By Itô’s formula, we have the following coordinate representation for total mean derivatives in the local chart (×U(1),(t,x(1)))(\mathbb{R}\times U^{(1)},(t,x^{(1)})) on ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M,

𝐃tf\displaystyle\mathbf{D}_{t}f =ft+fxiDix+122fxjxkQjkx,\displaystyle=\frac{\partial f}{\partial t}+\frac{\partial f}{\partial x^{i}}D^{i}x+\frac{1}{2}\frac{\partial^{2}f}{\partial x^{j}\partial x^{k}}Q^{jk}x, (4.5)
𝐐tf\displaystyle\mathbf{Q}_{t}f =fxjfxkQjkx.\displaystyle=\frac{\partial f}{\partial x^{j}}\frac{\partial f}{\partial x^{k}}Q^{jk}x.

If a linear connection \nabla is specified, we can use (3.4) to rewrite 𝐃t\mathbf{D}_{t} as follows:

𝐃t=t+Dixi+12Qjkxj,k2.\mathbf{D}_{t}=\partial_{t}+D_{\nabla}^{i}x\partial_{i}+\textstyle{{\frac{1}{2}}}Q^{jk}x\nabla^{2}_{\partial_{j},\partial_{k}}. (4.6)
Lemma 4.8.

Let us be given a bundle homomorphism F=(F0,F¯)F=(F^{0},\bar{F}) from (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) to (×N,ρ,)(\mathbb{R}\times N,\rho,\mathbb{R}) projecting to a diffeomorphism F0F^{0} and an MM-valued diffusion process X={X(t)}t[t0,τ)X=\{X(t)\}_{t\in[t_{0},\tau)}. If X~=FX\tilde{X}=F\cdot X, then in local coordinates (t,xi)(t,x^{i}) around (t0,q)(t_{0},q) and (s,yj)(s,y^{j}) around F(t0,q)F(t_{0},q),

(DX~)j(F0(t))\displaystyle(D\tilde{X})^{j}(F^{0}(t)) =(𝐃tF¯j)(j(t,X(t))X)d(F0)1ds(F0(t)),\displaystyle=(\mathbf{D}_{t}\bar{F}^{j})\left(j_{(t,X(t))}X\right)\frac{d(F^{0})^{-1}}{ds}(F^{0}(t)),
(QX~)kl(F0(t))\displaystyle(Q\tilde{X})^{kl}(F^{0}(t)) =(F¯kxiF¯lxj)(t,X(t))(QX)ij(t)d(F0)1ds(F0(t)).\displaystyle=\left(\frac{\partial\bar{F}^{k}}{\partial x^{i}}\frac{\partial\bar{F}^{l}}{\partial x^{j}}\right)\left(t,X(t)\right)(QX)^{ij}\left(t\right)\frac{d(F^{0})^{-1}}{ds}(F^{0}(t)).
Proof.

Assume that the diffusion XX can be represented in local coordinates by

dXi(t)=𝔟i(t,X(t))dt+σri(t,X(t))dWr(t),Xi(t0)=xi(q).dX^{i}(t)=\mathfrak{b}^{i}(t,X(t))dt+\sigma^{i}_{r}(t,X(t))dW^{r}(t),\quad X^{i}(t_{0})=x^{i}(q).

where WW is an NN-dimensional Brownian motion, so that

jtX=(DX(t),QX(t))=(𝔟,σσ)(t,X(t)).j_{t}X=(DX(t),QX(t))=(\mathfrak{b},\sigma\circ\sigma^{*})(t,X(t)).

Let (s0,q~)=F(t0,q)=(F0(t0),F¯(t0,q))(s_{0},\tilde{q})=F(t_{0},q)=(F^{0}(t_{0}),\bar{F}(t_{0},q)). Then

Xi((F0)1(s))=xi(q)+(F0)1(s0)(F0)1(s)𝔟i(u,X(u))𝑑u+(F0)1(s0)(F0)1(s)σri(u,X(u))𝑑Wr(u).X^{i}((F^{0})^{-1}(s))=x^{i}(q)+\int_{(F^{0})^{-1}(s_{0})}^{(F^{0})^{-1}(s)}\mathfrak{b}^{i}(u,X(u))du+\int_{(F^{0})^{-1}(s_{0})}^{(F^{0})^{-1}(s)}\sigma^{i}_{r}(u,X(u))dW^{r}(u).

Define

B(s)=0(F0)1(s)(F0)(u)𝑑W(u).B(s)=\int_{0}^{(F^{0})^{-1}(s)}\sqrt{(F^{0})^{\prime}(u)}dW(u).

Then [70, Theorem 8.5.7] says that BB is an NN-dimensional {(F0)1(s)}\{\mathcal{F}_{(F^{0})^{-1}(s)}\}-Brownian motion, as by a change of variable u=(F0)1(v)u=(F^{0})^{-1}(v), we have

(F0)1(s0)(F0)1(s)σri(u,X(u))𝑑Wr(u)=s0sσri((F0)1(v),X((F0)1(v)))(d(F0)1ds(v))12𝑑Br(v).\int_{(F^{0})^{-1}(s_{0})}^{(F^{0})^{-1}(s)}\sigma^{i}_{r}(u,X(u))dW^{r}(u)=\int_{s_{0}}^{s}\sigma^{i}_{r}((F^{0})^{-1}(v),X((F^{0})^{-1}(v)))\left(\frac{d(F^{0})^{-1}}{ds}(v)\right)^{\frac{1}{2}}dB^{r}(v).

Therefore,

Xi((F0)1(s))=xi(q)+s0s𝔟i((F0)1(v),X((F0)1(v)))d(F0)1(v)+s0sσri((F0)1(v),X((F0)1(v)))(d(F0)1ds(v))12𝑑Br(v).\begin{split}X^{i}((F^{0})^{-1}(s))=&\ x^{i}(q)+\int_{s_{0}}^{s}\mathfrak{b}^{i}((F^{0})^{-1}(v),X((F^{0})^{-1}(v)))d(F^{0})^{-1}(v)\\ &\ +\int_{s_{0}}^{s}\sigma^{i}_{r}((F^{0})^{-1}(v),X((F^{0})^{-1}(v)))\left(\frac{d(F^{0})^{-1}}{ds}(v)\right)^{\frac{1}{2}}dB^{r}(v).\end{split}

Recall that X~(s)=F¯((F0)1(s),X((F0)1(s)))\tilde{X}(s)=\bar{F}\left((F^{0})^{-1}(s),X((F^{0})^{-1}(s))\right). Using Itô’s formula, we have

X~j(s)=yj(q~)+s0sF¯jt((F0)1(v),X((F0)1(v)))d(F0)1(v)+s0sF¯jxi((F0)1(v),X((F0)1(v)))𝑑Xi((F0)1(v))+12s0s2F¯jxkxl((F0)1(v),X((F0)1(v)))dXk(F0)1,Xl(F0)1(v)=yj(q)+s0s[F¯jt+F¯jxi𝔟i+122F¯jxkxlσrkσrl]((F0)1(v),X((F0)1(v)))d(F0)1ds(v)𝑑v+s0s(F¯jxiσri)((F0)1(v),X((F0)1(v)))(d(F0)1ds(v))12𝑑Br(v).\begin{split}\tilde{X}^{j}(s)=&\ y^{j}(\tilde{q})+\int_{s_{0}}^{s}\frac{\partial\bar{F}^{j}}{\partial t}\left((F^{0})^{-1}(v),X((F^{0})^{-1}(v))\right)d(F^{0})^{-1}(v)\\ &\ +\int_{s_{0}}^{s}\frac{\partial\bar{F}^{j}}{\partial x^{i}}\left((F^{0})^{-1}(v),X((F^{0})^{-1}(v))\right)dX^{i}((F^{0})^{-1}(v))\\ &\ +\frac{1}{2}\int_{s_{0}}^{s}\frac{\partial^{2}\bar{F}^{j}}{\partial x^{k}\partial x^{l}}\left((F^{0})^{-1}(v),X((F^{0})^{-1}(v))\right)d\langle X^{k}\circ(F^{0})^{-1},X^{l}\circ(F^{0})^{-1}\rangle(v)\\ =&\ y^{j}(q)+\int_{s_{0}}^{s}\left[\frac{\partial\bar{F}^{j}}{\partial t}+\frac{\partial\bar{F}^{j}}{\partial x^{i}}\mathfrak{b}^{i}+\frac{1}{2}\frac{\partial^{2}\bar{F}^{j}}{\partial x^{k}\partial x^{l}}\sigma^{k}_{r}\sigma^{l}_{r}\right]\left((F^{0})^{-1}(v),X((F^{0})^{-1}(v))\right)\frac{d(F^{0})^{-1}}{ds}(v)dv\\ &\ +\int_{s_{0}}^{s}\left(\frac{\partial\bar{F}^{j}}{\partial x^{i}}\sigma^{i}_{r}\right)\left((F^{0})^{-1}(v),X((F^{0})^{-1}(v))\right)\left(\frac{d(F^{0})^{-1}}{ds}(v)\right)^{\frac{1}{2}}dB^{r}(v).\end{split}

It follows that

(DX~)j(s)\displaystyle(D\tilde{X})^{j}(s) =[F¯jt+F¯jxi𝔟i+122F¯jxkxlσrkσrl]((F0)1(v),X((F0)1(v)))d(F0)1ds(v)\displaystyle=\left[\frac{\partial\bar{F}^{j}}{\partial t}+\frac{\partial\bar{F}^{j}}{\partial x^{i}}\mathfrak{b}^{i}+\frac{1}{2}\frac{\partial^{2}\bar{F}^{j}}{\partial x^{k}\partial x^{l}}\sigma^{k}_{r}\sigma^{l}_{r}\right]\left((F^{0})^{-1}(v),X((F^{0})^{-1}(v))\right)\frac{d(F^{0})^{-1}}{ds}(v)
=(𝐃tF¯j)(j((F0)1(s),X((F0)1(s)))X)d(F0)1ds(s),\displaystyle=(\mathbf{D}_{t}\bar{F}^{j})\left(j_{((F^{0})^{-1}(s),X((F^{0})^{-1}(s)))}X\right)\frac{d(F^{0})^{-1}}{ds}(s),
(QX~)kl(s)\displaystyle(Q\tilde{X})^{kl}(s) =(F¯kxiσriF¯lxjσrj)((F0)1(s),X((F0)1(s)))d(F0)1ds(s)\displaystyle=\left(\frac{\partial\bar{F}^{k}}{\partial x^{i}}\sigma^{i}_{r}\frac{\partial\bar{F}^{l}}{\partial x^{j}}\sigma^{j}_{r}\right)\left((F^{0})^{-1}(s),X((F^{0})^{-1}(s))\right)\frac{d(F^{0})^{-1}}{ds}(s)
=(F¯kxiF¯lxj)((F0)1(s),X((F0)1(s)))(QX)ij((F0)1(s))d(F0)1ds(s).\displaystyle=\left(\frac{\partial\bar{F}^{k}}{\partial x^{i}}\frac{\partial\bar{F}^{l}}{\partial x^{j}}\right)\left((F^{0})^{-1}(s),X((F^{0})^{-1}(s))\right)(QX)^{ij}\left((F^{0})^{-1}(s)\right)\frac{d(F^{0})^{-1}}{ds}(s).

This completes the proof. ∎

We denote the induced local coordinates on 𝒯SN\mathcal{T}^{S}N by (yj,Djy,Qkly)(y^{j},D^{j}y,Q^{kl}y). Then clearly, yjjF=yjjF¯=yjF=F¯jy^{j}\circ jF=y^{j}\circ\overline{jF}=y^{j}\circ F=\bar{F}^{j}. Now take j(t,q)X×𝒯SMj_{(t,q)}X\in\mathbb{R}\times\mathcal{T}^{S}M. Then

DjyjF(j(t,q)X)=Djy(jF(t,q)X~)=(DX~)j(F0(t))=(𝐃tF¯j)(j(t,q)X)(dF0dt(t))1,\displaystyle D^{j}y\circ jF(j_{(t,q)}X)=D^{j}y(j_{F(t,q)}\tilde{X})=(D\tilde{X})^{j}(F^{0}(t))=(\mathbf{D}_{t}\bar{F}^{j})(j_{(t,q)}X)\left(\frac{dF^{0}}{dt}(t)\right)^{-1}, (4.7)
QklyjF(j(t,q)X)=Qkly(jF(t,q)X~)=(QX~)kl(F0(t))=(F¯kxiF¯lxj)(t,X(t))(QX)ij(t)(dF0dt(t))1.\displaystyle Q^{kl}y\circ jF(j_{(t,q)}X)=Q^{kl}y(j_{F(t,q)}\tilde{X})=(Q\tilde{X})^{kl}(F^{0}(t))=\left(\frac{\partial\bar{F}^{k}}{\partial x^{i}}\frac{\partial\bar{F}^{l}}{\partial x^{j}}\right)(t,X(t))(QX)^{ij}(t)\left(\frac{dF^{0}}{dt}(t)\right)^{-1}. (4.8)

4.2 Symmetries of SDEs

As an important application of the prolongations of diffusions and bundle homomorphisms, we now study the symmetries of stochastic differential equations. As in classical Lie’s theory of symmetries of ODEs, a symmetry of a stochastic differential equation is a space-time transformation that maps solutions to solutions. But this is not sufficient for the stochastic case. As we have mentioned in Section 4.1, the only smooth transformation on ×M\mathbb{R}\times M mapping diffusions to diffusions are bundle endomorphisms. Moreover, a solution of stochastic differential equation is always accompanied by a filtration, which will also be altered under space-time transformations. Thus, we have the following definition:

Definition 4.9 (Symmetries).

Given a stochastic differential equation S×𝒯SMS\subset\mathbb{R}\times\mathcal{T}^{S}M, a symmetry of SS is a bundle automorphism FF on (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) projecting to F0F^{0} such that if (X,{𝒫t})(X,\{\mathcal{P}_{t}\}) is a solution of SS, then so is (FX,{𝒫(F0)1(s)})(F\cdot X,\{\mathcal{P}_{(F^{0})^{-1}(s)}\}).

Using the definitions of stochastic differential equations and pushforwards, we have the following equivalent characterization of symmetries.

Lemma 4.10.

Let SS be a stochastic differential equation on MM. A bundle automorphism FF on (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) is a symmetry of SS, if and only if, whenever j(t,q)XSj_{(t,q)}X\in S we have jF(j(t,q)X)SjF(j_{(t,q)}X)\in S, or equivalently, jF(S)SjF(S)\subset S.

Recall that the infinitesimal version of bundle homomorphisms are the so called projectable or fiber-preserving vector fields. More precisely, a vector field VV on ×M\mathbb{R}\times M is called π\pi-projectable, if the (local) flow (or one-parameter group action) generated by VV consists of (local) bundle endomorphisms on (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) (cf. [72, Example 2.22] or [77, Proposition 3.2.15]). For such a vector field, we define its prolongation to be the infinitesimal generator of the prolongated flow.

Definition 4.11 (Stochastic prolongations of projectable vector fields).

Let VV be a π\pi-projectable vector field on ×M\mathbb{R}\times M, with corresponding (local) flow ψ={ψϵ}ϵ(ε,ε)\psi=\{\psi_{\epsilon}\}_{\epsilon\in(-\varepsilon,\varepsilon)}. Then, the stochastic prolongation of VV, denoted by jVjV, will be a vector field on the model jet bundle ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M, defined as the infinitesimal generator of the corresponding prolonged flow {jψϵ}ϵ(ε,ε)\{j\psi_{\epsilon}\}_{\epsilon\in(-\varepsilon,\varepsilon)}. In other words, jVjV is a vector field on ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M defined by

jV|j(t,q)X=ddϵ|ϵ=0(jψϵ)(j(t,q)X),jV\big{|}_{j_{(t,q)}X}=\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}(j\psi_{\epsilon})(j_{(t,q)}X),

for any j(t,q)X×𝒯SMj_{(t,q)}X\in\mathbb{R}\times\mathcal{T}^{S}M.

Now we can define infinitesimal versions of symmetries.

Definition 4.12 (Infinitesimal symmetries).

Let SS be a stochastic differential equation on MM. An infinitesimal symmetry of SS is a π\pi-projectable vector field VV on ×M\mathbb{R}\times M whose stochastic prolongation jVjV is tangent to SS.

The following properties follow straightforwardly from definitions.

Lemma 4.13.

Given a stochastic differential equation SS on MM, let VV be a complete π\pi-projectable vector field on ×M\mathbb{R}\times M and ψ={ψϵ}ϵ\psi=\{\psi_{\epsilon}\}_{\epsilon\in\mathbb{R}} be its flow. Then
(i) VV is an infinitesimal symmetry of SS if and only if jV(Θ)=0jV(\Theta)=0 for every local defining function Θ\Theta of SS;
(ii) VV is an infinitesimal symmetry of SS if and only if for each ϵ\epsilon\in\mathbb{R}, ψϵ\psi_{\epsilon} is a symmetry of SS.

4.3 Stochastic prolongation formulae

We consider a coordinate chart (×U(1),(t,x(1)))(\mathbb{R}\times U^{(1)},(t,x^{(1)})) on the model jet bundle ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M, which is induced by the coordinate chart (U,(xi))(U,(x^{i})) on MM. A π\pi-projectable vector field VV on ×M\mathbb{R}\times M has the following local coordinate representation

V(t,q)=V0(t)t|t+Vi(t,q)xi|q.V_{(t,q)}=V^{0}(t)\frac{\partial}{\partial{t}}\bigg{|}_{t}+V^{i}(t,q)\frac{\partial}{\partial{x^{i}}}\bigg{|}_{q}. (4.9)

Its prolongation jVjV is a vector field ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M of the form

jV|j(t,q)X=V0(t)t|t+Vi(t,q)xi|j(t,q)X+V1i(j(t,q)X)Dix|j(t,q)X+V2jk(j(t,q)X)Qjkx|j(t,q)X.jV\big{|}_{j_{(t,q)}X}=V^{0}(t)\frac{\partial}{\partial{t}}\bigg{|}_{t}+V^{i}(t,q)\frac{\partial}{\partial{x^{i}}}\bigg{|}_{j_{(t,q)}X}+V^{i}_{1}(j_{(t,q)}X)\frac{\partial}{\partial{D^{i}x}}\bigg{|}_{j_{(t,q)}X}+V^{jk}_{2}(j_{(t,q)}X)\frac{\partial}{\partial{Q^{jk}x}}\bigg{|}_{j_{(t,q)}X}.

Now we use Lemma 4.8 to compute the coefficients V1iV^{i}_{1}’s and V2jkV^{jk}_{2}’s.

Theorem 4.14.

Suppose VV is complete and π\pi-projectable and has the local representation (4.9). Then in the canonical coordinates (t,x(1))(t,x^{(1)}), the coefficient functions of its prolongation jVjV are given by the following formulae:

V1i(t,x(1))\displaystyle V^{i}_{1}(t,x^{(1)}) =(𝐃tVi)(t,x(1))V˙0(t)Dix,\displaystyle=(\mathbf{D}_{t}V^{i})(t,x^{(1)})-\dot{V}^{0}(t)D^{i}x, (4.10)
V2jk(t,x(1))\displaystyle V^{jk}_{2}(t,x^{(1)}) =Vjxi(t,x)Qikx+Vkxi(t,x)QijxV˙0(t)Qjkx.\displaystyle=\frac{\partial V^{j}}{\partial x^{i}}(t,x)Q^{ik}x+\frac{\partial V^{k}}{\partial x^{i}}(t,x)Q^{ij}x-\dot{V}^{0}(t)Q^{jk}x. (4.11)
Proof.

Let ψ={ψϵ}ϵ\psi=\{\psi_{\epsilon}\}_{\epsilon\in\mathbb{R}} be the flow generated by VV. Since VV is complete and π\pi-projectable, each ψϵ\psi_{\epsilon} is a bundle endomorphism on ×M\mathbb{R}\times M projecting to a diffeomorphism on \mathbb{R}. Let ψϵ(t,q)=(ψϵ0(t),ψ¯ϵ(t,q))\psi_{\epsilon}(t,q)=(\psi^{0}_{\epsilon}(t),\bar{\psi}_{\epsilon}(t,q)). Note that ψ00(t)=t\psi^{0}_{0}(t)=t, ψ¯0(t,q)=q\bar{\psi}_{0}(t,q)=q and

V0(t)=ddϵ|ϵ=0ψϵ0(t),Vi(t,q)=ddϵ|ϵ=0ψ¯ϵi(t,q).V^{0}(t)=\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}\psi^{0}_{\epsilon}(t),\quad V^{i}(t,q)=\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}\bar{\psi}^{i}_{\epsilon}(t,q).

Let X={X(t)}t[t0,τ)X=\{X(t)\}_{t\in[t_{0},\tau)} be a representative diffusion of j(t0,q)XU(1)j_{(t_{0},q)}X\in U^{(1)}. Then by Lemma 4.2 and Definition 4.4, a representative diffusion of jψϵ(j(t,q)X)j\psi_{\epsilon}(j_{(t,q)}X) is

X~ϵ(s)=ψϵX(s)=ψ¯ϵ((ψϵ0)1(s),X((ψϵ0)1(s))),s[ψϵ0(t0),ψϵ0(τ)).\tilde{X}_{\epsilon}(s)=\psi_{\epsilon}\cdot X(s)=\bar{\psi}_{\epsilon}\left((\psi^{0}_{\epsilon})^{-1}(s),X((\psi^{0}_{\epsilon})^{-1}(s))\right),\quad s\in[\psi^{0}_{\epsilon}(t_{0}),\psi^{0}_{\epsilon}(\tau)).

Now we apply Lemma 4.8 and take derivatives with respect to ϵ\epsilon. Since ddϵ\frac{d}{d\epsilon} commutes with the total mean derivative 𝐃t\mathbf{D}_{t} as is clear from the coordinate representation, we have

V1i(j(t,q)X)=ddϵ|ϵ=0(DX~ϵ)i(ψϵ0(t))=ddϵ|ϵ=0[(𝐃tψ¯ϵi)(j(t,X(t))X)d(ψϵ0)1ds(ψϵ0(t))]=𝐃tVi(j(t,q)X)(DX)i(t)V˙0(t).\begin{split}V_{1}^{i}(j_{(t,q)}X)&=\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}(D\tilde{X}_{\epsilon})^{i}(\psi^{0}_{\epsilon}(t))=\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}\left[(\mathbf{D}_{t}\bar{\psi}_{\epsilon}^{i})\left(j_{(t,X(t))}X\right)\frac{d(\psi_{\epsilon}^{0})^{-1}}{ds}(\psi^{0}_{\epsilon}(t))\right]\\ &=\mathbf{D}_{t}V^{i}(j_{(t,q)}X)-(DX)^{i}(t)\dot{V}^{0}(t).\end{split}

Also,

V2kl(j(t,q)X)=ddϵ|ϵ=0(QX~ϵ)kl(ψϵ0(t))=ddϵ|ϵ=0[(ψ¯ϵkxiψ¯ϵlxj)(t,X(t))(QX)ij(t)d(ψϵ0)1ds(ψϵ0(t))]=(Vkxiδjl+δikVlxj)(t,X(t))(QX)ij(t)δikδjl(QX)ij(t)V˙0(t)=Vkxi(t,q)(QX)il(t)+Vlxj(t,q)(QX)jk(t)(QX)kl(t)V˙0(t).\begin{split}V_{2}^{kl}(j_{(t,q)}X)&=\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}(Q\tilde{X}_{\epsilon})^{kl}(\psi^{0}_{\epsilon}(t))\\ &=\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}\left[\left(\frac{\partial\bar{\psi}_{\epsilon}^{k}}{\partial x^{i}}\frac{\partial\bar{\psi}_{\epsilon}^{l}}{\partial x^{j}}\right)\left(t,X(t)\right)(QX)^{ij}\left(t\right)\frac{d(\psi_{\epsilon}^{0})^{-1}}{ds}(\psi^{0}_{\epsilon}(t))\right]\\ &=\left(\frac{\partial V^{k}}{\partial x^{i}}\delta_{j}^{l}+\delta_{i}^{k}\frac{\partial V^{l}}{\partial x^{j}}\right)(t,X(t))(QX)^{ij}(t)-\delta_{i}^{k}\delta_{j}^{l}(QX)^{ij}(t)\dot{V}^{0}(t)\\ &=\frac{\partial V^{k}}{\partial x^{i}}(t,q)(QX)^{il}(t)+\frac{\partial V^{l}}{\partial x^{j}}(t,q)(QX)^{jk}(t)-(QX)^{kl}(t)\dot{V}^{0}(t).\end{split}

In the induced coordinate system (t,x(1))=(t,xi,Dix,D2jkx)(t,x^{(1)})=(t,x^{i},D^{i}x,D^{jk}_{2}x), the last two formulae read as (4.10) and (4.11), respectively. ∎

Stochastic analogs of contact structure on ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M and Cartan symmetries will be discussed in Appendix B. It turns out that the infinitesimal symmetry of the mixed-order Cartan distribution is equivalent to stochastic prolongation formulae of Theorem 4.14.

Applying Theorem 4.14 to the system of mean differential equations (2.22), we have

Corollary 4.15.

The complete and π\pi-projectable vector field VV in (4.9) is an infinitesimal symmetry of MDEs (2.22) if and only if the coefficients V0V^{0} and ViV^{i}’s satisfy the following “determining equations”:

V0𝔟it+Vj𝔟ixj=Vit+Vixj𝔟j+122VixjxkσrjσrkV˙0𝔟i,\displaystyle V^{0}\frac{\partial\mathfrak{b}^{i}}{\partial t}+V^{j}\frac{\partial\mathfrak{b}^{i}}{\partial x^{j}}=\frac{\partial V^{i}}{\partial t}+\frac{\partial V^{i}}{\partial x^{j}}\mathfrak{b}^{j}+\frac{1}{2}\frac{\partial^{2}V^{i}}{\partial x^{j}\partial x^{k}}\sigma^{j}_{r}\sigma^{k}_{r}-\dot{V}^{0}\mathfrak{b}^{i},
V0(σrjσrk)t+Vi(σrjσrk)xi=Vjxiσriσrk+VkxiσriσrjV˙0σrjσrk.\displaystyle V^{0}\frac{\partial(\sigma_{r}^{j}\sigma_{r}^{k})}{\partial t}+V^{i}\frac{\partial(\sigma_{r}^{j}\sigma_{r}^{k})}{\partial x^{i}}=\frac{\partial V^{j}}{\partial x^{i}}\sigma^{i}_{r}\sigma^{k}_{r}+\frac{\partial V^{k}}{\partial x^{i}}\sigma^{i}_{r}\sigma^{j}_{r}-\dot{V}^{0}\sigma^{j}_{r}\sigma^{k}_{r}. (4.12)
Proof.

We apply Lemma 4.13.(i) to (3.9) and then use Theorem 4.14, to get

V0𝔟it+Vj𝔟ixj\displaystyle V^{0}\frac{\partial\mathfrak{b}^{i}}{\partial t}+V^{j}\frac{\partial\mathfrak{b}^{i}}{\partial x^{j}} =𝐃tViV˙0Dix,\displaystyle=\mathbf{D}_{t}V^{i}-\dot{V}^{0}D^{i}x,
V0(σrjσrk)t+Vi(σrjσrk)xi\displaystyle V^{0}\frac{\partial(\sigma_{r}^{j}\sigma_{r}^{k})}{\partial t}+V^{i}\frac{\partial(\sigma_{r}^{j}\sigma_{r}^{k})}{\partial x^{i}} =VjxiQikx+VkxiQijxV˙0Qjkx.\displaystyle=\frac{\partial V^{j}}{\partial x^{i}}Q^{ik}x+\frac{\partial V^{k}}{\partial x^{i}}Q^{ij}x-\dot{V}^{0}Q^{jk}x.

Then we use the coordinate representation (4.5) for the total mean derivative 𝐃t\mathbf{D}_{t} and plug equation (3.9) in; the results follow. ∎

Remark 4.16.

In [31], the author proved a result similar to Corollary 4.15, with the following equation instead of equation (4.12):

V0σrjt+Viσrjxi=Vjxiσri12V˙0σrj.V^{0}\frac{\partial\sigma_{r}^{j}}{\partial t}+V^{i}\frac{\partial\sigma_{r}^{j}}{\partial x^{i}}=\frac{\partial V^{j}}{\partial x^{i}}\sigma^{i}_{r}-\frac{1}{2}\dot{V}^{0}\sigma^{j}_{r}. (4.13)

By multiplying both sides of (4.13) with σrk\sigma_{r}^{k}, and using the symmetry for index j,kj,k, one gets easily (4.12). So our determining equations for infinitesimal symmetries are more general than those of [31]. Basically, the paper [31] concerns symmetries of the Itô equation (𝔟,σ)(\mathfrak{b},\sigma), while we consider symmetries of the diffusion with generator (𝔟,σσ)(\mathfrak{b},\sigma\circ\sigma^{*}), or equivalently, a weak formulation of SDE. The former symmetries belong to the latter obviously, but not vice versa.

Now given a linear connection \nabla on MM, we define the \nabla-dependent versions of Definitions 4.1, 4.4 and 4.11. More precisely, for a diffusion XX on MM, we define its \nabla-prolongation to be a TMTM-valued diffusion jXj^{\nabla}X given by jX(t)=jX(t)(θtX)j^{\nabla}X(t)=j^{\nabla}_{X(t)}(\theta_{t}X). For a bundle homomorphism from F:(×M,π,)(×N,ρ,)F:(\mathbb{R}\times M,\pi,\mathbb{R})\to(\mathbb{R}\times N,\rho,\mathbb{R}) projecting to a diffeomorphism F0:F^{0}:\mathbb{R}\to\mathbb{R}, the \nabla-prolongation of FF is the map jF:×TM×TNj^{\nabla}F:\mathbb{R}\times TM\to\mathbb{R}\times TN defined by jF(j(t,q)X)=jF(t,q)(FX)j^{\nabla}F(j^{\nabla}_{(t,q)}X)=j^{\nabla}_{F(t,q)}(F\cdot X). The \nabla-prolongation of VV, denoted by jVj^{\nabla}V, is defined to be the infinitesimal generator of the corresponding prolonged flow {jψϵ}ϵ(ε,ε)\{j^{\nabla}\psi_{\epsilon}\}_{\epsilon\in(-\varepsilon,\varepsilon)}, so that jVj^{\nabla}V is a vector field on ×TM\mathbb{R}\times TM and has the form

jV|j(t,q)X=V0(t)t|t+Vi(t,q)xi|j(t,q)X+Vi(j(t,q)X)x˙i|j(t,q)X,j^{\nabla}V\big{|}_{j^{\nabla}_{(t,q)}X}=V^{0}(t)\frac{\partial}{\partial{t}}\bigg{|}_{t}+V^{i}(t,q)\frac{\partial}{\partial{x^{i}}}\bigg{|}_{j^{\nabla}_{(t,q)}X}+V^{i}_{\nabla}(j^{\nabla}_{(t,q)}X)\frac{\partial}{\partial{\dot{x}^{i}}}\bigg{|}_{j^{\nabla}_{(t,q)}X},

for VV of the form (4.9). If we denote V¯=Vixi\bar{V}=V^{i}\frac{\partial}{\partial{x^{i}}} so that V=V0+V¯V=V^{0}+\bar{V}, we have

Corollary 4.17.

Under the canonical coordinates (t,x,x˙)(t,x,\dot{x}), the coefficient ViV^{i}_{\nabla} of the \nabla-prolongation jVj^{\nabla}V are given by:

Vi(t,x,x˙)=(t+x˙jj)Vi(t,x)+12Qjkx[j,k2V¯+R(V¯,j)k]i(t,x)V˙0(t)x˙i,V^{i}_{\nabla}(t,x,\dot{x})=\left(\partial_{t}+\dot{x}^{j}\partial_{j}\right)V^{i}(t,x)+\textstyle{{\frac{1}{2}}}Q^{jk}x\left[\nabla^{2}_{\partial_{j},\partial_{k}}\bar{V}+R(\bar{V},\partial_{j})\partial_{k}\right]^{i}(t,x)-\dot{V}^{0}(t)\dot{x}^{i},

where RR is the curvature tensor.

Proof.

By (4.10) and (4.11), we have

Vi(j(t,q)X)=ddϵ|ϵ=0(DX~ϵ)i(ψϵ0(t))=ddϵ|ϵ=0[(DX~ϵ)i(ψϵ0(t))+12Γjki(X~ϵ(t))(QX~ϵ)jk(ψϵ0(t))]=V1i(j(t,q)X)+12Γjki(X(t))V2jk(j(t,q)X)+12Γjkixl(X(t))(QX)jk(t)Vl(X(t))=[t+((DX)l(t)12Γjkl(X(t))(QX)jk(t))xl+12(QX)jk(t)2xjxk]Vi(t,X(t))(DX)i(t)V˙0(t)+12Γjki(X(t))[Vjxl(t,(X(t))(QX)kl(t)+Vkxm(t,(X(t))(QX)jm(t)(QX)jk(t)V˙0(t)]+12Γjkixl(X(t))(QX)jk(t)Vl(t,X(t))=[t+(DX)l(t)xl]Vi(t,X(t))+12(QX)jk(t)[j,k2V¯+R(V¯,j)k]i(t,X(t))(DX)i(t)V˙0(t).\begin{split}V_{\nabla}^{i}(j_{(t,q)}X)&=\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}(D_{\nabla}\tilde{X}_{\epsilon})^{i}(\psi^{0}_{\epsilon}(t))=\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}\left[(D\tilde{X}_{\epsilon})^{i}(\psi^{0}_{\epsilon}(t))+\frac{1}{2}\Gamma^{i}_{jk}(\tilde{X}_{\epsilon}(t))(Q\tilde{X}_{\epsilon})^{jk}(\psi^{0}_{\epsilon}(t))\right]\\ &=V_{1}^{i}(j_{(t,q)}X)+\frac{1}{2}\Gamma^{i}_{jk}(X(t))V_{2}^{jk}(j_{(t,q)}X)+\frac{1}{2}\frac{\partial\Gamma^{i}_{jk}}{\partial x^{l}}(X(t))(QX)^{jk}(t)V^{l}(X(t))\\ &=\left[\frac{\partial}{\partial t}+\left((D_{\nabla}X)^{l}(t)-\frac{1}{2}\Gamma_{jk}^{l}(X(t))(QX)^{jk}(t)\right)\frac{\partial}{\partial x^{l}}+\frac{1}{2}(QX)^{jk}(t)\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}\right]V^{i}(t,X(t))-(DX)^{i}(t)\dot{V}^{0}(t)\\ &\quad+\frac{1}{2}\Gamma^{i}_{jk}(X(t))\left[\frac{\partial V^{j}}{\partial x^{l}}(t,(X(t))(QX)^{kl}(t)+\frac{\partial V^{k}}{\partial x^{m}}(t,(X(t))(QX)^{jm}(t)-(QX)^{jk}(t)\dot{V}^{0}(t)\right]\\ &\quad+\frac{1}{2}\frac{\partial\Gamma^{i}_{jk}}{\partial x^{l}}(X(t))(QX)^{jk}(t)V^{l}(t,X(t))\\ &=\left[\frac{\partial}{\partial t}+(D_{\nabla}X)^{l}(t)\frac{\partial}{\partial x^{l}}\right]V^{i}(t,X(t))+\frac{1}{2}(Q_{\nabla}X)^{jk}(t)\left[\nabla^{2}_{\partial_{j},\partial_{k}}\bar{V}+R(\bar{V},\partial_{j})\partial_{k}\right]^{i}(t,X(t))\\ &\quad-(D_{\nabla}X)^{i}(t)\dot{V}^{0}(t).\end{split}

The proof is complete. ∎

5 The second-order cotangent bundle

5.1 Second-order covectors

Definition 5.1 (Second-order cotangent space).

The second-order cotangent space at qMq\in M is the dual vector space of 𝒯qOM\mathcal{T}^{O}_{q}M, denoted by 𝒯qSM\mathcal{T}^{S*}_{q}M. The pairing of α𝒯qSM\alpha\in\mathcal{T}^{S*}_{q}M and A𝒯qOMA\in\mathcal{T}^{O}_{q}M is denoted by α,A\langle\alpha,A\rangle or α(A)\alpha(A). Elements of 𝒯qSM\mathcal{T}^{S*}_{q}M are called second-order covectors at qq. The disjoint union 𝒯SM:=qM𝒯qSM\mathcal{T}^{S*}M:=\amalg_{q\in M}\mathcal{T}^{S*}_{q}M is called the stochastic cotangent bundle of MM. The natural projection map from 𝒯SM\mathcal{T}^{S*}M to MM is denoted by τMS\tau^{S*}_{M}. A (local or global) smooth section of 𝒯SM\mathcal{T}^{S*}M is called a second-order covector field or a second-order form.

Dual to the left action (2.11) of GIdG_{I}^{d} on fibers of 𝒯SM\mathcal{T}^{S}M, GIdG_{I}^{d} will act on those of 𝒯SM\mathcal{T}^{S*}M from the right.

Lemma 5.2.

The stochastic cotangent bundle (𝒯SM,τMS,M)(\mathcal{T}^{S*}M,\tau^{S*}_{M},M) is the fiber bundle dual to (𝒯SM,τMS,M)(\mathcal{T}^{S}M,\tau^{S}_{M},M), with structure group GIdG_{I}^{d} acting on the typical fiber (d×Sym2(d))(\mathbb{R}^{d}\times\mathrm{Sym}^{2}(\mathbb{R}^{d}))^{*} from the right by

(p,o)(g,κ)=(gp,κp+(gg)o),(p,o)\cdot(g,\kappa)=(g^{*}p,\kappa^{*}p+(g^{*}\otimes g^{*})o),

for all (g,κ)GId(g,\kappa)\in G_{I}^{d}, p(d)p\in(\mathbb{R}^{d})^{*}, o(Sym2(d))o\in(\mathrm{Sym}^{2}(\mathbb{R}^{d}))^{*}.

The notion of second-order forms should not be confused with the classical one of 2-forms. There are two basic examples of second-order forms, say, d2fd^{2}f and dfdgdf\cdot dg, where ff and gg are given smooth functions on MM. They are defined as follows: for A𝒯SMA\in\mathcal{T}^{S}M,

d2f,A:=Af,dfdg,A:=ΓA(f,g)=A(fg)fAggAf,\langle d^{2}f,A\rangle:=Af,\qquad\langle df\cdot dg,A\rangle:=\Gamma_{A}(f,g)=A(fg)-fAg-gAf, (5.1)

where ΓA\Gamma_{A} is the squared field operator defined in (2.8). These notations go back to L. Schwartz [82] and P.A. Meyer [65] (see also [25, Chapters VI and VII]), where the term d2fd^{2}f is called the second differential of ff, and the term dfdgdf\cdot dg is called the symmetric product of dfdf and dgdg. Note that in these original references, there is a factor 12\frac{1}{2} at the RHS of the definition of dfdgdf\cdot dg. Here we drop this factor. Obviously, when restricted to TMTM, the second differential d2fd^{2}f is just the differential dfdf but the symmetric product dfdgdf\cdot dg vanishes.

The definition of the symmetric product dfdgdf\cdot dg yields two properties: dfdgdf\cdot dg is symmetric in ff and gg; and (dfdg)q=0(df\cdot dg)_{q}=0 if one of dfqdf_{q} and dgqdg_{q} vanishes. These lead to a more general definition for symmetric products of two 1-forms. More precisely, let ω,η𝒯qM\omega,\eta\in\mathcal{T}^{*}_{q}M, then there exist smooth functions ff and gg on MM such that ω=dfq\omega=df_{q} and η=dgq\eta=dg_{q}. By the preceding property, the second-order covector (dfdg)q(df\cdot dg)_{q} does not depend on the choice of ff and gg, and we will denote it by ωη\omega\cdot\eta. Now if ω,η\omega,\eta are second-order forms, then their symmetric product is defined pointwisely through (ωη)q=ωqηq(\omega\cdot\eta)_{q}=\omega_{q}\cdot\eta_{q}. More formally, we have

Definition 5.3 (Symmetric product, [25, Chapter VI]).

There exists a unique fiber-linear bundle homomorphism \bullet from TMTMT^{*}M\otimes T^{*}M to 𝒯SM\mathcal{T}^{S*}M, which is called the symmetric product, such that for all ω,ηTM\omega,\eta\in T^{*}M, (ωη)=ωη\bullet(\omega\otimes\eta)=\omega\cdot\eta.

It is easy to verify from (5.1) that the local frame, dual to (2.12), for (𝒯SM,τSM,M)(\mathcal{T}^{S*}M,\tau^{S*}_{M},M) over the local chart (U,(xi))(U,(x^{i})) is given by (see also [25, Chapter VI])

{d2xi,12dxidxi,dxjdxk:1id,1j<kd}.\left\{d^{2}x^{i},\textstyle{\frac{1}{2}}dx^{i}\cdot dx^{i},dx^{j}\cdot dx^{k}:1\leq i\leq d,1\leq j<k\leq d\right\}.

We adopt the convention that dxkdxj=dxjdxkdx^{k}\cdot dx^{j}=dx^{j}\cdot dx^{k} for all 1j<kd1\leq j<k\leq d. Under this frame, a second-order covector α𝒯SqM\alpha\in\mathcal{T}^{S*}_{q}M has a local expression

α=αid2xi|q+12αjkdxjdxk|q,\alpha=\alpha_{i}d^{2}x^{i}|_{q}+\textstyle{\frac{1}{2}}\alpha_{jk}dx^{j}\cdot dx^{k}|_{q}, (5.2)

where αjk\alpha^{jk} is symmetric in j,kj,k. The coordinates (xi)(x^{i}) induce a canonical coordinate system on 𝒯SM\mathcal{T}^{S*}M, denoted by (xi,pi,ojk)(x^{i},p_{i},o_{jk}) and defined by

xi(α)=xi(q),pi(α)=αi,ojk(α)=αjk.x^{i}(\alpha)=x^{i}(q),\quad p_{i}(\alpha)=\alpha_{i},\quad o_{jk}(\alpha)=\alpha_{jk}. (5.3)

for α\alpha in (5.2). Since the coefficients (αi)(\alpha_{i}) do transform like a covector, as indicated in Lemma 5.2, it will cause no ambiguity to retain (xi,pi)(x^{i},p_{i}) as canonical coordinates on TMT^{*}M. As in classical geometric mechanics [1, 38], we still call the coordinates (pi)(p_{i}) the conjugate momenta. And we shall call the second-order coordinates (ojk)(o_{jk}) the conjugate diffusivities.

The pairing of α\alpha and the second-order vector field AA in (2.7) is then

α,A=αiAi+αjkAjk.\langle\alpha,A\rangle=\alpha_{i}A^{i}+\alpha_{jk}A^{jk}.

It follows from (5.1) and (2.8) that for smooths functions ff and gg on MM,

d2f=fxid2xi+122fxjxkdxjdxk,dfdg=fxigxjdxidxj.d^{2}f=\frac{\partial f}{\partial x^{i}}d^{2}x^{i}+\frac{1}{2}\frac{\partial^{2}f}{\partial x^{j}\partial x^{k}}dx^{j}\cdot dx^{k},\qquad df\cdot dg=\frac{\partial f}{\partial x^{i}}\frac{\partial g}{\partial x^{j}}dx^{i}\cdot dx^{j}.

More generally, for 1-forms ω\omega and η\eta with local expressions ω=ωidxi\omega=\omega_{i}dx^{i} and η=ηidxi\eta=\eta_{i}dx^{i}, the symmetric product ωη\omega\cdot\eta has local expression

ωη=ωiηjdxidxj.\omega\cdot\eta=\omega_{i}\eta_{j}dx^{i}\cdot dx^{j}. (5.4)

Dual to the tangent case, there is indeed a canonical bundle epimorphism ϱ^:(𝒯SM,τSM,M)(TM,τM,M)\hat{\varrho}^{*}:(\mathcal{T}^{S*}M,\tau^{S*}_{M},M)\to(T^{*}M,\tau^{*}_{M},M), given by

ϱ^(α)=α|TM.\hat{\varrho}^{*}(\alpha)=\alpha|_{TM}.

In particular ϱ^(d2f)=df\hat{\varrho}^{*}(d^{2}f)=df. In local coordinates, ϱ^\hat{\varrho}^{*} reads as

ϱ^(αid2xi|q+12αjkdxjdxk|q)=αidxi|q,\hat{\varrho}^{*}\left(\alpha_{i}d^{2}x^{i}|_{q}+\textstyle{\frac{1}{2}}\alpha_{jk}dx^{j}\cdot dx^{k}|_{q}\right)=\alpha_{i}dx^{i}|_{q},

The map ϱ^\hat{\varrho}^{*} is well defined since α|TM\alpha|_{TM} is a covector. Clearly, ϱ^\hat{\varrho}^{*} is also a surjective submersion, so that 𝒯SM\mathcal{T}^{S*}M is a fiber bundle over TMT^{*}M. Occasionally, we will use the notation ϱ^M\hat{\varrho}^{*}_{M} to indicate the base manifold MM.

However, there is no canonical bundle monomorphism from TMT^{*}M to 𝒯SM\mathcal{T}^{S*}M which is a left inverse of ϱ^\hat{\varrho}^{*} and linear in fiber. We call such a bundle epimorphism a fiber-linear bundle injection from TMT^{*}M to 𝒯SM\mathcal{T}^{S*}M. Similarly to Proposition 2.11, we also have a connection correspondence property. Namely, if we are given a linear connection \nabla on MM, then it induces a fiber-linear bundle injection from TMT^{*}M to 𝒯SM\mathcal{T}^{S*}M by

ι^:TM𝒯SM,dxi|qd2xi|q+12Γjki(q)dxjdxk|q=:dxi|q,\hat{\iota}^{*}_{\nabla}:T^{*}M\to\mathcal{T}^{S*}M,\quad dx^{i}|_{q}\mapsto d^{2}x^{i}|_{q}+\textstyle{\frac{1}{2}}\Gamma_{jk}^{i}(q)dx^{j}\cdot dx^{k}|_{q}=:d^{\nabla}x^{i}|_{q}, (5.5)

or in local coordinates ι^(x,p)=(x,p,(Γjki(x)pi))\hat{\iota}^{*}_{\nabla}(x,p)=(x,p,(\Gamma_{jk}^{i}(x)p_{i})). Any fiber-linear bundle injection from TMT^{*}M to 𝒯SM\mathcal{T}^{S*}M induces a torsion-free linear connection on MM.

Denote by Sym2(TM)\mathrm{Sym}^{2}(T^{*}M) the subbundle of TMTMT^{*}M\otimes T^{*}M consisting of all (0,2)(0,2)-tensors on MM. Then the symmetric product \bullet, when restricting to Sym2(TM)\mathrm{Sym}^{2}(T^{*}M), is a bundle monomorphism whose image is the kernel of ϱ^\hat{\varrho}^{*}. Conversely, still by the connection correspondence, a linear connection \nabla induces a fiber-linear bundle epimorphism from 𝒯SM\mathcal{T}^{S*}M to Sym2(TM)\mathrm{Sym}^{2}(T^{*}M) which is a right inverse of \bullet and is given by

ϱ:𝒯SMSym2(TM),αid2xi|q+12αjkdxjdxk|q(αjkαiΓjki(q))dxjdxk|q.\varrho^{*}_{\nabla}:\mathcal{T}^{S*}M\to\mathrm{Sym}^{2}(T^{*}M),\quad\alpha_{i}d^{2}x^{i}|_{q}+\textstyle{\frac{1}{2}}\alpha_{jk}dx^{j}\cdot dx^{k}|_{q}\mapsto\left(\alpha_{jk}-\alpha_{i}\Gamma_{jk}^{i}(q)\right)dx^{j}\otimes dx^{k}|_{q}.

We introduce the \nabla-dependent coordinates (ojk)(o_{jk}^{\nabla}) by ojk(α)=αjkαiΓjki(q)o_{jk}^{\nabla}(\alpha)=\alpha_{jk}-\alpha_{i}\Gamma_{jk}^{i}(q) for α\alpha in (5.2), i.e.,

ojk=ojkpi(Γjkix).o_{jk}^{\nabla}=o_{jk}-p_{i}(\Gamma_{jk}^{i}\circ x). (5.6)

Then ϱ(α)=ojk(α)dxjdxk|q\varrho^{*}_{\nabla}(\alpha)=o_{jk}^{\nabla}(\alpha)dx^{j}\otimes dx^{k}|_{q} and in particular

ϱ(d2f)=(2fxjxkΓjkifxi)dxjdxk=2f.\varrho^{*}_{\nabla}(d^{2}f)=\left(\frac{\partial^{2}f}{\partial x^{j}\partial x^{k}}-\Gamma_{jk}^{i}\frac{\partial f}{\partial x^{i}}\right)dx^{j}\otimes dx^{k}=\nabla^{2}f.

The coordinates (xi,pi,ojk)(x^{i},p_{i},o_{jk}^{\nabla}) form a coordinate system on 𝒯SM\mathcal{T}^{S*}M, which we call the \nabla-canonical coordinate system. The coordinates (xi,ojk)(x^{i},o_{jk}^{\nabla}) also form a coordinate system on Sym2(TM)\mathrm{Sym}^{2}(T^{*}M) when restricted to it. We will call the coordinates (ojk)(o^{\nabla}_{jk}) the tensorial conjugate diffusivities.

To sum up, we have the following short exact sequence which is split when a linear connection is provided:

0Sym2(TM)𝒯SMϱ^TM0.0\longrightarrow\mathrm{Sym}^{2}(T^{*}M)\stackrel{{\scriptstyle\bullet}}{{\longrightarrow}}\mathcal{T}^{S*}M\stackrel{{\scriptstyle\hat{\varrho}^{*}}}{{\longrightarrow}}T^{*}M\longrightarrow 0. (5.7)

It is easy to check that the bundle homomorphisms ϱ^\hat{\varrho}^{*}, ι^\hat{\iota}^{*}_{\nabla}, \bullet and ϱ\varrho^{*}_{\nabla} are dual to ι\iota, ϱ\varrho_{\nabla}, ϱ^\hat{\varrho} and ι^\hat{\iota}_{\nabla} in (2.13), (2.14), (2.15) and (2.16), respectively, so that the short exact sequence (5.7) is dual to (2.17). Similarly to (2.18), we have the following decomposition if a linear connection \nabla is given,

𝒯SM=ι^(TM)(Sym2(TM))TMSym2(TM),\mathcal{T}^{S*}M=\hat{\iota}^{*}_{\nabla}(T^{*}M)\oplus\bullet\left(\mathrm{Sym}^{2}(T^{*}M)\right)\cong T^{*}M\oplus\mathrm{Sym}^{2}(T^{*}M),

with fiber-wise isomorphism \cong and first direct sum \oplus, which is given by

α=αidxi|q+12(αjkαiΓjki(q))dxjdxk|q(αidxi|q,(αjkαiΓjki(q))dxjdxk|q).\alpha=\alpha_{i}d^{\nabla}x^{i}|_{q}+\textstyle{\frac{1}{2}}\left(\alpha_{jk}-\alpha_{i}\Gamma_{jk}^{i}(q)\right)dx^{j}\cdot dx^{k}|_{q}\mapsto\left(\alpha_{i}dx^{i}|_{q},\left(\alpha_{jk}-\alpha_{i}\Gamma_{jk}^{i}(q)\right)dx^{j}\otimes dx^{k}|_{q}\right).

In particular,

d2f=ifdxi+122j,kfdxjdxk(df,2f).d^{2}f=\partial_{i}fd^{\nabla}x^{i}+\textstyle{{\frac{1}{2}}}\nabla^{2}_{\partial_{j},\partial_{k}}fdx^{j}\cdot dx^{k}\mapsto(df,\nabla^{2}f). (5.8)

Similarly to the classical cotangent space, the second-order cotangent space may be defined via germs. To be precise, we denote by Cq(M)C_{q}^{\infty}(M) the set of all germs of smooth functions at qMq\in M, and define a equivalence relation between germs: [f]q,[g]qCq(M)[f]_{q},[g]_{q}\in C_{q}^{\infty}(M) are equivalent if and only if they have the same Taylor expansion at qq higher than order zero and up to order two. Then, one can easily check that there is a one-to-one correspondence between 𝒯SqM\mathcal{T}^{S*}_{q}M and the quotient space of Cq(M)C_{q}^{\infty}(M) by this equivalence relation. Along this way, we can also observe the following diffeomorphism:

𝒯SM×J2π^,\mathcal{T}^{S*}M\times\mathbb{R}\cong J^{2}\hat{\pi}, (5.9)

by mapping (d2fq,f(q))(d^{2}f_{q},f(q)) to j2qfj^{2}_{q}f, where J2π^J^{2}\hat{\pi} is the classical second-order jet bundle of (M×,π^,M)(M\times\mathbb{R},\hat{\pi},M). This is similar to TM×T^{*}M\times\mathbb{R} is diffeomorphic to the first-order jet bundle J1π^J^{1}\hat{\pi} (e.g., [32, Example 2.5.11 ] or [77, Example 4.1.15 ]). We denote the natural projection maps from 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R} to \mathbb{R} and from TM×T^{*}M\times\mathbb{R} to \mathbb{R} by π^20,1\hat{\pi}^{2}_{0,1} and π^10,1\hat{\pi}^{1}_{0,1}, respectively.

The relations and projection maps are integrated into the following commutative diagram:

J1π^TM×{J^{1}\hat{\pi}\cong T^{*}M\times\mathbb{R}}J2π^𝒯SM×{J^{2}\hat{\pi}\cong\mathcal{T}^{S*}M\times\mathbb{R}}𝒯SM{\mathcal{T}^{S*}M}TM{T^{*}M}M×{M\times\mathbb{R}}M{M}{\mathbb{R}}π^1\scriptstyle{\hat{\pi}_{1}}π^1,0\scriptstyle{\hat{\pi}_{1,0}}π^10,1\scriptstyle{\hat{\pi}^{1}_{0,1}}π^2,0\scriptstyle{\hat{\pi}_{2,0}}π^20,1\scriptstyle{\hat{\pi}^{2}_{0,1}}π^2,1\scriptstyle{\hat{\pi}_{2,1}}π^2\scriptstyle{\hat{\pi}_{2}}π^1,1\scriptstyle{\hat{\pi}_{1,1}}τSM\scriptstyle{\tau^{S*}_{M}}ϱ^\scriptstyle{\hat{\varrho}^{*}}τM\scriptstyle{\tau^{*}_{M}}π\scriptstyle{\pi}π^\scriptstyle{\hat{\pi}}
Remark 5.4.

(i). As in Remark 3.5, given a linear connection \nabla, we can obtain a one-to-one correspondence between (TMSym2(TM))×(T^{*}M\oplus\mathrm{Sym}^{2}(T^{*}M))\times\mathbb{R} and J2π^J^{2}\hat{\pi} by mapping (dfq,2fq,f(q))(df_{q},\nabla^{2}f_{q},f(q)) to j2qfj^{2}_{q}f. One can find in [19] an application of the jet-like structure on TMSym2(TM)T^{*}M\oplus\mathrm{Sym}^{2}(T^{*}M) and higher-order bundles to Martin Hairer’s theory of regularity structures [36].

(ii). As we have seen, the product ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M is the model bundle of the stochastic jet space 𝒥SM\mathcal{J}^{S}M, while the product 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R} is diffeomorphic to the second-order jet bundle J2π^J^{2}\hat{\pi}. So, in a way, we can say that the “stochastic” and the “second-order” are dual to each other. This stochastic–second-order duality is somehow analogous to the particle–wave duality in quantum mechanics.

5.2 Second-order tangent and cotangent maps

Definition 5.5 (Second-order tangent and cotangent maps, [25, Chapter VI]).

Let MM and NN be two smooth manifolds, F:MNF:M\to N be a smooth map. The second-order tangent map of FF at qMq\in M is a linear map d2Fq:𝒯SqM𝒯SF(q)Nd^{2}F_{q}:\mathcal{T}^{S}_{q}M\to\mathcal{T}^{S}_{F(q)}N defined by

d2Fq(A)f=A(fF),for A𝒯SqM,fC(N).d^{2}F_{q}(A)f=A(f\circ F),\quad\text{for }A\in\mathcal{T}^{S}_{q}M,f\in C^{\infty}(N).

The second-order cotangent map of FF at qMq\in M is a linear map d2Fq:𝒯SF(q)N𝒯SqMd^{2}F^{*}_{q}:\mathcal{T}^{S*}_{F(q)}N\to\mathcal{T}^{S*}_{q}M dual to d2Fqd^{2}F_{q}, that is,

d2Fq(α)(A)=α(d2Fq(A)),for A𝒯SqM,α𝒯SF(q)N.d^{2}F^{*}_{q}(\alpha)(A)=\alpha(d^{2}F_{q}(A)),\quad\text{for }A\in\mathcal{T}^{S}_{q}M,\alpha\in\mathcal{T}^{S*}_{F(q)}N.

The restrictions of d2Fqd^{2}F_{q} to TqMT_{q}M coincide with the classical tangent map dFqdF_{q}. But this is not the case for d2Fqd^{2}F^{*}_{q} when restricting to TF(q)NT^{*}_{F(q)}N, since for αTF(q)N\alpha\in T^{*}_{F(q)}N, d2Fq(α)d^{2}F^{*}_{q}(\alpha) is still a linear map on 𝒯SqM\mathcal{T}^{S}_{q}M. A manifestation of these phenomena may be seen through local coordinates in the following lemma.

Lemma 5.6.

Let (U,(xi))(U,(x^{i})) and (V,(yj))(V,(y^{j})) be local coordinate charts around qq and F(q)F(q), respectively. If

A=Aixi|q+Aij2xixj|qandα=αid2yi|F(q)+αijdyidyj|F(q).A=A^{i}\frac{\partial}{\partial x^{i}}\bigg{|}_{q}+A^{ij}\frac{\partial^{2}}{\partial x^{i}\partial x^{j}}\bigg{|}_{q}\quad\text{and}\quad\alpha=\alpha_{i}d^{2}y^{i}|_{F(q)}+\alpha_{ij}dy^{i}\cdot dy^{j}|_{F(q)}.

Then

d2Fq(A)=(AFi)yi|F(q)+ΓA(Fi,Fj)2yiyj|F(q),\displaystyle d^{2}F_{q}(A)=(AF^{i})\frac{\partial}{\partial y^{i}}\bigg{|}_{F(q)}+\Gamma_{A}(F^{i},F^{j})\frac{\partial^{2}}{\partial y^{i}\partial y^{j}}\bigg{|}_{F(q)},
d2Fq(α)=αid2Fi|q+αijdFidFj|q.\displaystyle d^{2}F^{*}_{q}(\alpha)=\alpha_{i}d^{2}F^{i}|_{q}+\alpha_{ij}dF^{i}\cdot dF^{j}|_{q}.

Now if A𝒯qMA\in\mathcal{T}_{q}M, then all (Aij)(A^{ij}) vanish and thereby so do ΓA(Fi,Fj)\Gamma_{A}(F^{i},F^{j})’s. Thus, d2Fq(A)=(AFi)yi|F(q)=dFq(A)d^{2}F_{q}(A)=(AF^{i})\frac{\partial}{\partial y^{i}}|_{F(q)}=dF_{q}(A). This makes clear that d2Fq|𝒯qM=dFqd^{2}F_{q}|_{\mathcal{T}_{q}M}=dF_{q}. But if α𝒯F(q)N\alpha\in\mathcal{T}^{*}_{F(q)}N, then αij\alpha^{ij}’s vanish and

d2Fq(α)=αid2Fi|q=αiFixj(q)d2xj|q+αi2Fixjxk(q)dxjdxk|q,d^{2}F^{*}_{q}(\alpha)=\alpha_{i}d^{2}F^{i}|_{q}=\alpha_{i}\frac{\partial F^{i}}{\partial x^{j}}(q)d^{2}x^{j}|_{q}+\alpha_{i}\frac{\partial^{2}F^{i}}{\partial x^{j}\partial x^{k}}(q)dx^{j}\cdot dx^{k}|_{q},

while dFq(α)=αidFi|q=αiFixj(q)d2xj|qdF^{*}_{q}(\alpha)=\alpha_{i}dF^{i}|_{q}=\alpha_{i}\frac{\partial F^{i}}{\partial x^{j}}(q)d^{2}x^{j}|_{q}. Hence, d2Fq|𝒯F(q)NdFqd^{2}F^{*}_{q}|_{\mathcal{T}^{*}_{F(q)}N}\neq dF^{*}_{q}.

Definition 5.7 (Second-order pushforwards and pullbacks).

Let F:MNF:M\to N be smooth map. The second-order pushforward by FF is a bundle homomorphism FS:(𝒯SM,τSM,M)(𝒯SN,τSN,N)F^{S}_{*}:(\mathcal{T}^{S}M,\tau^{S}_{M},M)\to(\mathcal{T}^{S}N,\tau^{S}_{N},N) defined by

FS|𝒯SqM=d2Fq.F^{S}_{*}|_{\mathcal{T}^{S}_{q}M}=d^{2}F_{q}.

Given a second-order form α\alpha on NN, the second-order pullback of α\alpha by FF is a second-order form FSαF^{S*}\alpha on MM defined by

(FSα)q=d2Fq(αF(q)),qM.(F^{S*}\alpha)_{q}=d^{2}F^{*}_{q}\left(\alpha_{F(q)}\right),\quad q\in M.

Let FF be a diffeomorphism. The second-order pullback by FF is a bundle isomorphism FS:(𝒯SN,τSN,N)(𝒯SM,τSM,M)F^{S*}:(\mathcal{T}^{S*}N,\tau^{S*}_{N},N)\to(\mathcal{T}^{S*}M,\tau^{S*}_{M},M) defined by

FS|𝒯SqN=d2FF1(q).F^{S*}|_{\mathcal{T}^{S*}_{q^{\prime}}N}=d^{2}F^{*}_{F^{-1}(q^{\prime})}.

Given a second-order vector field AA on MM, the second-order pushforward of AA by FF is a second-order vector field FSAF^{S}_{*}A on NN defined by

(FSA)q=d2FF1(q)(AF1(q)),qN.(F^{S}_{*}A)_{q^{\prime}}=d^{2}F_{F^{-1}(q^{\prime})}\left(A_{F^{-1}(q^{\prime})}\right),\quad q^{\prime}\in N.

Clearly, FS|TM=FF^{S}_{*}|_{TM}=F_{*} is the usual pushforward, but FS|TNFF^{S*}|_{T^{*}N}\neq F^{*}. The following properties are straightforward.

Lemma 5.8.

Let F:MNF:M\to N, G:NKG:N\to K be two smooth maps. Let AA be a second-order vector field on MM and f,gf,g be two smooth functions on NN.
(i) GSFS=(GF)SG^{S}_{*}\circ F^{S}_{*}=(G\circ F)^{S}_{*}.
(ii) If FF is a diffeomorphism, then ((FSA)f)F=A(fF)((F^{S}_{*}A)f)\circ F=A(f\circ F).
(iii) FS(d2f)=d2(fF)F^{S*}(d^{2}f)=d^{2}(f\circ F), FS(dfdg)=d(fF)d(gF)F^{S*}(df\cdot dg)=d(f\circ F)\cdot d(g\circ F).

5.3 Mixed-order tangent and cotangent bundles

In this section, we will extend the notions of the previous two sections to the product manifold ×M\mathbb{R}\times M.

Definition 5.9.

The mixed-order tangent bundle of ×M\mathbb{R}\times M is the product bundle ([77, Definition 1.4.1]) (T×𝒯SM,τ×τSM,×M)(T\mathbb{R}\times\mathcal{T}^{S}M,\tau_{\mathbb{R}}\times\tau^{S}_{M},\mathbb{R}\times M). The mixed-order cotangent bundle of ×M\mathbb{R}\times M is the product bundle (T×𝒯SM,τ×τSM,×M)(T^{*}\mathbb{R}\times\mathcal{T}^{S*}M,\tau^{*}_{\mathbb{R}}\times\tau^{S*}_{M},\mathbb{R}\times M). A section of the mixed-order tangent or cotangent bundle is called a mixed-order vector field or mixed-order form, respectively.

The mixed-order tangent and cotangent bundles are dual to each other. The mixed-order tangent (or cotangent) bundle is the bundle that mixes the first-order tangent (or cotangent) bundle in time and the second-order one in space (this is why we use the terminology “mixed-order”). It also matches the fundamental principle of stochastic analysis, whose Itô’s logo is (dX(t))2dt(dX(t))^{2}\sim dt.

For an MM-valued diffusion XX with (time-dependent) generator AXA^{X}, we call the operator t+AX\frac{\partial}{\partial{t}}+A^{X} its extended generator. This extended generator is a mixed-order vector field on ×M\mathbb{R}\times M. Also note that the extended generator t+AX\frac{\partial}{\partial{t}}+A^{X} of XIt0(M)X\in I_{t_{0}}(M) can be characterized by the property that for every fC(×M)f\in C^{\infty}(\mathbb{R}\times M), the process

f(t,X(t))f(t0,X(t0))t0t(t+AX)f(s,X(s))ds,tt0,f(t,X(t))-f(t_{0},X(t_{0}))-\int_{t_{0}}^{t}\left(\frac{\partial}{\partial{t}}+A^{X}\right)f(s,X(s))ds,\quad t\geq t_{0},

is a real-valued continuous {𝒫t}\{\mathcal{P}_{t}\}-martingale. In general, a mixed-order vector field AA has the following local expression:

A=A0t+Aixi+Ajk2xjxk.A=A^{0}\frac{\partial}{\partial{t}}+A^{i}\frac{\partial}{\partial{x^{i}}}+A^{jk}\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}.

To give an example of mixed-order forms, we consider a smooth function ff on ×M\mathbb{R}\times M, and define in local coordinates

df:=ftdt+fxid2xi+122fxjxkdxjdxk.d^{\circ}f:=\frac{\partial f}{\partial t}dt+\frac{\partial f}{\partial x^{i}}d^{2}x^{i}+\frac{1}{2}\frac{\partial^{2}f}{\partial x^{j}\partial x^{k}}dx^{j}\cdot dx^{k}.

Then dfd^{\circ}f is a mixed-order form, and we call it the mixed differential of ff. Clearly, the pairing of the mixed differential dfd^{\circ}f and a mixed-order vector field AA is df,A=Af\langle d^{\circ}f,A\rangle=Af.

Given a bundle homomorphism from F:(×M,π,)(×N,ρ,)F:(\mathbb{R}\times M,\pi,\mathbb{R})\to(\mathbb{R}\times N,\rho,\mathbb{R}), we define its mixed-order tangent map at (t,q)×M(t,q)\in\mathbb{R}\times M by

dF(t,q)=d2F(t,q)|Tt×𝒯SqM:T×𝒯SM|(t,q)T×𝒯SN|F(t,q).d^{\circ}F_{(t,q)}=d^{2}F_{(t,q)}|_{T_{t}\mathbb{R}\times\mathcal{T}^{S}_{q}M}:T\mathbb{R}\times\mathcal{T}^{S}M|_{(t,q)}\to T\mathbb{R}\times\mathcal{T}^{S}N|_{F(t,q)}.

Its mixed-order cotangent map at (t,q)×M(t,q)\in\mathbb{R}\times M is defined as the linear map dF(t,q):T×𝒯SN|F(t,q)T×𝒯SM|(t,q)d^{\circ}F^{*}_{(t,q)}:T^{*}\mathbb{R}\times\mathcal{T}^{S*}N|_{F(t,q)}\to T^{*}\mathbb{R}\times\mathcal{T}^{S*}M|_{(t,q)} dual to dF(t,q)d^{\circ}F_{(t,q)}. If, moreover, FF is a bundle isomorphism, its mixed-order pushforward and pullback, denoted by FRF^{R}_{*} and FRF^{R*}, respectively, can be defined in a similar manner to Definition 5.7. We leave their detailed but cumbersome definitions and properties to Appendix A.1.

6 Stochastic Hamiltonian mechanics

6.1 Horizontal diffusions

In this section, we consider a general fiber bundle (E,πM,M)(E,\pi_{M},M) over a manifold MM, with fiber dimension nn. We first introduce a special class of diffusions on this fiber bundle, which we call horizontal diffusions. They are defined in a similar fashion as the horizontal subspaces in Definition 3.7. Roughly speaking, a horizontal diffusion process on EE is a diffusion that is random only “horizontally”, but not on fibers.

Definition 6.1 (Horizontal diffusions on fiber bundles).

Let (E,πM,M)(E,\pi_{M},M) be a fiber bundle. A EE-valued diffusion process 𝐗\mathbf{X} is said to be horizontal, if there exists an MM-valued diffusion process XX and a smoothly time-dependent section ϕ=(ϕt)\phi=(\phi_{t}) of πM\pi_{M}, such that a.s. 𝐗(t)=ϕ(t,X(t))\mathbf{X}(t)=\phi(t,X(t)) for all tt.

The process XX in the above definition is just the projection of 𝐗\mathbf{X}, for πM(𝐗(t))=πM(ϕ(t,X(t)))=X(t)\pi_{M}(\mathbf{X}(t))=\pi_{M}(\phi(t,X(t)))=X(t) a.s.. Since the projection map πM\pi_{M} is smooth, XX is still a diffusion process.

Now we are going to define a subclass of “integral processes” for second-order vector fields on EE by making use of horizontal diffusions. We use (xi,uμ)(x^{i},u^{\mu}) for an adapted coordinate system on EE (see [77, Definition 1.1.5]), where we use Greek alphabet to label the coordinates of fibers.

Given a second-order vector field with local expression

A=Aixi+Aμuμ+Ajk2xjxk+Ajμ2xjuμ+Aμν2uμuν,A=A^{i}\frac{\partial}{\partial{x^{i}}}+A^{\mu}\frac{\partial}{\partial{u^{\mu}}}+A^{jk}\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}+A^{j\mu}\frac{\partial^{2}}{\partial x^{j}\partial u^{\mu}}+A^{\mu\nu}\frac{\partial^{2}}{\partial u^{\mu}\partial u^{\nu}}, (6.1)

where Ai,Aμ,Ajk,Ajμ,AμνA^{i},A^{\mu},A^{jk},A^{j\mu},A^{\mu\nu} are smooth functions in the local chart of EE, by a horizontal integral process of AA in (6.1) we mean an EE-valued horizontal diffusion process 𝐗\mathbf{X} such that 𝐗\mathbf{X} is an integral process of AA in the sense of (2.22), that is, it is determined by the system

{(D(x𝐗))i(t)=Ai(𝐗(t)),(Q(x𝐗))jk(t)=2Ajk(𝐗(t)),(D(u𝐗))μ(t)=Aμ(𝐗(t)),(Q(x𝐗,u𝐗))jν(t)=Ajμ(𝐗(t)),(Q(u𝐗))μν(t)=2Aμν(𝐗(t)),\left\{\begin{aligned} (D(x\circ\mathbf{X}))^{i}(t)&=A^{i}(\mathbf{X}(t)),\\ (Q(x\circ\mathbf{X}))^{jk}(t)&=2A^{jk}(\mathbf{X}(t)),\\ (D(u\circ\mathbf{X}))^{\mu}(t)&=A^{\mu}(\mathbf{X}(t)),\\ (Q(x\circ\mathbf{X},u\circ\mathbf{X}))^{j\nu}(t)&=A^{j\mu}(\mathbf{X}(t)),\\ (Q(u\circ\mathbf{X}))^{\mu\nu}(t)&=2A^{\mu\nu}(\mathbf{X}(t)),\end{aligned}\right. (6.2)

where the expression x𝐗x\circ\mathbf{X} means that the family of coordinate functions (xi)(x^{i}) acts on 𝐗\mathbf{X}, and so on. Set 𝐗(t)=ϕ(t,X(t))\mathbf{X}(t)=\phi(t,X(t)) for some time-dependent section ϕ\phi of πM\pi_{M} and MM-valued diffusion XX. Denote ϕμ=uμϕ\phi^{\mu}=u^{\mu}\circ\phi. By Itô’s formula, the system (6.2) can be written as

{(DX)i(t)=Ai(ϕ(t,X(t))),(QX)jk(t)=2Ajk(ϕ(t,X(t))),(t+Ai(ϕ(t,X(t)))xi+Ajk(ϕ(t,X(t)))2xjxk)ϕμ(t,X(t))=Aμ(ϕ(t,X(t)))2Ajk(ϕ(t,X(t)))ϕμxk(t,X(t))=Ajμ(ϕ(t,X(t)))Ajk(ϕ(t,X(t)))ϕμxjϕνxk(t,X(t))=Aμν(ϕ(t,X(t))).\left\{\begin{aligned} (DX)^{i}(t)&=A^{i}(\phi(t,X(t))),\\ (QX)^{jk}(t)&=2A^{jk}(\phi(t,X(t))),\\ \bigg{(}\frac{\partial}{\partial t}+A^{i}&(\phi(t,X(t)))\frac{\partial}{\partial{x^{i}}}+A^{jk}(\phi(t,X(t)))\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}\bigg{)}\phi^{\mu}(t,X(t))=A^{\mu}(\phi(t,X(t)))\\ 2A^{jk}(\phi(t&,X(t)))\frac{\partial\phi^{\mu}}{\partial x^{k}}(t,X(t))=A^{j\mu}(\phi(t,X(t)))\\ A^{jk}(\phi(t,&X(t)))\frac{\partial\phi^{\mu}}{\partial x^{j}}\frac{\partial\phi^{\nu}}{\partial x^{k}}(t,X(t))=A^{\mu\nu}(\phi(t,X(t))).\end{aligned}\right. (6.3)

If X(t)X(t) has full support for all tt, then the last three equations in (6.3) translate into a system of (possibly degenerate) parabolic equations on EE,

{(t+Ai(ϕ(t,q))xi+Ajk(ϕ(t,q))2xjxk)ϕμ(t,q)=Aμ(ϕ(t,q)),2Ajk(ϕ(t,q))ϕμxk(t,q)=Ajμ(ϕ(t,q))Ajk(ϕ(t,q))ϕμxjϕνxk(t,q)=Aμν(ϕ(t,q)).\left\{\begin{aligned} &\bigg{(}\frac{\partial}{\partial t}+A^{i}(\phi(t,q))\frac{\partial}{\partial{x^{i}}}+A^{jk}(\phi(t,q))\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}\bigg{)}\phi^{\mu}(t,q)=A^{\mu}(\phi(t,q)),\\ &2A^{jk}(\phi(t,q))\frac{\partial\phi^{\mu}}{\partial x^{k}}(t,q)=A^{j\mu}(\phi(t,q))\\ &A^{jk}(\phi(t,q))\frac{\partial\phi^{\mu}}{\partial x^{j}}\frac{\partial\phi^{\nu}}{\partial x^{k}}(t,q)=A^{\mu\nu}(\phi(t,q)).\end{aligned}\right. (6.4)

Therefore, under suitable assumptions for the coefficients Ai,Aμ,Ajk,Ajμ,AμνA^{i},A^{\mu},A^{jk},A^{j\mu},A^{\mu\nu}, equation (6.4) is solvable, at least locally, by some time-dependent local section ϕ=(ϕt)\phi=(\phi_{t}) over a time interval [0,T][0,T]. Then, plugging ϕ(t)\phi(t) into the first two equations of (6.3), we can find XX and hence 𝐗\mathbf{X}. We call XX an projective integral process of AA.

6.2 The second-order symplectic structure on 𝒯SM\mathcal{T}^{S*}M and stochastic Hamilton’s equations

It is well known that the classical cotangent bundle TMT^{*}M has a natural symplectic structure, given by the canonical symplectic form ω0=dxidpi\omega_{0}=dx^{i}\wedge dp_{i}, where (xi,pi)(x^{i},p_{i}) are the canonical local coordinates on TMT^{*}M induced by local coordinates (xi)(x^{i}) on MM. Clearly ω0\omega_{0} is closed, because it is exact as ω0=dθ0\omega_{0}=-d\theta_{0}, where θ0=pidxi\theta_{0}=p_{i}dx^{i} is called the Poincaré (or tautological) 1-form.

Now we need to define a similar structure on the second-order cotangent bundle 𝒯SM\mathcal{T}^{S*}M, which is a second-order counterpart of the symplectic structure. Firstly, we adapt the coordinate-free definition of the tautological 1-form to the second-order case.

Definition 6.2.

The second-order tautological form θ\theta is a second-order form on 𝒯SM\mathcal{T}^{S*}M defined by

θα=d2(τMS)α(α),α𝒯SqM.\theta_{\alpha}=d^{2}(\tau_{M}^{S*})_{\alpha}^{*}(\alpha),\quad\alpha\in\mathcal{T}^{S*}_{q}M.

Under the induced coordinate system on 𝒯SM\mathcal{T}^{S*}M defined in (5.3), the second-order tautological form θ\theta has the following coordinate representation

θ=pid2xi+12ojkdxjdxk.\theta=p_{i}d^{2}x^{i}+\textstyle{\frac{1}{2}}o_{jk}dx^{j}\cdot dx^{k}. (6.5)

We introduce the canonical second-order symplectic form ω\omega on 𝒯SM\mathcal{T}^{S*}M by writing ω=d2θ\omega=-d^{2}\theta. Although we do not define the exterior differential for second-order forms, we can still take d2d^{2} formally on both sides of (6.5), using Leibniz’s rule and the composition rule dd=d2d\circ d=d^{2} (cf. [66, Section 6.(e)]), and forcing d3=0d^{3}=0 and (d2)(d)=(d)(d2)=0(d^{2}-)\cdot(d-)=(d-)\cdot(d^{2}-)=0. Then, we get

ω=d(d2xidpi+12dxjdxkdojkpid3xi+ojkd2xjdxk)=d2xid2pi+12dxjdxkd2ojk.\begin{split}\omega=&\ d\left(d^{2}x^{i}\wedge dp_{i}+\textstyle{\frac{1}{2}}dx^{j}\cdot dx^{k}\wedge do_{jk}-p_{i}d^{3}x^{i}+o_{jk}d^{2}x^{j}\wedge dx^{k}\right)\\ =&\ d^{2}x^{i}\wedge d^{2}p_{i}+\textstyle{\frac{1}{2}}dx^{j}\cdot dx^{k}\wedge d^{2}o_{jk}.\end{split} (6.6)

We call the pair (𝒯SM,ω)(\mathcal{T}^{S*}M,\omega) a second-order symplectic manifold. The complete axiom system for a second-order differential system (d,d2,,)(d,d^{2},\wedge,\cdot) is beyond the scope of this paper.

Remark 6.3.

In the formal expression (dd)f=d2f(d\circ d)f=d^{2}f, fC(M)f\in C^{\infty}(M), the two differential operators dd at LHS are different. The second dd is still de Rham’s exterior differential on MM, while the first needs to be understood as the exterior differential on TMTM by regarding the first differential dfdf as a function on TMTM. Thus the complete expression should be dTMdM=d2d_{TM}\circ d_{M}=d^{2}. Along this way, the differential operator dTMd_{TM} can be extended to a linear transform that maps 1-forms to 2nd-order forms and satisfies Leibniz’s rule, see [25, Theorem 7.1]. We shall denote the linear operator extended from dTMd_{TM} by d in order to distinguish. In local coordinates, it acts on a 1-form η=ηidxi\eta=\eta_{i}dx^{i} by dη=ηid2xi+12ηixjdxidxj\emph{{d}}\eta=\eta_{i}d^{2}x^{i}+\frac{1}{2}\frac{\partial\eta_{i}}{\partial x^{j}}dx^{i}\cdot dx^{j}, so that ϱ^(dη)=η\hat{\varrho}^{*}(\emph{{d}}\eta)=\eta and d2=ddd^{2}=\emph{{d}}\circ d. When a linear connection \nabla is specified, dη=ηidxi+12η(i,j)dxidxj\emph{{d}}\eta=\eta_{i}d^{\nabla}x^{i}+\frac{1}{2}\nabla\eta(\partial_{i},\partial_{j})dx^{i}\cdot dx^{j} which covers (5.8).

As in the classical case, we have the following property for the second-order tautological form.

Lemma 6.4.

The second-order tautological form θ\theta is the unique second-order form on 𝒯SM\mathcal{T}^{S*}M with the property that, for every second-order form α\alpha on MM, αSθ=α\alpha^{S*}\theta=\alpha.

Proof.

From Lemma 5.8, we have, for any second-order vector A𝒯SqMA\in\mathcal{T}^{S}_{q}M,

(αSθ)q,A=θαq,d2αq(A)=d2(τMS)αq(αq),d2αq(A)=αq,d2(τMS)αqd2αq(A)=αq,A,\langle(\alpha^{S*}\theta)_{q},A\rangle=\langle\theta_{\alpha_{q}},d^{2}\alpha_{q}(A)\rangle=\langle d^{2}(\tau_{M}^{S*})_{\alpha_{q}}^{*}(\alpha_{q}),d^{2}\alpha_{q}(A)\rangle=\langle\alpha_{q},d^{2}(\tau_{M}^{S*})_{\alpha_{q}}\circ d^{2}\alpha_{q}(A)\rangle=\langle\alpha_{q},A\rangle,

since τMSα=𝐈𝐝M\tau_{M}^{S*}\circ\alpha=\mathbf{Id}_{M}. ∎

Recall that, in Definition 5.7, we have defined the second-order pullbacks of second-order forms. Now, given a smooth map 𝐅:𝒯SM𝒯SN\mathbf{F}:\mathcal{T}^{S*}M\to\mathcal{T}^{S*}N and a second-order 2-form η\eta on 𝒯SN\mathcal{T}^{S*}N, we may also define the second-order pullback 𝐅Sη\mathbf{F}^{S*}\eta of η\eta by 𝐅\mathbf{F} by allowing 𝐅S\mathbf{F}^{S*} to be exchangeable with the symmetric product \cdot as well as the wedge product \wedge. Then, as a corollary of Lemma 6.4, we have

αSω=d2α.\alpha^{S*}\omega=-d^{2}\alpha.
Definition 6.5.

Let ω\omega and η\eta be the canonical second-order symplectic forms on 𝒯SM\mathcal{T}^{S*}M and 𝒯SN\mathcal{T}^{S*}N, respectively. A bundle homomorphism 𝐅:(𝒯SM,ϱ^M,TM)(𝒯SN,ϱ^N,TN)\mathbf{F}:(\mathcal{T}^{S*}M,\hat{\varrho}^{*}_{M},T^{*}M)\to(\mathcal{T}^{S*}N,\hat{\varrho}^{*}_{N},T^{*}N) is called second-order symplectic or a second-order symplectomorphism if 𝐅Sη=ω\mathbf{F}^{S*}\eta=\omega.

Theorem 6.6.

Let F:NMF:N\to M be a diffeomorphism. The second-order pullback FS:𝒯SM𝒯SNF^{S*}:\mathcal{T}^{S*}M\to\mathcal{T}^{S*}N by FF is second-order symplectic; in fact (FS)Sϑ=θ(F^{S*})^{S*}\vartheta=\theta, where ϑ\vartheta is the second-order tautological form on 𝒯SN\mathcal{T}^{S*}N.

Proof.

For qMq\in M, αq𝒯SqM\alpha_{q}\in\mathcal{T}^{S*}_{q}M and A𝒯SαqTSMA\in\mathcal{T}^{S}_{\alpha_{q}}T^{S*}M,

(FS)Sϑ,A=ϑ,d2(FS)αqA=d2(τNS)FS(αq)(FS(αq)),d2(FS)αqA=FS(αq),d2(τNS)FS(αq)d2(FS)αqA=αq,d2FF1(q)d2(τNS)FS(αq)d2(FS)αqA=αq,d2(τMS)αqA=d2(τMS)αq(αq),A=θαq,A,\begin{split}\langle(F^{S*})^{S*}\vartheta,A\rangle&=\langle\vartheta,d^{2}(F^{S*})_{\alpha_{q}}A\rangle=\langle d^{2}(\tau_{N}^{S*})_{F^{S*}(\alpha_{q})}^{*}(F^{S*}(\alpha_{q})),d^{2}(F^{S*})_{\alpha_{q}}A\rangle\\ &=\langle F^{S*}(\alpha_{q}),d^{2}(\tau_{N}^{S*})_{F^{S*}(\alpha_{q})}\circ d^{2}(F^{S*})_{\alpha_{q}}A\rangle\\ &=\langle\alpha_{q},d^{2}F_{F^{-1}(q)}\circ d^{2}(\tau_{N}^{S*})_{F^{S*}(\alpha_{q})}\circ d^{2}(F^{S*})_{\alpha_{q}}A\rangle\\ &=\langle\alpha_{q},d^{2}(\tau_{M}^{S*})_{\alpha_{q}}A\rangle\\ &=\langle d^{2}(\tau_{M}^{S*})^{*}_{\alpha_{q}}(\alpha_{q}),A\rangle\\ &=\langle\theta_{\alpha_{q}},A\rangle,\end{split}

where we used the fact that FτNSFS=τMSF\circ\tau_{N}^{S*}\circ F^{S*}=\tau_{M}^{S*} in the fourth line. ∎

Clearly, the counterparts of Hamiltonian vector fields on TMT^{*}M are now second-order vector fields on 𝒯SM\mathcal{T}^{S*}M. Remark that for a second-order vector field AA on 𝒯SM\mathcal{T}^{S*}M, the form AωA\lrcorner\,\omega take values in the cotangent bundle 𝒯S𝒯SM\mathcal{T}^{S*}\mathcal{T}^{S*}M.

Definition 6.7.

Let H:𝒯SMH:\mathcal{T}^{S*}M\to\mathbb{R} be a given smooth function. A second-order vector field AHA_{H} on 𝒯SM\mathcal{T}^{S*}M satisfying

AHω=d2HA_{H}\lrcorner\,\omega=d^{2}H (6.7)

is called a second-order Hamiltonian vector field of HH. We call the triple (𝒯SM,ω,H)(\mathcal{T}^{S*}M,\omega,H) a second-order Hamiltonian system. The function HH is called the second-order Hamiltonian of the system.

According to (6.7), the 2nd-order vector field AHA_{H} satisfies

AHH=d2H(AH)=ω(AH,AH)=0.A_{H}H=d^{2}H(A_{H})=\omega(A_{H},A_{H})=0. (6.8)

The condition (6.7) cannot uniquely determine AHA_{H}. It is easy to verify that AHA_{H} is of the general form

AH=HpixiHxipi+Hojk2xjxk(2Hxjxk+Cjk)ojk+Ajk2pjpk+Aijkl2oijokl+Ajk2xjpk+Ajkl2xjokl+Ajkl2pjokl,\begin{split}A_{H}=&\ \frac{\partial H}{\partial p_{i}}\frac{\partial}{\partial{x^{i}}}-\frac{\partial H}{\partial x^{i}}\frac{\partial}{\partial{p_{i}}}+\frac{\partial H}{\partial o_{jk}}\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}-\left(\frac{\partial^{2}H}{\partial x^{j}\partial x^{k}}+C_{jk}\right)\frac{\partial}{\partial{o_{jk}}}\\ &\ +A_{jk}\frac{\partial^{2}}{\partial p_{j}\partial p_{k}}+A_{ijkl}\frac{\partial^{2}}{\partial o_{ij}\partial o_{kl}}+A^{j}_{k}\frac{\partial^{2}}{\partial x^{j}\partial p_{k}}+A^{j}_{kl}\frac{\partial^{2}}{\partial x^{j}\partial o_{kl}}+A_{jkl}\frac{\partial^{2}}{\partial p_{j}\partial o_{kl}},\end{split} (6.9)

where the coefficients Cjk,Ajk,Aijkl,Ajk,Ajkl,AjklC_{jk},A_{jk},A_{ijkl},A^{j}_{k},A^{j}_{kl},A_{jkl} are smooth functions on local chart satisfying

CjkHojk=Ajk2Hpjpk+Aijkl2Hoijokl+Ajk2Hxjpk+Ajkl2Hxjokl+Ajkl2Hpjokl,C_{jk}\frac{\partial H}{\partial o_{jk}}=A_{jk}\frac{\partial^{2}H}{\partial p_{j}\partial p_{k}}+A_{ijkl}\frac{\partial^{2}H}{\partial o_{ij}\partial o_{kl}}+A^{j}_{k}\frac{\partial^{2}H}{\partial x^{j}\partial p_{k}}+A^{j}_{kl}\frac{\partial^{2}H}{\partial x^{j}\partial o_{kl}}+A_{jkl}\frac{\partial^{2}H}{\partial p_{j}\partial o_{kl}},

such that the local expression (6.9) is invariant under the canonical change of coordinates on 𝒯SM\mathcal{T}^{S*}M induced by a change of coordinates on MM, governed by the structure group in Lemma 5.2.

Given such a second-order Hamiltonian vector field of HH, its horizontal integral process is a 𝒯SM\mathcal{T}^{S*}M-valued horizontal diffusion X determined by the following MDEs on 𝒯SM\mathcal{T}^{S*}M,

{(D(xX))i(t)=Hpi(X(t)),(Q(xX))jk(t)=2Hojk(X(t)),(D(pX))i(t)=Hxi(X(t)),(D(oX))jk(t)=(2Hxjxk+Cjk)(X(t)),(CijHoij)(X(t))=12(Q(pX))jk(t)2Hpjpk(X(t))+12(Q(oX))ijkl(t)2Hoijokl(X(t))+(Q(xX,pX))jk2Hxjpk(X(t))+(Q(xX,oX))jkl2Hxjokl(X(t))+(Q(pX,oX))jkl2Hpjokl(X(t)),\left\{\begin{aligned} (D(x\circ\textbf{X}))^{i}(t)&=\frac{\partial H}{\partial p_{i}}(\textbf{X}(t)),\\ (Q(x\circ\textbf{X}))^{jk}(t)&=2\frac{\partial H}{\partial o_{jk}}(\textbf{X}(t)),\\ (D(p\circ\textbf{X}))_{i}(t)&=-\frac{\partial H}{\partial x^{i}}(\textbf{X}(t)),\\ (D(o\circ\textbf{X}))_{jk}(t)&=-\left(\frac{\partial^{2}H}{\partial x^{j}\partial x^{k}}+C_{jk}\right)(\textbf{X}(t)),\\ \left(C_{ij}\frac{\partial H}{\partial o_{ij}}\right)(\textbf{X}(t))&=\frac{1}{2}(Q(p\circ\textbf{X}))_{jk}(t)\frac{\partial^{2}H}{\partial p_{j}\partial p_{k}}(\textbf{X}(t))+\frac{1}{2}(Q(o\circ\textbf{X}))_{ijkl}(t)\frac{\partial^{2}H}{\partial o_{ij}\partial o_{kl}}(\textbf{X}(t))\\ &\quad+(Q(x\circ\textbf{X},p\circ\textbf{X}))^{j}_{k}\frac{\partial^{2}H}{\partial x^{j}\partial p_{k}}(\textbf{X}(t))+(Q(x\circ\textbf{X},o\circ\textbf{X}))^{j}_{kl}\frac{\partial^{2}H}{\partial x^{j}\partial o_{kl}}(\textbf{X}(t))\\ &\quad+(Q(p\circ\textbf{X},o\circ\textbf{X}))_{jkl}\frac{\partial^{2}H}{\partial p_{j}\partial o_{kl}}(\textbf{X}(t)),\end{aligned}\right. (6.10)

or, in coordinates,

{Dix=Hpi,Qjkx=2Hojk,Dip=Hxi,Djko=(2Hxjxk+Cjk),CijHoij=12Qjkp2Hpjpk+12Qijklo2Hoijokl+Qjk(x,p)2Hxjpk+Qjkl(x,o)2Hxjokl+Qjkl(p,o)2Hpjokl,\left\{\begin{aligned} D^{i}x&=\frac{\partial H}{\partial p_{i}},\\ Q^{jk}x&=2\frac{\partial H}{\partial o_{jk}},\\ D_{i}p&=-\frac{\partial H}{\partial x^{i}},\\ D_{jk}o&=-\left(\frac{\partial^{2}H}{\partial x^{j}\partial x^{k}}+C_{jk}\right),\\ C_{ij}\frac{\partial H}{\partial o_{ij}}&=\frac{1}{2}Q_{jk}p\frac{\partial^{2}H}{\partial p_{j}\partial p_{k}}+\frac{1}{2}Q_{ijkl}o\frac{\partial^{2}H}{\partial o_{ij}\partial o_{kl}}+Q^{j}_{k}(x,p)\frac{\partial^{2}H}{\partial x^{j}\partial p_{k}}\\ &\quad+Q^{j}_{kl}(x,o)\frac{\partial^{2}H}{\partial x^{j}\partial o_{kl}}+Q_{jkl}(p,o)\frac{\partial^{2}H}{\partial p_{j}\partial o_{kl}},\end{aligned}\right.

where (xi,pi,ojk,Dix,Dip,Djko,Qjkx,Qjkp,Qijklo,Qjk(x,p),Qjkl(x,o),Qjkl(p,o))\big{(}x^{i},p_{i},o_{jk},D^{i}x,D_{i}p,D_{jk}o,Q^{jk}x,Q_{jk}p,Q_{ijkl}o,Q^{j}_{k}(x,p),Q^{j}_{kl}(x,o),Q_{jkl}(p,o)\big{)} are canonical coordinates on 𝒯S𝒯SM\mathcal{T}^{S}\mathcal{T}^{S*}M. The first and third equations has been conjectured in [89] as stochastic Hamilton’s equations in the Euclidean space, since they have the same form as classical Hamilton’s equations (e.g., [1, Proposition 3.3.2]) except that mean derivative DD replaces classical time derivative.

At first glance, one may think that the system (6.10) is underdetermined, as there are fewer equations than unknowns (the number of unknowns is equal to the fiber dimension of 𝒯S𝒯SM\mathcal{T}^{S}\mathcal{T}^{S*}M). Besides, we haven not yet given (6.10) initial or terminal data. These will become clear after we make the following observations. Firstly, the first two equations of (6.10) constitute MDEs that are equivalent to an Itô SDE for x(𝐗)x(\mathbf{X}) in weak sense, as we have seen in Section 2.4. So x(𝐗)x(\mathbf{X}) should be assigned an initial value, say,

Law((x𝐗)(0))=μ0,\text{Law}((x\circ\mathbf{X})(0))=\mu_{0}, (6.11)

where μ0\mu_{0} is a given probability measure on MM. Secondly, in the third and fourth equations of (6.10), only the “drift” information of p(𝐗)p(\mathbf{X}) and o(𝐗)o(\mathbf{X}) is clear. To overcome the lack of information, we need to assign p(𝐗)p(\mathbf{X}) and o(𝐗)o(\mathbf{X}) terminal values, say,

{(p𝐗)(T)=p(x𝐗(T)),(o𝐗)(T)=o(x𝐗(T)),\left\{\begin{aligned} (p\circ\mathbf{X})(T)&=p^{*}(x\circ\mathbf{X}(T)),\\ (o\circ\mathbf{X})(T)&=o^{*}(x\circ\mathbf{X}(T)),\end{aligned}\right. (6.12)

where (p,o)(p^{*},o^{*}) is a given second-order form. Therefore, the third and fourth equations are understood as backward SDEs, whose drifts rely on diffusion coefficients via the last equation. The system (6.10) together with boundary values (6.11) and (6.12) could be understood as a (coupled) forward-backward system of SDEs [87] (where “backward” is taken in a different sense from ours in Chapter 2).

Notice that those forward-backward SDEs are not necessarily solvable (see [87, Proposition 7.5.2] for an example). In order to solve (6.10)–(6.12), we have to take the horizontal condition into consideration, and make some compatibility assumption. More precisely, we set X=τMS(𝐗)X=\tau_{M}^{S*}(\mathbf{X}) and

𝐗(t)=α(t,X(t)),\mathbf{X}(t)=\alpha(t,X(t)), (6.13)

for some time-dependent second-order form α\alpha on MM, and denote pi(t,x)=pi(α(t,x))p_{i}(t,x)=p_{i}(\alpha(t,x)) and ojk(t,x)=ojk(α(t,x))o_{jk}(t,x)=o_{jk}(\alpha(t,x)), so that α(t,x)=(p(t,x),o(t,x))\alpha(t,x)=(p(t,x),o(t,x)). Assume that for each t(0,T)t\in(0,T), X(t)X(t) has full support. Then, by applying Itô’s formula, in the same way as in (6.4), the system (6.10) reduces to

{(t+Hpjxj+Hojk2xjxk)pi=Hxi,(t+Hpkxk+Hokl2xkxl)oij=(2Hxixj+Cij),CijHoij=Hoij(pkxiplxj2Hpkpl+oklxiomnxj2Hoklomn+2pkxi2Hxjpk+2oklxi2Hxjokl+2pkxiolmxj2Hpkolm).\left\{\begin{aligned} \bigg{(}\frac{\partial}{\partial t}&+\frac{\partial H}{\partial p_{j}}\frac{\partial}{\partial{x^{j}}}+\frac{\partial H}{\partial o_{jk}}\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}\bigg{)}p_{i}=-\frac{\partial H}{\partial x^{i}},\\ \bigg{(}\frac{\partial}{\partial t}&+\frac{\partial H}{\partial p_{k}}\frac{\partial}{\partial{x^{k}}}+\frac{\partial H}{\partial o_{kl}}\frac{\partial^{2}}{\partial x^{k}\partial x^{l}}\bigg{)}o_{ij}=-\left(\frac{\partial^{2}H}{\partial x^{i}\partial x^{j}}+C_{ij}\right),\\ C_{ij}&\frac{\partial H}{\partial o_{ij}}=\frac{\partial H}{\partial o_{ij}}\bigg{(}\frac{\partial p_{k}}{\partial x^{i}}\frac{\partial p_{l}}{\partial x^{j}}\frac{\partial^{2}H}{\partial p_{k}\partial p_{l}}+\frac{\partial o_{kl}}{\partial x^{i}}\frac{\partial o_{mn}}{\partial x^{j}}\frac{\partial^{2}H}{\partial o_{kl}\partial o_{mn}}+2\frac{\partial p_{k}}{\partial x^{i}}\frac{\partial^{2}H}{\partial x^{j}\partial p_{k}}\\ &\qquad\qquad\qquad+2\frac{\partial o_{kl}}{\partial x^{i}}\frac{\partial^{2}H}{\partial x^{j}\partial o_{kl}}+2\frac{\partial p_{k}}{\partial x^{i}}\frac{\partial o_{lm}}{\partial x^{j}}\frac{\partial^{2}H}{\partial p_{k}\partial o_{lm}}\bigg{)}.\end{aligned}\right. (6.14)

Next, by taking partial derivative xj\frac{\partial}{\partial{x^{j}}} on both sides of the first equation of (6.14) and comparing with the next two, we find the following sufficient condition for the last two equations of (6.14):

oij(t,x)=pixj(t,x)=pjxi(t,x),o_{ij}(t,x)=\frac{\partial p_{i}}{\partial x^{j}}(t,x)=\frac{\partial p_{j}}{\partial x^{i}}(t,x), (6.15)

or equivalent, for the terminal value (p,o)(p^{*},o^{*}),

oij(x)=pixj(x)=pjxi(x).o^{*}_{ij}(x)=\frac{\partial p^{*}_{i}}{\partial x^{j}}(x)=\frac{\partial p^{*}_{j}}{\partial x^{i}}(x). (6.16)

Equation (6.15) implies that α\alpha in (6.13) is “exact”, in the sense that α=dη\alpha=\emph{{d}}\eta for the time-dependent 1-form η=pidxi\eta=p_{i}dx^{i}, where d is the extended differential operator defined in Remark 6.3. Similarly, equation (6.16) implies that (p,o)=dη(p^{*},o^{*})=\emph{{d}}\eta^{*} for 1-form η=pidxi\eta^{*}=p^{*}_{i}dx^{i}. The second equality of (6.15) (or (6.16)), called Onsager reciprocity or Maxwell relations [1, Section 5.3], implies that the 1-form η\eta (or η\eta^{*}) is closed. We will refer to equation (6.15) or (6.16) as second-order Maxwell relations.

Under the 2nd-order Maxwell relations, the original stochastic Hamilton’s system (6.10) turns to the following MDE-PDE coupled system.

{(DX)i(t)=Hpi(X(t),p(t,X(t)),o(t,X(t))),(QX)jk(t)=2Hojk(X(t),p(t,X(t)),o(t,X(t))),(t+Hpj(x,p(t,x),o(t,x))xj+Hojk(x,p(t,x),o(t,x))2xjxk)pi(t,x)=Hxi(x,p(t,x),o(t,x)),oij(t,x)=pixj(t,x),\left\{\begin{aligned} (DX)^{i}(t)&=\frac{\partial H}{\partial p_{i}}(X(t),p(t,X(t)),o(t,X(t))),\\ (QX)^{jk}(t)&=2\frac{\partial H}{\partial o_{jk}}(X(t),p(t,X(t)),o(t,X(t))),\\ \bigg{(}\frac{\partial}{\partial t}+\frac{\partial H}{\partial p_{j}}&(x,p(t,x),o(t,x))\frac{\partial}{\partial{x^{j}}}+\frac{\partial H}{\partial o_{jk}}(x,p(t,x),o(t,x))\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}\bigg{)}p_{i}(t,x)=-\frac{\partial H}{\partial x^{i}}(x,p(t,x),o(t,x)),\\ o_{ij}(t,x)&=\frac{\partial p_{i}}{\partial x^{j}}(t,x),\end{aligned}\right. (6.17)

The boundary values in (6.11) and (6.12) now read

Law(X(0))=μ0,(p,o)(T)=dη.\text{Law}(X(0))=\mu_{0},\quad(p,o)(T)=\emph{{d}}\eta^{*}. (6.18)

We first use the terminal value in (6.18), which satisfies (6.16), to solve the last two PDEs in (6.17). This gives (p,o)(p,o) and hence the 2nd-order form α\alpha. Then we plug pp and oo into the first two MDEs and solve them with initial distribution in (6.18). This yields in law the MM-valued diffusion X=τMS(X)X=\tau_{M}^{S*}(\textbf{X}) as a projective integral process of AHA_{H}.

We call system (6.10) or (6.17) the stochastic Hamilton’s equations (S-H equations in short). The second-order Maxwell relations are sufficient for the component oo of α\alpha in (6.13) to solve the last two equations of (6.10), so we refer to it as an integrability condition of (6.10). When restricting settings to Riemannian manifolds, the S-H equations (6.10) can be simplified to a global Hamiltonian-type system on TMT^{*}M, as we will see in Subsection 7.4.2.

Lemma 6.8.

Let H:𝒯SM×H:\mathcal{T}^{S*}M\times\mathbb{R}\to\mathbb{R} be a time-dependent 2nd-order Hamiltonian, and 𝐗\mathbf{X} be a horizontal integral process of AHA_{H}. Then, the total mean derivative of HH along 𝐗\mathbf{X} is

𝐃tH=Ht.\mathbf{D}_{t}H=\frac{\partial H}{\partial t}.
Proof.

We use (6.10) and local coordinates to derive

𝐃tH=D[H(𝐗(t),t)]=Ht+DixHxi+DipHpi+DjkoHojk+12Qjkx2Hxjxk+12Qjkp2Hpjpk+12Qijklo2Hoijokl+Qjk(x,p)2Hxjpk+Qjkl(x,o)2Hxjokl+Qjkl(p,o)2Hpjokl=Ht+DixHxi+DipHpi+DjkoHojk+12Qjkx2Hxjxk+CijHoij=Ht.\begin{split}\mathbf{D}_{t}H&=D[H(\mathbf{X}(t),t)]=\frac{\partial H}{\partial t}+D^{i}x\frac{\partial H}{\partial x^{i}}+D_{i}p\frac{\partial H}{\partial p_{i}}+D_{jk}o\frac{\partial H}{\partial o_{jk}}+\frac{1}{2}Q^{jk}x\frac{\partial^{2}H}{\partial x^{j}\partial x^{k}}\\ &\quad+\frac{1}{2}Q_{jk}p\frac{\partial^{2}H}{\partial p_{j}\partial p_{k}}+\frac{1}{2}Q_{ijkl}o\frac{\partial^{2}H}{\partial o_{ij}\partial o_{kl}}+Q^{j}_{k}(x,p)\frac{\partial^{2}H}{\partial x^{j}\partial p_{k}}+Q^{j}_{kl}(x,o)\frac{\partial^{2}H}{\partial x^{j}\partial o_{kl}}+Q_{jkl}(p,o)\frac{\partial^{2}H}{\partial p_{j}\partial o_{kl}}\\ &=\frac{\partial H}{\partial t}+D^{i}x\frac{\partial H}{\partial x^{i}}+D_{i}p\frac{\partial H}{\partial p_{i}}+D_{jk}o\frac{\partial H}{\partial o_{jk}}+\frac{1}{2}Q^{jk}x\frac{\partial^{2}H}{\partial x^{j}\partial x^{k}}+C_{ij}\frac{\partial H}{\partial o_{ij}}=\frac{\partial H}{\partial t}.\end{split}

The result follows. ∎

In particular, when HH is time-independent, we have

𝐃tH=0,\mathbf{D}_{t}H=0, (6.19)

which is also a consequence of (6.8). Equivalently, HH is harmonic with respect to the horizontal integral process 𝐗\mathbf{X}. In this case, we can say that HH is stochastically conserved, or is a stochastic conserved quantity. In particular, the expectation 𝐄[H(𝐗)]\mathbf{E}[H(\mathbf{X})] is a constant.

6.3 Two inspirational examples

Let MM be a Riemannian manifold with Riemannian metric gg. Assume for simplicity that MM is compact. Let \nabla be the Levi-Civita connection on TMTM with Christoffel symbols (Γkij)(\Gamma^{k}_{ij}). In this section, we will consider two types of processes on MM, to provide some intuition of our stochastic Hamiltonian formalism.

6.3.1 Diffusion processes on Riemannian manifolds

Consider a second-order Hamiltonian HH on 𝒯SM\mathcal{T}^{S*}M with the following coordinate expression:

H(x,p,o)=bi(x)pi12gij(x)Γijk(x)pk+12gij(x)oij+F(x),H(x,p,o)=b^{i}(x)p_{i}-\frac{1}{2}g^{ij}(x)\Gamma_{ij}^{k}(x)p_{k}+\frac{1}{2}g^{ij}(x)o_{ij}+F(x), (6.20)

where bb is a given smooth vector field on MM and FF a smooth function on MM. One can easily verify that the expression at RHS of (6.20) is indeed invariant under changes of coordinates. We consider the S-H equations (6.17) subject to boundary conditions Law(X(0))=μ0\text{Law}(X(0))=\mu_{0} and (p,o)(T)=d2ST(p,o)(T)=d^{2}S_{T}, where μ0\mu_{0} is a given probability distribution and STS_{T} a given smooth function on MM.

By the first two equations of system (6.17), the projection diffusion XX satisfies the following MDEs,

{(DX)i(t)=bi(X(t))12gjk(X(t))Γjki(X(t)),(QX)jk(t)=gjk(X(t)),\left\{\begin{aligned} (DX)^{i}(t)&=b^{i}(X(t))-\frac{1}{2}g^{jk}(X(t))\Gamma_{jk}^{i}(X(t)),\\ (QX)^{jk}(t)&=g^{jk}(X(t)),\end{aligned}\right. (6.21)

subject to the initial distribution Law(X(0))=μ0\text{Law}(X(0))=\mu_{0}; or equivalently (according to the end of Section 2.4), it can be rewritten as the following Itô SDE in weak sense,

dXi(t)=[bi(X(t))12gjk(X(t))Γjki(X(t))]dt+σri(X(t))dWr(t),Law(X(0))=μ0,dX^{i}(t)=\left[b^{i}(X(t))-\frac{1}{2}g^{jk}(X(t))\Gamma_{jk}^{i}(X(t))\right]dt+\sigma_{r}^{i}(X(t))dW^{r}(t),\quad\text{Law}(X(0))=\mu_{0}, (6.22)

where σ\sigma is the positive definite square root (1,1)(1,1)-tensor of gg, i.e., r=1dσirσjr=gij\sum_{r=1}^{d}\sigma^{i}_{r}\sigma^{j}_{r}=g^{ij}, WW denotes an d\mathbb{R}^{d}-valued standard Brownian motion. Note that equations (6.21) are independent of coordinates (p,o)(p,o), so they form a closed system on the base manifold MM and can be solved independently. Indeed, the solution XX is a diffusion on MM with generator AX=(bi12gjkΓjki)i+12gjkjk=b+12ΔA^{X}=(b^{i}-\frac{1}{2}g^{jk}\Gamma_{jk}^{i})\partial_{i}+\frac{1}{2}g^{jk}\partial_{j}\partial_{k}=\nabla_{b}+\frac{1}{2}\Delta.

Now we consider the last two equations of (6.17). The LHS of the third equation reads

[t+(bj12gklΓklj)xj+12gjk2xjxk]pi=(t+b,+12Δ)pi,\bigg{[}\frac{\partial}{\partial t}+\left(b^{j}-\frac{1}{2}g^{kl}\Gamma_{kl}^{j}\right)\frac{\partial}{\partial{x^{j}}}+\frac{1}{2}g^{jk}\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}\bigg{]}p_{i}=\left(\frac{\partial}{\partial t}+\langle b,\nabla\rangle+\frac{1}{2}\Delta\right)p_{i},

where ,\langle\cdot,\cdot\rangle denotes the pairing of vectors and covectors, Δ\Delta is the Laplace-Beltrami operator and \nabla the gradient, with respect to gg. In order to find the solution of the third equation of (6.17), we consider the following linear backward parabolic equation (where “backward” has a meaning different from that in Section 2.2)

St+b,S+12ΔS+F=0,t[0,T),\frac{\partial S}{\partial t}+\langle b,\nabla S\rangle+\frac{1}{2}\Delta S+F=0,\quad t\in[0,T), (6.23)

with terminal value S(T,x)=ST(x)S(T,x)=S_{T}(x). We let

pi=Sxi,p_{i}=\frac{\partial S}{\partial x^{i}}, (6.24)

and use (6.23) and (6.15) to derive

Fxi=xi(St+b,S+12ΔS)=(t+b,+12Δ)pi+(bjxipj12gklxiΓkljpj12gklΓkljxipj+12gjkxiojk)=(t+b,+12Δ)pi+xi(HF),\begin{split}-\frac{\partial F}{\partial x^{i}}&=\frac{\partial}{\partial{x^{i}}}\left(\frac{\partial S}{\partial t}+\langle b,\nabla S\rangle+\frac{1}{2}\Delta S\right)\\ &=\left(\frac{\partial}{\partial t}+\langle b,\nabla\rangle+\frac{1}{2}\Delta\right)p_{i}+\left(\frac{\partial b^{j}}{\partial x^{i}}p_{j}-\frac{1}{2}\frac{\partial g^{kl}}{\partial x^{i}}\Gamma_{kl}^{j}p_{j}-\frac{1}{2}g^{kl}\frac{\partial\Gamma_{kl}^{j}}{\partial x^{i}}p_{j}+\frac{1}{2}\frac{\partial g^{jk}}{\partial x^{i}}o_{jk}\right)\\ &=\left(\frac{\partial}{\partial t}+\langle b,\nabla\rangle+\frac{1}{2}\Delta\right)p_{i}+\frac{\partial}{\partial x^{i}}(H-F),\end{split} (6.25)

which agree with the third equation of (6.17).

Finally, we combine (6.24) with (6.15) to conclude that the horizontal integral process 𝐗\mathbf{X} is

𝐗(t)=(p,o)(t,X(t))=(Sxi,2Sxjxk)(t,X(t))=d2S(t,X(t)).\mathbf{X}(t)=(p,o)(t,X(t))=\left(\frac{\partial S}{\partial x^{i}},\frac{\partial^{2}S}{\partial x^{j}\partial x^{k}}\right)(t,X(t))=d^{2}S(t,X(t)).
Example 6.9 (Brownian motions).

When b0b\equiv 0 and F0F\equiv 0, the 2nd-order Hamiltonian is H(x,p,o)=12gij(x)(oijΓijk(x)pk)H(x,p,o)=\frac{1}{2}g^{ij}(x)(o_{ij}-\Gamma_{ij}^{k}(x)p_{k}), the solution process XX is a standard Brownian motion on MM with initial distribution μ0\mu_{0}. Such 2nd-order Hamiltonian HH can be regarded as a “stochastic deformation” of the trivial classical Hamiltonian H0=0H_{0}=0. Indeed, HH is the gg-canonical lift of H0H_{0} that will be defined in forthcoming Section 6.6. Therefore, we may regard Brownian motions as “stochastization” or “stochastic deformation” of trivially constant curves on the base manifold MM.

We are going to describe in the next example a dynamical approach to diffusions, elaborated afterwards (Section 7.3), inspired by Schrödinger.

6.3.2 Reciprocal processes and diffusion bridges on Riemannian manifolds

With the same coefficients b,Fb,F and boundary data μ0,ST\mu_{0},S_{T} in Subsection 6.3.1, we consider the S-H system (6.17) with the following second-order Hamiltonian HH on 𝒯SM\mathcal{T}^{S*}M:

H(x,p,o)=12gij(x)pipj+bi(x)pi12gij(x)Γijk(x)pk+12gij(x)oij+F(x),H(x,p,o)=\frac{1}{2}g^{ij}(x)p_{i}p_{j}+b^{i}(x)p_{i}-\frac{1}{2}g^{ij}(x)\Gamma_{ij}^{k}(x)p_{k}+\frac{1}{2}g^{ij}(x)o_{ij}+F(x), (6.26)

subject to boundary conditions Law(X(0))=μ0\text{Law}(X(0))=\mu_{0} and (p,o)(T)=d2ST(p,o)(T)=d^{2}S_{T}. Here, bb and FF are called, respectively, vector and scalar potentials in classical mechanics. Again, it is easy to verify that the expression at RHS of (6.26) is indeed invariant under changes of coordinates.

The LHS of the third equation in (6.17) now reads

[t+(gjkpk+bj12gklΓklj)xj+12gjk2xjxk]pi=(t+p+b,+12Δ)pi,\bigg{[}\frac{\partial}{\partial t}+\left(g^{jk}p_{k}+b^{j}-\frac{1}{2}g^{kl}\Gamma_{kl}^{j}\right)\frac{\partial}{\partial{x^{j}}}+\frac{1}{2}g^{jk}\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}\bigg{]}p_{i}=\left(\frac{\partial}{\partial t}+p\cdot\nabla+\langle b,\nabla\rangle+\frac{1}{2}\Delta\right)p_{i},

In order to find the solution of the third equation of (6.17), we first consider the positive solution of following backward parabolic equation on MM

ut+b,u+12Δu+Fu=0,t[0,T),\frac{\partial u}{\partial t}+\langle b,\nabla u\rangle+\frac{1}{2}\Delta u+Fu=0,\quad t\in[0,T), (6.27)

with terminal value u(T,x)=eST(x)u(T,x)=e^{S_{T}(x)}, where ,\langle\cdot,\cdot\rangle denotes the Riemannian inner product with respect to gg. If we let S=lnuS=\ln u, then it is easy to verify that SS satisfies the following Hamilton-Jacobi-Bellman (HJB) equation

St+b,S+12|S|2+12ΔS+F=0,t[0,T),\frac{\partial S}{\partial t}+\langle b,\nabla S\rangle+\frac{1}{2}|\nabla S|^{2}+\frac{1}{2}\Delta S+F=0,\quad t\in[0,T), (6.28)

with terminal value S(T,x)=ST(x)S(T,x)=S_{T}(x), where |||\cdot| denotes the Riemannian norm with respect to gg. Now we let

pi=Sxi=lnuxi,p_{i}=\frac{\partial S}{\partial x^{i}}=\frac{\partial\ln u}{\partial x^{i}}, (6.29)

and use (6.28) and (6.15) to derive, in a way similar to (6.25),

Fxi=xi(St+b,S+12|S|2+12ΔS)=(t+p+b,+12Δ)pi+xi(HF),-\frac{\partial F}{\partial x^{i}}=\frac{\partial}{\partial{x^{i}}}\left(\frac{\partial S}{\partial t}+\langle b,\nabla S\rangle+\frac{1}{2}|\nabla S|^{2}+\frac{1}{2}\Delta S\right)=\left(\frac{\partial}{\partial t}+p\cdot\nabla+\langle b,\nabla\rangle+\frac{1}{2}\Delta\right)p_{i}+\frac{\partial}{\partial x^{i}}(H-F),

which agree with the third equation of (6.17). Therefore, the projection diffusion XX of the system (6.17) satisfies the following MDEs,

{(DX)i(t)=gij(X(t))lnuxj(t,X(t))+bi(X(t))12gjk(X(t))Γjki(X(t)),(QX)jk(t)=gjk(X(t)),\left\{\begin{aligned} (DX)^{i}(t)&=g^{ij}(X(t))\frac{\partial\ln u}{\partial x^{j}}(t,X(t))+b^{i}(X(t))-\frac{1}{2}g^{jk}(X(t))\Gamma_{jk}^{i}(X(t)),\\ (QX)^{jk}(t)&=g^{jk}(X(t)),\end{aligned}\right. (6.30)

subject to the initial distribution Law(X(0))=μ0\text{Law}(X(0))=\mu_{0}; or equivalently (according to the end of Section 2.4), it can be rewritten as the following Itô SDE in weak sense,

{dXi(t)=[gij(X(t))lnuxj(t,X(t))+bi(X(t))12gjk(X(t))Γjki(X(t))]dt+σri(X(t))dWr(t),Law(X(0))=μ0,\left\{\begin{aligned} dX^{i}(t)&=\left[g^{ij}(X(t))\frac{\partial\ln u}{\partial x^{j}}(t,X(t))+b^{i}(X(t))-\frac{1}{2}g^{jk}(X(t))\Gamma_{jk}^{i}(X(t))\right]dt+\sigma_{r}^{i}(X(t))dW^{r}(t),\\ \text{Law}(&X(0))=\mu_{0},\end{aligned}\right. (6.31)

where σ\sigma is the positive definite square root (1,1)(1,1)-tensor of gg, i.e., r=1dσirσjr=gij\sum_{r=1}^{d}\sigma^{i}_{r}\sigma^{j}_{r}=g^{ij}, WW denotes an d\mathbb{R}^{d}-valued standard Brownian motion.

The solution process XX of (6.31) is called a Bernstein process [11, 18] (or the reciprocal process derived from the MM-valued diffusion in (6.22) [46]). The time marginal distribution μt\mu_{t} of XX satisfies a Born-type formula μt(dx)=u(t,x)v(t,x)dx\mu_{t}(dx)=u(t,x)v(t,x)dx (see, e.g., [88, Corollary 3.3.1] or [17, Equations (2.9), (4.6) and (4.8)]), where vv satisfies the adjoint equation of (6.27). The terminal law of XX can be determined in the following way: we first solve (6.27) to get u(0,x)u(0,x), and then find out the initial value for vv via μ0(dx)=u(0,x)v(0,x)dx\mu_{0}(dx)=u(0,x)v(0,x)dx and solve the equation for vv to get v(T,x)v(T,x), finally the terminal law of XX is given by μT(dx)=u(T,x)v(T,x)dx\mu_{T}(dx)=u(T,x)v(T,x)dx. In particular, when μ0=δq1\mu_{0}=\delta_{q_{1}} and μT=δq2\mu_{T}=\delta_{q_{2}} for q1,q2Mq_{1},q_{2}\in M, the solution XX of (6.31) is the Markovian bridge of the diffusion YY conditioning on ending point q2q_{2} [13].

Again we combine (6.29) with (6.15) to conclude that the horizontal integral process 𝐗\mathbf{X} is

𝐗(t)=(p,o)(t,X(t))=(Sxi,2Sxjxk)(t,X(t))=d2S(t,X(t)).\mathbf{X}(t)=(p,o)(t,X(t))=\left(\frac{\partial S}{\partial x^{i}},\frac{\partial^{2}S}{\partial x^{j}\partial x^{k}}\right)(t,X(t))=d^{2}S(t,X(t)). (6.32)
Remark 6.10.

(i). The derivation of the reciprocal process (6.31) from the diffusion (6.22) was the way chosen by Jamison [46], inspired by Schrödinger’s original problem [79]. No geometry or dynamical equations like HJB equation (6.28) was involved by him. Like here, Jamison’s construction was involving only the past (nondecreasing) filtration. The dynamical content dates back to [88, 17, 15], where a reciprocal process was constructed from the only data of a Hamiltonian operator as required by Schrödinger’s original problem, and the future (nonincreasing) filtration was also used to study the time-reversed dynamics. Cf. also Example 6.12 and Section 7.3.

(ii). Equations (6.30) suggest that the transformation from coordinates (x,p,o)(x,p,o) to coordinates (x,Dx,Qx)(x,Dx,Qx) is not invertible. More precisely, the coordinates (Dix)(D^{i}x) are transformed from (x,p)(x,p) but the coordinates (Qjkx)(Q^{jk}x) are only related to (xi)(x^{i}). Besides, these two equations have nothing to do with the coordinates (ojk)(o_{jk}). However, if we look at the \nabla-canonical coordinates (Dix)(D^{i}_{\nabla}x) for (6.30), then

(DX)i(t)=gij(X(t))pj(t,X(t))+bi(X(t)),(D_{\nabla}X)^{i}(t)=g^{ij}(X(t))p_{j}(t,X(t))+b^{i}(X(t)),

which indicates that the transform from (x,p)(x,p) to (x,Dx)(x,D_{\nabla}x) is invertible. These will help us establish stochastic Lagrangian mechanics and second-order Legendre transforms, in forthcoming Chapter 7.

(iii). As observed in Section 2.2, every result presented here has a backward version (in the sense of backward mean derivatives with respect to the future filtration {t}\{\mathcal{F}_{t}\}). Indeed, two forward-backward SDE systems for Bernstein diffusions on Euclidean space were derived in [16]: one is under the past filtration and coincides with ours, whereas the other one is under the future filtration.

There are some special cases which are of independent interests and have been considered in the literature.

Example 6.11 (Brownian (free) reciprocal processes and Brownian bridges).

Consider the case where b0b\equiv 0, F0F\equiv 0. In this case, YY is a Brownian motion on MM, so we call XX a Brownian reciprocal process. In particular, the Brownian bridge from q1q_{1} to q2q_{2} of time length T>0T>0 is driven by the Itô SDE (6.31) where X(0)=q1X(0)=q_{1}, b0b\equiv 0 and uu satisfies the backward heat equation (6.27) with F0F\equiv 0 and final value u(T,x)=δq2(x)u(T,x)=\delta_{q_{2}}(x). See also [40, Theorem 5.4.4]. Thus, Brownian bridges are understood as stochastic Hamiltonian flows of the 2nd-order Hamiltonian H(x,p,o)=12gij(x)pipj12gij(x)Γijk(x)pk+12gij(x)oijH(x,p,o)=\frac{1}{2}g^{ij}(x)p_{i}p_{j}-\frac{1}{2}g^{ij}(x)\Gamma_{ij}^{k}(x)p_{k}+\frac{1}{2}g^{ij}(x)o_{ij}, compared with geodesics as Hamiltonian flows of the classical Hamiltonian H0(x,p)=12gij(x)pipjH_{0}(x,p)=\frac{1}{2}g^{ij}(x)p_{i}p_{j} (cf. [1, Theorem 3.7.1]). Here, the 2nd-order Hamiltonian HH is the gg-canonical lift of H0H_{0}. We can also say that Brownian bridges are “stochastization” or “stochastic deformation” of geodesics, cf. Example 6.9. Relations between geodesics and Brownian motions have attracted many studies. For example, one can find various interpolation relations between geodesics and Brownian motions in [4, 61].

Example 6.12 (Euclidean quantum mechanics [15, 2, 3]).

It is insightful to consider the case M=dM=\mathbb{R}^{d} and b0b\equiv 0. The Riemannian metric under consideration is the flat Euclidean one. To catch sight of the analogy with quantum mechanics, we involve the reduced Planck constant \hbar into the second-order Hamiltonian HH of (6.26), so that

H(x,p,o)=12|p|2+2tro+F(x).H_{\hbar}(x,p,o)=\frac{1}{2}|p|^{2}+\frac{\hbar}{2}\mathrm{tr}\,o+F(x).

The system (6.10) then reads as

{(DX)i(t)=pi(t,X(t)),(QX)jk(t)=δjk,D[pi(t,X(t))]=Fxi(X(t)),oik(t,x)=pkxi(t,x).\left\{\begin{aligned} (DX)^{i}(t)&=p_{i}(t,X(t)),\\ (QX)^{jk}(t)&=\hbar\delta^{jk},\\ D[p_{i}(t,X(t))]&=-\frac{\partial F}{\partial x^{i}}(X(t)),\\ o_{ik}(t,x)&=\frac{\partial p_{k}}{\partial x^{i}}(t,x).\end{aligned}\right.

Note that the first three equations form a sub-system and can be solved separately, as they are independent of the coordinates oijo_{ij}’s. Equation (6.27) and its adjoint now reduce to the following \hbar-dependent backward and forward heat equations, respectively,

ut+22Δu+Fu=0,vt+22Δv+Fv=0,\hbar\frac{\partial u}{\partial t}+\frac{\hbar^{2}}{2}\Delta u+Fu=0,\qquad-\hbar\frac{\partial v}{\partial t}+\frac{\hbar^{2}}{2}\Delta v+Fv=0,

which together with the Born-type formula μt(dx)=u(t,x)v(t,x)dx\mu_{t}(dx)=u(t,x)v(t,x)dx display the strong analogy to quantum mechanics [88]. The function S=lnuS=\hbar\ln u solves the following \hbar-dependent HJB equation

St+12|S|2+2ΔS+F=0.\frac{\partial S}{\partial t}+\frac{1}{2}|\nabla S|^{2}+\frac{\hbar}{2}\Delta S+F=0.

The first three equations then can be solved by letting p=Sp=\nabla S. The first and third equations imply a Newton-type equation

DDX(t)=F(X(t)).DDX(t)=-\nabla F(X(t)). (6.33)

This is indeed the equation of motion of the Euclidean version of quantum mechanics, which was the original motivation of Schrödinger in his well-known problem to be discussed in Section 7.3. See [15, p. 158] and [89, Eq. (4.17)] for more. Note that [15, 89] used V=FV=-F to denote the physical scalar potential and used the relation S=lnuS=-\hbar\ln u and p=Sp=-\nabla S to formulate the HJB equation from backward heat equation in the case of nondecreasing (past) filtration.

There are two special cases of which more will be studied later.

(i). When d=1d=1 and F(x)=12x2F(x)=\frac{1}{2}x^{2}, i.e., H(x,p,o)=12(p2+x2)+12oH(x,p,o)=\frac{1}{2}(p^{2}+x^{2})+\frac{1}{2}o, we call its projective integral process XX the (forward) stochastic harmonic oscillator. It is a stochastization of the classical harmonic oscillator with Hamiltonian H0(x,p)=12(p2+x2)H_{0}(x,p)=\frac{1}{2}(p^{2}+x^{2}) [1, Example 5.2.3]. Likewise, here HH is the canonical lift H0H_{0}, see Section 6.6.

(ii). When d=1d=1 and F(x)=12x2F(x)=-\frac{1}{2}x^{2}, i.e., H(x,p,o)=12(p2x2)+12oH(x,p,o)=\frac{1}{2}(p^{2}-x^{2})+\frac{1}{2}o, we call it the (forward) Euclidean harmonic oscillator.

6.4 The mixed-order contact structure on 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R}

In the later sections we will investigate time-dependent systems. The proper space for consideration is now 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R}. Recall in (5.9) that 𝒯SM×=J2π^\mathcal{T}^{S*}M\times\mathbb{R}=J^{2}\hat{\pi}, where the latter is the second-order jet bundle of (M×,π^,M)(M\times\mathbb{R},\hat{\pi},M).

In classical differential geometry, the first-order jet bundle J1π^=TM×J^{1}\hat{\pi}=T^{*}M\times\mathbb{R} can be equipped with an exact contact structure in several ways [1, Section 5.1]. Among others, the canonical symplectic form ω0\omega_{0} on TMT^{*}M corresponds to a contact structure on J1π^J^{1}\hat{\pi} via ω~0=π^ω0\tilde{\omega}_{0}=\hat{\pi}^{*}\omega_{0}, which is indeed exact as ω~0=dθ~0\tilde{\omega}_{0}=-d\tilde{\theta}_{0} for θ~0=dt+π^θ0\tilde{\theta}_{0}=dt+\hat{\pi}^{*}\theta_{0}. Another commonly used contact structure is the Poincaré-Cartan form ω0H0=ω~0+dH0dt\omega^{0}_{H_{0}}=\tilde{\omega}_{0}+dH_{0}\wedge dt for a given function H0C(J1π^)H_{0}\in C^{\infty}(J^{1}\hat{\pi}). It is also exact as ω0H0=dθ0H0\omega^{0}_{H_{0}}=-d\theta^{0}_{H_{0}} where θ0H0=π^θ0H0dt\theta^{0}_{H_{0}}=\hat{\pi}^{*}\theta_{0}-H_{0}dt. The advantage of the Poincaré-Cartan form, compared with the contact form ω0\omega_{0}, is that it can be related to the (time-dependent) Hamiltonian vector field VH0V_{H_{0}} on TMT^{*}M of H0{H_{0}}. More precisely, the vector field V~H0=t+VH0\tilde{V}_{H_{0}}=\frac{\partial}{\partial{t}}+V_{H_{0}}, treated as a vector field on J1π^J^{1}\hat{\pi} and called the characteristic vector field of ω0H0\omega^{0}_{H_{0}}, is the unique vector field satisfying V~H0ω0H0=0\tilde{V}_{H_{0}}\lrcorner\,\omega^{0}_{H_{0}}=0 and V~H0dt=1\tilde{V}_{H_{0}}\lrcorner\,dt=1.

Now we proceed in a similar way for the second-order jet bundle J2π^J^{2}\hat{\pi}. Define

ω~=π^Sωandθ~=dt+π^Sθ.\tilde{\omega}=\hat{\pi}^{S*}\omega\quad\text{and}\quad\tilde{\theta}=dt+\hat{\pi}^{S*}\theta.

Then ω~=dθ~\tilde{\omega}=-d\tilde{\theta}. We call the pair (J2π^,ω~)(J^{2}\hat{\pi},\tilde{\omega}) a second-order contact manifold and the pair (J2π^,θ~)(J^{2}\hat{\pi},\tilde{\theta}) a mixed-order exact contact manifold. In local coordinates, ω~\tilde{\omega} has the same expression as ω\omega in (6.6), but we stress that it is now a second-order form on 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R}. The form θ~\tilde{\theta} has the local expression

θ~=dt+pid2xi+12ojkdxjdxk.\tilde{\theta}=dt+p_{i}d^{2}x^{i}+\textstyle{\frac{1}{2}}o_{jk}dx^{j}\cdot dx^{k}.

This makes clear that θ~\tilde{\theta} is a mixed-order form on 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R}.

A time-dependent second-order Hamiltonian HH is a smooth function on J2π^𝒯SM×J^{2}\hat{\pi}\cong\mathcal{T}^{S*}M\times\mathbb{R}. The second-order Hamiltonian vector field AHA_{H} of HH is now a time-dependent second-order vector field on 𝒯SM\mathcal{T}^{S*}M, its horizontal integral process share the same equations as (6.10) or (6.17), only with HH explicitly depending on time. Define a mixed-order vector field A~H\tilde{A}_{H} on 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R} by

A~H:=AH+t,\tilde{A}_{H}:=A_{H}+\frac{\partial}{\partial{t}},

where AHA_{H} is a second-order Hamiltonian vector field of the form (6.9). We call A~H\tilde{A}_{H} the extended second-order Hamiltonian vector field of HH.

We define the second-order counterpart of Poincaré-Cartan form by

ωH:=ω~+dHdt=d2xid2pi+12dxjdxkd2ojk+d2Hdt,\omega_{H}:=\tilde{\omega}+d^{\circ}H\wedge dt=d^{2}x^{i}\wedge d^{2}p_{i}+\textstyle{\frac{1}{2}}dx^{j}\cdot dx^{k}\wedge d^{2}o_{jk}+d^{2}H\wedge dt,

and call it the mixed-order Poincaré-Cartan form on 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R}. It is exact in the sense that ωH=dθH\omega_{H}=-d^{\circ}\theta_{H}, where θH=π^SθHdt=pid2xi+12ojkdxjdxkHdt\theta_{H}=\hat{\pi}^{S*}\theta-Hdt=p_{i}d^{2}x^{i}+\textstyle{\frac{1}{2}}o_{jk}dx^{j}\cdot dx^{k}-Hdt.

The following lemma gives the relations between ωH\omega_{H} and A~H\tilde{A}_{H}.

Lemma 6.13.

The class of extended second-order Hamiltonian vector fields A~H\tilde{A}_{H} is the unique class of mixed-order vector fields on 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R} satisfying

A~HωH=0andA~Hdt=1.\tilde{A}_{H}\lrcorner\,\omega_{H}=0\quad\text{and}\quad\tilde{A}_{H}\lrcorner\,dt=1.
Proof.

Firstly, we show that A~H\tilde{A}_{H} satisfies the two equalities. The second equality is trivial. For the first one, we pick a mixed-order vector field BB on 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R}; then,

ωH(A~H,B)=ω~(A~H,B)+dH(A~H)dt(B)dt(A~H)dH(B)=ω(AH,π^S(B))+[dH(AH)+dH(t)]dt(B)dH(B)=d2H(π^S(B))+Htdt(B)dH(B)=0.\begin{split}\omega_{H}(\tilde{A}_{H},B)&=\tilde{\omega}(\tilde{A}_{H},B)+d^{\circ}H(\tilde{A}_{H})dt(B)-dt(\tilde{A}_{H})d^{\circ}H(B)\\ &=\omega(A_{H},\hat{\pi}^{S}_{*}(B))+\left[d^{\circ}H(A_{H})+d^{\circ}H(\textstyle{\frac{\partial}{\partial{t}}})\right]dt(B)-d^{\circ}H(B)\\ &=d^{2}H(\hat{\pi}^{S}_{*}(B))+\textstyle{\frac{\partial H}{\partial t}}dt(B)-d^{\circ}H(B)\\ &=0.\end{split}

To prove the uniqueness, it suffices to show that any mixed-order vector field AA on 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R} satisfying AωH=0A\lrcorner\,\omega_{H}=0 is a multiplier of A~H\tilde{A}_{H}. Suppose that AA has the local expression

A=A0t+Aixi+Aipi+Ajk2xjxk+A2jkojk+A11jk2pjpk+Aijkl2oijokl+Ajk2xjpk+Ajkl2xjokl+Ajkl2pjokl.\begin{split}A=&\ A^{0}\frac{\partial}{\partial{t}}+A^{i}\frac{\partial}{\partial{x^{i}}}+A_{i}\frac{\partial}{\partial{p_{i}}}+A^{jk}\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}+A^{2}_{jk}\frac{\partial}{\partial{o_{jk}}}\\ &\ +A^{11}_{jk}\frac{\partial^{2}}{\partial p_{j}\partial p_{k}}+A_{ijkl}\frac{\partial^{2}}{\partial o_{ij}\partial o_{kl}}+A^{j}_{k}\frac{\partial^{2}}{\partial x^{j}\partial p_{k}}+A^{j}_{kl}\frac{\partial^{2}}{\partial x^{j}\partial o_{kl}}+A_{jkl}\frac{\partial^{2}}{\partial p_{j}\partial o_{kl}}.\end{split}

Then, it follows that

0=AωH=Aid2piAid2xi+Ajkd2ojk12A2jkdxjdxk+terms(A11jk,Aijkl,Ajk,Ajkl,Ajkl)A0(Hxid2xi+Hpid2pi+Hojkd2ojk+122Hxjxkdxjdxk+)+(AiHxi+AiHpi++Ajk2Hxjxk+A2jkHojk+)dt.\begin{split}0=A\lrcorner\,\omega_{H}=&\ A^{i}d^{2}p_{i}-A_{i}d^{2}x^{i}+A^{jk}d^{2}o_{jk}-\textstyle{\frac{1}{2}}A^{2}_{jk}dx^{j}\cdot dx^{k}+\text{terms}\left(A^{11}_{jk},A_{ijkl},A^{j}_{k},A^{j}_{kl},A_{jkl}\right)\\ &\ -A^{0}\left(\frac{\partial H}{\partial x^{i}}d^{2}x^{i}+\frac{\partial H}{\partial p_{i}}d^{2}p_{i}+\frac{\partial H}{\partial o_{jk}}d^{2}o_{jk}+\frac{1}{2}\frac{\partial^{2}H}{\partial x^{j}\partial x^{k}}dx^{j}\cdot dx^{k}+\cdots\right)\\ &\ +\left(A^{i}\frac{\partial H}{\partial x^{i}}+A_{i}\frac{\partial H}{\partial p_{i}}++A^{jk}\frac{\partial^{2}H}{\partial x^{j}\partial x^{k}}+A^{2}_{jk}\frac{\partial H}{\partial o_{jk}}+\cdots\right)dt.\end{split}

The vanishing of each coefficient gives

Ai=A0Hpi,Ai=A0Hxi,Ajk=A0Hojk,A2jk=A0(2Hxjxk+),.A^{i}=A^{0}\frac{\partial H}{\partial p_{i}},\quad A_{i}=-A^{0}\frac{\partial H}{\partial x^{i}},\quad A^{jk}=A^{0}\frac{\partial H}{\partial o_{jk}},\quad A^{2}_{jk}=-A^{0}\left(\frac{\partial^{2}H}{\partial x^{j}\partial x^{k}}+\cdots\right),\quad\cdots.

Therefore, A=A0A~HA=A^{0}\tilde{A}_{H}. ∎

6.5 Canonical transformations and Hamilton-Jacobi-Bellman equations

Let us study the second-order analogs of canonical transformations and their generating functions. To do so, we need to find a change of coordinates from (xi,pi,ojk,t)(x^{i},p_{i},o_{jk},t) to (yi,Pi,Ojk,s)(y^{i},P_{i},O_{jk},s) that preserves the form of stochastic Hamilton’s equations (6.10) (with time-dependent 2nd-order Hamiltonian). More precisely, we have the following definition of canonical transformations between mixed-order contact structures, which is adapted from those between classical contact structures in [9].

Definition 6.14.

Let (𝒯SM×,ω~)(\mathcal{T}^{S*}M\times\mathbb{R},\tilde{\omega}) and (𝒯SN×,η~)(\mathcal{T}^{S*}N\times\mathbb{R},\tilde{\eta}) be two second-order contact manifolds corresponding to second-order tautological forms θ\theta and ϑ\vartheta. A bundle isomorphism 𝐅:(𝒯SM×,π^2,1,TM×)(𝒯SN×,ρ^2,1,TN×)\mathbf{F}:(\mathcal{T}^{S*}M\times\mathbb{R},\hat{\pi}_{2,1},T^{*}M\times\mathbb{R})\to(\mathcal{T}^{S*}N\times\mathbb{R},\hat{\rho}_{2,1},T^{*}N\times\mathbb{R}) is called a canonical transformation if its projection 𝔽\mathbb{F} is a bundle isomorphism from (TM×,π^10,1,)(T^{*}M\times\mathbb{R},\hat{\pi}^{1}_{0,1},\mathbb{R}) to (TN×,ρ^10,1,)(T^{*}N\times\mathbb{R},\hat{\rho}^{1}_{0,1},\mathbb{R}) projecting to F0:F^{0}:\mathbb{R}\to\mathbb{R}, and there is a function H𝐅C(𝒯SM×)H_{\mathbf{F}}\in C^{\infty}(\mathcal{T}^{S*}M\times\mathbb{R}) such that

𝐅Rη~=ωH𝐅,\mathbf{F}^{R*}\tilde{\eta}=\omega_{H_{\mathbf{F}}}, (6.34)

where ωH𝐅=ω~+dH𝐅dF0\omega_{H_{\mathbf{F}}}=\tilde{\omega}+d^{\circ}H_{\mathbf{F}}\wedge dF^{0}.

The map 𝐅\mathbf{F} in the definition is also a bundle isomorphism from (𝒯SM×,π^20,1,)(\mathcal{T}^{S*}M\times\mathbb{R},\hat{\pi}^{2}_{0,1},\mathbb{R}) to (𝒯SN×,ρ^20,1,)(\mathcal{T}^{S*}N\times\mathbb{R},\hat{\rho}^{2}_{0,1},\mathbb{R}) projecting to F0F^{0}. Hence, we may assume 𝐅(αq,t)=(𝐅¯(αq,t),F0(t))\mathbf{F}(\alpha_{q},t)=(\bar{\mathbf{F}}(\alpha_{q},t),F^{0}(t)) for all (αq,t)𝒯SM×(\alpha_{q},t)\in\mathcal{T}^{S*}M\times\mathbb{R}, where 𝐅¯\bar{\mathbf{F}} is a smooth map from 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R} to 𝒯SN\mathcal{T}^{S*}N. For each tt\in\mathbb{R}, we define a map 𝐅¯t:𝒯SM𝒯SN\bar{\mathbf{F}}_{t}:\mathcal{T}^{S*}M\to\mathcal{T}^{S*}N by 𝐅¯t(αq)=𝐅¯(αq,t)\bar{\mathbf{F}}_{t}(\alpha_{q})=\bar{\mathbf{F}}(\alpha_{q},t). We also introduce an injection ȷt:𝒯SM𝒯SM×\jmath_{t}:\mathcal{T}^{S*}M\to\mathcal{T}^{S*}M\times\mathbb{R} by ȷt(αq)=(αq,t)\jmath_{t}(\alpha_{q})=(\alpha_{q},t). Then, we have 𝐅¯t=ρ^1,1𝐅ȷt\bar{\mathbf{F}}_{t}=\hat{\rho}_{1,1}\circ\mathbf{F}\circ\jmath_{t}.

Lemma 6.15.

The map 𝐅¯t\bar{\mathbf{F}}_{t} is second-order symplectic for each tt\in\mathbb{R} if and only if there is a mixed-order form α\alpha on 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R} such that

𝐅Rη~=ω~+αdt.\mathbf{F}^{R*}\tilde{\eta}=\tilde{\omega}+\alpha\wedge dt.

In particular, condition (6.34) implies that each 𝐅¯t\bar{\mathbf{F}}_{t} is a second-order symplectomorphism.

Proof.

The sufficiency follows from

(𝐅¯t)Sη=(ȷt)R𝐅R(ρ^1,1)Sη=(ȷt)R𝐅Rη~=(ȷt)Rω~+(ȷt)Rα(ȷt)Rdt=ω+(ȷt)Rα0=ω.\begin{split}(\bar{\mathbf{F}}_{t})^{S*}\eta&=(\jmath_{t})^{R*}\circ\mathbf{F}^{R*}\circ(\hat{\rho}_{1,1})^{S*}\eta=(\jmath_{t})^{R*}\circ\mathbf{F}^{R*}\tilde{\eta}\\ &=(\jmath_{t})^{R*}\tilde{\omega}+(\jmath_{t})^{R*}\alpha\wedge(\jmath_{t})^{R*}dt=\omega+(\jmath_{t})^{R*}\alpha\wedge 0=\omega.\end{split}

For the necessity, we observe that

(ȷt)R(𝐅Rη~ω~)=(𝐅¯t)Sηω=0.(\jmath_{t})^{R*}(\mathbf{F}^{R*}\tilde{\eta}-\tilde{\omega})=(\bar{\mathbf{F}}_{t})^{S*}\eta-\omega=0.

So we can write 𝐅Rη~ω~=αdt+γ\mathbf{F}^{R*}\tilde{\eta}-\tilde{\omega}=\alpha\wedge dt+\gamma, where γ\gamma is a mixed-order form which does not involve dtdt. This leads to γ=(π^1,1)R(ȷt)Rγ=(π^1,1)R(ȷt)R(𝐅Rη~ω~αdt)=0\gamma=(\hat{\pi}_{1,1})^{R*}\circ(\jmath_{t})^{R*}\gamma=(\hat{\pi}_{1,1})^{R*}\circ(\jmath_{t})^{R*}(\mathbf{F}^{R*}\tilde{\eta}-\tilde{\omega}-\alpha\wedge dt)=0. The result follows. ∎

The following lemma gives some equivalent statements to the condition (6.34).

Lemma 6.16.

Condition (6.34) is equivalent to the following:
(i) 𝐅Rϑ~θ~+H𝐅dF0\mathbf{F}^{R*}\tilde{\vartheta}-\tilde{\theta}+H_{\mathbf{F}}dF^{0} is mixed-order closed;
(ii) for all KC(𝒯SN×)K\in C^{\infty}(\mathcal{T}^{S*}N\times\mathbb{R}), 𝐅RηK=ωH\mathbf{F}^{R*}\eta_{K}=\omega_{H};
(iii) for all KC(𝒯SN×)K\in C^{\infty}(\mathcal{T}^{S*}N\times\mathbb{R}), 𝐅RA~H=A~K\mathbf{F}^{R}_{*}\tilde{A}_{H}=\tilde{A}_{K};
where H=(K𝐅+H𝐅)F˙0H=(K\circ\mathbf{F}+H_{\mathbf{F}})\dot{F}^{0}.

Proof.

The equivalence between (6.34) and (i) is clear. For (6.34)\Rightarrow(ii), since 𝐅\mathbf{F} projects to F0F^{0},

𝐅RηK=𝐅Rη~+d(K𝐅)d(t𝐅)=ω~+dH𝐅dF0+d(K𝐅)dF0=ω~+dHdt=ωH.\begin{split}\mathbf{F}^{R*}\eta_{K}&=\mathbf{F}^{R*}\tilde{\eta}+d^{\circ}(K\circ\mathbf{F})\wedge d(t\circ\mathbf{F})=\tilde{\omega}+d^{\circ}H_{\mathbf{F}}\wedge dF^{0}+d^{\circ}(K\circ\mathbf{F})\wedge dF^{0}\\ &=\tilde{\omega}+d^{\circ}H\wedge dt=\omega_{H}.\end{split}

The converse (ii)\Rightarrow(6.34) is straightforward by letting K0K\equiv 0. To show (ii)\Rightarrow(iii), by applying Lemma 6.13, it suffices to prove that

𝐅RA~HηK=0and𝐅RA~Hdt=1,\mathbf{F}^{R}_{*}\tilde{A}_{H}\lrcorner\,\eta_{K}=0\quad\text{and}\quad\mathbf{F}^{R}_{*}\tilde{A}_{H}\lrcorner\,dt=1,

while

𝐅RA~HηK=(𝐅R)1(A~H𝐅RηK)=(𝐅R)1(A~HωH)=0,\mathbf{F}^{R}_{*}\tilde{A}_{H}\lrcorner\,\eta_{K}=(\mathbf{F}^{R*})^{-1}(\tilde{A}_{H}\lrcorner\,\mathbf{F}^{R*}\eta_{K})=(\mathbf{F}^{R*})^{-1}(\tilde{A}_{H}\lrcorner\,\omega_{H})=0,

and

𝐅RA~Hdt=(𝐅R)1(A~H𝐅Rdt)=(𝐅R)1(F˙0A~Hdt)=(𝐅R)1(F˙0)=1.\mathbf{F}^{R}_{*}\tilde{A}_{H}\lrcorner\,dt=(\mathbf{F}^{R*})^{-1}(\tilde{A}_{H}\lrcorner\,\mathbf{F}^{R*}dt)=(\mathbf{F}^{R*})^{-1}(\dot{F}^{0}\tilde{A}_{H}\lrcorner\,dt)=(\mathbf{F}^{R*})^{-1}(\dot{F}^{0})=1.

(iii)\Rightarrow(ii) is similar. ∎

Definition 6.17.

Let 𝐅:𝒯SM×𝒯SN×\mathbf{F}:\mathcal{T}^{S*}M\times\mathbb{R}\to\mathcal{T}^{S*}N\times\mathbb{R} be canonical. If we can locally write

𝐅Rϑ~θ~+H𝐅dF0=dG\mathbf{F}^{R*}\tilde{\vartheta}-\tilde{\theta}+H_{\mathbf{F}}dF^{0}=-d^{\circ}G (6.35)

for GC(M×)G\in C^{\infty}(M\times\mathbb{R}), then we call GG a generating function for the canonical transformation 𝐅\mathbf{F}.

We use (x,p,o,t)(x,p,o,t) for local coordinates on 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R} and (y,P,O,s)(y,P,O,s) for those on 𝒯SN×\mathcal{T}^{S*}N\times\mathbb{R}. Recall that 𝐅(αq,t)=(𝐅¯(αq,t),F0(t))\mathbf{F}(\alpha_{q},t)=(\bar{\mathbf{F}}(\alpha_{q},t),F^{0}(t)). Then, using (A.4), the relation (6.35) reads in coordinates as

[F˙0+(Pi𝐅)𝐅¯it]dt+(Pi𝐅)𝐅¯ixjd2xj+12[(Pi𝐅)2𝐅¯ixkxl+(Oij𝐅)𝐅¯ixkd𝐅¯jdxl]dxkdxl(dt+pid2xi+12ojkdxjdxk)+H𝐅dF0+Gtdt+Gxid2xi+122Gxjxkdxjdxk=0.\begin{split}&\left[\dot{F}^{0}+(P_{i}\circ\mathbf{F})\frac{\partial\bar{\mathbf{F}}^{i}}{\partial t}\right]dt+(P_{i}\circ\mathbf{F})\frac{\partial\bar{\mathbf{F}}^{i}}{\partial x^{j}}d^{2}x^{j}+\frac{1}{2}\left[(P_{i}\circ\mathbf{F})\frac{\partial^{2}\bar{\mathbf{F}}^{i}}{\partial x^{k}\partial x^{l}}+(O_{ij}\circ\mathbf{F})\frac{\partial\bar{\mathbf{F}}^{i}}{\partial x^{k}}\frac{d\bar{\mathbf{F}}^{j}}{dx^{l}}\right]dx^{k}\cdot dx^{l}\\ &-\left(dt+p_{i}d^{2}x^{i}+\frac{1}{2}o_{jk}dx^{j}\cdot dx^{k}\right)+H_{\mathbf{F}}dF^{0}+\frac{\partial G}{\partial t}dt+\frac{\partial G}{\partial x^{i}}d^{2}x^{i}+\frac{1}{2}\frac{\partial^{2}G}{\partial x^{j}\partial x^{k}}dx^{j}\cdot dx^{k}=0.\end{split}

Balancing the coefficient of dtdt, we get

Gt+H𝐅+(Pi𝐅)𝐅¯it+F˙01=0.\frac{\partial G}{\partial t}+H_{\mathbf{F}}+(P_{i}\circ\mathbf{F})\frac{\partial\bar{\mathbf{F}}^{i}}{\partial t}+\dot{F}^{0}-1=0.

By Lemma 6.16, the new Hamiltonian function KK after transformation 𝐅\mathbf{F} is related to the old Hamiltonian HH by (HK𝐅)F˙0=H𝐅(H-K\circ\mathbf{F})\dot{F}^{0}=H_{\mathbf{F}}. Let us further assume that we can choose coordinates in which (yi)(y^{i}) and (xi)(x^{i}) are independent, so that the independent variables in (6.35) are (x,y,t)(x,y,t). Then, relation (6.35) means

(Pid2yi+12Ojkdyjdyk+dF0)(pid2xi+12ojkdxjdxk+dt)+(HdtKdF0)=dG,\left(P_{i}d^{2}y^{i}+\textstyle{\frac{1}{2}}O_{jk}dy^{j}\cdot dy^{k}+dF^{0}\right)-\left(p_{i}d^{2}x^{i}+\textstyle{\frac{1}{2}}o_{jk}dx^{j}\cdot dx^{k}+dt\right)+(Hdt-KdF^{0})=-d^{\circ}G, (6.36)

which implies that the generating function of the canonical transformation G(x,y,t)G(x,y,t) satisfies

{pi=Gxi,ojkxkyl=2Gxjxkxkyl+2Gxjyl,Pi=Gyi,Ojk=2Gyjyk2Gyjxlxlyk,(K1)F˙0H+1=Gt.\left\{\begin{aligned} &p_{i}=\frac{\partial G}{\partial x^{i}},\quad o_{jk}\frac{\partial x^{k}}{\partial y^{l}}=\frac{\partial^{2}G}{\partial x^{j}\partial x^{k}}\frac{\partial x^{k}}{\partial y^{l}}+\frac{\partial^{2}G}{\partial x^{j}\partial y^{l}},\quad P_{i}=-\frac{\partial G}{\partial y^{i}},\quad O_{jk}=-\frac{\partial^{2}G}{\partial y^{j}\partial y^{k}}-\frac{\partial^{2}G}{\partial y^{j}\partial x^{l}}\frac{\partial x^{l}}{\partial y^{k}},\\ &(K-1)\dot{F}^{0}-H+1=\frac{\partial G}{\partial t}.\end{aligned}\right. (6.37)

The expressions for (ojk)(o_{jk}) and (Ojk)(O_{jk}) are due to the mixed differential term in dGd^{\circ}G, and correspond to the relation (6.15).

Remark 6.18.

Unlike the canonical transformations of classical Hamiltonian systems which have four types of generating functions related via classical Legendre transform (see [35, Section 9.1]), here we can only have the type using (x,y,t)(x,y,t) as independent variables but not others. This can be attributed to the ill-behaveness of the 2nd-order analog of Legendre transform, as indicated in Remark 6.10.(iii). However, if the configuration space MM is a Riemannian manifold, stochastic Hamiltonian mechanics can be simplified to share the same phase space TMT^{*}M as classical Hamiltonian mechanics, so that we can also have four types of generating functions. See Subsection 7.4.2 for details and examples of canonical transformations.

The Hamilton-Jacobi-Bellman (HJB) equation can be introduced as a special case of a time-dependent canonical transformation (6.37). In the case where F0=𝐈𝐝F^{0}=\mathbf{Id}_{\mathbb{R}} and the new Hamiltonian KK vanishes formally, we denote by SS the corresponding generating function GG. It follows from (6.37) that SS solves the Hamilton-Jacobi-Bellman equation,

St+H(xi,Sxi,2Sxjxk,t)=0.\frac{\partial S}{\partial t}+H\left(x^{i},\frac{\partial S}{\partial x^{i}},\frac{\partial^{2}S}{\partial x^{j}\partial x^{k}},t\right)=0. (6.38)

We will refer to equation (6.38) as the HJB equation associated with second-order Hamiltonian HH, and a solution SS of (6.38) as a second-order Hamilton’s principal function of HH.

More generally, we have

Theorem 6.19.

Let AHA_{H} be a second-order Hamiltonian vector field on (𝒯SM,ω)(\mathcal{T}^{S*}M,\omega) and let SC(M×)S\in C^{\infty}(M\times\mathbb{R}). Then, the following statements are equivalent:
(i) for every MM-valued diffusion XX satisfying

(DX(t),QX(t))=d2(τM)d2S(t,X(t))AH,(DX(t),QX(t))=d^{2}(\tau_{M}^{*})_{d^{2}S(t,X(t))}A_{H},

the 𝒯SM\mathcal{T}^{S*}M-valued process d2SXd^{2}S\circ X is a horizontal integral process of AHA_{H};
(ii) SS satisfies the Hamilton-Jacobi-Bellman equation

St+H(d2S,t)=f(t),\frac{\partial S}{\partial t}+H(d^{2}S,t)=f(t), (6.39)

for some function ff depending only on tt.

Proof.

Let 𝐗=d2SX\mathbf{X}=d^{2}S\circ X and set xi=xid2Sx^{i}=x^{i}\circ d^{2}S, pi=pid2Sp_{i}=p_{i}\circ d^{2}S, ojk=ojkd2So_{jk}=o_{jk}\circ d^{2}S. Then

pi(t,x)=Sxi(t,x),ojk(t,x)=2Sxjxk(t,x).p_{i}(t,x)=\frac{\partial S}{\partial x^{i}}(t,x),\quad o_{jk}(t,x)=\frac{\partial^{2}S}{\partial x^{j}\partial x^{k}}(t,x). (6.40)

These imply that the last equation of the system (6.17) holds. Since

d2(τM)𝐗(t)AH=Hpi(𝐗(t))xi+Hojk(𝐗(t))2xjxk,d^{2}(\tau_{M}^{*})_{\mathbf{X}(t)}A_{H}=\frac{\partial H}{\partial p_{i}}(\mathbf{X}(t))\frac{\partial}{\partial{x^{i}}}+\frac{\partial H}{\partial o_{jk}}(\mathbf{X}(t))\frac{\partial^{2}}{\partial x^{j}\partial x^{k}},

the first two equations in (6.10) or (6.17) hold. Hence, to turn the process 𝐗=d2SX\mathbf{X}=d^{2}S\circ X into a horizontal integral process of AHA_{H}, it is sufficient and necessary to make sure that the third equation in (6.17) holds. Plugging the first equation of (6.40) into the third equation, it reads as

(t+Hpjxj+Hojk2xjxk)Sxi=Hxi.\bigg{(}\frac{\partial}{\partial t}+\frac{\partial H}{\partial p_{j}}\frac{\partial}{\partial{x^{j}}}+\frac{\partial H}{\partial o_{jk}}\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}\bigg{)}\frac{\partial S}{\partial x^{i}}=-\frac{\partial H}{\partial x^{i}}.

A straightforward reinterpretation yields

xi[St+H(xj,Sxj,2Sxjxk,t)]=0.\frac{\partial}{\partial{x^{i}}}\left[\frac{\partial S}{\partial t}+H\left(x^{j},\frac{\partial S}{\partial x^{j}},\frac{\partial^{2}S}{\partial x^{j}\partial x^{k}},t\right)\right]=0.

The result follows. ∎

Remark 6.20.

If SS solves the HJB equation (6.39), then S~=Sf~\tilde{S}=S-\tilde{f} solve (6.38) with f~\tilde{f} a primitive function of ff. As a matter of fact, one can always integrate the time-dependent function ff into the 2nd-order Hamiltonian function HH such that the HJB equation (6.39) has the same form as (6.38). More precisely, if we let H~=Hf\tilde{H}=H-f, then Theorem 6.19 also holds with H~\tilde{H} and zero function in place of HH and ff, respectively. A similar argument holds for S-H equations (6.10). Indeed, adding a function ff depending only on time to a 2nd-order Hamiltonian does not change its S-H equations.

Example 6.21.

The function S=lnuS=\ln u considered in Section 6.3 satisfies the Hamilton-Jacobi-Bellman equation (6.28), which is exactly St+H(d2S)=0\frac{\partial S}{\partial t}+H(d^{2}S)=0 with the second-order Hamiltonian HH given in (6.26). Hence, this theorem yields that the process d2SXd^{2}S\circ X is a horizontal integral process of AHA_{H}, which coincides with (6.32). The Euclidean case for such argument has been discovered in [15, p. 180] or [89, Eq. (4.20)].

By (6.38) and (6.40), the total mean derivative of a 2nd-order Hamilton’s principal function SS is given by

𝐃tS=St+DixSxi+12Qjkx2Sxjxk=piDix+12ojkQjkxH(x,p,o,t).\mathbf{D}_{t}S=\frac{\partial S}{\partial t}+D^{i}x\frac{\partial S}{\partial x^{i}}+\frac{1}{2}Q^{jk}x\frac{\partial^{2}S}{\partial x^{j}\partial x^{k}}=p_{i}D^{i}x+\frac{1}{2}o_{jk}Q^{jk}x-H(x,p,o,t). (6.41)

where (p(t,x),o(t,x))=d2S(t,x)(p(t,x),o(t,x))=d^{2}S(t,x) as in (6.40).

6.6 Second-order Hamiltonian functions from classical ones

In the presence of a linear connection \nabla on MM, we are able to reduce (or produce) second-order Hamiltonian functions to (from) classical ones.

Let be given a second-order Hamiltonian function H:𝒯SM×H:\mathcal{T}^{S*}M\times\mathbb{R}\to\mathbb{R}. We make use of the fiber-linear bundle injection ι^:TM𝒯SM\hat{\iota}^{*}_{\nabla}:T^{*}M\to\mathcal{T}^{S*}M in (5.5) to define a classical Hamiltonian by

H0=H(ι^×𝐈𝐝):TM×.H_{0}=H\circ(\hat{\iota}^{*}_{\nabla}\times\mathbf{Id}_{\mathbb{R}}):T^{*}M\times\mathbb{R}\to\mathbb{R}. (6.42)

In canonical coordinates, it maps as H0(x,p,t)=H(x,p,(Γjki(x)pi),t)H_{0}(x,p,t)=H(x,p,(\Gamma_{jk}^{i}(x)p_{i}),t). If we introduce a family of auxiliary variables by

o^jk=o^jk(x,p):=Γjki(x)pi.\hat{o}_{jk}=\hat{o}_{jk}(x,p):=\Gamma_{jk}^{i}(x)p_{i}. (6.43)

Then, we can write

H0(x,p,t)=H(x,p,o^(x,p),t).H_{0}(x,p,t)=H(x,p,\hat{o}(x,p),t).

We say HH reduces to H0H_{0} under the connection \nabla, or H0H_{0} is the \nabla-reduction of HH.

Clearly, the way to lift from a classical Hamiltonian H0:TM×H_{0}:T^{*}M\times\mathbb{R}\to\mathbb{R} to a second-order Hamiltonian function that reduces to H0H_{0} under \nabla is not unique. But there is a canonical reduction when we are provided with a symmetric (2,0)(2,0)-tensor field gg (not necessarily Riemannian), given by

H¯g0(x,p,o,t):=H0(x,p,t)+12gjk(x)(ojkΓijk(x)pi)=H0(x,p,t)+12gjk(x)ojk.\overline{H}^{g}_{0}(x,p,o,t):=H_{0}(x,p,t)+\textstyle{\frac{1}{2}}g^{jk}(x)\left(o_{jk}-\Gamma^{i}_{jk}(x)p_{i}\right)=H_{0}(x,p,t)+\textstyle{\frac{1}{2}}g^{jk}(x)o_{jk}^{\nabla}. (6.44)

Then, H0H_{0} is the \nabla-reduction of H¯g0\overline{H}^{g}_{0}, and

12ojkgjkH¯g0(x,p,o,t)=12o^jkgjkH¯g0(x,p,o^,t).\textstyle{\frac{1}{2}}o_{jk}g^{jk}-\overline{H}^{g}_{0}(x,p,o,t)=\textstyle{\frac{1}{2}}\hat{o}_{jk}g^{jk}-\overline{H}^{g}_{0}(x,p,\hat{o},t). (6.45)

We call H¯g0\overline{H}^{g}_{0} the (g,)(g,\nabla)-canonical lift of H0H_{0}. If gg is a Riemannian metric and \nabla is the associated Levi-Civita connection, then we simply call H¯g0\overline{H}^{g}_{0} the gg-canonical lift of H0H_{0}. If there is a classical Hamiltonian H0H_{0} such that the second-order Hamiltonian HH is the (g,)(g,\nabla)- (or gg-) canonical lift of H0H_{0}, we say HH is (g,)(g,\nabla)- (or gg-) canonical.

As an example, the second-order Hamiltonian HH of (6.26) is gg-canonical and reduces to H0(x,p)=12gij(x)pipj+bi(x)pi+F(x)H_{0}(x,p)=\frac{1}{2}g^{ij}(x)p_{i}p_{j}+b^{i}(x)p_{i}+F(x).

Furthermore, for the canonical transformation 𝐅:𝒯SM𝒯SN\mathbf{F}:\mathcal{T}^{S*}M\to\mathcal{T}^{S*}N in Definition 6.14, we can reduce its associated function H𝐅C(𝒯SM×)H_{\mathbf{F}}\in C^{\infty}(\mathcal{T}^{S*}M\times\mathbb{R}) to a classical function H0𝐅C(TM×)H^{0}_{\mathbf{F}}\in C^{\infty}(T^{*}M\times\mathbb{R}) via (6.42). As a consequence of (6.34), the projection of 𝐅\mathbf{F}, i.e., the map 𝔽:TM×TN×\mathbb{F}:T^{*}M\times\mathbb{R}\to T^{*}N\times\mathbb{R} satisfies 𝔽η~0=ω0H0𝐅\mathbb{F}^{*}\tilde{\eta}_{0}=\omega^{0}_{H^{0}_{\mathbf{F}}} where ω0H0𝐅=ω~0+dH0𝐅dF0\omega^{0}_{H^{0}_{\mathbf{F}}}=\tilde{\omega}_{0}+dH^{0}_{\mathbf{F}}\wedge dF^{0}. It follows that 𝔽\mathbb{F} is a classical canonical transformation [1, Definition 5.2.6].

We will go back to this issue in Section 7.4 where the second-order Legendre transform will be developed. In particular, we will show there that for the canonical 2nd-order Hamiltonian in (6.44), the corresponding 2nd-order Hamilton’s equations (6.17) can be rewritten on the cotangent bundle TMT^{*}M in a global fashion, see Theorem 7.22.

7 Stochastic Lagrangian mechanics

In this chapter, we specify a Riemannian metric gg for the manifold MM, and a gg-compatible linear connection \nabla. Note that such gg and \nabla always exist but are not unique in general.

We will denote by |||\cdot| and ,\langle\cdot,\cdot\rangle the Riemannian norm and inner product, respectively. Also, denote by gˇ\check{g} the inverse metric tensor of gg, and (Γjki)(\Gamma_{jk}^{i}) the Christoffel symbols of \nabla. We observe that gˇ\check{g} is a (2,0)(2,0)-tensor field. Denote by RR the Riemann curvature tensor and Ric\mathrm{Ric} the Ricci (1,1)(1,1)-tensor.

7.1 Mean covariant derivatives

Definition 7.1 (Vector fields and 1-forms along diffusions).

Let XX be diffusion on MM. By a vector field along XX, we mean a TMTM-valued process VV, such that τM(V(t))=X(t)\tau_{M}(V(t))=X(t) for all tt. Similarly, by a 1-form along XX, we mean a TMT^{*}M-valued process η\eta, such that τM(η(t))=X(t)\tau^{*}_{M}(\eta(t))=X(t) for all tt.

Clearly, for a time-dependent vector field VV on MM, the restriction of VV on XX, i.e., {V(t,X(t))}\{V_{(t,X(t))}\}, is a vector field along XX. In this case, we call {V(t,X(t))}\{V_{(t,X(t))}\} a vector field restricted on XX. In this way, vector fields restricted on XX are just TMTM-valued horizontal diffusions projecting to XX. Similarly for 1-forms.

Definition 7.2 (Parallelisms along diffusions).

Let XIt0(M)X\in I_{t_{0}}(M). A vector field VV along XX is said to be parallel along XX if the following Stratonovich SDE in local coordinates holds,

dVi(t)+Γjki(X(t))Vj(t)dXk(t)=0.dV^{i}(t)+\Gamma_{jk}^{i}(X(t))V^{j}(t)\circ dX^{k}(t)=0. (7.1)

A 1-form η\eta along XX is said to be parallel along XX if

dηj(t)Γjki(X(t))ηi(t)dXk(t)=0.d\eta_{j}(t)-\Gamma_{jk}^{i}(X(t))\eta_{i}(t)\circ dX^{k}(t)=0.
Definition 7.3 (Stochastic parallel displacements).

Given a diffusion XIt0(M)X\in I_{t_{0}}(M) and a (random) vector vTX(t0)Mv\in T_{X(t_{0})}M, the stochastic parallel displacement of vv along XX is the extension of vv to a parallel vector field VV along XX, that is, VV satisfies the SDE (7.1) with initial condition V(t0)=vV(t_{0})=v. We denote Γ(X)t0tv:=V(t)\Gamma(X)_{t_{0}}^{t}v:=V(t) and Γ(X)tt0V(t):=v\Gamma(X)_{t}^{t_{0}}V(t):=v. The stochastic parallel displacement of a (random) covector ηTX(t0)M\eta\in T^{*}_{X(t_{0})}M along XX is defined in a similar fashion.

Definition 7.4 (Damped parallel displacements).

Let XIt0(M)X\in I_{t_{0}}(M). Given a (random) vector vTX(t0)Mv\in T_{X(t_{0})}M and covector η0TX(t0)M\eta_{0}\in T^{*}_{X(t_{0})}M, the damped parallel displacement of vv along XX is the extension of vv to a vector field VV along XX that satisfies the SDE

dVi(t)+Γjki(X(t))Vj(t)dXk(t)+12Rikjl(X(t))Vj(t)(QX)kl(t)dt=0,V(t0)=v.dV^{i}(t)+\Gamma_{jk}^{i}(X(t))V^{j}(t)\circ dX^{k}(t)+\frac{1}{2}R^{i}_{kjl}(X(t))V^{j}(t)(QX)^{kl}(t)dt=0,\quad V(t_{0})=v. (7.2)

The damped parallel displacement of η0\eta_{0} along XX is the extension of η\eta to a vector field η\eta along XX that satisfies the SDE

dηj(t)Γjki(X(t))ηi(t)dXk(t)12Rikjl(X(t))ηi(t)(QX)kl(t)dt=0,η(t0)=η0.d\eta_{j}(t)-\Gamma_{jk}^{i}(X(t))\eta_{i}(t)\circ dX^{k}(t)-\frac{1}{2}R^{i}_{kjl}(X(t))\eta_{i}(t)(QX)^{kl}(t)dt=0,\quad\eta(t_{0})=\eta_{0}. (7.3)

We denote Γ¯(X)t0tv:=V(t)\overline{\Gamma}(X)_{t_{0}}^{t}v:=V(t), Γ¯(X)t0tη0:=η(t)\overline{\Gamma}(X)_{t_{0}}^{t}\eta_{0}:=\eta(t), and Γ¯(X)tt0V(t):=v\overline{\Gamma}(X)_{t}^{t_{0}}V(t):=v, Γ¯(X)tt0η(t):=η0\overline{\Gamma}(X)_{t}^{t_{0}}\eta(t):=\eta_{0}.

If VV and η\eta are restrictions on XX, that is, V(t)=V(t,X(t))V(t)=V_{(t,X(t))} and η(t)=η(t,X(t))\eta(t)=\eta_{(t,X(t))}, then equations (7.2) and (7.3) can be rewritten, respectively, as

Vtdt+dXV+12R(V,dX)dX=0,ηtdt+dXη12R(η,dX)dX=0,\frac{\partial V}{\partial t}dt+\nabla_{\circ dX}V+\frac{1}{2}R(V,\circ dX)\circ dX=0,\qquad\frac{\partial\eta}{\partial t}dt+\nabla_{\circ dX}\eta-\frac{1}{2}R(\eta,\circ dX)\circ dX=0,

where we mean by R(η,V)WR(\eta,V)W the 1-form [R(η,V)W][R(\eta^{\sharp},V)W]^{\flat}. The Stratonovich stochastic differentials can be transformed into Itô ones. For example, (7.3) is equivalent to

dηj(t)=Γjki(X(t))ηi(t)dXk(t)+12(QX)kl(t)(Γjkixl+ΓjkmΓmli)(X(t))ηi(t)dt+12Rikjl(X(t))ηi(t)(QX)kl(t)dt.d\eta_{j}(t)=\Gamma_{jk}^{i}(X(t))\eta_{i}(t)dX^{k}(t)+\frac{1}{2}(QX)^{kl}(t)\left(\frac{\partial\Gamma_{jk}^{i}}{\partial x^{l}}+\Gamma_{jk}^{m}\Gamma_{ml}^{i}\right)(X(t))\eta_{i}(t)dt+\frac{1}{2}R^{i}_{kjl}(X(t))\eta_{i}(t)(QX)^{kl}(t)dt. (7.4)
Remark 7.5.

The notion of stochastic parallel displacements was introduced by Itô [45] and Dynkin [23]. The notion of damped parallel displacement is due to Malliavin [62]. It was originally introduced by Dohrn and Guerra [21], where they call it geodesic correction to the stochastic parallel displacement.

Lemma 7.6.

Let XIt0(M)X\in I_{t_{0}}(M).
(i). Let η\eta be a 1-form on MM parallel along XX. If VV is a vector field on MM which is also parallel along XX, then η(V)(t)=η(V)(t0)\eta(V)(t)=\eta(V)(t_{0}) for all tt0t\geq t_{0}; if vTX(t0)Mv\in T_{X(t_{0})}M, then η(Γ(X)t0tv)(t)=η(v)(t0)\eta(\Gamma(X)_{t_{0}}^{t}v)(t)=\eta(v)(t_{0}) for all tt0t\geq t_{0}.
(ii). Let η\eta be a 1-form on along XX satisfying the SDE (7.3). If VV is a vector field along XX satisfying the SDE (7.2), then η(V)(t)=η(V)(t0)\eta(V)(t)=\eta(V)(t_{0}) for all tt0t\geq t_{0}; if vTX(t0)Mv\in T_{X(t_{0})}M, then η(Γ¯(X)t0tv)(t)=η(v)(t0)\eta(\overline{\Gamma}(X)_{t_{0}}^{t}v)(t)=\eta(v)(t_{0}) for all tt0t\geq t_{0}.

Proof.

We only prove Assertion (ii), as (i) is similar. Since Stratonovich stochastic differentials obey Leibniz’s rule, we have

d[η(V)]=ηidVi+Vjdηj=ηiΓjkiVjdXk12ηiRikjlVj(QX)kldt+VjΓjkiηidXk+12VjRikjlηi(QX)kldt=0.\begin{split}d[\eta(V)]&=\eta_{i}\circ dV^{i}+V^{j}\circ d\eta_{j}\\ &=-\eta_{i}\Gamma_{jk}^{i}V^{j}\circ dX^{k}-\frac{1}{2}\eta_{i}R^{i}_{kjl}V^{j}(QX)^{kl}dt+V^{j}\Gamma_{jk}^{i}\eta_{i}\circ dX^{k}+\frac{1}{2}V^{j}R^{i}_{kjl}\eta_{i}(QX)^{kl}dt\\ &=0.\end{split}

This proves the first statement of (ii). The second statement of (ii) follows by letting V(t):=Γ¯(X)t0tvV(t):=\overline{\Gamma}(X)_{t_{0}}^{t}v. ∎

Definition 7.7 (Mean covariant derivatives along diffusions).

Given a diffusion XX on MM. Let VV and η\eta be time-dependent vector field and 1-form along XX, respectively. The (forward) mean covariant derivative of VV with respect to XX is a time-dependent vector field 𝐃Vdt\frac{\mathbf{D}V}{dt} along XX, defined by

𝐃Vdt(t)=limϵ0+𝐄[Γ(X)t+ϵtV(t+ϵ)V(t)ϵ|𝒫t].\frac{\mathbf{D}V}{dt}(t)=\lim_{\epsilon\to 0^{+}}\mathbf{E}\left[\frac{\Gamma(X)_{t+\epsilon}^{t}V(t+\epsilon)-V(t)}{\epsilon}\Bigg{|}\mathcal{P}_{t}\right]. (7.5)

The damped mean covariant derivative of VV with respect to XX is a time-dependent vector field 𝐃¯Vdt\frac{\overline{\mathbf{D}}V}{dt} along XX with Γ¯\overline{\Gamma} instead of Γ\Gamma in (7.5). Similarly, we can define 𝐃ηdt\frac{\mathbf{D}\eta}{dt} and 𝐃¯ηdt\frac{\overline{\mathbf{D}}\eta}{dt}.

Lemma 7.8.

(i). Let VV and η\eta be vector field and 1-form along XX. If η\eta is parallel along XX, then

𝐄[η(𝐃Vdt)]=𝐄(D[η(V)]).\textstyle{\mathbf{E}\left[\eta\left(\frac{\mathbf{D}V}{dt}\right)\right]=\mathbf{E}\left(D[\eta(V)]\right).} (7.6)

If η\eta satisfies the SDE (7.3), then (7.6) holds true with 𝐃¯dt\frac{\overline{\mathbf{D}}}{dt} instead of 𝐃dt\frac{\mathbf{D}}{dt}.

(ii). Let VV be a vector field restricted on XX. Then

𝐃¯Vdt=𝐃Vdt+12(QX)ijR(V,i)j=Vt+DXV+12(QX)ij(2i,jV+R(V,i)j).\frac{\overline{\mathbf{D}}V}{dt}=\frac{\mathbf{D}V}{dt}+\frac{1}{2}(QX)^{ij}R(V,\partial_{i})\partial_{j}=\frac{\partial V}{\partial t}+\nabla_{D_{\nabla}X}V+\frac{1}{2}(QX)^{ij}\left(\nabla^{2}_{\partial_{i},\partial_{j}}V+R(V,\partial_{i})\partial_{j}\right).

(iii). Let η\eta be a 1-form restricted on XX. Then

𝐃¯ηdt=𝐃ηdt12(QX)ijR(η,j)i=ηt+DXη+12(QX)ij(2i,jηR(η,j)i).\frac{\overline{\mathbf{D}}\eta}{dt}=\frac{\mathbf{D}\eta}{dt}-\frac{1}{2}(QX)^{ij}R(\eta,\partial_{j})\partial_{i}=\frac{\partial\eta}{\partial t}+\nabla_{D_{\nabla}X}\eta+\frac{1}{2}(QX)^{ij}\left(\nabla^{2}_{\partial_{i},\partial_{j}}\eta-R(\eta,\partial_{j})\partial_{i}\right).

(iv). Let VV and η\eta be a vector field and a 1-form restricted on XX. Then

𝐃t[η(V)]=η(𝐃Vdt)+𝐃ηdt(V)+(QX)ij(iη)(jV)=η(𝐃¯Vdt)+𝐃¯ηdt(V)+(QX)ij(iη)(jV).\mathbf{D}_{t}[\eta(V)]=\eta\left(\frac{\mathbf{D}V}{dt}\right)+\frac{\mathbf{D}\eta}{dt}(V)+(QX)^{ij}(\nabla_{\partial_{i}}\eta)(\nabla_{\partial_{j}}V)=\eta\left(\frac{\overline{\mathbf{D}}V}{dt}\right)+\frac{\overline{\mathbf{D}}\eta}{dt}(V)+(QX)^{ij}(\nabla_{\partial_{i}}\eta)(\nabla_{\partial_{j}}V).
Proof.

(i). By Lemma 7.6.(i), we have

𝐄[η(𝐃Vdt)(t)]=limϵ0𝐄[η(t)(Γ(X)t+ϵtV(t+ϵ))η(t)(V(t))ϵ]=limϵ0𝐄[η(V)(t+ϵ)η(V)(t)ϵ]=𝐄(D[η(V)(t)]).\begin{split}\mathbf{E}\left[\eta\left(\frac{\mathbf{D}V}{dt}\right)(t)\right]&=\lim_{\epsilon\to 0}\mathbf{E}\left[\frac{\eta(t)(\Gamma(X)_{t+\epsilon}^{t}V(t+\epsilon))-\eta(t)(V(t))}{\epsilon}\right]\\ &=\lim_{\epsilon\to 0}\mathbf{E}\left[\frac{\eta(V)(t+\epsilon)-\eta(V)(t)}{\epsilon}\right]\\ &=\mathbf{E}\left(D[\eta(V)(t)]\right).\end{split}

This proves the first statement of (i). The second statement of (i) follows by a similar argument with 𝐃¯dt\frac{\overline{\mathbf{D}}}{dt} in place of 𝐃dt\frac{\mathbf{D}}{dt} and Γ¯\overline{\Gamma} in place of Γ\Gamma.

(ii). It suffices to derive the expression for 𝐃¯Vdt\frac{\overline{\mathbf{D}}V}{dt}. Suppose that η\eta is a 1-form satisfying the SDE (7.3) and the diffusion XX satisfies QX(t)=(σσ)(t,X(t))QX(t)=(\sigma\circ\sigma^{*})(t,X(t)). Then, we apply Itô’s formula to η(V)(X(t))\eta(V)(X(t)) and make use of (2.20) and (7.4). We get

d[η(V)]=d(ηiVi)=ηi(Vitdt+VixjdXj+122Vixjxkd[Xj,Xk])+Vjdηj+d[ηj,Vj]=ηi(Vit+Vixj(DX)j+122Vixjxk(QX)jk)dt+ηiVixjσrjdBr+Vj[Γjki(DX)k+12(QX)kl(Γjkixl+ΓjkmΓmli)+12Rikjl(QX)kl]ηidt+VjΓjkiηiσrkdBr+ΓjkiηiVjxl(QX)kldt=ηi[Vit+(Vixk+VjΓjki)(DX)k]dt+12ηi(QX)kl[VixjΓjkl+2Vixkxl+Vj(ΓjmiΓmkl+Γjkixl+ΓjkmΓmli)+2ΓjkiVjxl]dt+12ηiRikjl(QX)klVjdt+ηi(Vixk+VjΓjki)σrkdBr=η(Vt+DXV+12(QX)ij(2i,jV+R(V,i)j))dt+η(σrV)dBr.\begin{split}d[\eta(V)]&=d(\eta_{i}V^{i})=\eta_{i}\left(\frac{\partial V^{i}}{\partial t}dt+\frac{\partial V^{i}}{\partial x^{j}}dX^{j}+\frac{1}{2}\frac{\partial^{2}V^{i}}{\partial x^{j}\partial x^{k}}d[X^{j},X^{k}]\right)+V^{j}d\eta_{j}+d[\eta_{j},V^{j}]\\ &=\eta_{i}\left(\frac{\partial V^{i}}{\partial t}+\frac{\partial V^{i}}{\partial x^{j}}(DX)^{j}+\frac{1}{2}\frac{\partial^{2}V^{i}}{\partial x^{j}\partial x^{k}}(QX)^{jk}\right)dt+\eta_{i}\frac{\partial V^{i}}{\partial x^{j}}\sigma_{r}^{j}dB^{r}\\ &\quad+V^{j}\left[\Gamma_{jk}^{i}(DX)^{k}+\frac{1}{2}(QX)^{kl}\left(\frac{\partial\Gamma_{jk}^{i}}{\partial x^{l}}+\Gamma_{jk}^{m}\Gamma_{ml}^{i}\right)+\frac{1}{2}R^{i}_{kjl}(QX)^{kl}\right]\eta_{i}dt+V^{j}\Gamma_{jk}^{i}\eta_{i}\sigma_{r}^{k}dB^{r}\\ &\quad+\Gamma_{jk}^{i}\eta_{i}\frac{\partial V^{j}}{\partial x^{l}}(QX)^{kl}dt\\ &=\eta_{i}\left[\frac{\partial V^{i}}{\partial t}+\left(\frac{\partial V^{i}}{\partial x^{k}}+V^{j}\Gamma_{jk}^{i}\right)(D_{\nabla}X)^{k}\right]dt\\ &\quad+\frac{1}{2}\eta_{i}(QX)^{kl}\left[-\frac{\partial V^{i}}{\partial x^{j}}\Gamma^{j}_{kl}+\frac{\partial^{2}V^{i}}{\partial x^{k}\partial x^{l}}+V^{j}\left(-\Gamma_{jm}^{i}\Gamma^{m}_{kl}+\frac{\partial\Gamma_{jk}^{i}}{\partial x^{l}}+\Gamma_{jk}^{m}\Gamma_{ml}^{i}\right)+2\Gamma_{jk}^{i}\frac{\partial V^{j}}{\partial x^{l}}\right]dt\\ &\quad+\frac{1}{2}\eta_{i}R^{i}_{kjl}(QX)^{kl}V^{j}dt+\eta_{i}\left(\frac{\partial V^{i}}{\partial x^{k}}+V^{j}\Gamma_{jk}^{i}\right)\sigma_{r}^{k}dB^{r}\\ &=\eta\left(\frac{\partial V}{\partial t}+\nabla_{D_{\nabla}X}V+\frac{1}{2}(QX)^{ij}\left(\nabla^{2}_{\partial_{i},\partial_{j}}V+R(V,\partial_{i})\partial_{j}\right)\right)dt+\eta\left(\nabla_{\sigma_{r}}V\right)dB^{r}.\end{split}

Hence, the result (i) implies

𝐄[η(𝐃¯Vdt)]=𝐄(D[η(V)(t)])=𝐄[η(Vt+DXV+12(QX)ij(2i,jV+R(V,i)j))].\mathbf{E}\left[\eta\left(\frac{\overline{\mathbf{D}}V}{dt}\right)\right]=\mathbf{E}\left(D[\eta(V)(t)]\right)=\mathbf{E}\left[\eta\left(\frac{\partial V}{\partial t}+\nabla_{D_{\nabla}X}V+\frac{1}{2}(QX)^{ij}\left(\nabla^{2}_{\partial_{i},\partial_{j}}V+R(V,\partial_{i})\partial_{j}\right)\right)\right].

The arbitrariness of η\eta yields (ii).

(iii). Similar to (ii).

(iv). We only prove the first equality as the second is similar. By (4.6),

𝐃t[η(V)]=(t+(DX)ii+12(QX)ij2i,j)[η(V)]=(ηt)(V)+η(Vt)+(DXη)(V)+η(DXV)+12(QX)ij[(2i,jη)(V)+η(2i,jV)+(iη)(jV)+(jη)(iV)]=η(𝐃Vdt)+𝐃ηdt(V)+(QX)ij(iη)(jV).\begin{split}\mathbf{D}_{t}[\eta(V)]&=\left(\frac{\partial}{\partial t}+(D_{\nabla}X)^{i}\partial_{i}+\frac{1}{2}(QX)^{ij}\nabla^{2}_{\partial_{i},\partial_{j}}\right)[\eta(V)]\\ &=\left(\frac{\partial\eta}{\partial t}\right)(V)+\eta\left(\frac{\partial V}{\partial t}\right)+\left(\nabla_{D_{\nabla}X}\eta\right)(V)+\eta\left(\nabla_{D_{\nabla}X}V\right)\\ &\quad+\frac{1}{2}(QX)^{ij}\left[\left(\nabla^{2}_{\partial_{i},\partial_{j}}\eta\right)(V)+\eta\left(\nabla^{2}_{\partial_{i},\partial_{j}}V\right)+\left(\nabla_{\partial_{i}}\eta\right)\left(\nabla_{\partial_{j}}V\right)+\left(\nabla_{\partial_{j}}\eta\right)\left(\nabla_{\partial_{i}}V\right)\right]\\ &=\eta\left(\frac{\mathbf{D}V}{dt}\right)+\frac{\mathbf{D}\eta}{dt}(V)+(QX)^{ij}(\nabla_{\partial_{i}}\eta)(\nabla_{\partial_{j}}V).\end{split}

The result follows. ∎

If QX(t)=gˇ(X(t))QX(t)=\check{g}(X(t)), then

𝐃¯Vdt=Vt+DXV+12ΔV+12Ric(V),\frac{\overline{\mathbf{D}}V}{dt}=\frac{\partial V}{\partial t}+\nabla_{D_{\nabla}X}V+\frac{1}{2}\Delta V+\frac{1}{2}\mathrm{Ric}(V),

and similarly,

𝐃¯ηdt=ηt+DXη+12Δη12Ric(η)=ηt+DXη+12ΔLDη,\frac{\overline{\mathbf{D}}\eta}{dt}=\frac{\partial\eta}{\partial t}+\nabla_{D_{\nabla}X}\eta+\frac{1}{2}\Delta\eta-\frac{1}{2}\mathrm{Ric}(\eta)=\frac{\partial\eta}{\partial t}+\nabla_{D_{\nabla}X}\eta+\frac{1}{2}\Delta_{\mathrm{LD}}\eta, (7.7)

where Δ\Delta is the connection Laplacian, and ΔLD=(dd+dd)\Delta_{\mathrm{LD}}=-(dd^{*}+d^{*}d) is the Laplace-de Rham operator on forms. The relation ΔLD=ΔRic\Delta_{\mathrm{LD}}=\Delta-\mathrm{Ric} is due to the Weitzenböck identity [74, Theorem 9.4.1]. We remark here that the operator Δ+Ric\Delta+\mathrm{Ric} acting on vector fields is also called Laplace-de Rham operator in [21].

In the context of fluid dynamics, the operator t+v\frac{\partial}{\partial t}+\nabla_{v}, with vv a vector field, is often referred to as material derivative or hydrodynamic derivative. So the mean covariant derivative 𝐃dt\frac{\mathbf{D}}{dt} and its damped variant 𝐃¯dt\frac{\overline{\mathbf{D}}}{dt} can be regarded as stochastic deformations of material derivative.

7.2 A stochastic stationary-action principle

In this section, we will establish a type of stochastic stationary-action principle: the stochastic Hamilton’s principle. Another version for systems with conserved energy, the stochastic Maupertuis’s principle, can be found in Appendix C.

In contrast to second-order Hamiltonians, not all real-valued functions on 𝒯SM\mathcal{T}^{S}M can be used as second-order Lagrangians in stochastic Lagrangian mechanics. This has been hinted in Section 6.3, as we have mentioned in Remark 6.10. For this reason, we will produce a class of second-order Lagrangians from classical Lagrangians, via the fiber-linear bundle projection ϱ\varrho_{\nabla} in (3.3) and the \nabla-canonical coordinates (Dix)(D_{\nabla}^{i}x) in (3.2).

Definition 7.9.

By an admissible second-order Lagrangian, we mean a function L:×𝒯SML:\mathbb{R}\times\mathcal{T}^{S}M\to\mathbb{R} such that there exists a classical Lagrangian L0:×TML_{0}:\mathbb{R}\times TM\to\mathbb{R} satisfying L=L0(𝐈𝐝×ϱ)L=L_{0}\circ(\mathbf{Id}_{\mathbb{R}}\times\varrho_{\nabla}). We call LL the \nabla-lift of L0L_{0}.

In local coordinates, the \nabla-lift LL of L0L_{0} is expressed as

L(t,x,Dx,Qx)=L0ϱ(t,x,Dx,Qx)=L0(t,x,Dx).L(t,x,Dx,Qx)=L_{0}\circ\varrho_{\nabla}(t,x,Dx,Qx)=L_{0}(t,x,D_{\nabla}x). (7.8)

Let T>0T>0. Our stochastic variational problem consists in finding the extrema (maxima or minima) of the stochastic action functional

𝒮[X;0,T]:=𝐄0TL(t,X(t),DX(t),QX(t))dt=𝐄0TL0(t,X(t),DX(t))dt\mathcal{S}[X;0,T]:=\mathbf{E}\int_{0}^{T}L\left(t,X(t),DX(t),QX(t)\right)dt=\mathbf{E}\int_{0}^{T}L_{0}\left(t,X(t),D_{\nabla}X(t)\right)dt (7.9)

over a suitable domain of diffusions XX on MM, where LL is an admissible second-order Lagrangian lifted from L0L_{0}.

In order to formulate a well-posed stochastic variational problem in an economical way, we assume that the manifold MM is compact and the metric gg is geodesically complete (which will be used to characterize the variations of diffusions in Lemma 7.13), and that the connection \nabla is the associated Levi-Civita connection. The geodesic completeness can be ensured, for example, if MM is connected (see, e.g., [55, p. 346]). Whenever the metric gg is given, the associated Levi-Civita connection is uniquely determined, due to the fundamental theorem of Riemannian geometry [50, Theorem IV.2.2]. We will refer to such a geodesically complete Riemannian metric as a reference metric tensor.

For a fixed point qMq\in M and a probability distribution μ𝒫(M)\mu\in\mathcal{P}(M) on MM, we define an admissible class of diffusions by

𝒜g([0,T];q,μ)={XI(0,q)(T,μ)(M):QX(t)=gˇ(X(t)),t[0,T],a.s.},\mathcal{A}_{g}([0,T];q,\mu)=\left\{X\in I_{(0,q)}^{(T,\mu)}(M):QX(t)=\check{g}(X(t)),\forall t\in[0,T],\text{a.s.}\right\}, (7.10)

where I(0,q)(T,μ)(M)I_{(0,q)}^{(T,\mu)}(M) denotes the set all MM-valued diffusion processes starting from qq at t=0t=0 and with final distribution μ\mu, i.e., 𝐏(X(T))1=μ\mathbf{P}\circ(X(T))^{-1}=\mu.The action functional 𝒮\mathcal{S} is now defined on the set 𝒜g([0,T];q,μ)\mathcal{A}_{g}([0,T];q,\mu), that is, 𝒮:𝒜g([0,T];q,μ)\mathcal{S}:\mathcal{A}_{g}([0,T];q,\mu)\to\mathbb{R}.

Note that the admissible class 𝒜g\mathcal{A}_{g} is similar to the Wiener space, so that a candidate for its “tangent space” is Cameron–Martin space. Denote by ([0,T];q)\mathcal{H}([0,T];q) the Hilbert space of absolutely continuous curves v:[0,T]TqMv:[0,T]\to T_{q}M such that 0T|v˙(t)|2dt<\int_{0}^{T}|\dot{v}(t)|^{2}dt<\infty. Let 0([0,T];q)\mathcal{H}_{0}([0,T];q) be the subspace consisting of all v([0,T];q)v\in\mathcal{H}([0,T];q) satisfying v(0)=v(T)=0v(0)=v(T)=0.

Definition 7.10.

Let X𝒜g([0,T];q,μ)X\in\mathcal{A}_{g}([0,T];q,\mu). For a curve v0([0,T];q)v\in\mathcal{H}_{0}([0,T];q), the vector field along XX given by V(t):=Γ(X)0tv(t)V(t):=\Gamma(X)_{0}^{t}v(t) is called a tangent vector to 𝒜g([0,T];q,μ)\mathcal{A}_{g}([0,T];q,\mu) at XX. The tangent space to 𝒜g([0,T];q,μ)\mathcal{A}_{g}([0,T];q,\mu) at XX is the set of all such tangent vectors, that is,

TX𝒜g([0,T];q,μ):={Γ(X)0v():v0([0,T];q)}.T_{X}\mathcal{A}_{g}([0,T];q,\mu):=\left\{\Gamma(X)_{0}^{\cdot}v(\cdot):v\in\mathcal{H}_{0}([0,T];q)\right\}.
Definition 7.11.

By a variation (or deformation) of a diffusion X𝒜g([0,T];q,μ)X\in\mathcal{A}_{g}([0,T];q,\mu) along v0([0,T];q)v\in\mathcal{H}_{0}([0,T];q), we mean a one-parameter family of diffusions {Xvϵ}ϵ(ε,ε)\{X^{v}_{\epsilon}\}_{\epsilon\in(-\varepsilon,\varepsilon)}, where for each t[0,T]t\in[0,T], Xvϵ(t)X^{v}_{\epsilon}(t) satisfies the following ODE

ϵXvϵ(t)=Γ(Xvϵ)0tv(t),Xv0(t)=X(t).\frac{\partial}{\partial\epsilon}X^{v}_{\epsilon}(t)=\Gamma(X^{v}_{\epsilon})_{0}^{t}v(t),\quad X^{v}_{0}(t)=X(t). (7.11)

The diffusion X𝒜g([0,T];q,μ)X\in\mathcal{A}_{g}([0,T];q,\mu) is called a stationary (or critical) point of 𝒮\mathcal{S}, if the first variation δ𝒮\delta\mathcal{S} vanishes at XX, i.e.,

ddϵ|ϵ=0𝒮[Xvϵ;0,T]=0,for all v0([0,T];q).\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}\mathcal{S}[X^{v}_{\epsilon};0,T]=0,\quad\text{for all }v\in\mathcal{H}_{0}([0,T];q). (7.12)
Remark 7.12.

(i). The variations of diffusions on manifolds, via differential equation (7.11), is standard in stochastic analysis on path spaces of Riemannian manifolds. See for example [22, Eq. (2.3)] and [39, Theorem 4.1], where it is shown that Wiener measure is quasi-invariant under such variations. This kind of variations has some equivalent constructions. For instance, the previous two references also provided an approach by lifting to the frame bundle and projecting to the Euclidean space (a stochastic analog of Cartan’s development), while Malliavin and Fang [27] provided an alternative perspective via Bismut connection.

(ii). The stochastic variational problem (7.9)–(7.12) in the Euclidean context has also been familiar in stochastic optimal transport/control. See Section 7.3 and Subsection 7.4.4 for connections to those areas.

(iii). Unlike the infinitesimal variation used in Definition 4.11 for studying symmetries of SDEs, the infinitesimal variation here in (7.11) needs to be a parallel vector field.

The following lemma is the key for establishing stochastic Hamilton’s principle. The first statement shows that the variation XvϵX^{v}_{\epsilon} is well defined on the path space 𝒜g([0,T];q,μ)\mathcal{A}_{g}([0,T];q,\mu). The second one describes the infinitesimal changes of DXvϵD_{\nabla}X^{v}_{\epsilon} with respect to the variation parameter ϵ\epsilon. The proof of the latter is based on a geodesic approximation technique, which is originally due to Itô [44].

Lemma 7.13.

Given X𝒜g([0,T];q,μ)X\in\mathcal{A}_{g}([0,T];q,\mu) and v0([0,T];q)v\in\mathcal{H}_{0}([0,T];q). We have
(i) for each ϵ(ε,ε)\epsilon\in(-\varepsilon,\varepsilon), Xvϵ𝒜g([0,T];q,μ)X^{v}_{\epsilon}\in\mathcal{A}_{g}([0,T];q,\mu); and
(ii) for all t[0,T]t\in[0,T],

Ddϵ|ϵ=0DXvϵ(t)=Γ(X)0tv˙(t)+12(QX)ij(t)R(Γ(X)0tv(t),i)j,\frac{D}{d\epsilon}\bigg{|}_{\epsilon=0}D_{\nabla}X^{v}_{\epsilon}(t)=\Gamma(X)_{0}^{t}\dot{v}(t)+\frac{1}{2}(QX)^{ij}(t)R\left(\Gamma(X)_{0}^{t}v(t),\partial_{i}\right)\partial_{j}, (7.13)

where v˙(t)=ddtv(t)Tv(t)TqMTqM\dot{v}(t)=\frac{d}{dt}v(t)\in T_{v(t)}T_{q}M\cong T_{q}M, Ddϵ\frac{D}{d\epsilon} is the (classical) covariant derivative with respect to the parameter ϵ\epsilon.

Proof.

(i). Let ξ\xi and ξϵ\xi_{\epsilon} be the anti-development ([40, Definition 2.3.1]) of XX and XvϵX^{v}_{\epsilon}, respectively, with fixed initial frame r(0)OqMr(0)\in O_{q}M. Equivalently, for example, ξ\xi is an d\mathbb{R}^{d}-valued diffusion related to XX by the following SDEs [40, Section 2.3]

{dXi(t)=rji(t)dξj(t),drji(t)=Γkli(X(t))rjl(t)rmk(t)dξm(t).\left\{\begin{aligned} dX^{i}(t)&=r_{j}^{i}(t)\circ d\xi^{j}(t),\\ dr_{j}^{i}(t)&=-\Gamma_{kl}^{i}(X(t))r_{j}^{l}(t)r_{m}^{k}(t)\circ d\xi^{m}(t).\end{aligned}\right.

Applying the fact that k=1drkirkj=gij\sum_{k=1}^{d}r_{k}^{i}r_{k}^{j}=g^{ij} (e.g., [50, Proposition 1.5]) and the condition QX(t)=gˇ(X(t))QX(t)=\check{g}(X(t)), we have

rki(t)rlj(t)δkl=gij(X(t))=(QX)ij(t)=rki(t)rlj(t)(Qξ)kl(t),r_{k}^{i}(t)r_{l}^{j}(t)\delta^{kl}=g^{ij}(X(t))=(QX)^{ij}(t)=r_{k}^{i}(t)r_{l}^{j}(t)(Q\xi)^{kl}(t), (7.14)

and consequently, Qξ𝐈dQ\xi\equiv\mathbf{I}_{d}. Meanwhile, it follows from [27, Section 3.5] (or [22, Theorem 5.1], [39, Section 3]) that

dξϵ(t)=exp(ϵ0tΩ((r(0)1v)(s),dξ(s)))dξ(t)+ϵd(r(0)1v)(t),d\xi_{\epsilon}(t)=\exp\left(\epsilon\int_{0}^{t}\Omega\left(\left(r(0)^{-1}v\right)(s),\circ d\xi(s)\right)\right)\circ d\xi(t)+\epsilon d\left(r(0)^{-1}v\right)(t),

where Ω\Omega is the curvature form on the orthogonal frame bundle OMOM, taking values in 𝔰𝔬(d)\mathfrak{so}(d), and the frame r(0)r(0) is viewed as an isomorphism from d\mathbb{R}^{d} to TqMT_{q}M. It follows that Qξϵ=Qξ𝐈dQ\xi_{\epsilon}=Q\xi\equiv\mathbf{I}_{d}. For the reason similar to (7.14), we have QXvϵ(t)=gˇ(Xvϵ(t))QX^{v}_{\epsilon}(t)=\check{g}(X^{v}_{\epsilon}(t)). The result follows. See [22, Theorem 8.3] for a more elaborate proof.

(ii). Fix n,m+n,m\in\mathbb{N}_{+}. Let 0=t0<t1<<tn=T0=t_{0}<t_{1}<\cdots<t_{n}=T be a division of the time interval [0,T][0,T], and let ε=ϵm<<ϵ1<0=ϵ0<ϵ1<<ϵm=ε-\varepsilon=\epsilon_{m-}<\cdots<\epsilon_{-1}<0=\epsilon_{0}<\epsilon_{1}<\cdots<\epsilon_{m}=\varepsilon be one of the variation parameter interval (ε,ε)(-\varepsilon,\varepsilon). Denote Δti:=titi1\Delta t_{i}:=t_{i}-t_{i-1}. Consider the polygonal curve xn={xn(t)}t[0,T]x^{n}=\{x^{n}(t)\}_{t\in[0,T]}, which is an approximation of XX made of minimizing geodesic segments joining X(ti1)X(t_{i-1}) with X(ti)X(t_{i}) for all 1in1\leq i\leq n. This is attainable by the geodesic completeness. We will construct an approximation scheme for the variational processes XvϵX^{v}_{\epsilon}’s.

For ϵ[ϵ0,ϵ1]\epsilon\in[\epsilon_{0},\epsilon_{1}], we construct the approximation xnϵx^{n}_{\epsilon} of XvϵX^{v}_{\epsilon} as follows. We extend each X(ti)X(t_{i}), 0in0\leq i\leq n, to a geodesic

γ0(i)(ϵ)=expX(ti)(ϵΓ(xn)0tiv(ti)),ϵ[ϵ0,ϵ1].\gamma_{0}^{(i)}(\epsilon)=\exp_{X(t_{i})}\left(\epsilon\Gamma(x^{n})_{0}^{t_{i}}v(t_{i})\right),\quad\epsilon\in[\epsilon_{0},\epsilon_{1}].

Let xnϵ={xnϵ(t)}t[0,T]x^{n}_{\epsilon}=\{x^{n}_{\epsilon}(t)\}_{t\in[0,T]} be the polygonal curve consisting of minimizing geodesic segments joining γ0(i1)(ϵ)\gamma_{0}^{(i-1)}(\epsilon) with γ0(i)(ϵ)\gamma_{0}^{(i)}(\epsilon) for all 1in1\leq i\leq n.

Then, we construct xnϵx^{n}_{\epsilon} for ϵ[ϵj,ϵj+1]\epsilon\in[\epsilon_{j},\epsilon_{j+1}], 1jm11\leq j\leq m-1, by induction. Suppose xnϵx^{n}_{\epsilon}, ϵ[ϵj1,ϵj]\epsilon\in[\epsilon_{j-1},\epsilon_{j}], has been defined. Then, in particular, we have a curve xnϵjx^{n}_{\epsilon_{j}}. Extend each xnϵj(ti)x^{n}_{\epsilon_{j}}(t_{i}), 0in0\leq i\leq n, to a geodesic by

γj(i)(ϵ)=expxnϵj(ti)((ϵϵj)Γ(xnϵj)0tiv(ti)),ϵ[ϵj,ϵj+1].\gamma_{j}^{(i)}(\epsilon)=\exp_{x^{n}_{\epsilon_{j}}(t_{i})}\left((\epsilon-\epsilon_{j})\Gamma(x^{n}_{\epsilon_{j}})_{0}^{t_{i}}v(t_{i})\right),\quad\epsilon\in[\epsilon_{j},\epsilon_{j+1}].

Let xnϵx^{n}_{\epsilon} be the polygonal curve consisting of minimizing geodesic segments joining γj(i1)(ϵ)\gamma_{j}^{(i-1)}(\epsilon) with γj(i)(ϵ)\gamma_{j}^{(i)}(\epsilon) for all 1in1\leq i\leq n. In a similar way, we can define xnϵx^{n}_{\epsilon} for ϵ[ϵj,ϵj+1]\epsilon\in[\epsilon_{j},\epsilon_{j+1}], mj1-m\leq j\leq-1.

Now we have a family of polygonal curves {xnϵ:ϵ(ε,ε)}\{x^{n}_{\epsilon}:\epsilon\in(-\varepsilon,\varepsilon)\}, which satisfies xn0=xnx^{n}_{0}=x^{n} and

sign(ϵ)ϵ|ϵ=ϵjxnϵ(ti)=Γ(xnϵj)0tiv(ti).\frac{\partial^{\mathrm{sign}(\epsilon)}}{\partial\epsilon}\bigg{|}_{\epsilon=\epsilon_{j}}x^{n}_{\epsilon}(t_{i})=\Gamma(x^{n}_{\epsilon_{j}})_{0}^{t_{i}}v(t_{i}).

As for each ϵ(ε,ε)\epsilon\in(-\varepsilon,\varepsilon) and 1in1\leq i\leq n, {xnϵ(t)}t[ti1,ti]\{x^{n}_{\epsilon}(t)\}_{t\in[t_{i-1},t_{i}]} is a geodesic, the vector field

J(t):=ϵ|ϵ=0xnϵ(t),t[ti1,ti]J(t):=\frac{\partial}{\partial\epsilon}\bigg{|}_{\epsilon=0}x^{n}_{\epsilon}(t),\quad t\in[t_{i-1},t_{i}]

is a Jacobi field along {xn(t)}t[ti1,ti]\{x^{n}(t)\}_{t\in[t_{i-1},t_{i}]}. This leads to the following Jacobi equation

D2dt2J(t)+R(J(t),x˙n(t))x˙n(t)=0,t[ti1,ti],\frac{D^{2}}{dt^{2}}J(t)+R\left(J(t),\dot{x}^{n}(t)\right)\dot{x}^{n}(t)=0,\quad t\in[t_{i-1},t_{i}], (7.15)

with boundary values

J(ti1)=Γ(xn)0ti1v(ti1),J(ti)=Γ(xn)0tiv(ti).J(t_{i-1})=\Gamma(x^{n})_{0}^{t_{i-1}}v(t_{i-1}),\quad J(t_{i})=\Gamma(x^{n})_{0}^{t_{i}}v(t_{i}). (7.16)

Since the connection is torsion-free, we can exchange the covariant derivative and standard derivative to have

DdtJ(ti1)=Ddtϵxnϵ(t)|ϵ=0,t=ti1=Ddϵtxnϵ(t)|ϵ=0,t=ti1=Ddϵ|ϵ=0x˙nϵ(ti1),\frac{D}{dt}J(t_{i-1})=\frac{D}{dt}\frac{\partial}{\partial\epsilon}x^{n}_{\epsilon}(t)\bigg{|}_{\epsilon=0,t=t_{i-1}}=\frac{D}{d\epsilon}\frac{\partial}{\partial t}x^{n}_{\epsilon}(t)\bigg{|}_{\epsilon=0,t=t_{i-1}}=\frac{D}{d\epsilon}\bigg{|}_{\epsilon=0}\dot{x}^{n}_{\epsilon}(t_{i-1}), (7.17)

On the other hand, Taylor’s theorem yields

Γ(xn)titi1J(ti)=J(ti1)+DdtJ(ti1)Δti+12D2dt2J(ti1)(Δti)2+o((Δti)2).\Gamma(x^{n})_{t_{i}}^{t_{i-1}}J(t_{i})=J(t_{i-1})+\frac{D}{dt}J(t_{i-1})\Delta t_{i}+\frac{1}{2}\frac{D^{2}}{dt^{2}}J(t_{i-1})(\Delta t_{i})^{2}+o\left((\Delta t_{i})^{2}\right). (7.18)

Combining (7.15)–(7.18), we have

Ddϵ|ϵ=0x˙nϵ(ti1)=Γ(xn)0ti1v(ti)v(ti1)Δti+12R(Γ(xn)0ti1v(ti1),x˙n(ti1))x˙n(ti1)Δti+o(Δti).\frac{D}{d\epsilon}\bigg{|}_{\epsilon=0}\dot{x}^{n}_{\epsilon}(t_{i-1})=\Gamma(x^{n})_{0}^{t_{i-1}}\frac{v(t_{i})-v(t_{i-1})}{\Delta t_{i}}+\frac{1}{2}R\left(\Gamma(x^{n})_{0}^{t_{i-1}}v(t_{i-1}),\dot{x}^{n}(t_{i-1})\right)\dot{x}^{n}(t_{i-1})\Delta t_{i}+o\left(\Delta t_{i}\right).

A standard limit theorem yields the result (ii). ∎

Remark 7.14.

(i). The constraint QX(t)=gˇ(X(t))QX(t)=\check{g}(X(t)) in (7.10) looks strong. A possibly better viewpoint is to force all diffusions under consideration to have the same nondegenerate diffusion tensor aa, i.e., QX(t)=a(X(t))QX(t)=a(X(t)). Then, the inverse of aa defines a Riemannian metric gg, cf. [43, Section V.4]. As can be seen from the first part of the above proof, the constraint of fixing the diffusion tensor is a natural one in the literature of variational calculus on the path space. An intuitive reason for this constraint is to assure that the induced measures are equivalent, which is necessary for equation (7.11) to be well-posed, cf. [22]. The assumption of Levi-Civita connection \nabla may be relaxed to that the connection \nabla is gg-compatible and torsion skew symmetric [22, Definition 8.1], in which case the second assertion of this lemma needs to add the effect of torsion.

(ii). One may expect from the limits of (7.15) and (7.16) that there is a “stochastic” Jacobi equation with two boundary values describing the difference between a diffusion and an “infinitesimally close” diffusion, cf. [5].

For a smooth function ff on TMTM, we denote by dx˙fd_{\dot{x}}f the differential of ff with respect to the coordinates (x˙i)(\dot{x}^{i}). Since T(x,x˙)TxMTxMT_{(x,\dot{x})}T_{x}M\cong T_{x}M, dx˙fd_{\dot{x}}f is treated as a 1-form on TxMT_{x}M and

dx˙f=fx˙idxi.d_{\dot{x}}f=\frac{\partial f}{\partial\dot{x}^{i}}dx^{i}. (7.19)

We call dx˙fd_{\dot{x}}f the vertical differential of ff. Regarding the differential with respect to the coordinates (xi)(x^{i}), we introduce the horizontal differential which depends on the connection \nabla, by

dxf=(fxiΓijkx˙jfx˙k)dxi.d_{x}f=\left(\frac{\partial f}{\partial x^{i}}-\Gamma_{ij}^{k}\dot{x}^{j}\frac{\partial f}{\partial\dot{x}^{k}}\right)dx^{i}. (7.20)

It is easy to check that both definitions (7.19) and (7.20) are invariant under change of coordinates. In fact, by the classical theory [77, Section 3.5 and Example 4.6.7], we know that the connection \nabla can uniquely determine a TTMTTM-valued 1-form on TMTM horizontal over MM, which is given in local coordinates by

Γ=dxi(xiΓijkx˙jx˙k).\Gamma=dx^{i}\otimes\left(\frac{\partial}{\partial x^{i}}-\Gamma_{ij}^{k}\dot{x}^{j}\frac{\partial}{\partial\dot{x}^{k}}\right).

Hence, the horizontal differential is dxf=Γ(df)d_{x}f=\Gamma(df), where dfdf is the total differential of ff. Given a vector field VV on MM, fV:qf(Vq)f\circ V:q\mapsto f(V_{q}) is a smooth function on VV. Then, it is easy to check that

d(fV)=dxfV+(dx˙fV)(iV)dxi.d(f\circ V)=d_{x}f\circ V+(d_{\dot{x}}f\circ V)(\nabla_{\partial_{i}}V)dx^{i}. (7.21)

The following integration-by-parts formula will be used. Its proof is straightforward from definitions of stochastic integrals and mean derivatives, cf. [17, Lemma 4.4].

Lemma 7.15.

Let X={X(t)}t[0,T]X=\{X(t)\}_{t\in[0,T]} be a real-valued continuous semimartingale such that DXDX exists, let ff be a real-valued continuous process on [0,T][0,T], of finite variation. Then

𝐄0TX(t)f˙(t)dt=E[f(T)X(T)f(0)X(0)]𝐄0Tf(t)DX(t)dt.\mathbf{E}\int_{0}^{T}X(t)\dot{f}(t)dt=E\left[f(T)X(T)-f(0)X(0)\right]-\mathbf{E}\int_{0}^{T}f(t)DX(t)dt.

Now we are in position to present the stochastic version of Hamilton’s principle.

Theorem 7.16 (Stochastic Hamilton’s principle).

Let L0L_{0} be a regular Lagrangian on ×TM\mathbb{R}\times TM. A diffusion X𝒜g([0,T];q,μ)X\in\mathcal{A}_{g}([0,T];q,\mu) is a stationary point of 𝒮\mathcal{S}, if and only if XX satisfies the following stochastic Euler-Lagrange (S-EL) equation

𝐃¯dt(dx˙L0(t,X(t),DX(t)))=dxL0(t,X(t),DX(t)),\frac{\overline{\mathbf{D}}}{dt}\big{(}d_{\dot{x}}L_{0}\left(t,X(t),D_{\nabla}X(t)\right)\big{)}=d_{x}L_{0}\left(t,X(t),D_{\nabla}X(t)\right), (7.22)

where 𝐃¯dt\frac{\overline{\mathbf{D}}}{dt} is the damped mean covariant derivative with respect to XX.

We remark that since QX(t)=gˇ(X(t))QX(t)=\check{g}(X(t)), the operator 𝐃¯dt\frac{\overline{\mathbf{D}}}{dt} in (7.22) is just the one of (7.7). The unknown in (7.22) is the process XX, so the conditions X(0)=qX(0)=q and 𝐏(X(T))1=μ\mathbf{P}\circ(X(T))^{-1}=\mu, indicated in the assumption X𝒜g([0,T];q,μ)X\in\mathcal{A}_{g}([0,T];q,\mu), can be regarded as boundary conditions of (7.22).

Proof.

Denote V(t)=Γ(X)0tv(t)V(t)=\Gamma(X)_{0}^{t}v(t). It follows from (7.13) and (7.21) that

ddϵ|ϵ=0𝒮[Xvϵ;0,T]=𝐄0Tddϵ|ϵ=0L0(t,Xvϵ(t),DXvϵ(t))dt=𝐄0T[dxL0(ϵ|ϵ=0Xvϵ(t))+dx˙L0(Ddϵ|ϵ=0DXvϵ(t))]dt=𝐄0T[dxL0(V(t))+dx˙L0(Γ(X)0tv˙(t))+12(QX)ij(t)dx˙L0(R(V(t),i)j)]dt.\begin{split}\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}\mathcal{S}[X^{v}_{\epsilon};0,T]&=\mathbf{E}\int_{0}^{T}\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}L_{0}\left(t,X^{v}_{\epsilon}(t),D_{\nabla}X^{v}_{\epsilon}(t)\right)dt\\ &=\mathbf{E}\int_{0}^{T}\left[d_{x}L_{0}\left(\frac{\partial}{\partial\epsilon}\bigg{|}_{\epsilon=0}X^{v}_{\epsilon}(t)\right)+d_{\dot{x}}L_{0}\left(\frac{D}{d\epsilon}\bigg{|}_{\epsilon=0}D_{\nabla}X^{v}_{\epsilon}(t)\right)\right]dt\\ &=\mathbf{E}\int_{0}^{T}\left[d_{x}L_{0}\left(V(t)\right)+d_{\dot{x}}L_{0}\left(\Gamma(X)_{0}^{t}\dot{v}(t)\right)+\frac{1}{2}(QX)^{ij}(t)d_{\dot{x}}L_{0}\left(R(V(t),\partial_{i})\partial_{j}\right)\right]dt.\end{split} (7.23)

By Lemmas 7.6.(ii) and 7.15 and the fact that v(0)=v(T)=0v(0)=v(T)=0, we have

𝐄0Tdx˙L0(Γ(X)0tv˙(t))dt=𝐄0TΓ(X)t0(dx˙L0)(v˙(t))dt=𝐄0Tlimϵ0𝐄[(Γ(X)t+ϵ0(dx˙L0)Γ(X)t0(dx˙L0)ϵ)(v(t))|𝒫t]dt=𝐄0Tlimϵ0𝐄[(Γ(X)t+ϵt(dx˙L0)dx˙L0ϵ)(Γ(X)0tv(t))|𝒫t]dt=𝐄0Tlimϵ0𝐄[Γ(X)t+ϵt(dx˙L0)dx˙L0ϵ|𝒫t](Γ(X)0tv(t))dt=𝐄0T𝐃dt(dx˙L0)(V(t))dt.\begin{split}\mathbf{E}\int_{0}^{T}d_{\dot{x}}L_{0}\left(\Gamma(X)_{0}^{t}\dot{v}(t)\right)dt&=\mathbf{E}\int_{0}^{T}\Gamma(X)_{t}^{0}(d_{\dot{x}}L_{0})\left(\dot{v}(t)\right)dt\\ &=-\mathbf{E}\int_{0}^{T}\lim_{\epsilon\to 0}\mathbf{E}\left[\left(\frac{\Gamma(X)_{t+\epsilon}^{0}(d_{\dot{x}}L_{0})-\Gamma(X)_{t}^{0}(d_{\dot{x}}L_{0})}{\epsilon}\right)\left(v(t)\right)\Bigg{|}\mathcal{P}_{t}\right]dt\\ &=-\mathbf{E}\int_{0}^{T}\lim_{\epsilon\to 0}\mathbf{E}\left[\left(\frac{\Gamma(X)_{t+\epsilon}^{t}(d_{\dot{x}}L_{0})-d_{\dot{x}}L_{0}}{\epsilon}\right)\left(\Gamma(X)_{0}^{t}v(t)\right)\Bigg{|}\mathcal{P}_{t}\right]dt\\ &=-\mathbf{E}\int_{0}^{T}\lim_{\epsilon\to 0}\mathbf{E}\left[\frac{\Gamma(X)_{t+\epsilon}^{t}(d_{\dot{x}}L_{0})-d_{\dot{x}}L_{0}}{\epsilon}\Bigg{|}\mathcal{P}_{t}\right]\left(\Gamma(X)_{0}^{t}v(t)\right)dt\\ &=-\mathbf{E}\int_{0}^{T}\frac{\mathbf{D}}{dt}(d_{\dot{x}}L_{0})\left(V(t)\right)dt.\end{split} (7.24)

Thus, by Lemma 7.8.(iii),

ddϵ|ϵ=0𝒮[Xvϵ;0,T]=𝐄0T[dxL0(V(t))𝐃dt(dx˙L0)(V(t))+12(QX)ij(t)R(dx˙L0,j)i(V(t))]dt=𝐄0T(dxL0𝐃¯dt(dx˙L0))(V(t))dt.\begin{split}\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}\mathcal{S}[X^{v}_{\epsilon};0,T]&=\mathbf{E}\int_{0}^{T}\left[d_{x}L_{0}\left(V(t)\right)-\frac{\mathbf{D}}{dt}(d_{\dot{x}}L_{0})\left(V(t)\right)+\frac{1}{2}(QX)^{ij}(t)R(d_{\dot{x}}L_{0},\partial_{j})\partial_{i}\left(V(t)\right)\right]dt\\ &=\mathbf{E}\int_{0}^{T}\left(d_{x}L_{0}-\frac{\overline{\mathbf{D}}}{dt}(d_{\dot{x}}L_{0})\right)\left(V(t)\right)dt.\end{split}

The arbitrariness of vv yields the desired result. ∎

Remark 7.17.

(i). For a special class of Lagrangians in the Euclidean context, the stochastic Euler-Lagrange equation (7.22) has been established in [17, Subsection 5.1] where they called it stochastic Newton equation, see also [89]. For general Lagrangians on Riemmannian manifolds, equation (7.22) is new (to the authors’ best knowledge). See Section 7.3 for discussions of a special case.

(ii). The second author and his collaborator formulated a weak stochastic Euler-Lagrange equation in [52]. They mean by “weak” that their stochastic Euler-Lagrange equation holds in the sense of stochastic integrals. The main differences between their formulation and ours is that we get rid of the stochastic integral (martingale) part in our equation since we use mean derivatives instead of stochastic differentials.

7.3 An inspirational example: Schrödinger’s problem

The inspirational example of stochastic Hamiltonian mechanics presented in Section 6.3 also provides an example of our stochastic Lagrangian mechanics. Consider the following Lagrangian defined on ×TM\mathbb{R}\times TM:

L0(t,x,x˙)=12|x˙b(t,x)|2F(t,x),L_{0}(t,x,\dot{x})=\frac{1}{2}|\dot{x}-b(t,x)|^{2}-F(t,x), (7.25)

where bb is a given time-dependent vector field on MM. It actually relates to the 2nd-order Hamiltonian HH in (6.26) via the 2nd-order Legendre transform, which will be considered in Section 7.4. For such Lagrangian, we can directly figure out the relation between stochastic Euler-Lagrange equation (7.22) and Hamilton-Jacobi-Bellman equation. We denote by I0T(M)I_{0}^{T}(M) the set all MM-valued diffusion processes over time interval [0,T][0,T].

Theorem 7.18 (S-EL & HJB).

Let L0L_{0} be as in (7.25). If XI0T(M)X\in I_{0}^{T}(M) satisfies

DX(t)=S(t,X(t))+b(t,X(t))D_{\nabla}X(t)=\nabla S(t,X(t))+b(t,X(t)) (7.26)

for a function S:×MS:\mathbb{R}\times M\to\mathbb{R}, then XX is a solution of the stochastic Euler-Lagrange equation (7.22) if and only if SS solves the following Hamilton-Jacobi-Bellman equation

St+b,S+12|S|2+12ΔS+F=f,\frac{\partial S}{\partial t}+\langle b,\nabla S\rangle+\frac{1}{2}|\nabla S|^{2}+\frac{1}{2}\Delta S+F=f, (7.27)

with ff a function depending only on tt.

Proof.

For a function gg on ×M\mathbb{R}\times M, we will denote by dgdg the exterior differential of gg on MM, i.e., with respect to coordinates (xi)(x^{i}). Condition (7.26) can be rewritten in local coordinates as

x˙=S+b.\dot{x}=\nabla S+b. (7.28)

Then, it is clear that

dx˙L0=L0x˙idxi=gij(x˙jbj)dxi=dS.d_{\dot{x}}L_{0}=\frac{\partial L_{0}}{\partial\dot{x}^{i}}dx^{i}=g_{ij}(\dot{x}^{j}-b^{j})dx^{i}=dS. (7.29)

Since g=0\nabla g=0, we use Leibniz’s rule to derive

dxL0(k)=12d[g(x˙b,x˙b)](k)dF(k)=g(kb,x˙b)dF(k)=dS(kb)dF(k).\begin{split}d_{x}L_{0}(\partial_{k})&=\frac{1}{2}d[g(\dot{x}-b,\dot{x}-b)](\partial_{k})-dF(\partial_{k})=-g\left(\nabla_{\partial_{k}}b,\dot{x}-b\right)-dF(\partial_{k})=-dS\left(\nabla_{\partial_{k}}b\right)-dF(\partial_{k}).\end{split} (7.30)

Now we take the differential with respect to xx to the HJB equation (7.27). Obviously,

dSt=tdS=tdx˙L0.d\frac{\partial S}{\partial t}=\frac{\partial}{\partial t}dS=\frac{\partial}{\partial t}d_{\dot{x}}L_{0}.

For the second term,

d(b,S)(k)=d[dS(b)](k)=(kdS)(b)+dS(kb)=2k,bS+dS(kb)=(bdS)(k)+dS(kb).\begin{split}d(\langle b,\nabla S\rangle)(\partial_{k})&=d[dS(b)](\partial_{k})=\left(\nabla_{\partial_{k}}dS\right)(b)+dS\left(\nabla_{\partial_{k}}b\right)\\ &=\nabla^{2}_{\partial_{k},b}S+dS\left(\nabla_{\partial_{k}}b\right)=\left(\nabla_{b}dS\right)(\partial_{k})+dS\left(\nabla_{\partial_{k}}b\right).\end{split}

For the third term, we use again g=0\nabla g=0. Then, we have

12d(|S|2)(k)=12d[dSdS(gˇ)](k)=((kdS)dS)(gˇ)=(kdS)(S)=(SdS)(k).\begin{split}\frac{1}{2}d\left(|\nabla S|^{2}\right)(\partial_{k})&=\frac{1}{2}d[dS\otimes dS(\check{g})](\partial_{k})=\left((\nabla_{\partial_{k}}dS)\otimes dS\right)(\check{g})=\left(\nabla_{\partial_{k}}dS\right)(\nabla S)=\left(\nabla_{\nabla S}dS\right)(\partial_{k}).\end{split}

For the fourth term, in the same way we have

d(ΔS)(k)=d(gij2i,jS)(k)=d(2S(gˇ))(k)=(k2S)(gˇ)=gij3k,i,jS=gij[(3k,i,jS3i,k,jS)+(3i,k,jS3i,j,kS)+3i,j,kS]=gij[(2k,idS2i,kdS)(j)+0+2i,jdS(k)]=gij[R(k,i)dS(j)+2i,jdS(k)]=gij[R(dS,j)i(k)+2i,jdS(k)]=[ΔdSRic(dS)](k)=ΔLD(dS)(k).\begin{split}&\quad\ d(\Delta S)(\partial_{k})=d\left(g^{ij}\nabla^{2}_{\partial_{i},\partial_{j}}S\right)(\partial_{k})=d\left(\nabla^{2}S(\check{g})\right)(\partial_{k})=\left(\nabla_{\partial_{k}}\nabla^{2}S\right)(\check{g})=g^{ij}\nabla^{3}_{\partial_{k},\partial_{i},\partial_{j}}S\\ &=g^{ij}\left[\left(\nabla^{3}_{\partial_{k},\partial_{i},\partial_{j}}S-\nabla^{3}_{\partial_{i},\partial_{k},\partial_{j}}S\right)+\left(\nabla^{3}_{\partial_{i},\partial_{k},\partial_{j}}S-\nabla^{3}_{\partial_{i},\partial_{j},\partial_{k}}S\right)+\nabla^{3}_{\partial_{i},\partial_{j},\partial_{k}}S\right]\\ &=g^{ij}\left[\left(\nabla^{2}_{\partial_{k},\partial_{i}}dS-\nabla^{2}_{\partial_{i},\partial_{k}}dS\right)(\partial_{j})+0+\nabla^{2}_{\partial_{i},\partial_{j}}dS(\partial_{k})\right]=g^{ij}\left[R(\partial_{k},\partial_{i})dS(\partial_{j})+\nabla^{2}_{\partial_{i},\partial_{j}}dS(\partial_{k})\right]\\ &=g^{ij}\left[-R(dS,\partial_{j})\partial_{i}(\partial_{k})+\nabla^{2}_{\partial_{i},\partial_{j}}dS(\partial_{k})\right]=[\Delta dS-\mathrm{Ric}(dS)](\partial_{k})=\Delta_{\mathrm{LD}}(dS)(\partial_{k}).\end{split}

Combining these together and applying (7.26)–(7.30) as well as (7.7), we obtain

d(St+b,S+12|S|2+12ΔS+F)(k)=(t+b+S+12ΔLD)(dS)(k)+dS(kb)+dF(k)=𝐃¯dt(dS)(k)+dS(kb)+dF(k)=[𝐃¯dt(dx˙L0)dxL0](k).\begin{split}&\ d\left(\frac{\partial S}{\partial t}+\langle b,\nabla S\rangle+\frac{1}{2}|\nabla S|^{2}+\frac{1}{2}\Delta S+F\right)(\partial_{k})=\left(\frac{\partial}{\partial t}+\nabla_{b+\nabla S}+\frac{1}{2}\Delta_{\mathrm{LD}}\right)(dS)(\partial_{k})+dS\left(\nabla_{\partial_{k}}b\right)+dF(\partial_{k})\\ =&\ \frac{\overline{\mathbf{D}}}{dt}(dS)(\partial_{k})+dS\left(\nabla_{\partial_{k}}b\right)+dF(\partial_{k})=\left[\frac{\overline{\mathbf{D}}}{dt}(d_{\dot{x}}L_{0})-d_{x}L_{0}\right](\partial_{k}).\end{split}

The result follows. ∎

Remark 7.19.

Equation (7.29) gives the relation between Lagrangians and 2nd-order Hamilton’s principal functions. It is valid for more general Lagrangians, see Remark 7.23.(i).

Theorem 7.18 strongly suggests some relations between stochastic Lagrangian (and also Hamiltonian) mechanics and Schrödinger’s problem in the reintepretation of optimal transport. In the setting of the latter (see, e.g., [18, 58, 59]), there is a given reversible positive measure 𝐑\mathbf{R} on the path space 𝒞0T=C([0,T],M)\mathcal{C}_{0}^{T}=C([0,T],M), called reference measure, as well as two probability distributions μ0,μT𝒫(M)\mu_{0},\mu_{T}\in\mathcal{P}(M). Schrödinger’s problem aims to minimize the following relative entropy

H(𝐏|𝐑)={𝒞0Tlog(d𝐏d𝐑)d𝐏,𝐏𝐑,+,otherwise,H(\mathbf{P}|\mathbf{R})=\begin{cases}\int_{\mathcal{C}_{0}^{T}}\log\left(\frac{d\mathbf{P}}{d\mathbf{R}}\right)d\mathbf{P},&\mathbf{P}\ll\mathbf{R},\\ +\infty,&\text{otherwise},\end{cases} (7.31)

over all probability measures 𝐏\mathbf{P} on 𝒞0T\mathcal{C}_{0}^{T} such that μ0,μT\mu_{0},\mu_{T} are the initial and final time marginal distributions of 𝐏\mathbf{P}, i.e., 𝐏0=μ0\mathbf{P}_{0}=\mu_{0} and 𝐏T=μT\mathbf{P}_{T}=\mu_{T}, where 𝐏t:=𝐏(X(t))1\mathbf{P}_{t}:=\mathbf{P}\circ(X(t))^{-1} is the time marginal distribution of 𝐏\mathbf{P} and X(t):𝒞0TM,X(t,ω)=ω(t)X(t):\mathcal{C}_{0}^{T}\to M,X(t,\omega)=\omega(t) is the coordinate mapping. Denote, respectively, by X𝐑X_{\mathbf{R}} and X𝐏X_{\mathbf{P}}, the coordinate process XX under the measure 𝐑\mathbf{R} and 𝐏\mathbf{P}. Then, Girsanov theorem implies that [56, Theorem 1] a necessary condition for the finite entropy condition H(𝐏|𝐑)<H(\mathbf{P}|\mathbf{R})<\infty is QX𝐏=QX𝐑QX_{\mathbf{P}}=QX_{\mathbf{R}}, 𝐏\mathbf{P}-a.s.. Furthermore, if 𝐑\mathbf{R} is a diffusion measure, i.e., X𝐑X_{\mathbf{R}} is a diffusion process, then a similar application of Girsanov theorem yields that a necessary condition for H(𝐏|𝐑)<H(\mathbf{P}|\mathbf{R})<\infty is that 𝐏\mathbf{P} is also a diffusion measure and there exists a time-dependent vector field ww such that

(DX𝐏(t),QX𝐏(t))=(DX𝐑(t)+w(t,X(t)),QX𝐑(t)),t[0,T], a.s..\left(DX_{\mathbf{P}}(t),QX_{\mathbf{P}}(t)\right)=\left(DX_{\mathbf{R}}(t)+w(t,X(t)),QX_{\mathbf{R}}(t)\right),\quad\forall t\in[0,T],\text{ a.s.}.

The solution 𝐏\mathbf{P} of Schrödinger’s problem, i.e., minimizing (7.31), is related to the reference measure 𝐑\mathbf{R} by a time-symmetric version of Doob’s hh-transform [58, Section 3]. Its coordinate process X𝐏X_{\mathbf{P}} is sometimes called a Schrödinger bridge or Schrödinger process. When the reference measure 𝐑\mathbf{R} is Markovian, i.e., the law of a Markov process, the solution process X𝐏X_{\mathbf{P}} is also called a reciprocal [11, 46] or Bernstein process [18, 16].

If the manifold MM is endowed with a Riemannian metric gg, and the reference coordinate process X𝐑X_{\mathbf{R}} has generator

AX𝐑=b,+12Δ+F,A^{X_{\mathbf{R}}}=\langle b,\nabla\rangle+\textstyle{{\frac{1}{2}}}\Delta+F,

for some time-dependent vector field bb on MM, then the density μ(t,x)=d𝐏tdVol(x)\mu(t,x)=\frac{d\mathbf{P}^{*}_{t}}{d\mathrm{Vol}}(x) of the minimizer 𝐏\mathbf{P}^{*} of (7.31) solves the following Kolmogorov forward equation

{tμ(t,x)+div[μ(S+b)]12Δμ(t,x)=0,(t,x)(0,1]×M,μ(0,x)=μ0(x),xM.\left\{\begin{aligned} &\frac{\partial}{\partial t}\mu(t,x)+\mathrm{div}\,\left[\mu(\nabla S+b)\right]-\frac{1}{2}\Delta\mu(t,x)=0,\quad(t,x)\in(0,1]\times M,\\ &\mu(0,x)=\mu_{0}(x),\quad x\in M.\end{aligned}\right. (7.32)

where SS solves the HJB equation (7.27) with f0f\equiv 0, or (6.28).

Moreover, an analog of Benamou-Brenier formula was derived (see [58]). Consider the problem of minimizing the average action

0TM(12|v(t,x)b(t,x)|2F(t,x))ρ(t,dx)dt\int_{0}^{T}\int_{M}\left(\frac{1}{2}|v(t,x)-b(t,x)|^{2}-F(t,x)\right)\rho(t,dx)dt (7.33)

among all pairs (ρ,v)(\rho,v), where is ρ=(ρ(t))t[0,T]\rho=(\rho(t))_{t\in[0,T]} is a measurable path in 𝒫(M)\mathcal{P}(M), v=(v(t))t[0,T]v=(v(t))_{t\in[0,T]} is a measurable time-dependent vector field and the following constraints are satisfied (in the weak sense of PDEs):

{tρ+div(ρv)12Δρ=0,ρ(0)=μ0,ρ(T)=μT,\left\{\begin{aligned} &\frac{\partial}{\partial t}\rho+\mathrm{div}\,\left(\rho v\right)-\frac{1}{2}\Delta\rho=0,\\ &\rho(0)=\mu_{0},\ \rho(T)=\mu_{T},\end{aligned}\right. (7.34)

The relation between ρ\rho in (7.33) and 𝐏\mathbf{P} in (7.31) is just that ρ\rho is the time marginal of 𝐏\mathbf{P}, namely,

ρ(t)=𝐏t=𝐏(X(t))1.\rho(t)=\mathbf{P}_{t}=\mathbf{P}\circ(X(t))^{-1}. (7.35)

The minimizer of (7.33) is the pair (μ,S+b)(\mu,\nabla S+b) where μ\mu solves (7.32) and SS solves (6.28).

These results are summarized in the following equivalent relations:

inf{H(𝐏|𝐑):𝐏𝒫(𝒞0T),𝐏0=μ0,𝐏T=μT}H(μ0|𝐑0)=inf{0TM(12|v(t,x)b(t,x)|2F(t,x))ρ(t,dx)dt:(ρ,v) satisfies (7.34)}=0TM(12|S(t,x)|2F(t,x))μ(t,dx)dt.\begin{split}&\ \inf\left\{H(\mathbf{P}|\mathbf{R}):\mathbf{P}\in\mathcal{P}(\mathcal{C}_{0}^{T}),\mathbf{P}_{0}=\mu_{0},\mathbf{P}_{T}=\mu_{T}\right\}-H\left(\mu_{0}|\mathbf{R}_{0}\right)\\ =&\ \inf\left\{\int_{0}^{T}\int_{M}\left(\frac{1}{2}|v(t,x)-b(t,x)|^{2}-F(t,x)\right)\rho(t,dx)dt:(\rho,v)\text{ satisfies \eqref{FP-eqn}}\right\}\\ =&\ \int_{0}^{T}\int_{M}\left(\frac{1}{2}|\nabla S(t,x)|^{2}-F(t,x)\right)\mu(t,dx)dt.\end{split} (7.36)

Now if the coordinate process X𝐑X_{\mathbf{R}} under the reference measure 𝐑\mathbf{R} is a nondegenerate MM-valued diffusion in I0T(M)I_{0}^{T}(M) which is diffusion-homogeneous, then assigning such a reference measure 𝐑\mathbf{R} amounts to assigning a pair (b𝐑,g𝐑)Γ(TMSym2(TM))(b_{\mathbf{R}},g_{\mathbf{R}})\in\Gamma(TM\otimes\mathrm{Sym}^{2}(T^{*}M)), where g𝐑g_{\mathbf{R}} is a positive-definite symmetric (0,2)(0,2)-tensor, i.e., a Riemannian metric tensor. More precisely, we let AX𝐑=(𝔟,a)+FA^{X_{\mathbf{R}}}=(\mathfrak{b},a)+F be the generator of X𝐑X_{\mathbf{R}}. Since X𝐑X_{\mathbf{R}} is nondegenerate and diffusion-homogeneous, aa is a time-independent nondegenerate symmetric (2,0)(2,0)-tensor field. Let g𝐑=a^g_{\mathbf{R}}=\hat{a} be the inverse of aa, so that g𝐑g_{\mathbf{R}} is a Riemannian metric tensor. We then equip the Riemannian manifold (M,g𝐑)(M,g_{\mathbf{R}}) with the associated Levi-Civita connection \nabla. The isomorphism (2.19) implies that

AX𝐑=b𝐑ii+12g𝐑ij2i,j+F=b𝐑,+12Δ+F,A^{X_{\mathbf{R}}}=b_{\mathbf{R}}^{i}\partial_{i}+\textstyle{{\frac{1}{2}}}g_{\mathbf{R}}^{ij}\nabla^{2}_{\partial_{i},\partial_{j}}+F=\langle b_{\mathbf{R}},\nabla\rangle+\textstyle{{\frac{1}{2}}}\Delta+F,

where b𝐑b_{\mathbf{R}} is the time-dependent vector field given by b𝐑i=(𝔟i+12g𝐑jkΓijk)b_{\mathbf{R}}^{i}=(\mathfrak{b}^{i}+\textstyle{{\frac{1}{2}}}g_{\mathbf{R}}^{jk}\Gamma^{i}_{jk}), and \nabla and Δ\Delta are the gradient and Laplace-Beltrami operator with respect to g𝐑g_{\mathbf{R}}, respectively.

We set that 𝐏\mathbf{P} is a diffusion measure and QX𝐏=QX𝐑=gˇ𝐑QX_{\mathbf{P}}=QX_{\mathbf{R}}=\check{g}_{\mathbf{R}}, 𝐏\mathbf{P}-a.s., which is a necessary condition for H(𝐏|𝐑)<H(\mathbf{P}|\mathbf{R})<\infty. Then, by (3.4), the generator of X𝐏X_{\mathbf{P}} is given by

(DX𝐏(t),QX𝐏(t))=(DX𝐏)i(t)i|X(t)+12Δ|X(t).(DX_{\mathbf{P}}(t),QX_{\mathbf{P}}(t))=(D_{\nabla}X_{\mathbf{P}})^{i}(t)\partial_{i}|_{X(t)}+\textstyle{{\frac{1}{2}}}\Delta|_{X(t)}.

From (7.34) and (7.35), one can see that v(t,X(t))=DX𝐏(t)v(t,X(t))=D_{\nabla}X_{\mathbf{P}}(t) and the action (7.33) equals to

𝐄𝐏0T(12|DX(t)b𝐑(t,X(t))|2F(t,X(t)))dt.\mathbf{E}_{\mathbf{P}}\int_{0}^{T}\left(\frac{1}{2}|D_{\nabla}X(t)-b_{\mathbf{R}}(t,X(t))|^{2}-F(t,X(t))\right)dt. (7.37)

So the minimizing problem turns into minimizing the action (7.37) over all diffusion measures 𝐏𝒫(𝒞0T)\mathbf{P}\in\mathcal{P}(\mathcal{C}_{0}^{T}) with 𝐏0=μ0\mathbf{P}_{0}=\mu_{0}, 𝐏T=μT\mathbf{P}_{T}=\mu_{T} and QX𝐏=gˇ𝐑QX_{\mathbf{P}}=\check{g}_{\mathbf{R}}, 𝐏\mathbf{P}-a.s.. If μ0=δq\mu_{0}=\delta_{q} and μT=μ\mu_{T}=\mu, this brings us back to our stochastic variational problem, that is, to minimize the action functional 𝒮\mathcal{S} in (7.9) over 𝒜g𝐑([0,T];q,μ)\mathcal{A}_{g_{\mathbf{R}}}([0,T];q,\mu), with Lagrangian L0(t,x,x˙)=12|x˙b𝐑(t,x)|2F(t,x)L_{0}(t,x,\dot{x})=\frac{1}{2}|\dot{x}-b_{\mathbf{R}}(t,x)|^{2}-F(t,x). Note that in this case, since 𝐏0=μ0\mathbf{P}_{0}=\mu_{0} is Dirac, the relative entropy in (7.31) and H(μ0|𝐑0)H(\mu_{0}|\mathbf{R}_{0}) are always infinite, while their difference H(𝐏|𝐑)H(μ0|𝐑0)H(\mathbf{P}|\mathbf{R})-H(\mu_{0}|\mathbf{R}_{0}) can be finite as in (7.36). Moreover, by Theorem 7.16 and 7.18, a necessary condition for X𝐏X_{\mathbf{P}} to be the minimizer of 𝒮\mathcal{S} is that X𝐏X_{\mathbf{P}} satisfies (7.26) and (7.27), which coincides with (7.32).

Remark 7.20.

(i). Compared to the Lagrangian (7.25) used here for addressing Schrödinger’s problem, there is another type of Lagrangians used in the Euclidean version of quantum mechanics in [17, Eq. (5.4)]. The latter has an additional term of divergence of bb, which helps to express part of the action functional as a Stratonovich integral. The stochastic Euler-Lagrange equation (7.22) applied to their Lagrangians recovers the equations of motion in [17, Theorem 5.3].

(ii). In the seminal paper [73], F. Otto provided a geometric perspective for numerous PDEs by introducing a Riemannian structure in the Wasserstein space. It is known as Otto’s calculus. A similar idea can ascend to V.I. Arnold, who established a geometric framework for hydrodynamics by studying the Riemannian nature of the infinite-dimensional group of diffeomorphisms [8]. The recent paper [33] formulated Schrödinger’s problem via Otto calculus, where the equation of motion is given by an infinite-dimensional Newton equation, cf. [49, 86] on related matters. All these works can be called a “geometrization” of (stochastic) dynamics. In contrast, the present framework can be called a “stochastization” of geometric mechanics. The difference and relations between our framework and theirs are similar to those between two ways of producing HJ equations for quantum mechanics mentioned in the introduction. More precisely, while (second-order) HJB equations play a key role in our framework, various HJ equations with density-dependent potential terms were derived by them (see [33, Corollary 23] and [49, Proposition 2.4]).

7.4 Second-order Legendre transform

7.4.1 From 𝒯SM\mathcal{T}^{S*}M to 𝒯SM\mathcal{T}^{S}M and back

Let us fix a linear connection \nabla on MM. Here, for simplicity, we consider time-independent Hamiltonians and Lagrangians.

We first produce second-order Lagrangians from second-order Hamiltonians. To this end, we first reduce the second-order Hamiltonian to a classical one. Given a time-independent second-order Hamiltonian H:𝒯SMH:\mathcal{T}^{S*}M\to\mathbb{R}, its \nabla-reduction is the classical Hamiltonian H0=Hι^:TMH_{0}=H\circ\hat{\iota}^{*}_{\nabla}:T^{*}M\to\mathbb{R}, as in (6.42). If H0H_{0} is hyperregular (see [1, Section 3.6]), then its fiber derivative FH0:TMTM\textbf{F}H_{0}:T^{*}M\to TM, which is given in canonical coordinates by x˙i=H0pi\dot{x}^{i}=\frac{\partial H_{0}}{\partial p_{i}}, is a diffeomorphism and defines the classical Legendre transform [1, Section 3.6]:

L0(x,x˙)=pix˙iH0(x,p)=pix˙iH(x,p,o^),L_{0}(x,\dot{x})=p_{i}\dot{x}^{i}-H_{0}(x,p)=p_{i}\dot{x}^{i}-H\left(x,p,\hat{o}\right), (7.38)

where (o^jk)(\hat{o}_{jk}) is a family of auxiliary variables introduced in (6.43). Then we lift L0L_{0} to an admissible second-order Lagrangian L:𝒯SML:\mathcal{T}^{S}M\to\mathbb{R} as in Definition 7.9, that is, L=L0ϱL=L_{0}\circ\varrho_{\nabla}. Combining (7.38) with (7.8), the relation between LL and HH is

L(x,Dx,Qx)=piDxiH(x,p,o^)=piDix+12o^jkQjkxH(x,p,o^).L(x,Dx,Qx)=p_{i}D_{\nabla}x^{i}-H\left(x,p,\hat{o}\right)=p_{i}D^{i}x+\textstyle{\frac{1}{2}}\hat{o}_{jk}Q^{jk}x-H(x,p,\hat{o}). (7.39)

We call (7.39) the second-order Legendre transform. In particular, if we restrict the admissible 2nd-order Lagrangian LL to the subbundle of 𝒯SM\mathcal{T}^{S}M with coordinate constraint Qjkx=gjk(x)Q^{jk}x=g^{jk}(x) for some symmetric (2,0)(2,0)-tensor field gg (which is just the condition in (7.10)), and let HH be (g,)(g,\nabla)-canonical, then by (6.45), we have

L(x,Dx,Qx)=piDix+12ojkQjkxH(x,p,o).L(x,Dx,Qx)=p_{i}D^{i}x+\textstyle{\frac{1}{2}}o_{jk}Q^{jk}x-H(x,p,o). (7.40)

Consequently, we can find the relation between 2nd-order Hamilton’s principal functions and action functionals. By (6.41) and (7.40),

𝐃tS=L(t,x,Dx,Qx)=L0(t,x,Dx).\mathbf{D}_{t}S=L(t,x,Dx,Qx)=L_{0}(t,x,D_{\nabla}x).

One concludes, from Dynkin’s formula, that for an MM-valued diffusion X𝒜g([0,T];q,μ)X\in\mathcal{A}_{g}([0,T];q,\mu),

𝐄S(T,X(T))S(0,q)=𝐄0TL0(t,X(t),DX(t))dt=𝒮[X;0,T],\mathbf{E}S(T,X(T))-S(0,q)=\mathbf{E}\int_{0}^{T}L_{0}\left(t,X(t),D_{\nabla}X(t)\right)dt=\mathcal{S}[X;0,T],

and

S(t,x)S(0,q)=𝐄(t,x)[S(t,X(t))S(0,X(0))]=𝐄(t,x)0tL0(s,X(s),DX(s))ds,S(t,x)-S(0,q)=\mathbf{E}_{(t,x)}[S(t,X(t))-S(0,X(0))]=\mathbf{E}_{(t,x)}\int_{0}^{t}L_{0}\left(s,X(s),D_{\nabla}X(s)\right)ds,

where 𝐄(t,x)\mathbf{E}_{(t,x)} is the conditional expectation 𝐄(|X(t)=x)\mathbf{E}(\cdot|X(t)=x). These mean that the action functional is the expectation of 2nd-order Hamilton’s principal function (up to an undetermined constant), while the 2nd-order Hamilton’s principal function is the conditional expectation version of action functional.

Conversely, let us be given an admissible 2nd-order Lagrangian L:𝒯SML:\mathcal{T}^{S}M\to\mathbb{R} which is the \nabla-lift of a classical Lagrangian L0:TML_{0}:TM\to\mathbb{R}. If L0L_{0} is hyperregular, then its fiber derivative

FL0:TMTM,(x,x˙)(x,dx˙L0),\textbf{F}L_{0}:TM\to T^{*}M,\quad(x,\dot{x})\mapsto(x,d_{\dot{x}}L_{0}), (7.41)

which is written in coordinates as pi=L0x˙ip_{i}=\frac{\partial L_{0}}{\partial\dot{x}^{i}}, is a diffeomorphism and defines the classical inverse Legendre transform:

H0(x,p)=pix˙iL0(x,x˙).H_{0}(x,p)=p_{i}\dot{x}^{i}-L_{0}(x,\dot{x}). (7.42)

We replace coordinates (x˙i)(\dot{x}^{i}) by (Dix)(D_{\nabla}^{i}x), due to (3.2). Now, given a symmetric (2,0)(2,0)-tensor field gg, we lift H0H_{0} to the (g,)(g,\nabla)-canonical H¯g0\overline{H}^{g}_{0} in (6.44). The relation between H¯g0\overline{H}^{g}_{0} and LL is

H¯g0(x,p,o)=piDixL0(x,Dx)+12gjk(x)(ojkΓijk(x)pi)=piDix+12ojkQjkxL(x,Dx,Qx)+12(gjk(x)Qjkx)ojk,\begin{split}\overline{H}^{g}_{0}(x,p,o)&=p_{i}D_{\nabla}^{i}x-L_{0}(x,D_{\nabla}x)+\textstyle{\frac{1}{2}}g^{jk}(x)\left(o_{jk}-\Gamma^{i}_{jk}(x)p_{i}\right)\\ &=p_{i}D^{i}x+\textstyle{\frac{1}{2}}o_{jk}Q^{jk}x-L(x,Dx,Qx)+\textstyle{\frac{1}{2}}\left(g^{jk}(x)-Q^{jk}x\right)o^{\nabla}_{jk},\end{split} (7.43)

where (ojk)(o^{\nabla}_{jk}) is the tensorial conjugate diffusivities defined in (5.6). We call (7.43) the (g,)(g,\nabla)-canonical inverse 2nd-order Legendre transform. When gg is Riemannian and \nabla is the associated Levi-Civita connection, we call (7.43) the gg-canonical inverse 2nd-order Legendre transform. In particular, when restricting LL onto the subbundle of 𝒯SM\mathcal{T}^{S}M with coordinate constraint Qjkx=gjk(x)Q^{jk}x=g^{jk}(x), we have

H¯g0(x,p,o)=piDix+12ojkQjkxL(x,Dx,Qx).\overline{H}^{g}_{0}(x,p,o)=p_{i}D^{i}x+\textstyle{\frac{1}{2}}o_{jk}Q^{jk}x-L(x,Dx,Qx). (7.44)

Following the procedure in classical mechanics [1, Definition 3.5.11], for a given classical Lagrangian L0:TML_{0}:TM\to\mathbb{R}, we define a function A0:TMA_{0}:TM\to\mathbb{R} by A0(vx)=𝐅L0(vx)vxA_{0}(v_{x})=\mathbf{F}L_{0}(v_{x})\cdot v_{x}, and the classical energy E0:TME_{0}:TM\to\mathbb{R} by E0=A0L0E_{0}=A_{0}-L_{0}. Notice that in local coordinates, A0=x˙iL0x˙iA_{0}=\dot{x}^{i}\frac{\partial L_{0}}{\partial\dot{x}^{i}} and E0=x˙iL0x˙iL0E_{0}=\dot{x}^{i}\frac{\partial L_{0}}{\partial\dot{x}^{i}}-L_{0}.

Example 7.21.

It is easy to check that the \nabla-lift of the classical Lagrangian L0L_{0} in (7.25) is the second-order Legendre transform of the second-order Hamiltonian HH in (6.26). And conversely, the latter is the gg-canonical inverse 2nd-order Legendre transform of the former. The classical energy associated with this Lagrangian is given by

E0(t,x,x˙)=12|x˙b(t,x)|2+x˙b(t,x),b(t,x)+F(t,x).E_{0}(t,x,\dot{x})=\frac{1}{2}|\dot{x}-b(t,x)|^{2}+\langle\dot{x}-b(t,x),b(t,x)\rangle+F(t,x). (7.45)

Each term at RHS corresponds to a kinetic energy, a vector potential energy and a scalar potential energy, respectively.

7.4.2 Stochastic Hamiltonian mechanics on Riemannian manifolds

Given a reference metric tensor gg, i.e., a geodesically complete Riemannian metric as in Section 7.2, let \nabla be the associated Levi-Civita connection. If a 2nd-order Hamiltonian HH is the gg-canonical lift of a classical Hamiltonian H0H_{0}, namely, H=H¯0gH=\overline{H}_{0}^{g} as in (6.44), then the stochastic Hamilton’s equations (6.17) can reduce to a simpler Hamilton-type system on TMT^{*}M, which is exactly equivalent to the stochastic Euler-Lagrange equation (7.22) via the classical Legendre transform (7.41) and (7.42).

Similarly to (7.19) and (7.20), we introduce, for a smooth function ff on TMT^{*}M, the vertical gradient pf\nabla_{p}f and horizontal differential dxfd_{x}f which are given in local coordinates (x,p)(x,p) by

pf=fpixi,dxf=(fxi+Γijkpkfpj)dxi.\nabla_{p}f=\frac{\partial f}{\partial p_{i}}\frac{\partial}{\partial{x^{i}}},\quad d_{x}f=\left(\frac{\partial f}{\partial x^{i}}+\Gamma_{ij}^{k}p_{k}\frac{\partial f}{\partial p_{j}}\right)dx^{i}.

Both are invariant under change of coordinates. Still by the classical theory, the connection \nabla can uniquely determine a TTMTT^{*}M-valued 1-form on TMT^{*}M horizontal over MM, given by

Γ=dxi(xi+Γijkpkpj).\Gamma^{*}=dx^{i}\otimes\left(\frac{\partial}{\partial x^{i}}+\Gamma_{ij}^{k}p_{k}\frac{\partial}{\partial p_{j}}\right).

Hence, we have dxf=Γ(df)d_{x}f=\Gamma^{*}(df). Given a 1-form η\eta on MM, fη:qf(ηq)f\circ\eta:q\mapsto f(\eta_{q}) is a smooth function on MM. Then, it is easy to verify that

d(fη)=dxfη+(pfη)η.d(f\circ\eta)=d_{x}f\circ\eta+\nabla_{(\nabla_{p}f\circ\eta)}\eta. (7.46)
Theorem 7.22.

Given a smooth function H0:TM×H_{0}:T^{*}M\times\mathbb{R}\to\mathbb{R}.

(i). Let H=H¯0g:𝒯SM×H=\overline{H}_{0}^{g}:\mathcal{T}^{S*}M\times\mathbb{R}\to\mathbb{R} be the gg-canonical lift of HH. Let 𝐗\mathbf{X} be the horizontal integral process of stochastic Hamilton’s equations (6.17) corresponding to HH and X=τMS(𝐗)X=\tau_{M}^{S*}(\mathbf{X}). Define a TMT^{*}M-valued horizontal diffusion by 𝕏:=ϱ^(𝐗)\mathbb{X}:=\hat{\varrho}^{*}(\mathbf{X}). Then 𝕏(t)=p(t,X(t))\mathbb{X}(t)=p(t,X(t)) solves the following system on TMT^{*}M,

{DX(t)=pH0(𝕏(t),t),𝐃¯dtp(t,X(t))=dxH0(𝕏(t),t),\left\{\begin{aligned} D_{\nabla}X(t)&=\nabla_{p}H_{0}(\mathbb{X}(t),t),\\ \frac{\overline{\mathbf{D}}}{dt}p(t,X(t))&=-d_{x}H_{0}(\mathbb{X}(t),t),\end{aligned}\right. (7.47)

subject to QX(t)=gˇ(X(t))QX(t)=\check{g}(X(t)), where 𝐃¯dt\frac{\overline{\mathbf{D}}}{dt} is the damped mean covariant derivative with respect to XX. In this case, we refer to the system (7.47) as the gg-canonical reduction of (6.17), or global stochastic Hamilton’s equations.

(ii). If H0H_{0} is hyperregular, then the global stochastic Hamilton’s equations (7.47) are equivalent to the stochastic Euler-Lagrange equation (7.22) via the classical Legendre transform p=dx˙L0p=d_{\dot{x}}L_{0} and H0(x,p,t)=px˙L0(t,x,x˙)H_{0}(x,p,t)=p\cdot\dot{x}-L_{0}(t,x,\dot{x}).

(iii). Let SC(M×)S\in C^{\infty}(M\times\mathbb{R}). Then the following statements are equivalent:
(a) for every MM-valued diffusion XX satisfying

DX(t)=pH0(dS(t,X(t)),t),QX(t)=gˇ(X(t)),D_{\nabla}X(t)=\nabla_{p}H_{0}(dS(t,X(t)),t),\quad QX(t)=\check{g}(X(t)), (7.48)

the TMT^{*}M-valued process dSXdS\circ X solves the global stochastic Hamilton’s equations (7.47);
(b) SS satisfies the following Hamilton-Jacobi-Bellman equation

St+H0(dS,t)+12ΔS=f(t),\frac{\partial S}{\partial t}+H_{0}(dS,t)+\frac{1}{2}\Delta S=f(t), (7.49)

for some function ff depending only on tt.

Proof.

(i). Since H=H¯0g=H0+12gjk(ojkΓijkpi)H=\overline{H}_{0}^{g}=H_{0}+\textstyle{\frac{1}{2}}g^{jk}(o_{jk}-\Gamma^{i}_{jk}p_{i}), (QX)jk=2Hojk(QX)^{jk}=2\frac{\partial H}{\partial o_{jk}} if and only if QX(t)=gˇ(X(t))QX(t)=\check{g}(X(t)). Since,

Hpi=H0pi12gjkΓijk=dxi(pH0)12(QX)jkΓijk,\frac{\partial H}{\partial p_{i}}=\frac{\partial H_{0}}{\partial p_{i}}-\frac{1}{2}g^{jk}\Gamma^{i}_{jk}=dx^{i}(\nabla_{p}H_{0})-\frac{1}{2}(QX)^{jk}\Gamma^{i}_{jk},

we have (DX)i=Hpi(DX)^{i}=\frac{\partial H}{\partial p_{i}} if and only if DX=pH0D_{\nabla}X=\nabla_{p}H_{0}, due to (2.20). This proves the first equation of (7.47). Furthermore,

Hxi=H0xi+12igjk(ojkΓljkpl)12gjkiΓljkpl=H0xigjmΓimk(ojkΓljkpl)12gjkiΓljkpl.\frac{\partial H}{\partial x^{i}}=\frac{\partial H_{0}}{\partial x^{i}}+\frac{1}{2}\partial_{i}g^{jk}\left(o_{jk}-\Gamma^{l}_{jk}p_{l}\right)-\frac{1}{2}g^{jk}\partial_{i}\Gamma^{l}_{jk}p_{l}=\frac{\partial H_{0}}{\partial x^{i}}-g^{jm}\Gamma_{im}^{k}\left(o_{jk}-\Gamma^{l}_{jk}p_{l}\right)-\frac{1}{2}g^{jk}\partial_{i}\Gamma^{l}_{jk}p_{l}.

On the other hand, by applying Lemma 7.8 (ii) and (iv), and the equation DX=pH0D_{\nabla}X=\nabla_{p}H_{0}, we have

(D(pX))i=𝐃tpi=𝐃t[p(i)]=𝐃¯pdt(i)+p(𝐃¯idt)+(QX)jk(jp)(ki)=𝐃¯pdt(i)+p(DXi+12gjk2j,ki+12gjkR(i,j)k)+gjk(jp)(ki)=𝐃¯pdt(i)+pl(H0pjΓijl+12gjkiΓjkl)+gjkΓikm(jpmΓjmlpl).\begin{split}(D(p\circ\textbf{X}))_{i}&=\mathbf{D}_{t}p_{i}=\mathbf{D}_{t}[p(\partial_{i})]=\frac{\overline{\mathbf{D}}p}{dt}(\partial_{i})+p\left(\frac{\overline{\mathbf{D}}\partial_{i}}{dt}\right)+(QX)^{jk}(\nabla_{\partial_{j}}p)(\nabla_{\partial_{k}}\partial_{i})\\ &=\frac{\overline{\mathbf{D}}p}{dt}(\partial_{i})+p\left(\nabla_{D_{\nabla}X}\partial_{i}+\frac{1}{2}g^{jk}\nabla^{2}_{\partial_{j},\partial_{k}}\partial_{i}+\frac{1}{2}g^{jk}R(\partial_{i},\partial_{j})\partial_{k}\right)+g^{jk}(\nabla_{\partial_{j}}p)(\nabla_{\partial_{k}}\partial_{i})\\ &=\frac{\overline{\mathbf{D}}p}{dt}(\partial_{i})+p_{l}\left(\frac{\partial H_{0}}{\partial p_{j}}\Gamma_{ij}^{l}+\frac{1}{2}g^{jk}\partial_{i}\Gamma_{jk}^{l}\right)+g^{jk}\Gamma_{ik}^{m}\left(\partial_{j}p_{m}-\Gamma_{jm}^{l}p_{l}\right).\end{split} (7.50)

Hence,

(D(pX))i+Hxi=𝐃¯pdt(i)+dxH0(i)+gjmΓimk(jpkojk).\begin{split}(D(p\circ\textbf{X}))_{i}+\frac{\partial H}{\partial x^{i}}=\frac{\overline{\mathbf{D}}p}{dt}(\partial_{i})+d_{x}H_{0}(\partial_{i})+g^{jm}\Gamma_{im}^{k}\left(\partial_{j}p_{k}-o_{jk}\right).\end{split}

The second equation of (7.47) follows from (6.15).

(ii). The equivalence between (7.47) and (7.22) follows from the following calculations:

pH0=p(px˙L0)=x˙,\displaystyle\nabla_{p}H_{0}=\nabla_{p}(p\cdot\dot{x}-L_{0})=\dot{x},
dxH0=(H0xi+ΓijkpkH0pj)dxi=(L0xi+ΓijkL0x˙kx˙j)dxi=dxL0.\displaystyle d_{x}H_{0}=\left(\frac{\partial H_{0}}{\partial x^{i}}+\Gamma_{ij}^{k}p_{k}\frac{\partial H_{0}}{\partial p_{j}}\right)dx^{i}=\left(-\frac{\partial L_{0}}{\partial x^{i}}+\Gamma_{ij}^{k}\frac{\partial L_{0}}{\partial\dot{x}^{k}}\dot{x}^{j}\right)dx^{i}=-d_{x}L_{0}.

(iii). By (7.7), conditions (7.48) and (7.46),

𝐃¯dt(dS)=(t+DX+12ΔLD)(dS)=dSt+(pH0dS)dS12(dd+dd)dS=dSt+d(H0dS)dxH0dS12dddS=d(St+H0dS+12ΔS)dxH0dS.\begin{split}\frac{\overline{\mathbf{D}}}{dt}(dS)&=\left(\frac{\partial}{\partial t}+\nabla_{D_{\nabla}X}+\frac{1}{2}\Delta_{\mathrm{LD}}\right)(dS)=d\frac{\partial S}{\partial t}+\nabla_{(\nabla_{p}H_{0}\circ dS)}dS-\frac{1}{2}(dd^{*}+d^{*}d)dS\\ &=d\frac{\partial S}{\partial t}+d(H_{0}\circ dS)-d_{x}H_{0}\circ dS-\frac{1}{2}dd^{*}dS=d\left(\frac{\partial S}{\partial t}+H_{0}\circ dS+\frac{1}{2}\Delta S\right)-d_{x}H_{0}\circ dS.\end{split}

The result follows. ∎

Remark 7.23.

(i). Assertions (ii) and (iii) of Theorem 7.22 generalize Theorem 7.18, since from the Legendre transform p=dx˙L0p=d_{\dot{x}}L_{0} we observe that the S-EL equation (7.22) is related to HJB equation (7.49) via equation (7.29). However, assertion (iii) is a special case of Theorem 6.19, since HJB equation (7.49) is just the one in (6.39) with H=H¯0gH=\overline{H}_{0}^{g} the gg-canonical lift of H0H_{0}, due to the observation that H¯0g(d2S,t)=H0(dS,t)+12ΔS\overline{H}_{0}^{g}(d^{2}S,t)=H_{0}(dS,t)+\frac{1}{2}\Delta S.

(ii). The advantage of Theorem 7.22 is that it formulates stochastic Hamiltonian mechanics in a global way similar to stochastic Lagrangian mechanics, while its disadvantage is that it depends on the choice of Riemannian structures. However, unlike stochastic Hamiltonian mechanics of Chapter 6, neither global S-H equations (7.47) nor HJB equation (7.49) encodes any new symplectic or contact structures, as the Hamiltonian functions therein are still classical.

(iii). By a direct calculation similar to (7.50), one easily obtains the following local version of stochastic Euler-Lagrange equation (7.22):

𝐃t(L0x˙i)=L0xi+12gjkiΓljkL0x˙l12igjk(2L0xjx˙kΓljkL0x˙l).\mathbf{D}_{t}\left(\frac{\partial L_{0}}{\partial\dot{x}^{i}}\right)=\frac{\partial L_{0}}{\partial x^{i}}+\frac{1}{2}g^{jk}\partial_{i}\Gamma^{l}_{jk}\frac{\partial L_{0}}{\partial\dot{x}^{l}}-\frac{1}{2}\partial_{i}g^{jk}\left(\frac{\partial^{2}L_{0}}{\partial x^{j}\partial\dot{x}^{k}}-\Gamma^{l}_{jk}\frac{\partial L_{0}}{\partial\dot{x}^{l}}\right). (7.51)

This local version is related to stochastic Hamilton’s equations (6.10) via the canonical 2nd-order Legendre transform (7.43).

(iv). Similarly to Remark 6.20, if we let H~=Hf\tilde{H}=H-f, then Theorem 7.22 holds with H~\tilde{H} and zero function in place of HH and ff. We will refer to equation (7.49) with f0f\equiv 0 as the HJB equation associated with Hamiltonian H0H_{0}, or the HJB equation associated with the Lagrangian L0L_{0} related to H0H_{0} via the Legendre transform (when H0H_{0} is hyperregular).

On Riemannian manifolds, canonical transformations of Section 6.5 can also be reduced to tangent bundles. We consider a bundle isomorphism 𝐅\mathbf{F} from 𝒯SM×\mathcal{T}^{S*}M\times\mathbb{R} to 𝒯SN×\mathcal{T}^{S*}N\times\mathbb{R}, projecting to a time-change map F0:F^{0}:\mathbb{R}\to\mathbb{R}. The transformation 𝐅\mathbf{F} is a map from coordinates (xi,pi,ojk,t)(x^{i},p_{i},o_{jk},t) to (yi,Pi,Ojk,s)(y^{i},P_{i},O_{jk},s) satisfying s=F0(t)s=F^{0}(t). Both base manifolds MM and NN are equipped with some Riemannian metrics and the corresponding Levi-Civita connections.

By the inverse 2nd-order Legendre transform (7.44) and the integrability condition (6.15), the action functional in (7.9) can be rewritten as

𝒮[X;0,T]=𝐄0T[pi(t,X(t))(DX)i(t)+12pjxk(t,X(t))(QX)jk(t)H¯g0(𝐗(t),t)]dt=𝐄0T[pi(t,X(t))dXi(t)H¯g0(𝐗(t),t)dt],\begin{split}\mathcal{S}[X;0,T]&=\mathbf{E}\int_{0}^{T}\left[p_{i}(t,X(t))(DX)^{i}(t)+\frac{1}{2}\frac{\partial p_{j}}{\partial x^{k}}(t,X(t))(QX)^{jk}(t)-\overline{H}^{g}_{0}(\mathbf{X}(t),t)\right]dt\\ &=\mathbf{E}\int_{0}^{T}\left[p_{i}(t,X(t))\circ dX^{i}(t)-\overline{H}^{g}_{0}(\mathbf{X}(t),t)dt\right],\end{split}

where d\circ\,d denotes the Stratonovich stochastic differential and H¯g0=H0+12gjk(ojkΓijkpi)\overline{H}^{g}_{0}=H_{0}+\frac{1}{2}g^{jk}(o_{jk}-\Gamma^{i}_{jk}p_{i}). We denote simply xi=xi𝐗x^{i}=x^{i}\circ\mathbf{X}, pi=pi𝐗p_{i}=p_{i}\circ\mathbf{X} and H=H¯g0H=\overline{H}^{g}_{0}. Then 𝒮=𝐄0T(pidxi(t)Hdt)\mathcal{S}=\mathbf{E}\int_{0}^{T}(p_{i}\circ dx^{i}(t)-Hdt). Now we make a change of coordinates from (xi,pi,t)(x^{i},p_{i},t) to (yi,Pi,s)(y^{i},P_{i},s) satisfying s=F0(t)s=F^{0}(t), and denote that yi=yi𝐗y^{i}=y^{i}\circ\mathbf{X} and Pi=Pi𝐗P_{i}=P_{i}\circ\mathbf{X}. We have

𝒮=𝐄0T(Pidyi(s)Kds)=𝐄0T(Pid(yiF0)(t)KF˙0dt),\mathcal{S}=\mathbf{E}\int_{0}^{T}\left(P_{i}\circ dy^{i}(s)-Kds\right)=\mathbf{E}\int_{0}^{T}\left(P_{i}\circ d(y^{i}\circ F^{0})(t)-K\dot{F}^{0}dt\right),

where the function KK plays the role of the 2nd-order Hamiltonian in new coordinate system.

As in Section 6.5, the general condition for a transformation to be canonical is to preserve the form of stochastic Hamilton’s system (7.47). This is equivalent to preserve the form of stochastic stationary-action principle (7.12), according to Theorem 7.22.(ii). It follows from δ𝒮=0\delta\mathcal{S}=0 that

δ𝐄0T(pidxi(t)Hdt)=δ𝐄0T(Pid(yiF0)(t)KF˙0dt)=0.\delta\,\mathbf{E}\int_{0}^{T}\left(p_{i}\circ dx^{i}(t)-Hdt\right)=\delta\,\mathbf{E}\int_{0}^{T}\left(P_{i}\circ d(y^{i}\circ F^{0})(t)-K\dot{F}^{0}dt\right)=0.

Since the underlying process XX has zero variation at the endpoints, both equalities will be satisfied if the integrands are related by the following SDE:

pidxiHdt=PidyiKF˙0dt+dG,p_{i}\circ dx^{i}-Hdt=P_{i}\circ dy^{i}-K\dot{F}^{0}dt+dG, (7.52)

where GG is a function of phase space coordinates (x,p,t)(x,p,t) or (y,P,s)(y,P,s) or any mixture of them and called the generating function. Note that in contrast with the classical theory of canonical transformation and also (6.36), here equation (7.52) for canonical transformations is a stochastic differential equation, instead of equation for forms.

Consider the type one generating function G1G_{1}, that is, G=G1(x,y,t)G=G_{1}(x,y,t) is given as a function of the old and new generalized position coordinates (cf. [35, Section 9.1]). Then using Itô’s formula dG1=G1tdt+G1xidxi+G1yidyidG_{1}=\frac{\partial G_{1}}{\partial t}dt+\frac{\partial G_{1}}{\partial x^{i}}\circ dx^{i}+\frac{\partial G_{1}}{\partial y^{i}}\circ dy^{i}, and vanishing the coefficients of every (stochastic) differentials dx\circ dx, dy\circ dy and dtdt in (7.52), we get

pi=G1xi,Pi=G1yi,KF˙0H=G1t,p_{i}=\frac{\partial G_{1}}{\partial x^{i}},\quad P_{i}=-\frac{\partial G_{1}}{\partial y^{i}},\quad K\dot{F}^{0}-H=\frac{\partial G_{1}}{\partial t},

which recovers (6.37). By taking F0=𝐈𝐝F^{0}=\mathbf{Id}_{\mathbb{R}} (i.e., no time-change) and requiring the new Hamiltonian K0K_{0} to be identically zero, and writing G1G_{1} as SS the last equation turns into the following HJB equation

St(x,y,t)+H0(xi,Sxi(x,y,t),t)+12ΔxS(x,y,t)+12ΔyS(x,y,t)=0,\frac{\partial S}{\partial t}(x,y,t)+H_{0}\left(x^{i},\frac{\partial S}{\partial x^{i}}(x,y,t),t\right)+\frac{1}{2}\Delta_{x}S(x,y,t)+\frac{1}{2}\Delta_{y}S(x,y,t)=0,

where (x,y)(x,y) are regarded as coordinates on the product manifold M×NM\times N equipped with the direct-sum Riemannian metric and its corresponding Levi-Civita connection, Δx\Delta_{x} and Δy\Delta_{y} are the Laplacian on MM and NN, respectively, so that Δx+Δy\Delta_{x}+\Delta_{y} is the Laplacian on M×NM\times N under the aforementioned connection.

In contrast to the mixed-order contact approach to canonical transformations of Section 6.5, since the changes of coordinates proceed on TMT^{*}M, one can easily formulate four types of generating functions that are related to each other through classical Legendre transforms in the same way as in classical mechanics [35, Section 9.1]. For example, the type two generating function takes the form G=G2(x,P,t)yiPiG=G_{2}(x,P,t)-y^{i}P_{i}, for which we have

pi=G2xi,yi=G2Pi,KF˙0H=G2t.p_{i}=\frac{\partial G_{2}}{\partial x^{i}},\quad y^{i}=\frac{\partial G_{2}}{\partial P_{i}},\quad K\dot{F}^{0}-H=\frac{\partial G_{2}}{\partial t}. (7.53)

In this case, since (xi)(x^{i}) and (yi)(y^{i}) are no longer independent variables, Riemannian structures on MM and NN should be related by the transformation. In view of this, we only consider point transformations, a subclass of canonical transformations. That is, we assume G2G_{2} to be the form

G2(x,P,t)=fi(x,t)Pi+h(x,t)G_{2}(x,P,t)=f^{i}(x,t)P_{i}+h(x,t)

for some diffeomorphisms f:MNf:M\to N’s and h:Mh:M\to\mathbb{R}. The second equation of (7.53) implies

yi=fi(x,t).y^{i}=f^{i}(x,t).

So we equip NN with the (time-dependent) pushforward Riemannian metric of gg on MM by ff, and with the Levi-Civita connection.

Example 7.24 (Canonical transformations for one-dimensional Bernstein’s reciprocal processes).

Consider the scalar case of Example 6.11, that is, the \mathbb{R}-valued Brownian reciprocal process with 2nd-order Hamiltonian H(x,p,o)=H0(x,p)+12o=12|p|2+12oH(x,p,o)=H_{0}(x,p)+\frac{1}{2}o=\frac{1}{2}|p|^{2}+\frac{1}{2}o. The equations of motion are DDX=0DDX=0, QX=1QX=1 (cf. (6.33)). In the following, we will consider two canonical transformations which transform Brownian reciprocal processes to reciprocal processes derived from diffusions with linear potentials and quadratic potentials, respectively.

(i). Consider the time-dependent change of coordinates from (x,p,t)(x,p,t) to (y,P,t)(y,P,t) (without time-change) induced by G2(x,P,t)=(x+12t2)PtxG_{2}(x,P,t)=(x+\textstyle{{\frac{1}{2}}}t^{2})P-tx. By (7.53),

y=x+12t2,p=Pt,K=H+Pty+12t2.y=x+\textstyle{{\frac{1}{2}}}t^{2},\quad p=P-t,\quad K=H+Pt-y+\textstyle{{\frac{1}{2}}}t^{2}. (7.54)

For the latent 2nd-order coordinates, we have

O=Py=px=o.O=\frac{\partial P}{\partial y}=\frac{\partial p}{\partial x}=o.

Hence, by the last equation of (7.53), the new 2nd-order Hamiltonian is

K(y,P,O,t)=K0(y,P,t)+12O=12|P|2y+t2+12O,K(y,P,O,t)=K_{0}(y,P,t)+\frac{1}{2}O=\frac{1}{2}|P|^{2}-y+t^{2}+\frac{1}{2}O,

which is still of the form (6.26), with b0b\equiv 0 and F(t,y)=y+t2F(t,y)=-y+t^{2}. The equations of motion under new coordinates are DDY=1DDY=1 and QY=1QY=1. By Remark 6.20, KK share the same equations of motion with K~(y,P,O)=12|P|2y+12O\tilde{K}(y,P,O)=\frac{1}{2}|P|^{2}-y+\frac{1}{2}O. In other words, (7.54) transforms Brownian reciprocal processes to reciprocal processes derived from diffusions with linear potentials. This example is taken from [60, Theorem 4.1.(1)], where the authors used (7.54) to transform free heat equations to heat equations with linear potentials. We refer readers to [60] for more applications of canonical transformations of contact Hamiltonian systems to Euclidean quantum mechanics in Example 6.12.

(ii). Consider the following change of coordinates from (x,p,t)(x,p,t) to (y,P,s)(y,P,s) (with time-change)

x=y1t2,P=p1t2+yt,s=arctanht.x=y\sqrt{1-t^{2}},\quad P=p\sqrt{1-t^{2}}+yt,\quad s=\mathrm{arctanh}\,t. (7.55)

Clearly, the map (x,p)(y,P)(x,p)\mapsto(y,P) is induced by the type three generating function G3(y,p,t)=py1t2y22tG_{3}(y,p,t)=-py\sqrt{1-t^{2}}-\frac{y^{2}}{2}t via relations x=G3px=-\frac{\partial G_{3}}{\partial p} and P=G3yP=-\frac{\partial G_{3}}{\partial y}. The relation between the latent coordinates oo and OO is

O=Py=pxxy1t2+t=(1t2)o+t.O=\frac{\partial P}{\partial y}=\frac{\partial p}{\partial x}\frac{\partial x}{\partial y}\sqrt{1-t^{2}}+t=(1-t^{2})o+t. (7.56)

The new 2nd-order Hamiltonian KK satisfies KdsdtH=G3tK\frac{ds}{dt}-H=\frac{\partial G_{3}}{\partial t}. Hence, combining with (7.55) and (7.56), we obtain

K(y,P,O,s)=(1t2)(12|p|2+pyt1t212|y|2+12o)=12|P|212|y|2+12O12tanhs.K(y,P,O,s)=(1-t^{2})\left(\frac{1}{2}|p|^{2}+\frac{pyt}{\sqrt{1-t^{2}}}-\frac{1}{2}|y|^{2}+\frac{1}{2}o\right)=\frac{1}{2}|P|^{2}-\frac{1}{2}|y|^{2}+\frac{1}{2}O-\frac{1}{2}\tanh s.

This differs with the 2nd-order Hamiltonian of Euclidean harmonic oscillators in Example 6.12.(ii), i.e., K~(y,P,O)=12|P|212|y|2+12O\tilde{K}(y,P,O)=\frac{1}{2}|P|^{2}-\frac{1}{2}|y|^{2}+\frac{1}{2}O, by a term depending only on time. So by virtue of Remark 6.20, KK and K~\tilde{K} share the same equations of motion DDY=YDDY=Y, QY=1QY=1. Therefore, (7.55) transforms free reciprocal processes to Euclidean harmonic oscillators.

Example 7.25 (Canonical transformations for vanishing potentials).

Let (M,g)(M,g) be Riemannian. Take G2(x,P,t)=xiPiS(x,t)G_{2}(x,P,t)=x^{i}P_{i}-S(x,t) for some function SS. Then

y=x,pi=PiSxi,K=HSt.y=x,\quad p_{i}=P_{i}-\frac{\partial S}{\partial x^{i}},\quad K=H-\frac{\partial S}{\partial t}.

Since the transformation on base manifold MM is identity, it does not change the Riemannian metric, and

oij=pixj=Piyj2Sxixj.o_{ij}=\frac{\partial p_{i}}{\partial x^{j}}=\frac{\partial P_{i}}{\partial y^{j}}-\frac{\partial^{2}S}{\partial x^{i}\partial x^{j}}.

(i). We consider the Hamiltonian H0(x,p)bi(x)piF(x)H_{0}(x,p)\equiv b^{i}(x)p_{i}-F(x), whose corresponding 2nd-order Hamiltonian H=H¯g0H=\overline{H}^{g}_{0} has a diffusion with generator 12Δ+bF\frac{1}{2}\Delta+b\cdot\nabla-F for solution process (see Subsection 6.3.1). Then, the new Hamiltonian is

K0(y,P,t)=bi(y)Pib(y),S(y,t)F(y)12ΔS(y,t)St(y,t).K_{0}(y,P,t)=b^{i}(y)P_{i}-\langle b(y),\nabla S(y,t)\rangle-F(y)-\frac{1}{2}\Delta S(y,t)-\frac{\partial S}{\partial t}(y,t).

If we choose SS solving the backward PDE (6.23), then K0(y,P)=bi(y)PiK_{0}(y,P)=b^{i}(y)P_{i} has a diffusion process with generator 12Δ+b\frac{1}{2}\Delta+\nabla_{b} for solution. In particular, such a canonical transformation can transform a diffusion process with a scalar potential into a free motion.

(ii). Consider the Hamiltonian H0(x,p,t)=12gij(x)pipj+gij(x)piSxj(x,t)+bi(x)piF(x)H_{0}(x,p,t)=\frac{1}{2}g^{ij}(x)p_{i}p_{j}+g^{ij}(x)p_{i}\frac{\partial S}{\partial x^{j}}(x,t)+b^{i}(x)p_{i}-F(x), whose corresponding 2nd-order Hamiltonian H=H¯g0H=\overline{H}^{g}_{0} has a Schrödinger’s bridge with vector potential (b+S)(b+\nabla S) and scalar potential F-F for solution process. Then, the new Hamiltonian is

K0(y,P,t)=12gij(y)PiPj+bi(y)Pib(y),S(y,t)12|S(y,t)|2F(y)12ΔS(y,t)St(y,t).K_{0}(y,P,t)=\frac{1}{2}g^{ij}(y)P_{i}P_{j}+b^{i}(y)P_{i}-\langle b(y),\nabla S(y,t)\rangle-\frac{1}{2}|\nabla S(y,t)|^{2}-F(y)-\frac{1}{2}\Delta S(y,t)-\frac{\partial S}{\partial t}(y,t).

To transform K0K_{0} into the standard form K0(y,P,t)=12gij(y)PiPj+bi(y)PiK_{0}(y,P,t)=\frac{1}{2}g^{ij}(y)P_{i}P_{j}+b^{i}(y)P_{i} whose solution is a Schrödinger’s bridge with vector potential bb, we only need to assume that SS solves HJB equation (6.28). In particular, such a canonical transformation transforms a Schrödinger’s bridge with a scalar potential into a free one.

Regarding the classical energy introduced in the end of Subsection 7.4.1, for a given classical Lagrangian L0:×TML_{0}:\mathbb{R}\times TM\to\mathbb{R}, we introduce its generalized (or deformed) energy E:×TME:\mathbb{R}\times TM\to\mathbb{R} by

E(t,x,x˙)=E0(t,x,x˙)+12ΔS(t,x),E(t,x,\dot{x})=E_{0}(t,x,\dot{x})+\textstyle{\frac{1}{2}}\Delta S(t,x),

where SS is the solution of the Hamilton-Jacobi-Bellman equation (7.49) associated with L0L_{0} (with f0f\equiv 0). The term 12ΔS\frac{1}{2}\Delta S stands for the stochastic deformation.

7.4.3 Small-noise limits

In this part, we will see, informally, how our stochastic framework degenerates into classical mechanics as the noise goes to zero. Let ϵ>0\epsilon>0 be a small parameter which we refer to as diffusivity. The limit when ϵ0\epsilon\to 0 is called the small-noise limit.

Let 𝒜ϵg([0,T];q,μ)\mathcal{A}^{\epsilon}_{g}([0,T];q,\mu) be the small-noise version of the admissible class (7.10), that is, with the constraint QX(t)=ϵgˇ(X(t))QX(t)=\epsilon\check{g}(X(t)). The ϵ\epsilon-dependent stochastic variational problem is to minimize the action functional 𝒮[X;0,T]\mathcal{S}[X;0,T] in (7.9) among all X𝒜ϵg([0,T];q,μ)X\in\mathcal{A}^{\epsilon}_{g}([0,T];q,\mu). Then, the same procedure as Section 7.2 yields the following ϵ\epsilon-dependent stochastic Euler-Lagrange equation,

𝐃¯ϵdt(dx˙L0(t,Xϵ(t),DXϵ(t)))=dxL0(t,Xϵ(t),DXϵ(t)),\frac{\overline{\mathbf{D}}^{\epsilon}}{dt}\big{(}d_{\dot{x}}L_{0}\left(t,X_{\epsilon}(t),D_{\nabla}X_{\epsilon}(t)\right)\big{)}=d_{x}L_{0}\left(t,X_{\epsilon}(t),D_{\nabla}X_{\epsilon}(t)\right), (7.57)

which is an equivalent condition for Xϵ𝒜ϵg([0,T];q,μ)X_{\epsilon}\in\mathcal{A}^{\epsilon}_{g}([0,T];q,\mu) to be a stationary point of 𝒮\mathcal{S}. Here 𝐃¯ϵdt\frac{\overline{\mathbf{D}}^{\epsilon}}{dt} is the damped mean covariant derivative with respect to XϵX_{\epsilon} so that

𝐃¯ϵdt=t+DX+ϵ2ΔLD.\frac{\overline{\mathbf{D}}^{\epsilon}}{dt}=\frac{\partial}{\partial t}+\nabla_{D_{\nabla}X}+\frac{\epsilon}{2}\Delta_{\mathrm{LD}}.

Now as ϵ0\epsilon\to 0, since QXϵ0QX_{\epsilon}\to 0, XϵX_{\epsilon} tends to some deterministic curve γ=(γ(t))t[0,T]\gamma=(\gamma(t))_{t\in[0,T]} (in a suitable probabilistic sense), and DXϵ(t)D_{\nabla}X_{\epsilon}(t) tends to γ˙(t)\dot{\gamma}(t). Thus, we can write informally

𝒜ϵg([0,T];q,μ)𝒜0g([0,T];q,μ):={γ is adapted with paths in C2([0,T],M):γ(0)=q,𝐏(γ(T))1=μ}.\mathcal{A}^{\epsilon}_{g}([0,T];q,\mu)\to\mathcal{A}^{0}_{g}([0,T];q,\mu):=\left\{\gamma\text{ is adapted with paths in }C^{2}([0,T],M):\gamma(0)=q,\mathbf{P}\circ(\gamma(T))^{-1}=\mu\right\}.

The ϵ\epsilon-dependent stochastic variational problem tends to the following deterministic variational problem

minγ𝒜0g([0,T];q,μ)0TL0(t,γ(t),γ˙(t))dt.\min_{\gamma\in\mathcal{A}^{0}_{g}([0,T];q,\mu)}\int_{0}^{T}L_{0}\left(t,\gamma(t),\dot{\gamma}(t)\right)dt. (7.58)

And the ϵ\epsilon-dependent stochastic Euler-Lagrange equation (7.57) tends to

Ddt(dx˙L0(t,γ(t),γ˙(t)))=dxL0(t,γ(t),γ˙(t)),\frac{D}{dt}\big{(}d_{\dot{x}}L_{0}\left(t,\gamma(t),\dot{\gamma}(t)\right)\big{)}=d_{x}L_{0}\left(t,\gamma(t),\dot{\gamma}(t)\right), (7.59)

where, Ddt=t+γ˙\frac{D}{dt}=\frac{\partial}{\partial t}+\nabla_{\dot{\gamma}} is the material derivative along γ\gamma. This is the classical Euler-Lagrange equation in global form, cf. [85, p. 153].

We introduce the following ϵ\epsilon-dependent version of the gg-canonical lift (6.44):

Hϵ(x,p,o,t):=H0(x,p,t)+ϵ2gjk(x)(ojkΓijk(x)pi).H_{\epsilon}(x,p,o,t):=H_{0}(x,p,t)+\textstyle{\frac{\epsilon}{2}}g^{jk}(x)\left(o_{jk}-\Gamma^{i}_{jk}(x)p_{i}\right).

Let 𝐗ϵ\mathbf{X}_{\epsilon} be a horizontal integral process of stochastic Hamilton’s equations (6.10) corresponding to HϵH_{\epsilon} and Xϵ=τMS(𝐗ϵ)X_{\epsilon}=\tau_{M}^{S*}(\mathbf{X}_{\epsilon}). Since (Q(xXϵ))jk=2Hϵojk=ϵgˇ0(Q(x\circ\textbf{X}_{\epsilon}))^{jk}=2\frac{\partial H_{\epsilon}}{\partial o_{jk}}=\epsilon\check{g}\to 0 as ϵ0\epsilon\to 0, 𝐗ϵ\mathbf{X}_{\epsilon} converges to a TMT^{*}M-valued process. And since HϵpiH0pi\frac{\partial H_{\epsilon}}{\partial p_{i}}\to\frac{\partial H_{0}}{\partial p_{i}} and HϵxiH0xi\frac{\partial H_{\epsilon}}{\partial x^{i}}\to\frac{\partial H_{0}}{\partial x^{i}}, the limit TMT^{*}M-valued process satisfies classical Hamilton’s equations,

{x˙i(t)=H0pi(x(t),p(t),t),p˙i(t)=H0xi(x(t),p(t),t).\left\{\begin{aligned} \dot{x}^{i}(t)&=\textstyle{{\frac{\partial H_{0}}{\partial p_{i}}}}(x(t),p(t),t),\\ \dot{p}_{i}(t)&=-\textstyle{{\frac{\partial H_{0}}{\partial x^{i}}}}(x(t),p(t),t).\end{aligned}\right. (7.60)

Let 𝕏ϵ:=ϱ^(𝐗ϵ)\mathbb{X}_{\epsilon}:=\hat{\varrho}^{*}(\mathbf{X}_{\epsilon}). Then, 𝕏ϵ(t)=p(t,Xϵ(t))\mathbb{X}_{\epsilon}(t)=p(t,X_{\epsilon}(t)) solves the system of global stochastic Hamilton’s equations (7.47), with 𝕏ϵ\mathbb{X}_{\epsilon}, XϵX_{\epsilon} and 𝐃¯ϵdt\frac{\overline{\mathbf{D}}^{\epsilon}}{dt} in place of 𝕏\mathbb{X}, XX and 𝐃¯dt\frac{\overline{\mathbf{D}}}{dt}, respectively, subject to QXϵ(t)=ϵgˇ(Xϵ(t))QX_{\epsilon}(t)=\epsilon\check{g}(X_{\epsilon}(t)). As ϵ\epsilon goes to 0, this system tend to the following deterministic system,

{x˙(t)=pH0(x(t),p(t),t),Ddtp(t)=dxH0(x(t),p(t),t),\left\{\begin{aligned} \dot{x}(t)&=\nabla_{p}H_{0}(x(t),p(t),t),\\ \frac{D}{dt}p(t)&=-d_{x}H_{0}(x(t),p(t),t),\end{aligned}\right. (7.61)

This is indeed the global form of (7.60) which is equivalent to the global Euler-Lagrange equation (7.59) via the classical Legendre transform.

The corresponding ϵ\epsilon-dependent Hamilton-Jacobi-Bellman equation is now

St+Hϵ(d2S,t)=St+H0(dS,t)+ϵ2ΔS=f(t),\frac{\partial S}{\partial t}+H_{\epsilon}(d^{2}S,t)=\frac{\partial S}{\partial t}+H_{0}(dS,t)+\frac{\epsilon}{2}\Delta S=f(t),

which, as ϵ0\epsilon\to 0, goes to the classical Hamilton-Jacobi equation

St+H0(dS,t)=f(t).\frac{\partial S}{\partial t}+H_{0}(dS,t)=f(t).

The latter corresponds to (7.59)–(7.61) via classical Hamilton-Jacobi theory (e.g., [1, Chapter 5]).

We list here some previous works that have independent interests in the above small-noise limits, in some special cases. The time-asymptotic large deviation for Brownian bridges of Example 6.11 was studied in [41]. The second author of the present paper and his collaborator proved in [76] a large deviation result for one-dimensional Bernstein bridges which are solution processes of Euclidean quantum mechanics in Example 6.12. The paper [57] proved that the Γ\Gamma-limit of Schrödinger’s problem in Section 7.3 with small variance is the Monge-Kantorovich problem. The latter is the optimal transport problem associated with the classical variational problem (7.58) [85, Chapter 7]. See [67, Section 2.3] for more on small-noise limits of stochastic optimal transport.

Remark 7.26.

There are various terminologies in other areas related to the small-noise limit. In thermodynamics [42], ϵ\epsilon stands for the Boltzmann constant which relates to the diffusion coefficient via Einstein relation, as consistent with Schrödinger’s original statistical problem [79]; when applied to quantum mechanics as in Example 6.12, the small-noise limit is called the semiclassical limit and the parameter ϵ\epsilon stands for the reduced Planck constant \hbar; when/if applied to hydrodynamics (cf. [6, 14]), it is often called the vanishing viscosity limit and ϵ\epsilon stands for the kinematic viscosity ν\nu. The latter may be expected to solve Kolmogorov’s conjecture that the “stochastization” of dynamical systems is related to hydrodynamic PDEs as viscosity vanishes [8]. In physics, diffusivity, Planck constant and viscosity are indeed related to each other [84].

7.4.4 Relations to stochastic optimal control

Following the way of converting problems of classical calculus of variations into optimal control problems (see [29]), we can regard the stochastic variational problem of Section 7.2 as a stochastic optimal control problem.

Assume that (M,g)(M,g) is compact (for simplicity). Consider a stochastic control model in which the state evolves according to an MM-valued diffusion XX governed by a system of MDEs on the time interval [t,T][t,T], of the form

{DX(s)=U(s),QX(s)=g(X(s)),\left\{\begin{aligned} &D_{\nabla}X(s)=U(s),\\ &QX(s)=g(X(s)),\end{aligned}\right. (7.62)

or equivalently, by an Itô SDE of the form

dXi(s)=(Ui(s)12gjk(X(s))Γjki(X(s)))ds+σri(X(s))dWr(s),s[t,T],dX^{i}(s)=\left(U^{i}(s)-\frac{1}{2}g^{jk}(X(s))\Gamma_{jk}^{i}(X(s))\right)ds+\sigma_{r}^{i}(X(s))dW^{r}(s),\quad s\in[t,T],

where σ\sigma is the positive definite square root (1,1)(1,1)-tensor of gg, i.e., r=1dσirσjr=gij\sum_{r=1}^{d}\sigma^{i}_{r}\sigma^{j}_{r}=g^{ij}, WW is an d\mathbb{R}^{d}-valued standard Brownian motion and, most importantly, UU is a TMTM-valued process called the control process. There are no control constraints for UU as it is admissible in the sense of [29, Definition 2.1]. As endpoint condition, we require that X(t)=xX(t)=x.

The control problem on a finite time interval s[t,T]s\in[t,T] is to choose UU to minimize

J(t,x;X,U):=𝐄(t,x)[tTL0(s,X(s),U(s))dsST(X(T))],J(t,x;X,U):=\mathbf{E}_{(t,x)}\left[\int_{t}^{T}L_{0}\left(s,X(s),U(s)\right)ds-S_{T}(X(T))\right], (7.63)

among all pairs (X,U)(X,U) satisfying the system (7.62) and the endpoint condition, where STS_{T} is a given smooth function on MM. The real-valued smooth function L0L_{0} on ×TM\mathbb{R}\times TM is called running cost function and JJ the payoff functional. The problem is called a stochastic Bolza problem. In the case ST0S_{T}\equiv 0, this stochastic control problem is of the same form as our stochastic variational problem of Section 7.2. For this reason, we call the latter stochastic control problem to be in Lagrange form. By an argument similar to Theorem 7.16, one can derive the same equation as (7.22), but with boundary conditions X(t)=xX(t)=x and dx˙L0(T,X(T),DX(T))=dST(X(T))d_{\dot{x}}L_{0}(T,X(T),D_{\nabla}X(T))=dS_{T}(X(T)).

The starting point of dynamic programming is to regard the infimum of JJ being minimized as a function S(t,x)S(t,x) of the initial data:

S(t,x)=inf(X,U)J(t,x;X,U).S(t,x)=-\inf_{(X,U)}J(t,x;X,U).

Then, Bellman’s principle of dynamic programming [29, Section III.7] states that for tt+ϵTt\leq t+\epsilon\leq T,

0=infXI0T(M)𝐄(t,x)[tt+ϵL0(s,X(s),DX(s))dsS(t+ϵ,X(t+ϵ))+S(t,x)].0=\inf_{X\in I_{0}^{T}(M)}\mathbf{E}_{(t,x)}\left[\int_{t}^{t+\epsilon}L_{0}\left(s,X(s),D_{\nabla}X(s)\right)ds-S(t+\epsilon,X(t+\epsilon))+S(t,x)\right].

Divide the equation by ϵ\epsilon, let ϵ0+\epsilon\to 0^{+}, and then use Dynkin’s formula. We get the dynamic programming equation

0=inf[L0(t,x,Dx)(𝐃tS)(t,x,Dx,Qx)],0=\inf\left[L_{0}(t,x,D_{\nabla}x)-(\mathbf{D}_{t}S)(t,x,Dx,Qx)\right], (7.64)

subjected to terminal data S(T,x)=ST(x)S(T,x)=S_{T}(x). By (4.5) and (7.62),

𝐃tS=tS+DixiS+12QijxijS=tS+(Dix12Γijkgjk)iS+12gijijS.\mathbf{D}_{t}S=\partial_{t}S+D^{i}x\partial_{i}S+\textstyle{{\frac{1}{2}}}Q^{ij}x\partial_{i}\partial_{j}S=\partial_{t}S+\left(D^{i}_{\nabla}x-\textstyle{{\frac{1}{2}}}\Gamma^{i}_{jk}g^{jk}\right)\partial_{i}S+\textstyle{{\frac{1}{2}}}g^{ij}\partial_{i}\partial_{j}S.

We let

H(x,p,o,t)=sup[(Dix12Γijk(x)gjk(x))pi+12gij(x)oijL0(t,x,Dx)]H(x,p,o,t)=\sup\left[\left(D^{i}_{\nabla}x-\textstyle{{\frac{1}{2}}}\Gamma^{i}_{jk}(x)g^{jk}(x)\right)p_{i}+\textstyle{{\frac{1}{2}}}g^{ij}(x)o_{ij}-L_{0}(t,x,D_{\nabla}x)\right]

where the supremum can be ignored if L0L_{0} is convex, so that H=H¯g0H=\overline{H}^{g}_{0} is exactly the canonical inverse 2nd-order Legendre transform in (7.43). Then, the dynamic programming equation (7.64) can be written as the HJB equation (6.38), cf. [29, Section IV.3].

There is also a stochastic version of Pontryagin’s maximum principle [87, Theorem 3.3.2]. The crucial objects in stochastic Pontryagin’s principle are first- and second-order adjoint processes, pp and oo, respectively. Corresponding to the stochastic control problem (7.62)–(7.63), its adjoint processes pp and oo satisfy the following backward SDEs [87, Section 3.3.2] (where “backward” is again in a different sense from ours in Chapter 2),

{dpi(t)=[12(igklΓklj+gkliΓklj)(X(t))pj(t)r=1diσrj(X(t))zjr(t)+L0xi(t,X(t),U(t))]dt+zir(t)dWr(t),pi(T)=iST(X(T)),\left\{\begin{aligned} dp_{i}(t)&=\left[\frac{1}{2}\left(\partial_{i}g^{kl}\Gamma_{kl}^{j}+g^{kl}\partial_{i}\Gamma_{kl}^{j}\right)(X(t))p_{j}(t)-\sum_{r=1}^{d}\partial_{i}\sigma_{r}^{j}(X(t))z_{jr}(t)+\frac{\partial L_{0}}{\partial x^{i}}\left(t,X(t),U(t)\right)\right]dt\\ &\quad+z_{ir}(t)dW^{r}(t),\\ p_{i}(T)&=\partial_{i}S_{T}(X(T)),\end{aligned}\right. (7.65)

and

{doij(t)=[2H¯g0xixj(X(t),p(t),o(t),t)+oik(t)(ojl(t)2H0pkpl(X(t),p(t),t)+22H0xjpk(X(t),p(t),t))oik(t)(jglmΓlmk+glmjΓlmk)(X(t))+r=1d(jσrk(X(t))Zikr(t)+jσrl(X(t))Zilr(t))]dt+Zijr(t)dWr(t),oij(T)=ijST(X(T)),\left\{\begin{aligned} do_{ij}(t)&=-\Bigg{[}\frac{\partial^{2}\overline{H}^{g}_{0}}{\partial x^{i}\partial x^{j}}(X(t),p(t),o(t),t)\\ &\qquad\ +o_{ik}(t)\left(o_{jl}(t)\frac{\partial^{2}H_{0}}{\partial p_{k}\partial p_{l}}(X(t),p(t),t)+2\frac{\partial^{2}H_{0}}{\partial x^{j}\partial p_{k}}(X(t),p(t),t)\right)\\ &\qquad\ -o_{ik}(t)\left(\partial_{j}g^{lm}\Gamma_{lm}^{k}+g^{lm}\partial_{j}\Gamma_{lm}^{k}\right)(X(t))\\ &\qquad\ +\sum_{r=1}^{d}\left(\partial_{j}\sigma_{r}^{k}(X(t))Z_{ikr}(t)+\partial_{j}\sigma_{r}^{l}(X(t))Z_{ilr}(t)\right)\Bigg{]}dt+Z_{ijr}(t)dW^{r}(t),\\ o_{ij}(T)&=\partial_{i}\partial_{j}S_{T}(X(T)),\end{aligned}\right. (7.66)

which are called first- and second-order adjoint equation, respectively. The unknowns in (7.65) and (7.66) are the pairs (p,z)(p,z) and (o,Z)(o,Z), respectively. Suppose that pi(t)=pi(t,X(t))p_{i}(t)=p_{i}(t,X(t)) and oij(t)=oij(t,X(t))o_{ij}(t)=o_{ij}(t,X(t)) for time-dependent 2nd-order form (p,o)(p,o) that satisfies 2nd-order Maxwell relations (6.15). Then

zir=pixjσjr,Zijr=oijxkσkr.z_{ir}=\frac{\partial p_{i}}{\partial x^{j}}\sigma^{j}_{r},\quad Z_{ijr}=\frac{\partial o_{ij}}{\partial x^{k}}\sigma^{k}_{r}.

Plugging them into (7.65) and (7.66), we get

Dip\displaystyle D_{i}p =12(igklΓklj+gkliΓklj)pj12igjkpjxk+L0xi=H¯g0xi,\displaystyle=\frac{1}{2}\left(\partial_{i}g^{kl}\Gamma_{kl}^{j}+g^{kl}\partial_{i}\Gamma_{kl}^{j}\right)p_{j}-\frac{1}{2}\partial_{i}g^{jk}\frac{\partial p_{j}}{\partial x^{k}}+\frac{\partial L_{0}}{\partial x^{i}}=-\frac{\partial\overline{H}^{g}_{0}}{\partial x^{i}}, (7.67)
Dijo\displaystyle D_{ij}o =(2H¯g0xixj+pkxiplxj2H¯g0pkpl+2pkxi2H¯g0xjpk+2oklxi2H¯g0xjokl).\displaystyle=-\left(\frac{\partial^{2}\overline{H}^{g}_{0}}{\partial x^{i}\partial x^{j}}+\frac{\partial p_{k}}{\partial x^{i}}\frac{\partial p_{l}}{\partial x^{j}}\frac{\partial^{2}\overline{H}^{g}_{0}}{\partial p_{k}\partial p_{l}}+2\frac{\partial p_{k}}{\partial x^{i}}\frac{\partial^{2}\overline{H}^{g}_{0}}{\partial x^{j}\partial p_{k}}+2\frac{\partial o_{kl}}{\partial x^{i}}\frac{\partial^{2}\overline{H}^{g}_{0}}{\partial x^{j}\partial o_{kl}}\right).

These coincide with the corresponding equations in the S-H system (6.10) for 2nd-order Hamiltonian H¯g0\overline{H}^{g}_{0}. The first equality of (7.67) also recovers (7.51).

7.5 Stochastic variational symmetries

Definition 7.27.

Given an action functional 𝒮\mathcal{S} as in (7.9), a bundle automorphism FF on (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) projecting to F0F^{0} is called a variational symmetry of 𝒮\mathcal{S} if, whenever [t1,t2][t_{1},t_{2}] is a subinterval of [0,T][0,T], we have 𝒮[FX,F0(t1),F0(t2)]=𝒮[X,t1,t2]\mathcal{S}[F\cdot X,F^{0}(t_{1}),F^{0}(t_{2})]=\mathcal{S}[X,t_{1},t_{2}]. A π\pi-projectable vector field VV on ×M\mathbb{R}\times M is called an infinitesimal variational symmetry of 𝒮\mathcal{S}, if its flow consists of variational symmetries of 𝒮\mathcal{S}.

Lemma 7.28.

The π\pi-projectable vector field VV of the form (4.9) is an infinitesimal variational symmetry of 𝒮\mathcal{S} if and only if

[(jV)(L0)+L0V˙0](jtX),t[0,T]\left[(j^{\nabla}V)(L_{0})+L_{0}\dot{V}^{0}\right](j^{\nabla}_{t}X),\quad t\in[0,T]

is a martingale, for all XI0T(M)X\in I_{0}^{T}(M).

Proof.

As in the proof of Theorem 4.14, we let ψ={(ψ0ϵ,ψ¯ϵ)}ϵ\psi=\{(\psi^{0}_{\epsilon},\bar{\psi}_{\epsilon})\}_{\epsilon\in\mathbb{R}} be the flow generated by VV, and denote X~ϵ=ψϵX\tilde{X}_{\epsilon}=\psi_{\epsilon}\cdot X. Then, by a change of variable s=ψ0ϵ(t)s=\psi^{0}_{\epsilon}(t),

𝒮[X~ϵ,ψ0ϵ(t1),ψ0ϵ(t2)]=𝐄ψ0ϵ(t1)ψ0ϵ(t2)L0(s,X~ϵ(s),DX~ϵ(s))ds=𝐄t1t2L0(ψ0ϵ(t),ψ¯ϵ(t,X(t)),DX~ϵ(ψ0ϵ(t)))dψ0ϵdt(t)dt.\begin{split}\mathcal{S}[\tilde{X}_{\epsilon},\psi^{0}_{\epsilon}(t_{1}),\psi^{0}_{\epsilon}(t_{2})]&=\mathbf{E}\int_{\psi^{0}_{\epsilon}(t_{1})}^{\psi^{0}_{\epsilon}(t_{2})}L_{0}\left(s,\tilde{X}_{\epsilon}(s),D_{\nabla}\tilde{X}_{\epsilon}(s)\right)ds\\ &=\mathbf{E}\int_{t_{1}}^{t_{2}}L_{0}\left(\psi^{0}_{\epsilon}(t),\bar{\psi}_{\epsilon}(t,X(t)),D_{\nabla}\tilde{X}_{\epsilon}(\psi^{0}_{\epsilon}(t))\right)\frac{d\psi^{0}_{\epsilon}}{dt}(t)dt.\end{split}

Since for all [t1,t2][0,T][t_{1},t_{2}]\subset[0,T] and each ϵ\epsilon, 𝒮[X~ϵ,ψ0ϵ(t1),ψ0ϵ(t2)]=S[X,t1,t2]\mathcal{S}[\tilde{X}_{\epsilon},\psi^{0}_{\epsilon}(t_{1}),\psi^{0}_{\epsilon}(t_{2})]=S[X,t_{1},t_{2}], we have that the difference

L0(ψ0ϵ(t),ψ¯ϵ(t,X(t)),DX~ϵ(ψ0ϵ(t)))dψ0ϵdt(t)L0(t,X(t),DX(t)).L_{0}\left(\psi^{0}_{\epsilon}(t),\bar{\psi}_{\epsilon}(t,X(t)),D_{\nabla}\tilde{X}_{\epsilon}(\psi^{0}_{\epsilon}(t))\right)\frac{d\psi^{0}_{\epsilon}}{dt}(t)-L_{0}\left(t,X(t),D_{\nabla}X(t)\right).

is a martingale (depending on ϵ\epsilon). Taking derivatives with respect to ϵ\epsilon and evaluating at ϵ=0\epsilon=0 for the above equality, and recalling that jV=ddϵ|ϵ=0jψϵj^{\nabla}V=\frac{d}{d\epsilon}\big{|}_{\epsilon=0}j^{\nabla}\psi_{\epsilon}, we can obtain the desired result. ∎

Definition 7.29.

Given a smooth function Φ:×M\Phi:\mathbb{R}\times M\to\mathbb{R}. A π\pi-projectable vector field VV on ×M\mathbb{R}\times M is called an infinitesimal Φ\Phi-divergence symmetry of 𝒮\mathcal{S}, if

[(jV)(L0)+L0V˙0](jtX)=𝐃tΦ(jtX),\left[(j^{\nabla}V)(L_{0})+L_{0}\dot{V}^{0}\right](j^{\nabla}_{t}X)=\mathbf{D}_{t}\Phi(j^{\nabla}_{t}X),

for all XI0T(M)X\in I_{0}^{T}(M) and t[0,T]t\in[0,T].

Recall that for the π\pi-projectable vector field VV of the form (4.9), we denote V¯=Vixi\bar{V}=V^{i}\frac{\partial}{\partial{x^{i}}}, as in Corollary 4.17.

Proposition 7.30.

A vector field VV of the form (4.9) is an infinitesimal Φ\Phi-divergence symmetry of 𝒮\mathcal{S} if and only if

V0tL0+dxL0(V¯)+dx˙L0(𝐃¯V¯dt)V˙0E0=𝐃tΦ.V^{0}\partial_{t}L_{0}+d_{x}L_{0}(\bar{V})+d_{\dot{x}}L_{0}\left(\frac{\overline{\mathbf{D}}\bar{V}}{dt}\right)-\dot{V}^{0}E_{0}=\mathbf{D}_{t}\Phi.
Proof.

It follows from Corollary 4.17 and (7.19), (7.20) that

𝐃tΦ=V0tL0+ViiL0+[(t+x˙jj)Vi+12(ΔV¯+Ric(V¯))iV˙0x˙i]x˙iL0+V˙0L0=V0tL0+dxL0(V¯)+dx˙L0((t+x˙+12ΔLD)V¯)V˙0(x˙ix˙iL0L0)=V0tL0+dxL0(V¯)+dx˙L0(𝐃¯V¯dt)V˙0E0.\begin{split}\mathbf{D}_{t}\Phi&=V^{0}\partial_{t}L_{0}+V^{i}\partial_{i}L_{0}+\left[\left(\partial_{t}+\dot{x}^{j}\partial_{j}\right)V^{i}+\textstyle{{\frac{1}{2}}}\left(\Delta\bar{V}+\mathrm{Ric}(\bar{V})\right)^{i}-\dot{V}^{0}\dot{x}^{i}\right]\partial_{\dot{x}^{i}}L_{0}+\dot{V}^{0}L_{0}\\ &=V^{0}\partial_{t}L_{0}+d_{x}L_{0}(\bar{V})+d_{\dot{x}}L_{0}\left(\left(\partial_{t}+\nabla_{\dot{x}}+\textstyle{{\frac{1}{2}}}\Delta_{\text{LD}}\right)\bar{V}\right)-\dot{V}^{0}\left(\dot{x}^{i}\partial_{\dot{x}^{i}}L_{0}-L_{0}\right)\\ &=V^{0}\partial_{t}L_{0}+d_{x}L_{0}(\bar{V})+d_{\dot{x}}L_{0}\left(\frac{\overline{\mathbf{D}}\bar{V}}{dt}\right)-\dot{V}^{0}E_{0}.\end{split}

This concludes the proof. ∎

Corollary 7.31.

Let L0:×TML_{0}:\mathbb{R}\times TM\to\mathbb{R} be a hyperregular Lagrangian. Let VV be a vector field of the form (4.9). Given a smooth function Φ:×M\Phi:\mathbb{R}\times M\to\mathbb{R}, define the Φ\Phi-extension of VV by

VΦ=V+Φu,V_{\Phi}=V+\Phi\frac{\partial}{\partial{u}}, (7.68)

which is a vector field on ×M×\mathbb{R}\times M\times\mathbb{R}. Suppose that VV satisfies

12V˙0ΔS=gij2i,jV¯S,\frac{1}{2}\dot{V}^{0}\Delta S=g^{ij}\nabla^{2}_{\partial_{i},\nabla_{\partial_{j}}\bar{V}}S,

for SS the solution of the Hamilton-Jacobi-Bellman equation (7.49) associated with L0L_{0} (for f0f\equiv 0). Then, VV is an infinitesimal Φ\Phi-divergence symmetry of 𝒮\mathcal{S} if and only if VΦV_{\Phi} is an infinitesimal symmetry of equation (7.49).

Proof.

By the classical jet bundle theory, we know that VV is an infinitesimal symmetry of Hamilton-Jacobi-Bellman equation (7.49) if and only if [72, Theorem 2.31]

j1,2V(ut+H0(x,(ui),t)+12gij(x)uij12gij(x)Γijk(x)uk)=0,j^{1,2}V\left(u_{t}+H_{0}(x,(u_{i}),t)+\textstyle{{\frac{1}{2}}}g^{ij}(x)u_{ij}-\textstyle{{\frac{1}{2}}}g^{ij}(x)\Gamma_{ij}^{k}(x)u_{k}\right)=0, (7.69)

where

j1,2V=V0t+Vixi+Φu+Vtut+Viui+Vijuij,j^{1,2}V=V^{0}\frac{\partial}{\partial{t}}+V^{i}\frac{\partial}{\partial{x^{i}}}+\Phi\frac{\partial}{\partial{u}}+V_{t}\frac{\partial}{\partial u_{t}}+V_{i}\frac{\partial}{\partial u_{i}}+V_{ij}\frac{\partial}{\partial u_{ij}},

with coefficients given by [72, Theorem 2.36 or Example 2.38]

Vt=ΦtV˙0utVitui,Vi=ΦxiVjxiuj,Vij=2Φxixj2VkxixjukVkxiujkVkxjuik.V_{t}=\frac{\partial\Phi}{\partial t}-\dot{V}^{0}u_{t}-\frac{\partial V^{i}}{\partial t}u_{i},\quad V_{i}=\frac{\partial\Phi}{\partial x^{i}}-\frac{\partial V^{j}}{\partial x^{i}}u_{j},\quad V_{ij}=\frac{\partial^{2}\Phi}{\partial x^{i}\partial x^{j}}-\frac{\partial^{2}V^{k}}{\partial x^{i}\partial x^{j}}u_{k}-\frac{\partial V^{k}}{\partial x^{i}}u_{jk}-\frac{\partial V^{k}}{\partial x^{j}}u_{ik}.

Moreover, the jet coordinates (ut,ui,uij)(u_{t},u_{i},u_{ij}) satisfy

(ut,ui,uij)=(tS,iS,ijS)=(E012ΔS,x˙iL0,ijS),(u_{t},u_{i},u_{ij})=(\partial_{t}S,\partial_{i}S,\partial_{ij}S)=(-E_{0}-\textstyle{{\frac{1}{2}}}\Delta S,\partial_{\dot{x}^{i}}L_{0},\partial_{ij}S),

where we recall dS=dx˙L0dS=d_{\dot{x}}L_{0} from equation (7.29) and Remark 7.23, and also that tS=H0(dS,t)12ΔS=E012ΔS\partial_{t}S=-H_{0}(dS,t)-\frac{1}{2}\Delta S=-E_{0}-\frac{1}{2}\Delta S. Plugging these into (7.69) and using the fact that tH0=tL0\partial_{t}H_{0}=-\partial_{t}L_{0} and xiH0=xiL0\partial_{x^{i}}H_{0}=-\partial_{x^{i}}L_{0} due to classical Legendre transform, we have

0=V0tH0+Vi(xiH0+12igjkujk12igjkΓjklul12gjkiΓjklul)+(tΦV˙0uttViui)+(iΦiVlul)(piH012gjkΓjki)+12gij(ijΦijVkukiVkujkjVkuik)=V0tH0+VixiH0(t+piH0j)Viui12gij(ijVkΓijllVk+2ΓilkjVl+lΓijkVl)ukV˙0utgij(jVk+ΓjmkVm)(uikΓiklul)+[tΦ+(piH012gjkΓjki)iΦ+12gijijΦ]=V0tL0VixiL0(t+x˙jj)Vix˙iL012[ΔV¯+Ric(V¯)]kx˙kL0+V˙0(E0+12ΔS)gij2i,jV¯S+(tΦ+x˙iiΦ+12ΔΦ)=[V0tL0+dxL0(V¯)+dx˙L0(𝐃¯V¯dt)V˙0E0]+(12V˙0ΔSgij2i,jV¯S)+𝐃tΦ,\begin{split}0=&\ V^{0}\partial_{t}H_{0}+V^{i}\left(\partial_{x^{i}}H_{0}+\textstyle{{\frac{1}{2}}}\partial_{i}g^{jk}u_{jk}-\textstyle{{\frac{1}{2}}}\partial_{i}g^{jk}\Gamma_{jk}^{l}u_{l}-\textstyle{{\frac{1}{2}}}g^{jk}\partial_{i}\Gamma_{jk}^{l}u_{l}\right)+\left(\partial_{t}\Phi-\dot{V}^{0}u_{t}-\partial_{t}V^{i}u_{i}\right)\\ &\ +\left(\partial_{i}\Phi-\partial_{i}V^{l}u_{l}\right)\left(\partial_{p_{i}}H_{0}-\textstyle{{\frac{1}{2}}}g^{jk}\Gamma_{jk}^{i}\right)+\textstyle{{\frac{1}{2}}}g^{ij}\left(\partial_{i}\partial_{j}\Phi-\partial_{i}\partial_{j}V^{k}u_{k}-\partial_{i}V^{k}u_{jk}-\partial_{j}V^{k}u_{ik}\right)\\ =&\ V^{0}\partial_{t}H_{0}+V^{i}\partial_{x^{i}}H_{0}-\left(\partial_{t}+\partial_{p_{i}}H_{0}\partial_{j}\right)V^{i}u_{i}-\textstyle{{\frac{1}{2}}}g^{ij}\left(\partial_{i}\partial_{j}V^{k}-\Gamma_{ij}^{l}\partial_{l}V^{k}+2\Gamma_{il}^{k}\partial_{j}V^{l}+\partial_{l}\Gamma_{ij}^{k}V^{l}\right)u_{k}\\ &\ -\dot{V}^{0}u_{t}-g^{ij}\left(\partial_{j}V^{k}+\Gamma_{jm}^{k}V^{m}\right)\left(u_{ik}-\Gamma_{ik}^{l}u_{l}\right)+\left[\partial_{t}\Phi+\left(\partial_{p_{i}}H_{0}-\textstyle{{\frac{1}{2}}}g^{jk}\Gamma_{jk}^{i}\right)\partial_{i}\Phi+\textstyle{{\frac{1}{2}}}g^{ij}\partial_{i}\partial_{j}\Phi\right]\\ =&\ -V^{0}\partial_{t}L_{0}-V^{i}\partial_{x^{i}}L_{0}-\left(\partial_{t}+\dot{x}^{j}\partial_{j}\right)V^{i}\partial_{\dot{x}^{i}}L_{0}-\textstyle{{\frac{1}{2}}}\left[\Delta\bar{V}+\mathrm{Ric}(\bar{V})\right]^{k}\partial_{\dot{x}^{k}}L_{0}\\ &\ +\dot{V}^{0}\left(E_{0}+\textstyle{{\frac{1}{2}}}\Delta S\right)-g^{ij}\nabla^{2}_{\partial_{i},\nabla_{\partial_{j}}\bar{V}}S+\left(\partial_{t}\Phi+\dot{x}^{i}\partial_{i}\Phi+\textstyle{{\frac{1}{2}}}\Delta\Phi\right)\\ =&\ -\left[V^{0}\partial_{t}L_{0}+d_{x}L_{0}(\bar{V})+d_{\dot{x}}L_{0}\left(\frac{\overline{\mathbf{D}}\bar{V}}{dt}\right)-\dot{V}^{0}E_{0}\right]+\left(\textstyle{{\frac{1}{2}}}\dot{V}^{0}\Delta S-g^{ij}\nabla^{2}_{\partial_{i},\nabla_{\partial_{j}}\bar{V}}S\right)+\mathbf{D}_{t}\Phi,\end{split} (7.70)

where, in the last equality, we used the fact that (QX)ij(t)=gij(X(t))(QX)^{ij}(t)=g^{ij}(X(t)) to derive 𝐃tΦ\mathbf{D}_{t}\Phi. The result then follows from Proposition 7.30. ∎

Theorem 7.32 (Stochastic Noether’s theorem).

Let L0:×TML_{0}:\mathbb{R}\times TM\to\mathbb{R} be a hyperregular Lagrangian. Suppose that the vector field VΦV_{\Phi} in (7.68) is an infinitesimal symmetry of the Hamilton-Jacobi-Bellman equation (7.49) associated with L0L_{0} (with f0f\equiv 0). Then the following stochastic conservation law holds for the stochastic Euler-Lagrange equation (7.22),

𝐃t[Vix˙iL0V0EΦ]=0.\mathbf{D}_{t}\left[V^{i}\partial_{\dot{x}^{i}}L_{0}-V^{0}E-\Phi\right]=0. (7.71)
Proof.

Recall that dS=dx˙L0dS=d_{\dot{x}}L_{0} and tS=E012ΔS=E\partial_{t}S=-E_{0}-\frac{1}{2}\Delta S=-E. By applying Lemma 7.8.(iv) and (7.22), as well as the fact that (QX)ij(t)=gij(X(t))(QX)^{ij}(t)=g^{ij}(X(t)), we have

𝐃t[dx˙L0(V¯)]=dx˙L0(𝐃¯V¯dt)+𝐃¯(dx˙L0)dt(V¯)+(QX)ij(i(dx˙L0))(jV¯)=dx˙L0(𝐃¯V¯dt)+dxL0(V¯)+gij2i,jV¯S.\begin{split}\mathbf{D}_{t}\left[d_{\dot{x}}L_{0}(\bar{V})\right]&=d_{\dot{x}}L_{0}\left(\frac{\overline{\mathbf{D}}\bar{V}}{dt}\right)+\frac{\overline{\mathbf{D}}(d_{\dot{x}}L_{0})}{dt}(\bar{V})+(QX)^{ij}(\nabla_{\partial_{i}}(d_{\dot{x}}L_{0}))(\nabla_{\partial_{j}}\bar{V})\\ &=d_{\dot{x}}L_{0}\left(\frac{\overline{\mathbf{D}}\bar{V}}{dt}\right)+d_{x}L_{0}(\bar{V})+g^{ij}\nabla^{2}_{\partial_{i},\nabla_{\partial_{j}}\bar{V}}S.\end{split}

Then, we use HJB equation (7.49) (with f0f\equiv 0) and the classical Legendre transform H0=dx˙L0x˙L0H_{0}=d_{\dot{x}}L_{0}\cdot\dot{x}-L_{0} to derive

𝐃tE=𝐃ttS=t(t+x˙+12Δ)S=t[dSx˙+(t+12Δ)S]=t(dx˙L0x˙H0)=tL0.\begin{split}\mathbf{D}_{t}E&=-\mathbf{D}_{t}\partial_{t}S=-\partial_{t}\left(\partial_{t}+\nabla_{\dot{x}}+\textstyle{{\frac{1}{2}}}\Delta\right)S=-\partial_{t}\left[dS\cdot\dot{x}+\left(\partial_{t}+\textstyle{{\frac{1}{2}}}\Delta\right)S\right]\\ &=-\partial_{t}\left(d_{\dot{x}}L_{0}\cdot\dot{x}-H_{0}\right)=-\partial_{t}L_{0}.\end{split}

Combining these with the S-EL equation (7.22) and the criterion (7.70) for symmetries of the HJB equation (7.27), we have

𝐃t[Vix˙iL0V0EΦ]=𝐃t[dx˙L0(V¯)]V˙0EV0𝐃tE𝐃tΦ=dx˙L0(𝐃¯V¯dt)+dxL0(V¯)+gij2i,jV¯SV˙0(E0+12ΔS)+V0tL0𝐃tΦ=0.\begin{split}\mathbf{D}_{t}\left[V^{i}\partial_{\dot{x}^{i}}L_{0}-V^{0}E-\Phi\right]&=\mathbf{D}_{t}\left[d_{\dot{x}}L_{0}(\bar{V})\right]-\dot{V}^{0}E-V^{0}\mathbf{D}_{t}E-\mathbf{D}_{t}\Phi\\ &=d_{\dot{x}}L_{0}\left(\frac{\overline{\mathbf{D}}\bar{V}}{dt}\right)+d_{x}L_{0}(\bar{V})+g^{ij}\nabla^{2}_{\partial_{i},\nabla_{\partial_{j}}\bar{V}}S-\dot{V}^{0}\left(E_{0}+\textstyle{{\frac{1}{2}}}\Delta S\right)+V^{0}\partial_{t}L_{0}-\mathbf{D}_{t}\Phi\\ &=0.\end{split}

The result follows. ∎

Remark 7.33.

(i). In stochastic Hamiltonian formalism, (7.71) reads as 𝐃t[VipiV0HΦ]=0\mathbf{D}_{t}\left[V^{i}p_{i}-V^{0}H-\Phi\right]=0.

(ii). The stochastic conservation law (6.19) of a time-independent gg-canonical 2nd-order Hamiltonian H=H¯0gH=\overline{H}_{0}^{g} can be regarded as a special case of the above stochastic Noether’s theorem. Indeed, consider the infinitesimal unit time translation V=tV=\frac{\partial}{\partial t}, i.e., V0=1V^{0}=1, V¯=0\bar{V}=0, Φ=0\Phi=0. Then, the criterion (7.70) reduces to 0=tL0=tH00=\partial_{t}L_{0}=-\partial_{t}H_{0}, which means that H=H¯0gH=\overline{H}_{0}^{g} is time-independent. The resulting stochastic conservation law is 𝐃tE=𝐃tH=0\mathbf{D}_{t}E=\mathbf{D}_{t}H=0.

Applying stochastic Noether’s theorem to Schrödinger’s problem of Section 7.3, we have the following corollary. Its Euclidean case with zero vector potential (i.e., b0b\equiv 0) has already been formulated in [83].

Corollary 7.34 (Stochastic Noether’s theorem for Schrödinger’s problem).

Let L0L_{0} be the Lagrangian given in (7.25). Suppose that the vector field VΦV_{\Phi} in (7.68) is an infinitesimal symmetry of Hamilton-Jacobi-Bellman equation (7.27) with f0f\equiv 0. Then the following stochastic conservation law holds for the coordinate process of the solution of Schrödinger’s problem in (7.33),

𝐃t[gij(Djxbj)ViV0(E0+12ΔS)Φ]=0,\mathbf{D}_{t}\left[g_{ij}\left(D_{\nabla}^{j}x-b^{j}\right)V^{i}-V^{0}\left(E_{0}+\textstyle{{\frac{1}{2}}}\Delta S\right)-\Phi\right]=0,

where E0E_{0} is the classical energy given in (7.45) and SS is the solution of (7.27).

Appendix A Mixed-order tangent and cotangent bundles

A.1 Mixed-order tangent and cotangent maps

Clearly, the mixed-order tangent bundle T×𝒯SMT\mathbb{R}\times\mathcal{T}^{S}M is a subbundle of the totally second-order tangent bundle 𝒯S(×M)\mathcal{T}^{S}(\mathbb{R}\times M), and contains the tangent bundle T(×M)T×TMT(\mathbb{R}\times M)\cong T\mathbb{R}\times TM as a subbundle. Similar properties hold for the mixed-order cotangent bundle.

It is easy to verify that the mixed-order tangent bundle can be characterized as follows:

T×𝒯SM={A𝒯S(×M):πS(A)T}.T\mathbb{R}\times\mathcal{T}^{S}M=\{A\in\mathcal{T}^{S}(\mathbb{R}\times M):\pi^{S}_{*}(A)\in T\mathbb{R}\}.

We also define the stochastic analog of the vertical bundle as

VSπ={AT×𝒯SM:πS(A)=0}.V^{S}\pi=\{A\in T\mathbb{R}\times\mathcal{T}^{S}M:\pi^{S}_{*}(A)=0\}.

Then, it is easy to see that VSπ×𝒯SMV^{S}\pi\cong\mathbb{R}\times\mathcal{T}^{S}M.

Given a smooth map F:×M×NF:\mathbb{R}\times M\to\mathbb{R}\times N, we can define its second-order pushforward FSF_{*}^{S} as in Definitions 5.5 and 5.7, so that FSF_{*}^{S} is a bundle homomorphism from τS×M\tau^{S}_{\mathbb{R}\times M} to τS×N\tau^{S}_{\mathbb{R}\times N}. In general, FSF_{*}^{S} neither maps the mixed-order tangent bundle to the mixed-order tangent bundle, nor maps the vertical bundle to the vertical bundle. But if FF is projectable, then it does.

Lemma A.1.

Let MM and NN be two smooth manifolds and MM be connected. Let F:×M×NF:\mathbb{R}\times M\to\mathbb{R}\times N be a smooth map. Then the following statements are equivalent:
(i) FF is a bundle homomorphism from (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) to (×N,ρ,)(\mathbb{R}\times N,\rho,\mathbb{R});
(ii) FS(T×𝒯SM)T×𝒯SNF^{S}_{*}(T\mathbb{R}\times\mathcal{T}^{S}M)\subset T\mathbb{R}\times\mathcal{T}^{S}N;
(iii) FS(VSπ)VSρF^{S}_{*}(V^{S}\pi)\subset V^{S}\rho.

Proof.

We first prove that (i) implies both (ii) and (iii). Suppose that FF is a bundle homomorphism projecting to F0F^{0}. Then, ρF=F0π\rho\circ F=F^{0}\circ\pi and hence, for any A𝒯S(×M)A\in\mathcal{T}^{S}(\mathbb{R}\times M),

ρS(FS(A))=(F0)SπS(A).\rho^{S}_{*}(F^{S}_{*}(A))=(F^{0})^{S}_{*}\pi^{S}_{*}(A).

If AT×𝒯SMA\in T\mathbb{R}\times\mathcal{T}^{S}M, then πS(A)T\pi^{S}_{*}(A)\in T\mathbb{R} and thus ρS(FS(A))(F0)S(T)=(F0)(T)T\rho^{S}_{*}(F^{S}_{*}(A))\in(F^{0})^{S}_{*}(T\mathbb{R})=(F^{0})_{*}(T\mathbb{R})\subset T\mathbb{R}. This implies FS(A)T×𝒯SNF^{S}_{*}(A)\in T\mathbb{R}\times\mathcal{T}^{S}N. If AVSπA\in V^{S}\pi, then πS(A)=0\pi^{S}_{*}(A)=0, it follows ρS(FS(A))=0\rho^{S}_{*}(F^{S}_{*}(A))=0 and therefore FS(A)VSρF^{S}_{*}(A)\in V^{S}\rho.

Next we prove either (ii) or (iii) implies (i). Choose local coordinates (t,xi)(t,x^{i}) around (t0,q)×M(t_{0},q)\in\mathbb{R}\times M and (s,yj)(s,y^{j}) around F(t0,q)F(t_{0},q). Suppose FF has a local expression F=(F0,F¯j)F=(F^{0},\bar{F}^{j}). Let AT×𝒯SM|(t0,q)A\in T\mathbb{R}\times\mathcal{T}^{S}M|_{(t_{0},q)} having the following local expression:

A=A0t|t0+Aixi|q+Ajk2xjxk|q.A=A^{0}\frac{\partial}{\partial{t}}\bigg{|}_{t_{0}}+A^{i}\frac{\partial}{\partial{x^{i}}}\bigg{|}_{q}+A^{jk}\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}\bigg{|}_{q}. (A.1)

Then, Lemma 5.6 yields

FSA=(AF0)s|F0(t0,q)+(AF¯i)yi|F¯(t0,q)+ΓA(F0,F0)2s2|F0(t0,q)+ΓA(F¯i,F¯j)2yiyj|F¯(t0,q)+2ΓA(F0,F¯i)2syi|F(t0,q).\begin{split}F^{S}_{*}A=&\ (AF^{0})\frac{\partial}{\partial s}\bigg{|}_{F^{0}(t_{0},q)}+(A\bar{F}^{i})\frac{\partial}{\partial y^{i}}\bigg{|}_{\bar{F}(t_{0},q)}+\Gamma_{A}(F^{0},F^{0})\frac{\partial^{2}}{\partial s^{2}}\bigg{|}_{F^{0}(t_{0},q)}\\ &\ +\Gamma_{A}(\bar{F}^{i},\bar{F}^{j})\frac{\partial^{2}}{\partial y^{i}\partial y^{j}}\bigg{|}_{\bar{F}(t_{0},q)}+2\Gamma_{A}(F^{0},\bar{F}^{i})\frac{\partial^{2}}{\partial s\partial y^{i}}\bigg{|}_{F(t_{0},q)}.\end{split}

If (ii) holds, then FS(A)T×𝒯SN|F(t0,q)F^{S}_{*}(A)\in T\mathbb{R}\times\mathcal{T}^{S}N|_{F(t_{0},q)}. It then follows

ΓA(F0,F0)=AjkF0xjF0xk=0andΓA(F0,F¯i)=AjkF0xjF¯ixk=0.\Gamma_{A}(F^{0},F^{0})=A^{jk}\frac{\partial F^{0}}{\partial x^{j}}\frac{\partial F^{0}}{\partial x^{k}}=0\quad\text{and}\quad\Gamma_{A}(F^{0},\bar{F}^{i})=A^{jk}\frac{\partial F^{0}}{\partial x^{j}}\frac{\partial\bar{F}^{i}}{\partial x^{k}}=0. (A.2)

Since AA is arbitrary, we know that F0xi=0\frac{\partial F^{0}}{\partial x^{i}}=0 for all ii. Then, by the connectness of MM, F0F^{0} is independent of qMq\in M. This implies that FF is a bundle homomorphism. Now assume that AVSM|(t0,q)A\in V^{S}M|_{(t_{0},q)} has a local expression in (A.1) with A0=0A^{0}=0. If (iii) holds, then FS(A)VSN|F(t0,q)F^{S}_{*}(A)\in V^{S}N|_{F(t_{0},q)}. This amounts to (A.2) together with

AF0=AiF0xi+Ajk2F0xjxk=0.AF^{0}=A^{i}\frac{\partial F^{0}}{\partial x^{i}}+A^{jk}\frac{\partial^{2}F^{0}}{\partial x^{j}\partial x^{k}}=0.

Again, the arbitrariness of AA yields that F0xi=0\frac{\partial F^{0}}{\partial x^{i}}=0 for all ii. Thus, FF is a bundle homomorphism. ∎

It is easy to deduce from the proof that if F=(F0,F¯)F=(F^{0},\bar{F}) is a bundle homomorphism from π\pi to ρ\rho, then FS|T×𝒯SMF^{S}_{*}|_{T\mathbb{R}\times\mathcal{T}^{S}M} is a bundle homomorphism from τR×τSM\tau_{R}\times\tau^{S}_{M} to τ×τSN\tau_{\mathbb{R}}\times\tau^{S}_{N}.

When F:×M×NF:\mathbb{R}\times M\to\mathbb{R}\times N is a diffeomorphism, we can also consider the second-order pullback map FSF^{S*} which is a bundle homomorphism from τS×M\tau^{S*}_{\mathbb{R}\times M} to τS×N\tau^{S*}_{\mathbb{R}\times N}. But when we restrict FSF^{S*} to the mixed-order cotangent bundle T×𝒯SMT^{*}\mathbb{R}\times\mathcal{T}^{S*}M, there are difficulties. We can check that even if FF is a bundle homomorphism, FSF^{S*} does not necessarily map T×𝒯SMT^{*}\mathbb{R}\times\mathcal{T}^{S*}M into T×𝒯SMT^{*}\mathbb{R}\times\mathcal{T}^{S*}M. The reason is basically that the restrictions of second-order pullbacks to the cotangent bundle do not coincide with usual pullbacks. To overcome this, we consider the dual map of FS|T×𝒯SMF^{S}_{*}|_{T\mathbb{R}\times\mathcal{T}^{S}M}. This motivates the following definition, which contrasts with Definitions 5.5 and 5.7.

Definition A.2 (Mixed-order pushforward and pullback).

Let FF be a bundle homomorphism from (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) to (×N,ρ,)(\mathbb{R}\times N,\rho,\mathbb{R}). The mixed-order tangent map of FF at (t,q)×M(t,q)\in\mathbb{R}\times M is the linear map dF(t,q):T×𝒯SM|(t,q)T×𝒯SN|F(t,q)d^{\circ}F_{(t,q)}:T\mathbb{R}\times\mathcal{T}^{S}M|_{(t,q)}\to T\mathbb{R}\times\mathcal{T}^{S}N|_{F(t,q)} defined by

dF(t,q)=d2F(t,q)|Tt×𝒯SqM.d^{\circ}F_{(t,q)}=d^{2}F_{(t,q)}|_{T_{t}\mathbb{R}\times\mathcal{T}^{S}_{q}M}.

The mixed-order cotangent map of FF at (t,q)×M(t,q)\in\mathbb{R}\times M is the linear map dF(t,q):T×𝒯SN|F(t,q)T×𝒯SM|(t,q)d^{\circ}F^{*}_{(t,q)}:T^{*}\mathbb{R}\times\mathcal{T}^{S*}N|_{F(t,q)}\to T^{*}\mathbb{R}\times\mathcal{T}^{S*}M|_{(t,q)} dual to dF(t,q)d^{\circ}F_{(t,q)}, that is,

dF(t,q)(α)(A)=α(dF(t,q)(A)),for ATt×𝒯SqM,αT×𝒯SN|F(t,q).d^{\circ}F^{*}_{(t,q)}(\alpha)(A)=\alpha(d^{\circ}F_{(t,q)}(A)),\quad\text{for }A\in T_{t}\mathbb{R}\times\mathcal{T}^{S}_{q}M,\alpha\in T^{*}\mathbb{R}\times\mathcal{T}^{S*}N|_{F(t,q)}.

The mixed-order pushforward by FF is the bundle homomorphism FR:(T×𝒯SM,τ×τSM,×M)(T×𝒯SN,τ×τSN,×N)F^{R}_{*}:(T\mathbb{R}\times\mathcal{T}^{S}M,\tau_{\mathbb{R}}\times\tau^{S}_{M},\mathbb{R}\times M)\to(T\mathbb{R}\times\mathcal{T}^{S}N,\tau_{\mathbb{R}}\times\tau^{S}_{N},\mathbb{R}\times N) defined by

FR|Tt×𝒯qSM=dF(t,q).F^{R}_{*}|_{T_{t}\mathbb{R}\times\mathcal{T}_{q}^{S}M}=d^{\circ}F_{(t,q)}.

Given a mixed-order form α\alpha on ×N\mathbb{R}\times N, the mixed-order pullback of α\alpha by FF is the mixed-order form FRαF^{R*}\alpha on ×M\mathbb{R}\times M defined by

(FRα)(t,q)=dF(t,q)(αF(t,q)),(t,q)×M.(F^{R*}\alpha)_{(t,q)}=d^{\circ}F^{*}_{(t,q)}\left(\alpha_{F(t,q)}\right),\quad(t,q)\in\mathbb{R}\times M.

If, moreover, FF is a bundle isomorphism, then the mixed-order pullback by FF is the bundle isomorphism FR:(T×𝒯RN,τ×τSN,×N)(T×𝒯SM,τ×τSM,×M)F^{R*}:(T\mathbb{R}\times\mathcal{T}^{R*}N,\tau_{\mathbb{R}}\times\tau^{S*}_{N},\mathbb{R}\times N)\to(T\mathbb{R}\times\mathcal{T}^{S*}M,\tau_{\mathbb{R}}\times\tau^{S*}_{M},\mathbb{R}\times M) defined by

FR|𝒯s×𝒯qSN=dFF1(s,q).F^{R*}|_{\mathcal{T}_{s}\mathbb{R}\times\mathcal{T}_{q^{\prime}}^{S*}N}=d^{\circ}F^{*}_{F^{-1}(s,q^{\prime})}.

Given a mixed-order vector field AA on ×M\mathbb{R}\times M, the mixed-order pushforward of AA by FF is the mixed-order vector field FRAF^{R}_{*}A on ×N\mathbb{R}\times N defined by

(FRA)(s,q)=dFF1(s,q)(AF1(s,q)),(s,q)×N.(F^{R}_{*}A)_{(s,q^{\prime})}=d^{\circ}F_{F^{-1}(s,q^{\prime})}\left(A_{F^{-1}(s,q^{\prime})}\right),\quad(s,q^{\prime})\in\mathbb{R}\times N.

Clearly, the mixed-order pushforward FRF^{R}_{*} is nothing but FS|T×𝒯SMF^{S}_{*}|_{T\mathbb{R}\times\mathcal{T}^{S}M}. Write F=(F0,F¯)F=(F^{0},\bar{F}). Then, in local coordinates, FRF^{R}_{*} acts on AA of (A.1) as follows:

FRA=A0dF0dt(t0)s|F0(t0)+(AF¯i)(t0,q)yi|F¯(t0,q)+AklF¯ixkF¯jxl(t0,q)2yiyj|F¯(t0,q)=A0dF0dt(t0)s|F0(t0)+[A0F¯it(t0,q)+AjF¯ixj(t0,q)+Ajk2F¯ixjxk(t0,q)]yi|F¯(t0,q)+AklF¯ixkF¯jxl(t0,q)2yiyj|F¯(t0,q).\begin{split}F^{R}_{*}A=&\ A^{0}\frac{dF^{0}}{dt}(t_{0})\frac{\partial}{\partial s}\bigg{|}_{F^{0}(t_{0})}+(A\bar{F}^{i})(t_{0},q)\frac{\partial}{\partial y^{i}}\bigg{|}_{\bar{F}(t_{0},q)}+A^{kl}\frac{\partial\bar{F}^{i}}{\partial x^{k}}\frac{\partial\bar{F}^{j}}{\partial x^{l}}(t_{0},q)\frac{\partial^{2}}{\partial y^{i}\partial y^{j}}\bigg{|}_{\bar{F}(t_{0},q)}\\ =&\ A^{0}\frac{dF^{0}}{dt}(t_{0})\frac{\partial}{\partial s}\bigg{|}_{F^{0}(t_{0})}+\left[A^{0}\frac{\partial\bar{F}^{i}}{\partial t}(t_{0},q)+A^{j}\frac{\partial\bar{F}^{i}}{\partial x^{j}}(t_{0},q)+A^{jk}\frac{\partial^{2}\bar{F}^{i}}{\partial x^{j}\partial x^{k}}(t_{0},q)\right]\frac{\partial}{\partial y^{i}}\bigg{|}_{\bar{F}(t_{0},q)}\\ &\ +A^{kl}\frac{\partial\bar{F}^{i}}{\partial x^{k}}\frac{\partial\bar{F}^{j}}{\partial x^{l}}(t_{0},q)\frac{\partial^{2}}{\partial y^{i}\partial y^{j}}\bigg{|}_{\bar{F}(t_{0},q)}.\end{split} (A.3)

And FRF^{R*} acts on the mixed-order cotangent vector α=α0ds|F0(t0)+αid2yi|F¯(t0,q)+αijdyidyj|F¯(t0,q)T×𝒯SN|F(t0,q)\alpha=\alpha_{0}ds|_{F^{0}(t_{0})}+\alpha_{i}d^{2}y^{i}|_{\bar{F}(t_{0},q)}+\alpha_{ij}dy^{i}\cdot dy^{j}|_{\bar{F}(t_{0},q)}\in T\mathbb{R}\times\mathcal{T}^{S*}N|_{F(t_{0},q)} by

FRα=α0dF0dt(t0)dt|t0+αidFi|(t0,q)+αijF¯ixkdF¯idxl(t0,q)dxkdxl|q=[α0dF0dt(t0)+αiF¯it(t0,q)]dt|t0+αiF¯ixj(t0,q)d2xj|q+[αi22F¯ixkxl(t0,q)+αijF¯ixkdF¯jdxl(t0,q)]dxkdxl|q.\begin{split}F^{R*}\alpha=&\ \alpha_{0}\frac{dF^{0}}{dt}(t_{0})dt|_{t_{0}}+\alpha_{i}d^{\circ}F^{i}|_{(t_{0},q)}+\alpha_{ij}\frac{\partial\bar{F}^{i}}{\partial x^{k}}\frac{d\bar{F}^{i}}{dx^{l}}(t_{0},q)dx^{k}\cdot dx^{l}|_{q}\\ =&\ \left[\alpha_{0}\frac{dF^{0}}{dt}(t_{0})+\alpha_{i}\frac{\partial\bar{F}^{i}}{\partial t}(t_{0},q)\right]dt|_{t_{0}}+\alpha_{i}\frac{\partial\bar{F}^{i}}{\partial x^{j}}(t_{0},q)d^{2}x^{j}|_{q}\\ &\ +\left[\frac{\alpha_{i}}{2}\frac{\partial^{2}\bar{F}^{i}}{\partial x^{k}\partial x^{l}}(t_{0},q)+\alpha_{ij}\frac{\partial\bar{F}^{i}}{\partial x^{k}}\frac{d\bar{F}^{j}}{dx^{l}}(t_{0},q)\right]dx^{k}\cdot dx^{l}|_{q}.\end{split} (A.4)

By virtue of these local expressions, one easily deduce that

FR|Tt×𝒯SqM=F(q)|Tt×F¯(t)S|𝒯SqM,FR|𝒯s×𝒯SqN=F(q)|Ts×F¯(t)S|𝒯SqN.F^{R}_{*}|_{T_{t}\mathbb{R}\times\mathcal{T}^{S}_{q}M}=F(q)_{*}|_{T_{t}\mathbb{R}}\times\bar{F}(t)^{S}_{*}|_{\mathcal{T}^{S}_{q}M},\quad F^{R*}|_{\mathcal{T}^{*}_{s}\mathbb{R}\times\mathcal{T}^{S*}_{q^{\prime}}N}=F(q)^{*}|_{T^{*}_{s}\mathbb{R}}\times\bar{F}(t)^{S*}|_{\mathcal{T}^{S*}_{q^{\prime}}N}.

And in turn, these verify the linearity of FRF^{R}_{*} and FRF^{R*}. The following property is easy to check.

Lemma A.3.

Let FF be a bundle isomorphism from (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) to (×N,ρ,)(\mathbb{R}\times N,\rho,\mathbb{R}) and AA be a mixed-order vector field. Let ff be a smooth functions on ×N\mathbb{R}\times N. Then ((FRA)f)F=A(fF)((F^{R}_{*}A)f)\circ F=A(f\circ F).

A.2 Pushforwards of generators

A smooth map F:MNF:M\to N can be associated naturally with a bundle homomorphism 𝐈𝐝×F:(×M,π,)(×N,ρ,)\mathbf{Id}_{\mathbb{R}}\times F:(\mathbb{R}\times M,\pi,\mathbb{R})\to(\mathbb{R}\times N,\rho,\mathbb{R}) that projects to the identity on \mathbb{R}. In this case, the pushforward of a diffusion XX by 𝐈𝐝×F\mathbf{Id}_{\mathbb{R}}\times F is just (𝐈𝐝×F)X=F(X)(\mathbf{Id}_{\mathbb{R}}\times F)\cdot X=F(X). The stochastic prolongations of the bundle homomorphism 𝐈𝐝×F\mathbf{Id}_{\mathbb{R}}\times F is then

j(𝐈𝐝×F)(j(t,q)X)=j(t,F(q))(F(X)).j(\mathbf{Id}_{\mathbb{R}}\times F)(j_{(t,q)}X)=j_{(t,F(q))}(F(X)).
Corollary A.4.

Let F:MNF:M\to N be a diffeomorphism. If a diffusion XX on MM has a generator A=(At)A=(A_{t}), then the process F(X)F(X) is a diffusion on NN, with generator FSA=(FSAt)F^{S}_{*}A=(F^{S}_{*}A_{t}).

Proof.

Assume XIt0(M)X\in I_{t_{0}}(M). For every fC(N)f\in C^{\infty}(N), fFC(M)f\circ F\in C^{\infty}(M), by the assumption, we have

fF(X(t))fF(X(t0))t0tAs(fF)(X(s))ds=f(F(X(t)))f(F(X(t0)))t0t((FSAs)f)(F(X(s)))ds\begin{split}&\ f\circ F(X(t))-f\circ F(X(t_{0}))-\int_{t_{0}}^{t}A_{s}(f\circ F)(X(s))ds\\ =&\ f(F(X(t)))-f(F(X(t_{0})))-\int_{t_{0}}^{t}\left((F^{S}_{*}A_{s})f\right)(F(X(s)))ds\end{split}

is a real-valued continuous {𝒫t}\{\mathcal{P}_{t}\}-martingale. This proves that F(X)It0(N)F(X)\in I_{t_{0}}(N) has generator FSAF^{S}_{*}A. ∎

This corollary together with the identification between ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M and ×𝒯EM\mathbb{R}\times\mathcal{T}^{E}M in (3.6) and (3.7), give rise to the relation between prolongations and pushforwards as follows:

j(𝐈𝐝×F)(t,Aq)=j(𝐈𝐝×F)(j(t,q)XA)=j(t,F(q))(FXA)=(t,(FSAt)F(q))=(t,d2Fq(A(t,q)))=(t,d2Fq(Aq))=(𝐈𝐝×FS)(t,Aq),\begin{split}j(\mathbf{Id}_{\mathbb{R}}\times F)(t,A_{q})&=j(\mathbf{Id}_{\mathbb{R}}\times F)(j_{(t,q)}X^{A})=j_{(t,F(q))}(F\circ X^{A})=\left(t,(F^{S}_{*}A_{t})_{F(q)}\right)\\ &=\left(t,d^{2}F_{q}(A_{(t,q)})\right)=\left(t,d^{2}F_{q}(A_{q})\right)=(\mathbf{Id}_{\mathbb{R}}\times F^{S}_{*})(t,A_{q}),\end{split}

so that j(𝐈𝐝×F)=𝐈𝐝×FSj(\mathbf{Id}_{\mathbb{R}}\times F)=\mathbf{Id}_{\mathbb{R}}\times F^{S}_{*}.

The following corollary is an extension of Corollary A.4 and a straightforward consequence of Lemma 4.8. Here, we will present another proof, using notions of Appendix A.1.

Corollary A.5.

Let FF be a bundle isomorphism from (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) to (×N,ρ,)(\mathbb{R}\times N,\rho,\mathbb{R}) projecting to F0F^{0}. If XX is a diffusion on MM with respect to {𝒫t}\{\mathcal{P}_{t}\} and has a extended generator t+A\frac{\partial}{\partial{t}}+A where AA is a time-dependent second-order vector field, then the pushforward FXF\cdot X is a diffusion on NN with respect to {(F0)1(s)}\{\mathcal{F}_{(F^{0})^{-1}(s)}\}, with extended generator

d(F0)1dsFR(t+A).\frac{d(F^{0})^{-1}}{ds}F^{R}_{*}\left(\frac{\partial}{\partial{t}}+A\right).
Proof.

Assume that XIt0(M)X\in I_{t_{0}}(M) and F=(F0,F¯)F=(F^{0},\bar{F}). For every fC(×N)f\in C^{\infty}(\mathbb{R}\times N), Lemma A.3 yields that the process

fF(t,X(t))fF(t0,X(t0))t0t(t+A)(fF)(u,X(u))du=f(F0(t),F¯(t,X(t)))f(F0(t0),F¯(t0,X(t0)))t0tFR(t+A)f(F0(u),F¯(u,X(u)))du\begin{split}&\ f\circ F(t,X(t))-f\circ F(t_{0},X(t_{0}))-\int_{t_{0}}^{t}\left(\frac{\partial}{\partial{t}}+A\right)(f\circ F)(u,X(u))du\\ =&\ f\left(F^{0}(t),\bar{F}(t,X(t))\right)-f\left(F^{0}(t_{0}),\bar{F}(t_{0},X(t_{0}))\right)-\int_{t_{0}}^{t}F^{R}_{*}\left(\frac{\partial}{\partial{t}}+A\right)f\left(F^{0}(u),\bar{F}(u,X(u))\right)du\end{split}

is a continuous {𝒫t}\{\mathcal{P}_{t}\}-martingale. Denote s0=F0(t0)s_{0}=F^{0}(t_{0}). By substituting t=(F0)1(s)t=(F^{0})^{-1}(s) which can be done because F0F^{0} is an isomorphism, and using the change of variable u=(F0)1(v)u=(F^{0})^{-1}(v), and recalling that FX(s)=F¯((F0)1(s),X((F0)1(s)))F\cdot X(s)=\bar{F}\left((F^{0})^{-1}(s),X((F^{0})^{-1}(s))\right), the process

f(s,FX(s))f(s0,FX(s0))(F0)1(s0)(F0)1(s)FR(t+A)f(F0(u),F¯(u,X(u)))du=f(s,FX(s))f(s0,FX(s0))s0sd(F0)1ds(v)FR(t+A)f(v,FX(v))dv\begin{split}&\ f(s,F\cdot X(s))-f(s_{0},F\cdot X(s_{0}))-\int_{(F^{0})^{-1}(s_{0})}^{(F^{0})^{-1}(s)}F^{R}_{*}\left(\frac{\partial}{\partial{t}}+A\right)f\left(F^{0}(u),\bar{F}(u,X(u))\right)du\\ =&\ f(s,F\cdot X(s))-f(s_{0},F\cdot X(s_{0}))-\int_{s_{0}}^{s}\frac{d(F^{0})^{-1}}{ds}(v)F^{R}_{*}\left(\frac{\partial}{\partial{t}}+A\right)f(v,F\cdot X(v))dv\end{split}

is a continuous {(F0)1(s)}\{\mathcal{F}_{(F^{0})^{-1}(s)}\}-martingale. The result follows. ∎

Remark A.6.

(i) As a consequence, the generator of the pushforward FXF\cdot X is given in local coordinates by

d(F0)1ds[(t+A)F¯iF1]yi+d(F0)1ds[(AklF¯ixkF¯jxl)F1]2yiyj.\frac{d(F^{0})^{-1}}{ds}\left[\left(\frac{\partial}{\partial{t}}+A\right)\bar{F}^{i}\circ F^{-1}\right]\frac{\partial}{\partial y^{i}}+\frac{d(F^{0})^{-1}}{ds}\left[\left(A^{kl}\frac{\partial\bar{F}^{i}}{\partial x^{k}}\frac{\partial\bar{F}^{j}}{\partial x^{l}}\right)\circ F^{-1}\right]\frac{\partial^{2}}{\partial y^{i}\partial y^{j}}.

This coincides with Lemma 4.8.

(ii) This corollary together with Lemma A.1 indicates that the bundle homomorphisms from ×M\mathbb{R}\times M to ×N\mathbb{R}\times N are the only (deterministic) smooth maps between them that map diffusions to diffusions. Indeed, if a smooth map FF from ×M\mathbb{R}\times M to ×N\mathbb{R}\times N pushes forward a diffusion to another diffusion, then a similar argument as in Corollary A.5 implies that FSF^{S}_{*} would map the extended generator of the former diffusion to that of the latter, whereas Lemma A.1 says such FSF^{S}_{*} must be the second-order pushforward of some bundle homomorphism.

(iii) In particular, if FF is a smooth map from MM to NN and XX is a diffusion on MM with generator AA, then F(X)F(X) is a diffusion on NN with respect to the same filtration, with generator FS(A)F^{S}_{*}(A).

A.3 Pushforwards and pullbacks by diffusions

Definition A.7 (Pushforwards and pullbacks by diffusions).

Let XX be an MM-valued diffusion process. Let (×U,(t,xi))(\mathbb{R}\times U,(t,x^{i})) be a coordinate chart on ×M\mathbb{R}\times M. The pushforward map XX_{*} from TtT_{t}\mathbb{R} to Tt×𝒯SX(t)MT_{t}\mathbb{R}\times\mathcal{T}^{S}_{X(t)}M is defined in the local coordinate by

X(τddt|t0)=τ(t|t0+(DX)i(t0)xi|X(t0)+12(QX)jk(t0)2xjxk|X(t0)).X_{*}\left(\tau\frac{d}{dt}\bigg{|}_{t_{0}}\right)=\tau\left(\frac{\partial}{\partial{t}}\bigg{|}_{t_{0}}+(DX)^{i}(t_{0})\frac{\partial}{\partial{x^{i}}}\bigg{|}_{X(t_{0})}+\frac{1}{2}(QX)^{jk}(t_{0})\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}\bigg{|}_{X(t_{0})}\right). (A.5)

The pullback map XX^{*} from 𝒯t×𝒯SX(t)M\mathcal{T}^{*}_{t}\mathbb{R}\times\mathcal{T}^{S*}_{X(t)}M to 𝒯t\mathcal{T}^{*}_{t}\mathbb{R} is defined by

X(α0dt|t0+αid2xi|X(t0)+12αjkdxjdxk|X(t0))=(α0+αi(DX)i(t0)+12αjk(QX)jk(t0))dt|t0.X^{*}\left(\alpha_{0}dt|_{t_{0}}+\alpha_{i}d^{2}x^{i}|_{X(t_{0})}+\textstyle{{\frac{1}{2}}}\alpha_{jk}dx^{j}\cdot dx^{k}|_{X(t_{0})}\right)=\left(\alpha_{0}+\alpha_{i}(DX)^{i}(t_{0})+\textstyle{{\frac{1}{2}}}\alpha_{jk}(QX)^{jk}(t_{0})\right)dt|_{t_{0}}. (A.6)
Remark A.8.

Recall that in classical differential geometry, the pushforward by a smooth curve γ=(γ(t))t[1,1]\gamma=(\gamma(t))_{t\in[-1,1]} on MM is a map γ:TTM\gamma_{*}:T\mathbb{R}\to TM given by γ(ddt|t0)=γ˙i(t0)xi|γ(t0)\gamma_{*}(\frac{d}{dt}|_{t_{0}})=\dot{\gamma}^{i}(t_{0})\frac{\partial}{\partial{x^{i}}}|_{\gamma(t_{0})}. While if we look at the graph of γ\gamma as a section of the trivial bundle (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}), denoted by γ¯\bar{\gamma}, then the pushforward map by γ¯\bar{\gamma} is γ¯(ddt|t0)=ddt|t0+γ˙i(t0)xi|γ(t0)\bar{\gamma}_{*}(\frac{d}{dt}|_{t_{0}})=\frac{d}{dt}|_{t_{0}}+\dot{\gamma}^{i}(t_{0})\frac{\partial}{\partial{x^{i}}}|_{\gamma(t_{0})}. For this reason, it would be more appropriate to call XX_{*} and XX^{*} in Definition A.7 the pushforward and pullback by graph of XX, or by random section corresponding to XX, instead of by XX itself. But we avoid that for simplicity.

One can see from the definition that the pushforward XX_{*} maps the time vector ddt|t0\frac{d}{dt}|_{t_{0}} to the value of the extended generator of XX at (t0,X(t0))(t_{0},X(t_{0})). There is an informal way to look at the pullback map XX^{*}: one first replace all xx’s by XX’s in the brackets at LHS of (A.6) and obtain

α0dt+αidXi+12αjkdXjdXk;\alpha_{0}dt+\alpha_{i}dX^{i}+\textstyle{{\frac{1}{2}}}\alpha_{jk}dX^{j}\cdot dX^{k};

then substituting dXidX^{i} and dXjdXkdX^{j}\cdot dX^{k}, and following Itô’s calculus,

dXi=(DX)idt+martingale part,dXjdXk=(QX)jkdt,dX^{i}=(DX)^{i}dt+\text{martingale part},\qquad dX^{j}\cdot dX^{k}=(QX)^{jk}dt,

and getting rid of the martingale part, we get the RHS of (A.6).

The following corollary is straightforward. We will see that pushforward and pullback maps by diffusions are also closely related to the concept of “total derivatives”.

Corollary A.9.

(i). Let XX be an MM-valued diffusion process. For all τddt|t0𝒯t0\tau\frac{d}{dt}|_{t_{0}}\in\mathcal{T}_{t_{0}}\mathbb{R} and α𝒯t0×𝒯X(t0)SM\alpha\in\mathcal{T}_{t_{0}}^{*}\mathbb{R}\times\mathcal{T}_{X(t_{0})}^{S*}M,

X(α),τddt|t0=α,X(τddt|t0).\left\langle X^{*}\left(\alpha\right),\tau\textstyle{\frac{d}{dt}|_{t_{0}}}\right\rangle=\left\langle\alpha,X_{*}(\tau\textstyle{\frac{d}{dt}|_{t_{0}}})\right\rangle. (A.7)

(ii). If XI(t0,q)(M)X\in I_{(t_{0},q)}(M), ff is a smooth function on ×M\mathbb{R}\times M and gg a smooth function on MM, then

X(df),ddt|t0\displaystyle\left\langle X^{*}(d^{\circ}f),\textstyle{\frac{d}{dt}}\right\rangle\big{|}_{t_{0}} =X(ddt)(f)|(t0,q)=(𝐃tf)(j(t0,q)X)=t+AX,df(t0,q),\displaystyle=X_{*}(\textstyle{\frac{d}{dt}})(f)\big{|}_{(t_{0},q)}=(\mathbf{D}_{t}f)(j_{(t_{0},q)}X)=\langle\textstyle{{\frac{\partial}{\partial t}}}+A^{X},d^{\circ}f\rangle(t_{0},q),
X(dgdg),ddt|t0\displaystyle\left\langle X^{*}(dg\cdot dg),\textstyle{\frac{d}{dt}}\right\rangle\big{|}_{t_{0}} =dgdg,X(ddt)|(t0,q)=(𝐐tg)(j(t0,q)X).\displaystyle=\left\langle dg\cdot dg,X_{*}(\textstyle{\frac{d}{dt}})\right\rangle\big{|}_{(t_{0},q)}=(\mathbf{Q}_{t}g)(j_{(t_{0},q)}X).

(iii). Let X,YX,Y be MM-valued diffusion processes satisfying X(t)=Y(t)X(t)=Y(t) a.s.. Then, jtX=jtYj_{t}X=j_{t}Y a.s. if and only if X(ddt|t)=Y(ddt|t)X_{*}(\frac{d}{dt}|_{t})=Y_{*}(\frac{d}{dt}|_{t}) a.s.. In particular, if X,YI(t,q)(M)X,Y\in I_{(t,q)}(M), then j(t,q)X=j(t,q)Yj_{(t,q)}X=j_{(t,q)}Y if and only if X(ddt|t)=Y(ddt|t)X_{*}(\frac{d}{dt}|_{t})=Y_{*}(\frac{d}{dt}|_{t}).
(iv). Let FF be a bundle homomorphism from (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) to (×N,ρ,)(\mathbb{R}\times N,\rho,\mathbb{R}) projecting to F0F^{0}, and XX be an MM-valued diffusion process. Then FRX=(FX)(F0)F^{R}_{*}\circ X_{*}=(F\cdot X)_{*}\circ(F^{0})_{*}.
(v). Let FF be a smooth function from MM to MM, and XX be an MM-valued diffusion process. Then (𝐈𝐝T×FS)X=(FX)(\mathbf{Id}_{T\mathbb{R}}\times F^{S}_{*})\circ X_{*}=(F\circ X)_{*}.

Proof.

Assertions (i), (ii) and (iii) are easy to deduce from the definitions. We prove (iv) using local expressions. Assume that F=(F0,F¯)F=(F^{0},\bar{F}) and denote X~=FX\tilde{X}=F\cdot X. Recall that X~(F0(t))=F¯(t,X(t))\tilde{X}(F^{0}(t))=\bar{F}(t,X(t)). Then

FRX(ddt|t)=dF0dt(t)s|F0(t)+[F¯it(t,X(t))+(DX)j(t)F¯ixj(t,X(t))+12(QX)jk(t)2F¯ixjxk(t,X(t))]yi|F¯(t,X(t))+12(QX)kl(t)F¯ixkF¯jxl(t,X(t))2yiyj|F¯(t,X(t))=dF0dt(t)[s|F0(t)+(DX~)i(F0(t))yi|X~(F0(t))+12(QX~)ij(F0(t))2yiyj|X~(F0(t))]=dF0dt(t)(FX)(s|F0(t))=(FX)(F0)(ddt|t).\begin{split}F^{R}_{*}\circ X_{*}\left(\frac{d}{dt}\bigg{|}_{t}\right)=&\ \frac{dF^{0}}{dt}(t)\frac{\partial}{\partial s}\bigg{|}_{F^{0}(t)}+\bigg{[}\frac{\partial\bar{F}^{i}}{\partial t}(t,X(t))+(DX)^{j}(t)\frac{\partial\bar{F}^{i}}{\partial x^{j}}(t,X(t))\\ &\ +\frac{1}{2}(QX)^{jk}(t)\frac{\partial^{2}\bar{F}^{i}}{\partial x^{j}\partial x^{k}}(t,X(t))\bigg{]}\frac{\partial}{\partial y^{i}}\bigg{|}_{\bar{F}(t,X(t))}+\frac{1}{2}(QX)^{kl}(t)\frac{\partial\bar{F}^{i}}{\partial x^{k}}\frac{\partial\bar{F}^{j}}{\partial x^{l}}(t,X(t))\frac{\partial^{2}}{\partial y^{i}\partial y^{j}}\bigg{|}_{\bar{F}(t,X(t))}\\ =&\ \frac{dF^{0}}{dt}(t)\Bigg{[}\frac{\partial}{\partial s}\bigg{|}_{F^{0}(t)}+(D\tilde{X})^{i}(F^{0}(t))\frac{\partial}{\partial y^{i}}\bigg{|}_{\tilde{X}(F^{0}(t))}+\frac{1}{2}(Q\tilde{X})^{ij}(F^{0}(t))\frac{\partial^{2}}{\partial y^{i}\partial y^{j}}\bigg{|}_{\tilde{X}(F^{0}(t))}\Bigg{]}\\ =&\ \frac{dF^{0}}{dt}(t)(F\cdot X)_{*}\left(\frac{\partial}{\partial s}\bigg{|}_{F^{0}(t)}\right)\\ =&\ (F\cdot X)_{*}\circ(F^{0})_{*}\left(\frac{d}{dt}\bigg{|}_{t}\right).\end{split}

The result follows. ∎

A.4 Lie derivatives

Definition A.10 (Lie derivatives).

Let VV be a vector field on MM and ψ={ψϵ}ϵ\psi=\{\psi_{\epsilon}\}_{\epsilon\in\mathbb{R}} be its flow. Let AA be a second-order vector field and α\alpha be a second-order form on MM. The Lie derivative of AA with respect to VV is a second-order vector field on MM, denoted by VA\mathcal{L}_{V}A, and defined by

(VA)q=ddϵ|ϵ=0(ψϵ)S(Aψϵ(q))=limϵ0(ψϵ)S(Aψϵ(q))Aqϵ.(\mathcal{L}_{V}A)_{q}=\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}(\psi_{-\epsilon})^{S}_{*}(A_{\psi_{\epsilon}(q)})=\lim_{\epsilon\to 0}\frac{(\psi_{-\epsilon})^{S}_{*}(A_{\psi_{\epsilon}(q)})-A_{q}}{\epsilon}.

The Lie derivative of α\alpha with respect to VV is a second-order form on MM, denoted by Vα\mathcal{L}_{V}\alpha, and defined by

(Vα)q=ddϵ|ϵ=0(ψϵ)S(αψϵ(q))=limϵ0(ψϵ)S(αψϵ(q))αqϵ.(\mathcal{L}_{V}\alpha)_{q}=\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}(\psi_{\epsilon})^{S*}(\alpha_{\psi_{\epsilon}(q)})=\lim_{\epsilon\to 0}\frac{(\psi_{\epsilon})^{S*}(\alpha_{\psi_{\epsilon}(q)})-\alpha_{q}}{\epsilon}.

For sufficient small ϵ0\epsilon\neq 0, ψϵ\psi_{\epsilon} is defined in a neighborhood of qMq\in M and ψϵ\psi_{-\epsilon} is the inverse of ψϵ\psi_{\epsilon}. So the difference quotients in the above definitions of Lie derivatives make sense. It is easy to verify that the derivatives exist for each qMq\in M, and VA\mathcal{L}_{V}A is a smooth second-order vector field, Vα\mathcal{L}_{V}\alpha is a smooth second-order covector field. Likewise, the restrictions of V\mathcal{L}_{V} to 𝒯qM\mathcal{T}_{q}M and 𝒯F(q)N\mathcal{T}^{*}_{F(q)}N coincide with the classical Lie derivatives. In the following, we will seek properties of \mathcal{L}. Some of them can be found in [66, Section 6.(d)].

Lemma A.11.

Let VV be a vector field and ff be a smooth function. Let AA and α\alpha be a second-order vector field and second-order form, respectively. Then
(i) VA=[V,A]\mathcal{L}_{V}A=[V,A], where the RHS denotes the commutator of VV and AA as linear operators;
(ii) V(fA)=(Vf)A+fVA\mathcal{L}_{V}(fA)=(Vf)A+f\mathcal{L}_{V}A;
(iii) Vα,A=V(α,A)α,VA\langle\mathcal{L}_{V}\alpha,A\rangle=V(\langle\alpha,A\rangle)-\langle\alpha,\mathcal{L}_{V}A\rangle;
(iv) V(fα)=(Vf)α+fVα\mathcal{L}_{V}(f\alpha)=(Vf)\alpha+f\mathcal{L}_{V}\alpha;
(v) V(d2f)=d2(Vf)\mathcal{L}_{V}(d^{2}f)=d^{2}(Vf).

Remark A.12.

Note that the commutator [V,A][V,A] is a second-order vector field. Indeed, if VV and AA have coordinate expressions V=VixiV=V^{i}\frac{\partial}{\partial{x^{i}}} and A=Aixi+Aij2xixjA=A^{i}\frac{\partial}{\partial x^{i}}+A^{ij}\frac{\partial^{2}}{\partial x^{i}\partial x^{j}}, then the following local expression for [V,A][V,A] is easy to verify,

[V,A]=(VjAixjAjVixjAjk2Vixjxk)xi+ViAjkxi2xjxkAjk(Vixj2xixk+Vixk2xixj).\begin{split}[V,A]=&\ \left(V^{j}\frac{\partial A^{i}}{\partial x^{j}}-A^{j}\frac{\partial V^{i}}{\partial x^{j}}-A^{jk}\frac{\partial^{2}V^{i}}{\partial x^{j}\partial x^{k}}\right)\frac{\partial}{\partial{x^{i}}}+V^{i}\frac{\partial A^{jk}}{\partial x^{i}}\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}-A^{jk}\left(\frac{\partial V^{i}}{\partial x^{j}}\frac{\partial^{2}}{\partial x^{i}\partial x^{k}}+\frac{\partial V^{i}}{\partial x^{k}}\frac{\partial^{2}}{\partial x^{i}\partial x^{j}}\right).\end{split}
Proof.

(i) For a function fC(M)f\in C^{\infty}(M),

(VA)qf=limϵ0(ψϵ)S(Aψϵ(q))fAqfϵ=limϵ0(Aψϵ(q))(fψϵ)Aqfϵ=limϵ0(Aψϵ(q))(fψϵf)ϵ+limϵ0(Aψϵ(q))fAqfϵ.\begin{split}(\mathcal{L}_{V}A)_{q}f&=\lim_{\epsilon\to 0}\frac{(\psi_{-\epsilon})^{S}_{*}(A_{\psi_{\epsilon}(q)})f-A_{q}f}{\epsilon}=\lim_{\epsilon\to 0}\frac{(A_{\psi_{\epsilon}(q)})(f\circ\psi_{-\epsilon})-A_{q}f}{\epsilon}\\ &=\lim_{\epsilon\to 0}\frac{(A_{\psi_{\epsilon}(q)})(f\circ\psi_{-\epsilon}-f)}{\epsilon}+\lim_{\epsilon\to 0}\frac{(A_{\psi_{\epsilon}(q)})f-A_{q}f}{\epsilon}.\end{split}

Then, a similar argument to the derivation of classical Lie derivatives yields

(VA)qf=Aq(Vf)+Vq(Af)=[V,A]qf.(\mathcal{L}_{V}A)_{q}f=-A_{q}(Vf)+V_{q}(Af)=[V,A]_{q}f.

(ii) V(fA)g=[V,fA]g=V(fAg)fAVg=VfAg+fVAgfAVg=VfAg+f(VA)g\mathcal{L}_{V}(fA)g=[V,fA]g=V(fAg)-fAVg=VfAg+fVAg-fAVg=VfAg+f(\mathcal{L}_{V}A)g.

(iii) For a second-order vector field AA,

Vα,A=limϵ0(ψϵ)S(αψϵ(q)),Aαq,Aϵ=limϵ0αψϵ(q),(ψϵ)SAαq,Aϵ=limt0αψϵ(q)αq,(ψϵ)SAϵ+limϵ0αq,(ψϵ)SAAt=limϵ0αψϵ(q)αq,Aϵlimϵ0αq,(ψϵ)SAAϵ=V(α,A)α,VA.\begin{split}\langle\mathcal{L}_{V}\alpha,A\rangle&=\lim_{\epsilon\to 0}\frac{\langle(\psi_{\epsilon})^{S*}(\alpha_{\psi_{\epsilon}(q)}),A\rangle-\langle\alpha_{q},A\rangle}{\epsilon}=\lim_{\epsilon\to 0}\frac{\langle\alpha_{\psi_{\epsilon}(q)},(\psi_{\epsilon})^{S}_{*}A\rangle-\langle\alpha_{q},A\rangle}{\epsilon}\\ &=\lim_{t\to 0}\frac{\langle\alpha_{\psi_{\epsilon}(q)}-\alpha_{q},(\psi_{\epsilon})^{S}_{*}A\rangle}{\epsilon}+\lim_{\epsilon\to 0}\frac{\langle\alpha_{q},(\psi_{\epsilon})^{S}_{*}A-A\rangle}{t}\\ &=\lim_{\epsilon\to 0}\frac{\langle\alpha_{\psi_{\epsilon}(q)}-\alpha_{q},A\rangle}{\epsilon}-\lim_{\epsilon\to 0}\frac{\langle\alpha_{q},(\psi_{-\epsilon})^{S}_{*}A-A\rangle}{\epsilon}\\ &=V(\langle\alpha,A\rangle)-\langle\alpha,\mathcal{L}_{V}A\rangle.\end{split}

(iv) Use (iii) to derive

V(fα),A=V(fα,A)fα,VA=(Vf)α,A+fV(α,A)fα,VA=(Vf)α,A+fVα,A.\begin{split}\langle\mathcal{L}_{V}(f\alpha),A\rangle&=V(f\langle\alpha,A\rangle)-f\langle\alpha,\mathcal{L}_{V}A\rangle=(Vf)\langle\alpha,A\rangle+fV(\langle\alpha,A\rangle)-f\langle\alpha,\mathcal{L}_{V}A\rangle\\ &=(Vf)\langle\alpha,A\rangle+f\langle\mathcal{L}_{V}\alpha,A\rangle.\end{split}

(v) Again using (iii) we have V(d2f),A=V(d2f,A)d2f,VA=VAf[V,A]f=AVf=d2(Vf),A\langle\mathcal{L}_{V}(d^{2}f),A\rangle=V(\langle d^{2}f,A\rangle)-\langle d^{2}f,\mathcal{L}_{V}A\rangle=VAf-[V,A]f=AVf=\langle d^{2}(Vf),A\rangle. ∎

Corollary A.13.

(i) V(dfdg)=d(Vf)dg+dfd(Vg)\mathcal{L}_{V}(df\cdot dg)=d(Vf)\cdot dg+df\cdot d(Vg).
(ii) V(ωη)=Vωη+ωVη\mathcal{L}_{V}(\omega\cdot\eta)=\mathcal{L}_{V}\omega\cdot\eta+\omega\cdot\mathcal{L}_{V}\eta.
(iii) V\mathcal{L}_{V} commutes with the symmetric product operator \bullet.

Proof.

For the first assertion,

V(dfdg),A=V(dfdg,A)dfdg,VA=V(ΓA(f,g))Γ[V,A](f,g)=V(A(fg)fAggAf)([V,A](fg)f[V,A]gg[V,A]f)=VA(fg)VfAgfVAgVgAfgVAf(VA(fg)AV(fg)fVAg+fAVggVAf+gAVf)=AV(fg)VfAgVgAffAVggAVf=[A(Vfg)VfAggAVf][A(fVg)VgAffAVg]=d(Vf)dg,A+dfd(Vg),A.\begin{split}&\ \langle\mathcal{L}_{V}(df\cdot dg),A\rangle=V(\langle df\cdot dg,A\rangle)-\langle df\cdot dg,\mathcal{L}_{V}A\rangle=V(\Gamma_{A}(f,g))-\Gamma_{[V,A]}(f,g)\\ =&\ V(A(fg)-fAg-gAf)-([V,A](fg)-f[V,A]g-g[V,A]f)\\ =&\ VA(fg)-VfAg-fVAg-VgAf-gVAf\\ &\ -(VA(fg)-AV(fg)-fVAg+fAVg-gVAf+gAVf)\\ =&\ AV(fg)-VfAg-VgAf-fAVg-gAVf\\ =&\ [A(Vfg)-VfAg-gAVf]-[A(fVg)-VgAf-fAVg]\\ =&\ \langle d(Vf)\cdot dg,A\rangle+\langle df\cdot d(Vg),A\rangle.\end{split}

We use the local expressions to prove the second assertion. Assume, locally, that ω=ωidxi\omega=\omega_{i}dx^{i} and η=ηidxi\eta=\eta_{i}dx^{i}. Then, by (5.4), Lemma A.11.(ii) and Corollary A.11.(iv),

V(ωη)=V(ωiηjdxidxj)=V(ωiηj)dxidxj+ωiηjV(dxidxj)=(ηjVωi+ωiVηj)dxidxj+ωiηj(dVidxj+dxidVj)=(Vωidxi+ωidVi)(ηjdxj)+(ωidxi)(Vηjdxj+ηjdVj)=Vωη+ωVη.\begin{split}\mathcal{L}_{V}(\omega\cdot\eta)&=\mathcal{L}_{V}(\omega_{i}\eta_{j}dx^{i}\cdot dx^{j})=V(\omega_{i}\eta_{j})dx^{i}\cdot dx^{j}+\omega_{i}\eta_{j}\mathcal{L}_{V}(dx^{i}\cdot dx^{j})\\ &=(\eta_{j}V\omega_{i}+\omega_{i}V\eta_{j})dx^{i}\cdot dx^{j}+\omega_{i}\eta_{j}(dV^{i}\cdot dx^{j}+dx^{i}\cdot dV^{j})\\ &=(V\omega_{i}dx^{i}+\omega_{i}dV^{i})\cdot(\eta_{j}dx^{j})+(\omega_{i}dx^{i})\cdot(V\eta_{j}dx^{j}+\eta_{j}dV^{j})\\ &=\mathcal{L}_{V}\omega\cdot\eta+\omega\cdot\mathcal{L}_{V}\eta.\end{split}

The last assertion is a consequence of the second one. Indeed,

V((ωη))=V(ωη)=Vωη+ωVη=(Vωη+ωVη)=(V(ωη)).\mathcal{L}_{V}(\bullet(\omega\otimes\eta))=\mathcal{L}_{V}(\omega\cdot\eta)=\mathcal{L}_{V}\omega\cdot\eta+\omega\cdot\mathcal{L}_{V}\eta=\bullet(\mathcal{L}_{V}\omega\otimes\eta+\omega\otimes\mathcal{L}_{V}\eta)=\bullet(\mathcal{L}_{V}(\omega\otimes\eta)).

Given a vector field VV on ×M\mathbb{R}\times M, the Lie derivative V\mathcal{L}_{V} can also be defined for second-order vector fields and second-order forms on ×M\mathbb{R}\times M, as in Definition A.10, without any changes. But when restricting to the mixed-order vector fields and mixed-order forms, it is necessary that the flow in Definition A.10 consists of bundle homomorphisms on (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}), so that its mixed-order pushforwards and pullbacks are well defined. This feeding back to the vector field VV amounts to VV is π\pi-projectable. In this case, we just replace the second-order pushforwards and pullbacks in Definition A.10 by mixed-order pushforwards and pullbacks, to define the Lie derivative V\mathcal{L}_{V} for mixed-order vector fields and mixed-order forms on ×M\mathbb{R}\times M.

Now let VV be a π\pi-projectable vector field on ×M\mathbb{R}\times M. Then, Lemma A.11.(i)–(iv) still holds for smooth functions ff on ×M\mathbb{R}\times M, mixed-order vector fields AA and mixed-order forms α\alpha on ×M\mathbb{R}\times M. The assertion (v) will hold with the mixed differential instead of the second-order differential, that is, V(df)=d(Vf)\mathcal{L}_{V}(d^{\circ}f)=d^{\circ}(Vf). Moreover, if VV and AA have coordinate expressions V=V0t+VixiV=V^{0}\frac{\partial}{\partial{t}}+V^{i}\frac{\partial}{\partial{x^{i}}} and A=A0t+Aixi+Aij2xixjA=A^{0}\frac{\partial}{\partial{t}}+A^{i}\frac{\partial}{\partial x^{i}}+A^{ij}\frac{\partial^{2}}{\partial x^{i}\partial x^{j}} where V0V^{0} only depends on time, then the Lie derivative VA\mathcal{L}_{V}A has the following expression:

VA=[V,A]=(V0A0t+VjA0xjA0V0t)t+(V0Ait+VjAixjA0VitAjVixjAjk2Vixjxk)xi+(V0Ajkt+ViAjkxi)2xjxkAjk(Vixj2xixk+Vixk2xixj).\begin{split}\mathcal{L}_{V}A=[V,A]=&\ \left(V^{0}\frac{\partial A^{0}}{\partial t}+V^{j}\frac{\partial A^{0}}{\partial x^{j}}-A^{0}\frac{\partial V^{0}}{\partial t}\right)\frac{\partial}{\partial{t}}\\ &\ +\left(V^{0}\frac{\partial A^{i}}{\partial t}+V^{j}\frac{\partial A^{i}}{\partial x^{j}}-A^{0}\frac{\partial V^{i}}{\partial t}-A^{j}\frac{\partial V^{i}}{\partial x^{j}}-A^{jk}\frac{\partial^{2}V^{i}}{\partial x^{j}\partial x^{k}}\right)\frac{\partial}{\partial{x^{i}}}\\ &\ +\left(V^{0}\frac{\partial A^{jk}}{\partial t}+V^{i}\frac{\partial A^{jk}}{\partial x^{i}}\right)\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}-A^{jk}\left(\frac{\partial V^{i}}{\partial x^{j}}\frac{\partial^{2}}{\partial x^{i}\partial x^{k}}+\frac{\partial V^{i}}{\partial x^{k}}\frac{\partial^{2}}{\partial x^{i}\partial x^{j}}\right).\end{split}

Appendix B The mixed-order contact structure on ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M

B.1 Mixed-order total derivatives and mixed-order contact forms

We denote by π1,0(T×𝒯SM)\pi_{1,0}^{*}(T\mathbb{R}\times\mathcal{T}^{S}M) the pullback bundle (see [77, Definition 1.4.5]) of τ×τSM\tau_{\mathbb{R}}\times\tau^{S}_{M} by π1,0\pi_{1,0}. It is a fiber bundle over ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M.

Definition B.1 (Mixed-order holonomic lift).

Let tt\in\mathbb{R}, qMq\in M, XI(t,q)(M)X\in I_{(t,q)}(M) and τddt|tTt\tau\frac{d}{dt}|_{t}\in T_{t}\mathbb{R}. The mixed-order holonomic lift of τt|t\tau\frac{\partial}{\partial{t}}|_{t} by XX is defined to be

(X(τddt|t),j(t,q)X)π1,0(T×𝒯SM).\left(X_{*}(\tau\textstyle{\frac{d}{dt}}|_{t}),j_{(t,q)}X\right)\in\pi_{1,0}^{*}(T\mathbb{R}\times\mathcal{T}^{S}M).

The set of all mixed-order holonomic lifts is denoted by HRπ1,0H^{R}\pi_{1,0}, that is,

HRπ1,0:={(X(τddt|t),j(t,q)X)π1,0(T×𝒯SM):j(t,q)X×𝒯SM,τddt|tTt}.H^{R}\pi_{1,0}:=\left\{\left(X_{*}(\tau\textstyle{\frac{d}{dt}}|_{t}),j_{(t,q)}X\right)\in\pi_{1,0}^{*}(T\mathbb{R}\times\mathcal{T}^{S}M):j_{(t,q)}X\in\mathbb{R}\times\mathcal{T}^{S}M,\tau\textstyle{\frac{d}{dt}}|_{t}\in T_{t}\mathbb{R}\right\}.

Since XX_{*} depends only upon the mean derivatives of XX at tt, the holonomic lift of a tangent vector is completely determined by j(t,q)Xj_{(t,q)}X and does not depend on the choice of the representative diffusion XX. In particular, the set HRπ1,0H^{R}\pi_{1,0} is well defined and is clearly a subbundle of π1,0(T×𝒯SM)\pi_{1,0}^{*}(T\mathbb{R}\times\mathcal{T}^{S}M).

Lemma B.2.

The fiber bundle (π1,0(T×𝒯SM),π1,0(τ×τSM),×𝒯SM)(\pi_{1,0}^{*}(T\mathbb{R}\times\mathcal{T}^{S}M),\pi_{1,0}^{*}(\tau_{\mathbb{R}}\times\tau^{S}_{M}),\mathbb{R}\times\mathcal{T}^{S}M) can be written as the Whitney sum of two subbundles

π1,0(VSπ)××𝒯SMHRπ1,0.\pi_{1,0}^{*}(V^{S}\pi)\times_{\mathbb{R}\times\mathcal{T}^{S}M}H^{R}\pi_{1,0}.
Proof.

Suppose that (A,j(t,q)X)π1,0(T×𝒯SM)(A,j_{(t,q)}X)\in\pi_{1,0}^{*}(T\mathbb{R}\times\mathcal{T}^{S}M). Then AT×𝒯SMA\in T\mathbb{R}\times\mathcal{T}^{S}M, and

(X(πR(A)),j(t,q)X)HRπ1,0.\left(X_{*}(\pi^{R}_{*}(A)),j_{(t,q)}X\right)\in H^{R}\pi_{1,0}.

It follows easily from the definition of pushforward (A.5) that πR(AX(πR(A)))=0\pi^{R}_{*}(A-X_{*}(\pi^{R}_{*}(A)))=0. Hence, AX(πR(A))VSπA-X_{*}(\pi^{R}_{*}(A))\in V^{S}\pi and

(AX(πR(A)),j(t,q)X)π1,0(VSπ).\left(A-X_{*}(\pi^{R}_{*}(A)),j_{(t,q)}X\right)\in\pi_{1,0}^{*}(V^{S}\pi).

The result follows. ∎

The decomposition of (A,j(t,q)X)π1,0(T×𝒯SM)(A,j_{(t,q)}X)\in\pi_{1,0}^{*}(T\mathbb{R}\times\mathcal{T}^{S}M) may then be found by letting

A=A0t|t+Aixi|q+Ajk2xjxk|q=(AiA0Dix(j(t,q)X))xi|q+(AjkA0Qjkx(j(t,q)X))2xjxk|q+A0(t|t+Dix(j(t,q)X)xi|q+12Qjkx(j(t,q)X)2xjxk|q).\begin{split}A=&\ A^{0}\frac{\partial}{\partial{t}}\bigg{|}_{t}+A^{i}\frac{\partial}{\partial{x^{i}}}\bigg{|}_{q}+A^{jk}\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}\bigg{|}_{q}\\ =&\ \left(A^{i}-A^{0}D^{i}x(j_{(t,q)}X)\right)\frac{\partial}{\partial{x^{i}}}\bigg{|}_{q}+\left(A^{jk}-A^{0}Q^{jk}x(j_{(t,q)}X)\right)\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}\bigg{|}_{q}\\ &\ +A^{0}\left(\frac{\partial}{\partial{t}}\bigg{|}_{t}+D^{i}x(j_{(t,q)}X)\frac{\partial}{\partial{x^{i}}}\bigg{|}_{q}+\frac{1}{2}Q^{jk}x(j_{(t,q)}X)\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}\bigg{|}_{q}\right).\end{split}
Definition B.3.

A section of the bundle (HRπ1,0,π1,0(τ×τSM)|HRπ1,0,×𝒯SM)(H^{R}\pi_{1,0},\pi_{1,0}^{*}(\tau_{\mathbb{R}}\times\tau^{S}_{M})|_{H^{R}\pi_{1,0}},\mathbb{R}\times\mathcal{T}^{S}M) is called a mixed-order total derivative. The specific section

t+Dixxi+12Qjkx2xjxk\frac{\partial}{\partial{t}}+D^{i}x\frac{\partial}{\partial{x^{i}}}+\frac{1}{2}Q^{jk}x\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}

is called the coordinate mixed-order total derivative, and is denoted by 𝐃t\mathbf{D}_{t}.

The coordinate mixed-order total derivative is just the total mean derivative in Definition 4.7. The dual construction is the mixed-order contact cotangent vector, which may be described as being in the kernel of XX^{*}.

Definition B.4.

An element (α,j(t,q)X)π1,0(T×𝒯SM)(\alpha,j_{(t,q)}X)\in\pi_{1,0}^{*}(T^{*}\mathbb{R}\times\mathcal{T}^{S*}M) is called a mixed-order contact cotangent vector if X(α)=0X^{*}(\alpha)=0. The set of all mixed-order contact cotangent vectors is denoted by CRπ1,0C^{R*}\pi_{1,0}, that is,

CRπ1,0:={(α,j(t,q)X)π1,0(T×𝒯SM):j(t,q)X×𝒯SM,X(α)=0}.C^{R*}\pi_{1,0}:=\left\{(\alpha,j_{(t,q)}X)\in\pi_{1,0}^{*}(T^{*}\mathbb{R}\times\mathcal{T}^{S*}M):j_{(t,q)}X\in\mathbb{R}\times\mathcal{T}^{S}M,X^{*}(\alpha)=0\right\}.

It is straightforward to check that the vanishing of XX^{*} does not depend on the particular choice of the representative diffusion XX. The dual relation between XX^{*} and XX_{*} in (A.7) implies that the mixed-order contact and holonomic elements annihilate each other.

To express a mixed-order contact cotangent vector (α,j(t,q)X)(\alpha,j_{(t,q)}X) in coordinates, let us consider

α=α0dt|t+αid2xi|q+αjkdxjdxk|q.\alpha=\alpha_{0}dt|_{t}+\alpha_{i}d^{2}x^{i}|_{q}+\alpha_{jk}dx^{j}\cdot dx^{k}|_{q}. (B.1)

Using the definition (A.6) we get

0=X(α)=(α0+αi(DX)i+αjk(QX)jk)dt|t.0=X^{*}(\alpha)=\left(\alpha_{0}+\alpha_{i}(DX)^{i}+\alpha_{jk}(QX)^{jk}\right)dt|_{t}.

There are two basic nontrivial solutions of the above equation, say,

{α0=αi(DX)i,αjk=0,and{α0=αjk(QX)jk,αi=0.\left\{\begin{aligned} &\alpha_{0}=-\alpha_{i}(DX)^{i},\\ &\alpha_{jk}=0,\end{aligned}\right.\qquad\text{and}\qquad\left\{\begin{aligned} &\alpha_{0}=-\alpha_{jk}(QX)^{jk},\\ &\alpha_{i}=0.\end{aligned}\right.

Plugging these solutions in (B.1), we get two basic types of mixed-order contact cotangent vectors

(d2xiDixdt)|j(t,q)Xand(dxjdxkQjkxdt)|j(t,q)X.(d^{2}x^{i}-D^{i}xdt)|_{j_{(t,q)}X}\qquad\text{and}\qquad(dx^{j}\cdot dx^{k}-Q^{jk}xdt)|_{j_{(t,q)}X}.

Thus, every mixed-order contact cotangent vector in (CRπ1,0)j(t,q)X{(C^{R*}\pi_{1,0})}_{j_{(t,q)}X} is a linear combination of these basic mixed-order contact cotangent vectors.

Lemma B.5.

The fiber bundle (π1,0(T×𝒯SM),π1,0(τ×τSM),×𝒯SM)(\pi_{1,0}^{*}(T^{*}\mathbb{R}\times\mathcal{T}^{S*}M),\pi_{1,0}^{*}(\tau^{*}_{\mathbb{R}}\times\tau^{S*}_{M}),\mathbb{R}\times\mathcal{T}^{S}M) can be written as the Whitney sum of two subbundles

π1(T)××𝒯SMCRπ1,0.\pi_{1}^{*}(T^{*}\mathbb{R})\times_{\mathbb{R}\times\mathcal{T}^{S}M}C^{R*}\pi_{1,0}.
Proof.

Suppose that (α,j(t,q)X)π1,0(T×𝒯SM)(\alpha,j_{(t,q)}X)\in\pi_{1,0}^{*}(T^{*}\mathbb{R}\times\mathcal{T}^{S*}M). Then, αT×𝒯SM\alpha\in T^{*}\mathbb{R}\times\mathcal{T}^{S*}M, and the definition of pullback yields

(X(α),j(t,q)X)π1(T).\left(X^{*}(\alpha),j_{(t,q)}X\right)\in\pi_{1}^{*}(T^{*}\mathbb{R}).

Since X(αX(α))=0X^{*}(\alpha-X^{*}(\alpha))=0, it follows that

(αX(α),j(t,q)X)CRπ1,0.\left(\alpha-X^{*}(\alpha),j_{(t,q)}X\right)\in C^{R*}\pi_{1,0}.

This ends the proof. ∎

The decomposition of (α,j(t,q)X)π1,0(T×𝒯SM)(\alpha,j_{(t,q)}X)\in\pi_{1,0}^{*}(T\mathbb{R}\times\mathcal{T}^{S}M) may then be found by letting

α=α0dt|t+αid2xi|q+αjkdxjdxk|q=(α0+αiDix(j(t,q)X)+αjkQjkx(j(t,q)X))dt|t+αi(d2xiDix(j(t,q)X)dt)|(t,q)+αjk(dxjdxkQjkx(j(t,q)X)dt)|(t,q).\begin{split}\alpha=&\ \alpha_{0}dt|_{t}+\alpha_{i}d^{2}x^{i}|_{q}+\alpha_{jk}dx^{j}\cdot dx^{k}|_{q}\\ =&\ \left(\alpha_{0}+\alpha_{i}D^{i}x(j_{(t,q)}X)+\alpha_{jk}Q^{jk}x(j_{(t,q)}X)\right)dt|_{t}\\ &\ +\alpha_{i}\left(d^{2}x^{i}-D^{i}x(j_{(t,q)}X)dt\right)\big{|}_{(t,q)}+\alpha_{jk}\left(dx^{j}\cdot dx^{k}-Q^{jk}x(j_{(t,q)}X)dt\right)\Big{|}_{(t,q)}.\end{split}
Definition B.6.

A section of the bundle (CRπ1,0,π1,0(τ×τSM)|CRπ1,0,×𝒯SM)(C^{R*}\pi_{1,0},\pi_{1,0}^{*}(\tau^{*}_{\mathbb{R}}\times\tau^{S*}_{M})|_{C^{R*}\pi_{1,0}},\mathbb{R}\times\mathcal{T}^{S}M) is called a mixed-order contact form. The following specific sections

d2xiDixdt,dxjdxkQjkxdt,1i,j,kd,d^{2}x^{i}-D^{i}xdt,\quad dx^{j}\cdot dx^{k}-Q^{jk}xdt,\qquad 1\leq i,j,k\leq d,

are called basic mixed-order contact forms.

It follows from the construction that the set of basic mixed-order contact forms defines a local frame of the bundle π1,0(τ×τSM)|CRπ1,0\pi_{1,0}^{*}(\tau^{*}_{\mathbb{R}}\times\tau^{S*}_{M})|_{C^{R*}\pi_{1,0}}.

Remark B.7.

In contrast, we recall the classical contact forms on the first-order jet bundle J1π=×TMJ^{1}\pi=\mathbb{R}\times TM. Using the coordinates (t,xi,x˙i)(t,x^{i},\dot{x}^{i}), the classical basic contact forms are dxix˙idtdx^{i}-\dot{x}^{i}dt, 1id1\leq i\leq d. See [77, Section 4.3] and [71, Theorem 4.23], also cf. [32, p. 9], for a one-dimensional example.

Corollary B.8.

Let (×U,(t,xi))(\mathbb{R}\times U,(t,x^{i})) be a coordinate chart on ×M\mathbb{R}\times M. Let 𝐗\mathbf{X} be a 𝒯SM\mathcal{T}^{S}M-valued diffusion process. In local coordinates, the pushforward map 𝐗\mathbf{X}_{*} from TT\mathbb{R} to T×𝒯S𝒯SMT\mathbb{R}\times\mathcal{T}^{S}\mathcal{T}^{S}M is given by

𝐗(τddt|t)=τ(t+Di(x𝐗)xi+Di(Dx𝐗)Dix+Djk(Qx𝐗)Qjkx+12Qjk(x𝐗)2xjxk+12Qjk(Dx𝐗)2DjxDkx+12Qjklm(Qx𝐗)2QjkxQlmx+12Qjk(x𝐗,Dx𝐗)2xjDkx+12Qjkl(x𝐗,Qx𝐗)2xjQklx+12Qjkl(Dx𝐗,Qx𝐗)2DjxQklx)|(t,𝐗(t)).\begin{split}\mathbf{X}_{*}\left(\tau\frac{d}{dt}\bigg{|}_{t}\right)=&\ \tau\bigg{(}\frac{\partial}{\partial{t}}+D^{i}(x\circ\mathbf{X})\frac{\partial}{\partial{x^{i}}}+D^{i}(Dx\circ\mathbf{X})\frac{\partial}{\partial{D^{i}x}}+D^{jk}(Qx\circ\mathbf{X})\frac{\partial}{\partial{Q^{jk}x}}\\ &\ +\frac{1}{2}Q^{jk}(x\circ\mathbf{X})\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}+\frac{1}{2}Q^{jk}(Dx\circ\mathbf{X})\frac{\partial^{2}}{\partial D^{j}x\partial D^{k}x}+\frac{1}{2}Q^{jklm}(Qx\circ\mathbf{X})\frac{\partial^{2}}{\partial Q^{jk}x\partial Q^{lm}x}\\ &\ +\frac{1}{2}Q^{jk}(x\circ\mathbf{X},Dx\circ\mathbf{X})\frac{\partial^{2}}{\partial x^{j}\partial D^{k}x}+\frac{1}{2}Q^{jkl}(x\circ\mathbf{X},Qx\circ\mathbf{X})\frac{\partial^{2}}{\partial x^{j}\partial Q^{kl}x}\\ &\ +\frac{1}{2}Q^{jkl}(Dx\circ\mathbf{X},Qx\circ\mathbf{X})\frac{\partial^{2}}{\partial D^{j}x\partial Q^{kl}x}\bigg{)}\bigg{|}_{(t,\mathbf{X}(t))}.\end{split}

The pullback map 𝐗\mathbf{X}^{*} from T×𝒯S𝒯SMT^{*}\mathbb{R}\times\mathcal{T}^{S*}\mathcal{T}^{S}M to TT^{*}\mathbb{R} is given by

𝐗(α0dt+αid2xi+α1id2Dix+α2jkd2Qjkx+αjkdxjdxk+α1jkdDjxdDkx+α2jklmdQjkxdQlmx+α01jkdxjdDkx+α02jkldxjdQklx+α12jkldDjxdQklx)|(t,𝐗(t))=(α0+αiDi(x𝐗)+α1iDi(Dx𝐗)+α2jkDjk(Qx𝐗)+αjkQjk(x𝐗)+α1jkQjk(Dx𝐗)+α2jklmQjklm(Qx𝐗)+α01jkQjk(x𝐗,Dx𝐗)+α02jklQjkl(x𝐗,Qx𝐗)+α12jklQjkl(Dx𝐗,Qx𝐗))dt|t.\begin{split}&\ \mathbf{X}^{*}\Big{(}\alpha_{0}dt+\alpha_{i}d^{2}x^{i}+\alpha^{1}_{i}d^{2}D^{i}x+\alpha^{2}_{jk}d^{2}Q^{jk}x+\alpha_{jk}dx^{j}\cdot dx^{k}+\alpha^{1}_{jk}dD^{j}x\cdot dD^{k}x+\alpha^{2}_{jklm}dQ^{jk}x\cdot dQ^{lm}x\\ &\qquad+\alpha^{01}_{jk}dx^{j}\cdot dD^{k}x+\alpha^{02}_{jkl}dx^{j}\cdot dQ^{kl}x+\alpha^{12}_{jkl}dD^{j}x\cdot dQ^{kl}x\Big{)}\Big{|}_{(t,\mathbf{X}(t))}\\ =&\ \Big{(}\alpha_{0}+\alpha_{i}D^{i}(x\circ\mathbf{X})+\alpha^{1}_{i}D^{i}(Dx\circ\mathbf{X})+\alpha^{2}_{jk}D^{jk}(Qx\circ\mathbf{X})\\ &\ +\alpha_{jk}Q^{jk}(x\circ\mathbf{X})+\alpha^{1}_{jk}Q^{jk}(Dx\circ\mathbf{X})+\alpha^{2}_{jklm}Q^{jklm}(Qx\circ\mathbf{X})\\ &\ +\alpha^{01}_{jk}Q^{jk}(x\circ\mathbf{X},Dx\circ\mathbf{X})+\alpha^{02}_{jkl}Q^{jkl}(x\circ\mathbf{X},Qx\circ\mathbf{X})+\alpha^{12}_{jkl}Q^{jkl}(Dx\circ\mathbf{X},Qx\circ\mathbf{X})\Big{)}dt|_{t}.\end{split}
Corollary B.9.

Let α\alpha be a section of (T×𝒯S𝒯SM,τ×τSTSM,×𝒯SM)(T^{*}\mathbb{R}\times\mathcal{T}^{S*}\mathcal{T}^{S}M,\tau^{*}_{\mathbb{R}}\times\tau^{S*}_{T^{S}M},\mathbb{R}\times\mathcal{T}^{S}M). Then α\alpha is a mixed-order contact form if and only if for every tt\in\mathbb{R} and every XqMI(t,q)(M)X\in\cup_{q\in M}I_{(t,q)}(M),

(jX)(α|j(t,q)X)=0.(jX)^{*}(\alpha|_{j_{(t,q)}X})=0.
Proof.

We first let α=α0dt+αid2xi+αjkdxjdxk\alpha=\alpha_{0}dt+\alpha_{i}d^{2}x^{i}+\alpha_{jk}dx^{j}\cdot dx^{k} be a mixed-order contact form and let XI(t,q)(M)X\in I_{(t,q)}(M). Then

(jX)(α|j(t,q)X)=(α0+αiDix+αjkQjkx)(j(t,q)X)dt|t=X(α|j(t,q)X)=0.(jX)^{*}(\alpha|_{j_{(t,q)}X})=\left(\alpha_{0}+\alpha_{i}D^{i}x+\alpha_{jk}Q^{jk}x\right)(j_{(t,q)}X)dt|_{t}=X^{*}(\alpha|_{j_{(t,q)}X})=0. (B.2)

To prove the converse, we suppose

α=α0dt+αid2xi+α1id2Dix+α2jkd2Qjkx+αjkdxjdxk+α1jkdDjxdDkx+α2jklmdQjkxdQlmx+α01jkdxjdDkx+α02jkldxjdQklx+α12jkldDjxdQklx\begin{split}\alpha=&\ \alpha_{0}dt+\alpha_{i}d^{2}x^{i}+\alpha^{1}_{i}d^{2}D^{i}x+\alpha^{2}_{jk}d^{2}Q^{jk}x+\alpha_{jk}dx^{j}\cdot dx^{k}+\alpha^{1}_{jk}dD^{j}x\cdot dD^{k}x+\alpha^{2}_{jklm}dQ^{jk}x\cdot dQ^{lm}x\\ &\ +\alpha^{01}_{jk}dx^{j}\cdot dD^{k}x+\alpha^{02}_{jkl}dx^{j}\cdot dQ^{kl}x+\alpha^{12}_{jkl}dD^{j}x\cdot dQ^{kl}x\end{split}

Fix a particular index i0i_{0} with 1i0d1\leq i_{0}\leq d. Let YI(t,q)(M)Y\in I_{(t,q)}(M) such that j(t,q)X=j(t,q)Yj_{(t,q)}X=j_{(t,q)}Y, DiDY=DiDX+δi0iD^{i}DY=D^{i}DX+\delta_{i_{0}}^{i} and

(DjkQY,QjkDY,QjklmQY,Qjk(Y,DY),Qjkl(Y,QY),Qjkl(DY,QY))=(DjkQX,QjkDX,QjklmQX,Qjk(X,DX),Qjkl(X,QX),Qjkl(DX,QX)).\begin{split}&\ \left(D^{jk}QY,Q^{jk}DY,Q^{jklm}QY,Q^{jk}(Y,DY),Q^{jkl}(Y,QY),Q^{jkl}(DY,QY)\right)\\ =&\ \left(D^{jk}QX,Q^{jk}DX,Q^{jklm}QX,Q^{jk}(X,DX),Q^{jkl}(X,QX),Q^{jkl}(DX,QX)\right).\end{split}

Then,

0=(jY)(α|j(t,q)Y)=(jX)(α|j(t,q)X)+α1iδi0i=α1i0.0=(jY)^{*}(\alpha|_{j_{(t,q)}Y})=(jX)^{*}(\alpha|_{j_{(t,q)}X})+\alpha^{1}_{i}\delta_{i_{0}}^{i}=\alpha^{1}_{i_{0}}.

It follows from the arbitrariness of i0i_{0} that α1i=0\alpha^{1}_{i}=0 for all 1id1\leq i\leq d. Similarly, all αjk1\alpha_{jk}^{1}, αjk2\alpha_{jk}^{2} and αjklm2\alpha_{jklm}^{2} vanish. Consequently, α=α0dt+αid2xi+αjkdxjdxk\alpha=\alpha_{0}dt+\alpha_{i}d^{2}x^{i}+\alpha_{jk}dx^{j}\cdot dx^{k}. As in (B.2), we have (jX)(α|j(t,q)X)=X(α|j(t,q)X)=0(jX)^{*}(\alpha|_{j_{(t,q)}X})=X^{*}(\alpha|_{j_{(t,q)}X})=0. Hence, α\alpha is a mixed-order contact form. ∎

Corollary B.10.

Let 𝐗\mathbf{X} be a 𝒯SM\mathcal{T}^{S}M-valued diffusion process. Then 𝐗=jX\mathbf{X}=jX, with XX an MM-valued diffusion process, if and only if 𝐗(α)=0\mathbf{X}^{*}(\alpha)=0 for every mixed-order contact form α\alpha on ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M.

Proof.

We first suppose 𝐗=jX\mathbf{X}=jX with XX an MM-valued diffusion process. Then, for a mixed-order contact form α\alpha,

𝐗(α)=(jX)(α)=X(α)=0.\mathbf{X}^{*}(\alpha)=(jX)^{*}(\alpha)=X^{*}(\alpha)=0.

To prove the converse, it suffices to show, in local coordinates, that

Dix(𝐗)=Di(x𝐗),Qjkx(𝐗)=Qjk(x𝐗).D^{i}x(\mathbf{X})=D^{i}(x\circ\mathbf{X}),\qquad Q^{jk}x(\mathbf{X})=Q^{jk}(x\circ\mathbf{X}).

This can be done as soon as we let α\alpha be a basic mixed-order contact form. For example, let α=d2xiDixdt\alpha=d^{2}x^{i}-D^{i}xdt, then

0=𝐗(α)=(Di(x𝐗)Dix𝐗)dt,0=\mathbf{X}^{*}(\alpha)=\left(D^{i}(x\circ\mathbf{X})-D^{i}x\circ\mathbf{X}\right)dt,

which leads to Dix(𝐗)=Di(x𝐗)D^{i}x(\mathbf{X})=D^{i}(x\circ\mathbf{X}). ∎

B.2 The mixed-order Cartan distribution and its symmetries

The model bundle ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M is a trivial bundle over \mathbb{R} in its own right, and so we may consider its mixed-order tangent bundle (T×𝒯S𝒯SM,τ×τS𝒯SM,×𝒯SM)(T\mathbb{R}\times\mathcal{T}^{S}\mathcal{T}^{S}M,\tau_{\mathbb{R}}\times\tau^{S}_{\mathcal{T}^{S}M},\mathbb{R}\times\mathcal{T}^{S}M).

Definition B.11.

The bundle endomorphisms (v,𝐈𝐝E)(v,\mathbf{Id}_{E}) of π1,0(τ×τSM)\pi_{1,0}^{*}(\tau_{\mathbb{R}}\times\tau^{S}_{M}) is defined by

v(Ah+Av)=Av,v(A^{h}+A^{v})=A^{v},

where AhHRπ1,0A^{h}\in H^{R}\pi_{1,0} and Avπ1,0(VSπ)A^{v}\in\pi_{1,0}^{*}(V^{S}\pi).

Definition B.12 (Mixed-order Cartan distribution).

The mixed-order Cartan distribution is the kernel of the vector bundle homomorphism over 𝐈𝐝×𝒯SM\mathbf{Id}_{\mathbb{R}\times\mathcal{T}^{S}M}

v(π1,0,τ×τS𝒯SM):T×𝒯S𝒯SMπ1,0(τ×τSM)v\circ(\pi_{1,0*},\tau_{\mathbb{R}}\times\tau^{S}_{\mathcal{T}^{S}M}):T\mathbb{R}\times\mathcal{T}^{S}\mathcal{T}^{S}M\to\pi_{1,0}^{*}(\tau_{\mathbb{R}}\times\tau^{S}_{M})

and is denoted by CRπ1,0C^{R}\pi_{1,0}.

Note that CRπ1,0C^{R}\pi_{1,0} is a subbundle of τ×τS𝒯SM\tau_{\mathbb{R}}\times\tau^{S}_{\mathcal{T}^{S}M}. It follows from the above two definitions that

CRπ1,0=(π1,0,τ×τS𝒯SM)1HRπ1,0.C^{R}\pi_{1,0}=(\pi_{1,0*},\tau_{\mathbb{R}}\times\tau^{S}_{\mathcal{T}^{S}M})^{-1}H^{R}\pi_{1,0}.

Hence, for each XI(t,q)(M)X\in I_{(t,q)}(M),

CRπ1,0|j(t,q)X=(jX)(Tt)VSπ1,0|j(t,q)X.C^{R}\pi_{1,0}|_{j_{(t,q)}X}=(jX)_{*}(T_{t}\mathbb{R})\oplus V^{S}\pi_{1,0}|_{j_{(t,q)}X}.

Similarly to the proof of Lemma B.2, we can decompose an element 𝐀CRπ1,0|j(t,q)X\mathbf{A}\in C^{R}\pi_{1,0}|_{j_{(t,q)}X} as

𝐀=(jX)((π1)R(𝐀))+[𝐀(jX)((π1)R(𝐀))],\mathbf{A}=(jX)_{*}((\pi_{1})^{R}_{*}(\mathbf{A}))+\left[\mathbf{A}-(jX)_{*}((\pi_{1})^{R}_{*}(\mathbf{A}))\right], (B.3)

where (jX)((π1)R(𝐀))(jX)(Tt)|j(t,q)X(jX)_{*}((\pi_{1})^{R}_{*}(\mathbf{A}))\in(jX)_{*}(T_{t}\mathbb{R})|_{j_{(t,q)}X} and 𝐀(jX)((π1)R(𝐀))VSj(t,q)Xπ1,0\mathbf{A}-(jX)_{*}((\pi_{1})^{R}_{*}(\mathbf{A}))\in V^{S}_{j_{(t,q)}X}\pi_{1,0}.

From the duality relations it also follows that (τ×τS𝒯SM)|CRπ1,0(\tau^{*}_{\mathbb{R}}\times\tau^{S*}_{\mathcal{T}^{S}M})|_{C^{R*}\pi_{1,0}} is the annihilator of (τ×τS𝒯SM)|CRπ1,0(\tau_{\mathbb{R}}\times\tau^{S}_{\mathcal{T}^{S}M})|_{C^{R}\pi_{1,0}}, or in other words, the basic mixed-order contact forms are local defining forms for the mixed-order contact distribution CRπ1,0C^{R}\pi_{1,0}. A typical element 𝐀CRπ1,0|j(t,q)X\mathbf{A}\in C^{R}\pi_{1,0}|_{j_{(t,q)}X} may be written in coordinates as

𝐀=𝐀0(t|j(t,q)X+Dix(j(t,q)X)xi|j(t,q)X+12Qjkx(j(t,q)X)2xjxk|j(t,q)X)+𝐀i1Dix|j(t,q)X+𝐀jk2Qjkx|j(t,q)X+𝐀jk112DjxDkx|j(t,q)X+𝐀jklm222QjkxQlmx|j(t,q)X+𝐀jk012xjDkx|j(t,q)X+𝐀jkl022xjQklx|j(t,q)X+𝐀jkl122DjxQklx|j(t,q)X.\begin{split}\mathbf{A}=&\ \mathbf{A}^{0}\left(\frac{\partial}{\partial{t}}\bigg{|}_{j_{(t,q)}X}+D^{i}x(j_{(t,q)}X)\frac{\partial}{\partial{x^{i}}}\bigg{|}_{j_{(t,q)}X}+\frac{1}{2}Q^{jk}x(j_{(t,q)}X)\frac{\partial^{2}}{\partial x^{j}\partial x^{k}}\bigg{|}_{j_{(t,q)}X}\right)\\ &\ +\mathbf{A}^{i}_{1}\frac{\partial}{\partial{D^{i}x}}\bigg{|}_{j_{(t,q)}X}+\mathbf{A}^{jk}_{2}\frac{\partial}{\partial{Q^{jk}x}}\bigg{|}_{j_{(t,q)}X}+\mathbf{A}^{jk}_{11}\frac{\partial^{2}}{\partial D^{j}x\partial D^{k}x}\bigg{|}_{j_{(t,q)}X}+\mathbf{A}^{jklm}_{22}\frac{\partial^{2}}{\partial Q^{jk}x\partial Q^{lm}x}\bigg{|}_{j_{(t,q)}X}\\ &\ +\mathbf{A}^{jk}_{01}\frac{\partial^{2}}{\partial x^{j}\partial D^{k}x}\bigg{|}_{j_{(t,q)}X}+\mathbf{A}^{jkl}_{02}\frac{\partial^{2}}{\partial x^{j}\partial Q^{kl}x}\bigg{|}_{j_{(t,q)}X}+\mathbf{A}^{jkl}_{12}\frac{\partial^{2}}{\partial D^{j}x\partial Q^{kl}x}\bigg{|}_{j_{(t,q)}X}.\end{split} (B.4)

From this it is easy to deduce (π1,0)R𝐀HRπ1,0(\pi_{1,0})_{*}^{R}\mathbf{A}\in H^{R}\pi_{1,0}.

Definition B.13.

A symmetry of the mixed-order Cartan distribution on ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M is a bundle automorphism 𝐅\mathbf{F} of ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M which satisfies 𝐅R(CRπ1,0)=CRπ1,0\mathbf{F}^{R}_{*}(C^{R}\pi_{1,0})=C^{R}\pi_{1,0}.

It follows by duality that symmetries of the mixed-order Cartan distribution are those bundle automorphisms which satisfy 𝐅R(CRπ1,0)=CRπ1,0\mathbf{F}^{R*}(C^{R*}\pi_{1,0})=C^{R*}\pi_{1,0}. For this reason, 𝐅\mathbf{F} is also called a mixed-order contact transformation. Similarly, 𝐅\mathbf{F} may be characterized by the fact that whenever α\alpha is a mixed-order contact form then so is 𝐅R(α)\mathbf{F}^{R*}(\alpha).

Proposition B.14.

Let 𝐅\mathbf{F} be a bundle homomorphism from (×𝒯SM,π1,)(\mathbb{R}\times\mathcal{T}^{S}M,\pi_{1},\mathbb{R}) to (×𝒯SN,ρ1,)(\mathbb{R}\times\mathcal{T}^{S}N,\rho_{1},\mathbb{R}) that projects to a diffeomorphism F0:F^{0}:\mathbb{R}\to\mathbb{R}. Then 𝐅R(CRπ1,0)CRρ1,0\mathbf{F}^{R}_{*}(C^{R}\pi_{1,0})\subset C^{R}\rho_{1,0} if and only if 𝐅=jF\mathbf{F}=jF where FF is a bundle homomorphism from (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) to (×N,ρ,)(\mathbb{R}\times N,\rho,\mathbb{R}) that projects to F0F^{0}.

Proof.

First, we prove the sufficiency. Let 𝐀CRπ1,0|j(t,q)X\mathbf{A}\in C^{R}\pi_{1,0}|_{j_{(t,q)}X}. According to (B.3), we decompose 𝐀\mathbf{A} by 𝐀=𝐀1+𝐀2\mathbf{A}=\mathbf{A}_{1}+\mathbf{A}_{2} with 𝐀1=(jX)((π1)R(𝐀))(jX)(Tt)\mathbf{A}_{1}=(jX)_{*}((\pi_{1})^{R}_{*}(\mathbf{A}))\in(jX)_{*}(T_{t}\mathbb{R}) and 𝐀2VSj(t,q)Xπ1,0\mathbf{A}_{2}\in V^{S}_{j_{(t,q)}X}\pi_{1,0}. Then, since by Corollaries 4.6 and A.9.(iv), (jF)R(jX)=(jFjX)(F0)=(jX~)(F0)(jF)^{R}_{*}\circ(jX)_{*}=(jF\cdot jX)_{*}\circ(F^{0})_{*}=(j\tilde{X})_{*}\circ(F^{0})_{*} where X~=FX\tilde{X}=F\cdot X is the pushforward of XX by FF, we have

𝐅R(𝐀1)=(jF)R(𝐀1)=(jF)R(jX)(π1)R𝐀=(jX~)(F0)(π1)R𝐀(jX~)(𝒯F0(t)).\mathbf{F}^{R}_{*}(\mathbf{A}_{1})=(jF)^{R}_{*}(\mathbf{A}_{1})=(jF)^{R}_{*}(jX)_{*}(\pi_{1})^{R}_{*}\mathbf{A}=(j\tilde{X})_{*}(F^{0})_{*}(\pi_{1})^{R}_{*}\mathbf{A}\in(j\tilde{X})_{*}(\mathcal{T}_{F^{0}(t)}\mathbb{R}).

Besides, since jF:π1,0ρ1,0jF:\pi_{1,0}\to\rho_{1,0} is a bundle homomorphism projecting to FF by Corollary 4.5.(ii), we have ρ1,0jF=Fπ1,0\rho_{1,0}\circ jF=F\circ\pi_{1,0}. Then,

(ρ1,0)S(𝐅R(𝐀2))=(ρ1,0)S((jF)R(𝐀2))=FS(π1,0)S(𝐀2)=0,(\rho_{1,0})^{S}_{*}(\mathbf{F}^{R}_{*}(\mathbf{A}_{2}))=(\rho_{1,0})^{S}_{*}((jF)^{R}_{*}(\mathbf{A}_{2}))=F^{S}_{*}(\pi_{1,0})^{S}_{*}(\mathbf{A}_{2})=0,

which yields 𝐅R(𝐀2)VSρ1,0\mathbf{F}^{R}_{*}(\mathbf{A}_{2})\in V^{S}\rho_{1,0}. This proves 𝐅R(CRπ1,0)CRρ1,0\mathbf{F}^{R}_{*}(C^{R}\pi_{1,0})\subset C^{R}\rho_{1,0}.

For the necessity, we first prove that 𝐅\mathbf{F} is bundle homomorphism from π1,0\pi_{1,0} to ρ1,0\rho_{1,0} by showing 𝐅S(VSπ1,0)VSρ1,0\mathbf{F}^{S}_{*}(V^{S}\pi_{1,0})\subset V^{S}\rho_{1,0}, by virtue of Lemma A.1. Let 𝐀VSπ1,0\mathbf{A}\in V^{S}\pi_{1,0}. Set 𝐅R𝐀=𝐀1+𝐀2\mathbf{F}^{R}_{*}\mathbf{A}=\mathbf{A}_{1}+\mathbf{A}_{2}, where 𝐀1(jY)(𝒯F0(t))\mathbf{A}_{1}\in(jY)_{*}(\mathcal{T}_{F^{0}(t)}\mathbb{R}) and 𝐀2VSρ1,0\mathbf{A}_{2}\in V^{S}\rho_{1,0} for some diffusion YY. Since 𝐅\mathbf{F} projects to F0F^{0},

(ρ1)S(𝐅S𝐀)=(F0)S(π1)S𝐀=(F0)SπS(π1,0)S𝐀=0,(\rho_{1})^{S}_{*}(\mathbf{F}^{S}_{*}\mathbf{A})=(F^{0})^{S}_{*}(\pi_{1})^{S}_{*}\mathbf{A}=(F^{0})^{S}_{*}\pi^{S}_{*}(\pi_{1,0})^{S}_{*}\mathbf{A}=0,

while (ρ1)S𝐀2=ρS(ρ1,0)S𝐀2=0(\rho_{1})^{S}_{*}\mathbf{A}_{2}=\rho^{S}_{*}(\rho_{1,0})^{S}_{*}\mathbf{A}_{2}=0. Thus, (ρ1)S𝐀1=0(\rho_{1})^{S}_{*}\mathbf{A}_{1}=0. Since 𝐀1(jY)(𝒯F0(t))\mathbf{A}_{1}\in(jY)_{*}(\mathcal{T}_{F^{0}(t)}\mathbb{R}), we set 𝐀1=(jY)(τs|F0(t))\mathbf{A}_{1}=(jY)_{*}(\tau\frac{\partial}{\partial{s}}|_{F^{0}(t)}). Then (ρ1)S𝐀1=τs|F0(t)=0(\rho_{1})^{S}_{*}\mathbf{A}_{1}=\tau\frac{\partial}{\partial{s}}|_{F^{0}(t)}=0. Hence, τ=0\tau=0 and so 𝐀1=0\mathbf{A}_{1}=0. This leads to 𝐅R(VSπ1,0)VSρ1,0\mathbf{F}^{R}_{*}(V^{S}\pi_{1,0})\subset V^{S}\rho_{1,0} and so that 𝐅\mathbf{F} is bundle homomorphism from π1,0\pi_{1,0} to ρ1,0\rho_{1,0}. Denote the projection of 𝐅\mathbf{F} onto a map from ×M\mathbb{R}\times M to ×N\mathbb{R}\times N by FF. It follows that

ρFπ1,0=ρρ1,0𝐅=ρ1𝐅=F0π1=F0ππ1,0.\rho\circ F\circ\pi_{1,0}=\rho\circ\rho_{1,0}\circ\mathbf{F}=\rho_{1}\circ\mathbf{F}=F^{0}\circ\pi_{1}=F^{0}\circ\pi\circ\pi_{1,0}.

Since π1,0\pi_{1,0} is surjective, we obtain ρF=F0π\rho\circ F=F^{0}\circ\pi, so that FF is a bundle homomorphism from π\pi to ρ\rho projecting to F0F^{0}. We shall write F=(F0,F¯)F=(F^{0},\bar{F}) and 𝐅=(F0,𝐅¯)\mathbf{F}=(F^{0},\bar{\mathbf{F}}).

Next, we will show 𝐅=jF\mathbf{F}=jF. Fix a j(t,q)X×𝒯SMj_{(t,q)}X\in\mathbb{R}\times\mathcal{T}^{S}M. Let 𝐅(j(t,q)X)=j(s,q)Y\mathbf{F}(j_{(t,q)}X)=j_{(s,q^{\prime})}Y. Then, s=F0(t)s=F^{0}(t) and (s,q)=F(t,q)(s,q^{\prime})=F(t,q). For an element 𝐀CRπ1,0|j(t,q)X\mathbf{A}\in C^{R}\pi_{1,0}|_{j_{(t,q)}X} with local expression in (B.4), we have from (A.3) that

𝐅R𝐀=𝐀0dF0dt(t)s|j(s,q)Y+(𝐀F¯i)(j(t,q)X)yi|j(s,q)Y+𝐀02Qjkx(j(t,q)X)F¯ixkF¯jxl(t,q)2yiyj|j(s,q)Y+terms(Diy|j(s,q)Y,Qijy|j(s,q)Y,).\begin{split}\mathbf{F}^{R}_{*}\mathbf{A}=&\ \mathbf{A}^{0}\frac{dF^{0}}{dt}(t)\frac{\partial}{\partial s}\bigg{|}_{j_{(s,q^{\prime})}Y}+(\mathbf{A}\bar{F}^{i})(j_{(t,q)}X)\frac{\partial}{\partial y^{i}}\bigg{|}_{j_{(s,q^{\prime})}Y}+\frac{\mathbf{A}^{0}}{2}Q^{jk}x(j_{(t,q)}X)\frac{\partial\bar{F}^{i}}{\partial x^{k}}\frac{\partial\bar{F}^{j}}{\partial x^{l}}(t,q)\frac{\partial^{2}}{\partial y^{i}\partial y^{j}}\bigg{|}_{j_{(s,q^{\prime})}Y}\\ &\ +\text{terms}\left(\frac{\partial}{\partial D^{i}y}\bigg{|}_{j_{(s,q^{\prime})}Y},\frac{\partial}{\partial Q^{ij}y}\bigg{|}_{j_{(s,q^{\prime})}Y},\cdots\right).\end{split}

Since F¯\bar{F} only depends on the variables on ×M\mathbb{R}\times M, we have

(𝐀F¯i)(j(t,q)X)=((π1,0)R𝐀)F¯i(j(t,q)X)=𝐀0[F¯it(t,q)+Djx(j(t,q)X)F¯ixj(t,q)+12Qjkx(j(t,q)X)2F¯ixjxk(t,q)].\begin{split}(\mathbf{A}\bar{F}^{i})(j_{(t,q)}X)&=\left((\pi_{1,0})_{*}^{R}\mathbf{A}\right)\bar{F}^{i}(j_{(t,q)}X)\\ &=\mathbf{A}^{0}\left[\frac{\partial\bar{F}^{i}}{\partial t}(t,q)+D^{j}x(j_{(t,q)}X)\frac{\partial\bar{F}^{i}}{\partial x^{j}}(t,q)+\frac{1}{2}Q^{jk}x(j_{(t,q)}X)\frac{\partial^{2}\bar{F}^{i}}{\partial x^{j}\partial x^{k}}(t,q)\right].\end{split}

Then, the local expressions for jFjF in (4.7) and (4.8) yield

𝐅R𝐀=𝐀0dF0dt(t)[s|j(s,q)Y+DiyjF(j(t,q)X)yi|j(s,q)Y+12QijyjF(j(t,q)X)2yiyj|j(s,q)Y].\mathbf{F}^{R}_{*}\mathbf{A}=\mathbf{A}^{0}\frac{dF^{0}}{dt}(t)\left[\frac{\partial}{\partial s}\bigg{|}_{j_{(s,q^{\prime})}Y}+D^{i}y\circ jF(j_{(t,q)}X)\frac{\partial}{\partial y^{i}}\bigg{|}_{j_{(s,q^{\prime})}Y}+\frac{1}{2}Q^{ij}y\circ jF(j_{(t,q)}X)\frac{\partial^{2}}{\partial y^{i}\partial y^{j}}\bigg{|}_{j_{(s,q^{\prime})}Y}\right].

Since 𝐅R𝐀CRπ1,0|j(s,q)Y\mathbf{F}^{R}_{*}\mathbf{A}\in C^{R}\pi_{1,0}|_{j_{(s,q^{\prime})}Y} by the assumption, it follows that jF(j(t,q)X)=j(s,q)Y=𝐅(j(t,q)X)jF(j_{(t,q)}X)=j_{(s,q^{\prime})}Y=\mathbf{F}(j_{(t,q)}X). This proves that 𝐅=jF\mathbf{F}=jF. ∎

Corollary B.15.

Let 𝐅\mathbf{F} be a bundle automorphism on (×𝒯SM,π1,)(\mathbb{R}\times\mathcal{T}^{S}M,\pi_{1},\mathbb{R}) projecting to a diffeomorphism F0:F^{0}:\mathbb{R}\to\mathbb{R}. Then 𝐅\mathbf{F} is a symmetry of CRπ1,0C^{R}\pi_{1,0} if and only if 𝐅=jF\mathbf{F}=jF where FF is a bundle automorphism on (×M,π,)(\mathbb{R}\times M,\pi,\mathbb{R}) that projects to F0F^{0}.

Proof.

If 𝐅\mathbf{F} is a symmetry, then 𝐅R(CRπ1,0)CRπ1,0\mathbf{F}^{R}_{*}(C^{R}\pi_{1,0})\subset C^{R}\pi_{1,0} and (𝐅1)R(CRπ1,0)CRπ1,0(\mathbf{F}^{-1})^{R}_{*}(C^{R}\pi_{1,0})\subset C^{R}\pi_{1,0}. By Proposition B.14, 𝐅=jF\mathbf{F}=jF and 𝐅1=jG\mathbf{F}^{-1}=jG for some bundle endomorphisms FF and GG on (×M,π1,)(\mathbb{R}\times M,\pi_{1},\mathbb{R}) that projects to F0F^{0} and (F0)1(F^{0})^{-1}, respectively. Then, Corollary 4.5.(iii) implies that j(FG)=jFjG=FF1=𝐈𝐝×𝒯SMj(F\circ G)=jF\circ jG=F\circ F^{-1}=\mathbf{Id}_{\mathbb{R}\times\mathcal{T}^{S}M} and hence FG=𝐈𝐝×MF\circ G=\mathbf{Id}_{\mathbb{R}\times M}. For the same reason, GF=𝐈𝐝×MG\circ F=\mathbf{Id}_{\mathbb{R}\times M}. Thus, FF is a bundle automorphism on π\pi. Conversely, if 𝐅=jF\mathbf{F}=jF and FF is a bundle automorphism, then 𝐅jF1=jF1F=𝐈𝐝×𝒯SM\mathbf{F}\circ jF^{-1}=jF^{-1}\circ F=\mathbf{Id}_{\mathbb{R}\times\mathcal{T}^{S}M}, which yields 𝐅1=jF1\mathbf{F}^{-1}=jF^{-1} and hence 𝐅\mathbf{F} is a bundle automorphism on π1\pi_{1}. ∎

B.3 Infinitesimal symmetries

Definition B.16.

An infinitesimal symmetry of the mixed-order Cartan distribution is a π1\pi_{1}-projectable vector field 𝐕\mathbf{V} on ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M with the property that, whenever the mixed-order vector field 𝐀\mathbf{A} belongs to CRπ1,0C^{R}\pi_{1,0}, then so does the mixed-order vector field 𝐕𝐀\mathcal{L}_{\mathbf{V}}\mathbf{A}.

Like in the classical case, an infinitesimal symmetry of the mixed-order Cartan distribution may also be called an infinitesimal mixed-order contact transformation. By duality, 𝐕\mathbf{V} is such an infinitesimal symmetry precisely when 𝐕α\mathcal{L}_{\mathbf{V}}\alpha is a contact form for every mixed-order contact form α\alpha.

The following lemma is a consequence of the definition of Lie derivatives.

Lemma B.17.

Let 𝐕\mathbf{V} be a π1\pi_{1}-projectable vector field on ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M with flow Ψ={Ψϵ}ϵ\Psi=\{\Psi_{\epsilon}\}_{\epsilon\in\mathbb{R}}. Then, 𝐕\mathbf{V} is an infinitesimal symmetry of the mixed-order Cartan distribution if and only if for each ϵ\epsilon, the diffeomorphism Ψϵ\Psi_{\epsilon} is a symmetry of the mixed-order Cartan distribution.

The following result is the infinitesimal version of Corollary B.15. It can be deduced directly from Lemma B.17 and Corollary B.15. But here we give a computational proof based on the Lie derivative of mixed-order contact forms.

Theorem B.18.

Let 𝐕\mathbf{V} be a π1\pi_{1}-projectable vector field on ×𝒯SM\mathbb{R}\times\mathcal{T}^{S}M. Then, 𝐕\mathbf{V} is an infinitesimal symmetry of the mixed-order Cartan distribution if and only if 𝐕\mathbf{V} is the prolongation of a π\pi-projectable vector field VV on ×M\mathbb{R}\times M.

Proof.

Let the vector field 𝐕\mathbf{V} having the following local expression:

𝐕=𝐕0t+𝐕ixi+𝐕1iDix+𝐕2iQjkx,\mathbf{V}=\mathbf{V}^{0}\frac{\partial}{\partial{t}}+\mathbf{V}^{i}\frac{\partial}{\partial{x^{i}}}+\mathbf{V}_{1}^{i}\frac{\partial}{\partial{D^{i}x}}+\mathbf{V}_{2}^{i}\frac{\partial}{\partial{Q^{jk}x}},

where 𝐕0\mathbf{V}^{0} only depends on time due to the projectability of 𝐕\mathbf{V}. We then derive the Lie derivative 𝐕\mathcal{L}_{\mathbf{V}} of the basic mixed-order contact forms d2xiDixdtd^{2}x^{i}-D^{i}xdt and dxjdxkQjkxdtdx^{j}\cdot dx^{k}-Q^{jk}xdt as follows:

𝐕(d2xiDixdt)=d𝐕i𝐕1idtDixd𝐕0=𝐕itdt+𝐕ixjd2xj+122𝐕ixjxkdxjdxk+terms(𝐕iDjx,𝐕iQjkx,)𝐕1idtDixd𝐕0dtdt=𝐕ixj(d2xjDjxdt)+122𝐕ixjxk(dxjdxkQjkxdt)+(𝐕it+𝐕ixjDjx+122𝐕ixjxkQjkx𝐕1iDixd𝐕0dt)dt+terms(𝐕iDjx,𝐕iQjkx,),\begin{split}&\ \mathcal{L}_{\mathbf{V}}(d^{2}x^{i}-D^{i}xdt)\\ =&\ d^{\circ}\mathbf{V}^{i}-\mathbf{V}_{1}^{i}dt-D^{i}xd\mathbf{V}^{0}\\ =&\ \frac{\partial\mathbf{V}^{i}}{\partial t}dt+\frac{\partial\mathbf{V}^{i}}{\partial x^{j}}d^{2}x^{j}+\frac{1}{2}\frac{\partial^{2}\mathbf{V}^{i}}{\partial x^{j}\partial x^{k}}dx^{j}\cdot dx^{k}+\text{terms}\left(\frac{\partial\mathbf{V}^{i}}{\partial D^{j}x},\frac{\partial\mathbf{V}^{i}}{\partial Q^{jk}x},\cdots\right)\\ &\ -\mathbf{V}_{1}^{i}dt-D^{i}x\frac{d\mathbf{V}^{0}}{dt}dt\\ =&\ \frac{\partial\mathbf{V}^{i}}{\partial x^{j}}(d^{2}x^{j}-D^{j}xdt)+\frac{1}{2}\frac{\partial^{2}\mathbf{V}^{i}}{\partial x^{j}\partial x^{k}}(dx^{j}\cdot dx^{k}-Q^{jk}xdt)\\ &\ +\left(\frac{\partial\mathbf{V}^{i}}{\partial t}+\frac{\partial\mathbf{V}^{i}}{\partial x^{j}}D^{j}x+\frac{1}{2}\frac{\partial^{2}\mathbf{V}^{i}}{\partial x^{j}\partial x^{k}}Q^{jk}x-\mathbf{V}_{1}^{i}-D^{i}x\frac{d\mathbf{V}^{0}}{dt}\right)dt+\text{terms}\left(\frac{\partial\mathbf{V}^{i}}{\partial D^{j}x},\frac{\partial\mathbf{V}^{i}}{\partial Q^{jk}x},\cdots\right),\end{split}

and

𝐕(dxjdxkQjkxdt)=d𝐕jdxk+dxjd𝐕k𝐕2jkdtQjkxd𝐕0=𝐕jxidxidxk+𝐕kxidxjdxi𝐕2jkdtQjkxd𝐕0=𝐕jxi(dxidxkQikxdt)+𝐕kxi(dxidxjQijxdt)+(𝐕jxiQikx+𝐕kxiQijx𝐕2jkQjkxd𝐕0dt)dt.\begin{split}&\ \mathcal{L}_{\mathbf{V}}(dx^{j}\cdot dx^{k}-Q^{jk}xdt)\\ =&\ d\mathbf{V}^{j}\cdot dx^{k}+dx^{j}\cdot d\mathbf{V}^{k}-\mathbf{V}_{2}^{jk}dt-Q^{jk}xd\mathbf{V}^{0}=\frac{\partial\mathbf{V}^{j}}{\partial x^{i}}dx^{i}\cdot dx^{k}+\frac{\partial\mathbf{V}^{k}}{\partial x^{i}}dx^{j}\cdot dx^{i}-\mathbf{V}_{2}^{jk}dt-Q^{jk}xd\mathbf{V}^{0}\\ =&\ \frac{\partial\mathbf{V}^{j}}{\partial x^{i}}(dx^{i}\cdot dx^{k}-Q^{ik}xdt)+\frac{\partial\mathbf{V}^{k}}{\partial x^{i}}(dx^{i}\cdot dx^{j}-Q^{ij}xdt)+\left(\frac{\partial\mathbf{V}^{j}}{\partial x^{i}}Q^{ik}x+\frac{\partial\mathbf{V}^{k}}{\partial x^{i}}Q^{ij}x-\mathbf{V}_{2}^{jk}-Q^{jk}x\frac{d\mathbf{V}^{0}}{dt}\right)dt.\end{split}

Thus, the mixed-order forms 𝐕(d2xiDixdt)\mathcal{L}_{\mathbf{V}}(d^{2}x^{i}-D^{i}xdt) and 𝐕(dxjdxkQjkxdt)\mathcal{L}_{\mathbf{V}}(dx^{j}\cdot dx^{k}-Q^{jk}xdt) are mixed-order contact forms if and only if

terms 𝐕iDjx,𝐕iQjkx, etc, vanish and\displaystyle\text{terms }\frac{\partial\mathbf{V}^{i}}{\partial D^{j}x},\frac{\partial\mathbf{V}^{i}}{\partial Q^{jk}x},\text{ etc, vanish and} (B.5)
𝐕it+𝐕ixjDjx+122𝐕ixjxkQjkx𝐕1iDixd𝐕0dt=0,\displaystyle\frac{\partial\mathbf{V}^{i}}{\partial t}+\frac{\partial\mathbf{V}^{i}}{\partial x^{j}}D^{j}x+\frac{1}{2}\frac{\partial^{2}\mathbf{V}^{i}}{\partial x^{j}\partial x^{k}}Q^{jk}x-\mathbf{V}_{1}^{i}-D^{i}x\frac{d\mathbf{V}^{0}}{dt}=0, (B.6)
𝐕jxiQikx+𝐕kxiQijx𝐕2jkQjkxd𝐕0dt=0.\displaystyle\frac{\partial\mathbf{V}^{j}}{\partial x^{i}}Q^{ik}x+\frac{\partial\mathbf{V}^{k}}{\partial x^{i}}Q^{ij}x-\mathbf{V}_{2}^{jk}-Q^{jk}x\frac{d\mathbf{V}^{0}}{dt}=0. (B.7)

Now (B.5) means that 𝐕i\mathbf{V}^{i}’s only depend on the variables on ×M\mathbb{R}\times M, so that the vector field 𝐕\mathbf{V} is also π1,0\pi_{1,0}-projectable. The two equations (B.6) and (B.7) are just restatements of the prolongation formulae in Theorem 4.14. ∎

Appendix C Stochastic Maupertuis’s principle

Based on Definition 7.11, if we further consider the variation caused by time-change, as in classical mechanics (cf. [1, Definition 3.8.4] or the so called Δ\Delta-variation in [35, Section 8.6]), then we need to impose the constraint of constant energy. So the path space 𝒜g([0,T];q,μ)\mathcal{A}_{g}([0,T];q,\mu) in (7.10) is modified to

𝒜g([0,T];q,μ;e):={(X,τ):τC2([0,T],),τ>0,XI(τ(0),q)(τ(T),μ)(M),QX(t)=gˇ(X(t)),t[τ(0),τ(T)],a.s.,𝐄E0(t,X(t),DX(t))=e,t[τ(0),τ(T)]},\begin{split}\mathcal{A}_{g}([0,T];q,\mu;e):=\Big{\{}(X,\tau):&\tau\in C^{2}([0,T],\mathbb{R}),\tau^{\prime}>0,X\in I_{(\tau(0),q)}^{(\tau(T),\mu)}(M),\\ &QX(t)=\check{g}(X(t)),\forall t\in[\tau(0),\tau(T)],\text{a.s.},\\ &\mathbf{E}E_{0}(t,X(t),D_{\nabla}X(t))=e,\forall t\in[\tau(0),\tau(T)]\Big{\}},\end{split}

where ee\in\mathbb{R} is a regular value of E0E_{0}.

Definition C.1.

Given v([0,T];q)v\in\mathcal{H}([0,T];q) and ς𝒞1([0,T],)\varsigma\in\mathcal{C}^{1}([0,T],\mathbb{R}), by a variation of the pair (X,τ)𝒜g([0,T];q,μ;e)(X,\tau)\in\mathcal{A}_{g}([0,T];q,\mu;e) along (v,ς)(v,\varsigma), we mean a family of pairs {(Xϵv,ς,τςϵ)}ϵ(ε,ε)\{(X_{\epsilon}^{v,\varsigma},\tau^{\varsigma}_{\epsilon})\}_{\epsilon\in(-\varepsilon,\varepsilon)} where τς0=τ\tau^{\varsigma}_{0}=\tau, tτςϵ>0\frac{\partial}{\partial t}\tau^{\varsigma}_{\epsilon}>0, such that for each ϵ\epsilon, ϵτςϵ|ϵ=0=ς\frac{\partial}{\partial\epsilon}\tau^{\varsigma}_{\epsilon}|_{\epsilon=0}=\varsigma, Xϵv,ςI(τςϵ(0),q)(τςϵ(T),μ)(M)X_{\epsilon}^{v,\varsigma}\in I_{(\tau^{\varsigma}_{\epsilon}(0),q)}^{(\tau^{\varsigma}_{\epsilon}(T),\mu)}(M), and for each t[τςϵ(0),τςϵ(T)]t\in[\tau^{\varsigma}_{\epsilon}(0),\tau^{\varsigma}_{\epsilon}(T)], 𝐄E0(t,Xϵv,ς(t),DXϵv,ς(t))=e\mathbf{E}E_{0}(t,X_{\epsilon}^{v,\varsigma}(t),D_{\nabla}X_{\epsilon}^{v,\varsigma}(t))=e, Xv,ςϵ(t)X^{v,\varsigma}_{\epsilon}(t) satisfies the ODE

ϵXv,ςϵ(t)=Γ(Xv,ςϵ)τςϵ(0)tv(t),Xv,ς0(t)=X(t).\frac{\partial}{\partial\epsilon}X^{v,\varsigma}_{\epsilon}(t)=\Gamma(X^{v,\varsigma}_{\epsilon})_{\tau^{\varsigma}_{\epsilon}(0)}^{t}v(t),\quad X^{v,\varsigma}_{0}(t)=X(t). (C.1)

Define a functional :𝒜g([0,T];q,μ;e)\mathcal{I}:\mathcal{A}_{g}([0,T];q,\mu;e)\to\mathbb{R} by

[X,τ]:=𝐄τ(0)τ(T)A0(t,X(t),DX(t))dt.\mathcal{I}[X,\tau]:=\mathbf{E}\int_{\tau(0)}^{\tau(T)}A_{0}\left(t,X(t),D_{\nabla}X(t)\right)dt.

The pair (X,τ)𝒜g([0,T];q,μ;e)(X,\tau)\in\mathcal{A}_{g}([0,T];q,\mu;e) is called a stationary point of \mathcal{I}, if

ddϵ|ϵ=0[Xϵv,ς,τςϵ]=0,for all v([0,T];q) and ς𝒞1([0,T],).\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}\mathcal{I}[X_{\epsilon}^{v,\varsigma},\tau^{\varsigma}_{\epsilon}]=0,\quad\text{for all }v\in\mathcal{H}([0,T];q)\text{ and }\varsigma\in\mathcal{C}^{1}([0,T],\mathbb{R}).

As in Lemma 7.13, it is easy to deduce from (C.1) that QXv,ςϵ(t)=gˇ(Xv,ςϵ(t))QX^{v,\varsigma}_{\epsilon}(t)=\check{g}(X^{v,\varsigma}_{\epsilon}(t)) for each t[τςϵ(0),τςϵ(T)]t\in[\tau^{\varsigma}_{\epsilon}(0),\tau^{\varsigma}_{\epsilon}(T)] so that Xv,ςϵ𝒜g([0,T];q,μ;e)X^{v,\varsigma}_{\epsilon}\in\mathcal{A}_{g}([0,T];q,\mu;e). Moreover, formula (7.13) still holds for all t[τ(0),τ(T)]t\in[\tau(0),\tau(T)], with Xv,ςϵX^{v,\varsigma}_{\epsilon} in place of XvϵX^{v}_{\epsilon}.

Lemma C.2.

Keep the notations in Definition C.1. Then, in normal coordinates (xi)(x^{i}) we have

ϵ|ϵ=0𝐄[(Xv,ςϵ)i(τςϵ(s))|𝒫τ(s)]=(Γ(X)τ(0)τ(s)v(τ(s)))i+ς(s)(DX)i(τ(s)).\frac{\partial}{\partial\epsilon}\bigg{|}_{\epsilon=0}\mathbf{E}\left[(X^{v,\varsigma}_{\epsilon})^{i}(\tau^{\varsigma}_{\epsilon}(s))\big{|}\mathcal{P}_{\tau(s)}\right]=\left(\Gamma(X)_{\tau(0)}^{\tau(s)}v(\tau(s))\right)^{i}+\varsigma(s)(D_{\nabla}X)^{i}(\tau(s)).
Proof.

Without loss of generality, we assume τςϵ(s)τ(s)\tau^{\varsigma}_{\epsilon}(s)\geq\tau(s). It follows from (C.1) and Definition 2.5 that

LHS=limϵ0𝐄[(Xv,ςϵ)i(τςϵ(s))Xi(τ(s))ϵ|𝒫τ(s)]=limϵ0𝐄[(Xv,ςϵ)i(τςϵ(s))Xi(τςϵ(s))ϵ|𝒫τ(s)]+limϵ0𝐄[Xi(τςϵ(s))Xi(τ(s))τςϵ(s)τ(s)|𝒫τ(s)]ς(s)=RHS.\begin{split}\text{LHS}&=\lim_{\epsilon\to 0}\mathbf{E}\left[\frac{(X^{v,\varsigma}_{\epsilon})^{i}(\tau^{\varsigma}_{\epsilon}(s))-X^{i}(\tau(s))}{\epsilon}\bigg{|}\mathcal{P}_{\tau(s)}\right]\\ &=\lim_{\epsilon\to 0}\mathbf{E}\left[\frac{(X^{v,\varsigma}_{\epsilon})^{i}(\tau^{\varsigma}_{\epsilon}(s))-X^{i}(\tau^{\varsigma}_{\epsilon}(s))}{\epsilon}\bigg{|}\mathcal{P}_{\tau(s)}\right]+\lim_{\epsilon\to 0}\mathbf{E}\left[\frac{X^{i}(\tau^{\varsigma}_{\epsilon}(s))-X^{i}(\tau(s))}{\tau^{\varsigma}_{\epsilon}(s)-\tau(s)}\bigg{|}\mathcal{P}_{\tau(s)}\right]\varsigma(s)\\ &=\text{RHS}.\end{split}

Done. ∎

Theorem C.3 (Stochastic Maupertuis’s principle).

Let L0L_{0} be a regular Lagrangian on ×TM\mathbb{R}\times TM. Let XI(0,q)(T,μ)(M)X\in I_{(0,q)}^{(T,\mu)}(M) such that (X,𝐈𝐝[0,T])𝒜g([0,T];q,μ;e)(X,\mathbf{Id}_{[0,T]})\in\mathcal{A}_{g}([0,T];q,\mu;e). Then, the pair (X,𝐈𝐝[0,T])(X,\mathbf{Id}_{[0,T]}) is a stationary point of \mathcal{I} if and only if XX satisfy the stochastic Euler-Lagrange equation (7.22).

Proof.

Since all diffusions in 𝒜g([0,T];q,μ;e)\mathcal{A}_{g}([0,T];q,\mu;e) have the same average energy ee, we have

[X,τ]:=𝐄τ(0)τ(T)[L0(t,X(t),DX(t))+e]dt.\mathcal{I}[X,\tau]:=\mathbf{E}\int_{\tau(0)}^{\tau(T)}[L_{0}\left(t,X(t),D_{\nabla}X(t)\right)+e]dt.

Denote V(t)=Γ(X)0tv(t)V(t)=\Gamma(X)_{0}^{t}v(t). As in (7.23),

ddϵ|ϵ=0I[Xϵv,ς,τςϵ]=𝐄0Tddϵ|ϵ=0L0(t,Xv,ςϵ(t),DXv,ςϵ(t))dt+ς(t)𝐄[L0(t,X(t),DX(t))+e]|0T=𝐄0T[dxL0(V(t))+dx˙L0(Γ(X)0tv˙(t))+12(QX)ij(t)dx˙L0(R(V(t),i)j)]dt+ς(t)𝐄[L0(t,X(t),DX(t))+e]|0T.\begin{split}\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}I[X_{\epsilon}^{v,\varsigma},\tau^{\varsigma}_{\epsilon}]&=\mathbf{E}\int_{0}^{T}\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}L_{0}\left(t,X^{v,\varsigma}_{\epsilon}(t),D_{\nabla}X^{v,\varsigma}_{\epsilon}(t)\right)dt+\varsigma(t)\mathbf{E}[L_{0}\left(t,X(t),D_{\nabla}X(t)\right)+e]\big{|}_{0}^{T}\\ &=\mathbf{E}\int_{0}^{T}\left[d_{x}L_{0}\left(V(t)\right)+d_{\dot{x}}L_{0}\left(\Gamma(X)_{0}^{t}\dot{v}(t)\right)+\frac{1}{2}(QX)^{ij}(t)d_{\dot{x}}L_{0}\left(R(V(t),\partial_{i})\partial_{j}\right)\right]dt\\ &\quad+\varsigma(t)\mathbf{E}[L_{0}\left(t,X(t),D_{\nabla}X(t)\right)+e]\big{|}_{0}^{T}.\end{split}

We apply (7.24) and notice that in the present situation we do not have v(0)=v(T)=0v(0)=v(T)=0 in general. Hence,

𝐄0Tdx˙L0(Γ(X)0tv˙(t))dt=𝐄0TΓ(X)t0(dx˙L0)(v˙(t))dt=𝐄[dx˙L0(V(t))]|0T𝐄0T𝐃dt(dx˙L0)(V(t))dt.\begin{split}\mathbf{E}\int_{0}^{T}d_{\dot{x}}L_{0}\left(\Gamma(X)_{0}^{t}\dot{v}(t)\right)dt&=\mathbf{E}\int_{0}^{T}\Gamma(X)_{t}^{0}(d_{\dot{x}}L_{0})\left(\dot{v}(t)\right)dt\\ &=\mathbf{E}[d_{\dot{x}}L_{0}\left(V(t)\right)]\big{|}_{0}^{T}-\mathbf{E}\int_{0}^{T}\frac{\mathbf{D}}{dt}(d_{\dot{x}}L_{0})\left(V(t)\right)dt.\end{split}

One the other hand, since for all ϵ\epsilon, Xϵv,ς(τςϵ(0))=qX_{\epsilon}^{v,\varsigma}(\tau^{\varsigma}_{\epsilon}(0))=q and 𝐏(Xϵv,ς(τςϵ(T)))1=μ\mathbf{P}\circ(X_{\epsilon}^{v,\varsigma}(\tau^{\varsigma}_{\epsilon}(T)))^{-1}=\mu. It follows from Lemma C.2 that

V(s)+ς(s)DX(s)=0,for s=0 or s=T.V(s)+\varsigma(s)D_{\nabla}X(s)=0,\quad\text{for }s=0\text{ or }s=T.

Therefore,

ddϵ|ϵ=0I[Xϵv,ς,τςϵ]=𝐄0T(dxL0𝐃¯dt(dx˙L0))(V(t))dt+ς(t)𝐄[L0(t,X(t),DX(t))(dx˙L0)(DX(t))+e]|0T.\begin{split}\frac{d}{d\epsilon}\bigg{|}_{\epsilon=0}I[X_{\epsilon}^{v,\varsigma},\tau^{\varsigma}_{\epsilon}]&=\mathbf{E}\int_{0}^{T}\left(d_{x}L_{0}-\frac{\overline{\mathbf{D}}}{dt}(d_{\dot{x}}L_{0})\right)\left(V(t)\right)dt\\ &\quad+\varsigma(t)\mathbf{E}\left[L_{0}\left(t,X(t),D_{\nabla}X(t)\right)-(d_{\dot{x}}L_{0})\left(D_{\nabla}X(t)\right)+e\right]\big{|}_{0}^{T}.\end{split}

By the definition of the energy E0E_{0}, we know that

𝐄[L0(t,X(t),DX(t))(dx˙L0)(DX(t))]=𝐄E0(t,X(t),DX(t))=e.\mathbf{E}\left[L_{0}\left(t,X(t),D_{\nabla}X(t)\right)-(d_{\dot{x}}L_{0})\left(D_{\nabla}X(t)\right)\right]=-\mathbf{E}E_{0}\left(t,X(t),D_{\nabla}X(t)\right)=-e.

The result follows. ∎

Data Availability

Our manuscript has no associated data.

Acknowledgements.

We would like to thank Prof. Ana Bela Cruzeiro and Prof. Marc Arnaudon for their careful reading and helpful discussions, which helped us a lot especially in improving Section 6.2 and 7.2. We also would like to thank Prof. Maosong Xiang for his helpful suggestions and kind experience-sharing. This paper is supported by FCT, Portugal, project PTDC/MAT-STA/28812/2017, “Schrödinger’s problem and optimal transport: a multidisciplinary perspective (SchröMoka)”.

References

  • Abraham and Marsden [1978] R. Abraham and J.E. Marsden. Foundations of mechanics. Addison-Wesley Publishing Company, 2nd ed., 1978.
  • Albeverio et al. [1989] S. Albeverio, K. Yasue, and J.-C. Zambrini. Euclidean quantum mechanics: analytical approach. In Annales de l’IHP Physique théorique, vol. 50, pp. 259–308, 1989.
  • Albeverio et al. [2006] S. Albeverio, J. Rezende, and J.-C. Zambrini. Probability and quantum symmetries. II. The theorem of Noether in quantum mechanics. Journal of mathematical physics, 47(6):062107, 2006.
  • Angst et al. [2015] J. Angst, I. Bailleul, and C. Tardif. Kinetic Brownian motion on Riemannian manifolds. Electronic Journal of Probability, 20:1–40, 2015.
  • Arnaudon and Thalmaier [1998] M. Arnaudon and A. Thalmaier. Complete lifts of connections and stochastic Jacobi fields. Journal de mathématiques pures et appliquées, 77(3):283–315, 1998.
  • Arnaudon et al. [2014] M. Arnaudon, X. Chen, and A.B. Cruzeiro. Stochastic Euler-Poincaré reduction. Journal of Mathematical Physics, 55(8):081507, 2014.
  • Arnold [1989] V.I. Arnold. Mathematical methods of classical mechanics, vol. 60. Springer-Verlag New York, 2nd ed., 1989.
  • Arnold and Khesin [2021] V.I. Arnold and B.A. Khesin. Topological methods in hydrodynamics, vol. 125. Springer Nature Switzerland, 2nd ed., 2021.
  • Asorey et al. [1983] M. Asorey, J.F. Carinena, and L.A. Ibort. Generalized canonical transformations for time-dependent systems. Journal of mathematical physics, 24(12):2745–2750, 1983.
  • Belopolskaya and Dalecky [1990] Y.I. Belopolskaya and Y.L. Dalecky. Stochastic equations and differential geometry. Kluwer Academic Publishers, 1990.
  • Bernstein [1932] S. Bernstein. Sur les liaisons entre les grandeurs aléatoires. Verh. Internat. Math.-Kongr., Zurich, Band I, 1932.
  • Bismut [1981] J.-M. Bismut. Mécanique aléatoire, vol. 866. Springer-Verlag Berlin Heidelberg, 1981.
  • Çetin and Danilova [2016] U. Çetin and A. Danilova. Markov bridges: SDE representation. Stochastic Processes and their Applications, 126(3):651–679, 2016.
  • Chen et al. [2023] X. Chen, A.B. Cruzeiro, and T.S. Ratiu. Stochastic variational principles for dissipative equations with advected quantities. Journal of Nonlinear Science, 33(1):5, 2023.
  • Chung and Zambrini [2003] K.L. Chung and J.-C. Zambrini. Introduction to random time and quantum randomness, vol. 1. World Scientific, new ed., 2003.
  • Cruzeiro and Vuillermot [2015] A.B. Cruzeiro and P.-A. Vuillermot. Forward-backward stochastic differential equations generated by Bernstein diffusions. Stochastic Analysis and Applications, 33(1):91–109, 2015.
  • Cruzeiro and Zambrini [1991] A.B. Cruzeiro and J.-C. Zambrini. Malliavin calculus and Euclidean quantum mechanics. I. Functional calculus. J. Funct. Anal., 96(1):62–95, 1991.
  • Cruzeiro et al. [2000] A.B. Cruzeiro, L. Wu, and J.-C. Zambrini. Bernstein processes associated with a Markov process. In Stochastic Analysis and Mathematical Physics, pp. 41–72. Springer Science & Business Media, 2000.
  • Dahlqvist et al. [2019] A. Dahlqvist, J. Diehl, and B.K. Driver. The parabolic Anderson model on Riemann surfaces. Probability Theory and Related Fields, 174(1):369–444, 2019.
  • Dirac [1933] P.A.M. Dirac. The Lagrangian in quantum mechanics. Physik. Zeitschrift der Sowjetunion, Band 3(Heft 1):64–72, 1933.
  • Dohrn and Guerra [1979] D. Dohrn and F. Guerra. Geodesic correction to stochastic parallel displacement of tensors. In Stochastic Behavior in Classical and Quantum Hamiltonian Systems, pp. 241–249. Springer, 1979.
  • Driver [1992] B.K. Driver. A Cameron–Martin type quasi-invariance theorem for Brownian motion on a compact Riemannian manifold. Journal of functional analysis, 110(2):272–376, 1992.
  • Dynkin [1968] E.B. Dynkin. Diffusion of tensors. In Dokl. Akad. Nauk SSSR, vol. 179, pp. 1264–1267. Russian Academy of Sciences, 1968.
  • Elworthy [1982] K.D. Elworthy. Stochastic differential equations on manifolds, vol. 70. Cambridge University Press, 1982.
  • Emery [1989] M. Emery. Stochastic calculus in manifolds. Springer-Verlag Berlin Heidelberg, 1989.
  • Emery [2007] M. Emery. An invitation to second-order stochastic differential geometry. hal-00145073, 2007.
  • Fang and Malliavin [1993] S. Fang and P. Malliavin. Stochastic analysis on the path space of a Riemannian manifold: I. Markovian stochastic calculus. J. Funct. Anal., 118(1):249–274, 1993.
  • Feynman [1948] R.P. Feynman. Space-time approach to non-relativistic quantum mechanics. Rev. Mod. Phys., 118:367–387, 1948.
  • Fleming and Soner [2006] W.H. Fleming and H.M. Soner. Controlled Markov processes and viscosity solutions, vol. 25. Springer Science & Business Media, 2nd ed., 2006.
  • Fock [1978] V.A. Fock. Fundamentals of quantum mechanics. Mir Publishers, 2nd ed., 1978.
  • Gaeta and Quintero [1999] G. Gaeta and N.R. Quintero. Lie-point symmetries and stochastic differential equations. Journal of Physics A: Mathematical and General, 32(48):8485–8505, 1999.
  • Geiges [2008] H. Geiges. An introduction to contact topology, vol. 109. Cambridge University Press, 2008.
  • Gentil et al. [2020] I. Gentil, C. Léonard, and L. Ripani. Dynamical aspects of the generalized Schrödinger problem via Otto calculus–a heuristic point of view. Revista Matemática Iberoamericana, 36(4):1071–1112, 2020.
  • Gliklikh [2011] Y.E. Gliklikh. Global and stochastic analysis with applications to mathematical physics. Springer-Verlag London Limited, 2011.
  • Goldstein et al. [2002] H. Goldstein, C. Poole, and J. Safko. Classical mechanics. Pearson Education, 3rd ed., 2002.
  • Hairer [2014] M. Hairer. A theory of regularity structures. Inventiones mathematicae, 198(2):269–504, 2014.
  • Haussmann [1986] U.G. Haussmann. A stochastic maximum principle for optimal control of diffusions, vol. 151. Longman Scientific and Technical, 1986.
  • Holm et al. [2009] D.D. Holm, T. Schmah, and C. Stoica. Geometric mechanics and symmetry: from finite to infinite dimensions, vol. 12. Oxford University Press, 2009.
  • Hsu [1995] E.P. Hsu. Quasi-invariance of the Wiener measure on the path space over a compact Riemannian manifold. J. Funct. Anal., 134:417–450, 1995.
  • Hsu [2002] E.P. Hsu. Stochastic analysis on manifolds, vol. 38. American Mathematical Society, 2002.
  • Hsu [1990] P. Hsu. Brownian bridges on Riemannian manifolds. Probability theory and related fields, 84(1):103–118, 1990.
  • Huang and Zambrini [2023] Q. Huang and J.-C. Zambrini. Stochastic geometric mechanics in nonequilibrium thermodynamics: Schrödinger meets Onsager. Journal of Physics A: Mathematical and Theoretical, 56(13):134003, 2023.
  • Ikeda and Watanabe [1989] N. Ikeda and S. Watanabe. Stochastic differential equations and diffusion processes, vol. 24. North-Holland Publishing Company, 2nd ed., 1989.
  • Itô [1962] K. Itô. The Brownian motion and tensor fields on Riemannian manifold. In Proceedings of the international congress of mathematicians 1962, pp. 536–539. Almqvist & Wiksells, 1962.
  • Itô [1975] K. Itô. Stochastic parallel displacement. In Probabilistic methods in differential equations, pp. 1–7. Springer, 1975.
  • Jamison [1975] B. Jamison. The Markov processes of Schrödinger. Z. Wahrscheinlichkeitstheorie verw. Gebiete, 32(4):323–331, 1975.
  • Jost [2017] J. Jost. Riemannian geometry and geometric analysis. Springer International Publishing, 7th ed., 2017.
  • Karatzas and Shreve [1991] I. Karatzas and S. Shreve. Brownian motion and stochastic calculus, vol. 113. Springer-Verlag New York, 1991.
  • Khesin et al. [2021] B. Khesin, G. Misiołek, and K. Modin. Geometric hydrodynamics and infinite-dimensional Newton’s equations. Bulletin of the American Mathematical Society, 58(3):377–442, 2021.
  • Kobayashi and Nomizu [1963] S. Kobayashi and K. Nomizu. Foundations of differential geometry, vol. 1. Interscience Publishers, 1963.
  • Lang [1999] S. Lang. Fundamentals of differential geometry, vol. 191. Springer Science & Business Media New York, 1999.
  • Lassalle and Zambrini [2016] R. Lassalle and J.-C. Zambrini. A weak approach to the stochastic deformation of classical mechanics. Journal of Geometric Mechanics, 8(2):221, 2016.
  • Lázaro-Camí and Ortega [2008] J.-A. Lázaro-Camí and J.-P. Ortega. Stochastic Hamiltonian dynamical systems. Report on Mathematical Physics, 61(1):65–122, 2008.
  • Lázaro-Camí and Ortega [2009] J.-A. Lázaro-Camí and J.-P. Ortega. The stochastic Hamilton-Jacobi equation. Journal of Geometric Mechanics, 1(3):295, 2009.
  • Lee [2013] J.M. Lee. Introduction to smooth manifolds, vol. 218. Springer Science+Business Media New York, 2nd ed., 2013.
  • Léonard [2012a] C. Léonard. Girsanov theory under a finite entropy condition. In Séminaire de Probabilités XLIV, pp. 429–465. Springer, 2012a.
  • Léonard [2012b] C. Léonard. From the Schrödinger problem to the Monge–Kantorovich problem. Journal of Functional Analysis, 262(4):1879–1920, 2012b.
  • Léonard [2014] C. Léonard. A survey of the Schrödinger problem and some of its connections with optimal transport. Discret. Contin. Dyn. Syst., 34(4):1533–1574, 2014.
  • Léonard et al. [2014] C. Léonard, S. Rœlly, and J.-C. Zambrini. Reciprocal processes: A measure-theoretical point of view. Probability Surveys, 11:237–269, 2014.
  • Lescot and Zambrini [2007] P. Lescot and J.-C. Zambrini. Probabilistic deformation of contact geometry, diffusion processes and their quadratures. In Seminar on Stochastic Analysis, Random Fields and Applications V, vol. 59, pp. 203–226. Springer, 2007.
  • Li [2016] X.-M. Li. Limits of random differential equations on manifolds. Probability Theory and Related Fields, 166(3):659–712, 2016.
  • Malliavin [1997] P. Malliavin. Stochastic analysis, vol. 313. Springer-Verlag Berlin Heidelberg, 1997.
  • Marsden and Ratiu [1999] J.E. Marsden and T.S. Ratiu. Introduction to mechanics and symmetry: a basic exposition of classical mechanical systems, vol. 17. Springer Science & Business Media, 2nd ed., 1999.
  • Meyer [1979] P.-A. Meyer. Formes differentielles d’ordre n>1n>1. Publication IRMA, Université Louis Pasteur, Strasbourg, 80, 1979.
  • Meyer [1981a] P.-A. Meyer. A differential geometric formalism for the Itô calculus. In Stochastic Integrals, vol. 851 of LNM, pp. 256–270. Springer, 1981a.
  • Meyer [1981b] P.-A Meyer. Géométrie stochastique sans larmes. In Séminaire de Probabilités XV 1979/80, pp. 44–102. Springer, 1981b.
  • Mikami [2021] T. Mikami. Stochastic Optimal Transportation: Stochastic Control with Fixed Marginals. Springer Nature, 2021.
  • Munkres [1975] J.R. Munkres. Topology. Prentice Hall. Inc, 2nd ed., 1975.
  • Nelson [2001] E. Nelson. Dynamical theories of Brownian motion, vol. 106. Princeton University Press, 2nd ed., 2001.
  • Øksendal [2010] B. Øksendal. Stochastic differential equations: an introduction with applications. Springer-Verlag Berlin Heidelberg, 2010.
  • Olver [1995] P.J. Olver. Equivalence, invariants and symmetry. Cambridge University Press, 1995.
  • Olver [1998] P.J. Olver. Applications of Lie groups to differential equations, vol. 107. Springer-Verlag New York, 2nd ed., 1998.
  • Otto [2001] F. Otto. The geometry of dissipative evolution equations: the porous medium equation. Commun. Partial Differ. Equ., 26(1-2):107–174, 2001.
  • Petersen [2016] P. Petersen. Riemannian geometry, vol. 171. Springer International Publishing AG, 3rd ed., 2016.
  • Peyré et al. [2019] G. Peyré, L. Chizat, F.-X. Vialard, and J. Solomon. Quantum entropic regularization of matrix-valued optimal transport. European Journal of Applied Mathematics, 30(6):1079–1102, 2019.
  • Privault et al. [2016] N. Privault, X. Yang, and J.-C. Zambrini. Large deviations for Bernstein bridges. Stochastic Processes and their Applications, 126(5):1285–1305, 2016.
  • Saunders [1989] D.J. Saunders. The geometry of jet bundles, vol. 142. Cambridge University Press, 1989.
  • Schrödinger [1926] E. Schrödinger. Quantization as a problem of proper values (part I). Annalen der Physik, 1926.
  • Schrödinger [1932] E. Schrödinger. Sur la théorie relativiste de l’électron et l’interprétation de la mécanique quantique. Annales de l’institut Henri Poincaré, 2(4):269–310, 1932.
  • Schwartz [1980] L. Schwartz. Semi-martingales sur des variétés, et martingales conformes sur des variétés analytiques complexes, vol. 780. Springer-Verlag Berlin Heidelberg, 1980.
  • Schwartz [1982] L. Schwartz. Géométrie différentielle du 2 ème ordre, semi-martingales et équations différentielles stochastiques sur une variété différentielle. In Séminaire de Probabilités XVI, 1980/81 Supplément: Géométrie Différentielle Stochastique, pp. 1–148. Springer, 1982.
  • Schwartz [1984] L. Schwartz. Semimartingales and their stochastic calculus on manifolds. Gaetan Morin Editeur Ltee, 1984.
  • Thieullen and Zambrini [1997] M. Thieullen and J.-C. Zambrini. Probability and quantum symmetries I: The theorem of Noether in Schrödinger’s Euclidean quantum mechanics. Ann. Inst. Henri Poincaré, 67(3):297–338, 1997.
  • Trachenko and Brazhkin [2021] K. Trachenko and V.V. Brazhkin. The quantum mechanics of viscosity. Physics Today, 74(12):66–67, 2021.
  • Villani [2009] C. Villani. Optimal transport: old and new, vol. 338. Springer-Verlag Berlin Heidelberg, 2009.
  • von Renesse [2012] M.-K. von Renesse. An optimal transport view of Schrödinger’s equation. Canadian mathematical bulletin, 55(4):858–869, 2012.
  • Yong and Zhou [1999] J. Yong and X.Y. Zhou. Stochastic controls: Hamiltonian systems and HJB equations, vol. 43. Springer-Verlag New York, Inc, 1999.
  • Zambrini [1986] J.-C. Zambrini. Variational processes and stochastic versions of mechanics. Journal of Mathematical Physics, 27(9):2307–2330, 1986.
  • Zambrini [2015] J.-C. Zambrini. The research program of stochastic deformation (with a view toward geometric mechanics). In Stochastic analysis: a series of lectures, vol. 68 of Progress in Probability, pp. 359–393. Springer Basel, 2015.