A Feynman-Kac Type Theorem for ODEs: Solutions of Second Order ODEs as Modes of Diffusions

Zachary Selk Department of Mathematics, Purdue University, West Lafayette IN 47907, USA. and Harsha Honnappa School of Industrial Engineering, Purdue University, West Lafayette IN 47907, USA. [zselk,honnappa]@purdue.edu

Abstract.

In this article, we prove a Feynman-Kac type result for a broad class of second order ordinary differential equations. The classical Feynman-Kac theorem says that the solution to a broad class of second order parabolic equations is the mean of a particular diffusion. In our situation, we show that the solution to a system of second order ordinary differential equations is the mode of a diffusion, defined through the Onsager-Machlup formalism. One potential utility of our result is to use Monte Carlo type methods to estimate the solutions of ordinary differential equations. We conclude with examples of our result illustrating its utility in numerically solving linear second order ODEs.

2010 Mathematics Subject Classification:

60H30, 60G15, 34B05

H. Honnappa is supported by the NSF grant DMS-1812197.

1. Introduction

The Feynman-Kac theorem is a widely known and extensively used formula that relates the solutions of second order parabolic partial differential equations (PDEs) to the average of a transformation of a diffusion. It was introduced in [22] and has since seen tremendous utility and generalization. The Feynman-Kac formula has been extended to stochastic PDE in e.g. [4, 19, 20]. It has also seen utility in physics, see e.g. [1, 8, 17, 24, 27]. It has been used also in interacting particle systems, see e.g. [7, 11, 12, 13]. It has been used in finance, see e.g. [21].

There are many forms of the Feynman-Kac formula, see e.g. [25] for a standard reference. In one form, the Feynman-Kac formula says the following.

Theorem 1 (Feynman-Kac).

Consider the partial differential equation

\frac{\partial}{\partial t}u(t,x)+\mu(t,x)\frac{\partial}{\partial x}u(t,x)+\frac{1}{2}\sigma^{2}(t,x)\frac{\partial^{2}}{\partial x^{2}}u(t,x)=V(t,x)u(t,x),

on the domain $[0,T]\times\mathbb{R}$ , with terminal condition $u(T,x)=\psi(x)$ and $\mu,\sigma,V$ sufficiently regular. Then

u(t,x)=E_{\mu_{0}}[e^{-\int_{t}^{T}V(s,X(s))ds}\psi(X(T))|X(t)=x],

where $X(t)$ is the solution to the solution to the stochastic differential equation

dX(t)=\mu(t,X(t))dt+\sigma(t,X(t))dB(t),

with initial condition $X(t)=x$ , $B(t)$ a standard Brownian motion and $\mu_{0}$ is the measure associated to $B(t)$ .

Linear second order ordinary differential equations and systems thereof find extensive use in engineering, physics, biology, geometry, control theory and many other areas of pure and applied mathematics. One example of a system of such equations in physics is the matrix Airy equation, which has connections to the intersection theory on the moduli space of curves, see [23]. Further examples abound in the computation of long-run average costs associated with diffusion processes, which can be expressed as solutions of second-order ordinary differential equations (ODEs). In this article, we demonstrate a theorem analogous to Theorem 1 for second order ordinary differential equations. However in the ODE case, solutions are represented as the conditional “mode” of a diffusion instead of a conditional mean. By “mode”, we mean the minimizer of an Onsager-Machlup function. It was first shown in [15] that the Onsager-Machlup function is a Lagrangian for the “most likely path” or “mode” of a diffusion. Our main theorem can be summarized as follows, see Theorem 22 for a full version.

Theorem 2 (Feynman-Kac for ODEs).

Consider the differential equation

(1)

\ddot{y}(t)+f(t)\dot{y}(t)+g(t)y(t)+h(t)=0,

where the unknown $y\in C^{2}([0,T],\mathbb{R})$ , with boundary conditions $y(0)=0$ and $y(T)=a$ . Let $f(t),g(t)$ and $h(t)$ be (almost) arbitrary $C^{1}$ real valued functions. Then there exists a choice of $C^{1}([0,T],\mathbb{R})$ functions $A(t)$ and $D(t)$ that depend on $f,g$ and $h$ so that the solution to (1) is

(2)

y(t)=e^{-V(t)}\operatorname{Mode}\left[X(t)\bigg{|}X(T)=a\right],

where $X(t)=\int_{0}^{t}[A(s)B(s)+D(s)]ds+B(t)$ , $B(t)$ is a standard Brownian motion and additionally $V(t)=2^{-1}\int_{0}^{t}{f(s)}ds$ . Here, the conditional mode of a diffusion is defined as the minimizer of the Onsager-Machlup function defined in Theorem 13 over all paths $z$ with $z(T)=a$ .

The $e^{-V(t)}$ term is analogous to the Feynman-Kac for PDEs case, Theorem 1. The Feynman-Kac formula can first be proven for diffusions with no factor in front of the $u(t,x)$ term. Then including the exponential weight term, one can get more general terms in front of $u(t,x)$ .

The primary tool we will use is the Onsager-Machlup formalism established in [15, 10, 26]. In particular, we will extend the Portmanteau theorem in [26], which relates information projections, Kullback-Leibler weighted control, finding the most likely path of a diffusion and Euler-Lagrange equations. In [15], it was shown that the so-called “Onsager-Machlup” function is a Lagrangian for the most likely path of a diffusion. That is, the minimizer is the “mode” of the path. We show that the solution to (almost) any second order linear ODE is the (potentially weighted) conditional mode a diffusion, defined as the minimizer of the Onsager-Machlup function. Additionally, we extend our results to include systems of second order linear ODEs.

In Section 2, we introduce this formalism and the required stochastic analysis. In Section 3 we extend the Portmanteau theorem of [26] for our purposes. In Section 4 we prove Theorem 2. Finally in Section 5 we give some examples.

2. Preliminaries

In this section, we collect some preliminary results on stochastic analysis on Wiener space and finding modes of diffusions through the Onsager-Machlup formalism. For more information, see [3, 10, 18, 26]. We first define classical Wiener space and the Cameron-Martin space of Brownian motion which play a central role in our paper.

Definition 3.

We say that $(\mathcal{C}_{0}([0,T],\mathbb{R}^{n}),\mathcal{B}(\|\cdot\|_{\infty}),\mu_{0})$ is classical Wiener space, where $\mathcal{C}_{0}[0,T]$ is the space of continuous $\mathbb{R}^{n}$ valued functions $f$ on the compact interval $[0,T]$ with $f(0)=0$ , $\mathcal{B}(\|\cdot\|_{\infty})$ are the Borel sets from the supremum norm $\|\cdot\|_{\infty}$ and $\mu_{0}$ is classical Wiener measure corresponding to a standard $n$ -dimensional Brownian motion $B(t)$ .

Definition 4.

The Cameron-Martin space $\mathcal{H}_{\mu_{0}}$ associated to classical Wiener space is the Sobolev space $W_{0}^{1,2}[0,T]$ of all absolutely continuous functions whose derivative lie in $L^{2}[0,T]$ . We also denote the Cameron-Martin norm of an element $z\in\mathcal{H}_{\mu_{0}}$ by

\|z\|_{\mu_{0}}^{2}:=\int_{0}^{T}\|\dot{z}(t)\|^{2}dt.

The primary reason why we need the Cameron-Martin space is the following theorem, called the Cameron-Martin theorem.

Theorem 5.

[18, Theorem 3.41] Let $\mu_{0}$ be classical Wiener measure on $\mathcal{C}_{0}$ . For $h\in\mathcal{C}_{0}$ , define the pushforward measure $\mu:=T_{h}^{\ast}(\mu_{0})$ by

T_{h}^{\ast}(\mu_{0})(A)=\mu_{0}(A-h),

for all Borel $A\in\mathcal{B}(\|\cdot\|_{\infty})$ . Then the pushforward measure $\mu:=T_{h}^{\ast}(\mu_{0})$ is absolutely continuous with respect to $\mu_{0}$ if and only $h\in\mathcal{H}_{\mu_{0}}$ . Furthermore, the Radon-Nikodym derivative of $\mu$ with respect to $\mu_{0}$ is

\frac{d\mu}{d\mu_{0}}=\exp\left(\int_{0}^{T}\dot{h}(t)\cdot dB(t)-\frac{1}{2}\int_{0}^{T}\|\dot{h}(t)\|^{2}dt\right).

For more information on Cameron-Martin spaces see [18], Chapter 3. We formally define the set of Gaussian shift measures from Theorem 5 to be

(3)

\mathcal{P}:=\{T_{h}^{\ast}(\mu_{0}):h\in\mathcal{H}_{\mu_{0}}\}.

The Cameron-Martin theorem states that one can shift by “deterministic” paths that lie in the Cameron-Martin space in a way that the pushforward measure is absolutely continuous with respect to $\mu_{0}$ . The generalization to shifting classical Wiener measure by more general processes is Girsanov’s theorem. We provide both a forward and a reverse direction here.

Theorem 6.

[3, Lemma 5.76] Let $(\mathcal{C}_{0}[0,T],\mu_{0})$ be classical Wiener space, and assume:

(a) $F$ is a progressively measurable process with respect to the filtration generated by the standard Brownian motion $B(t)$ ;

(b) The sample paths are almost surely in the Cameron-Martin space $W_{0}^{1,2}$ ;

E_{\mu_{0}}\left[\exp\left(\int_{0}^{T}\|f(s)\|^{2}ds\right)\right]<\infty,

where $f(s)=\dot{F}(s)$ , holds.

Then, the process $B(t)-F(t)$ is a standard Brownian motion under $\mu$ with density

\frac{d\mu}{d\mu_{0}}=\exp\left(\int_{0}^{T}f(s)\cdot dB(s)-\frac{1}{2}\int_{0}^{T}\|f(s)\|^{2}ds\right).

Theorem 7.

[3, Theorem 5.72] Let $(\mathcal{C}_{0}[0,T],\mu_{0})$ be classical Wiener space, and suppose $\mu\sim\mu_{0}$ . Then, there exists a progressively measurable process $F(t)$ with sample paths almost surely in the Cameron-Martin space $W_{0}^{1,2}$ so that the process $B(t)-F(t)$ is a standard Brownian motion under $\mu$ . Furthermore, the density is given by

\frac{d\mu}{d\mu_{0}}=\exp\left(\int_{0}^{T}f(s)\cdot dB(s)-\frac{1}{2}\int_{0}^{T}\|f(s)\|^{2}ds\right),

where $f(s)=\dot{F}(s)$ .

We can now introduce Onsager-Machlup formalism for finding the mode of a diffusion. The Onsager-Machlup function is defined as below.

Definition 8.

Let $\mu_{0}$ be a classical Wiener measure on $\mathcal{C}_{0}$ . Let $B^{\delta}(z)\subset\mathcal{B}$ be the open ball of radius $\delta$ around $z$ . Let $\mu$ be another measure that is absolutely continuous with respect to $\mu_{0}$ . If the limit

\lim_{\delta\to 0}\frac{\mu(B^{\delta}(z_{2}))}{\mu(B^{\delta}(z_{1}))}=\exp\left(OM(z_{1})-OM(z_{2})\right)

exists for all $z_{1},z_{2}\in\mathcal{H}_{\mu_{0}}$ , then $OM$ is called the Onsager-Machlup function for $\mu$ .

In order to ensure that the Onsager-Machlup function exists, we need the following hypothesis and theorem taken from [10].

Hypothesis 9.

Let $\Phi:\mathcal{C}_{0}\to\mathbb{R}$ be a functional satisfying the following conditions:

(i) For every $\varepsilon>0$ there is an $M\in\mathbb{R}$ such that

\Phi(\omega)\geq M-\varepsilon\|\omega\|^{2},

for all $\omega\in\mathcal{C}_{0}$ .

(ii) $\Phi$ is locally bounded above, i.e., for every $r>0$ there is a $K=K(r)>0$ such that

\Phi(\omega)\leq K,

for all $\omega\in\mathcal{C}_{0}$ with $\|\omega\|<r$ .

(iii) $\Phi$ is locally Lipschitz continuous, i.e., for every $r>0$ there exists $L=L(r)>0$ such that

|\Phi(\omega_{1})-\Phi(\omega_{2})|\leq L\|\omega_{1}-\omega_{2}\|,

for all $\omega_{1},\omega_{2}\in\mathcal{C}_{0}$ with $\|\omega_{1}\|<r$ and $\|\omega_{2}\|<r$ .

Theorem 10.

[10, Theorem 3.2] Let $\Phi:\mathcal{C}_{0}\to\mathbb{R}$ satisfy Hypothesis 9. Let the measure $\mu$ have the density

\frac{d\mu}{d\mu_{0}}=\frac{e^{-\Phi}}{E_{\mu_{0}}[e^{-\Phi}]}.

Then, the OM function is given by

OM_{\Phi}(z)=\begin{cases}\Phi(z)+\frac{1}{2}\|z\|_{\mu_{0}}^{2}&\text{ if }z\in\mathcal{H}_{\mu_{0}}\\ \infty&\text{ else}\end{cases}.

Remark 11.

The minimizer of the Onsager-Machlup function can be seen as the “most likely” path of the diffusion, which was shown in [15]. This fact is central to the approach in this paper. Furthermore, we define the conditional mode to minimize $OM$ over a region. That is, for a subset $A\subset\mathcal{H}_{\mu_{0}}$ and a process $X$ whose path measure has a density

\frac{d\mu}{d\mu_{0}}=\frac{e^{-\Phi}}{E_{\mu_{0}}[e^{-\Phi}]},

we define

\operatorname{Mode}\left[X(t)\bigg{|}A\right]:=\arg\min_{z\in A}OM_{\Phi}(z),

if the infimum is indeed achieved.

We also recall the definition of Kullback-Leibler divergence. For two probability measures $\mu_{1},\mu_{2}$ on some measurable space $(X,\mathcal{X})$ , the KL divergence is defined by

D_{KL}(\mu_{1}||\mu_{2})=\begin{cases}E_{\mu_{1}}\left[\log\left(\frac{d\mu_{1}}{d\mu_{2}}\right)\right]&\text{ if }\frac{d\mu_{1}}{d\mu_{2}}\text{ exists }\\ \infty&\text{ else}.\end{cases}

We recall three key properties of the KL-divergence: (i) $D_{KL}(\mu||\mu_{0})\geq 0$ ; (ii) $D_{KL}(\mu||\mu_{0})=0$ if and only if $\mu=\mu_{0}$ ; and (iii) $\mu\mapsto D_{KL}(\mu||\mu_{0})$ is convex. The next result shows how to compute the KL-divergence between two shift measures drawn from the Cameron-Martin space $\mathcal{H}_{\mu_{0}}$ .

Theorem 12.

[9, Lemma 3.20] Let $\mu_{0}$ be a centered Gaussian measure on $\mathcal{B}$ with Cameron-Martin space $\mathcal{H}_{\mu_{0}}$ , and let $\mu_{1}=T_{h_{1}}^{\ast}(\mu_{0})$ and $\mu_{2}=T_{h_{2}}^{\ast}(\mu_{0})$ for some $h_{1},h_{2}\in\mathcal{H}_{\mu_{0}}$ . Then, the Radon-Nikodym derivative $\frac{d\mu_{1}}{d\mu_{2}}$ exists and

D_{KL}(\mu_{1}||\mu_{2})=\frac{1}{2}\|h_{1}-h_{2}\|_{\mu_{0}}^{2}.

3. A Portmanteau Theorem

The final result we need is analogous to Theorem 1.1 from [26], a portmanteau theorem that lets us convert from Euler-Lagrange equations, to information projections, to Kullback Leibler (KL) divergence weighted control, to Onsager-Machlup functions for measures on classical Wiener space.

Theorem 13.

Let $\mathcal{C}_{0}$ be a classical Wiener space with Cameron-Martin space $\mathcal{H}_{\mu_{0}}$ (see Definition 4). Let $C:\mathcal{C}_{0}\to\mathbb{R}$ be a functional so that $\mu_{0}(C<+\infty)>0$ and $E_{\mu_{0}}[\exp(-C)\,|C|]<+\infty$ . Furthermore, assume that the functional $\Phi:\mathcal{C}_{0}\to\mathbb{R}$ defined by $\Phi(z)=E_{\mu_{0}}[C(\omega+z)]$ exists and satisfies Hypothesis 9. Define the probability measure $\mu^{\ast}$ on $\mathcal{C}_{0}$ with density

\frac{d\mu^{\ast}}{d\mu_{0}}:=\frac{e^{-C}}{E_{\mu_{0}}[e^{-C}]},

and the probability measure $\tilde{\mu}$ on $\mathcal{C}_{0}$ with density

\frac{d\tilde{\mu}}{d\mu_{0}}:=\frac{e^{-\Phi}}{E_{\mu_{0}}[e^{-\Phi}]}.

Let $\mathcal{P}$ be the set of Gaussian shift measures which are absolutely continuous with respect to $\mu_{0}$ (defined in Eq. (3)). Define $\mathcal{P}_{a}$ to be the subset of $\mathcal{P}$ defined in (3) defined by

(4)

\mathcal{P}_{a}:=\{T_{h}^{\ast}(\mu_{0}):h\in\mathcal{H}_{\mu_{0}},h(T)=a\}.

Furthermore, suppose that $C$ is such that that $E_{\mu_{0}}[C^{2}]<\infty$ and therefore there is a progressively measurable process $\Gamma(t,B(t))$ so that $C=\int_{0}^{T}\Gamma(t,B(t))\cdot dB(t)$ by Itô’s representation theorem. Let $L(t,z,\dot{z})$ be the Lagrangian

(5)

L(t,q(t),\dot{q}(t);\,\Gamma):=E_{\mu_{0}}\left[\|\dot{q}(t)-\Gamma(t,B(t)+q(t))\|^{2}\right].

Suppose that there exist $\alpha>0$ and $\beta\geq 0$ such that $L(t,x,\dot{x})\geq\alpha|\dot{x}|^{2}-\beta$ for all $t\in[0,T]$ , $x\in\mathbb{R}^{n}$ , and $\dot{x}\in\mathbb{R}^{n}$ . Consider the calculus of variations problem:

Consider the following four optimization problems:

(a) (Information projection)

\mathbb{K}:=\inf_{\mu\in\mathcal{P}_{a}}D_{KL}(\mu||\mu^{\ast}).

(b) (State independent KL-weighted control)

\mathbb{P}:=\inf_{\mu\in\mathcal{P}_{a}}\left\{E_{\mu}[C]+D_{KL}(\mu||\mu_{0})\right\}.

\mathbb{M}:=\inf_{z\in\mathcal{H}_{\mu_{0}},z(T)=a}OM_{\Phi}(z).

(d) (Calculus of variations)

\mathbb{L}:\inf_{z\in W_{0}^{1,2}[0,T]}\left\{\mathcal{L}(z,\dot{z};\,\Gamma):=\frac{1}{2}\int_{0}^{T}L(t,z(t),\dot{z}(t);\,\Gamma)dt\text{ s.t. }z(0)=0,z(T)=a\right\}.

Then, the optimal values $\operatorname{val}(\mathbb{K})$ , $\operatorname{val}(\mathbb{P})$ , $\operatorname{val}(\mathbb{M})$ and $\operatorname{val}(\mathbb{L})$ are all attained (and so all three problems have optimal solutions). By the Cameron-Martin theorem (see Theorem 5), we can identify each shift measure $\mu_{z}\in\mathcal{P}$ with its corresponding shift $z\in\mathcal{H}_{\mu_{0}}$ , and so we have

\operatorname{sol}(\mathbb{K})=\operatorname{sol}(\mathbb{P})\equiv\operatorname{sol}(\mathbb{M})=\operatorname{sol}(\mathbb{L}).

In the above, we have used equivalence “ $\equiv$ ” for shorthand to denote that $\mu_{z}\in\operatorname{sol}(\mathbb{P})$ (or $\mu_{z}\in\operatorname{sol}(\mathbb{K})$ ) if and only if its corresponding shift $z\in\operatorname{sol}(\mathbb{M}).$ (or $z\in\operatorname{sol}(\mathbb{L})$ ).

Proof.

Equivalence of (a) and (b) Under assumptions the $\mu_{0}(C<+\infty)>0$ and $E_{\mu_{0}}[\exp(-C)\,|C|]<+\infty$ the measure $\mu^{\ast}$ is well defined, see Lemma 2.4 in [6]. Then we can note that for any $\mu$ equivalent to $\mu^{\ast}$ we have that

	$\displaystyle D_{KL}(\mu\|\|\mu^{\ast})$	$\displaystyle=E_{\mu}\left[\log\left(\frac{d\mu}{d\mu^{\ast}}\right)\right]$
		$\displaystyle=E_{\mu}\left[\log\left(\frac{d\mu}{d\mu_{0}}\right)\right]-E_{\mu}\left[\log\left(\frac{d\mu^{\ast}}{d\mu_{0}}\right)\right]$
		$\displaystyle=E_{\mu}[C]+D_{KL}(\mu\|\|\mu_{0})-\log E_{\mu_{0}}[e^{-C}].$

Taking an infimum of both sides shows that problem (a) is equivalent to problem (b).

Equivalence of (a) and (d) We just have to check that for all $\mu\in\mathcal{P}$ where $\mathcal{P}$ is defined in (3), with shift $q\in\mathcal{H}_{\mu_{0}}$ we have that

D_{KL}(\mu||\mu^{\ast})=\frac{1}{2}\int_{0}^{T}L(t,q(t),\dot{q}(t);\Gamma)dt.

To this aim, we note that if $\mu\in\mathcal{P}$ where $\mathcal{P}$ is defined in (3) with corresponding shift $q$ that

\frac{d\mu}{d\mu^{\ast}}=\exp\left(\int_{0}^{T}(\dot{q}-\Gamma(t,B(t)))\cdot dB(t)-\frac{1}{2}\int_{0}^{T}\|\dot{q}\|^{2}-\|\Gamma(t,B(t))\|^{2}ds\right).

By Girsanov’s theorem, Theorem 6, we have that $B(t)=\tilde{B}(t)+\dot{q}(t)$ where $\tilde{B}(t)$ is a Brownian motion under $\mu$ . Therefore the KL divergence is thus

	$\displaystyle D_{KL}(\mu\|\|\mu^{\ast})$	$\displaystyle=E_{\mu}\bigg{[}\int_{0}^{T}(\dot{q}(t)-\Gamma(t,\tilde{B}(t)+q(t)))\cdot d(\tilde{B}(t)+q(t))$
		$\displaystyle\qquad-\frac{1}{2}\int_{0}^{T}(\\|\dot{q}\\|^{2}-\\|\Gamma(t,\tilde{B}(t)+q(t))\\|^{2})dt\bigg{]}.$

Using the mean zero property of Itô integration yields that

	$\displaystyle D_{KL}(\mu\|\|\mu^{\ast})$	$\displaystyle=E_{\mu}\bigg{[}\int_{0}^{T}(\\|\dot{q}(t)\\|^{2}-\Gamma(t,\tilde{B}(t)+q(t)))\cdot\dot{q}(t))dt$
		$\displaystyle\qquad-\frac{1}{2}\int_{0}^{T}(\\|\dot{q}\\|^{2}-\\|\Gamma(t,\tilde{B}(t)+q(t))\\|^{2})dt\bigg{]}.$

Therefore using Fubini’s theorem and a polarization identity we conclude that

D_{KL}(\mu||\mu^{\ast})=\frac{1}{2}\int_{0}^{T}E_{\mu_{0}}\left[\|\dot{q}(t)-\Gamma(t,B(t)+q(t))\|^{2}\right]dt.

The assumptions on the Lagrangian $L$ imply that there is a unique solution to problem (d) and hence there is a solution to problem (a).

Equivalence of (a) and (c) Using Theorem 12 and the definition of $\Phi$ for each $\mu\in\mathcal{P}$ with corresponding shift $z_{\mu}\in\mathcal{H}_{\mu_{0}}$ , we have that

	$\displaystyle D_{KL}(\mu\|\|\mu^{\ast})$	$\displaystyle=E_{\mu}[C]+D_{KL}(\mu\|\|\mu_{0})-\log E_{\mu_{0}}[e^{-C}]$
		$\displaystyle=E_{\mu_{0}}[C(\omega+z_{\mu})]+\frac{1}{2}\\|z_{\mu}\\|_{\mu_{0}}^{2}-\log E_{\mu_{0}}[e^{-C}]$
		$\displaystyle=\Phi(z_{\mu})+\frac{1}{2}\\|z_{\mu}\\|_{\mu_{0}}^{2}-\log E_{\mu_{0}}[e^{-C}]$
		$\displaystyle=OM_{\Phi}(z_{\mu})-\log E_{\mu_{0}}[e^{-C}].$

Using the equivalence between (a) and (d), along with the assumptions on the Lagrangian $L$ concludes.

∎

Remark 14.

Theorem 13 is based on [26, Theorem 1.1]. However, there are three key differences. First, parts (a), (b), and (c) are stated in terms of a general Gaussian measure in [26, Theorem 1.1] whereas in Theorem 13 it is stated for a $n$ dimensional Brownian motion. Second, part (d) in [26, Theorem 1.1] is only stated in terms of a $1$ dimensional Brownian motion whereas in Theorem 13 it is for $n$ dimensions. Third, in [26, Theorem 1.1], everything is stated in terms of unconditional mode instead of conditional mode from Theorem 13.

4. Proof of Feynman-Kac for ODEs

In this section, we restate Theorem 2 in more detail and offer a proof. For pedagogical reasons, we first prove the result for a single ODE. We then extend to the case of a system of ODEs. The proof for the system is similar but it is included for completeness.

4.1. Feynman-Kac for single ODE

In this subsection we prove our Feynman-Kac result for a single linear second order ODE. First we prove the case in which $f(t)=0$ and thus $V(t)=0$ and we are working with no exponential weight. We then use this to construct the solution for general $f$ . First we have the following lemma.

Lemma 15.

If $\Gamma(t,x)=A(t)x+D(t)$ for $C^{1}([0,T],\mathbb{R}^{n\times n})$ matrix $A$ and $C^{1}([0,T],\mathbb{R}^{n})$ vector $D$ , then $\mu^{\ast}=\tilde{\mu}$ .

Proof.

We just have to show that $C=\Phi$ . Recall that $C=\int_{0}^{T}\Gamma(t,B(t))\cdot dB(t)$ . To the stated aim, we write that

	$\displaystyle\Phi(z)$	$\displaystyle=E_{\mu_{0}}[C(\omega+z)]$
		$\displaystyle=E_{\mu_{0}}\left[\int_{0}^{T}[A(t)(B(t)+z(t))+D(t)]\cdot d(B(t)+z(t))\right]$
		$\displaystyle=\int_{0}^{T}[A(t)z(t)+D(t)]\cdot dz(t)$
		$\displaystyle=C(z).$

∎

Proposition 16.

Consider the differential equation

(6)

\ddot{y}(t)+g(t)y(t)+h(t)=0,

where the unknown $y\in C^{2}([0,T],\mathbb{R})$ , with boundary conditions $y(0)=0$ and $y(T)=a$ . Let $g(t)$ and $h(t)$ be $C^{1}([0,T],\mathbb{R})$ functions so that the differential equations

(7)

A^{2}(t)+\dot{A}(t)=-g(t)

and

\dot{D}(t)+D(t)A(t)=-h(t)

have solutions, $A(t),D(t)$ . Let $X$ be the diffusion corresponding to $\mu^{\ast}$ given in Theorem 13 with $\Gamma(t,x)=A(t)x+D(t)$ . Then the solution to (6) is

(8)

y(t)=\operatorname{Mode}\left[X(t)\bigg{|}X(T)=a\right],

Here, the conditional mode of a diffusion is defined as the minimizer of the Onsager-Machlup function defined in Theorem 13 over all paths $z$ with $z(T)=a$ , as in Remark 11.

Proof.

Let $\Gamma(t,x)=A(t)x+D(t)$ for the Lagrangian defined in Theorem 13. The Lagrangian is thus

(9)

L(t,q,\dot{q})=E_{\mu_{0}}\left[(\dot{q}(t)-A(t)(B(t)+q(t))-D(t))^{2}\right].

First, we note that all of the hypotheses of Theorem 13 hold as $A,D$ are both $C^{1}$ functions by assumption. Computing the derivatives of $L$ yields that

L_{q}(t,q,\dot{q})=-2A(t)\left(\dot{q}(t)-A(t)q(t)-D(t)\right)

and

L_{\dot{q}}(t,q,\dot{q})=2\left(\dot{q}(t)-A(t)q(t)-D(t)\right).

The Euler Lagrange equation for $L$ is then

-A(t)\left(\dot{q}(t)-A(t)q(t)-D(t)\right)-\ddot{q}(t)+\dot{A}(t)q(t)+A(t)\dot{q}(t)+\dot{D}(t)=0.

We may simplify this as

(10)

-\ddot{q}(t)+q(t)\left(A^{2}(t)+\dot{A}(t)\right)+\left(\dot{D}(t)+A(t)D(t)\right)=0.

By the assumption of $A$ and $D$ , we have that the Euler Lagrange equation for the Lagrangian $L$ is equation (6). Therefore by Theorem 13 we have that the solution to equation (6) is the conditional mode of $\tilde{\mu}$ . By Lemma 15 we have that $\mu^{\ast}=\tilde{\mu}$ . Therefore the solution to equation (6) is the conditional mode of $\mu^{\ast}$ . ∎

Remark 17.

The equation

(11)

A^{2}(t)+\dot{A}(t)=-g(t)

is an example of a Riccati equation, whose general form is

\dot{A}(t)=\gamma_{0}(t)+\gamma_{1}(t)A(t)+\gamma_{2}(t)A^{2}(t),

where $\gamma_{i}$ are continuous functions and $\gamma_{0},\gamma_{2}\neq 0$ . See [2] for conditions on solving Riccati equations explicitly. To this aim, they convert the Riccati equation to a second order linear differential equation

\ddot{z}(t)+(-\gamma_{1}(t)+\dot{\gamma}_{2}(t)/\gamma_{2}(t))\dot{z}(t)+\gamma_{0}(t)\gamma_{2}(t)z(t)=0,

where $A(t)=-\dot{z}(t)/(\gamma_{2}(t)z(t)).$ In the case of Theorem 16, the Riccati equation (11) turns into the homogeneous equation

\ddot{z}(t)+g(t)z(t)=0,

with $A(t)=-\dot{z}(t)/z(t).$ Note however that this is not quite the homogeneous version of (6) because in (6) we insist that $y(0)=0$ .

Explicitly giving $C$ for an equation (6) comes down to the solvability of the equation (11). We give an example of when (11) is explicitly solvable as a corollary.

Corollary 18.

Consider the differential equation

(12)

\ddot{y}(t)+gy(t)+h(t)=0,

where the unknown $y\in C^{2}([0,T],\mathbb{R})$ , with boundary conditions $y(0)=0$ and $y(T)=a$ . Let $h(t)$ be $C^{1}([0,T],\mathbb{R})$ and let $g\leq 0$ be a constant. Then the solution to (12) is

(13)

y(t)=\operatorname{Mode}\left[X(t)\bigg{|}X(T)=a\right],

where $X$ is the diffusion

X(t)=\int_{0}^{t}\left(\sqrt{-g}B(u)-e^{-\sqrt{-g}}\int_{1}^{u}e^{s\sqrt{-g}}h(s)ds\right)du+B(t).

Remark 19.

One might ask if one can find a $\Gamma(t,x)$ that yields an Euler-Lagrange equation of the form (1) with $f\neq 0$ . The answer is no, and the $\Gamma(t,x)$ used in Proposition 16 is of maximal generality, as the below lemma will show.

Lemma 20.

Let $\Gamma(t,x)$ as in Portmanteau theorem be such that $\frac{\partial}{\partial x}\Gamma(t,x)$ depends nontrivially on $x$ . Then the resulting Euler Lagrange equation has a nonlinearity in $q$ .

Proof.

We write down the Lagrangian as

L(t,q,\dot{q})=E_{\mu_{0}}\left[(\dot{q}(t)-\Gamma(t,B(t)+q(t)))^{2}\right].

Taking derivatives yields

L_{q}(t,q,\dot{q})=2E_{\mu_{0}}\left[(\dot{q}(t)-\Gamma(t,B(t)+q(t)))(-\Gamma_{x}(t,B(t)+q(t)))\right],

and

L_{\dot{q}}(t,q,\dot{q})=2E_{\mu_{0}}\left[(\dot{q}(t)-\Gamma(t,B(t)+q(t)))\right].

The Euler-Lagrange equation for $L$ is thus

	$\displaystyle-\dot{q}(t)E_{\mu_{0}}[\Gamma_{x}(t,B(t)+q(t))]+E_{\mu_{0}}[\Gamma(t,B(t)+q(t))\Gamma_{x}(t,B(t)+q(t))]$
	$\displaystyle\leavevmode\nobreak\ \leavevmode\nobreak\ -\ddot{q}(t)+\frac{d}{dt}E_{\mu_{0}}[\Gamma(t,B(t)+q(t))]=0.$

Because $\Gamma_{x}(t,x)$ depends nontrivially on $x$ , the term

E_{\mu_{0}}[\Gamma(t,B(t)+q(t))\Gamma_{x}(t,B(t)+q(t))]

depends nonlinearly on $q$ with no terms involving $\dot{q}$ or $\ddot{q}$ . The three other terms will all involve a $\dot{q}$ or a $\ddot{q}$ , so there is no chance for cancellation. ∎

Corollary 21.

If an ODE

\ddot{y}(t)+f(t)\dot{y}(t)+g(t)y(t)+h(t)=0,

where $f,g,h\in C^{1}(0,T)$ , comes from the Euler-Lagrange equation in Theorem 13, then $\Gamma(t,x)=A(t)x+C(t),$ for some $A,C\in C^{1}(0,T)$ . Thus $f=0$ in Proposition 16 is of maximal generality.

We finally prove our main theorem for general $f$ by applying a simple transformation to solutions given in Proposition 16. This is the full version of Theorem 2.

Theorem 22 (Feynman-Kac for ODEs).

Consider the differential equation

(14)

\ddot{y}(t)+f(t)\dot{y}(t)+g(t)y(t)+h(t)=0,

where the unknown $y\in C^{2}([0,T],\mathbb{R})$ , with boundary conditions $y(0)=0$ and $y(T)=a$ . Let $f(t),g(t)$ and $h(t)$ be $C^{1}$ real valued functions. Let $V(t)=2^{-1}\int_{0}^{t}f(s)ds$ . Define the functions

\hat{g}(t)=-\ddot{V}(t)+(\dot{V}(t))^{2}-f(t)\dot{V}(t)+g(t)

and

\hat{h}(t)=e^{V(t)}h(t).

Suppose that the differential equations

A^{2}(t)+\dot{A}(t)=-\hat{g}(t)

and

\dot{D}(t)+D(t)A(t)=-\hat{h}(t)

have solutions, $A(t),D(t)$ , implying that the solution $\hat{y}(t)=\operatorname{Mode}\left[X\mid X(T)=a\right]$ given in Proposition 16 to the equation

\hat{y}^{\prime\prime}(t)+\hat{g}(t)\hat{y}(t)+\hat{h}(t)=0

exists. Then the solution to (14) is

(15)

y(t)=e^{-V(t)}\hat{y}(t)=e^{-V(t)}\operatorname{Mode}\left[X\mid X(T)=a\right].

Proof.

Differentiating $y$ once yields that

\dot{y}(t)=e^{-V(t)}[\hat{y}^{\prime}(t)-\dot{V}(t)\hat{y}(t)].

Differentiating again gives

\ddot{y}(t)=e^{-V(t)}[\hat{y}^{\prime\prime}(t)-\ddot{V}(t)\hat{y}(t)-\dot{V}(t)\hat{y}^{\prime}(t)-\dot{V}(t)\hat{y}^{\prime}(t)+(\dot{V}(t))^{2}\hat{y}(t)].

Checking that $y$ satisfies equation (14) concludes that

	$\displaystyle\ddot{y}(t)+f(t)\dot{y}(t)+g(t)y(t)+h(t)$	$\displaystyle=e^{-V(t)}[\hat{y}^{\prime\prime}(t)+\hat{g}(t)\hat{y}(t)]+h(t)$
		$\displaystyle=e^{-V(t)}[-e^{V(t)}h(t)]+h(t)$
		$\displaystyle=0.$

∎

4.2. Feynman-Kac for Systems of Equations

In this subsection, we extend Theorem 22 to the case of systems of linear second order ODEs. The proofs are similar to the proofs in the previous section, but we include them for completeness.

Proposition 23.

Consider the system of differential equations

(16)

\ddot{y}(t)+g(t)y(t)+h(t)=0,

where the unknown $y\in C^{2}([0,T],\mathbb{R}^{n})$ , with boundary conditions $y(0)=0$ and $y(T)=a$ . Let $g(t)$ be a $C^{1}([0,T],\mathbb{R}^{n\times n})$ matrix and let $h(t)$ be a $C^{1}([0,T],\mathbb{R}^{n})$ vector so that the differential equations

A^{T}(t)A(t)+\dot{A}(t)=-g(t)

and

\dot{D}(t)+A^{T}(t)D(t)=-h(t)

have solutions, $A(t),D(t)$ . Let $X$ be the diffusion corresponding to $\mu$ given in Theorem 13 with $\Gamma(t,x)=A(t)x+D(t)$ . Then the solution to (16) is

(17)

y(t)=\operatorname{Mode}\left[X(t)\bigg{|}X(T)=a\right],

Here, the conditional mode of a diffusion is defined as the minimizer of the Onsager-Machlup function defined in Theorem 13 over all paths $z$ with $z(T)=a$ , as in Remark 11.

Proof.

All the assumptions for Theorem 13 are satisfied so we may write the Lagrangian for $\Gamma$ as

(18)

L(t,q,\dot{q})=E_{\mu_{0}}\left[\|\dot{q}(t)-A(t)(B(t)+q(t))-D(t)\|^{2}\right].

Differentiating under the integral sign yields that

	$\displaystyle L_{q}(t,q,\dot{q})$	$\displaystyle=-E_{\mu_{0}}\left[2(\dot{q}(t)-A(t)(B(t)+q(t))-D(t))^{T}A(t)\right]$
		$\displaystyle=-2(\dot{q}(t)-A(t)q(t)-D(t))^{T}A(t)$
		$\displaystyle=-2(\dot{q}^{T}(t)A(t)-q^{T}(t)A^{T}(t)A(t)-D^{T}(t)A(t))$

and

	$\displaystyle L_{\dot{q}}(t,q,\dot{q})$	$\displaystyle=E_{\mu_{0}}\left[2(\dot{q}(t)-A(t)(B(t)+q(t))-D(t))^{T}\right]$
		$\displaystyle=2(\dot{q}(t)-A(t)q(t)-D(t))^{T}$
		$\displaystyle=2(\dot{q}^{T}(t)-q^{T}(t)A^{T}(t)-D^{T}(t)).$

The Euler-Lagrange equation is then

-\dot{q}^{T}(t)A(t)+q^{T}(t)A^{T}(t)A(t)+D^{T}(t)A(t)-\ddot{q}^{T}(t)+q^{T}\dot{A}^{T}(t)+\dot{q}^{T}(t)A^{T}(t)+\dot{D}^{T}(t)=0.

Collecting terms yields that

(19)

-\ddot{q}^{T}(t)+q^{T}(t)(A^{T}(t)A(t)+\dot{A}(t))+D^{T}(t)A(t)+\dot{D}^{T}(t)=0.

Taking the transpose of equation (19) yields that

-\ddot{q}(t)+(A^{T}(t)A(t)+\dot{A}^{T}(t))\dot{q}(t)+A^{T}(t)D(t)+\dot{D}(t)=0,

which is equation (16). ∎

Theorem 24 (Feynman-Kac for system of ODEs).

Consider the system of differential equations

(20)

\ddot{y}(t)+f(t)\dot{y}(t)+g(t)y(t)+h(t)=0,

where the unknown $y\in C^{2}([0,T],\mathbb{R}^{n})$ , with boundary conditions $y(0)=0$ and $y(T)=a$ . Let $f(t),g(t)$ be $C^{1}([0,T],\mathbb{R}^{n\times n})$ matrices and let $h(t)$ be a $C^{1}([0,T],\mathbb{R}^{n})$ vector. Let $V(t)=2^{-1}\int_{0}^{t}f(s)ds$ . Suppose that the matrix exponentials $e^{V(t)}$ and $e^{-V(t)}$ exist and $e^{-V(t)}$ commutes with both $f(t)$ and $g(t)$ . Define the functions

\hat{g}(t)=-\ddot{V}(t)+(\dot{V}(t))^{2}-f(t)\dot{V}(t)+g(t)

and

\hat{h}(t)=e^{V(t)}h(t),

where $e^{V(t)}$ denotes the standard exponential of a matrix. Suppose that the differential equations

(21)

A^{T}(t)A(t)+\dot{A}(t)=-\hat{g}(t)

and

\dot{D}(t)+A^{T}(t)D(t)=-\hat{h}(t)

have solutions, $A(t),D(t)$ , implying that the solution $\hat{y}(t)=\operatorname{Mode}\left[X\mid X(T)=a\right]$ given in Proposition 23 to the equation

\hat{y}^{\prime\prime}(t)+\hat{g}(t)\hat{y}(t)+\hat{h}(t)=0

exists. Then the solution to (14) is

(22)

y(t)=e^{-V(t)}\hat{y}(t)=e^{-V(t)}\operatorname{Mode}\left[X\mid X(T)=a\right].

Proof.

Differentiating $y$ once yields that

\dot{y}(t)=e^{-V(t)}[\hat{y}^{\prime}(t)-\dot{V}(t)\hat{y}(t)].

Differentiating again gives

\ddot{y}(t)=e^{-V(t)}[\hat{y}^{\prime\prime}(t)-\ddot{V}(t)\hat{y}(t)-\dot{V}(t)\hat{y}^{\prime}(t)-\dot{V}(t)\hat{y}^{\prime}(t)+(\dot{V}(t))^{2}\hat{y}(t)].

Checking that $y$ satisfies equation (14) using the commutivity assumption concludes that

	$\displaystyle\ddot{y}(t)+f(t)\dot{y}(t)+g(t)y(t)+h(t)$	$\displaystyle=e^{-V(t)}[\hat{y}^{\prime\prime}(t)+\hat{g}(t)\hat{y}^{\prime}(t)]+h(t)$
		$\displaystyle=e^{-V(t)}[-e^{V(t)}h(t)]+h(t)$
		$\displaystyle=0.$

∎

5. Examples

In this section, we give a few examples of the process $X$ for various ODEs.

Example 1.

Consider the ODE

\ddot{y}(t)-y(t)=0.

As there is no $\dot{y}$ term, we may use Proposition 16 directly. In this case, $A(t)$ solves the first order differential equation

A^{2}(t)+\dot{A}(t)=1.

Thus we may let $A=\pm 1$ . As there is no $h$ term, we may simply let $D=0$ . Therefore choosing $\Gamma(t,x)=x$ and thus

C=\int_{0}^{T}B(t)dB(t).

Thus the solution is the conditional mode of the process $X(t)$ with density

\frac{d\mu^{\ast}}{d\mu_{0}}=\frac{e^{-C}}{E_{\mu_{0}}[e^{-C}]}.

That is, the process

X(t)=\int_{0}^{t}B(s)ds+B(t).

Example 2.

Consider the ODE

\ddot{y}(t)+\dot{y}(t)-y(t)=0.

As $f\neq 0$ we must use the full Theorem 22. We note that $\hat{g}(t)=-1$ and $\hat{h}(t)=-1$ . So therefore $X$ is the same as the previous example and we only need to include the exponential weight to get that

y(t)=e^{-t/2}\operatorname{Mode}\left[X(t)\mid X(T)=a\right]

Example 3.

Consider the ODE

\ddot{y}(t)-y(t)+t=0.

Then we can let $A(t)=1$ and $D(t)=1-t$ . Therefore $\Gamma(t,x)=x+1-t$ . Therefore

C=\int_{0}^{T}[B(t)+1-t]dB(t).

Thus the solution is the conditional mode of the process $X(t)$ with density

\frac{d\mu^{\ast}}{d\mu_{0}}=\frac{e^{-C}}{E_{\mu_{0}}[e^{-C}]}.

That is, the process is

X(t)=\int_{0}^{t}[B(s)+1-s]ds+B(t).

We conclude with an example of an ODE so that equation (7) cannot be solved explicitly.

Example 4.

Consider the Airy ODE

\ddot{y}(t)-ty(t)=0,

with boundary conditions $y(0)=0$ and $y(1)=2$ . As $f=0$ we may use Proposition 16. $A$ must satisfy

A^{2}(t)+\dot{A}(t)=t.

However, this equation is not explicitly solvable by hand. Therefore we estimate $A$ using a regular perturbation expansion and use part (d) of our portmanteau Theorem 13 to estimate the solution. We numerically minimize the functional

\frac{1}{2}\int_{0}^{T}E_{\mu_{0}}[(\dot{q}(t)-A(t)B(t))^{2}]dt

over all $q$ which are Legendre polynomials of degree less than $200$ with the estimated $A$ , we call this minimizer $q_{\ast}$ . We plot the estimated solutions against the true solution, along with the value of the functional

L(q_{\ast})=\int_{0}^{T}(\ddot{q}_{\ast}(t)-tq_{\ast}(t))^{2}dt

for various degrees of approximation in Figure 1.

Refer to caption — (a) Approximate solution compared to the Airy function.

The approximation can be expected to improve further by employing higher degree Legendre polynomials. However, as Figure 1(B) and Figure 1(C) show, this improvement is very slow. Nonetheless, what we have demonstrated is a completely new approach to solving the Airy equation.

6. Conclusions

We presented a new Feynman-Kac type formula for systems of linear second order ODEs, by demonstrating that the solution of the ODE is the mode of a specific diffusion process that is determined by the coefficients of the ODE. To the best of our knowledge, this characterization of the solution of a given linear ODE in terms of a path functional of a stochastic process is novel and has not been identified in the literature before. This result opens the door to new ways of solving second order linear ODEs, potentially leveraging stochastic simulation, just as the Feynman-Kac theorem has led to the development of Monte Carlo methods for solving linear parabolic PDEs. We gave a few examples of the utility of our result for numerically solving ODEs. The primary bottleneck to numerically implementing this method is the calculation of $A$ which, as noted before, is the solution to a Riccati equation which is known to be difficult to solve explicitly. There is work, of course, on numerical solutions of Riccati equations, see e.g. [5]. In our current example with the Airy equation, we compute an approximate solution to the Ricatti equation by using a regular perturbation expansion (which entails the introduction of some ‘bias’ into final solution to the second order ODE).

However, once $A$ is identified, numerically computing the solution to the second order ODE can be done reasonably efficiently. Part (d) of our Portmanteau in Theorem 13 provides one possible way of computing the mode. Alternatively, the part (a) of the theorem suggests that the mode could be computed by solving the constrained information projection. One approach to doing this would be to use the Iterative Proportional Fitting Procedure (IPFP) or its discrete counterpart the Sinkhorn algorithm [14], for instance. There are also direct methods for computing the mode; see [16] for examples using a trapezoidal discretization.

To conclude, we pose two open questions:

Question 1.

Is there an efficient way of numerically estimating the solution without estimating $A$ first?

Question 2.

Can we extend our results to nonlinear ODE? That is, given a general equation

\ddot{y}(t)+F(t,y,\dot{y})=0,

is there a condition on $F$ so that we can guarantee to find a corresponding $\Gamma$ ?

Acknowledgements

The authors would like to acknowledge Zihe Zhou for her code for the Airy equation, Example 4.

References

[1] Luigi Accardi. On the quantum Feynman-Kac formula. Rend. Sem. Mat. Fis. Milano, 48:135–180 (1980), 1978.
[2] Anas Al Bastami, Milivoj R. Belić, and Nikola Z. Petrović. Special solutions of the Riccati equation with applications to the Gross-Pitaevskii nonlinear PDE. Electron. J. Differential Equations, pages No. 66, 10, 2010.
[3] Fabrice Baudoin. Diffusion processes and stochastic calculus. EMS Textbooks in Mathematics. European Mathematical Society (EMS), Zürich, 2014.
[4] Lorenzo Bertini and Nicoletta Cancrini. The stochastic heat equation: Feynman-Kac formula and intermittence. J. Statist. Phys., 78(5-6):1377–1401, 1995.
[5] Jafar Biazar and Mohsen Didgar. Numerical solution of Riccati equations by the Adomian and asymptotic decomposition methods over extended domains. Int. J. Differ. Equ., pages Art. ID 580741, 7, 2015.
[6] Joris Bierkens and Hilbert J. Kappen. Explicit solution of relative entropy weighted control. Systems Control Lett., 72:36–43, 2014.
[7] F. Cérou, P. Del Moral, and A. Guyader. A nonasymptotic theorem for unnormalized Feynman-Kac particle models. Ann. Inst. Henri Poincaré Probab. Stat., 47(3):629–649, 2011.
[8] K. L. Chung and K. M. Rao. Feynman-Kac functional and the Schrödinger equation. In Seminar on Stochastic Processes, 1981 (Evanston, Ill., 1981), volume 1 of Progr. Prob. Statist., pages 1–29. Birkhäuser, Boston, Mass., 1981.
[9] Y Dan. Bayesian inference for Gaussian models: Inverse problems and evolution equations. 2020.
[10] M. Dashti, K. J. H. Law, A. M. Stuart, and J. Voss. MAP estimators and their consistency in Bayesian nonparametric inverse problems. Inverse Problems, 29(9):095017, 27, 2013.
[11] P. Del Moral and L. Miclo. Branching and interacting particle systems approximations of Feynman-Kac formulae with applications to non-linear filtering. In Séminaire de Probabilités, XXXIV, volume 1729 of Lecture Notes in Math., pages 1–145. Springer, Berlin, 2000.
[12] Pierre Del Moral. Feynman-Kac formulae. Probability and its Applications (New York). Springer-Verlag, New York, 2004. Genealogical and interacting particle systems with applications.
[13] Pierre Del Moral, Arnaud Doucet, and Sumeetpal S. Singh. A backward particle interpretation of Feynman-Kac formulae. M2AN Math. Model. Numer. Anal., 44(5):947–975, 2010.
[14] Simone Di Marino and Augusto Gerolin. An optimal transport approach for the schrödinger bridge problem and convergence of sinkhorn algorithm. Journal of Scientific Computing, 85(2):1–28, 2020.
[15] Detlef Dürr and Alexander Bach. The Onsager-Machlup function as Lagrangian for the most probable path of a diffusion process. Communications in Mathematical Physics, 60(2):153–170, 1978.
[16] Dimas Abreu Dutra, Bruno Otávio Soares Teixeira, and Luis Antonio Aguirre. Maximum a posteriori state path estimation: discretization limits and their interpretation. Automatica J. IFAC, 50(5):1360–1368, 2014.
[17] Batu Güneysu, Matthias Keller, and Marcel Schmidt. A Feynman-Kac-Itô formula for magnetic Schrödinger operators on graphs. Probab. Theory Related Fields, 165(1-2):365–399, 2016.
[18] Martin Hairer. An introduction to stochastic PDEs. arXiv preprint arXiv:0907.4178, 2009.
[19] Yaozhong Hu, Fei Lu, and David Nualart. Feynman-Kac formula for the heat equation driven by fractional noise with Hurst parameter $H<1/2$ . Ann. Probab., 40(3):1041–1068, 2012.
[20] Yaozhong Hu, David Nualart, and Jian Song. Feynman-Kac formula for heat equation driven by fractional white noise. Ann. Probab., 39(1):291–326, 2011.
[21] Svante Janson and Johan Tysk. Feynman-Kac formulas for Black-Scholes-type operators. Bull. London Math. Soc., 38(2):269–282, 2006.
[22] M. Kac. On distributions of certain Wiener functionals. Trans. Amer. Math. Soc., 65:1–13, 1949.
[23] Maxim Kontsevich. Intersection theory on the moduli space of curves and the matrix Airy function. Comm. Math. Phys., 147(1):1–23, 1992.
[24] József Lőrinczi, Fumio Hiroshima, and Volker Betz. Feynman-Kac-type theorems and Gibbs measures on path space, volume 34 of De Gruyter Studies in Mathematics. Walter de Gruyter & Co., Berlin, 2011. With applications to rigorous quantum field theory.
[25] Bernt Ø ksendal. Stochastic differential equations. Universitext. Springer-Verlag, Berlin, sixth edition, 2003. An introduction with applications.
[26] Zachary Selk, William Haskell, and Harsha Honnappa. Information projection on banach spaces with applications to state independent kl-weighted optimal control. Appl. Math. Optim., 2021.
[27] Zhong Xin Zhao. Green function for Schrödinger operator and conditioned Feynman-Kac gauge. J. Math. Anal. Appl., 116(2):309–334, 1986.

	$\displaystyle D_{KL}(\mu\|\|\mu^{\ast})$	$\displaystyle=E_{\mu}[C]+D_{KL}(\mu\|\|\mu_{0})-\log E_{\mu_{0}}[e^{-C}]$
		$\displaystyle=E_{\mu_{0}}[C(\omega+z_{\mu})]+\frac{1}{2}\\|z_{\mu}\\|_{\mu_{0}}^{2}-\log E_{\mu_{0}}[e^{-C}]$
		$\displaystyle=\Phi(z_{\mu})+\frac{1}{2}\\|z_{\mu}\\|_{\mu_{0}}^{2}-\log E_{\mu_{0}}[e^{-C}]$
		$\displaystyle=OM_{\Phi}(z_{\mu})-\log E_{\mu_{0}}[e^{-C}].$