This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Gradient Flow Structure of a Multidimensional Nonlinear Sixth Order Quantum-Diffusion Equation

Daniel Matthes [email protected] Zentrum Mathematik, TU München, Boltzmannstrasse 3, D-85748 Garching, Germany  and  Eva-Maria Rott Zentrum Mathematik, TU München, Boltzmannstrasse 3, D-85748 Garching, Germany [email protected]
Abstract.

A nonlinear parabolic equation of sixth order is analyzed. The equation arises as a reduction of a model from quantum statistical mechanics, and also as the gradient flow of a second-order information functional with respect to the L2L^{2}-Wasserstein metric. First, we prove global existence of weak solutions for initial conditions of finite entropy by means of the time-discrete minimizing movement scheme. Second, we calculate the linearization of the dynamics around the unique stationary solution, for which we can explicitly compute the entire spectrum. A key element in our approach is a particular relation between the entropy, the Fisher information and the second order functional that generates the gradient flow under consideration.

Key words and phrases:
Higher-order diffusion equations, quantum diffusion model, Wasserstein gradient flow, flow interchange estimate, long-time behavior, linearization
2010 Mathematics Subject Classification:
Primary: 35K30, Secondary: 35B45, 35B40
This research was supported by the DFG Collaborative Research Center TRR 109, “Discretization in Geometry and Dynamic”.

1. Introduction

1.1. The equation

The following nonlinear parabolic evolution equation of sixth order is considered:

tu=div(u(Φ(u)+λ3|x|2)),Φ(u)=122logu2+1u2:(u2logu),\displaystyle\partial_{t}u=\mathrm{div}\big{(}u\,\nabla\big{(}\Phi(u)+\lambda^{3}|x|^{2}\big{)}\big{)},\quad\Phi(u)=\frac{1}{2}\left\lVert\nabla^{2}\log u\right\rVert^{2}+\frac{1}{u}\nabla^{2}:(u\nabla^{2}\log u), (1)

where λ0\lambda\geq 0 is a given parameter. There are at least two different contexts in which (1) plays a role.

The first is the semi-classical approximation of the nonlocal quantum drift-diffusion model by Degond et al [5]. In the formal asymptotic expansion of that equation in terms of the Planck constant \hbar, the right-hand side of (1) with λ=0\lambda=0 appears as the term of order 4\hbar^{4}, after the linear diffusion operator Δu\Delta u at order 0\hbar^{0} and the operator 2:(u2logu)-\nabla^{2}:(u\nabla^{2}\log u), which is related to the Bohm potential, at order 2\hbar^{2}. More details on the derivation of the model and its formal expansion are given below in Section 2.1.

The other context, more relevant to the paper at hand, is that of gradient flows in the L2L^{2}-Wasserstein distance. Consider the following three functionals, defined on — at the moment for simplicity: strictly positive — probability densities u:du:{\mathbb{R}}^{d}\to{\mathbb{R}} by

λ(u)=du(logu+λ2|x|2)dx,λ(u)=12du(|logu|2+λ2|x|2)dx,λ(u)=12du(2logu2+2λ3|x|2)dx,\begin{split}\mathcal{H}_{\lambda}(u)&=\int_{\mathbb{R}^{d}}u\left(\log u+\frac{\lambda}{2}|x|^{2}\right)\,\mathrm{d}x,\\ \mathcal{F}_{\lambda}(u)&=\frac{1}{2}\int_{\mathbb{R}^{d}}u\left(|\nabla\log u|^{2}+\lambda^{2}|x|^{2}\right)\,\mathrm{d}x,\\ \mathcal{E}_{\lambda}(u)&=\frac{1}{2}\int_{\mathbb{R}^{d}}u\left(\|\nabla^{2}\log u\|^{2}+2\lambda^{3}|x|^{2}\right)\,\mathrm{d}x,\end{split} (2)

which we shall refer to as — λ\lambda-perturbed if λ>0\lambda>0entropy, Fisher information, and energy, respectively. The celebrated result of [9] is that the gradient flow of λ\mathcal{H}_{\lambda} in the L2L^{2}-Wasserstein metric is the linear Fokker-Planck equation,

tu=Δu+λdiv(xu).\displaystyle\partial_{t}u=\Delta u+\lambda\mathrm{div}(xu). (3)

In [8] — see also[1, Example 11.1.10] — it has been shown that the gradient flow of λ\mathcal{F}_{\lambda} is the so-called quantum drift-diffusion or DLSS equation,

tu=2:(u2logu)+λdiv(xu).\displaystyle\partial_{t}u=-\nabla^{2}:(u\nabla^{2}\log u)+\lambda\mathrm{div}(xu). (4)

The starting point for our analysis is that (1) is the gradient flow of λ\mathcal{E}_{\lambda}, at least formally, that is, the potential Φ\Phi in (1) is the variational derivative of λ\mathcal{E}_{\lambda}. The reason for considering λ\mathcal{E}_{\lambda} as potential for a gradient flow is more profound than its formal similarity with λ\mathcal{H}_{\lambda} and λ\mathcal{F}_{\lambda}. There is an intimate relation between λ\mathcal{H}_{\lambda}, λ\mathcal{F}_{\lambda} and λ\mathcal{E}_{\lambda}, that has already been the basis for deriving sharp self-similar asymptotics in [13], and that we shall elaborate on in Section 1.3 below. In a nutshell, the dissipation of λ\mathcal{H}_{\lambda} along the heat flow is λ\mathcal{F}_{\lambda}, the dissipation of λ\mathcal{F}_{\lambda} along the heat flow is λ\mathcal{E}_{\lambda}, up to a multiple of λ\mathcal{F}_{\lambda} itself, and this equals the dissipation of λ\mathcal{H}_{\lambda} along the flow of (4). In this spirit, one may consider (4) and (1), respectively, as fourth and sixth order analogues of the second order linear Fokker-Planck equation (3).

Our analytical results are two-fold. First, we give a proof of existence of weak solutions to the initial value problem for (1) on the whole space d{\mathbb{R}}^{d} for initial data with finite entropy and finite second moment. This result is proven with full rigor. Second, we study the long-time asymptotics of solutions using a linearization around the steady state. This part is formal in the sense that we calculate the linearization for sufficiently smooth perturbations of the steady state and discuss the spectral properties of an appropriate closure of the linear operator.

1.2. Existence of solutions

Global existence of non-negative weak solutions to (1) with λ=0\lambda=0 on the dd-dimensional torus, i.e., with periodic boundary conditions, has been proven in [11] for d=1d=1, and in [3] for d=2d=2 and d=3d=3. The main technical ingredient of these proofs is a particular regularization of the evolution equation, namely by εΔ3logu\varepsilon\Delta^{3}\log u. This regularization produces approximations of the true solution that are smooth and have an ε\varepsilon-dependent positive lower bound. Smoothness of both uu and logu\log u then allows to perform all the necessary a priori estimates on the approximation, and these pass to the limit ε0\varepsilon\downarrow 0. An extension of this method from the torus to the whole space — if possible at all — is at least not straight-forward, since the intermediate estimates involve simultaneous Sobolev estimates on uu and logu\log u, that would contradict each other on unbounded domains.

Here, we shall prove existence of solutions in d3d\leq 3 on the whole space d{\mathbb{R}}^{d}. Our technical device is very different from the one recalled above: we invoke the machinery of metric gradient flows. Our approximations are of lower regularity, and we do not have any information about positivity. To justify the a priori estimates, we need the method of flow interchange with the heat equation [8, 13]. In contrast to the constructions in [11, 3], we do not modify the equation, but perform a variational discretization in time using the minimizing movement scheme. More specifically: given initial probability density u0u_{0} of finite second moment and entropy, and a time step τ>0\tau>0, define inductively a sequence (uτn)n=0(u_{\tau}^{n})_{n=0}^{\infty} by uτ0=u0u_{\tau}^{0}=u_{0}, and uτnu_{\tau}^{n} being a minimizer of

u12τ𝐖2(u,uτn1)2+λ(u).\displaystyle u\mapsto\frac{1}{2\tau}{\mathbf{W}}_{2}(u,u_{\tau}^{n-1})^{2}+\mathcal{E}_{\lambda}(u). (5)

We recall the L2L^{2}-Wasserstein-distance 𝐖2{\mathbf{W}}_{2} below in Section 2.2, and we prove that the inductive procedure is well-defined. Our result about existence is that the sequences (uτn)(u_{\tau}^{n}) approximate a weak solution to (1).

Theorem 1.1.

Assume d3d\leq 3. Let an initial datum u0u_{0} be given that is a probability density with finite entropy, 0(u0)<\mathcal{H}_{0}(u_{0})<\infty, and finite second moment. For each τ>0\tau>0, define a sequence (uτn)n=0(u_{\tau}^{n})_{n=0}^{\infty} inductively as described above. Then the piecewise constant “interpolations” u¯τ:0L1(d)\bar{u}_{\tau}:{\mathbb{R}}_{\geq 0}\to L^{1}({\mathbb{R}}^{d}) with

u¯τ(t)=uτnfor all (n1)τ<tnτ\displaystyle\bar{u}_{\tau}(t)=u_{\tau}^{n}\quad\text{for all $(n-1)\tau<t\leq n\tau$} (6)

converge along a suitable null sequence (τk)(\tau_{k}) to a limit u:0×du:{\mathbb{R}}_{\geq 0}\times{\mathbb{R}}^{d}\to{\mathbb{R}} in Lloc2(>0;W2,2(d))L^{2}_{\text{loc}}(\mathbb{R}_{>0};W^{2,2}({\mathbb{R}}^{d})). And that limit uu is a weak solution to (1) with u(0)=u0u(0)=u_{0} in the following sense: tu(t,)dt\mapsto u(t,\cdot)\mathcal{L}^{d} is a locally Hölder continuous curve from 0{\mathbb{R}}_{\geq 0} in (𝒫2(d),𝐖2)(\mathcal{P}_{2}(\mathbb{R}^{d}),{\mathbf{W}}_{2}), the roots s=us=\sqrt{u} and z=u4z=\sqrt[4]{u} are of regularity

sLloc2(>0;W2,2(d)),zLloc4(>0;W1,4(d)),\displaystyle s\in L^{2}_{\text{loc}}({\mathbb{R}}_{>0};W^{2,2}({\mathbb{R}}^{d})),\quad z\in L^{4}_{\text{loc}}({\mathbb{R}}_{>0};W^{1,4}({\mathbb{R}}^{d})), (7)

respectively, and for every test function φCc(>0×d)\varphi\in C^{\infty}_{c}({\mathbb{R}}_{>0}\times{\mathbb{R}}^{d}),

0d(tφ2λ3xφ)udxdt=0𝒩[u,φ]dt\int_{0}^{\infty}\int_{\mathbb{R}^{d}}\big{(}\partial_{t}\varphi-2\lambda^{3}x\cdot\nabla\varphi\big{)}u\,\mathrm{d}x\,\mathrm{d}t=\int_{0}^{\infty}\mathcal{N}[u,\varphi]\,\mathrm{d}t (8)

where the nonlinearity is given by

𝒩[u,φ]=4d{22φ:(2)+12(s2Δφ+2s3φ):}dx,\displaystyle\mathcal{N}[u,\varphi]=-4\int_{\mathbb{R}^{d}}\big{\{}2\nabla^{2}\varphi:(\ell^{2})+\frac{1}{2}(s\nabla^{2}\Delta\varphi+2\nabla s\cdot\nabla^{3}\varphi):\ell\big{\}}\,\mathrm{d}x, (9)

with =2s4zz\ell=\nabla^{2}s-4\nabla z\otimes\nabla z, which coincides with 12u2logu\frac{1}{2}\sqrt{u}\,\nabla^{2}\log u where u>0u>0.

Remark 1.2.

In view of (7), one has Lloc2(>0×d)\ell\in L^{2}_{\text{loc}}({\mathbb{R}}_{>0}\times{\mathbb{R}}^{d}), and so φ\varphi is “tested” against a local L1L^{1}-function in (9). It is far from obvious that (8) with the nonlinearity (9) is indeed a weak formulation of (1). Equality of 𝒩[u,φ]-\mathcal{N}[u,\varphi] with a more “natural” weak formulation of the nonlinearity in (1)’s right hand side, like

dφdiv(uΦ(u))dx=ddiv(uφ)Φ(u)dx,\displaystyle\int_{\mathbb{R}^{d}}\varphi\,\mathrm{div}\big{(}u\nabla\Phi(u)\big{)}\,\mathrm{d}x=\int_{\mathbb{R}^{d}}\mathrm{div}(u\nabla\varphi)\Phi(u)\,\mathrm{d}x,

for smooth and positive solutions uu can be verified by a direct but tedious computation involving various integration by parts. A more conceptual way to recognize 𝒩[u,φ]-\mathcal{N}[u,\varphi] as a weak formulation is explained at the beginning of Section 3.3.

Theorem 1.1 is proven by time-discrete approximation of the solution uu via the celebrated minimizing movement scheme. The main compactness estimate for passing to the time-continuous limit is provided by the dissipation of the (unperturbed) entropy 0\mathcal{H}_{0}, which formally amounts to

ddt0(u)κd(3u2+|u6|6)dx,-\frac{\mathrm{d}}{\,\mathrm{d}t}\mathcal{H}_{0}(u)\geq\kappa\int_{\mathbb{R}^{d}}\left(\interleave\nabla^{3}\sqrt{u}\interleave^{2}+\left\lvert\nabla\sqrt[6]{u}\right\rvert^{6}\right)\,\mathrm{d}x, (10)

with some positive κ>0\kappa>0. The formal calculations leading to (10) via integration by parts are identical to the ones used in [3]. The justification of these estimates in the whole-space case is technically more involved.

1.3. Long-time asymptotics

The interpretation of (1) as a Wasserstein gradient flow is essential for our second main result, which concerns the long time asymptotics of uu. First assume λ>0\lambda>0, in which case there is an unique equilibrium UλU_{\lambda} for (1) (with mass equal to 1), given by

Uλ(x)=(λ2π)d/2exp(λ2|x|2).\displaystyle U_{\lambda}(x)=\left(\frac{\lambda}{2\pi}\right)^{d/2}\exp\left(-\frac{\lambda}{2}|x|^{2}\right). (11)

Our approach to understanding the dynamics near UλU_{\lambda} is to formally calculate a linearization of (1) at UλU_{\lambda} and to determine its spectrum. We wrote a linearization since there are several competing concepts for linearization in nonlinear diffusion equations, providing different pieces of information about the long-time asymptotics. Following the ideas of [7], we study here the “linearization in Wasserstein”, which is given by the so-called displacement Hessian of λ\mathcal{E}_{\lambda} at UλU_{\lambda}. Very informally, the displacement Hessian of a functional 𝒰{\mathcal{U}} at a critical point UU_{*} is the representation of 𝒰{\mathcal{U}}’s second variational derivative with respect to the scalar product in H1(d;Udd)H^{-1}({\mathbb{R}}^{d};U_{*}\,\mathrm{d}\mathcal{L}^{d}). A definition and a more intuitive interpretation in terms of Lagrangian maps is given in Section 4.3.

Displacement Hessians are rarely used for the analysis of long-time asymptotics since the extraction of rigorous analytical information requires a lot of a priori knowledge about regularity from the solution. Particularly when it comes to proving higher order asymptotics, alternative linearizations are often easier to handle; we refer to the discussions in [21, 6]. The most famous application of displacement Hessians concerns the result on self-similar asymptotics for the porous medium equation [17]; further applications, also to higher order diffusion equations, can be found e.g. in [14, 15, 20]. In the situation at hand, we use the Wasserstein linearization because of its compatibility with the special structure of (1) that we outline below.

Since the derivation of the result is probably more interesting than the result itself, we briefly indicate the main ingredient, which is the aforementioned intimate relation between the three functionals in (2): recall that the linear Fokker-Planck equation (3) is the Wasserstein gradient flow of λ\mathcal{H}_{\lambda}. Consider a solution (wr)r0(w_{r})_{r\geq 0} to that flow, i.e.,

rwr=Δwr+λdiv(xwr).\displaystyle\partial_{r}w_{r}=\Delta w_{r}+\lambda\mathrm{div}(xw_{r}). (12)

It is a well-known fact that λ\mathcal{F}_{\lambda} is the dissipation of λ\mathcal{H}_{\lambda} along (12), i.e.,

λ(wr)λ(Uλ)=12ddrλ(wr).\displaystyle\mathcal{F}_{\lambda}(w_{r})-\mathcal{F}_{\lambda}(U_{\lambda})=-\frac{1}{2}\frac{\mathrm{d}}{\,\mathrm{d}r}\mathcal{H}_{\lambda}(w_{r}). (13)

The connection between λ\mathcal{H}_{\lambda} and λ\mathcal{E}_{\lambda} — and, implicitly, also λ\mathcal{F}_{\lambda} — is given by

λ(wr)λ(Uλ)=14d2dr2λ(wr)λ2ddrλ(wr).\displaystyle\mathcal{E}_{\lambda}(w_{r})-\mathcal{E}_{\lambda}(U_{\lambda})=\frac{1}{4}\frac{\mathrm{d}^{2}}{\,\mathrm{d}r^{2}}\mathcal{H}_{\lambda}(w_{r})-\frac{\lambda}{2}\frac{\mathrm{d}}{\,\mathrm{d}r}\mathcal{H}_{\lambda}(w_{r}). (14)

The relation (14) is derived in Section 4.2. It will play the same role in our studies on (1) as (13) has played for the analysis of the DLSS equation (4) in [14]. That is, we use (14) to express the displacement Hessian of λ\mathcal{E}_{\lambda} in terms of the displacement Hessian of λ\mathcal{H}_{\lambda}.

For the linear Fokker-Planck equation (12), which is the gradient flow of the relative entropy λ\mathcal{H}_{\lambda} and has UλU_{\lambda} as a critical point as well, the displacement Hessian has been calculated in [7, Proposition 2]: it is the extension of the formal operator

𝐋λφ=1Uλdiv(Uλφ)=Δφ+λxφ\displaystyle{\mathbf{L}}_{\lambda}\varphi=-\frac{1}{U_{\lambda}}\mathrm{div}\big{(}U_{\lambda}\,\nabla\varphi\big{)}=-\Delta\varphi+\lambda x\cdot\nabla\varphi (15)

to W1,2(d;Uλdd)W^{1,2}({\mathbb{R}}^{d};U_{\lambda}\,\mathrm{d}\mathcal{L}^{d}). The corresponding spectrum of 𝐋λ{\mathbf{L}}_{\lambda} is well-known: it is purely discrete with eigenvalues that are precisely the positive integer multiples of λ\lambda. The corresponding eigenfunctions are Hermite polynomials. The derivation of 𝐋λ{\mathbf{L}}_{\lambda} might appear a bit weird: first, one re-writes the linear Fokker-Planck equation (12) on the L2L^{2}-Wasserstein space, obtaining a highly non-linear gradient flow, and then calculates its linearization, which is — up to dualization — again the original equation (12). The key point is that the Wasserstein linearization is compatible with the relation (14), i.e., the linearization of the highly non-linear sixth order equation (1) is easily expressible in terms of 𝐋λ{\mathbf{L}}_{\lambda}, see Theorem 1.3 below. A similar idea has been used in [14], where the authors showed by exploiting the relation (13) below that the displacement Hessian for λ\mathcal{F}_{\lambda} is formally given by 𝐋λ2{\mathbf{L}}_{\lambda}^{2}, and discussed implications on the long-time asymptotics of the nonlinear fourth order DLSS (4) equations.

Our second main result is:

Theorem 1.3.

Given a test function ψCc(d)\psi\in C^{\infty}_{c}({\mathbb{R}}^{d}), let uσu_{\sigma} be the solution of the transport equation

σuσ=div(uσψ)subject to the initial condition u0=Uλ.\displaystyle\partial_{\sigma}u_{\sigma}=-\mathrm{div}(u_{\sigma}\nabla\psi)\quad\text{subject to the initial condition $u_{0}=U_{\lambda}$}.

Then:

d2dσ2|σ=0λ(uσ)=dψ(𝐋λ3ψ)Uλdd+λdψ(𝐋λ2ψ)Uλdd.\displaystyle\frac{\mathrm{d}^{2}}{\,\mathrm{d}\sigma^{2}}\bigg{|}_{\sigma=0}\mathcal{E}_{\lambda}(u_{\sigma})=\int_{\mathbb{R}^{d}}\nabla\psi\cdot\nabla\big{(}{\mathbf{L}}_{\lambda}^{3}\psi\big{)}U_{\lambda}\,\mathrm{d}\mathcal{L}^{d}+\lambda\int_{\mathbb{R}^{d}}\nabla\psi\cdot\nabla\big{(}{\mathbf{L}}_{\lambda}^{2}\psi\big{)}U_{\lambda}\,\mathrm{d}\mathcal{L}^{d}.

Consequently, if λ\mathcal{E}_{\lambda} possesses a displacement Hessian at UλU_{\lambda}, then it is an extension of the linear operator 𝐋λ3+λ𝐋λ2{\mathbf{L}}_{\lambda}^{3}+\lambda{\mathbf{L}}_{\lambda}^{2}.

Theorem 1.3 could be proven by direct calculations, evaluating the second variational derivative of λ\mathcal{E}_{\lambda} along the transport equation, and then integrating by parts until the desired form is attained. This would be tedious but in principle straight-forward once the desired terminal form 𝐋λ3+λ𝐋λ2{\mathbf{L}}_{\lambda}^{3}+\lambda{\mathbf{L}}_{\lambda}^{2} is known. Here we present a more conceptual approach, based on the relation (14), which leads us to the form of the Hessian in the first place.

As said before, the implications of Theorem 1.3 on the long-time asymptotics of (1) are far from obvious. Naturally, the hope is that the dynamics of all sufficiently smooth solutions uu close to equilibrium is approximately the same as that of the linearized equation, and in particular, that the eigenvalues of 𝐋λ3+λ𝐋λ2{\mathbf{L}}_{\lambda}^{3}+\lambda{\mathbf{L}}_{\lambda}^{2} determine the exponential decay of uu’s “nonlinear modes” in the long-time limit. It is not hard to see that the formal differential operator 𝐋λ3+λ𝐋λ2{\mathbf{L}}_{\lambda}^{3}+\lambda{\mathbf{L}}_{\lambda}^{2} possesses a unique self-adjoint closure in W1,2(d;Uλdd)W^{1,2}({\mathbb{R}}^{d};U_{\lambda}\,\mathrm{d}\mathcal{L}^{d}), and that the spectrum of the latter is purely discrete with eigenvalues λ3(k3+k2)\lambda^{3}(k^{3}+k^{2}) for k=0,1,2,k=0,1,2,\ldots

The strong results from [13], where the fully nonlinear exponential stability of the Gaussian steady state for the DLSS equation (4) has been proven on grounds of (13), give rise to a conjecture, namely that the spectral gap 2λ32\lambda^{3} in the linearization actually determines the global rate of convergence to equilibrium.

Conjecture 1.

The weak solutions to (1) constructed in the course of the proof of Theorem 1.1 above converge to UλU_{\lambda} in L1(d)L^{1}({\mathbb{R}}^{d}) at an exponential rate of λ3\lambda^{3}, i.e., there exists a constant C(u0)C(u^{0}) that is expressible in terms of the initial datum u0u^{0} alone such that

u(t,)UλL1(d)C(u0)eλ3tfor all t0.\displaystyle\|u(t,\cdot)-U_{\lambda}\|_{L^{1}({\mathbb{R}}^{d})}\leq C(u^{0})e^{-\lambda^{3}t}\quad\text{for all $t\geq 0$}. (16)

A direct consequence of Conjecture 1 would be uu’s asymptotic self-similarity in case λ=0\lambda=0. More precisely, in Section 4.6 we show that if Conjecture 1 were true, then any solution uu to (1) with λ=0\lambda=0 approaches in L1(d)L^{1}({\mathbb{R}}^{d}) with algebraic rate t1/6t^{-1/6} the self-similar solution

u(t;x)=[1+12t]d/6U1([1+12t]1/6x)\displaystyle u_{*}(t;x)=[1+12t]^{-d/6}U_{1}\big{(}[1+12t]^{-1/6}x\big{)} (17)

at least if uu is already sufficiently close to self-similarity initially.


Outline

After explaining the origin and relevance of equation (1) in Section 2.1, we introduce commonly known background regarding the L2L^{2}-Wasserstein space and metric gradient flows in Section 2.2. Section 3 is then devoted to the existence proof, while Section 4 deals with formal results for the intermediate respectively long time behaviour of those obtained solutions.

Notation

The euclidean norm is denoted by ||\left\lvert\cdot\right\rvert, while \left\lVert\cdot\right\rVert is given by A2=i,j=1daij2\left\lVert A\right\rVert^{2}=\sum_{i,j=1}^{d}a_{ij}^{2} for A=(aij)d×dA=(a_{ij})\in\mathbb{R}^{d\times d} and 𝔹2=i,j,k=1dbijk2\interleave\mathbb{B}\interleave^{2}=\sum_{i,j,k=1}^{d}b_{ijk}^{2} for 𝔹=(b)ijkd×d×d\mathbb{B}=(b)_{ijk}\in\mathbb{R}^{d\times d\times d}. The domain Dom(𝒢)\mathrm{Dom}({\mathcal{G}}) of a functional 𝒢\mathcal{G} defined on a set XX consists of all uXu\in X such that 𝒢(u)<\mathcal{G}(u)<\infty.

2. Derivation and Preliminaries

2.1. Derivation of the equation

We sketch the derivation of (1) from a quantum mechanical model. In [5] — building on [4] — the following non-linear and non-local quantum analogue of the classical Fokker-Planck equation (12) has been derived:

tu=div(u(A[u]+λ2|x|2)).\displaystyle\partial_{t}u=\mathrm{div}\left(u\,\nabla\left(A[u]+\frac{\lambda}{2}|x|^{2}\right)\right). (18)

Here uu is the macroscopic density of quantum particles whose dynamics aims at minimizing the ensemble’s relative von Neumann entropy 𝙷λ\mathtt{H}_{\lambda}. The precise definition of A[u]A[u], sometimes referred to as uu’s quantum logarithm, is intricate; for the sake of completeness, we mention one possible definition of A[u]A[u] as the uniquely determined potential A:dA:{\mathbb{R}}^{d}\to{\mathbb{R}} such that

Tr[φexp(22Δ+A)]=dφ(x)u(x)dx\displaystyle\operatorname{Tr}\left[\varphi\,\exp\left(-\frac{\hbar^{2}}{2}\Delta+A\right)\right]=\int_{{\mathbb{R}}^{d}}\varphi(x)u(x)\,\mathrm{d}x

for all test functions φCc(d)\varphi\in C^{\infty}_{c}({\mathbb{R}}^{d}). Here Tr\operatorname{Tr} denotes the trace over L2(d)L^{2}({\mathbb{R}}^{d}), and exp\exp is the exponential of a self-adjoint operator.

Under the specific hypotheses made in [5], the von Neumann entropy can be expressed in terms of the macroscipic density uu alone, and is given by

𝙷λ(u)=du(A[u]+λ2|x|2)dx.\mathtt{H}_{\lambda}(u)=\int_{{\mathbb{R}}^{d}}u\left(A[u]+\frac{\lambda}{2}|x|^{2}\right)\,\mathrm{d}x.

We remark that 𝙷λ(u)\mathtt{H}_{\lambda}(u)’s variational derivative is A[u]+λ2|x|2A[u]+\frac{\lambda}{2}|x|^{2}, and thus equation (18) has the formal structure of a gradient flow in 𝐖2{\mathbf{W}}_{2}. In the semi-classical limit 0\hbar\to 0, the von Neumann entropy 𝙷λ(u)\mathtt{H}_{\lambda}(u) formally approaches the Boltzmann entropy λ(u)\mathcal{H}_{\lambda}(u), and the variational derivative A[u]A[u] formally approaches the pointwise logarithm logu\log u. Consequently, (18) turns into the classical Fokker-Planck equation.

The existence analysis for the full equation (18) goes far beyond classical parabolic theory, and has been carried out just recently, and only in one space dimension [18]. Already the rigorous definition of A[u]A[u] as solution to an inverse problem has been challenging [16]. A way to approximate (18) by local equations is via the expansion of A[u]A[u] in powers of the small parameter \hbar. In [2, Appendix] the following asymptotic expansion up to 𝒪(6)\mathcal{O}(\hbar^{6}) has been computed (for λ=0\lambda=0):

A=logu=:A0[u]+212(2Δuu)=:A1[u]+4360(122logu2+1u2:(u2logu))=:A2[u]+𝒪(6).A=\underbrace{\log u}_{=:A_{0}[u]}+\frac{\hbar^{2}}{12}\underbrace{\left(-2\frac{\Delta\sqrt{u}}{\sqrt{u}}\right)}_{=:A_{1}[u]}+\frac{\hbar^{4}}{360}\underbrace{\left(\frac{1}{2}\left\lVert\nabla^{2}\log u\right\rVert^{2}+\frac{1}{u}\nabla^{2}:(u\nabla^{2}\log u)\right)}_{=:A_{2}[u]}+\mathcal{O}(\hbar^{6}).

As mentioned above, a reduction of A[u]A[u] in (18) to the leading order term A0[u]A_{0}[u] yields the linear Fokker-Planck (or rather: heat — since λ=0\lambda=0) equation (12),

tu=div(uA0[u])=Δu.\displaystyle\partial_{t}u=\mathrm{div}(u\,\nabla A_{0}[u])=\Delta u.

Replacing A[u]A[u] by A1[u]A_{1}[u] leads to the Derida-Lebowitz-Speer-Spohn (DLSS) equation

tu=div(uA1[u])=2:(u2logu).\displaystyle\partial_{t}u=\mathrm{div}(u\nabla A_{1}[u])=-\nabla^{2}:(u\nabla^{2}\log u).

Finally, since A2[u]A_{2}[u] is identical to the functional Φ(u)\Phi(u), the equation tu=div(uA2[u])\partial_{t}u=\mathrm{div}(u\nabla A_{2}[u]) coincides with the sixth-order equation (1) under consideration here.

2.2. Background for Wasserstein gradient flows

In this section we briefly review some basics about the theory of optimal transport and L2L^{2}-Wasserstein gradient flows, but only as far as it is needed later. For a more profound introduction to these topics, we refer to the text books [1, 19, 22]. By 𝒫2(d)\mathcal{P}_{2}(\mathbb{R}^{d}) we denote all probability measures with finite second moment,

m2(μ)=d|x|2dμ(x)<.\textbf{m}_{2}(\mu)=\int_{\mathbb{R}^{d}}|x|^{2}\,\mathrm{d}\mu(x)<\infty.

We shall frequently identify absolutely continuous (with respect to the Lebesgue measure d\mathcal{L}^{d}) probability measures μ\mu on d{\mathbb{R}}^{d} with their densities uL1(d)u\in L^{1}({\mathbb{R}}^{d}). A sequence (μn)n𝒫(d)(\mu_{n})_{n\in\mathbb{N}}\subset\mathcal{P}(\mathbb{R}^{d}) of probability measures converges narrowly to ρ𝒫(d)\rho\in\mathcal{P}(\mathbb{R}^{d}) if

limndf(x)dμn(x)=df(x)dμ(x)\displaystyle\lim\limits_{n\rightarrow\infty}\int_{\mathbb{R}^{d}}f(x)\,\mathrm{d}\mu_{n}(x)=\int_{\mathbb{R}^{d}}f(x)\,\mathrm{d}\mu(x)

holds for all bounded, continuous functions f:df:\mathbb{R}^{d}\rightarrow\mathbb{R}. The L2L^{2}-Wasserstein distance between two measures μ,ν𝒫2(d)\mu,\nu\in\mathcal{P}_{2}(\mathbb{R}^{d}) is defined via

W22(μ,ν)=minπΠ(μ,ν)d×d|xy|2dπ(x,y),\textnormal{W}_{2}^{2}(\mu,\nu)=\min_{\pi\in\Pi(\mu,\nu)}\int_{\mathbb{R}^{d}\times\mathbb{R}^{d}}|x-y|^{2}\,\mathrm{d}\pi(x,y), (19)

where Π(μ,ν)\Pi(\mu,\nu) denotes the set of all transport plans between μ\mu and ν\nu, that is all π𝒫(d×d)\pi\in\mathcal{P}(\mathbb{R}^{d}\times\mathbb{R}^{d}) with respective marginals μ\mu and ν\nu. The Wasserstein distance metrizes narrow convergence on 𝒫2(d)\mathcal{P}_{2}(\mathbb{R}^{d}) and is itself lower semi-continuous with respect to that convergence.

We are not going to define metric gradient flows on (𝒫2(d),W2(,))(\mathcal{P}_{2}(\mathbb{R}^{d}),\textnormal{W}_{2}(\cdot,\cdot)) in general. Here we just need a particularly nice subclass.

Definition 2.1.

Let 𝒱:𝒫2(d){+}{\mathcal{V}}:\mathcal{P}_{2}(\mathbb{R}^{d})\to{\mathbb{R}}\cup\{+\infty\} be a proper, lower semi-continuous functional. Further, let a semi-group (𝒮s)s0(\mathcal{S}_{s})_{s\geq 0} of continuous maps 𝒮s\mathcal{S}_{s} on 𝒫2(d)\mathcal{P}_{2}(\mathbb{R}^{d}) be given. We call the semi-group an α\alpha-flow for 𝒱{\mathcal{V}} if the following evolutionary variational inequality holds at any s0s\geq 0 and with any μ,νDom(𝒱)\mu,\nu\in\mathrm{Dom}({\mathcal{V}}):

12d+dsW22(𝒮s(μ),ν)+α2W22(𝒮s(μ),ν)𝒱(ν)𝒱(𝒮s(μ)).\displaystyle\frac{1}{2}\frac{\mathrm{d}^{+}}{\,\mathrm{d}s}\textnormal{W}_{2}^{2}(\mathcal{S}_{s}(\mu),\nu)+\frac{\alpha}{2}\textnormal{W}_{2}^{2}(\mathcal{S}_{s}(\mu),\nu)\leq{\mathcal{V}}(\nu)-{\mathcal{V}}(\mathcal{S}_{s}(\mu)). (20)
Example 1.

The following three examples of α\alpha-flows will be important in the following. They are both special cases of [1, Example 11.2.7].

  1. (1)

    The linear heat flow, given as distributional solution to sμ=Δμ\partial_{s}\mu=\Delta\mu, is a 0-flow for the entropy 𝒱=0{\mathcal{V}}=\mathcal{H}_{0}.

  2. (2)

    The linear mass transport, given as distributional solution to sμ=div(μV)\partial_{s}\mu=\mathrm{div}(\mu\nabla V) for a potential VC2(d)V\in C^{2}({\mathbb{R}}^{d}) with globally bounded second derivatives, is an α\alpha-flow for the potential energy 𝒱(μ)=dV(x)dμ(x){\mathcal{V}}(\mu)=\int_{\mathbb{R}^{d}}V(x)\,\mathrm{d}\mu(x), for every α\alpha such that 2Vα𝕀\nabla^{2}V\geq\alpha\mathbb{I}.

  3. (3)

    The rescaling, given as distributional solution to sμ=div(xμ)\partial_{s}\mu=\mathrm{div}(x\,\mu), is an 11-flow for the potential energy 𝒱=12𝔪2{\mathcal{V}}=\frac{1}{2}\mathfrak{m}_{2}.

Solutions to (1) will be constructed via discrete-in-time approximation by means of the minimizing movement scheme, i.e., a variational Euler method, see Section 3.2 below. Inductively, the approximation μτn\mu_{\tau}^{n} at time t=nτt=n\tau is obtained from μτn1\mu_{\tau}^{n-1}, the one at t=(n1)τt=(n-1)\tau, as minimizer in

μ12τ𝐖2(μ,μτn1)2+¯λ(μ).\displaystyle\mu\mapsto\frac{1}{2\tau}{\mathbf{W}}_{2}\big{(}\mu,\mu_{\tau}^{n-1}\big{)}^{2}+\overline{\mathcal{E}}_{\lambda}(\mu). (21)

For passage to the continuous limit, a priori estimates independent of the time step τ\tau are essential. The key is to give a rigorous meaning to a priori estimates related to dissipations ddt𝒱(μt)-\frac{\mathrm{d}}{\,\mathrm{d}t}{\mathcal{V}}(\mu_{t}) of auxiliary functionals 𝒱{\mathcal{V}} already on the time-discrete level. For this, we shall use the so-called flow interchange method [13, Theorem 3.2], where the minimizer μτn\mu_{\tau}^{n} is perturbed along the α\alpha-flow 𝒮()𝒱\mathcal{S}^{\mathcal{V}}_{(\cdot)} of the auxiliary functional 𝒱{\mathcal{V}}.

Lemma 2.2 (Flow Interchange).

Let 𝒰,𝒱:𝒫2(d){+}{\mathcal{U}},{\mathcal{V}}:\mathcal{P}_{2}(\mathbb{R}^{d})\rightarrow\mathbb{R}\cup\{+\infty\} be two proper, lower semicontinuous functionals with Dom(𝒰)Dom(𝒱)\mathrm{Dom}({\mathcal{U}})\subseteq\mathrm{Dom}({\mathcal{V}}). Assume further that 𝒱{\mathcal{V}} produces an α\alpha-flow 𝒮()𝒱\mathcal{S}^{{\mathcal{V}}}_{(\cdot)}. Let μ\mu^{*} be a global minimizer of the following Yosida-penalization of 𝒰{\mathcal{U}},

μ12τW22(μ,μ¯)+𝒰(μ),\displaystyle\mu\mapsto\frac{1}{2\tau}\textnormal{W}_{2}^{2}(\mu,\bar{\mu})+{\mathcal{U}}(\mu), (22)

where μ¯\bar{\mu} is given. Then

lim supσ0𝒰(μ)𝒰(𝒮σ𝒱(μ))σ𝒱(μ¯)𝒱(μ)τα2τW22(μ,μ¯).\displaystyle\limsup_{\sigma\downarrow 0}\frac{{\mathcal{U}}(\mu^{*})-{\mathcal{U}}(\mathcal{S}^{\mathcal{V}}_{\sigma}(\mu^{*}))}{\sigma}\leq\frac{{\mathcal{V}}(\bar{\mu})-{\mathcal{V}}(\mu^{*})}{\tau}-\frac{\alpha}{2\tau}\textnormal{W}_{2}^{2}(\mu^{*},\bar{\mu}). (23)
Sketch of proof.

On the one hand, since μ\mu^{*} is the minimizer in (22), we have for each σ>0\sigma>0 that

1σ[𝒰(μ)𝒰(𝒮σ𝒱(μ))]12τσ[𝐖22(𝒮σ𝒱(μ),μ¯)𝐖22(μ,μ¯)].\displaystyle\frac{1}{\sigma}\big{[}{\mathcal{U}}(\mu^{*})-{\mathcal{U}}(\mathcal{S}^{\mathcal{V}}_{\sigma}(\mu^{*}))\big{]}\leq\frac{1}{2\tau\sigma}\big{[}{\mathbf{W}}_{2}^{2}(\mathcal{S}^{\mathcal{V}}_{\sigma}(\mu^{*}),\bar{\mu})-{\mathbf{W}}_{2}^{2}(\mu^{*},\bar{\mu})\big{]}. (24)

On the other hand, by the EVI (20) at s=0s=0,

lim supσ012τσ[𝐖22(𝒮σ𝒱(μ),μ¯)𝐖22(μ,μ¯)]1τ[𝒱(μ¯)𝒱(μ)]α2τW22(μ,μ¯).\displaystyle\limsup_{\sigma\downarrow 0}\frac{1}{2\tau\sigma}\big{[}{\mathbf{W}}_{2}^{2}(\mathcal{S}^{\mathcal{V}}_{\sigma}(\mu^{*}),\bar{\mu})-{\mathbf{W}}_{2}^{2}(\mu^{*},\bar{\mu})\big{]}\leq\frac{1}{\tau}\big{[}{\mathcal{V}}(\bar{\mu})-{\mathcal{V}}(\mu^{*})\big{]}-\frac{\alpha}{2\tau}\textnormal{W}_{2}^{2}(\mu^{*},\bar{\mu}). (25)

Combining (24) with (25) yields (23). ∎

Remark 2.3.

The left-hand side in (23) is an approximation of 𝒰{\mathcal{U}}’s dissipation along 𝒱{\mathcal{V}}’s flow. At least on a formal level, one expects that it is also an approximation to 𝒱{\mathcal{V}}’s dissipation along 𝒰{\mathcal{U}}’s flow, i.e., a time-discrete analogue of ddt𝒱(μt)-\frac{\mathrm{d}}{\,\mathrm{d}t}{\mathcal{V}}(\mu_{t}). Indeed, in a corresponding smooth situation, with a map x(,):×nx_{(\cdot,\cdot)}:{\mathbb{R}}\times{\mathbb{R}}\to{\mathbb{R}}^{n}, that is a gradient flow with respect to two “time” parameters ss and tt,

tx(s,t)=U(x(s,t)),sx(s,t)=V(x(s,t)),\displaystyle\partial_{t}x_{(s,t)}=-\nabla U(x_{(s,t)}),\quad\partial_{s}x_{(s,t)}=-\nabla V(x_{(s,t)}),

for smooth functions U,V:nU,V:{\mathbb{R}}^{n}\to{\mathbb{R}}, we have

ddtV(xs,t)=U(x(s,t))V(x(s,t))=ddsU(xs,t).\displaystyle-\frac{\mathrm{d}}{\,\mathrm{d}t}V(x_{s,t})=\nabla U(x_{(s,t)})\cdot\nabla V(x_{(s,t)})=-\frac{\mathrm{d}}{\,\mathrm{d}s}U(x_{s,t}).

In the non-smooth situation at hand, the two dissipations might not be identical, but typically, one can be controlled in terms of the other.

3. Existence of Solutions

3.1. Properties of the energy functional

We begin by giving a proper definition of the energy functional.

Definition 3.1.

The energy functional ¯λ:𝒫2(d)0{+}\overline{\mathcal{E}}_{\lambda}:\mathcal{P}_{2}(\mathbb{R}^{d})\to{\mathbb{R}}_{\geq 0}\cup\{+\infty\} is defined as follows: if μ=ud\mu=u\mathcal{L}^{d} is absolutely continuous with uW2,2(d)\sqrt{u}\in W^{2,2}({\mathbb{R}}^{d}), then

¯λ(μ):=2d2u4u4u42dx+λ3d|x|2dμ(x);\displaystyle\overline{\mathcal{E}}_{\lambda}(\mu):=2\int_{\mathbb{R}^{d}}\big{\|}\nabla^{2}\sqrt{u}-4\nabla\sqrt[4]{u}\otimes\nabla\sqrt[4]{u}\big{\|}^{2}\,\mathrm{d}x+\lambda^{3}\int_{\mathbb{R}^{d}}|x|^{2}\,\mathrm{d}\mu(x); (26)

otherwise, ¯λ(μ):=+\overline{\mathcal{E}}_{\lambda}(\mu):=+\infty.

Remark 3.2.

Several comments are in order.

  • For well-definedness of the right-hand side in (26), we implicitly use the fact that uW2,2(d)\sqrt{u}\in W^{2,2}({\mathbb{R}}^{d}) implies u4W1,4(d)\sqrt[4]{u}\in W^{1,4}({\mathbb{R}}^{d}), see e.g. [12, Théoréme 1(ii)]. Actually, there is a constant CC such that u4L42C2uL2\|\nabla\sqrt[4]{u}\|_{L^{4}}^{2}\leq C\|\nabla^{2}\sqrt{u}\|_{L^{2}}, and thus,

    ¯0(μ)C2uL22.\displaystyle\overline{\mathcal{E}}_{0}(\mu)\leq C^{\prime}\|\nabla^{2}\sqrt{u}\|_{L^{2}}^{2}. (27)
  • The condition uW2,2(d)\sqrt{u}\in W^{2,2}({\mathbb{R}}^{d}) is an essential part of the definition. Note that in principle, cancellation effects inside the norm under the integral in (26) could lead to a finite integral value despite uW2,2(d)\sqrt{u}\notin W^{2,2}({\mathbb{R}}^{d}). Our choice of ¯λ\overline{\mathcal{E}}_{\lambda}’s domain is justified by Propositions 3.3 and 3.4 below.

  • If μ\mu’s density uu is positive everywhere, then one has

    2u4u4u4=12u2logu,\nabla^{2}\sqrt{u}-4\nabla\sqrt[4]{u}\otimes\nabla\sqrt[4]{u}=\frac{1}{2}\sqrt{u}\,\nabla^{2}\log u,

    and so ¯λ(μ)=λ(u)\overline{\mathcal{E}}_{\lambda}(\mu)=\mathcal{E}_{\lambda}(u) with λ\mathcal{E}_{\lambda} defined in (2).

  • The authors of [8] consider a slightly different version 𝔎1(|d)\mathfrak{K}_{-1}(\cdot|\mathcal{L}^{d}) of 0\mathcal{E}_{0}: assuming μ=ud\mu=u\mathcal{L}^{d} satisfies the weaker condition uWloc2,2(d)\sqrt{u}\in W^{2,2}_{\text{loc}}({\mathbb{R}}^{d}), they set

    𝔎1(μ|d)=du2uuuu2dμ(x).\displaystyle\mathfrak{K}_{-1}(\mu|\mathcal{L}^{d})=\int_{\mathbb{R}^{d}}\left\|\frac{\sqrt{u}\nabla^{2}\sqrt{u}-\nabla\sqrt{u}\otimes\nabla\sqrt{u}}{\sqrt{u}}\right\|^{2}\,\mathrm{d}\mu(x). (28)

    Since the zero set of uu is clearly μ\mu-negligible, the integrand is μ\mu-a.e. well-defined. For uW2,2(d)\sqrt{u}\in W^{2,2}({\mathbb{R}}^{d}), one has 𝔎1(μ|d)=¯0(μ)\mathfrak{K}_{-1}(\mu|\mathcal{L}^{d})=\overline{\mathcal{E}}_{0}(\mu); this is a consequence of the fact that the derivatives of Sobolev functions are zero d\mathcal{L}^{d}-a.e. on their level sets, i.e., the integrand in (26) vanishes a.e. on {u=0}\{u=0\}. The representation in (28) is the relevant one in the proof of lower semi-continuity, see Proposition 3.4 below. For our later needs, the representation (26) is better suited.

The energy ¯λ\overline{\mathcal{E}}_{\lambda} defined above is part of a family of second-order functionals that have been studied in [8, Section 3] in connection with the DLSS equation (4). We recall and adapt two results here.

Proposition 3.3 (adapted from Corollary 3.2 in [8]).

There is a constant CC, only depending on the dimension dd, such that for all absolutely continuous μ=ud𝒫2(d)\mu=u\mathcal{L}^{d}\in\mathcal{P}_{2}(\mathbb{R}^{d}):

d(2u2+|u4|4)dxC¯λ(μ).\displaystyle\int_{\mathbb{R}^{d}}\big{(}\big{\|}\nabla^{2}\sqrt{u}\big{\|}^{2}+\big{|}\nabla\sqrt[4]{u}\big{|}^{4}\big{)}\,\mathrm{d}x\leq C\overline{\mathcal{E}}_{\lambda}(\mu). (29)

It is important to remark here that [8, Corollary 3.2] indeed applies to the case Ω=d\Omega={\mathbb{R}}^{d}: the derivation of (29) only involves an integration by parts with the vector field 𝐯=|u4|2u\mathbf{v}=|\nabla\sqrt[4]{u}|^{2}\nabla\sqrt{u}, which is integrable on d{\mathbb{R}}^{d} since uW2,2(d)\sqrt{u}\in W^{2,2}({\mathbb{R}}^{d}) and u4W1,4(d)\sqrt[4]{u}\in W^{1,4}({\mathbb{R}}^{d}). We refer to Lemma C.1 and to the subsequent discussion in the Appendix.

Proposition 3.4 (adapted from Corollary 3.4 of [8]).

The functional ¯λ\overline{\mathcal{E}}_{\lambda} is sequentially lower semi-continuous with respect to narrow convergence.

Proof.

In [8, Corollary 3.4], the lower semi-continuity of 𝔎1(|d)\mathfrak{K}_{-1}(\cdot|\mathcal{L}^{d}) recalled in (28) above is shown. The only difference between 𝔎1(μ|d)\mathfrak{K}_{-1}(\mu|\mathcal{L}^{d}) and ¯0(μ)\overline{\mathcal{E}}_{0}(\mu) is that the former is defined by the integral value (possibly ++\infty) for all μ=ud\mu=u\mathcal{L}^{d} with uWloc2,2(d)\sqrt{u}\in W^{2,2}_{\text{loc}}({\mathbb{R}}^{d}), and the latter is ++\infty unless uW2,2(d)\sqrt{u}\in W^{2,2}({\mathbb{R}}^{d}).

The “stability” of ¯0\overline{\mathcal{E}}_{0}’s restricted domain follows directly from Proposition 3.3 above: if (μn)(\mu_{n}) converges narrowly to μ\mu and has supn¯λ(μn)<\sup_{n}\overline{\mathcal{E}}_{\lambda}(\mu_{n})<\infty, then (un)(\sqrt{u_{n}}) is bounded in W2,2(d)W^{2,2}({\mathbb{R}}^{d}) thanks to (29). Consequently, (un)(\sqrt{u_{n}}) converges strongly to a limit u\sqrt{u} in L2(d)L^{2}({\mathbb{R}}^{d}), which in turn implies that μ=ud\mu=u\mathcal{L}^{d} is absolutely continuous. And boundedness of (un)(\sqrt{u_{n}}) in W2,2(d)W^{2,2}({\mathbb{R}}^{d}) further implies that also uW2,2(d)\sqrt{u}\in W^{2,2}({\mathbb{R}}^{d}). ∎

3.2. Minimizing movements

Let a time step size τ>0\tau>0 be fixed. For a given ν𝒫2(d)\nu\in\mathcal{P}_{2}(\mathbb{R}^{d}), define the Yosida-penalized energy functional ¯λ,τ(;ν)\overline{\mathcal{E}}_{\lambda,\tau}(\cdot;\nu) by

¯λ,τ(μ;ν)=12τ𝐖2(μ,ν)2+¯λ(μ).\displaystyle\overline{\mathcal{E}}_{\lambda,\tau}(\mu;\nu)=\frac{1}{2\tau}{\mathbf{W}}_{2}(\mu,\nu)^{2}+\overline{\mathcal{E}}_{\lambda}(\mu).
Lemma 3.5.

For each ν𝒫2(d)\nu\in\mathcal{P}_{2}(\mathbb{R}^{d}), there exists a global minimizer μ𝒫2(d)\mu^{*}\in\mathcal{P}_{2}(\mathbb{R}^{d}) of ¯λ,τ(;ν)\overline{\mathcal{E}}_{\lambda,\tau}(\cdot;\nu).

Proof.

This is a standard argument from the calculus of variations. Observe that the functional μ¯λ,τ(μ;ν)\mu\mapsto\overline{\mathcal{E}}_{\lambda,\tau}(\mu;\nu) has the following properties:

  • It is bounded from below (in fact: non-negative), and is not identically ++\infty (since it has a finite value for any absolutely continuous μ=ud𝒫2(d)\mu=u\mathcal{L}^{d}\in\mathcal{P}_{2}(\mathbb{R}^{d}) with uW2,2(d)\sqrt{u}\in W^{2,2}({\mathbb{R}}^{d})). Hence, it has a finite infimum.

  • It is coercive. Indeed, by non-negativity of ¯λ\overline{\mathcal{E}}_{\lambda} and the properties of 𝐖2{\mathbf{W}}_{2}, one easily shows that ¯λ,τ(μ;ν)14τ𝔪2(μ)C\overline{\mathcal{E}}_{\lambda,\tau}(\mu;\nu)\geq\frac{1}{4\tau}\mathfrak{m}_{2}(\mu)-C with a CC that is expressible in terms of ν\nu and τ\tau. Hence, sublevels are tight and thus pre-compact with respect to narrow convergence.

  • It is sequentially lower semi-continuous with respect to narrow convergence, see Proposition 3.4 above.

The existence of a minimizer now follows by standard arguments. ∎

As a consequence of Lemma 3.5, the minimizing movement scheme for ¯\overline{\mathcal{E}} is well-defined for every initial condition μ0𝒫2(d)\mu_{0}\in\mathcal{P}_{2}(\mathbb{R}^{d}). That is, starting from μτ0:=μ0\mu_{\tau}^{0}:=\mu_{0}, one can define a sequence (μτn)n=0(\mu_{\tau}^{n})_{n=0}^{\infty} inductively by choosing for μτn\mu_{\tau}^{n} as a minimizer of ¯τ(;μτn1)\overline{\mathcal{E}}_{\tau}(\cdot;\mu_{\tau}^{n-1}) for n=1,2,n=1,2,\ldots For notational convenience, we also introduce the usual time-discrete “interpolation” μ¯τ:0𝒫2(d)\bar{\mu}_{\tau}:{\mathbb{R}}_{\geq 0}\to\mathcal{P}_{2}(\mathbb{R}^{d}) by

μ¯τ(t)=μτnfor all t((n1)τ,nτ]\displaystyle\bar{\mu}_{\tau}(t)=\mu_{\tau}^{n}\quad\text{for all $t\in((n-1)\tau,n\tau]$}

for n=1,2,n=1,2,\ldots, and μ¯τ(0)=μ0\bar{\mu}_{\tau}(0)=\mu_{0}. The sequence (μτn)n=0(\mu_{\tau}^{n})_{n=0}^{\infty} and its interpolation μ¯τ\bar{\mu}_{\tau} satisfy a variety of energy estimates, that directly follow from the construction via minimization.

Lemma 3.6 (Basic discrete estimates).

For every N=1,2,N=1,2,\ldots, the energy ¯λ(μτN)\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{N}) is finite, and moreover,

¯λ(μτN)¯λ(μτN1)¯λ(μ0).\displaystyle\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{N})\leq\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{N-1})\leq\overline{\mathcal{E}}_{\lambda}(\mu_{0}). (30)
τ2n=N(W2(μτn+1,μτn)τ)2¯λ(μτN).\displaystyle\frac{\tau}{2}\sum_{n=N}^{\infty}\left(\frac{\textnormal{W}_{2}(\mu_{\tau}^{n+1},\mu_{\tau}^{n})}{\tau}\right)^{2}\leq\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{N}). (31)

And further, for all s,tNτs,t\geq N\tau,

W22(μ¯τ(t),μ¯τ(s))2¯λ(μτN)max{τ,|ts|}.\displaystyle\textnormal{W}_{2}^{2}(\bar{\mu}_{\tau}(t),\bar{\mu}_{\tau}(s))\leq 2\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{N})\,\max\{\tau,|t-s|\}. (32)

The derivation of these estimates is a standard procedure, see e.g. [1].

3.3. Discrete equation

In this section, we derive a time-discrete surrogate of (1) that is satisfied by the discrete approximation (μτn)(\mu_{\tau}^{n}). Following the seminal idea from [9, 8], we produce a time-discrete, very weak formulation of (1) — eventually leading to (8) — by performing an inner variation of the minimizer μτn\mu_{\tau}^{n} of ¯τ(;μτn1)\overline{\mathcal{E}}_{\tau}(\cdot;\mu_{\tau}^{n-1}). That weak formulation is well-defined under the hypothesis that μτnDom(¯λ)\mu_{\tau}^{n}\in\mathrm{Dom}(\overline{\mathcal{E}}_{\lambda}), which follows trivially by the construction.

More specifically, let a smooth and compactly supported function φCc(d)\varphi\in C^{\infty}_{c}({\mathbb{R}}^{d}) be given, and define the associated d{\mathbb{R}}^{d}-gradient flow X():×ddX_{(\cdot)}:{\mathbb{R}}\times{\mathbb{R}}^{d}\to{\mathbb{R}}^{d} as solution to the ODE initial value problem

ddσXσ=φXσ,X0=id.\frac{\mathrm{d}}{\,\mathrm{d}\sigma}X_{\sigma}=-\nabla\varphi\circ X_{\sigma},\quad X_{0}=\textnormal{id}. (33)

For a given absolutely continuous measure μ=udDom(¯λ)\mu=u\mathcal{L}^{d}\in\mathrm{Dom}(\overline{\mathcal{E}}_{\lambda}), we consider the deformations μσ:=Xσ#μ𝒫2(d)\mu_{\sigma}:=X_{\sigma}\#\mu\in\mathcal{P}_{2}(\mathbb{R}^{d}). These satisfy the continuity equation along the vector field φ-\nabla\varphi, i.e.,

σμσ=div(μσφ).\displaystyle\partial_{\sigma}\mu_{\sigma}=\mathrm{div}(\mu_{\sigma}\nabla\varphi). (34)

In Lemma 3.7 below, the σ\sigma-derivative of ¯λ(μσ)\overline{\mathcal{E}}_{\lambda}(\mu_{\sigma}) at σ=0\sigma=0 is given explicitly. In view of (34), we obtain formally — that is, in case that uu is positive and smooth — that

ddσ|σ=0¯λ(μσ)\displaystyle\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\overline{\mathcal{E}}_{\lambda}(\mu_{\sigma}) =d(δ0δu+λ3|x|2)div(uφ)dx\displaystyle=\int_{\mathbb{R}^{d}}\left(\frac{\delta\mathcal{E}_{0}}{\delta u}+\lambda^{3}|x|^{2}\right)\mathrm{div}(u\nabla\varphi)\,\mathrm{d}x
=dφdiv(uΦ(u))dx2λ3dxφudx.\displaystyle=\int_{\mathbb{R}^{d}}\varphi\,\mathrm{div}\big{(}u\nabla\Phi(u)\big{)}\,\mathrm{d}x-2\lambda^{3}\int_{\mathbb{R}^{d}}x\cdot\nabla\varphi u\,\mathrm{d}x.

In other words, calculating the σ\sigma-derivative of ¯λ(μσ)\overline{\mathcal{E}}_{\lambda}(\mu_{\sigma}) at σ=0\sigma=0 amounts to calculating a special form of the right-hand side in (1), “tested” against φ\varphi. This philosophy — which was the founding idea behind the flow interchange lemma, see (23) — is made rigorous in Lemma 3.8 further below.

Lemma 3.7 (First variation).

Let μ=udDom(¯λ)\mu=u\mathcal{L}^{d}\in\mathrm{Dom}(\overline{\mathcal{E}}_{\lambda}) and define accordingly s:=uW2,2(d)s:=\sqrt{u}\in W^{2,2}({\mathbb{R}}^{d}) and z:=u4W1,4(d)z:=\sqrt[4]{u}\in W^{1,4}({\mathbb{R}}^{d}). Given φCc(d)\varphi\in C^{\infty}_{c}({\mathbb{R}}^{d}), define the flow X()X_{(\cdot)} as in (33) above. Then the map σ¯λ(Xσ#μ)\sigma\mapsto\overline{\mathcal{E}}_{\lambda}(X_{\sigma}\#\mu) is differentiable in σ=0\sigma=0, with derivative:

ddσ|σ=0¯λ(Xσ#μ)=𝒩[u,φ]2λ3dxφdμ,\displaystyle\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\overline{\mathcal{E}}_{\lambda}(X_{\sigma}\#\mu)=-\mathcal{N}[u,\varphi]-2\lambda^{3}\int_{\mathbb{R}^{d}}x\cdot\nabla\varphi\,\mathrm{d}\mu, (35)

with 𝒩[u,φ]\mathcal{N}[u,\varphi] given in (9).

Proof.

First, let us assume that λ=0\lambda=0; the minor modifications for λ>0\lambda>0 are described at the end of the proof. Introduce uσu_{\sigma} as the density of Xσ#μX_{\sigma}\#\mu, and accordingly sσ=uσs_{\sigma}=\sqrt{u_{\sigma}} and zσ=uσ4z_{\sigma}=\sqrt[4]{u_{\sigma}}. For later reference, observe that

Xσ=idσφ+O(σ2),DXσ=𝕀σDφ+O(σ2),D2Xσ=σD2φ+O(σ2).\displaystyle X_{\sigma}=\textnormal{id}-\sigma\nabla\varphi+O(\sigma^{2}),\quad\mathrm{D}X_{\sigma}=\mathbb{I}-\sigma\mathrm{D}\nabla\varphi+O(\sigma^{2}),\quad\mathrm{D}^{2}X_{\sigma}=-\sigma\mathrm{D}^{2}\nabla\varphi+O(\sigma^{2}).

Next, define the volume distortion Vσ:=det(DXσ)V_{\sigma}:=\det(\mathrm{D}X_{\sigma}), which is a positive and smooth function, and observe that

Vσ=1σΔφ+O(σ2),DVσ=σDΔφ+O(σ2),D2Vσ=σD2Δφ+O(σ2).\displaystyle V_{\sigma}=1-\sigma\Delta\varphi+O(\sigma^{2}),\quad\mathrm{D}V_{\sigma}=-\sigma\mathrm{D}\Delta\varphi+O(\sigma^{2}),\quad\mathrm{D}^{2}V_{\sigma}=-\sigma\mathrm{D}^{2}\Delta\varphi+O(\sigma^{2}).

By the change of variables formula, we have with x=Xσ(y)x=X_{\sigma}(y),

¯0(Xσ#μ)=2d2sσ4zσzσ2dx=2dVσ1/22sσXσ4(Vσ1/4zσXσ)(Vσ1/4zσXσ)2dy\begin{split}\overline{\mathcal{E}}_{0}(X_{\sigma}\#\mu)&=2\int_{\mathbb{R}^{d}}\|\nabla^{2}s_{\sigma}-4\nabla z_{\sigma}\otimes\nabla z_{\sigma}\|^{2}\,\mathrm{d}x\\ &=2\int_{\mathbb{R}^{d}}\big{\|}V_{\sigma}^{1/2}\nabla^{2}s_{\sigma}\circ X_{\sigma}-4(V_{\sigma}^{1/4}\nabla z_{\sigma}\circ X_{\sigma})\otimes(V_{\sigma}^{1/4}\nabla z_{\sigma}\circ X_{\sigma})\big{\|}^{2}\,\mathrm{d}y\end{split} (36)

We shall now express the spatial derivatives of sσs_{\sigma} and zσz_{\sigma} in terms of the respective derivatives of ss and zz. For the next calculations, which involve a repeated application of the chain rule, we use instead of gradients and Hessians the less intuitive but more consistent notations with total derivatives D\mathrm{D} that produce row vectors.

Recall the effect of the push-forward on densities:

uσ=uVσXσ1.u_{\sigma}=\frac{u}{V_{\sigma}}\circ X_{\sigma}^{-1}. (37)

Hence we have:

sσ=(Vσ1/2s)Xσ1,zσ=(Vσ1/4z)Xσ1.\displaystyle s_{\sigma}=(V_{\sigma}^{-1/2}s)\circ X_{\sigma}^{-1},\quad z_{\sigma}=(V_{\sigma}^{-1/4}z)\circ X_{\sigma}^{-1}. (38)

For the first derivatives, we thus obtain

Dsσ\displaystyle\mathrm{D}s_{\sigma} ={(Vσ1/2Ds12Vσ3/2sDVσ)(DXσ)1}Xσ1,\displaystyle=\left\{\left(V_{\sigma}^{-1/2}\mathrm{D}s-\frac{1}{2}V_{\sigma}^{-3/2}s\mathrm{D}V_{\sigma}\right)(\mathrm{D}X_{\sigma})^{-1}\right\}\circ X_{\sigma}^{-1},
Dzσ\displaystyle\mathrm{D}z_{\sigma} ={(Vσ1/4Dz14Vσ5/4zDVσ)(DXσ)1}Xσ1,\displaystyle=\left\{\left(V_{\sigma}^{-1/4}\mathrm{D}z-\frac{1}{4}V_{\sigma}^{-5/4}z\mathrm{D}V_{\sigma}\right)(\mathrm{D}X_{\sigma})^{-1}\right\}\circ X_{\sigma}^{-1},

and for the second derivative of sσs_{\sigma},

D2sσ\displaystyle\mathrm{D}^{2}s_{\sigma} ={[Vσ1/2D2sVσ3/2DssDVσ12Vσ3/2sD2Vσ+34Vσ5/2sDVσDVσ\displaystyle=\bigg{\{}\bigg{[}V_{\sigma}^{-1/2}\mathrm{D}^{2}s-V_{\sigma}^{-3/2}\mathrm{D}s\otimes_{s}\mathrm{D}V_{\sigma}-\frac{1}{2}V_{\sigma}^{-3/2}s\mathrm{D}^{2}V_{\sigma}+\frac{3}{4}V_{\sigma}^{-5/2}s\mathrm{D}V_{\sigma}\otimes\mathrm{D}V_{\sigma}
(Vσ1/2Ds12Vσ3/2sDVσ)(DXσ)1D2Xσ]:((DXσ)1(DXσ)1)}Xσ1.\displaystyle\qquad-\left(V_{\sigma}^{-1/2}\mathrm{D}s-\frac{1}{2}V_{\sigma}^{-3/2}s\mathrm{D}V_{\sigma}\right)(\mathrm{D}X_{\sigma})^{-1}\mathrm{D}^{2}X_{\sigma}\bigg{]}:\big{(}(\mathrm{D}X_{\sigma})^{-1}\otimes(\mathrm{D}X_{\sigma})^{-1}\big{)}\Big{\}}\circ X_{\sigma}^{-1}.

The expression on the right hand side calls for some explanation: the part in the square brackets is a symmetric bilinear form; and when D2sσXσ\mathrm{D}^{2}s_{\sigma}\circ X_{\sigma} is applied to two vectors ξ,ηd\xi,\eta\in{\mathbb{R}}^{d}, then that bilinear form is evaluated on the vectors (DXσ)1ξ,(DXσ)1η(\mathrm{D}X_{\sigma})^{-1}\xi,(\mathrm{D}X_{\sigma})^{-1}\eta.

Since all the σ\sigma-dependence on the right-hand side is now in XσX_{\sigma} and VσV_{\sigma}, these expressions are obviously differentiable in σ\sigma, with derivatives

ddσ|σ=0(Vσ1/4DzσXσ)\displaystyle\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\big{(}V_{\sigma}^{1/4}\mathrm{D}z_{\sigma}\circ X_{\sigma}\big{)} =ddσ|σ=0{(Dz14Vσ1zDVσ)(DXσ)1}\displaystyle=\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\left\{\left(\mathrm{D}z-\frac{1}{4}V_{\sigma}^{-1}z\mathrm{D}V_{\sigma}\right)(\mathrm{D}X_{\sigma})^{-1}\right\} =DzDφ+14zDΔφ\displaystyle=\mathrm{D}z\mathrm{D}\nabla\varphi+\frac{1}{4}z\mathrm{D}\Delta\varphi

and

ddσ|σ=0(Vσ1/2D2sσXσ)\displaystyle\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\big{(}V_{\sigma}^{1/2}\mathrm{D}^{2}s_{\sigma}\circ X_{\sigma}\big{)}
=ddσ|σ=0{[D2sVσ1DssDVσ12Vσ1sD2Vσ+34Vσ2sDVσDVσ\displaystyle=\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\bigg{\{}\bigg{[}\mathrm{D}^{2}s-V_{\sigma}^{-1}\mathrm{D}s\otimes_{s}\mathrm{D}V_{\sigma}-\frac{1}{2}V_{\sigma}^{-1}s\mathrm{D}^{2}V_{\sigma}+\frac{3}{4}V_{\sigma}^{-2}s\mathrm{D}V_{\sigma}\otimes\mathrm{D}V_{\sigma}
(Ds12Vσ1sDVσ)(DXσ)1D2Xσ]:((DXσ)1(DXσ)1)}\displaystyle\qquad-\left(\mathrm{D}s-\frac{1}{2}V_{\sigma}^{-1}s\mathrm{D}V_{\sigma}\right)(\mathrm{D}X_{\sigma})^{-1}\mathrm{D}^{2}X_{\sigma}\bigg{]}:\big{(}(\mathrm{D}X_{\sigma})^{-1}\otimes(\mathrm{D}X_{\sigma})^{-1}\big{)}\bigg{\}}
=2D2sDφ+DssDΔφ+12sD2Δφ+DsD2φ.\displaystyle=2\mathrm{D}^{2}s\mathrm{D}\nabla\varphi+\mathrm{D}s\otimes_{s}\mathrm{D}\Delta\varphi+\frac{1}{2}s\mathrm{D}^{2}\Delta\varphi+\mathrm{D}s\mathrm{D}^{2}\nabla\varphi.

We are now in the position to calculate the σ\sigma-derivative of ¯(Xσ#μ)\overline{\mathcal{E}}(X_{\sigma}\#\mu) at σ=0\sigma=0 by differentiating in (36) directly under the integral. Recall the abbreviation =2s4zz\ell=\nabla^{2}s-4\nabla z\otimes\nabla z, which is a symmetric d×dd\times d-matrix. For the derivative of the integrand we obtain:

12ddσ|σ=0Vσ1/22sσXσ4(Vσ1/4zσXσ)(Vσ1/4zσXσ)2\displaystyle\frac{1}{2}\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\big{\|}V_{\sigma}^{1/2}\nabla^{2}s_{\sigma}\circ X_{\sigma}-4(V_{\sigma}^{1/4}\nabla z_{\sigma}\circ X_{\sigma})\otimes(V_{\sigma}^{1/4}\nabla z_{\sigma}\circ X_{\sigma})\big{\|}^{2}
=(ddσ|σ=0(Vσ1/22sσXσ)):8z(ddσ|σ=0(Vσ1/4zσXσ))\displaystyle=\left(\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\big{(}V_{\sigma}^{1/2}\nabla^{2}s_{\sigma}\circ X_{\sigma}\big{)}\right):\ell-8\nabla z\cdot\ell\cdot\left(\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\big{(}V_{\sigma}^{1/4}\nabla z_{\sigma}\circ X_{\sigma}\big{)}\right)
=22φ:(2s)+Δφs+12s2Δφ:+s3φ:\displaystyle=2\nabla^{2}\varphi:(\ell\cdot\nabla^{2}s)+\nabla\Delta\varphi\cdot\ell\cdot\nabla s+\frac{1}{2}s\nabla^{2}\Delta\varphi:\ell+\nabla s\cdot\nabla^{3}\varphi:\ell
82φ:((zz))2zΔφz\displaystyle\qquad-8\nabla^{2}\varphi:\big{(}\ell\cdot(\nabla z\otimes\nabla z)\big{)}-2z\nabla\Delta\varphi\cdot\ell\cdot\nabla z
=22φ:(2)+12(s2Δφ+2s3φ):,\displaystyle=2\nabla^{2}\varphi:(\ell^{2})+\frac{1}{2}(s\nabla^{2}\Delta\varphi+2\nabla s\cdot\nabla^{3}\varphi):\ell,

where we have used that s=2zz\nabla s=2z\nabla z to cancel the two terms multiplying Δφ\nabla\Delta\varphi. Integration of this equality with respect to xx yields (35) with λ=0\lambda=0.

When λ>0\lambda>0, we calculate

ddσ|σ=0¯λ(Xσ#μ)=ddσ|σ=0¯0(Xσ#μ)+λ3ddσ|σ=0d|x|2dXσ#μ(x).\displaystyle\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\overline{\mathcal{E}}_{\lambda}(X_{\sigma}\#\mu)=\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\overline{\mathcal{E}}_{0}(X_{\sigma}\#\mu)+\lambda^{3}\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\int_{\mathbb{R}^{d}}|x|^{2}\,\mathrm{d}X_{\sigma}\#\mu(x).

Since by definition of the push-forward,

d|x|2d(Xσ#μ)=d|Xσ(y)|2dμ(y),\displaystyle\int_{\mathbb{R}^{d}}|x|^{2}\,\mathrm{d}(X_{\sigma}\#\mu)=\int_{\mathbb{R}^{d}}|X_{\sigma}(y)|^{2}\,\mathrm{d}\mu(y),

we directly obtain

ddσ|σ=0d|Xσ(y)|2dXσ#μ(y)=d2X0(y)ddσ|σ=0Xσ(y)dμ(y)=2dyφ(y)dμ(y),\displaystyle\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\int_{\mathbb{R}^{d}}|X_{\sigma}(y)|^{2}\,\mathrm{d}X_{\sigma}\#\mu(y)=\int_{\mathbb{R}^{d}}2X_{0}(y)\cdot\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}X_{\sigma}(y)\,\mathrm{d}\mu(y)=-2\int_{\mathbb{R}^{d}}y\cdot\nabla\varphi(y)\,\mathrm{d}\mu(y),

which provides the additional term in (35). ∎

Lemma 3.8.

For any test function φCc(d)\varphi\in C^{\infty}_{c}({\mathbb{R}}^{d}), there is a constant α\alpha such that for each n1n\geq 1, the measures μτn=uτnd\mu_{\tau}^{n}=u_{\tau}^{n}\mathcal{L}^{d} and μτn1=uτn1d\mu_{\tau}^{n-1}=u_{\tau}^{n-1}\mathcal{L}^{d} satisfy the following time discrete version of (8):

|dφuτnuτn1τdx+𝒩[uτn,φ]+2λ3dxφuτndx|ατ2(W2(uτn,uτn1)τ)2.\left|\int_{\mathbb{R}^{d}}\varphi\frac{u_{\tau}^{n}-u_{\tau}^{n-1}}{\tau}\,\mathrm{d}x+\mathcal{N}[u^{n}_{\tau},\varphi]+2\lambda^{3}\int_{\mathbb{R}^{d}}x\cdot\nabla\varphi u_{\tau}^{n}\,\mathrm{d}x\right|\leq\frac{\alpha\tau}{2}\left(\frac{\textnormal{W}_{2}(u^{n}_{\tau},u^{n-1}_{\tau})}{\tau}\right)^{2}. (39)
Proof.

Choose α>0\alpha>0 such that α𝟙2φα𝟙-\alpha\mathds{1}\leq\nabla^{2}\varphi\leq\alpha\mathds{1}. According to Example 1, the solution μσ=Xσ#μτn\mu_{\sigma}=X_{\sigma}\#\mu_{\tau}^{n} to the continuity equation (34) follows an (α)(-\alpha)-flow for the potential energy 𝒱(μ)=dφ(x)dμ(x){\mathcal{V}}(\mu)=\int_{\mathbb{R}^{d}}\varphi(x)\,\mathrm{d}\mu(x) emerging from μτn\mu_{\tau}^{n}. As a consequence of the flow interchange estimate (23),

ddσ|σ=0¯λ(Xσ#μτn)\displaystyle-\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\overline{\mathcal{E}}_{\lambda}(X_{\sigma}\#\mu_{\tau}^{n}) =lim supσ0¯λ(μτn)¯λ(Xσ#μτn)σ\displaystyle=\limsup_{\sigma\downarrow 0}\frac{\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{n})-\overline{\mathcal{E}}_{\lambda}(X_{\sigma}\#\mu_{\tau}^{n})}{\sigma}
𝒱(μτn1)𝒱(μτn)τ+α2τW22(μτn,μτn1)\displaystyle\leq\frac{{\mathcal{V}}(\mu_{\tau}^{n-1})-{\mathcal{V}}(\mu_{\tau}^{n})}{\tau}+\frac{\alpha}{2\tau}\textnormal{W}_{2}^{2}(\mu_{\tau}^{n},\mu_{\tau}^{n-1})
=dφuτnuτn1τdx+α2τW22(μτn,μτn1).\displaystyle\quad=-\int_{\mathbb{R}^{d}}\varphi\frac{u_{\tau}^{n}-u_{\tau}^{n-1}}{\tau}\,\mathrm{d}x+\frac{\alpha}{2\tau}\textnormal{W}_{2}^{2}(\mu_{\tau}^{n},\mu_{\tau}^{n-1}).

Substitution of (35) for the σ\sigma-derivative of ¯λ\overline{\mathcal{E}}_{\lambda} yields

dφuτnuτn1τdx+𝒩[uτn,φ]+2λ3dxφuτndxα2τW22(μτn,μτn1).\displaystyle\int_{\mathbb{R}^{d}}\varphi\frac{u_{\tau}^{n}-u_{\tau}^{n-1}}{\tau}\,\mathrm{d}x+\mathcal{N}[u^{n}_{\tau},\varphi]+2\lambda^{3}\int_{\mathbb{R}^{d}}x\cdot\nabla\varphi u_{\tau}^{n}\,\mathrm{d}x\leq\frac{\alpha}{2\tau}\textnormal{W}_{2}^{2}(\mu_{\tau}^{n},\mu_{\tau}^{n-1}).

Replacing φ\varphi by φ-\varphi produces the same inequality (also with the same value of α\alpha) but with an overall minus on the left-hand side. Combination of these two estimates leads to (39). ∎

3.4. Additional a priori estimates

The next step is to derive a time-discrete version of the a priori estimate (10).

Proposition 3.9.

The sequence (μτn)(\mu_{\tau}^{n}) of time-discrete approximations μτn=uτnd\mu_{\tau}^{n}=u_{\tau}^{n}\mathcal{L}^{d} constructed above satisfies at each n=1,2,n=1,2,\ldots

κd(3uτn2+|uτn6|6)dx0(uτn1)0(uτn)τ+2dλ3,\kappa\int_{\mathbb{R}^{d}}\big{(}\interleave\nabla^{3}\sqrt{u_{\tau}^{n}}\interleave^{2}+|\nabla\sqrt[6]{u_{\tau}^{n}}|^{6}\big{)}\,\mathrm{d}x\leq\frac{\mathcal{H}_{0}(u_{\tau}^{n-1})-\mathcal{H}_{0}(u_{\tau}^{n})}{\tau}+2d\lambda^{3}, (40)

where κ>0\kappa>0 is a constant that is expressible in terms of the dimension dd alone.

The proof of the Lemma builds on the analogous result derived in [3] for a — more hands-on — time-discrete approximation of solutions to (1) with periodic boundary conditions. The formal calculations there are identical to the ones that lead to (40). The rigorous justification is more difficult: first because the time steps in [3] have a higher degree of spatial regularity than the Wasserstein approximants here; and second, because the periodic boundary conditions make any discussion of boundary terms related to integration by parts unnecessary.

We shall perform several approximations before we can apply the formal calculations from [3]. For notational convenience, we assume λ=0\lambda=0 for the moment; the minor modifications for λ>0\lambda>0 are summarized at the end of the proof of Proposition 3.9.

The first ingredient in our approximation procedure is the following regularization of the energy functional: for each ε>0\varepsilon>0, define

ε(u):=2d2u+ε4u+ε4u+ε42dx.\displaystyle\mathcal{E}^{\varepsilon}(u):=2\int_{\mathbb{R}^{d}}\big{\|}\nabla^{2}\sqrt{u+\varepsilon}-4\nabla\sqrt[4]{u+\varepsilon}\otimes\nabla\sqrt[4]{u+\varepsilon}\|^{2}\,\mathrm{d}x.

for uL1(d)u\in L^{1}({\mathbb{R}}^{d}) with uW2,2(d)\sqrt{u}\in W^{2,2}({\mathbb{R}}^{d}).

Lemma 3.10.

Assume that udDom(¯0)u\mathcal{L}^{d}\in\mathrm{Dom}(\overline{\mathcal{E}}_{0}). Then also ε(u)<\mathcal{E}^{\varepsilon}(u)<\infty, and limε0ε(u)=¯0(ud)\lim_{\varepsilon\downarrow 0}\mathcal{E}^{\varepsilon}(u)=\overline{\mathcal{E}}_{0}(u\mathcal{L}^{d}).

Proof.

By definition of ¯0\overline{\mathcal{E}}_{0}, we have that uW2,2(d)\sqrt{u}\in W^{2,2}({\mathbb{R}}^{d}) and u4W1,4(d)\sqrt[4]{u}\in W^{1,4}({\mathbb{R}}^{d}). According to the chain rule for concatenation of uu with the smooth functions rr+εr\mapsto\sqrt{r+\varepsilon} and rr+ε4r\mapsto\sqrt[4]{r+\varepsilon} that are sublinear with globally bounded first and second derivatives, also u+εWloc2,2(d)\sqrt{u+\varepsilon}\in W^{2,2}_{\text{loc}}({\mathbb{R}}^{d}) and u+ε4Wloc1,4(d)\sqrt[4]{u+\varepsilon}\in W^{1,4}_{\text{loc}}({\mathbb{R}}^{d}), and

2u+ε\displaystyle\nabla^{2}\sqrt{u+\varepsilon} =[uu+ε]1/2(2u+4εu+εu4u4),\displaystyle=\left[\frac{u}{u+\varepsilon}\right]^{1/2}\left(\nabla^{2}\sqrt{u}+4\frac{\varepsilon}{u+\varepsilon}\nabla\sqrt[4]{u}\otimes\nabla\sqrt[4]{u}\right),
u+ε4\displaystyle\nabla\sqrt[4]{u+\varepsilon} =[uu+ε]3/4u4.\displaystyle=\left[\frac{u}{u+\varepsilon}\right]^{3/4}\nabla\sqrt[4]{u}.

In particular, 2u+εL2(d)\nabla^{2}\sqrt{u+\varepsilon}\in L^{2}({\mathbb{R}}^{d}) and u+ε4L4(d)\nabla\sqrt[4]{u+\varepsilon}\in L^{4}({\mathbb{R}}^{d}), and so ε(u)<\mathcal{E}^{\varepsilon}(u)<\infty. Further, the quotients u/(u+ε)u/(u+\varepsilon) and ε/(u+ε)\varepsilon/(u+\varepsilon) are bounded by one, and converge to one and to zero respectively, in measure udu\mathcal{L}^{d}-a.e. It follows by dominated convergence that

2u+ε2uin L2(d),u+ε4u4in L4(d),\displaystyle\nabla^{2}\sqrt{u+\varepsilon}\to\nabla^{2}\sqrt{u}\quad\text{in $L^{2}({\mathbb{R}}^{d})$},\qquad\nabla\sqrt[4]{u+\varepsilon}\to\nabla\sqrt[4]{u}\quad\text{in $L^{4}({\mathbb{R}}^{d})$},

and therefore also ε(u)¯0(ud)\mathcal{E}^{\varepsilon}(u)\to\overline{\mathcal{E}}_{0}(u\mathcal{L}^{d}). ∎

We are now going to study certain properties of ε\mathcal{E}^{\varepsilon} along solutions of the heat flow. More specifically, let some probability density uL1(d)u\in L^{1}({\mathbb{R}}^{d}) be given, and consider for each r>0r>0:

wr:=KruforKr(z)=(4πr)d/2exp(|z|24r).\displaystyle w_{r}:=K_{r}\ast u\quad\text{for}\quad K_{r}(z)=(4\pi r)^{-d/2}\exp\left(-\frac{|z|^{2}}{4r}\right). (41)

Note that wrw_{r} is the unique solution to the initial value problem

rwr=Δwr,w0=u.\displaystyle\partial_{r}w_{r}=\Delta w_{r},\quad w_{0}=u.

Our first observation is that rε(wr)r\mapsto\mathcal{E}^{\varepsilon}(w_{r}) has the expected limit for r0r\downarrow 0.

Lemma 3.11.

Assume that udDom(¯0)u\mathcal{L}^{d}\in\mathrm{Dom}(\overline{\mathcal{E}}_{0}), define wrw_{r} as in (41) above. Then limr0ε(wr)=ε(u)\lim_{r\downarrow 0}\mathcal{E}^{\varepsilon}(w_{r})=\mathcal{E}^{\varepsilon}(u).

Proof.

We present a proof that heavily uses the dimensionality restriction d3d\leq 3.

By definition of ¯0\overline{\mathcal{E}}_{0}, we have that uW2,2(d)\sqrt{u}\in W^{2,2}({\mathbb{R}}^{d}) and u4W1,4(d)\sqrt[4]{u}\in W^{1,4}({\mathbb{R}}^{d}). First, we show that

uW2,2(d)W1,4(d).\displaystyle u\in W^{2,2}({\mathbb{R}}^{d})\cap W^{1,4}({\mathbb{R}}^{d}). (42)

For this, we use that since d3d\leq 3, one has W2,2(d)L(d)W^{2,2}({\mathbb{R}}^{d})\hookrightarrow L^{\infty}({\mathbb{R}}^{d}), and so uu is globally bounded. Now, by the chain rule,

u\displaystyle\nabla u =(u44)=4u43u4,\displaystyle=\nabla\big{(}\sqrt[4]{u}^{4}\big{)}=4\sqrt[4]{u}^{3}\nabla\sqrt[4]{u},
2u\displaystyle\nabla^{2}u =2(u2)=2u2u+2uu=2u2u+8u42u4u4,\displaystyle=\nabla^{2}\big{(}\sqrt{u}^{2}\big{)}=2\sqrt{u}\nabla^{2}\sqrt{u}+2\nabla\sqrt{u}\otimes\nabla\sqrt{u}=2\sqrt{u}\nabla^{2}\sqrt{u}+8\sqrt[4]{u}^{2}\nabla\sqrt[4]{u}\otimes\nabla\sqrt[4]{u},

which shows uL4(d)\nabla u\in L^{4}({\mathbb{R}}^{d}) and 2uL2(d)\nabla^{2}u\in L^{2}({\mathbb{R}}^{d}). Interpolation with uL1(d)u\in L^{1}({\mathbb{R}}^{d}) yields (42).

By the representation (41) of wrw_{r} as convolution with KrK_{r}, and thanks to Young’s inequality, it follows from (42) that

wruin W2,2(d)W1,4(d).\displaystyle w_{r}\to u\quad\text{in $W^{2,2}({\mathbb{R}}^{d})\cap W^{1,4}({\mathbb{R}}^{d})$}. (43)

Recall that wrw_{r} is smooth and positive. It follows by the chain rule that:

wr+ε4\displaystyle\nabla\sqrt[4]{w_{r}+\varepsilon} =14(wr+ε)3/4wr,\displaystyle=\frac{1}{4}(w_{r}+\varepsilon)^{-3/4}\nabla w_{r},
2wr+ε\displaystyle\nabla^{2}\sqrt{w_{r}+\varepsilon} =12(wr+ε)1/22wr14(wr+ε)3/2wrwr.\displaystyle=\frac{1}{2}(w_{r}+\varepsilon)^{-1/2}\nabla^{2}w_{r}-\frac{1}{4}(w_{r}+\varepsilon)^{-3/2}\nabla w_{r}\otimes\nabla w_{r}.

Combining (43) with the fact that, for any p>0p>0, the function (wr+ε)p(w_{r}+\varepsilon)^{-p} is bounded by εp\varepsilon^{-p}, and converges in measure to (u+ε)p(u+\varepsilon)^{-p} as r0r\downarrow 0, we conclude that

wr+ε4\displaystyle\nabla\sqrt[4]{w_{r}+\varepsilon} 14(u+ε)3/4u=u+ε4in L4(d),\displaystyle\to\frac{1}{4}(u+\varepsilon)^{-3/4}\nabla u=\nabla\sqrt[4]{u+\varepsilon}\quad\text{in $L^{4}({\mathbb{R}}^{d})$},
2wr+ε\displaystyle\nabla^{2}\sqrt{w_{r}+\varepsilon} 12(u+ε)1/22u14(u+ε)3/2uu=2u+εin L2(d),\displaystyle\to\frac{1}{2}(u+\varepsilon)^{-1/2}\nabla^{2}u-\frac{1}{4}(u+\varepsilon)^{-3/2}\nabla u\otimes\nabla u=\nabla^{2}\sqrt{u+\varepsilon}\quad\text{in $L^{2}({\mathbb{R}}^{d})$},

and so we have that

2wr+ε4wr+ε4wr+ε42u+ε4u+ε4u+ε4in L2(d),\displaystyle\nabla^{2}\sqrt{w_{r}+\varepsilon}-4\nabla\sqrt[4]{w_{r}+\varepsilon}\otimes\nabla\sqrt[4]{w_{r}+\varepsilon}\to\nabla^{2}\sqrt{u+\varepsilon}-4\nabla\sqrt[4]{u+\varepsilon}\otimes\nabla\sqrt[4]{u+\varepsilon}\quad\text{in $L^{2}({\mathbb{R}}^{d})$},

which is the claim. ∎

The next result is our core computation, namely of the derivative of ε(wr)\mathcal{E}^{\varepsilon}(w_{r}) at r>0r>0.

Lemma 3.12.

Given a probability density uu, define wrw_{r} by (41). Then rε(wr)r\mapsto\mathcal{E}^{\varepsilon}(w_{r}) is differentiable at each r>0r>0, with

ddrε(wr)κd(3wr+ε2+|wr+ε6|6)dx.\displaystyle-\frac{\mathrm{d}}{\,\mathrm{d}r}\mathcal{E}^{\varepsilon}(w_{r})\geq\kappa\int_{\mathbb{R}^{d}}\big{(}\interleave\nabla^{3}\sqrt{w_{r}+\varepsilon}\interleave^{2}+|\nabla\sqrt[6]{w_{r}+\varepsilon}|^{6}\big{)}\,\mathrm{d}x. (44)
Proof.

The heat kernel Ks(z)K_{s}(z) from (41) is smooth and positive at every s>0s>0 and zdz\in{\mathbb{R}}^{d}, and all spatial derivatives αKs\nabla^{\alpha}K_{s} with arbitrary multi-index α\alpha belong to any Lp(d)L^{p}({\mathbb{R}}^{d}) with p[1,]p\in[1,\infty]. Consequently, the function >0×d(t;x)wr(x){\mathbb{R}}_{>0}\times{\mathbb{R}}^{d}\ni(t;x)\mapsto w_{r}(x) is positive and smooth, each wrw_{r} is a probability density, and, thanks to Young’s integral inequality, all spatial derivatives αwr\nabla^{\alpha}w_{r} are in any Lp(d)L^{p}({\mathbb{R}}^{d}).

Next, observe that yr:=log(wr+ε)y_{r}:=\log(w_{r}+\varepsilon) is a smooth function as well, that satisfies

reyr=Δeyr\displaystyle\partial_{r}e^{y_{r}}=\Delta e^{y_{r}} (45)

in the classical sense. Moreover, despite the fact that yry_{r} itself is clearly not integrable on d{\mathbb{R}}^{d}, its spatial derivatives αyr\partial^{\alpha}y_{r} belong to any Lp(d)L^{p}({\mathbb{R}}^{d}); the latter is most easily seen from the fact that each αyr\partial^{\alpha}y_{r} can be written as a linear combination of products of terms βwr/(wr+ε)\partial^{\beta}w_{r}/(w_{r}+\varepsilon), with suitable multi-indices β\beta, where 1|β||α|1\leq|\beta|\leq|\alpha|. This further means that also eyr=wr+εe^{y_{r}}=w_{r}+\varepsilon times any linear combination of products of spatial derivatives of yry_{r} are integrable. In particular, Gauss’ theorem as stated in Lemma C.1 in the Appendix is applicable to vector fields built from such functions.

We can now calculate the derivative of ¯(wrd)\overline{\mathcal{E}}(w_{r}\mathcal{L}^{d}), which we write equivalently in the form

ε(wr)=d(wr+ε)2log(wr+ε)2dx=deyr2yr2dx.\displaystyle\mathcal{E}^{\varepsilon}(w_{r})=\int_{\mathbb{R}^{d}}(w_{r}+\varepsilon)\big{\|}\nabla^{2}\log(w_{r}+\varepsilon)\big{\|}^{2}\,\mathrm{d}x=\int_{\mathbb{R}^{d}}e^{y_{r}}\|\nabla^{2}y_{r}\|^{2}\,\mathrm{d}x.

Thanks to the aforementioned smoothness of yry_{r} and the admissibility of integration by parts via Lemma C.1 from the Appendix, we obtain — recalling (45), and supressing yy’s sub-index r>0r>0 from now on —

ddrε(wr)=12dΔey2y2dxdey2y:2(eyΔey)dx=dey3y:2ydxdey2y:2(Δy+|y|2)dx=deyy3y:2ydx+d(ey2y)3ydx2dey2y:(2yy)dx=dey3y2dx2dey2y:([2y]2)dx.\begin{split}-\frac{\mathrm{d}}{\,\mathrm{d}r}\mathcal{E}^{\varepsilon}(w_{r})&=-\frac{1}{2}\int_{\mathbb{R}^{d}}\Delta e^{y}\|\nabla^{2}y\|^{2}\,\mathrm{d}x-\int_{\mathbb{R}^{d}}e^{y}\nabla^{2}y:\nabla^{2}(e^{-y}\Delta e^{y})\,\mathrm{d}x\\ &=\int_{\mathbb{R}^{d}}\nabla e^{y}\cdot\nabla^{3}y:\nabla^{2}y\,\mathrm{d}x-\int_{\mathbb{R}^{d}}e^{y}\nabla^{2}y:\nabla^{2}\big{(}\Delta y+|\nabla y|^{2}\big{)}\,\mathrm{d}x\\ &=\int_{\mathbb{R}^{d}}e^{y}\nabla y\cdot\nabla^{3}y:\nabla^{2}y\,\mathrm{d}x+\int_{\mathbb{R}^{d}}\nabla\big{(}e^{y}\nabla^{2}y)\vdots\nabla^{3}y\,\mathrm{d}x-2\int_{\mathbb{R}^{d}}e^{y}\nabla^{2}y:\nabla(\nabla^{2}y\cdot\nabla y)\,\mathrm{d}x\\ &=\int_{\mathbb{R}^{d}}e^{y}\interleave\nabla^{3}y\interleave^{2}\,\mathrm{d}x-2\int_{\mathbb{R}^{d}}e^{y}\nabla^{2}y:\big{(}[\nabla^{2}y]^{2}\big{)}\,\mathrm{d}x.\end{split} (46)

From this point on, we proceed in analogy to the proof of [3, inequality (19)]. That means, we define (ad hoc) the smooth and integrable vector field

𝐯\displaystyle\mathbf{v} =ey{[2|y|2+Δy+5y2yy)+5yΔy]y\displaystyle=e^{y}\big{\{}\big{[}2|\nabla y|^{2}+\Delta y+5\nabla y\cdot\nabla^{2}y\cdot\nabla y)+5\nabla y\cdot\nabla\Delta y\big{]}\nabla y
+[3|y|2y+11Δy+24y2y]2y[5yy+112y]:3y}.\displaystyle\quad+\big{[}3|\nabla y|^{2}\nabla y+11\nabla\Delta y+24\nabla y\cdot\nabla^{2}y\big{]}\cdot\nabla^{2}y-\big{[}5\nabla y\otimes\nabla y+11\nabla^{2}y\big{]}:\nabla^{3}y\big{\}}.

The proof of [3, inequality (19)] amounts to showing that

12ey[3y222y:([2y]2)]+div𝐯κ[263ey/22+66|ey/6|6]\displaystyle 12e^{y}\big{[}\interleave\nabla^{3}y\interleave^{2}-2\nabla^{2}y:\big{(}[\nabla^{2}y]^{2}\big{)}\big{]}+\mathrm{div}\mathbf{v}\geq\kappa\big{[}2^{6}\interleave\nabla^{3}e^{y/2}\interleave^{2}+6^{6}|\nabla e^{y/6}|^{6}\big{]} (47)

holds pointwise, i.e., yy’s domain of definition plays no role here. The key idea in proving (47) is that after division by eye^{y}, it becomes an inequality between polynomials in the first, second and third derivatives of yy, hence the proof of its validity is a (cumbersome) algebraic problem. For more details on the choice of 𝐯\mathbf{v} and the general concepts behind the algebraic method for proving entropy dissipation inequalities, we refer the reader to [3, 10].

Integration of (47) on d{\mathbb{R}}^{d}, recalling (46), and making use of Lemma C.1, leads to (44). ∎

We are finally in the position to prove the main estimate (40).

Proof of Proposition 3.9.

This is another application of the flow interchange method, see Lemma 2.2. According to Example 1, the heat flow given by 𝒮rμ=Krμ\mathcal{S}_{r}\mu=K_{r}\ast\mu defines a 0-flow on 𝒫2(d)\mathcal{P}_{2}(\mathbb{R}^{d}) for the (unperturbed) entropy 0\mathcal{H}_{0}. Therefore,

lim infr0(1r[¯0(wrd)¯0(μτn)])0(uτn1)0(uτn)τ.\displaystyle\liminf_{r\downarrow 0}\left(-\frac{1}{r}\big{[}\overline{\mathcal{E}}_{0}(w_{r}\mathcal{L}^{d})-\overline{\mathcal{E}}_{0}(\mu_{\tau}^{n})\big{]}\right)\leq\frac{\mathcal{H}_{0}(u_{\tau}^{n-1})-\mathcal{H}_{0}(u_{\tau}^{n})}{\tau}. (48)

Formally, the left-hand side above is minus the derivative of r¯0(wrd)r\mapsto\overline{\mathcal{E}}_{0}(w_{r}\mathcal{L}^{d}) at r=0r=0, and the eventual goal is to control this in the spirit of (44), for ε=0\varepsilon=0. The main technical obstacle that prevents us from carrying out this differentiation and to conclude directly (40) from here is the possible irregularity of ¯0(wrd)\overline{\mathcal{E}}_{0}(w_{r}\mathcal{L}^{d}) at r=0r=0. It is not even clear that ¯0(wrd)¯0(μτn)\overline{\mathcal{E}}_{0}(w_{r}\mathcal{L}^{d})\to\overline{\mathcal{E}}_{0}(\mu_{\tau}^{n}) as r0r\downarrow 0. Indeed, while lower semi-continuity is known from Proposition 3.4, upper semi-continuity might fail. The problem is that — due to the non-differentiability of ss4s\mapsto\sqrt[4]{s} at s=0s=0 — one cannot conclude from uτn4W1,4(d)\sqrt[4]{u_{\tau}^{n}}\in W^{1,4}({\mathbb{R}}^{d}) that wr4uτn4\nabla\sqrt[4]{w_{r}}\to\nabla\sqrt[4]{u_{\tau}^{n}} in L4(d)L^{4}({\mathbb{R}}^{d}). We tackle this problem by approximation of ¯0\overline{\mathcal{E}}_{0} by ε\mathcal{E}^{\varepsilon}, for which continuity at r=0r=0 has been shown in Lemma 3.11.

It follows from Lemma 3.12 that rε(wr)r\mapsto\mathcal{E}^{\varepsilon}(w_{r}) is differentiable at every r>0r>0, with derivative given in (44). Thanks to Lemma 3.11, we can apply the fundamental theorem of calculus to obtain that

1r¯[ε(wr¯)ε(uτn)]\displaystyle-\frac{1}{\bar{r}}\big{[}\mathcal{E}^{\varepsilon}(w_{\bar{r}})-\mathcal{E}^{\varepsilon}(u_{\tau}^{n})\big{]} =0r¯(ddrε(wr))dr\displaystyle=\fint_{0}^{\bar{r}}\left(-\frac{\mathrm{d}}{\,\mathrm{d}r}\mathcal{E}^{\varepsilon}(w_{r})\right)\,\mathrm{d}r
κ0r¯d(3wr+ε2+|wr+ε6|6)dxdr\displaystyle\geq\kappa\fint_{0}^{\bar{r}}\int_{\mathbb{R}^{d}}\big{(}\interleave\nabla^{3}\sqrt{w_{r}+\varepsilon}\interleave^{2}+|\nabla\sqrt[6]{w_{r}+\varepsilon}|^{6}\big{)}\,\mathrm{d}x\,\mathrm{d}r

We pass to the limit ε0\varepsilon\downarrow 0, using Lemma 3.10 on the left hand side and Fatou’s lemma respectively the lower semi-continuity of norms on the right hand side. With the latter we subsequently obtain for r¯0\bar{r}\downarrow 0

lim infr¯0(1r¯[¯0(wr¯)¯0(uτn)])\displaystyle\liminf_{\bar{r}\downarrow 0}\left(-\frac{1}{\bar{r}}\big{[}\overline{\mathcal{E}}_{0}(w_{\bar{r}})-\overline{\mathcal{E}}_{0}(u_{\tau}^{n})\big{]}\right) κlim infr¯00r¯d(3wr2+|wr6|6)dxdr\displaystyle\geq\kappa\liminf_{\bar{r}\downarrow 0}\fint_{0}^{\bar{r}}\int_{\mathbb{R}^{d}}\big{(}\interleave\nabla^{3}\sqrt{w_{r}}\interleave^{2}+|\nabla\sqrt[6]{w_{r}}|^{6}\big{)}\,\mathrm{d}x\,\mathrm{d}r
κd(3uτn2+|uτn6|6)dx.\displaystyle\geq\kappa\int_{\mathbb{R}^{d}}\big{(}\interleave\nabla^{3}\sqrt{u_{\tau}^{n}}\interleave^{2}+|\nabla\sqrt[6]{u_{\tau}^{n}}|^{6}\big{)}\,\mathrm{d}x.

Plugging this into (48) yields (40) in the case λ=0\lambda=0.

The changes induced by passing from λ=0\lambda=0 to λ>0\lambda>0 means to add on the right-hand side of (44) the contribution

λ3ddrd|x|2dνr=λ3d|x|2Δwrdx=λ3d2𝑑wrdx=2dλ3,\displaystyle\lambda^{3}\frac{\mathrm{d}}{\,\mathrm{d}r}\int_{\mathbb{R}^{d}}|x|^{2}\,\mathrm{d}\nu_{r}=\lambda^{3}\int_{\mathbb{R}^{d}}|x|^{2}\Delta w_{r}\,\mathrm{d}x=\lambda^{3}\int_{\mathbb{R}^{d}}2dw_{r}\,\mathrm{d}x=2d\lambda^{3},

which is independent of r>0r>0. The passage to the limit r0r\downarrow 0 is trivial here. In combination with the the result for λ=0\lambda=0, we arrive at (40). ∎

3.5. A universal bound on the energy

In the spirit of [8, Theorem 1.4], we prove the following bound on the energy.

Proposition 3.13.

There is a constant Eλ{\mathrm{E}_{\lambda}}, expressible in terms of the initial entropy value 0(u0)\mathcal{H}_{0}(u_{0}) and the initial second moment 𝔪2(u0)\mathfrak{m}_{2}(u_{0}), such that, for each N=1,2,N=1,2,\ldots,

¯λ(μτN)Eλ(1+(Nτ)2/3).\displaystyle\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{N})\leq{\mathrm{E}_{\lambda}}\big{(}1+(N\tau)^{-2/3}\big{)}. (49)

This proposition is a consequence of the following a priori estimates.

Lemma 3.14.

There is a universal constant AA such that, for each N=1,2,N=1,2,\ldots,

κ2τn=1Nd(3uτn2+|uτn6|6)dx\displaystyle\frac{\kappa}{2}\tau\sum_{n=1}^{N}\int_{\mathbb{R}^{d}}\big{(}\interleave\nabla^{3}\sqrt{u_{\tau}^{n}}\interleave^{2}+|\nabla\sqrt[6]{u_{\tau}^{n}}|^{6}\big{)}\,\mathrm{d}x π𝔪2(u0)+0(u0)+(A+2dλ3)Nτ,\displaystyle\leq\pi\mathfrak{m}_{2}(u_{0})+\mathcal{H}_{0}(u_{0})+(A+2d\lambda^{3})N\tau, (50)
π𝔪2(uτN)\displaystyle\pi\mathfrak{m}_{2}(u_{\tau}^{N}) 2π𝔪2(u0)+0(u0)+2(A+dλ3)Nτ.\displaystyle\leq 2\pi\mathfrak{m}_{2}(u_{0})+\mathcal{H}_{0}(u_{0})+2(A+d\lambda^{3})N\tau. (51)

As a technical ingredient in the proofs of both Proposition 3.13 and Lemma 3.14 above, we need:

Lemma 3.15.

There exists a constant BB such that for any μ=udDom(¯0)\mu=u\mathcal{L}^{d}\in\mathrm{Dom}(\overline{\mathcal{E}}_{0}),

[¯0(μ)]3/2Bd3u2dx.\displaystyle\big{[}\overline{\mathcal{E}}_{0}(\mu)\big{]}^{3/2}\leq B\int_{\mathbb{R}^{d}}\interleave\nabla^{3}\sqrt{u}\interleave^{2}\,\mathrm{d}x. (52)

And consequently, for each ε>0\varepsilon>0, there exists a CεC_{\varepsilon} independent of μ\mu such that

¯0(μ)εd3u2dx+Cε.\displaystyle\overline{\mathcal{E}}_{0}(\mu)\leq\varepsilon\int_{\mathbb{R}^{d}}\interleave\nabla^{3}\sqrt{u}\interleave^{2}\,\mathrm{d}x+C_{\varepsilon}. (53)
Proof of Proposition 3.13 from Lemmas 3.14 and 3.15.

Without loss of generality, it suffices to prove

¯λ(μ)Eλ(Nτ)2/3\displaystyle\overline{\mathcal{E}}_{\lambda}(\mu)\leq{\mathrm{E}_{\lambda}}\,(N\tau)^{-2/3} (54)

for all NN such that Nτ1N\tau\leq 1; the estimate (49) for larger NN is then a trivial consequence of the monotonicity (30). Choosing NN accordingly, the monotonicity (30) of ¯λ\overline{\mathcal{E}}_{\lambda} implies that

Nτ[¯λ(μτN)]3/2τn=1N[¯λ(μτn)]3/2.\displaystyle N\tau\big{[}\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{N})\big{]}^{3/2}\leq\tau\sum_{n=1}^{N}\big{[}\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{n})\big{]}^{3/2}.

Substitute the elementary estimate

[¯λ(μτn)]3/22(¯0(μτn)3/2+λ9/2𝔪2(μτn)3/2)\displaystyle\big{[}\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{n})\big{]}^{3/2}\leq\sqrt{2}\left(\overline{\mathcal{E}}_{0}(\mu_{\tau}^{n})^{3/2}+\lambda^{9/2}\mathfrak{m}_{2}(\mu_{\tau}^{n})^{3/2}\right)

in the sum on the right-hand side and use (52) to obtain

Nτ[¯λ(μτN)]3/2τn=1N2(Bd3uτn2dx+λ9/2𝔪2(μτn)3/2).\displaystyle N\tau\big{[}\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{N})\big{]}^{3/2}\leq\sqrt{\tau}\sum_{n=1}^{N}2\left(B\int_{\mathbb{R}^{d}}\interleave\nabla^{3}\sqrt{u_{\tau}^{n}}\interleave^{2}\,\mathrm{d}x+\lambda^{9/2}\mathfrak{m}_{2}(\mu_{\tau}^{n})^{3/2}\right).

Since we assume nτNτ1n\tau\leq N\tau\leq 1, the terms on the right-hand side are bounded thanks to (50) and (51). That is, Nτ[¯λ(μτN)]3/2CN\tau\big{[}\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{N})\big{]}^{3/2}\leq C, where CC depends just 0(u0)\mathcal{H}_{0}(u_{0}) and 𝔪2(u0)\mathfrak{m}_{2}(u_{0}). Divide by NτN\tau and take the power 2/32/3 to obtain (54) with Eλ=C2/3{\mathrm{E}_{\lambda}}=C^{2/3}. ∎

Proof of Lemma 3.14.

Summation of (40) from n=1n=1 to n=Nn=N yields

κτn=1Nd(3uτn2+|uτn6|6)dx0(u0)0(uτN)+2dλ3Nτ.\displaystyle\kappa\tau\sum_{n=1}^{N}\int_{\mathbb{R}^{d}}\big{(}\interleave\nabla^{3}\sqrt{u_{\tau}^{n}}\interleave^{2}+|\nabla\sqrt[6]{u_{\tau}^{n}}|^{6}\big{)}\,\mathrm{d}x\leq\mathcal{H}_{0}(u_{0})-\mathcal{H}_{0}(u_{\tau}^{N})+2d\lambda^{3}N\tau. (55)

We shall now derive a suitable lower bound on 0(uτN)\mathcal{H}_{0}(u_{\tau}^{N}). More precisely, we derive an upper bound on 𝔪2(uτN)\mathfrak{m}_{2}(u_{\tau}^{N}); notice that entropy and second moment are connected via

0(uτN)π𝔪2(uτN),\displaystyle-\mathcal{H}_{0}(u_{\tau}^{N})\leq\pi\mathfrak{m}_{2}(u_{\tau}^{N}), (56)

which follows from a scaling argument — see e.g. [8, Section 2.2] for details. To estimate the second moment, we apply once again the flow interchange technique. The flow 𝒮()\mathcal{S}_{(\cdot)} is given by exponential dilations, that is 𝒮σμ=(eσid)#μ\mathcal{S}_{\sigma}\mu=(e^{-\sigma}\textnormal{id})\#\mu, or more explicitly in terms of the densities uσu_{\sigma} of 𝒮σμ\mathcal{S}_{\sigma}\mu:

uσ(x)=edσu(eσx).\displaystyle u_{\sigma}(x)=e^{d\sigma}u(e^{\sigma}x).

Since, by the chain rule,

[2loguσ](x)=e2σ[2logu](eσx),\displaystyle\big{[}\nabla^{2}\log u_{\sigma}\big{]}(x)=e^{2\sigma}\big{[}\nabla^{2}\log u\big{]}(e^{\sigma}x),

and since

𝔪2(uσ)=d|x|2uσ(x)dx=d|eσy|2u(y)dy=e2σ𝔪2(u),\displaystyle\mathfrak{m}_{2}(u_{\sigma})=\int_{\mathbb{R}^{d}}|x|^{2}u_{\sigma}(x)\,\mathrm{d}x=\int_{\mathbb{R}^{d}}|e^{-\sigma}y|^{2}u(y)\,\mathrm{d}y=e^{-2\sigma}\mathfrak{m}_{2}(u),

it is easily seen that

λ(uσ)=e4σ0(u)+λ3e2σ𝔪2(u).\mathcal{E}_{\lambda}(u_{\sigma})=e^{4\sigma}\mathcal{E}_{0}(u)+\lambda^{3}e^{-2\sigma}\mathfrak{m}_{2}(u).

Clearly, this scaling property carries over to the extension ¯λ\overline{\mathcal{E}}_{\lambda}, and we obtain

ddσ|σ=0¯λ(𝒮σμτn)=4¯0(μτn)2λ3𝔪2(μτn).\displaystyle\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\overline{\mathcal{E}}_{\lambda}(\mathcal{S}_{\sigma}\mu_{\tau}^{n})=4\overline{\mathcal{E}}_{0}(\mu_{\tau}^{n})-2\lambda^{3}\mathfrak{m}_{2}(\mu_{\tau}^{n}).

Recall from Example 1 that 𝒮()\mathcal{S}_{(\cdot)} is a 11-flow for the auxiliary functional 12𝔪2\frac{1}{2}\mathfrak{m}_{2}. The flow interchange estimate (23) now yields

4¯0(μτn)+2λ3𝔪2(μτn)\displaystyle-4\overline{\mathcal{E}}_{0}(\mu_{\tau}^{n})+2\lambda^{3}\mathfrak{m}_{2}(\mu_{\tau}^{n}) =lim supσ0¯λ(μτn)¯λ(𝒮σμτn)σ\displaystyle=\limsup_{\sigma\to 0}\frac{\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{n})-\overline{\mathcal{E}}_{\lambda}(\mathcal{S}_{\sigma}\mu_{\tau}^{n})}{\sigma}
𝔪2(μτn1)𝔪2(μτn)2ττ2(W2(μτn,μτn1)τ)2.\displaystyle\leq\frac{\mathfrak{m}_{2}(\mu_{\tau}^{n-1})-\mathfrak{m}_{2}(\mu_{\tau}^{n})}{2\tau}-\frac{\tau}{2}\left(\frac{\textnormal{W}_{2}(\mu_{\tau}^{n},\mu_{\tau}^{n-1})}{\tau}\right)^{2}.

Neglecting the last term, we obtain from here the recursion formula

𝔪2(μτn)𝔪2(μτn1)+8τ¯0(μτn)4λ3τ𝔪2(μτn),\displaystyle\mathfrak{m}_{2}(\mu_{\tau}^{n})\leq\mathfrak{m}_{2}(\mu_{\tau}^{n-1})+8\tau\overline{\mathcal{E}}_{0}(\mu_{\tau}^{n})-4\lambda^{3}\tau\mathfrak{m}_{2}(\mu_{\tau}^{n}),

which clearly implies that

𝔪2(μτN)𝔪2(μτ0)+8τn=1N¯0(μτn).\displaystyle\mathfrak{m}_{2}(\mu_{\tau}^{N})\leq\mathfrak{m}_{2}(\mu_{\tau}^{0})+8\tau\sum_{n=1}^{N}\overline{\mathcal{E}}_{0}(\mu_{\tau}^{n}).

On the right-hand side, we estimate ¯0(μτn)\overline{\mathcal{E}}_{0}(\mu_{\tau}^{n}) by means of (53) with ε:=κ/(16π)\varepsilon:=\kappa/(16\pi),

𝔪2(μτN)𝔪2(μτ0)+κ2πτn=1Nd3uτn2dx+8CεNτ.\displaystyle\mathfrak{m}_{2}(\mu_{\tau}^{N})\leq\mathfrak{m}_{2}(\mu_{\tau}^{0})+\frac{\kappa}{2\pi}\tau\sum_{n=1}^{N}\int_{\mathbb{R}^{d}}\interleave\nabla^{3}\sqrt{u_{\tau}^{n}}\interleave^{2}\,\mathrm{d}x+8C_{\varepsilon}N\tau. (57)

By means of the inequality (56) between entropy and second moment, this provides an estimate from above on 0(uτN)-\mathcal{H}_{0}(u_{\tau}^{N}) on the right-hand side of (55). Rearranging terms, one finally obtains (50), with A:=8πCεA:=8\pi C_{\varepsilon}. The estimate (51) is now easily obtained by estimating the first sum in (57) above with (50), and neglecting the second sum. ∎

Proof of Lemma 3.15.

By standard interpolation of Sobolev norms, there exists a constant AA such that

2fL22A3fL24/3fL22/3\displaystyle\|\nabla^{2}f\|_{L^{2}}^{2}\leq A\|\nabla^{3}f\|_{L^{2}}^{4/3}\|f\|_{L^{2}}^{2/3}

for all fW3,2(d)f\in W^{3,2}({\mathbb{R}}^{d}). Apply this to f:=uf:=\sqrt{u}. Since

uL22=dudx=1,\displaystyle\big{\|}\sqrt{u}\big{\|}_{L^{2}}^{2}=\int_{\mathbb{R}^{d}}u\,\mathrm{d}x=1,

we obtain by means of (27) that

¯0(μ)CA3uL24/3,\displaystyle\overline{\mathcal{E}}_{0}(\mu)\leq C^{\prime}A\big{\|}\nabla^{3}\sqrt{u}\big{\|}_{L^{2}}^{4/3},

which is (52). The other estimate (53) follows directly from (52) via Young’s inequality. ∎

Corollary 3.16.

There is a constant CC, expressible in terms of the initial entropy 0(u0)\mathcal{H}_{0}(u_{0}) and the initial second moment 𝔪2(u0)\mathfrak{m}_{2}(u_{0}) alone, such that, for each T>0T>0,

0Td(3u¯τ2+|u¯τ6|6)dxdtC(1+T)\displaystyle\int_{0}^{T}\int_{\mathbb{R}^{d}}\big{(}\interleave\nabla^{3}\sqrt{\bar{u}_{\tau}}\interleave^{2}+|\nabla\sqrt[6]{\bar{u}_{\tau}}|^{6}\big{)}\,\mathrm{d}x\,\mathrm{d}t\leq C(1+T) (58)
Proof.

This is a direct consequence of (50). ∎

3.6. Uniform almost continuity of the discrete trajectory

In this section we show that the piecewise constant interpolations μ¯τ\bar{\mu}_{\tau} are “uniformly almost continuous” as curves in 𝒫2(d)\mathcal{P}_{2}(\mathbb{R}^{d}). More specifically, we derive an approximate Hölder estimate with a τ\tau-independent Hölder constant, see Proposition 3.20 below.

Lemma 3.17.

There is a constant AA, depending on μ0=u0d\mu_{0}=u_{0}\mathcal{L}^{d} just in terms of 𝔪2(μ0)\mathfrak{m}_{2}(\mu_{0}), such that for every τ(0,1)\tau\in(0,1):

W2(μτ1,μ0)Aτ1/6.\displaystyle\textnormal{W}_{2}(\mu_{\tau}^{1},\mu_{0})\leq A\tau^{1/6}. (59)
Proof.

The idea is to show that, for a constant AA^{\prime} expressible just in terms of 𝔪2(μ0)\mathfrak{m}_{2}(\mu_{0}),

12τW22(𝒦σμ0,μ0)+λ(𝒦σu0)Aτ2/3.\displaystyle\frac{1}{2\tau}\textnormal{W}_{2}^{2}({\mathcal{K}}_{\sigma}\ast\mu_{0},\mu_{0})+\mathcal{E}_{\lambda}({\mathcal{K}}_{\sigma}\ast u_{0})\leq A^{\prime}\tau^{-2/3}. (60)

It then follows thanks to non-negativity of ¯λ\overline{\mathcal{E}}_{\lambda} that the minimizer μτ1\mu_{\tau}^{1} of ¯λ,τ(;μ0)\overline{\mathcal{E}}_{\lambda,\tau}(\cdot;\mu_{0}) satisfies

12τW22(μτ1,μ0)¯λ,τ(μτ1;μ0)¯λ,τ(𝒦σμτ0;μ0)Aτ2/3,\displaystyle\frac{1}{2\tau}\textnormal{W}_{2}^{2}(\mu_{\tau}^{1},\mu_{0})\leq\overline{\mathcal{E}}_{\lambda,\tau}(\mu_{\tau}^{1};\mu_{0})\leq\overline{\mathcal{E}}_{\lambda,\tau}({\mathcal{K}}_{\sigma}\ast\mu_{\tau}^{0};\mu_{0})\leq A^{\prime}\tau^{-2/3},

which immediately implies (59). To prove (60), we show that, on the one hand,

W22(𝒦σμ0,μ0)2dσ,\displaystyle\textnormal{W}_{2}^{2}({\mathcal{K}}_{\sigma}\ast\mu_{0},\mu_{0})\leq 2d\sigma, (61)

and that, on the other hand,

λ(𝒦σu0)Bσ2,\displaystyle\mathcal{E}_{\lambda}({\mathcal{K}}_{\sigma}\ast u_{0})\leq B\sigma^{-2}, (62)

for all σ(0,1)\sigma\in(0,1), with a constant BB that again depends on μ0\mu_{0} only via 𝔪2(μ0)\mathfrak{m}_{2}(\mu_{0}). The choice σ:=τ1/3\sigma:=\tau^{1/3} then yields (60). For the proof of (61), we compare the Wasserstein distance with the transport cost generated by the plan π\pi that has Lebesgue density

g(x,y)=u0(x)𝒦σ(yx).\displaystyle g(x,y)=u_{0}(x){\mathcal{K}}_{\sigma}(y-x).

It is easily seen that the two marginals are indeed u0u_{0} and 𝒦σu0{\mathcal{K}}_{\sigma}\ast u_{0}, respectively. According to (19), we have

W22(𝒦σμ0,μ0)d×d|xy|2g(x,y)d(x,y)=du0(x)dxd|z|2𝒦σ(z)dz=2dσ.\displaystyle\textnormal{W}_{2}^{2}({\mathcal{K}}_{\sigma}\ast\mu_{0},\mu_{0})\leq\int_{{\mathbb{R}}^{d}\times{\mathbb{R}}^{d}}|x-y|^{2}g(x,y)\,\mathrm{d}(x,y)=\int_{\mathbb{R}^{d}}u_{0}(x)\,\mathrm{d}x\int_{\mathbb{R}^{d}}|z|^{2}{\mathcal{K}}_{\sigma}(z)\,\mathrm{d}z=2d\sigma.

The proof of (62) is a bit more elaborate. First, note that

𝒦σ(||2)(y)\displaystyle{\mathcal{K}}_{\sigma}\ast\big{(}|\cdot|^{2}\big{)}(y) =d|yx|2𝒦σ(x)dx\displaystyle=\int_{\mathbb{R}^{d}}|y-x|^{2}{\mathcal{K}}_{\sigma}(x)\,\mathrm{d}x
=|y|2d𝒦σ(x)dx2ydx𝒦σ(x)dx+d|x|2𝒦σ(x)dx\displaystyle=|y|^{2}\int_{\mathbb{R}^{d}}{\mathcal{K}}_{\sigma}(x)\,\mathrm{d}x-2y\cdot\int_{\mathbb{R}^{d}}x{\mathcal{K}}_{\sigma}(x)\,\mathrm{d}x+\int_{\mathbb{R}^{d}}|x|^{2}{\mathcal{K}}_{\sigma}(x)\,\mathrm{d}x
=|y|22y0+2dσ,\displaystyle=|y|^{2}-2y\cdot 0+2d\sigma,

and hence

𝔪2(𝒦σu0)=d|x|2d(𝒦σu0)(x)\displaystyle\mathfrak{m}_{2}({\mathcal{K}}_{\sigma}\ast u_{0})=\int_{\mathbb{R}^{d}}|x|^{2}\,\mathrm{d}\big{(}{\mathcal{K}}_{\sigma}\ast u_{0}\big{)}(x) =d𝒦σ(||2)(y)dμ0(y)\displaystyle=\int_{\mathbb{R}^{d}}{\mathcal{K}}_{\sigma}\ast\big{(}|\cdot|^{2}\big{)}(y)\,\mathrm{d}\mu_{0}(y)
=2dσddμ0(y)+d|y|2dμ0(y)=2dσ+𝔪2(μ0).\displaystyle=2d\sigma\int_{\mathbb{R}^{d}}\,\mathrm{d}\mu_{0}(y)+\int_{\mathbb{R}^{d}}|y|^{2}\,\mathrm{d}\mu_{0}(y)=2d\sigma+\mathfrak{m}_{2}(\mu_{0}).

Concerning the estimate of 0\mathcal{E}_{0}, observe that for a smooth and positive probability density ff,

f2logf2=f2fffff2222f2f+2|f|4f3.\displaystyle f\big{\|}\nabla^{2}\log f\big{\|}^{2}=f\left\|\frac{\nabla^{2}f}{f}-\frac{\nabla f\otimes\nabla f}{f^{2}}\right\|^{2}\leq 2\frac{\|\nabla^{2}f\|^{2}}{f}+2\frac{|\nabla f|^{4}}{f^{3}}.

Now we plug f:=𝒦σu0f:={\mathcal{K}}_{\sigma}\ast u_{0} in. By Jensen’s inequality,

2(𝒦σu0)2(x)\displaystyle\big{\|}\nabla^{2}({\mathcal{K}}_{\sigma}\ast u_{0})\big{\|}^{2}(x) =(𝒦σu0)2(x)d2𝒦σ(y)𝒦σ(y)𝒦σ(y)u0(xy)dy(𝒦σu0)(x)2\displaystyle=\big{(}{\mathcal{K}}_{\sigma}\ast u_{0}\big{)}^{2}(x)\left\|\int_{\mathbb{R}^{d}}\frac{\nabla^{2}{\mathcal{K}}_{\sigma}(y)}{{\mathcal{K}}_{\sigma}(y)}\,\frac{{\mathcal{K}}_{\sigma}(y)u_{0}(x-y)\,\mathrm{d}y}{\big{(}{\mathcal{K}}_{\sigma}\ast u_{0}\big{)}(x)}\right\|^{2}
(𝒦σu0)2(x)d2𝒦σ(y)𝒦σ(y)2𝒦σ(y)u0(xy)dy(𝒦σu0)(x)\displaystyle\leq\big{(}{\mathcal{K}}_{\sigma}\ast u_{0}\big{)}^{2}(x)\int_{\mathbb{R}^{d}}\left\|\frac{\nabla^{2}{\mathcal{K}}_{\sigma}(y)}{{\mathcal{K}}_{\sigma}(y)}\right\|^{2}\,\frac{{\mathcal{K}}_{\sigma}(y)u_{0}(x-y)\,\mathrm{d}y}{\big{(}{\mathcal{K}}_{\sigma}\ast u_{0}\big{)}(x)}
=(𝒦σu0)(x)(2𝒦σ2𝒦σu0)(x).\displaystyle=\big{(}{\mathcal{K}}_{\sigma}\ast u_{0}\big{)}(x)\,\left(\frac{\|\nabla^{2}{\mathcal{K}}_{\sigma}\|^{2}}{{\mathcal{K}}_{\sigma}}\ast u_{0}\right)(x).

It thus follows that

d2(𝒦σu0)2𝒦σu0dxd(2𝒦σ2𝒦σu0)dx=d2𝒦σ2𝒦σdx=(12σ)2d(d+1).\displaystyle\int_{\mathbb{R}^{d}}\frac{\|\nabla^{2}({\mathcal{K}}_{\sigma}\ast u_{0})\|^{2}}{{\mathcal{K}}_{\sigma}\ast u_{0}}\,\mathrm{d}x\leq\int_{\mathbb{R}^{d}}\left(\frac{\|\nabla^{2}{\mathcal{K}}_{\sigma}\|^{2}}{{\mathcal{K}}_{\sigma}}\ast u_{0}\right)\,\mathrm{d}x=\int_{\mathbb{R}^{d}}\frac{\|\nabla^{2}{\mathcal{K}}_{\sigma}\|^{2}}{{\mathcal{K}}_{\sigma}}\,\mathrm{d}x=\left(\frac{1}{2\sigma}\right)^{2}d(d+1).

In an analogous manner,

|(𝒦σu0)|4(x)\displaystyle\big{|}\nabla({\mathcal{K}}_{\sigma}\ast u_{0})\big{|}^{4}(x) =(𝒦σu0)4(x)|d𝒦σ(y)𝒦σ(y)𝒦σ(y)u0(xy)dy(𝒦σu0)(x)|4\displaystyle=\big{(}{\mathcal{K}}_{\sigma}\ast u_{0}\big{)}^{4}(x)\,\left|\int_{\mathbb{R}^{d}}\frac{\nabla{\mathcal{K}}_{\sigma}(y)}{{\mathcal{K}}_{\sigma}(y)}\,\frac{{\mathcal{K}}_{\sigma}(y)u_{0}(x-y)\,\mathrm{d}y}{\big{(}{\mathcal{K}}_{\sigma}\ast u_{0}\big{)}(x)}\right|^{4}
(𝒦σu0)4(x)d|𝒦σ(y)𝒦σ(y)|4𝒦σ(y)u0(xy)dy(𝒦σu0)(x)\displaystyle\leq\big{(}{\mathcal{K}}_{\sigma}\ast u_{0}\big{)}^{4}(x)\,\int_{\mathbb{R}^{d}}\left|\frac{\nabla{\mathcal{K}}_{\sigma}(y)}{{\mathcal{K}}_{\sigma}(y)}\right|^{4}\,\frac{{\mathcal{K}}_{\sigma}(y)u_{0}(x-y)\,\mathrm{d}y}{\big{(}{\mathcal{K}}_{\sigma}\ast u_{0}\big{)}(x)}
=(𝒦σu0)3(x)(|𝒦σ|4𝒦σ3u0)(x),\displaystyle=\big{(}{\mathcal{K}}_{\sigma}\ast u_{0}\big{)}^{3}(x)\,\left(\frac{|\nabla{\mathcal{K}}_{\sigma}|^{4}}{{\mathcal{K}}_{\sigma}^{3}}\ast u_{0}\right)(x),

implying that

d|(𝒦σu0)|4(𝒦σu0)3dxd(|𝒦σ|4𝒦σ3u0)dx=d|𝒦σ|4𝒦σ3dx=(12σ)2d(d+2).\displaystyle\int_{\mathbb{R}^{d}}\frac{|\nabla({\mathcal{K}}_{\sigma}\ast u_{0})|^{4}}{({\mathcal{K}}_{\sigma}\ast u_{0})^{3}}\,\mathrm{d}x\leq\int_{\mathbb{R}^{d}}\left(\frac{|\nabla{\mathcal{K}}_{\sigma}|^{4}}{{\mathcal{K}}_{\sigma}^{3}}\ast u_{0}\right)\,\mathrm{d}x=\int_{\mathbb{R}^{d}}\frac{|\nabla{\mathcal{K}}_{\sigma}|^{4}}{{\mathcal{K}}_{\sigma}^{3}}\,\mathrm{d}x=\left(\frac{1}{2\sigma}\right)^{2}d(d+2).

Collecting these estimates, we finally obtain

λ(𝒦σu0)\displaystyle\mathcal{E}_{\lambda}({\mathcal{K}}_{\sigma}\ast u_{0}) =0(𝒦σu0)+λ3𝔪2(𝒦σu0)\displaystyle=\mathcal{E}_{0}({\mathcal{K}}_{\sigma}\ast u_{0})+\lambda^{3}\mathfrak{m}_{2}({\mathcal{K}}_{\sigma}\ast u_{0})
d2(𝒦σu0)2𝒦σu0dx+d|(𝒦σu0)|4(𝒦σu0)3dx+λ3𝔪2(𝒦σu0)\displaystyle\leq\int_{\mathbb{R}^{d}}\frac{\|\nabla^{2}({\mathcal{K}}_{\sigma}\ast u_{0})\|^{2}}{{\mathcal{K}}_{\sigma}\ast u_{0}}\,\mathrm{d}x+\int_{\mathbb{R}^{d}}\frac{|\nabla({\mathcal{K}}_{\sigma}\ast u_{0})|^{4}}{({\mathcal{K}}_{\sigma}\ast u_{0})^{3}}\,\mathrm{d}x+\lambda^{3}\mathfrak{m}_{2}({\mathcal{K}}_{\sigma}\ast u_{0})
(12σ)2d(2d+3)+λ3[2dσ+𝔪2(u0)].\displaystyle\leq\left(\frac{1}{2\sigma}\right)^{2}d(2d+3)+\lambda^{3}\big{[}2d\sigma+\mathfrak{m}_{2}(u_{0})\big{]}.

From here, (62) follows immediately. ∎

Lemma 3.18.

There is a constant L\mathrm{L} such that, for all M¯>M¯1\overline{M}>\underline{M}\geq 1 with M¯τ1\overline{M}\tau\leq 1,

W2(uτM¯,uτM¯)L(M¯τM¯τ)1/12.\displaystyle\textnormal{W}_{2}(u_{\tau}^{\overline{M}},u_{\tau}^{\underline{M}})\leq\mathrm{L}\big{(}\overline{M}\tau-\underline{M}\tau\big{)}^{1/12}. (63)
Proof.

Let NτN_{\tau} be the largest integer NN with Nτ1N\tau\leq 1. Using Lemma A.1 from the appendix and Proposition 3.13, we find that

τn=1Nτ(nτ)5/6(W2(uτn+1,uτn)τ)2\displaystyle\tau\sum_{n=1}^{N_{\tau}}(n\tau)^{5/6}\left(\frac{\textnormal{W}_{2}(u_{\tau}^{n+1},u_{\tau}^{n})}{\tau}\right)^{2} τn=1Nτ[τN=1n(τN)1/6(W2(uτn+1,uτn)τ)2]\displaystyle\leq\tau\sum_{n=1}^{N_{\tau}}\left[\tau\sum_{N=1}^{n}(\tau N)^{-1/6}\left(\frac{\textnormal{W}_{2}(u_{\tau}^{n+1},u_{\tau}^{n})}{\tau}\right)^{2}\right]
N=1Nτ[(τN)1/6τn=N(W2(uτn+1,uτn)τ)2]\displaystyle\leq\sum_{N=1}^{N_{\tau}}\left[(\tau N)^{-1/6}\ \tau\sum_{n=N}^{\infty}\left(\frac{\textnormal{W}_{2}(u_{\tau}^{n+1},u_{\tau}^{n})}{\tau}\right)^{2}\right]
2N=1Nτ(τN)1/6¯λ(uτN)\displaystyle\leq 2\sum_{N=1}^{N_{\tau}}(\tau N)^{-1/6}\overline{\mathcal{E}}_{\lambda}(u_{\tau}^{N})
4EλN=1Nτ(τN)5/64Eλ01s5/6ds=24Eλ.\displaystyle\leq 4{\mathrm{E}_{\lambda}}\sum_{N=1}^{N_{\tau}}(\tau N)^{-5/6}\leq 4{\mathrm{E}_{\lambda}}\int_{0}^{1}s^{-5/6}\,\mathrm{d}s=24{\mathrm{E}_{\lambda}}.

Now we combine this with the triangle inequality for 𝐖2{\mathbf{W}}_{2}, the basic energy estimate (31), and Lemma A.2 from the appendix:

W2(uτM¯,uτN¯)\displaystyle\textnormal{W}_{2}(u_{\tau}^{\overline{M}},u_{\tau}^{\underline{N}}) n=M¯M¯1W2(uτn+1,uτn)\displaystyle\leq\sum_{n=\underline{M}}^{\overline{M}-1}\textnormal{W}_{2}(u_{\tau}^{n+1},u_{\tau}^{n})
(τn=1Nτ(nτ)5/6(W2(uτn+1,uτn)τ)2)1/2(τn=M¯M¯1(nτ)5/6)1/2\displaystyle\leq\left(\tau\sum_{n=1}^{N_{\tau}}(n\tau)^{5/6}\left(\frac{\textnormal{W}_{2}(u_{\tau}^{n+1},u_{\tau}^{n})}{\tau}\right)^{2}\right)^{1/2}\left(\tau\sum_{n=\underline{M}}^{\overline{M}-1}(n\tau)^{-5/6}\right)^{1/2}
(24Eλ)1/2(12[(M¯τ)1/6(M¯τ)1/6])1/2.\displaystyle\leq(24{\mathrm{E}_{\lambda}})^{1/2}\left(12\left[\big{(}\overline{M}\tau\big{)}^{1/6}-\big{(}\underline{M}\tau\big{)}^{1/6}\right]\right)^{1/2}.

To conclude (63) from here, with L=122Eλ\mathrm{L}=12\sqrt{2{\mathrm{E}_{\lambda}}}, it suffices to recall that (a+b)1/6a1/6+b1/6(a+b)^{1/6}\leq a^{1/6}+b^{1/6} for arbitrary non-negative reals aa and bb. ∎

Lemma 3.19.

For all M¯M¯\overline{M}\geq\underline{M} with M¯τ1\underline{M}\tau\geq 1,

W2(μτM¯,μτM¯)2Eλ(M¯τM¯τ)1/2.\displaystyle\textnormal{W}_{2}(\mu_{\tau}^{\overline{M}},\mu_{\tau}^{\underline{M}})\leq 2\sqrt{{\mathrm{E}_{\lambda}}}\big{(}\overline{M}\tau-\underline{M}\tau\big{)}^{1/2}. (64)
Proof.

From the basic estimate (31), we obtain that

W2(uτM¯,uτN¯)\displaystyle\textnormal{W}_{2}(u_{\tau}^{\overline{M}},u_{\tau}^{\underline{N}}) n=M¯M¯1W2(uτn+1,uτn)\displaystyle\leq\sum_{n=\underline{M}}^{\overline{M}-1}\textnormal{W}_{2}(u_{\tau}^{n+1},u_{\tau}^{n})
(τn=M¯Nτ(W2(uτn+1,uτn)τ)2)1/2(τn=M¯M¯11)1/2\displaystyle\leq\left(\tau\sum_{n=\underline{M}}^{N_{\tau}}\left(\frac{\textnormal{W}_{2}(u_{\tau}^{n+1},u_{\tau}^{n})}{\tau}\right)^{2}\right)^{1/2}\left(\tau\sum_{n=\underline{M}}^{\overline{M}-1}1\right)^{1/2}
(2¯λ(μτM¯))1/2(M¯τM¯τ)1/2\displaystyle\leq\big{(}2\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{\underline{M}})\big{)}^{1/2}\big{(}\overline{M}\tau-\underline{M}\tau\big{)}^{1/2}

To conclude (64), observe that ¯λ(μτM¯)2Eλ\overline{\mathcal{E}}_{\lambda}(\mu_{\tau}^{\underline{M}})\leq 2{\mathrm{E}_{\lambda}}. thanks to Proposition (3.13) and since M¯τ1\underline{M}\tau\geq 1. ∎

Recall that a modulus of continuity is a map ω:0×00\omega:{\mathbb{R}}_{\geq 0}\times{\mathbb{R}}_{\geq 0}\to{\mathbb{R}}_{\geq 0} with the property that lim(s,t)(r,r)ω(s,t)=0\lim_{(s,t)\to(r,r)}\omega(s,t)=0 for arbitrary r0r\in{\mathbb{R}}_{\geq 0}.

Proposition 3.20.

With the constants AA from Lemma 3.17, L\mathrm{L} from Lemma 3.18, and Eλ{\mathrm{E}_{\lambda}} from Proposition 3.13, one has, for all τ(0,1)\tau\in(0,1),

W2(μ¯τ(t),μ¯τ(s))(A+L+2Eλ)(τ1/12+|ts|1/12+|ts|1/2)for all s,t0.\displaystyle\textnormal{W}_{2}(\bar{\mu}_{\tau}(t),\bar{\mu}_{\tau}(s))\leq\left(A+\mathrm{L}+2\sqrt{{\mathrm{E}_{\lambda}}}\right)\big{(}\tau^{1/12}+|t-s|^{1/12}+|t-s|^{1/2}\big{)}\quad\text{for all $s,t\geq 0$}. (65)
Proof.

Without loss of generality, assume that t>st>s.

Given τ\tau, let M¯\underline{M} and M¯\overline{M} be the smallest integers with M¯τs\underline{M}\tau\geq s and M¯τt\overline{M}\tau\geq t, respectively. By definition of μ¯τ\bar{\mu}_{\tau}, we have that

W2(μ¯τ(t),μ¯τ(s))=W2(μτM¯,μτM¯).\displaystyle\textnormal{W}_{2}(\bar{\mu}_{\tau}(t),\bar{\mu}_{\tau}(s))=\textnormal{W}_{2}(\mu_{\tau}^{\overline{M}},\mu_{\tau}^{\underline{M}}).

Further note that M¯τM¯τts+τ\overline{M}\tau-\underline{M}\tau\leq t-s+\tau. Let NτN_{\tau} be the smallest integer NN with Nτ1N\tau\geq 1. If either M¯Nτ\underline{M}\geq N_{\tau} or M¯Nτ\overline{M}\leq N_{\tau}, then (65) follows directly from (63) or (64), respectively, using that

(M¯τM¯τ)1/12\displaystyle\big{(}\overline{M}\tau-\underline{M}\tau\big{)}^{1/12} (ts+τ)1/12(ts)1/12+τ1/12,\displaystyle\leq(t-s+\tau)^{1/12}\leq(t-s)^{1/12}+\tau^{1/12},
(M¯τM¯τ)1/2\displaystyle\big{(}\overline{M}\tau-\underline{M}\tau\big{)}^{1/2} (ts+τ)1/2(ts)1/2+τ1/2(ts)1/2+τ1/12.\displaystyle\leq(t-s+\tau)^{1/2}\leq(t-s)^{1/2}+\tau^{1/2}\leq(t-s)^{1/2}+\tau^{1/12}.

If instead M¯<Nτ<M¯\underline{M}<N_{\tau}<\overline{M}, then we estimate further:

W2(μτM¯,μτM¯)W2(μτM¯,μτNτ)+W2(μτNτ,μτN¯),\displaystyle\textnormal{W}_{2}(\mu_{\tau}^{\overline{M}},\mu_{\tau}^{\underline{M}})\leq\textnormal{W}_{2}(\mu_{\tau}^{\overline{M}},\mu_{\tau}^{N_{\tau}})+\textnormal{W}_{2}(\mu_{\tau}^{N_{\tau}},\mu_{\tau}^{\underline{N}}),

and apply (63) and (64), respectively, to the sum on the right-hand side, using that, trivially,

M¯τNτts+τ,NτM¯τts.\displaystyle\overline{M}\tau-N_{\tau}\leq t-s+\tau,\quad N_{\tau}-\underline{M}\tau\leq t-s.

Finally, if M¯τ\underline{M}_{\tau}, i.e., s=0s=0, then we estimate

W2(μτM¯,μ0)W2(μτM¯,μτ1)+W2(μτ1,μ0),\displaystyle\textnormal{W}_{2}(\mu_{\tau}^{\overline{M}},\mu_{0})\leq\textnormal{W}_{2}(\mu_{\tau}^{\overline{M}},\mu_{\tau}^{1})+\textnormal{W}_{2}(\mu_{\tau}^{1},\mu_{0}),

apply the reasoning above to the first distance, and Lemma 3.17 to the second, using τ1/6τ1/12\tau^{1/6}\leq\tau^{1/12}. ∎

3.7. Passage to the continuous equation

The collected estimates will be enough to pass to the limit τ0\tau\downarrow 0 in the time-discrete evolution equation (39). We choose for any τ>0\tau>0 an interpolated time discrete solution μ¯τ:0𝒫2(d)\bar{\mu}_{\tau}:{\mathbb{R}}_{\geq 0}\to\mathcal{P}_{2}(\mathbb{R}^{d}) starting from μτ0=u0d\mu_{\tau}^{0}=u_{0}\mathcal{L}^{d}, with respective Lebesgue densities u¯τ:0L1(d)\bar{u}_{\tau}:{\mathbb{R}}_{\geq 0}\to L^{1}(\mathbb{R}^{d}).

Lemma 3.21.

There is a sequence τk0\tau_{k}\downarrow 0 and a Hölder continuous limit curve ud:0(𝒫2(d),𝐖2)u\mathcal{L}^{d}:{\mathbb{R}}_{\geq 0}\to(\mathcal{P}_{2}(\mathbb{R}^{d}),{\mathbf{W}}_{2}) with u(0,)=u0u(0,\cdot)=u_{0} and uLloc2(>0;W2,2(d))u\in L_{loc}^{2}(\mathbb{R}_{>0};W^{2,2}(\mathbb{R}^{d})), such that

  1. i)

    u¯τk(t,)du(t,)d\bar{u}_{\tau_{k}}(t,\cdot)\mathcal{L}^{d}\rightarrow u(t,\cdot)\mathcal{L}^{d} narrowly at each t[0,T]t\in[0,T],

  2. ii)

    u¯τku\sqrt{\bar{u}_{\tau_{k}}}\rightarrow\sqrt{u} strongly in Lloc2(>0;W2,2(d))L_{loc}^{2}(\mathbb{R}_{>0};W^{2,2}(\mathbb{R}^{d})),

  3. iii)

    u¯τk4u4\sqrt[4]{\bar{u}_{\tau_{k}}}\rightarrow\sqrt[4]{u} strongly in Lloc4(>0;W1,4(d))L_{loc}^{4}(\mathbb{R}_{>0};W^{1,4}(\mathbb{R}^{d})),

  4. iv)

    u¯τk6u6\sqrt[6]{\bar{u}_{\tau_{k}}}\rightarrow\sqrt[6]{u} weakly in Lloc6(>0;W1,6(d))L_{loc}^{6}(\mathbb{R}_{>0};W^{1,6}(\mathbb{R}^{d})).

Proof.

In the following, let some time horizon T>0T>0 be fixed. Thanks to the moment estimate (51) and the τ\tau-uniform Hölder regularity (65), the curves μ¯τ\bar{\mu}_{\tau} satisfy the hypotheses of the generalized Arzelá-Ascoli-Theorem [1, Theorem 3.3.1], see Lemma D.1.

Hence, for a suitable vanishing sequence τk\tau_{k}, the μ¯τk\bar{\mu}_{\tau_{k}} converge — narrowly at each t0t\geq 0 — to a limit curve μ:0𝒫2(d)\mu:{\mathbb{R}}_{\geq 0}\to\mathcal{P}_{2}(\mathbb{R}^{d}). And that limit inherits the Hölder continuity (65), i.e.,

𝐖2(μ(t),μ(s))C(|ts|1/12+|ts|1/2)for all s,t0.\displaystyle{\mathbf{W}}_{2}(\mu(t),\mu(s))\leq C\big{(}|t-s|^{1/12}+|t-s|^{1/2}\big{)}\quad\text{for all $s,t\geq 0$}. (66)

By the lower semi-continuity of ¯λ\overline{\mathcal{E}}_{\lambda}, see Proposition 3.4, and the universal bound (49), one has

¯λ(μ(t))lim infk¯λ(μ¯τk(t))Eλ(1+t2/3)\displaystyle\overline{\mathcal{E}}_{\lambda}(\mu(t))\leq\liminf_{k\to\infty}\overline{\mathcal{E}}_{\lambda}(\bar{\mu}_{\tau_{k}}(t))\leq{\mathrm{E}_{\lambda}}\big{(}1+t^{-2/3}\big{)}

at each t>0t>0. Hence the limit measures are absolutely continuous, μ(t)=u(t)d\mu(t)=u(t)\mathcal{L}^{d}, with u(t)W2,2(d)\sqrt{u(t)}\in W^{2,2}(\mathbb{R}^{d}). Additionally, in view of the estimate (29), we have that

u¯τ(t)W2,22CEλ(1+t2/3),\displaystyle\|\sqrt{\bar{u}_{\tau}(t)}\|_{W^{2,2}}^{2}\leq C{\mathrm{E}_{\lambda}}\big{(}1+t^{-2/3}\big{)}, (67)

and consequently,

u¯τk(t)u(t)in W1,2(d)\displaystyle\sqrt{\bar{u}_{\tau_{k}}(t)}\to\sqrt{u(t)}\quad\text{in $W^{1,2}(\mathbb{R}^{d})$} (68)

at each t>0t>0. Indeed, by Rellich’s theorem, a subsubsequence of an arbitrary subsequence of u¯τk(t)\sqrt{\bar{u}_{\tau_{k}}(t)} converges strongly to some limit vv in W1,2(d)W^{1,2}(\mathbb{R}^{d}); but then the implied pointwise a.e. convergence leads to v2=u(t)v^{2}=u(t). By independence of the limit from the chosen subsequence, we conclude convergence of the entire sequence. Next, since (67) provides a uniform bound on u¯τ\sqrt{\bar{u}_{\tau}} in L2(0,T;W1,2(d))L^{2}(0,T;W^{1,2}(\mathbb{R}^{d})), the dominated convergence theorem applies and yields

u¯τkuin L2(0,T;W1,2(d).\displaystyle\sqrt{\bar{u}_{\tau_{k}}}\to\sqrt{u}\quad\text{in $L^{2}(0,T;W^{1,2}(\mathbb{R}^{d})$}.

We combine this convergence with the uniform bound on u¯τ\sqrt{\bar{u}_{\tau}} in L2(0,T;W3,2(d))L^{2}(0,T;W^{3,2}(\mathbb{R}^{d})) from (58) to obtain claim ii) above via interpolation. With ii) and the uniform bound on u¯τ6\sqrt[6]{\bar{u}_{\tau}} in L6(0,T;W1,6(d))L^{6}(0,T;W^{1,6}(\mathbb{R}^{d})) from (58) at hand, we verify claim iii) via Theorem B.1 from the Appendix. The last claim iv) is another direct consequence of the bound (58) on u¯τ6\sqrt[6]{\bar{u}_{\tau}}. That uLloc2(>0;W2,2(d))u\in L_{loc}^{2}(\mathbb{R}_{>0};W^{2,2}(\mathbb{R}^{d})) follows from the arguments given in Lemma 3.11.

Lemma 3.22.

The limit uu defined in Lemma 3.21 above is a solution to (1) in the sense of (8).

Proof.

Let ψCc(>0×d)\psi\in C^{\infty}_{c}({\mathbb{R}}_{>0}\times{\mathbb{R}}^{d}) be a test function in time and space. Fix some τk\tau_{k}; without loss of generality, we assume that τk\tau_{k} is so small that ψ(t,x)=0\psi(t,x)=0 for all 0<t<τk0<t<\tau_{k} and xdx\in{\mathbb{R}}^{d}. For each n=1,2,n=1,2,\ldots, use φτkn:=ψ(nτk;)\varphi_{\tau_{k}}^{n}:=\psi(n\tau_{k};\cdot) as test function in (39), then sum over all nn\in{\mathbb{N}}; this is actually a finite sum since ψ\psi is compactly supported. With the help of the triangle inequality,

|τn=1dφτkn+1φτknτkuτkndx+τn=1(𝒩[uτkn,φτkn]+2λ3dxφτknuτkndx)|\displaystyle\left|-\tau\sum_{n=1}^{\infty}\int_{\mathbb{R}^{d}}\frac{\varphi_{\tau_{k}}^{n+1}-\varphi_{\tau_{k}}^{n}}{\tau_{k}}u_{\tau_{k}}^{n}\,\mathrm{d}x+\tau\sum_{n=1}^{\infty}\left(\mathcal{N}[u_{\tau_{k}}^{n},\varphi_{\tau_{k}}^{n}]+2\lambda^{3}\int_{\mathbb{R}^{d}}x\cdot\nabla\varphi_{\tau_{k}}^{n}u_{\tau_{k}}^{n}\,\mathrm{d}x\right)\right|
ατk2n=1(W2(uτkn,uτkn1)τk)2.\displaystyle\leq\frac{\alpha\tau_{k}}{2}\sum_{n=1}^{\infty}\left(\frac{\textnormal{W}_{2}(u_{\tau_{k}}^{n},u_{\tau_{k}}^{n-1})}{\tau_{k}}\right)^{2}.

The right-hand side converges to zero for kk\to\infty thanks to the estimate (31). This implies, after rewriting everything in terms of the interpolated functions,

limk0Tdδτkψu¯τkdxdt=limk0T(𝒩[u¯τk,ψ¯τk]+2λ3dxψ¯τku¯τkdx)dt,\displaystyle\lim_{k\to\infty}\int_{0}^{T}\int_{\mathbb{R}^{d}}\delta_{\tau_{k}}\psi\,\bar{u}_{\tau_{k}}\,\mathrm{d}x\,\mathrm{d}t=\lim_{k\to\infty}\int_{0}^{T}\left(\mathcal{N}[\bar{u}_{\tau_{k}},\bar{\psi}_{\tau_{k}}]+2\lambda^{3}\int_{\mathbb{R}^{d}}x\cdot\nabla\bar{\psi}_{\tau_{k}}\bar{u}_{\tau_{k}}\,\mathrm{d}x\right)\,\mathrm{d}t,

where T>0T>0 is chosen large enough so that suppψ(0,T)×d\operatorname{supp}\psi\subset(0,T)\times{\mathbb{R}}^{d}, and we have introduced

ψ¯τk(t)=ψ(nτk),δτkψ(t)=ψ((n+1)τk)ψ(nτk)τkfor all t((n1]τk,nτk].\displaystyle\bar{\psi}_{\tau_{k}}(t)=\psi(n\tau_{k}),\quad\delta_{\tau_{k}}\psi(t)=\frac{\psi((n+1)\tau_{k})-\psi(n\tau_{k})}{\tau_{k}}\quad\text{for all $t\in((n-1]\tau_{k},n\tau_{k}]$}.

Notice that

ψ¯τkψ,δτkψtψuniformly on >0×d.\displaystyle\bar{\psi}_{\tau_{k}}\to\psi,\quad\delta_{\tau_{k}}\psi\to\partial_{t}\psi\quad\text{uniformly on ${\mathbb{R}}_{>0}\times{\mathbb{R}}^{d}$}.

It is now easily checked that the convergence stated in Lemma 3.21 above are sufficient to pass to the respective limits inside the integrals, that is

0Tdtψudxdt=0T(𝒩[u,ψ]+2λ3dxψudx)dt.\displaystyle\int_{0}^{T}\int_{\mathbb{R}^{d}}\partial_{t}\psi\,u\,\mathrm{d}x\,\mathrm{d}t=\int_{0}^{T}\left(\mathcal{N}[u,\psi]+2\lambda^{3}\int_{\mathbb{R}^{d}}x\cdot\nabla\psi\,u\,\mathrm{d}x\right)\,\mathrm{d}t.

This is equivalent to the weak formulation (8). ∎

This finishes the proof of Theorem 1.1.

4. Long Time Behaviour

4.1. An illustration by ODEs

We illustrate the role played by (14) by an analogous situation for smooth gradient flows on n{\mathbb{R}}^{n}. We are given a family of strictly convex functions hλ:nh_{\lambda}:{\mathbb{R}}^{n}\to{\mathbb{R}} and fix a parameter λ>0\lambda>0. From here, we define derived functions fλ,eλ:nf_{\lambda},e_{\lambda}:{\mathbb{R}}^{n}\to{\mathbb{R}} by

fλ:=12|Dhλ|2,eλ:=12DfλDhλ+λfλ=12DhλD2hλDhλ+λ2|Dhλ|2.\displaystyle f_{\lambda}:=\frac{1}{2}|\mathrm{D}h_{\lambda}|^{2},\quad e_{\lambda}:=\frac{1}{2}\mathrm{D}f_{\lambda}\cdot\mathrm{D}h_{\lambda}+\lambda f_{\lambda}=\frac{1}{2}\mathrm{D}h_{\lambda}\cdot\mathrm{D}^{2}h_{\lambda}\cdot\mathrm{D}h_{\lambda}+\frac{\lambda}{2}|\mathrm{D}h_{\lambda}|^{2}. (69)

Thanks to strict convexity of hλh_{\lambda}, there exists precisely one minimum point xλx_{\lambda} of hλh_{\lambda}, and this is by construction also the unique minimum point of fλf_{\lambda} and of eλe_{\lambda}. We wish to study the linearized dynamics of the gradient flow x˙=Deλ(x)\dot{x}=-\mathrm{D}e_{\lambda}(x) near the stationary point xλx_{\lambda}. Since

Deλ=12DhλD3hλDhλ+D2hλD2hλDhλ+λD2hλDhλ,\displaystyle\mathrm{D}e_{\lambda}=\frac{1}{2}\mathrm{D}h_{\lambda}\cdot\mathrm{D}^{3}h_{\lambda}\cdot\mathrm{D}h_{\lambda}+\mathrm{D}^{2}h_{\lambda}\cdot\mathrm{D}^{2}h_{\lambda}\cdot\mathrm{D}h_{\lambda}+\lambda\mathrm{D}^{2}h_{\lambda}\cdot\mathrm{D}h_{\lambda},

and since Dhλ(xλ)=0\mathrm{D}h_{\lambda}(x_{\lambda})=0, it follows that the linearization Aλn×nA_{\lambda}\in{\mathbb{R}}^{n\times n} of eλe_{\lambda}’s gradient vector field Deλ(x)\mathrm{D}e_{\lambda}(x) near x=xλx=x_{\lambda} is given by

Aλξ=[D2hλ(xλ)]3ξ+λ[D2hλ(xλ)]2ξ.\displaystyle A_{\lambda}\xi=\big{[}\mathrm{D}^{2}h_{\lambda}(x_{\lambda})\big{]}^{3}\xi+\lambda\big{[}\mathrm{D}^{2}h_{\lambda}(x_{\lambda})\big{]}^{2}\xi.

Assuming that the eigenvalues μ1,,μn\mu_{1},\ldots,\mu_{n} of D2hλ(xλ)\mathrm{D}^{2}h_{\lambda}(x_{\lambda}) are known, the eigenvalue of AA are known as well: these are precisely μk3+λμk2\mu_{k}^{3}+\lambda\mu_{k}^{2} for k=1,2,,nk=1,2,\ldots,n. In particular, the smallest of these is a lower bound on the exponential rate of convergence to xλx_{\lambda} in the linearized dynamics.

4.2. Derivation of the relation (14)

We shall now derive the relation (14), which is the basis for all further analysis below, and which plays the same role for (1) as (69) has played in the analysis of the toy problem above.

Lemma 4.1.

Let wrw_{r} be a solution to the linear Fokker-Planck equation (12). Then, at each r>0r>0,

12ddrλ(wr)\displaystyle\frac{1}{2}\frac{\mathrm{d}}{\,\mathrm{d}r}\mathcal{H}_{\lambda}(w_{r}) =dwr|logwrUλ|2dx,\displaystyle=-\int_{\mathbb{R}^{d}}w_{r}\left|\nabla\log\frac{w_{r}}{U_{\lambda}}\right|^{2}\,\mathrm{d}x, (70)
14d2dr2λ(wr)\displaystyle\frac{1}{4}\frac{\mathrm{d}^{2}}{\,\mathrm{d}r^{2}}\mathcal{H}_{\lambda}(w_{r}) =14ddiv(wrlogwrUλ)|logwrUλ|2dx\displaystyle=-\frac{1}{4}\int_{\mathbb{R}^{d}}\mathrm{div}\left(w_{r}\nabla\log\frac{w_{r}}{U_{\lambda}}\right)\left|\nabla\log\frac{w_{r}}{U_{\lambda}}\right|^{2}\,\mathrm{d}x
+12d1wr[div(wrlogwrUλ)]2dx.\displaystyle\qquad+\frac{1}{2}\int_{\mathbb{R}^{d}}\frac{1}{w_{r}}\left[\mathrm{div}\left(w_{r}\nabla\log\frac{w_{r}}{U_{\lambda}}\right)\right]^{2}\,\mathrm{d}x. (71)
Proof.

Thanks to the regularizing properties of the linear Fokker-Planck equation, (r,x)wr(x)(r,x)\mapsto w_{r}(x) is an everywhere positive CC^{\infty}-function on >0×d{\mathbb{R}}_{>0}\times{\mathbb{R}}^{d}, where it satisfies (12) in the classical sense, or equivalently

rwr=div(wrlogwrUλ).\displaystyle\partial_{r}w_{r}=\mathrm{div}\left(w_{r}\nabla\log\frac{w_{r}}{U_{\lambda}}\right). (72)

Moreover, wrw_{r} is a probability density at any r>0r>0, and wr(x)w_{r}(x) decays sufficiently rapidly for |x||x|\to\infty to justify the integration by parts below.

To derive λ\mathcal{H}_{\lambda}, we rewrite it in the form

λ(w)=dwUλlogwUλUλdx.\displaystyle\mathcal{H}_{\lambda}(w)=\int_{\mathbb{R}^{d}}\frac{w}{U_{\lambda}}\log\frac{w}{U_{\lambda}}\,U_{\lambda}\,\mathrm{d}x. (73)

For its rr-derivative, we obtain after subsitution of (72) and an integration by parts:

12ddrλ(wr)=12d(1+logwrUλ)rwrdx=12dwr|logwrUλ|2dx,\displaystyle\frac{1}{2}\frac{\mathrm{d}}{\,\mathrm{d}r}\mathcal{H}_{\lambda}(w_{r})=\frac{1}{2}\int_{\mathbb{R}^{d}}\left(1+\log\frac{w_{r}}{U_{\lambda}}\right)\,\partial_{r}w_{r}\,\mathrm{d}x=-\frac{1}{2}\int_{\mathbb{R}^{d}}w_{r}\left|\nabla\log\frac{w_{r}}{U_{\lambda}}\right|^{2}\,\mathrm{d}x,

which is (70).

For computation of the second rr-derivative of λ\mathcal{H}_{\lambda}, we differentiate in (70), substitute (72) again, and integrate by parts to obtain

14d2dr2λ(wr)\displaystyle\frac{1}{4}\frac{\mathrm{d}^{2}}{\,\mathrm{d}r^{2}}\mathcal{H}_{\lambda}(w_{r}) =14ddrdwr|logwrUλ|2dx\displaystyle=-\frac{1}{4}\frac{\mathrm{d}}{\,\mathrm{d}r}\int_{\mathbb{R}^{d}}w_{r}\left|\nabla\log\frac{w_{r}}{U_{\lambda}}\right|^{2}\,\mathrm{d}x
=14drwr|logwrUλ|2dx12dwr[logwrUλ][rwrwr]dx\displaystyle=-\frac{1}{4}\int_{\mathbb{R}^{d}}\partial_{r}w_{r}\,\left|\nabla\log\frac{w_{r}}{U_{\lambda}}\right|^{2}\,\mathrm{d}x-\frac{1}{2}\int_{\mathbb{R}^{d}}w_{r}\,\nabla\left[\log\frac{w_{r}}{U_{\lambda}}\right]\cdot\nabla\left[\frac{\partial_{r}w_{r}}{w_{r}}\right]\,\mathrm{d}x
=14ddiv(wrlogwrUλ)|logwrUλ|2dx+12d1wr[div(wrlogwrUλ)]2dx,\displaystyle=-\frac{1}{4}\int_{\mathbb{R}^{d}}\mathrm{div}\left(w_{r}\nabla\log\frac{w_{r}}{U_{\lambda}}\right)\left|\nabla\log\frac{w_{r}}{U_{\lambda}}\right|^{2}\,\mathrm{d}x+\frac{1}{2}\int_{\mathbb{R}^{d}}\frac{1}{w_{r}}\,\left[\mathrm{div}\left(w_{r}\nabla\log\frac{w_{r}}{U_{\lambda}}\right)\right]^{2}\,\mathrm{d}x,

and this is (71). ∎

Lemma 4.2.

Let wrw_{r} be a solution to the linear Fokker-Planck equation (12). Then

12ddrλ(wr)\displaystyle-\frac{1}{2}\frac{\mathrm{d}}{\,\mathrm{d}r}\mathcal{H}_{\lambda}(w_{r}) =λ(wr)λ(Uλ),\displaystyle=\mathcal{F}_{\lambda}(w_{r})-\mathcal{F}_{\lambda}(U_{\lambda}), (74)
12ddrλ(wr)\displaystyle-\frac{1}{2}\frac{\mathrm{d}}{\,\mathrm{d}r}\mathcal{F}_{\lambda}(w_{r}) =λ(wr)(Uλ)λ[λ(wr)(Uλ)],\displaystyle=\mathcal{E}_{\lambda}(w_{r})-\mathcal{E}(U_{\lambda})-\lambda\big{[}\mathcal{F}_{\lambda}(w_{r})-\mathcal{F}(U_{\lambda})\big{]}, (75)

for all r>0r>0.

Proof.

As in the previous proof, we rely on the regularity of the Fokker-Planck flow in the calculations below. Still, to enhance readability, we shall use simply uu instead of wrw_{r} below.

To establish the connection of the right hand side in (70) to λ\mathcal{F}_{\lambda}, we substitute

loguUλ=logu+λx,\displaystyle\nabla\log\frac{u}{U_{\lambda}}=\nabla\log u+\lambda x, (76)

and then integrate by parts in the term with linear dependence on xx:

12du|[logu+λx]|2dx\displaystyle\frac{1}{2}\int_{\mathbb{R}^{d}}u|[\nabla\log u+\lambda x]|^{2}\,\mathrm{d}x =12du|logu|2dx+λ22d|x|2udx+λdxudx\displaystyle=\frac{1}{2}\int_{\mathbb{R}^{d}}u|\nabla\log u|^{2}\,\mathrm{d}x+\frac{\lambda^{2}}{2}\int_{\mathbb{R}^{d}}|x|^{2}u\,\mathrm{d}x+\lambda\int_{\mathbb{R}^{d}}x\cdot\nabla u\,\mathrm{d}x
=λ(u)dλdudx.\displaystyle=\mathcal{F}_{\lambda}(u)-d\lambda\int_{\mathbb{R}^{d}}u\,\mathrm{d}x.

Since uu is a probability density, the constant above amounts to dλd\lambda. To conclude (74) from here, it remains to observe that

λ(Uλ)=12dUλ|logUλ|2dx+λ22d|x|2Uλdx=λ2d|x|2Uλdx=dλ.\displaystyle\mathcal{F}_{\lambda}(U_{\lambda})=\frac{1}{2}\int_{\mathbb{R}^{d}}U_{\lambda}|\nabla\log U_{\lambda}|^{2}\,\mathrm{d}x+\frac{\lambda^{2}}{2}\int_{\mathbb{R}^{d}}|x|^{2}U_{\lambda}\,\mathrm{d}x=\lambda^{2}\int_{\mathbb{R}^{d}}|x|^{2}U_{\lambda}\,\mathrm{d}x=d\lambda.

Next we need to show that the right-hand sides in (71) and (75), respectively, are the same. Substitution of (76) into (71) yields, after an integration by parts,

12ddrλ(u)\displaystyle-\frac{1}{2}\frac{\mathrm{d}}{\,\mathrm{d}r}\mathcal{F}_{\lambda}(u) =14d|[logu+λx]|2div(u[logu+λx])dx\displaystyle=-\frac{1}{4}\int_{\mathbb{R}^{d}}\big{|}[\nabla\log u+\lambda x]\big{|}^{2}\mathrm{div}(u[\nabla\log u+\lambda x])\,\mathrm{d}x
12du[logu+λx](div(u[logu+λx])u)dx\displaystyle\qquad-\frac{1}{2}\int_{\mathbb{R}^{d}}u[\nabla\log u+\lambda x]\cdot\nabla\left(\frac{\mathrm{div}(u[\nabla\log u+\lambda x])}{u}\right)\,\mathrm{d}x
=14d|[logu+λx]|2div(u[logu+λx])dx\displaystyle=-\frac{1}{4}\int_{\mathbb{R}^{d}}\big{|}[\nabla\log u+\lambda x]\big{|}^{2}\mathrm{div}(u[\nabla\log u+\lambda x])\,\mathrm{d}x
+12d[logu+λx]logudiv(u[logu+λx])dx\displaystyle\qquad+\frac{1}{2}\int_{\mathbb{R}^{d}}[\nabla\log u+\lambda x]\cdot\nabla\log u\,\mathrm{div}(u[\nabla\log u+\lambda x])\,\mathrm{d}x
12d[logu+λx]div(u[logu+λx])dx\displaystyle\qquad-\frac{1}{2}\int_{\mathbb{R}^{d}}[\nabla\log u+\lambda x]\cdot\nabla\mathrm{div}(u[\nabla\log u+\lambda x])\,\mathrm{d}x
=14d(|logu|2λ2|x|2)div(u[logu+λx])dx\displaystyle=\frac{1}{4}\int_{\mathbb{R}^{d}}(|\nabla\log u|^{2}-\lambda^{2}|x|^{2})\,\mathrm{div}(u[\nabla\log u+\lambda x])\,\mathrm{d}x
12d[logu+λx]div{(u[logu+λx])}dx.\displaystyle\qquad-\frac{1}{2}\int_{\mathbb{R}^{d}}[\nabla\log u+\lambda x]\cdot\mathrm{div}\big{\{}\nabla\otimes(u[\nabla\log u+\lambda x])\big{\}}\,\mathrm{d}x.

Now we integrate by parts to remove the divergence in both integrals:

12ddrλ(u)\displaystyle-\frac{1}{2}\frac{\mathrm{d}}{\,\mathrm{d}r}\mathcal{F}_{\lambda}(u) =12dulogu2logu[logu+λx]dx+λ22dxudx+λ32du|x|2dx\displaystyle=-\frac{1}{2}\int_{\mathbb{R}^{d}}u\nabla\log u\cdot\nabla^{2}\log u\cdot[\nabla\log u+\lambda x]\,\mathrm{d}x+\frac{\lambda^{2}}{2}\int_{\mathbb{R}^{d}}x\cdot\nabla u\,\mathrm{d}x+\frac{\lambda^{3}}{2}\int_{\mathbb{R}^{d}}u|x|^{2}\,\mathrm{d}x
+12d(2logu+λ𝕀):(u[logu+λx])dx+12du2logu+λ𝕀2dx\displaystyle\qquad+\frac{1}{2}\int_{\mathbb{R}^{d}}(\nabla^{2}\log u+\lambda\mathbb{I}):\big{(}\nabla u\otimes[\nabla\log u+\lambda x]\big{)}\,\mathrm{d}x+\frac{1}{2}\int_{\mathbb{R}^{d}}u\|\nabla^{2}\log u+\lambda\mathbb{I}\|^{2}\,\mathrm{d}x
=dλ22dudx+λ32d|x|2udx+λ2du|logu|2dx+λ22dxudx\displaystyle=-\frac{d\lambda^{2}}{2}\int_{\mathbb{R}^{d}}u\,\mathrm{d}x+\frac{\lambda^{3}}{2}\int_{\mathbb{R}^{d}}|x|^{2}u\,\mathrm{d}x+\frac{\lambda}{2}\int_{\mathbb{R}^{d}}u|\nabla\log u|^{2}\,\mathrm{d}x+\frac{\lambda^{2}}{2}\int_{\mathbb{R}^{d}}x\cdot\nabla u\,\mathrm{d}x
+12du2logu2dx+λduΔlogudx+dλ22dudx\displaystyle\qquad+\frac{1}{2}\int_{\mathbb{R}^{d}}u\|\nabla^{2}\log u\|^{2}\,\mathrm{d}x+\lambda\int_{\mathbb{R}^{d}}u\Delta\log u\,\mathrm{d}x+\frac{d\lambda^{2}}{2}\int_{\mathbb{R}^{d}}u\,\mathrm{d}x
=12du2logu2dx+λ32d|x|2udxλ2du|logu|2dxdλ22\displaystyle=\frac{1}{2}\int_{\mathbb{R}^{d}}u\|\nabla^{2}\log u\|^{2}\,\mathrm{d}x+\frac{\lambda^{3}}{2}\int_{\mathbb{R}^{d}}|x|^{2}u\,\mathrm{d}x-\frac{\lambda}{2}\int_{\mathbb{R}^{d}}u|\nabla\log u|^{2}\,\mathrm{d}x-\frac{d\lambda^{2}}{2}
=12du2logu2dx+λ3d|x|2udxλ[λ(u)λ(Uλ)]3dλ22.\displaystyle=\frac{1}{2}\int_{\mathbb{R}^{d}}u\|\nabla^{2}\log u\|^{2}\,\mathrm{d}x+\lambda^{3}\int_{\mathbb{R}^{d}}|x|^{2}u\,\mathrm{d}x-\lambda\big{[}\mathcal{F}_{\lambda}(u)-\mathcal{F}_{\lambda}(U_{\lambda})\big{]}-\frac{3d\lambda^{2}}{2}.

Finally, observe that

λ(Uλ)=12dUλλ𝕀2dx+λ3d|x|2Uλdx=dλ22+dλ3λ=3dλ22.\displaystyle\mathcal{E}_{\lambda}(U_{\lambda})=\frac{1}{2}\int_{\mathbb{R}^{d}}U_{\lambda}\|\lambda\mathbb{I}\|^{2}\,\mathrm{d}x+\lambda^{3}\int_{\mathbb{R}^{d}}|x|^{2}U_{\lambda}\,\mathrm{d}x=\frac{d\lambda^{2}}{2}+\frac{d\lambda^{3}}{\lambda}=\frac{3d\lambda^{2}}{2}.

This yields (75). ∎

The combination of (74) and (75) suggests that for all sufficiently regular uu,

λ(u)=12ddrλ(u)+λλ(u)+dλ22=14d2dr2λ(u)λ2ddrλ(u)+3dλ22.\displaystyle\mathcal{E}_{\lambda}(u)=-\frac{1}{2}\frac{\mathrm{d}}{\,\mathrm{d}r}\mathcal{F}_{\lambda}(u)+\lambda\mathcal{F}_{\lambda}(u)+\frac{d\lambda^{2}}{2}=\frac{1}{4}\frac{\mathrm{d}^{2}}{\,\mathrm{d}r^{2}}\mathcal{H}_{\lambda}(u)-\frac{\lambda}{2}\frac{\mathrm{d}}{\,\mathrm{d}r}\mathcal{H}_{\lambda}(u)+\frac{3d\lambda^{2}}{2}. (77)

4.3. The displacement Hessian

The geometric idea behind the linearization by means of the displacement Hessian is the representation of the dynamics on the space of probability measures in Lagrangian coordinates. For the moment, let us consider general L2L^{2}-Wasserstein gradient flow, written in the form of a nonlinear transport equation,

tut=div(ut𝐯[ut])with𝐯[ut]=δ𝒰δu|u=ut.\displaystyle\partial_{t}u_{t}=-\mathrm{div}(u_{t}\mathbf{v}[u_{t}])\quad\text{with}\quad\mathbf{v}[u_{t}]=-\frac{\delta{\mathcal{U}}}{\delta u}\bigg{|}_{u=u_{t}}. (78)

By a Lagrangian representation of a solution utu_{t} with respect to some reference measure UU, we mean a time-dependent diffeomorphism Xt:ddX_{t}:{\mathbb{R}}^{d}\to{\mathbb{R}}^{d} satisfying

ut=Xt#U=UdetDXtXt1.\displaystyle u_{t}=X_{t}\#U=\frac{U}{\det\mathrm{D}X_{t}}\circ X_{t}^{-1}. (79)

Note that there is a freedom of gauge here: (79) determines XtX_{t} only up to a concatenation from the right with any tt-dependent map that leaves UU invariant.

Since utu_{t} satisfies the transport equation (78), it is easily deduced that XtX_{t} “follows the vector field 𝐯\mathbf{v}”. I.e., it satisfies the Lagrangian equation

tXt𝐯[ut]Xt.\displaystyle\partial_{t}X_{t}\cong\mathbf{v}[u_{t}]\circ X_{t}. (80)

Here \cong refers to the aforementioned freedom of gauge for XtX_{t}: the left and right sides in (80) may differ by a vector field ζt\zeta_{t} that is divergence-free with respect to UU, i.e., div(Uζt)=0\mathrm{div}(U\zeta_{t})=0.

Now assume that UU is a stationary solution of (78); then X=idX=\textnormal{id} is a stationary solution of (80). The Wasserstein-linearization of (78) around UU is an appropriate linearization of (80) around id. Taking into account that 𝐯[U]=0\mathbf{v}[U]=0, one obtains for any smooth, compactly supported vector field Ξ\Xi:

ddh|h=0(𝐯[(idhζ)#U](idhΞ))=[δ2𝒰δu2|u=Udiv(UΞ)].\displaystyle\frac{\mathrm{d}}{\,\mathrm{d}h}\bigg{|}_{h=0}\big{(}\mathbf{v}[(\textnormal{id}-h\zeta)\#U]\circ(\textnormal{id}-h\Xi)\big{)}=\nabla\left[\frac{\delta^{2}{\mathcal{U}}}{\delta u^{2}}\bigg{|}_{u=U}\mathrm{div}(U\Xi)\right].

Consequently, the linearized Lagrangian dynamics is given by

tΞt[δ2𝒰δu2|u=Udiv(UΞ)].\displaystyle\partial_{t}\Xi_{t}\cong\nabla\left[\frac{\delta^{2}{\mathcal{U}}}{\delta u^{2}}\bigg{|}_{u=U}\mathrm{div}(U\Xi)\right]. (81)

For definition of the displacement Hessian and the Wasserstein linearization, one chooses a particular gauge in (80) — and consequently also in (81) — to remove the ambiguity. Thanks to the Brenier theorem from optimal transportation, one may assume Xt=idφtX_{t}=\textnormal{id}-\nabla\varphi_{t} with some time-dependent potential φt\varphi_{t}. Inserting this into (81) yields

tφt=δ2𝒰δu2|u=Udiv(Uφt).\displaystyle\partial_{t}\varphi_{t}=\frac{\delta^{2}{\mathcal{U}}}{\delta u^{2}}\bigg{|}_{u=U}\mathrm{div}(U\nabla\varphi_{t}). (82)

The operator on the right-hand side acting on φt\varphi_{t} is the the negative of the displacement Hessian of 𝒰{\mathcal{U}}. More rigorously, one defines:

Definition 4.3.

Assume that UU is a global minimizer of 𝒰{\mathcal{U}}, and assume further that there exists a densely defined self-adjoint linear operator 𝐋{\mathbf{L}} on H1(d;Ud)H^{1}({\mathbb{R}}^{d};U\mathcal{L}^{d}) such that, for any test function ψCc(d)\psi\in C^{\infty}_{c}({\mathbb{R}}^{d}),

dψ(𝐋ψ)Udx=d2dσ2|σ=0𝒰(uσ),\displaystyle\int_{\mathbb{R}^{d}}\nabla\psi\cdot\nabla\big{(}{\mathbf{L}}\psi\big{)}U\,\mathrm{d}x=\frac{\mathrm{d}^{2}}{\,\mathrm{d}\sigma^{2}}\bigg{|}_{\sigma=0}{\mathcal{U}}(u_{\sigma}),

where uσu_{\sigma} is the solution to the transport equation σuσ+div(uσψ)=0\partial_{\sigma}u_{\sigma}+\mathrm{div}(u_{\sigma}\nabla\psi)=0. Then 𝐋{\mathbf{L}} is called displacement Hessian of 𝒰{\mathcal{U}} at UU, and is denoted by HessU𝒰\mathrm{Hess}_{U}{\mathcal{U}}.

4.4. Calculation of the displacement Hessian

We shall now calculate the displacement Hessian for the three functionals from (2). Note that they all have UλU_{\lambda} as global minimizer.

Proposition 4.4.

Define the linear operator 𝐋{\mathbf{L}} on Cc(d)C^{\infty}_{c}({\mathbb{R}}^{d}) by

𝐋φ:=1Uλdiv(Uλφ).\displaystyle{\mathbf{L}}\varphi:=-\frac{1}{U_{\lambda}}\mathrm{div}(U_{\lambda}\nabla\varphi).

Then we have for all ψCc(d)\psi\in C^{\infty}_{c}({\mathbb{R}}^{d}):

(HessUλλ)ψ=𝐋ψ,(HessUλλψ)=𝐋2ψ,(HessUλλ)ψ=(𝐋3+λ𝐋2)ψ.\displaystyle\big{(}\mathrm{Hess}_{U_{\lambda}}\mathcal{H}_{\lambda}\big{)}\psi={\mathbf{L}}\psi,\quad\big{(}\mathrm{Hess}_{U_{\lambda}}\mathcal{F}_{\lambda}\psi\big{)}={\mathbf{L}}^{2}\psi,\quad\big{(}\mathrm{Hess}_{U_{\lambda}}\mathcal{E}_{\lambda}\big{)}\psi=({\mathbf{L}}^{3}+\lambda{\mathbf{L}}^{2})\psi. (83)
Proof.

Let some ψCc(d)\psi\in C^{\infty}_{c}({\mathbb{R}}^{d}) be fixed. For the curve uσu_{\sigma}, we choose the solution of the transport problem

σuσ=div(uσψ),u0=Uλ.\displaystyle\partial_{\sigma}u_{\sigma}=-\mathrm{div}(u_{\sigma}\nabla\psi),\quad u_{0}=U_{\lambda}. (84)

Note that the transport vector field ψ\nabla\psi is independent of time and smooth with compact support, hence the (σ;x)uσ(x)(\sigma;x)\mapsto u_{\sigma}(x) is an everywhere positive smooth function on ×d{\mathbb{R}}\times{\mathbb{R}}^{d}.

For the relative entropy, recalling the representation (73), and that loguUλ0\log\frac{u}{U_{\lambda}}\equiv 0 for u=Uλu=U_{\lambda},

d2dσ2|σ=0λ(uσ)=ddσ|σ=0d(1+loguσUλ)σuσdx=d(σ|σ=0uσ)2Uλdx.\displaystyle\frac{\mathrm{d}^{2}}{\,\mathrm{d}\sigma^{2}}\bigg{|}_{\sigma=0}\mathcal{H}_{\lambda}(u_{\sigma})=\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\int_{\mathbb{R}^{d}}\left(1+\log\frac{u_{\sigma}}{U_{\lambda}}\right)\partial_{\sigma}u_{\sigma}\,\mathrm{d}x=\int_{\mathbb{R}^{d}}\frac{\big{(}\partial_{\sigma}\big{|}_{\sigma=0}u_{\sigma}\big{)}^{2}}{U_{\lambda}}\,\mathrm{d}x.

Substituting (84) and integrating by parts, we obtain

d2dσ2|σ=0λ(uσ)=d1Uλ[div(Uλψ)]2dx=dUλψ(1Uλdiv(Uλψ))dx.\displaystyle\frac{\mathrm{d}^{2}}{\,\mathrm{d}\sigma^{2}}\bigg{|}_{\sigma=0}\mathcal{H}_{\lambda}(u_{\sigma})=\int_{\mathbb{R}^{d}}\frac{1}{U_{\lambda}}[\mathrm{div}(U_{\lambda}\nabla\psi)]^{2}\,\mathrm{d}x=-\int_{\mathbb{R}^{d}}U_{\lambda}\nabla\psi\cdot\left(\frac{1}{U_{\lambda}}\mathrm{div}(U_{\lambda}\nabla\psi)\right)\,\mathrm{d}x.

This gives the first identity in (83).

For the perturbed Fisher information, we start from the representation

λ(u)=12du|loguUλ|2dx+λ(Uλ)\displaystyle\mathcal{F}_{\lambda}(u)=\frac{1}{2}\int_{\mathbb{R}^{d}}u\left|\nabla\log\frac{u}{U_{\lambda}}\right|^{2}\,\mathrm{d}x+\mathcal{F}_{\lambda}(U_{\lambda})

that follows from (70) and (74), and obtain

d2dσ2|σ=0λ(uσ)\displaystyle\frac{\mathrm{d}^{2}}{\,\mathrm{d}\sigma^{2}}\bigg{|}_{\sigma=0}\mathcal{F}_{\lambda}(u_{\sigma}) =ddσ|σ=0d{12(σuσ)|loguσUλ|2+uσ[loguσUλ][σuσuσ]}dx\displaystyle=\frac{\mathrm{d}}{\,\mathrm{d}\sigma}\bigg{|}_{\sigma=0}\int_{\mathbb{R}^{d}}\left\{\frac{1}{2}(\partial_{\sigma}u_{\sigma})\left|\nabla\log\frac{u_{\sigma}}{U_{\lambda}}\right|^{2}+u_{\sigma}\,\nabla\left[\log\frac{u_{\sigma}}{U_{\lambda}}\right]\cdot\nabla\left[\frac{\partial_{\sigma}u_{\sigma}}{u_{\sigma}}\right]\right\}\,\mathrm{d}x
=dUλ|σ|σ=0uσUλ|2dx.\displaystyle=\int_{\mathbb{R}^{d}}U_{\lambda}\left|\nabla\frac{\partial_{\sigma}\big{|}_{\sigma=0}u_{\sigma}}{U_{\lambda}}\right|^{2}\,\mathrm{d}x.

Hence, by (84) and two consecutive integration by parts,

d2dσ2|σ=0λ(uσ)\displaystyle\frac{\mathrm{d}^{2}}{\,\mathrm{d}\sigma^{2}}\bigg{|}_{\sigma=0}\mathcal{F}_{\lambda}(u_{\sigma}) =dUλ|div(Uλψ)Uλ|2dx\displaystyle=\int_{\mathbb{R}^{d}}U_{\lambda}\left|\nabla\frac{\mathrm{div}(U_{\lambda}\nabla\psi)}{U_{\lambda}}\right|^{2}\,\mathrm{d}x
=d1Uλdiv(Uλψ)div[Uλ(1Uλdiv(Uλψ))]dx\displaystyle=-\int_{\mathbb{R}^{d}}\frac{1}{U_{\lambda}}\mathrm{div}(U_{\lambda}\nabla\psi)\,\mathrm{div}\left[U_{\lambda}\nabla\left(\frac{1}{U_{\lambda}}\mathrm{div}(U_{\lambda}\nabla\psi)\right)\right]\,\mathrm{d}x
=dUλψ{1Uλdiv[Uλ(1Uλdiv(Uλψ))]}dx,\displaystyle=\int_{\mathbb{R}^{d}}U_{\lambda}\nabla\psi\cdot\nabla\left\{\frac{1}{U_{\lambda}}\mathrm{div}\left[U_{\lambda}\nabla\left(\frac{1}{U_{\lambda}}\mathrm{div}(U_{\lambda}\nabla\psi)\right)\right]\right\}\,\mathrm{d}x,

which confirms the second identity in (83).

Finally, for computation of the Hessian of λ\mathcal{E}_{\lambda}, we make use of (71) and (75) and so obtain

d2dσ2|σ=0λ(uσ)\displaystyle\frac{\mathrm{d}^{2}}{\,\mathrm{d}\sigma^{2}}\bigg{|}_{\sigma=0}\mathcal{E}_{\lambda}(u_{\sigma}) =λd2dσ2|σ=0λ(uσ)\displaystyle=\lambda\frac{\mathrm{d}^{2}}{\,\mathrm{d}\sigma^{2}}\bigg{|}_{\sigma=0}\mathcal{F}_{\lambda}(u_{\sigma})
14d2dσ2|σ=0ddiv(uσloguσUλ)|loguσUλ|2dx\displaystyle\qquad-\frac{1}{4}\frac{\mathrm{d}^{2}}{\,\mathrm{d}\sigma^{2}}\bigg{|}_{\sigma=0}\int_{\mathbb{R}^{d}}\mathrm{div}\left(u_{\sigma}\nabla\log\frac{u_{\sigma}}{U_{\lambda}}\right)\left|\nabla\log\frac{u_{\sigma}}{U_{\lambda}}\right|^{2}\,\mathrm{d}x
+12d2dσ2|σ=0d1uσ[div(uσloguσUλ)]2dx\displaystyle\qquad+\frac{1}{2}\frac{\mathrm{d}^{2}}{\,\mathrm{d}\sigma^{2}}\bigg{|}_{\sigma=0}\int_{\mathbb{R}^{d}}\frac{1}{u_{\sigma}}\left[\mathrm{div}\left(u_{\sigma}\nabla\log\frac{u_{\sigma}}{U_{\lambda}}\right)\right]^{2}\,\mathrm{d}x

An quick inspection of the last two lines above reveals that there is exactly one term that does not vanish automatically because of logu0Uλ0\log\frac{u_{0}}{U_{\lambda}}\equiv 0, namely the one where both logarithmic terms inside the last integral get differentiated. In combination with the calculation for λ\mathcal{F}_{\lambda} above, we conclude that

d2dσ2|σ=0λ(uσ)=λdUλψ(𝐋2ψ)dx+d1Uλ[div(Uλσ|σ=0uσUλ)]2dx.\begin{split}\frac{\mathrm{d}^{2}}{\,\mathrm{d}\sigma^{2}}\bigg{|}_{\sigma=0}\mathcal{E}_{\lambda}(u_{\sigma})&=\lambda\int_{\mathbb{R}^{d}}U_{\lambda}\nabla\psi\cdot\nabla({\mathbf{L}}^{2}\psi)\,\mathrm{d}x+\int_{\mathbb{R}^{d}}\frac{1}{U_{\lambda}}\left[\mathrm{div}\left(U_{\lambda}\nabla\frac{\partial_{\sigma}|_{\sigma=0}u_{\sigma}}{U_{\lambda}}\right)\right]^{2}\,\mathrm{d}x.\end{split} (85)

We consider the last integral, substitute (84), and repeatedly integrate by parts:

d1Uλ[div(Uλσ|σ=0uσUλ)]2dx=d1Uλ[div(Uλdiv(Uλψ)Uλ)]2dx\displaystyle\int_{\mathbb{R}^{d}}\frac{1}{U_{\lambda}}\left[\mathrm{div}\left(U_{\lambda}\nabla\frac{\partial_{\sigma}|_{\sigma=0}u_{\sigma}}{U_{\lambda}}\right)\right]^{2}\,\mathrm{d}x=\int_{\mathbb{R}^{d}}\frac{1}{U_{\lambda}}\left[\mathrm{div}\left(U_{\lambda}\nabla\frac{\mathrm{div}(U_{\lambda}\nabla\psi)}{U_{\lambda}}\right)\right]^{2}\,\mathrm{d}x
=d1Uλdiv(Uλψ)div{Uλ[1Uλdiv(Uλdiv(Uλψ)Uλ)]}dx\displaystyle\qquad=\int_{\mathbb{R}^{d}}\frac{1}{U_{\lambda}}\mathrm{div}(U_{\lambda}\nabla\psi)\,\mathrm{div}\left\{U_{\lambda}\nabla\left[\frac{1}{U_{\lambda}}\mathrm{div}\left(U_{\lambda}\nabla\frac{\mathrm{div}(U_{\lambda}\nabla\psi)}{U_{\lambda}}\right)\right]\right\}\,\mathrm{d}x
=dUλψ(1Uλdiv{Uλ[1Uλdiv(Uλdiv(Uλψ)Uλ)]})dx.\displaystyle\qquad=-\int_{\mathbb{R}^{d}}U_{\lambda}\nabla\psi\cdot\nabla\left(\frac{1}{U_{\lambda}}\mathrm{div}\left\{U_{\lambda}\nabla\left[\frac{1}{U_{\lambda}}\mathrm{div}\left(U_{\lambda}\nabla\frac{\mathrm{div}(U_{\lambda}\nabla\psi)}{U_{\lambda}}\right)\right]\right\}\right)\,\mathrm{d}x.

Substituting this in (85) yields the final identity in (83). ∎

4.5. Special solutions and their linearization

To illustrate the applicability of the linearization of (1), we completely characterize the dynamics of (1) and its Wasserstein linearization

tφt=(𝐋λ3+λ𝐋λ2)φt\displaystyle\partial_{t}\varphi_{t}=-\big{(}{\mathbf{L}}_{\lambda}^{3}+\lambda{\mathbf{L}}_{\lambda}^{2}\big{)}\varphi_{t} (86)

in the invariant finite dimensional submanifold defined by affine deformations of Gaussians. More specifically, we consider the — non-linear and linearized — dynamics induced on the set of positive definite matrices Sd×dS\in{\mathbb{R}}^{d\times d} and vectors ada\in{\mathbb{R}}^{d} by means of

ut=Xt#UλwithXt(y)=St1/2y+at.\displaystyle u_{t}=X_{t}\#U_{\lambda}\quad\text{with}\quad X_{t}(y)=S_{t}^{1/2}y+a_{t}. (87)

We begin with the linearized dynamics. Since we have

Xt=idψtwithψt(y)=12yT(𝕀St1/2)yatTy,\displaystyle X_{t}=\textnormal{id}-\nabla\psi_{t}\quad\text{with}\quad\psi_{t}(y)=\frac{1}{2}y^{T}(\mathbb{I}-S_{t}^{1/2})y-a_{t}^{T}y, (88)

we need to study solutions φt\varphi_{t} to (86) of quadratic type,

φt(x)=12xTAtx+btTx+ct.\displaystyle\varphi_{t}(x)=\frac{1}{2}x^{T}A_{t}x+b_{t}^{T}x+c_{t}.

It is obvious that these form an invariant subspace under (86). For this ansatz, we obtain

𝐋λ3φt+λ𝐋λ2φt=6λ3xTAtx+2λ3btTx6λ2tr[At],\displaystyle{\mathbf{L}}_{\lambda}^{3}\varphi_{t}+\lambda{\mathbf{L}}_{\lambda}^{2}\varphi_{t}=6\lambda^{3}x^{T}A_{t}x+2\lambda^{3}b_{t}^{T}x-6\lambda^{2}\operatorname{tr}[A_{t}],

and so it follows from (86) that

A˙t=12λ3At,b˙t=2λ3bt,c˙t=6λ2tr[At].\displaystyle\dot{A}_{t}=-12\lambda^{3}A_{t},\quad\dot{b}_{t}=-2\lambda^{3}b_{t},\quad\dot{c}_{t}=-6\lambda^{2}\operatorname{tr}[A_{t}]. (89)

Note that the equation for ctc_{t} can be neglected, since the value of ctc_{t} is irrelevant in Xt=idφtX_{t}=\textnormal{id}-\nabla\varphi_{t}.

Now for the full nonlinear dynamics. From the ansatz (87), we obtain that

ut(x)=1(2π)ddetStexp(λ2(xat)TSt1(xat)).\displaystyle u_{t}(x)=\frac{1}{\sqrt{(2\pi)^{d}\det S_{t}}}\exp\left(-\frac{\lambda}{2}(x-a_{t})^{T}S_{t}^{-1}(x-a_{t})\right).

It follows that, on the one hand,

tut(x)\displaystyle\partial_{t}u_{t}(x) =[12(ddtlogdetSt)λ2(xat)Tddt(St1)(xat)+λ(xa)TSt1a˙t]ut(x)\displaystyle=\left[-\frac{1}{2}\left(\frac{\mathrm{d}}{\,\mathrm{d}t}\log\det S_{t}\right)-\frac{\lambda}{2}(x-a_{t})^{T}\frac{\mathrm{d}}{\,\mathrm{d}t}\big{(}S_{t}^{-1}\big{)}(x-a_{t})+\lambda(x-a)^{T}S_{t}^{-1}\dot{a}_{t}\right]u_{t}(x)
=[λ2(xa)TSt1S˙tSt1(xat)+λ(xat)TSt1a˙t12tr[St1S˙t]]ut(x).\displaystyle=\left[\frac{\lambda}{2}(x-a)^{T}S_{t}^{-1}\dot{S}_{t}S_{t}^{-1}(x-a_{t})+\lambda(x-a_{t})^{T}S_{t}^{-1}\dot{a}_{t}-\frac{1}{2}\operatorname{tr}[S_{t}^{-1}\dot{S}_{t}]\right]u_{t}(x). (90)

And on the other hand, using that 2ut(x)=(λ2St1(xat)St1(xat)λSt1)ut(x)\nabla^{2}u_{t}(x)=(\lambda^{2}S_{t}^{-1}(x-a_{t})\otimes S_{t}^{-1}(x-a_{t})-\lambda S_{t}^{-1})u_{t}(x), and also that 2logutλSt1\nabla^{2}\log u_{t}\equiv-\lambda S_{t}^{-1}, we obtain

Φ(ut)=32λ2St12λ3(xat)TSt3(xat),\displaystyle\Phi(u_{t})=\frac{3}{2}\lambda^{2}\|S_{t}^{-1}\|^{2}-\lambda^{3}(x-a_{t})^{T}S_{t}^{-3}(x-a_{t}),

and thus further

div(ut[Φ(ut)+2λ3x])=div(ut[2λ3(𝕀St3)(xat)+2λ3at])\displaystyle\mathrm{div}\big{(}u_{t}\big{[}\nabla\Phi(u_{t})+2\lambda^{3}x\big{]}\big{)}=\mathrm{div}\big{(}u_{t}\big{[}2\lambda^{3}(\mathbb{I}-S_{t}^{-3})(x-a_{t})+2\lambda^{3}a_{t}\big{]}\big{)}
=(2λ3tr[𝕀St3]2λ4(xat)TSt1at2λ4(xat)T(St1St4)(xat))ut(x).\displaystyle=\left(2\lambda^{3}\operatorname{tr}[\mathbb{I}-S_{t}^{-3}]-2\lambda^{4}(x-a_{t})^{T}S_{t}^{-1}a_{t}-2\lambda^{4}(x-a_{t})^{T}(S_{t}^{-1}-S_{t}^{-4})(x-a_{t})\right)u_{t}(x). (91)

By equating the right-hand sides of (90) and (91), we see that (1) is satisfied if and only if

S˙t=4λ3(StSt2),a˙t=2λ3at,tr[St1S˙t]=4λ3tr[𝕀St3].\displaystyle\dot{S}_{t}=-4\lambda^{3}(S_{t}-S_{t}^{-2}),\quad\dot{a}_{t}=-2\lambda^{3}a_{t},\quad\operatorname{tr}[S_{t}^{-1}\dot{S}_{t}]=-4\lambda^{3}\operatorname{tr}[\mathbb{I}-S_{t}^{-3}].

Note that the last of these three differential equations is a trivial consequence of the first. Note further that the equations for ata_{t} and StS_{t} are easy to solve:

at=e2λ3ta0,St=(𝕀+e12λ3t(S03𝕀))1/3;\displaystyle a_{t}=e^{-2\lambda^{3}t}a_{0},\quad S_{t}=\big{(}\mathbb{I}+e^{-12\lambda^{3}t}(S_{0}^{3}-\mathbb{I})\big{)}^{1/3}; (92)

the third root is well-defined since the expression in the brackets is a positive definite matrix at each t0t\geq 0.

In view of (88), the quantities (At,bt)(A_{t},b_{t}) and (St,at)(S_{t},a_{t}) are related by

St=(𝕀At)2,at=bt.\displaystyle S_{t}=(\mathbb{I}-A_{t})^{2},\quad a_{t}=-b_{t}.

It is thus obvious that the linear ODEs (89) capture the correct asymptotic behaviour of StS_{t} and ata_{t} as tt\to\infty. It should be noted that the dd-fold eigenvalues 2λ32\lambda^{3} and 12λ312\lambda^{3} consistute the lowest eigenvalues of HessUλλ\mathrm{Hess}_{U_{\lambda}}\mathcal{E}_{\lambda}’s spectrum; the next eigenvalue is 36λ336\lambda^{3}. The interpretation in terms of higher-order asymptotics would be this: an arbitrary initial datum u0u_{0} that sufficiently close to UλU_{\lambda} in an appropriate sense can be modified by an suitable transformation of the form

u~0=G#u0,withG(x)=S1/2x+a,\displaystyle\tilde{u}_{0}=G\#u_{0},\quad\text{with}\quad G(x)=S^{1/2}x+a,

such that the corresponding solution u~t\tilde{u}_{t} converges to UλU_{\lambda} at an exponential rate of 36λ336\lambda^{3}. The rigorous proof of such a result is currently out of reach. For a related result on the porous medium equation, we refer to [6].

4.6. Asymptotic self-similarity

We discuss the following consequence of Conjecture 1.

Corollary 4.5.

If conjecture 1 is true, then for any initial datum u0u_{0} of finite second moment and finite entropy, there is a corresponding weak solution uu to the initial value problem for (1) with λ=0\lambda=0 that approaches the self-similar solution uu_{*} from (17) at algebraic rate t1/6t^{-1/6}. That is,

u(t;)u(t;)L1(d)C(u0)(1+12t)1/6\displaystyle\big{\|}u(t;\cdot)-u_{*}(t;\cdot)\big{\|}_{L^{1}({\mathbb{R}}^{d})}\leq C(u_{0})(1+12t)^{-1/6} (93)
Proof.

By the usual rescaling for homogeneous parabolic equations, we relate the solution uu for λ=0\lambda=0 to a solution vv for λ=1\lambda=1. Specifically, we set

κ(t):=(1+12t)1/6,τ(t)=logκ(t),\displaystyle\kappa(t):=(1+12t)^{1/6},\quad\tau(t)=\log\kappa(t),

and then introduce v=v(s;y)v=v(s;y) implicitly via

u(t;x)=κ(t)dv(τ(t);κ(t)1x).\displaystyle u(t;x)=\kappa(t)^{-d}v\big{(}\tau(t);\kappa(t)^{-1}x\big{)}. (94)

Note that u(0;x)=v(0;x)u(0;x)=v(0;x). Now let vv be the solution to (1) with λ=1\lambda=1 according to Theorem 1.1 for the initial condition v(0;x)=u0(x)v(0;x)=u_{0}(x). Performing a change of variables under the integral in (9) it is easily seen that uu satisfies the weak formulation (8) with λ=0\lambda=0.

To conclude the self-similar asymptotics (93) via Conjecture 1, we write (16) for vv in place of uu, and with λ=1\lambda=1, and perform a change of variables under the integral:

C(u0)(1+12t)1/6=C(u0)eτ(t)\displaystyle C(u^{0})(1+12t)^{-1/6}=C(u^{0})e^{-\tau(t)} v(τ(t);)U1L1(d)\displaystyle\geq\big{\|}v(\tau(t);\cdot)-U_{1}\big{\|}_{L^{1}({\mathbb{R}}^{d})}
=d|v(τ(t);y)U1(y)|dy\displaystyle=\int_{\mathbb{R}^{d}}|v(\tau(t);y)-U_{1}(y)|\,\mathrm{d}y
=d|v(τ(t);κ(t)1x)U1(κ(t)1x)|κ(t)ddx\displaystyle=\int_{\mathbb{R}^{d}}|v(\tau(t);\kappa(t)^{-1}x)-U_{1}(\kappa(t)^{-1}x)|\,\kappa(t)^{-d}\,\mathrm{d}x
=d|u(t;x)u(t;x)|dx=u(t;)u(t;)L1(d),\displaystyle=\int_{\mathbb{R}^{d}}|u(t;x)-u_{*}(t;x)|\,\mathrm{d}x=\|u(t;\cdot)-u_{*}(t;\cdot)\|_{L^{1}({\mathbb{R}}^{d})},

where we have used the definition of uu_{*} in (17). ∎

Appendix A Two elementary inequalities for sums of powers

Lemma A.1.

For each n=1,2,n=1,2,\ldots,

(nτ)5/6τN=1n(Nτ)1/6\displaystyle(n\tau)^{5/6}\leq\tau\sum_{N=1}^{n}(N\tau)^{-1/6} (95)
Proof.

We prove (95) by induction on nn. For n=1n=1, one has equality. Now assume that (95) holds for n1n\geq 1; we need to show that

((n+1)τ)5/6(nτ)5/6((n+1)τ)1/6τ.\displaystyle((n+1)\tau)^{5/6}-(n\tau)^{5/6}\leq((n+1)\tau)^{-1/6}\tau.

By the “below tangent formula” for the concave function ss5/6s\mapsto s^{5/6}, we have

((n+1)τ)5/6(nτ)5/6+56(nτ)1/6τ.\displaystyle((n+1)\tau)^{5/6}\leq(n\tau)^{5/6}+\frac{5}{6}(n\tau)^{-1/6}\tau.

We conclude by observing that

56(n+1n)1/65621/6<1\displaystyle\frac{5}{6}\left(\frac{n+1}{n}\right)^{1/6}\leq\frac{5}{6}2^{1/6}<1

for all n=1,2,n=1,2,\ldots. ∎

Lemma A.2.

For integers M¯>M¯1\overline{M}>\underline{M}\geq 1,

τ12n=M¯M¯1(nτ)5/6(M¯τ)1/6(M¯τ)1/6.\displaystyle\frac{\tau}{12}\sum_{n=\underline{M}}^{\overline{M}-1}(n\tau)^{-5/6}\leq(\overline{M}\tau)^{1/6}-(\underline{M}\tau)^{1/6}. (96)
Proof.

Inequality (96) clearly follows from

τ12(nτ)5/6((n+1)τ)1/6(nτ)1/6\displaystyle\frac{\tau}{12}(n\tau)^{-5/6}\leq((n+1)\tau)^{1/6}-(n\tau)^{1/6}

for all n=1,2,n=1,2,\ldots. This is a consequence of the “below tangent formula” for the concave function ss1/6s\mapsto s^{1/6},

(nτ)1/6((n+1)τ)1/616((n+1)τ)5/6τ,\displaystyle(n\tau)^{1/6}\leq((n+1)\tau)^{1/6}-\frac{1}{6}((n+1)\tau)^{-5/6}\tau,

in combination with the observation that

16(nn+1)5/61625/6>112.\displaystyle\frac{1}{6}\left(\frac{n}{n+1}\right)^{5/6}\geq\frac{1}{6}2^{-5/6}>\frac{1}{12}.

Appendix B A convergence theorem for powers

Theorem B.1.

For 0<β<γ<α<0<\beta<\gamma<\alpha<\infty with αp=βq=γr\alpha p=\beta q=\gamma r and a sequence of nonnegative functions (un)n(u_{n})_{n\ \mathbb{N}} on (0,T)×d(0,T)\times\mathbb{R}^{d} such that

  1. (i)

    unαuαu_{n}^{\alpha}\rightarrow u^{\alpha} strongly in Lp(0,T;W1,p(d))L^{p}(0,T;W^{1,p}(\mathbb{R}^{d}))

  2. (ii)

    (unβ)n(u_{n}^{\beta})_{n\in\mathbb{N}} is bounded in Lq(0,T;W1,q(d))L^{q}(0,T;W^{1,q}(\mathbb{R}^{d}))

we have (up to subsequences) unγuγu_{n}^{\gamma}\rightarrow u^{\gamma} strongly in Lr(0,T;W1,r(d))L^{r}(0,T;W^{1,r}(\mathbb{R}^{d})) .

We rephrase a slight variant, respectively extension, of the proof of [11, Proposition 6.1].

Proof.

At first assume that the sequence is uniformly bounded away from 0, i.e. unεnu_{n}\geq\varepsilon\;\forall n\in\mathbb{N} for some ε>0\varepsilon>0, and that all unu_{n} have support on a ball with radius R>0R>0, Ω:=B0(R)\Omega:=B_{0}(R). By the ”missing term in Fatou’s Lemma” it is enough to show pointwise convergence and convergence of the norms for the strong convergence of (unγ)n(u_{n}^{\gamma})_{n\in\mathbb{N}} in Lr(0,T;W1,r(Ω))L^{r}(0,T;W^{1,r}(\Omega)).

For this, define μn,μ\mu_{n},\mu to be the measures with densities un,uu_{n},u (with respect to the Lebesgue measure). As (un)n(u_{n})_{n\in\mathbb{N}} converges in L1(0,T;L1(Ω))L^{1}(0,T;L^{1}(\Omega)), which of course implies weak convergence in L1(0,T;L1(Ω))L^{1}(0,T;L^{1}(\Omega)), (μn)n(\mu_{n})_{n\in\mathbb{N}} converges narrowly to μ\mu on (0,T)×Ω(0,T)\times\Omega (since Cb((0,T)×Ω)L((0,T)×Ω)C_{b}((0,T)\times\Omega)\subset L^{\infty}((0,T)\times\Omega)). Moreover for vn:=un/unv_{n}:=\nabla u_{n}/u_{n} we have

(0,T)×Ω|vn|p𝑑μn(x)=(0,T)×Ω|un|punp(α1)𝑑x=αp(0,T)×Ω|unα|p𝑑x\int_{(0,T)\times\Omega}|v_{n}|^{p}\,d\mu_{n}(x)=\int_{(0,T)\times\Omega}|\nabla u_{n}|^{p}u_{n}^{p(\alpha-1)}\,dx=\alpha^{-p}\int_{(0,T)\times\Omega}|\nabla u_{n}^{\alpha}|^{p}\,dx (97)

and hence vnLp(μn,(0,T)×Ω)v_{n}\in L^{p}(\mu_{n},(0,T)\times\Omega).

Further, (97) and assumption (i)(i) imply

lim supn0vnLp(μn,(0,T)×Ω)=vLp(μ,(0,T)×Ω),\limsup_{n\rightarrow 0}\left\lVert v_{n}\right\rVert_{L^{p}(\mu_{n},(0,T)\times\Omega)}=\left\lVert v\right\rVert_{L^{p}(\mu,(0,T)\times\Omega)},

hence (vn)n(v_{n})_{n\in\mathbb{N}} converges in the sense of [1, Definition 5.4.3.] strongly in Lp(μ,(0,T)×Ω)L^{p}(\mu,(0,T)\times\Omega), which in addition to the narrow convergence of (μn)n(\mu_{n})_{n\in\mathbb{N}}, implies the narrow convergence of the transport plans γn=(id×vn)#μn\gamma_{n}=(\textnormal{id}\times v_{n})\#\mu_{n} to γ=(id×v)#μ\gamma=(\textnormal{id}\times v)\#\mu in 𝒫(((0,T)×Ω)×((0,T)×Ω))\mathcal{P}(((0,T)\times\Omega)\times((0,T)\times\Omega)), as stated in [1, Theorem 5.4.4.]. Since the ppth and qqth moment of μn\mu_{n} are uniformly bounded, by applying [1, Theorem 5.1.7.] to (γn)n(\gamma_{n})_{n\in\mathbb{N}} we find

limn(0,T)×Ωf(x,vn(x))𝑑μn(x)=(0,T)×Ωf(x,v(x))𝑑μ(x)\lim\limits_{n\rightarrow\infty}\int_{(0,T)\times\Omega}f(x,v_{n}(x))\,d\mu_{n}(x)=\int_{(0,T)\times\Omega}f(x,v(x))\,d\mu(x)

for functions ff with at most rr-growth. Hence choosing f(x,y)=|y|rf(x,y)=|y|^{r} yields the norm convergence:

limn(0,T)×Ω|unγ|r𝑑x=limn(0,T)×Ω|vn(x)|r𝑑μn(x)=(0,T)×Ω|uγ|r𝑑x.\lim\limits_{n\rightarrow\infty}\int_{(0,T)\times\Omega}|\nabla u_{n}^{\gamma}|^{r}\,dx=\lim\limits_{n\rightarrow\infty}\int_{(0,T)\times\Omega}|v_{n}(x)|^{r}\,d\mu_{n}(x)=\int_{(0,T)\times\Omega}|\nabla u^{\gamma}|^{r}\,dx.

By assumption (i)(i) we can extract pointwise convergent subsequences (not relabelled)

unα(x)\displaystyle u^{\alpha}_{n}(x) uα(x)a.e.x(0,T)×Ω\displaystyle\rightarrow u^{\alpha}(x)\quad a.e.\,x\in(0,T)\times\Omega
unα(x)\displaystyle\nabla u^{\alpha}_{n}(x) uα(x)a.e.x(0,T)×Ω\displaystyle\rightarrow\nabla u^{\alpha}(x)\quad a.e.\,x\in(0,T)\times\Omega

which, taking the strict positivity of the functions into account, yields

unγ(x)\displaystyle u^{\gamma}_{n}(x) uγ(x)a.e.x(0,T)×Ω\displaystyle\rightarrow u^{\gamma}(x)\quad a.e.\,x\in(0,T)\times\Omega
unγ(x)=unγα(x)unα(x)\displaystyle\nabla u^{\gamma}_{n}(x)=u_{n}^{\gamma-\alpha}(x)\nabla u^{\alpha}_{n}(x) uγα(x)uα(x)=uγ(x)a.e.x(0,T)×Ω\displaystyle\rightarrow u^{\gamma-\alpha}(x)\nabla u^{\alpha}(x)=\nabla u^{\gamma}(x)\quad a.e.\,x\in(0,T)\times\Omega

and hence finally the strong convergence of the subsequence.

For a nonnegative sequence (un)n(u_{n})_{n\in\mathbb{N}} we apply the above result to the truncated functions

un,ε(x):=max{un(x),ε}a.e.x(0,T)×Ωu_{n,\varepsilon}(x):=\max\{u_{n}(x),\varepsilon\}\quad a.e.\,x\in(0,T)\times\Omega

for a sequence ε0\varepsilon\downarrow 0. Using the norm convergence of (unγ)n(u_{n}^{\gamma})_{n\in\mathbb{N}} and the truncation property of W1,r(Ω)W^{1,r}(\Omega), we have

unγun,εγLr(0,T;W1,r(Ω))=unγ(uγ)n,εγLr(0,T;W1,r(Ω))0\left\lVert u_{n}^{\gamma}-u_{n,\varepsilon}^{\gamma}\right\rVert_{L^{r}(0,T;W^{1,r}(\Omega))}=\left\lVert u_{n}^{\gamma}-(u^{\gamma})_{n,{\varepsilon}^{\gamma}}\right\rVert_{L^{r}(0,T;W^{1,r}(\Omega))}\rightarrow 0 (98)

for ε0\varepsilon\rightarrow 0 and thus the result also holds for nonnegative functions.

To extend the argument to the whole d\mathbb{R}^{d}, one argues by approximating unu_{n} by the sequence un,l=χlunu_{n,l}=\chi_{l}u_{n} for χl(x)=χ(|x|/l)\chi_{l}(x)=\chi(|x|/l) where χCc(d)\chi\in C^{\infty}_{c}({\mathbb{R}}^{d}) is a cut-off function with 0χ10\leq\chi\leq 1, where χ(x)1\chi(x)\equiv 1 for |x|1|x|\leq 1 and χ(x)=0\chi(x)=0 for |x|2|x|\geq 2. Passing to subsequences and using the former calculations, by diagonalization one finds a sequence converging on every ball B0(l)B_{0}(l) for ll\in\mathbb{N}. With the norm boundedness of unγu^{\gamma}_{n} in Lr(0,T;W1,r(Ω))L^{r}(0,T;W^{1,r}(\Omega)) obtained by using the representation given in (97) and interpolating between the uniformly bounded vnLp(μ,(0,T)×d)\left\lVert v_{n}\right\rVert_{L^{p}(\mu,(0,T)\times\mathbb{R}^{d})} and vnLq(μ,(0,T)×d)\left\lVert v_{n}\right\rVert_{L^{q}(\mu,(0,T)\times\mathbb{R}^{d})} one eventually concludes the argument for functions on the whole d\mathbb{R}^{d}. ∎

Appendix C Integration by parts

For convenience of the reader, we recall the basic rule for integration by parts.

Lemma C.1.

Let f,gL1(d)f,g\in L^{1}({\mathbb{R}}^{d}) be given, and assume that there exists a vector field 𝐯L1(d;d)\mathbf{v}\in L^{1}({\mathbb{R}}^{d};{\mathbb{R}}^{d}) such that f=g+div𝐯f=g+\mathrm{div}\mathbf{v} in the sense of distributions. Then

dfdx=dgdx.\displaystyle\int_{\mathbb{R}^{d}}f\,\mathrm{d}x=\int_{\mathbb{R}^{d}}g\,\mathrm{d}x. (99)
Proof.

Let χCc(d)\chi\in C^{\infty}_{c}({\mathbb{R}}^{d}) be a cut-off function with 0χ10\leq\chi\leq 1, with χ(x)1\chi(x)\equiv 1 for |x|1|x|\leq 1, and with χ(x)=0\chi(x)=0 for |x|2|x|\geq 2. Defining as usual χR(x):=χ(x/R)\chi_{R}(x):=\chi(x/R) for R>1R>1, we obtain

dχRfdx=dχRgdx+div𝐯,χR=dgχRdxd𝐯χRdx.\displaystyle\int_{\mathbb{R}^{d}}\chi_{R}f\,\mathrm{d}x=\int_{\mathbb{R}^{d}}\chi_{R}g\,\mathrm{d}x+\langle\mathrm{div}\mathbf{v},\chi_{R}\rangle=\int_{\mathbb{R}^{d}}g\chi_{R}\,\mathrm{d}x-\int_{\mathbb{R}^{d}}\mathbf{v}\cdot\nabla\chi_{R}\,\mathrm{d}x.

On the one hand, since gL1(d)g\in L^{1}({\mathbb{R}}^{d}), we have that gχRgg\chi_{R}\to g in L1(d)L^{1}({\mathbb{R}}^{d}) as RR\to\infty by dominated convergence. On the other hand, since 𝐯L1(d;d)\mathbf{v}\in L^{1}({\mathbb{R}}^{d};{\mathbb{R}}^{d}) and χR(x)=1Rχ(x/R)\nabla\chi_{R}(x)=\frac{1}{R}\nabla\chi(x/R), we have that 𝐯χR0\mathbf{v}\cdot\nabla\chi_{R}\to 0 in L1(d)L^{1}({\mathbb{R}}^{d}) as RR\to\infty, again by dominated convergence. This shows (99). ∎

An application of this variant of integration by parts is the following identity, taken from [8, Theorem 3.1]:

4d|u4|4dx=2du42uu4dx+dΔu|u4|2dx\displaystyle 4\int_{\mathbb{R}^{d}}|\nabla\sqrt[4]{u}|^{4}\,\mathrm{d}x=2\int_{\mathbb{R}^{d}}\nabla\sqrt[4]{u}\cdot\nabla^{2}\sqrt{u}\cdot\nabla\sqrt[4]{u}\,\mathrm{d}x+\int_{\mathbb{R}^{d}}\Delta\sqrt{u}|\nabla\sqrt[4]{u}|^{2}\,\mathrm{d}x

for all positive uL1(d)u\in L^{1}({\mathbb{R}}^{d}) with uW2,2(d)\sqrt{u}\in W^{2,2}({\mathbb{R}}^{d}). The respective vector field is given by 𝐯=|u4|2u\mathbf{v}=|\nabla\sqrt[4]{u}|^{2}\nabla\sqrt{u}, which is L1(d;d)L^{1}({\mathbb{R}}^{d};{\mathbb{R}}^{d}) since uL2(d;d)\nabla\sqrt{u}\in L^{2}({\mathbb{R}}^{d};{\mathbb{R}}^{d}) and u4L4(d)\nabla\sqrt[4]{u}\in L^{4}({\mathbb{R}}^{d}). The relation f+div𝐯=gf+\mathrm{div}\mathbf{v}=g is easily seen for positive and smooth functions uu, and carries over to the afore mentioned more general uu via local approximation.

Appendix D Arzelá-Ascoli-Theorem

For enhanced self-containment, we replicate the statement of the generalized Arzelá-Ascoli Theorem from [1, Proposition 3.3.1], which has been essential in the proof of Lemma 3.21.

Lemma D.1.

Let (,d)(\mathcal{I},d) be a complete metric space and σ\sigma an Hausdorff topology on \mathcal{I} compatible with dd in the sense that for sequences (xn),(yn)(x_{n}),(y_{n})\subset\mathcal{I}

xn𝑑x\displaystyle x_{n}\xrightarrow{d}x\quad xn𝜎x\displaystyle\Longrightarrow\quad x_{n}\xrightarrow{\sigma}x
and(xn,yn)𝜎(x,y)\displaystyle\textnormal{and}\quad(x_{n},y_{n})\xrightarrow{\sigma}(x,y)\quad lim infnd(xn,yn)d(x,y).\displaystyle\Longrightarrow\quad\liminf_{n\rightarrow\infty}d(x_{n},y_{n})\geq d(x,y).

Further, let KK\subset\mathcal{I} be a sequentially compact set w.r.t. σ\sigma and let un:[0,T]u_{n}:[0,T]\rightarrow\mathcal{I} be curves such that

un(t)Kn,t[0,T],\displaystyle u_{n}(t)\in K\quad\forall n\in\mathbb{N},\quad t\in[0,T],
lim supnd(un(s),un(t))w(s,t)s,t[0,T],\displaystyle\limsup_{n\rightarrow\infty}d(u_{n}(s),u_{n}(t))\leq w(s,t)\quad\forall s,t\in[0,T],

for a (symmetric) function w:[0,T]×[0,T][0,+)w:[0,T]\times[0,T]\rightarrow[0,+\infty), such that

lim(s,t)(r,r)w(s,t)=0r[0,T],\lim_{(s,t)\rightarrow(r,r)}w(s,t)=0\quad\forall r\in[0,T]\setminus\mathcal{I},

where \mathcal{I} is an (at most) countable subset of [0,T][0,T]. Then there exists an increasing subsequence kn(k)k\mapsto n(k) and a limit curve u:[0,T]u:[0,T]\rightarrow\mathcal{I} such that

un(k)𝜎u(t)t[0,T],u is d-continuous in [0,T].u_{n(k)}\xrightarrow{\sigma}u(t)\quad\forall t\in[0,T],\quad\textnormal{$u$ is $d$-continuous in }[0,T]\setminus\mathcal{I}.

In the proof of Lemma 3.21, this result is applied as follows. The complete metric space is (𝒫2(d),𝐖2)(\mathcal{P}_{2}(\mathbb{R}^{d}),{\mathbf{W}}_{2}), the auxiliary topology is the one induced by narrow convergence; we have recalled the compatibility conditions in Section 2.2. The compact subset K𝒫2(d)K\subset\mathcal{P}_{2}(\mathbb{R}^{d}) is formed by the probability measures satisfying the uniform bound on the second moment from (51); sequential compactness of KK is a consequence of Prokhorov’s theorem. The modulus ww of continuity is given by the right-hand side in the Hölder estimate (65) at τ=0\tau=0. Notice that the conclusion above guarantees convergence in σ\sigma, i.e., narrowly, not necessarily in 𝐖2{\mathbf{W}}_{2}.

References

  • [1] Luigi Ambrosio, Nicola Gigli, and Giuseppe Savaré. Gradient flows in metric spaces and in the space of probability measures. Lectures in Mathematics ETH Zürich. Birkhäuser Verlag, Basel, second edition, 2008.
  • [2] Mario Bukal, Ansgar Jüngel, and Daniel Matthes. Entropies for radially symmetric higher-order nonlinear diffusion equations. Commun. Math. Sci., 9(2):353–382, 2011.
  • [3] Mario Bukal, Ansgar Jüngel, and Daniel Matthes. A multidimensional nonlinear sixth-order quantum diffusion equation. Ann. Inst. H. Poincaré Anal. Non Linéaire, 30(2):337–365, 2013.
  • [4] P. Degond and C. Ringhofer. Quantum moment hydrodynamics and the entropy principle. J. Statist. Phys., 112(3-4):587–628, 2003.
  • [5] Pierre Degond, Florian Méhats, and Christian Ringhofer. Quantum energy-transport and drift-diffusion models. J. Stat. Phys., 118(3-4):625–667, 2005.
  • [6] Jochen Denzler, Herbert Koch, and Robert J. McCann. Higher-order time asymptotics of fast diffusion in Euclidean space: a dynamical systems approach. Mem. Amer. Math. Soc., 234(1101):vi+81, 2015.
  • [7] Jochen Denzler and Robert J. McCann. Fast diffusion to self-similarity: complete spectrum, long-time asymptotics, and numerology. Arch. Ration. Mech. Anal., 175(3):301–342, 2005.
  • [8] Ugo Gianazza, Giuseppe Savaré, and Giuseppe Toscani. The Wasserstein gradient flow of the Fisher information and the quantum drift-diffusion equation. Arch. Ration. Mech. Anal., 194(1):133–220, 2009.
  • [9] Richard Jordan, David Kinderlehrer, and Felix Otto. The variational formulation of the Fokker-Planck equation. SIAM J. Math. Anal., 29(1):1–17, 1998.
  • [10] Ansgar Jüngel and Daniel Matthes. An algorithmic construction of entropies in higher-order nonlinear PDEs. Nonlinearity, 19(3):633–659, 2006.
  • [11] Ansgar Jüngel and Josipa-Pina Milišić. A sixth-order nonlinear parabolic equation for quantum systems. SIAM J. Math. Anal., 41(4):1472–1490, 2009.
  • [12] Pierre-Louis Lions and Cédric Villani. Régularité optimale de racines carrées. C. R. Acad. Sci. Paris Sér. I Math., 321(12):1537–1541, 1995.
  • [13] Daniel Matthes, Robert J. McCann, and Giuseppe Savaré. A family of nonlinear fourth order equations of gradient flow type. Comm. Partial Differential Equations, 34(10-12):1352–1397, 2009.
  • [14] Robert J. McCann and Christian Seis. The spectrum of a family of fourth-order nonlinear diffusions near the global attractor. Comm. Partial Differential Equations, 40(2):191–218, 2015.
  • [15] Robert J. McCann and Dejan Slepčev. Second-order asymptotics for the fast-diffusion equation. Int. Math. Res. Not., pages Art. ID 24947, 22, 2006.
  • [16] Florian Méhats and Olivier Pinaud. An inverse problem in quantum statistical physics. J. Stat. Phys., 140(3):565–602, 2010.
  • [17] Felix Otto. The geometry of dissipative evolution equations: the porous medium equation. Comm. Partial Differential Equations, 26(1-2):101–174, 2001.
  • [18] Olivier Pinaud. The quantum drift-diffusion model: existence and exponential convergence to the equilibrium. Ann. Inst. H. Poincaré Anal. Non Linéaire, 36(3):811–836, 2019.
  • [19] Filippo Santambrogio. Optimal transport for applied mathematicians, volume 87 of Progress in Nonlinear Differential Equations and their Applications. Birkhäuser/Springer, Cham, 2015. Calculus of variations, PDEs, and modeling.
  • [20] Christian Seis. Long-time asymptotics for the porous medium equation: the spectrum of the linearized operator. J. Differential Equations, 256(3):1191–1223, 2014.
  • [21] Christian Seis. The thin-film equation close to self-similarity. Anal. PDE, 11(5):1303–1342, 2018.
  • [22] Cédric Villani. Topics in optimal transportation, volume 58 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 2003.