This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A Stein’s Method Approach to the Linear Noise Approximation for Stationary Distributions of Chemical Reaction Networks

Theodore W. Grunberg Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA 02139 USA ([email protected]).    Domitilla Del Vecchio Department of Mechanical Engineering, MIT, Cambridge, MA 02139 USA ([email protected]).
Abstract

Stochastic Chemical Reaction Networks are continuous time Markov chain models that describe the time evolution of the molecular counts of species interacting stochastically via discrete reactions. Such models are ubiquitous in systems and synthetic biology, but often have a large or infinite number of states, and thus are not amenable to computation and analysis. Due to this, approximations that rely on the molecular counts and the volume being large are commonly used, with the most common being the Reaction Rate Equations and the Linear Noise Approximation. For finite time intervals, Kurtz established the validity of the Reaction Rate Equations and Linear Noise Approximation, by proving law of large numbers and central limit theorem results respectively. However, the analogous question for the stationary distribution of the Markov chain model has remained mostly unanswered, except for chemical reaction networks with special structures or bounded molecular counts. In this work, we use Stein’s Method to obtain sufficient conditions for the stationary distribution of an appropriately scaled Stochastic Chemical Reaction Network to converge to the Linear Noise Approximation as the system size goes to infinity. Our results give non asymptotic bounds on the error in the Reaction Rate Equations and in the Linear Noise Approximation as applied to the stationary distribution. As a special case, we give conditions under which the global exponential stability of an equilibrium point of the Reaction Rate Equations is sufficient to obtain our error bounds, thus permitting one to obtain conclusions about the Markov chain by analyzing the deterministic Reaction Rate Equations.

1 Introduction

1.1 Background and existing results

Systems composed of a set of chemical species interacting via a finite set of reactions are most often modeled as a Stochastic Chemical Reaction Network (SCRN), a continuous time Markov chain (CTMC) model for how the molecular counts of each species change over time. Such models have found widespread use in systems and synthetic biology as a method to capture the counts of relevant biomolecules within a single cell, where due to the small numbers of molecules present, the effect of stochasticity in the system’s evolution can be profound [67, 61, 52, 46, 6, 20, 65]. The time evolution of the probability distribution of molecular counts can be computed via the Kolmogorov forward equations, often called the Chemical Master Equation [27], or by a Monte Carlo approach using the Doob-Gillespie Algorithm [16, 26]. However, due to the very large, and often countably infinite, number of states in the Markov chain, simpler approximations have been proposed, the most prominent of which exploit the fact that for many physical systems the volume in which the reactions occur and the molecular counts of all species are large. These approximations are formalized by considering a family of Markov chains 𝑿Ω(t)\bm{X}_{\Omega}(t), where 𝑿Ω(t)\bm{X}_{\Omega}(t) is the vector of molecular counts at time tt and Ω\Omega is the volume. The Reaction Rate Equation (RRE) model is an Ordinary Differential Equation (ODE), whose solution 𝒗(t)\bm{v}(t) is a deterministic approximation for 𝑿Ω(t)/Ω\bm{X}_{\Omega}(t)/\Omega, rigorously justified by Kurtz on finite time intervals in the sense that under mild technical conditions, 𝑿Ω(t)/Ωp𝒗(t)\bm{X}_{\Omega}(t)/\Omega\rightarrow_{p}\bm{v}(t) uniformly on any finite time interval [39]. Due to 𝒗(t)\bm{v}(t) being the solution to an ODE, the computational and analytical simplicity of the RRE model has lead to its widespread use in both modeling in systems biology [3] and design in synthetic biology [14, 24, 19]. In essence, the RRE model approximates the concentrations by a constant at each point in time, neglecting any stochastic fluctuation away from this value. In order to account for these fluctuations, Van Kampen expanded the Forward Kolmogorov Equations in powers of 1/Ω1/\sqrt{\Omega} [35, 62]. The first order approximation he derived in this manner, called the Linear Noise Approximation (LNA), adds a correction term to the RREs, specifically a Gaussian approximation to fluctuations of 𝑿Ω(t)/Ω\bm{X}_{\Omega}(t)/\Omega about 𝒗(t)\bm{v}(t). The LNA was rigorously justified by Kurtz in the sense of giving technical conditions for the central limit theorem result 1Ω(𝑿Ω(t)Ω𝒗(t))p𝒩(0,P(t))\frac{1}{\sqrt{\Omega}}\left(\bm{X}_{\Omega}(t)-\Omega\bm{v}(t)\right)\rightarrow_{p}\mathcal{N}(0,P(t)) to hold uniformly on finite time intervals with an appropriate covariance matrix P(t)P(t). The LNA is similar to the RREs in that the approximating distribution can be easily computed from the solution to an ODE, and due to this simplicity it has been used widely to quantify noise in natural biological systems, [50, 51, 18, 60], as well as for design [56, 55], and inference [38, 22, 23] in synthetic biology.

However, a key weakness of the theoretical justification for the RREs and the LNA is that the results given in [39] do not say anything about the relationship between the stationary distribution of the Markov chain model and the steady state behavior of the RREs or LNA, even when the stationary distribution exists uniquely for all Ω\Omega and the RREs/LNA has a well defined limiting behavior. In a later work, Kurtz gave a sufficient condition for the uniform in time convergence of 𝑿Ω(t)/Ω\bm{X}_{\Omega}(t)/\Omega to 𝒗(t)\bm{v}(t) and of 1Ω(𝑿Ω(t)Ω𝒗(t))\frac{1}{\sqrt{\Omega}}\left(\bm{X}_{\Omega}(t)-\Omega\bm{v}(t)\right) to 𝒩(0,P(t))\mathcal{N}(0,P(t)) [40]. However, that result requires that the state space of 𝑿Ω(t)/Ω\bm{X}_{\Omega}(t)/\Omega be bounded uniformly in Ω\Omega, which rules out the analysis of many biological systems of interest where the molecular counts of certain species can grow unbounded, and thus for all Ω\Omega, 𝑿Ω(t)\bm{X}_{\Omega}(t) evolves on a countably infinite subset of the integer lattice. This technical difficulty in using the RREs or LNA to approximate the stationary distribution of an SCRN has long been understood [62], and in fact it is often the case that the Markov chain and RREs/LNA models have different long term behavior [43]. In this work we give sufficient conditions under which there exists a constant CC such that for all sufficiently large Ω\Omega,

𝒲1(1Ω(𝑿ΩΩ𝒗),𝒀)ClnΩΩ,\mathcal{W}_{1}\left(\frac{1}{\sqrt{\Omega}}\left(\bm{X}_{\Omega}^{\infty}-\Omega\bm{v}^{*}\right),\bm{Y}^{\infty}\right)\leq C\frac{\ln{\Omega}}{\sqrt{\Omega}}, (1)

where 𝑿Ω\bm{X}_{\Omega}^{\infty} is distributed according to the stationary distribution of 𝑿Ω(t)\bm{X}_{\Omega}(t), 𝒗\bm{v}^{*} is an appropriate equilbrium point of the RRE, 𝒀𝒩(0,P)\bm{Y}^{\infty}\sim\mathcal{N}(0,P^{\infty}) with PP^{\infty} a covariance matrix computed via the LNA, and 𝒲1\mathcal{W}_{1} is the 1-Wasserstein distance between the laws of the two random variables. This result immediately implies both that the Markov chain’s stationary distribution converges to an equilibrium point of the RRE in the sense 𝑿Ω/Ωd𝒗\bm{X}_{\Omega}^{\infty}/\Omega\rightarrow_{d}\bm{v}^{*}, where 𝒗\bm{v}^{*} is an appropriate equilbrium of the RRE, and that 1Ω(𝑿Ω(t)Ω𝒗(t))d𝒩(0,P)\frac{1}{\sqrt{\Omega}}\left(\bm{X}_{\Omega}(t)-\Omega\bm{v}(t)\right)\rightarrow_{d}\mathcal{N}(0,P^{\infty}) [25]. We give sufficient conditions in terms of second moment of 1Ω(𝑿ΩΩ𝒗)\frac{1}{\sqrt{\Omega}}\left(\bm{X}_{\Omega}^{\infty}-\Omega\bm{v}^{*}\right) for (1) to hold, and additionally give sufficient conditions on only the RRE model for (1) to hold, which include global exponential stability of 𝒗\bm{v}^{*}. Interestingly, the convergence rate given by (1) is identical to the convergence rate shown by Kurtz on finite time intervals [41].

The primary technique we use to prove our results is Stein’s Method [13]. Stein’s Method is a powerful technique for proving central limit theorems introduced by Stein in 1972 [59, 58], and used in a wide variety of situations since then, such as when the limiting distribution is Poisson [7] or Binomial [17], as well as for convergence to a diffusion process [8]. The most closely related work to our study comes from queueing theory, where Braverman et al. using Stein’s Method, and Gurvich using a closely related technique, studied diffusion approximations of certain queueing systems in the heavy traffic regime [11, 10, 33]. Our application of Stein’s method follows that of [11] at a high level, where the key difference is that our family of CTMCs and limiting process is in nn dimensions instead of one. In order to handle this, we use results on Stein Factors from [30].

1.2 Related work

Although rigorous work on approximating the stationary distribution of SCRNs in the large volume limit has been limited, a variety of related ideas have been presented in the literature. In particular, when the microscopic rates are all affine, it is known that the LNA matches the first and second moments of 𝑿Ω(t)\bm{X}_{\Omega}(t) for all time [62, 43]. This idea has been extended to a restricted class of SCRNs with nonlinear propensities in [31]. Additionally, Sontag and Singh gave a special class of SCRNs where the moments can be exactly computed even when they may not match the moments of the LNA exactly [57]. However, in these cases it is still unclear if the stationary distribution of 1Ω(𝑿Ω(t)Ωv)\frac{1}{\sqrt{\Omega}}\left(\bm{X}_{\Omega}(t)-\Omega v^{*}\right) converges to the Gaussian given by the LNA in the large volume limit. In fact, the results we present allow one to rigorously go from convergence of 1Ω(𝑿ΩΩv)\frac{1}{\sqrt{\Omega}}\left(\bm{X}_{\Omega}^{\infty}-\Omega v^{*}\right) to 𝒩(0,P)\mathcal{N}(0,P^{\infty}) with respect to the first and second moments, to convergence in 1-Wasserstein distance. We can thus strengthen the results of e.g. [31] with limited additional work. We note that methods for establishing convergence of the stationary distributions of a family of Markov Processes to the stationary distribution of a limiting process are given in [21]. However, not only have these methods not been successfully applied to SCRNs, but also our Stein’s Method approach yields non-asymptotic bounds on the approximation error in 1-Wasserstein distance.

In certain special cases, such as detailed or complex balance, the stationary distribution of 𝑿Ω(t)\bm{X}_{\Omega}(t) can be computed in closed form, and thus in principle one could directly analyze the approximation error when using the LNA [66, 4, 48]. However, for most SCRNs of interest in biology, such results are not applicable. For example, a simple two step transcription-translation model for protein expression is not detailed or complex balanced [14], and thus the results in [66, 4] cannot be used to obtain a closed form expression for the stationary distribution, whereas the results of [48] are only exact for systems where 𝑿Ω(t)\bm{X}_{\Omega}(t) has finitely many states.

We note that the LNA is only one diffusion approximation for SCRNs, and in fact the Chemical Langevin Equation (CLE) is often a more accurate approximation to 𝑿Ω(t)/Ω\bm{X}_{\Omega}(t)/\Omega for moderately large volumes [28]. The convergence of 𝑿Ω(t)/Ω\bm{X}_{\Omega}(t)/\Omega to the CLE on finite time intervals was established by Kurtz [41], but using the CLE to approximate the stationary distribution of 𝑿Ω(t)/Ω\bm{X}_{\Omega}(t)/\Omega has remained mostly unstudied. In fact, the CLE often exhibits finite time breakdown and thus does not have a well defined stationary distribution, a problem that has been tackled by allowing complex values in the Stochastic Differential equation [54], as well as imposing the physical constraint that concentrations are nonnegative [42].

1.3 Outline of the rest of the paper

The rest of the paper is as follows: In Section 2 we give notation and mathematical background, and then introduce the three relevant models. In Section 3 we state our main results, the proofs of which appear in Section 4. In Section 5 we illustrate our results with several examples. In Section 6 we provide concluding remarks and future research directions.

2 Problem Setting

In this section we first introduce the notation we will use, and then introduce SCRNs and the associated CTMC model along with and the Reaction Rate Equations and Linear Noise Approximation models. We also introduce the 1-Wasserstein distance, which we will use to measure the distance between probability distributions.

2.1 Notation

We denote (column) vectors by bold symbols 𝒙,𝒁\bm{x},\bm{Z}, with the elements indexed as 𝒙=[x1,,xn]T\bm{x}=[x_{1},\dots,x_{n}]^{T}. Let 0n\mathbb{Z}_{\geq 0}^{n} [>0n\mathbb{Z}_{>0}^{n}] denote the nonnegative [positive] integer lattice in nn dimensions. Let \mathbb{Q} denote the set of rational numbers. Let n\mathbb{R}^{n}, 0n\mathbb{R}_{\geq 0}^{n}, and >0n\mathbb{R}^{n}_{>0} denote n-dimensional Euclidian space, and the nonnegative and positive orthants of the same respectively. For 𝒙n\bm{x}\in\mathbb{R}^{n} we denote the standard 2-norm by 𝒙\lVert\bm{x}\rVert, and for a matrix An×mA\in\mathbb{R}^{n\times m} we denote by A\left\lVert A\right\rVert the induced 2-norm A=sup𝒙0A𝒙𝒙\left\lVert A\right\rVert=\sup_{\bm{x}\neq 0}\frac{\left\lVert A\bm{x}\right\rVert}{\lVert\bm{x}\rVert}, and denote by cond(A)\mathrm{cond}(A) the condition number of AA. When all eigenvalues of An×nA\in\mathbb{R}^{n\times n} have strictly negative real parts we say that AA is Hurwitz. Given a function g:ng:\mathbb{R}^{n}\rightarrow\mathbb{R}, we denote by g𝒛\frac{\partial g}{\partial\bm{z}}, g\nabla g, and 2g\nabla^{2}g the row vector of partial derivatives, the gradient, and the Hessian of gg respectively. We denote the third (directional) derivative of gg by 3g[𝒙1,𝒙2,𝒙3]\nabla^{3}g[\bm{x}_{1},\bm{x}_{2},\bm{x}_{3}], along the directions 𝒙1,𝒙2,𝒙3\bm{x}_{1},\bm{x}_{2},\bm{x}_{3}, and define its induced norm as sup𝒙1=𝒙2=𝒙3=13g[𝒙1,𝒙2,𝒙3]\sup_{\lVert\bm{x}_{1}\rVert=\lVert\bm{x}_{2}\rVert=\lVert\bm{x}_{3}\rVert=1}\lVert\nabla^{3}g[\bm{x}_{1},\bm{x}_{2},\bm{x}_{3}]\rVert. For two functions ff and gg from n\mathbb{R}^{n} to \mathbb{R}, we define the convolution of ff and gg as (fg)(𝒙)=nf(𝒙𝒔)g(𝒔)μ(d𝒔)(f\star g)(\bm{x})=\int_{\mathbb{R}^{n}}f(\bm{x}-\bm{s})g(\bm{s})\mu(d\bm{s}), with μ\mu the Lebesgue measure on n\mathbb{R}^{n}. We denote the set of k-continuously differentiable functions on a domain 𝒟\mathcal{D} by Ck(𝒟)C^{k}(\mathcal{D}), where we omit the domain if it is clear from context.

2.2 Stochastic Chemical Reaction Networks

We consider Stochastic Chemical Reaction Networks parameterized by Ω𝒪>0\Omega\in\mathcal{O}\subseteq\mathbb{R}_{>0}, composed of nn species X1,X2,,Xn\mathrm{X}_{1},\mathrm{X}_{2},\dots,\mathrm{X}_{n}, interacting according to rr reactions, each characterized by a nonzero reaction vector 𝝃in\bm{\xi}_{i}\in\mathbb{Z}^{n} and a microscopic propensity function λi(𝒙)\lambda_{i}(\bm{x}), which depends on Ω\Omega. The reaction vector describes the change in the counts on the species when reaction ii fires, and the microscopic propensity function describes the rate at which reaction ii fires. Thus, for each Ω𝒪\Omega\in\mathcal{O}, the Stochastic Chenical Reaction Network describes a CTMC {𝑿Ω(t),t0}\{\bm{X}_{\Omega}(t),t\geq 0\}, where 𝑿Ω(t)=[X1(t),X2(t),,Xn(t)]T𝒳Ω0n\bm{X}_{\Omega}(t)=[X_{1}(t),X_{2}(t),\dots,X_{n}(t)]^{T}\in\mathcal{X}_{\Omega}\subseteq\mathbb{Z}_{\geq 0}^{n} represents the molecular counts of the species. The state space 𝒳Ω\mathcal{X}_{\Omega} can be chosen as any subset of 0n\mathbb{Z}_{\geq 0}^{n} such that 1) λi(𝒙)0\lambda_{i}(\bm{x})\geq 0 for all 𝒙𝒳Ω\bm{x}\in\mathcal{X}_{\Omega} and 2) for all 𝒙𝒳Ω\bm{x}\in\mathcal{X}_{\Omega} and all i{1,,r}i\in\{1,\dots,r\} such that 𝒙+𝝃i𝒳Ω\bm{x}+\bm{\xi}_{i}\notin\mathcal{X}_{\Omega}, λi(𝒙)=0\lambda_{i}(\bm{x})=0. The infinitesimal transition rates from 𝒙\bm{x} to 𝒙\bm{x}^{\prime} are given by q𝒙,𝒙=i:𝒙+𝝃i=𝒙λi(𝒙)q_{\bm{x},\bm{x}^{\prime}}=\sum_{i:\bm{x}+\bm{\xi}_{i}=\bm{x}^{\prime}}\lambda_{i}(\bm{x}). We refer the interested reader to [5] for a more thorough background on SCRNs. Throughout this work we will need to impose the following assumption on our SCRNs:

Assumption 2.1.

The propensity functions have the form

λi(𝒙)=Ωλ¯i(𝒙Ω).\lambda_{i}(\bm{x})=\Omega\bar{\lambda}_{i}\left(\frac{\bm{x}}{\Omega}\right). (2)

Additionally, there exists 𝒳=j=1nIj0n\mathcal{X}=\prod_{j=1}^{n}I_{j}\subseteq\mathbb{R}_{\geq 0}^{n} where for each jj, IJ=[0,xjmax]I_{J}=[0,x_{j}^{max}] for some xjmax>0x_{j}^{max}\in\mathbb{Q}_{>0}, or Ij=[0,)I_{j}=[0,\infty), such that:

  1. 1.

    For all i{1,,r}i\in\{1,\dots,r\}, λ¯i(𝒗)>0\bar{\lambda}_{i}(\bm{v})>0 for all 𝒗𝒳\bm{v}\in\mathcal{X} such that there exists η>0\eta>0 satisfying 𝒗+η𝝃i𝒳\bm{v}+\eta\bm{\xi}_{i}\in\mathcal{X}.

  2. 2.

    For all i{1,,r}i\in\{1,\dots,r\}, λ¯i(𝒗)=0\bar{\lambda}_{i}(\bm{v})=0 for all 𝒗𝒳\bm{v}\in\mathcal{X} such that for all η>0\eta>0, 𝒗+η𝝃i𝒳\bm{v}+\eta\bm{\xi}_{i}\notin\mathcal{X}.

Remark 2.1.

The functions λ¯i(𝒗)\bar{\lambda}_{i}(\bm{v}) are the macroscopic propensities of each reaction.

Under Assumption 2.1, we can define 𝒪={Ω>0|j,Ωxjmax>0}\mathcal{O}=\left\{\Omega\in\mathbb{R}_{>0}\middle|\forall j,\;\Omega x_{j}^{max}\in\mathbb{Z}_{>0}\right\}. Note that our assumption that xjmaxx_{j}^{max}\in\mathbb{Q} ensures that sup𝒪=+\sup\mathcal{O}=+\infty, i.e. our family of Markov chains includes those with arbitrarily large volume. Under Assumption 2.1 we will always choose to define the state space of our CTMC as 𝒳Ω=0nΩ𝒳\mathcal{X}_{\Omega}=\mathbb{Z}_{\geq 0}^{n}\cap\Omega\mathcal{X}. In order to apply our Stein’s Method technique, we will need the following assumption:

Assumption 2.2.

For all i{1,,r}i\in\{1,\dots,r\}, λ¯i(𝒗)\bar{\lambda}_{i}(\bm{v}) is twice continuously differentiable on some open set 𝒟\mathcal{D} containing 𝒳\mathcal{X}, and has bounded second derivatives on 𝒳\mathcal{X}.

One form of λi(𝒙)\lambda_{i}(\bm{x}) which satisfies Assumptions 2.1 and 2.2 is mass action kinetics [62], which means that the propensities are of the form

λi(𝒙)=kiΩ1jξri,jj=1nxjξri,j,\lambda_{i}(\bm{x})=k_{i}\Omega^{1-\sum_{j}\xi_{ri,j}}\prod_{j=1}^{n}x_{j}^{\xi_{ri,j}}, (3)

where 𝝃i=𝝃pi𝝃ri\bm{\xi}_{i}=\bm{\xi}_{pi}-\bm{\xi}_{ri} with 𝝃pi,𝝃ri0n\bm{\xi}_{pi},\bm{\xi}_{ri}\in\mathbb{Z}_{\geq 0}^{n} is the reaction vector of the chemical reaction

\schemestart\subschemejξri,jXj\arrow(x1x1)>[ki][0]\subschemejξpi,jXj\schemestop.\schemestart\subscheme{\sum_{j}\xi_{ri,j}\mathrm{X}_{j}}\arrow(x1--x1){->[k_{i}]}[0]\subscheme{\sum_{j}\xi_{pi,j}\mathrm{X}_{j}}\schemestop. (4)

The requirement that 𝝃pi2\sum\bm{\xi}_{pi}\leq 2 must be imposed for λi(𝒙)\lambda_{i}(\bm{x}) to satisfy Assumption 2.2. See [27] for a discussion of the physical assumptions necessary to reach such propensities. We note that (3) requires that no reactant appears twice on the left hand side of the reaction. Having a reactant Xj\mathrm{X}_{j} appear twice on the left hand side of a reaction creates a propensity proportional to xj2x_{j}^{2} instead of the physically accurate xj(xj1)x_{j}(x_{j}-1) [27]. Here kik_{i} is the reaction rate constant of reaction ii and ξri,j\xi_{ri,j} is the jjth element of 𝝃ri\bm{\xi}_{ri}. Another form of λi(𝒙)\lambda_{i}(\bm{x}) that satisfies Assumptions 2.1 and 2.2 is

λi(𝒙)=Ωk1i(xji/Ω)pk2i+(xji/Ω)q,\lambda_{i}(\bm{x})=\Omega\frac{k^{i}_{1}(x_{j_{i}}/\Omega)^{p}}{k^{i}_{2}+(x_{j_{i}}/\Omega)^{q}}, (5)

where ji{1,,n}j_{i}\in\{1,\dots,n\} and p,q>0p,q\in\mathbb{Z}_{>0} such that p+q3p+q\leq 3 or p=qp=q. Such “Hill-type propensities” can arise through timescale separation in systems of chemical reactions with mass action kinetics [36, 47, 34]. For simplicity, we will impose the following assumption:

Assumption 2.3.

The stoichiometric subspace, span{𝝃1,,𝝃r}\mathrm{span}\{\bm{\xi}_{1},\dots,\bm{\xi}_{r}\}, has dimension nn, and for each Ω𝒪\Omega\in\mathcal{O}, 𝑿Ω(t)\bm{X}_{\Omega}(t) is irreducible.

Though Assumption 2.3 appears to rule out SCRNs that have conservation laws, i.e. a linear combination of species counts that remains unchanged by all reactions, we will show through an example in Section 5 that Assumption 2.3 does not fundamentally prevent the analysis of SCRNs with conservation laws, as long as a change of coordinates exist where the system can be represented in a lower dimensional lattice 0n\mathbb{Z}_{\geq 0}^{n^{\prime}} as an SCRN with stoichiometric subspace of dimension nn^{\prime}. For f:nf:\mathbb{R}^{n}\rightarrow\mathbb{R} the generator of 𝑿Ω(t)\bm{X}_{\Omega}(t) is

G𝑿Ωf(𝒙)=i=1rλi(𝒙)(f(𝒙+𝝃i)f(𝒙)).G_{\bm{X}_{\Omega}}f(\bm{x})=\sum_{i=1}^{r}\lambda_{i}(\bm{x})\left(f(\bm{x}+\bm{\xi}_{i})-f(\bm{x})\right). (6)

We will also consider two shifted and scaled version of 𝑿Ω(t)\bm{X}_{\Omega}(t) defined by

𝑿~Ω(t)=1Ω(𝑿Ω(t)Ω𝒗),\tilde{\bm{X}}_{\Omega}(t)=\frac{1}{\sqrt{\Omega}}\left(\bm{X}_{\Omega}(t)-\Omega\bm{v}^{*}\right), (7)

and

𝑿~~Ω(t)=1Ω𝑿Ω(t)𝒗,\tilde{\tilde{\bm{X}}}_{\Omega}(t)=\frac{1}{\Omega}\bm{X}_{\Omega}(t)-\bm{v}^{*}, (8)

where 𝒗\bm{v}^{*} is a specified point in 𝒳\mathcal{X}. For f:nf:\mathbb{R}^{n}\rightarrow\mathbb{R} the generator of 𝑿~Ω(t)\tilde{\bm{X}}_{\Omega}(t) is given by

G𝑿~Ωf(𝒙)=i=1rλi(Ω𝒙+Ω𝒗)(f(𝒙+1Ω𝝃i)f(𝒙)),G_{\tilde{\bm{X}}_{\Omega}}f(\bm{x})=\sum_{i=1}^{r}\lambda_{i}(\sqrt{\Omega}\bm{x}+\Omega\bm{v}^{*})\left(f(\bm{x}+\frac{1}{\sqrt{\Omega}}\bm{\xi}_{i})-f(\bm{x})\right), (9)

and the generator of 𝑿~~Ω(t)\tilde{\tilde{\bm{X}}}_{\Omega}(t) is given by

G𝑿~~Ωf(𝒙)=i=1rΩλ¯i(𝒙+𝒗)(f(𝒙+1Ω𝝃i)f(𝒙)).G_{\tilde{\tilde{\bm{X}}}_{\Omega}}f(\bm{x})=\sum_{i=1}^{r}\Omega\bar{\lambda}_{i}(\bm{x}+\bm{v}^{*})\left(f(\bm{x}+\frac{1}{\Omega}\bm{\xi}_{i})-f(\bm{x})\right). (10)

When 𝑿Ω(t)\bm{X}_{\Omega}(t) has a unique stationary distribution π\pi, we denote by 𝑿Ω\bm{X}_{\Omega}^{\infty} a random variable having law π\pi. We likewise define 𝑿~Ω\tilde{\bm{X}}_{\Omega}^{\infty} and 𝑿~~Ω\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty}.

2.3 Reaction Rate Equations

The Reaction Rate Equations (RREs) are an Ordinary Differential Equation Model which is used to approximate 𝑿Ω(t)/Ω\bm{X}_{\Omega}(t)/\Omega. Defining 𝝀¯(𝒗)=[λ¯1(𝒗),λ¯2(𝒗),,λ¯r(𝒗)]T\bar{\bm{\lambda}}(\bm{v})=\begin{bmatrix}\bar{\lambda}_{1}(\bm{v}),&\bar{\lambda}_{2}(\bm{v}),&\dots,&\bar{\lambda}_{r}(\bm{v})\end{bmatrix}^{T} and S=[𝝃1𝝃2𝝃r]S=\begin{bmatrix}\bm{\xi}_{1}&\bm{\xi}_{2}&\dots&\bm{\xi}_{r}\end{bmatrix}, the reaction rate equations are

ddt𝒗(t)\displaystyle\frac{d}{dt}\bm{v}(t) =F(𝒗(t)),\displaystyle=F(\bm{v}(t)), 𝒗(0)\displaystyle\bm{v}(0) =𝒗0\displaystyle=\bm{v}_{0} (11)

where F(𝒗)=S𝝀¯(𝒗)F(\bm{v})=S\bar{\bm{\lambda}}(\bm{v}) and 𝒗𝒳0n\bm{v}\in\mathcal{X}\subseteq\mathbb{R}^{n}_{\geq 0}. An equilibrium point of the RREs is any point 𝒗𝒳\bm{v}^{*}\in\mathcal{X} satisfying 0=F(𝒗)0=F(\bm{v}^{*}). We note that under Assumption 2.1, 𝒳\mathcal{X} will be positive invariant with respect to 11.

2.4 Linear Noise Approximation

The Linear Noise Approximation (LNA) is a diffusion approximation obtained by expanding the Chemical Master Equation using the ansatz 𝑿Ω(t)Ω𝒗(t)+Ω𝒀(t)\bm{X}_{\Omega}(t)\approx\Omega\bm{v}(t)+\sqrt{\Omega}\bm{Y}^{\prime}(t) [62]. The terms 𝒗(t)\bm{v}(t) and 𝒀(t)\bm{Y}^{\prime}(t) are given by

ddt𝒗(t)\displaystyle\frac{d}{dt}\bm{v}(t) =F(𝒗(t)),\displaystyle=F(\bm{v}(t)), 𝒗(0)\displaystyle\bm{v}(0) =𝒗0,\displaystyle=\bm{v}_{0}, (12a)
d𝒀(t)\displaystyle d\bm{Y}^{\prime}(t) =F(𝒗)𝒗𝒀(t)dt+Sdiag𝝀¯(𝒗)d𝑾(t),\displaystyle=\frac{\partial F(\bm{v})}{\partial\bm{v}}\bm{Y}^{\prime}(t)dt+S\operatorname{diag}\sqrt{\bar{\bm{\lambda}}(\bm{v})}d\bm{W}(t), 𝒀(0)\displaystyle\bm{Y}^{\prime}(0) =𝒀0,\displaystyle=\bm{Y}^{\prime}_{0}, (12b)

where evidently (12a) is the RRE (11). Here 𝑾(t)\bm{W}(t) is a unit covariance Wiener Process. Let 𝒗\bm{v}^{*} be an equilibrium point of (12a). We note that λ¯i(𝒗)0\bar{\lambda}_{i}(\bm{v})\geq 0 must hold for (12b) to make sense, which is guaranteed by e.g. Assumption 2.1. We denote by 𝒀(t)\bm{Y}(t) the solution to (12b) with 𝒗(t)=𝒗\bm{v}(t)=\bm{v}^{*} and 𝒀(0)=𝒀0\bm{Y}^{\prime}(0)=\bm{Y}_{0}. The generator of 𝒀(t)\bm{Y}(t) is

G𝒀f(𝒙)=f(𝒙)𝒙F(𝒗)𝒗𝒙+12i=1,j=1nDij2f(𝒙)xixj,G_{\bm{Y}}f(\bm{x})=\frac{\partial f(\bm{x})}{\partial\bm{x}}\cdot\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}}\bm{x}+\frac{1}{2}\sum_{i=1,j=1}^{n}D_{ij}\frac{\partial^{2}f(\bm{x})}{\partial x_{i}\partial x_{j}}, (13)

where D=Sdiag(𝝀¯(𝒗))STD=S\operatorname{diag}\left(\bar{\bm{\lambda}}(\bm{v}^{*})\right)S^{T}. We remark that 𝒀(t)\bm{Y}(t) is an Ornstein–Uhlenbeck process, and thus 𝒀(t)\bm{Y}(t) is Gaussian as long as 𝒀0\bm{Y}_{0} is. The stationary distribution of 𝒀(t)\bm{Y}(t) is a zero mean Gaussian with covariance matrix Pn×nP^{\infty}\in\mathbb{R}^{n\times n} given as the solution to the Lyapunov Equation

F(𝒗)𝒗P+PF(𝒗)𝒗T=Sdiag𝝀¯(𝒗)ST,\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}}P^{\infty}+P^{\infty}\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}}^{T}=-S\operatorname{diag}\bar{\bm{\lambda}}(\bm{v}^{*})S^{T}, (14)

which has a unique positive definite solution as long as F(𝒗)𝒗\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}} is Hurwitz. In this case, we denote by 𝒀\bm{Y}^{\infty} a random variable distributed according to 𝒩(0,P)\mathcal{N}(0,P^{\infty}).

2.5 1-Wasserstein Distance

In this work we measure the distance between probability distributions using the 1-Wasserstein distance [25]. By a slight abuse of notation, we define the 1-Wasserstein distance between two random variables as the 1-Wasserstein distance between the laws of the random variables, formally:

Definition 2.1.

Given two random variables 𝒁1\bm{Z}_{1} and 𝒁2\bm{Z}_{2}, both taking values in n\mathbb{R}^{n}, we define the 1-Wasserstein distance between 𝒁1\bm{Z}_{1} and 𝒁2\bm{Z}_{2} as

𝒲1(𝒁1,𝒁2)=suphLip(1)|𝔼[h(𝒁1)]𝔼[h(𝒁2)]|,\mathcal{W}_{1}\left(\bm{Z}_{1},\bm{Z}_{2}\right)=\sup_{h\in\mathrm{Lip}(1)}\left\lvert\mathbb{E}\left[h\left(\bm{Z}_{1}\right)\right]-\mathbb{E}\left[h\left(\bm{Z}_{2}\right)\right]\right\rvert, (15)

where Lip(1)={h:n|𝒙,𝒙n,h(𝒙)h(𝒙)𝒙𝒙}\mathrm{Lip}(1)=\left\{h:\mathbb{R}^{n}\rightarrow\mathbb{R}\middle|\forall\bm{x},\bm{x}^{\prime}\in\mathbb{R}^{n},\;\left\lVert h(\bm{x})-h(\bm{x}^{\prime})\right\rVert\leq\left\lVert\bm{x}-\bm{x}^{\prime}\right\rVert\right\}.

We remark that this definition of 1-Wasserstein distance uses the dual formulation [63], and is a slight abuse of notation since the 1-Wasserstein distance is usually defined between two probability distributions. Our definition is equivalent to taking the 1-Wasserstein distance between the laws of 𝒁1\bm{Z}_{1} and 𝒁2\bm{Z}_{2}.

3 Approximating Stationary Distributions of SCRNs

Here we state the main results of this work. To begin, we present the technical conditions that will guarantee that the stationary distribution of 𝑿~Ω(t)\tilde{\bm{X}}_{\Omega}(t) converges to 𝒩(0,P)\mathcal{N}(0,P^{\infty}). We then give our main result, showing that when the first and second moments of 𝑿~Ω\tilde{\bm{X}}_{\Omega}^{\infty} are controlled, we can bound the 1-Wasserstein distance between 𝑿~Ω\tilde{\bm{X}}_{\Omega}^{\infty} and 𝒀\bm{Y}^{\infty}. We next show that this result can be used to analyze the RRE approximation as well. Finally, we give a method to check that the second moment of 𝑿~Ω\tilde{\bm{X}}_{\Omega}^{\infty} is controlled by relating the global stability properties of the RRE to the required moment bound.

Condition 3.1.

There exists p(𝒙)p(\bm{x}), a polynomial of degree κ\kappa, such that for all ii and all 𝒙𝒳\bm{x}\in\mathcal{X}, λ¯i(𝒙)p(𝒙)\bar{\lambda}_{i}(\bm{x})\leq p(\bm{x}). Furthermore, there exists Ω0\Omega_{0} such that for all Ω{Ω𝒪|ΩΩ0}\Omega\in\left\{\Omega\in\mathcal{O}\middle|\Omega\geq\Omega_{0}\right\}, 𝑿~Ω(t)\tilde{\bm{X}}_{\Omega}(t) has a unique stationary distribution π~\tilde{\pi}, which satisfies

𝔼𝑿~Ωπ~[𝑿~Ωκ+1]<.\mathbb{E}_{\tilde{\bm{X}}_{\Omega}^{\infty}\sim\tilde{\pi}}\left[\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\rVert^{\kappa+1}\right]<\infty. (16)

Condition 3.1 does not require any type of uniformity in Ω\Omega, and therefore the condition 𝔼[𝑿Ωκ+1]<\mathbb{E}\left[\lVert\bm{X}_{\Omega}^{\infty}\rVert^{\kappa+1}\right]<\infty can be checked instead.

Condition 3.2 (Uniform moment bounds).

There exist constants M1M_{1}, M2M_{2}, and Ω0\Omega_{0} such that for all Ω{Ω𝒪|ΩΩ0}\Omega\in\left\{\Omega\in\mathcal{O}\middle|\Omega\geq\Omega_{0}\right\}, 𝑿~Ω(t)\tilde{\bm{X}}_{\Omega}(t) has a unique stationary distribution π~\tilde{\pi}, which satisfies

𝔼𝑿~Ωπ~[𝑿~Ω]M1,\mathbb{E}_{\tilde{\bm{X}}_{\Omega}^{\infty}\sim\tilde{\pi}}\left[\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\rVert\right]\leq M_{1}, (17)

and

𝔼𝑿~Ωπ~[𝑿~Ω2]M2.\mathbb{E}_{\tilde{\bm{X}}_{\Omega}^{\infty}\sim\tilde{\pi}}\left[\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\rVert^{2}\right]\leq M_{2}. (18)
Remark 3.1.

In Condition 3.2, the bound on 𝔼𝑿~Ωπ~[𝑿~Ω2]\mathbb{E}_{\tilde{\bm{X}}_{\Omega}^{\infty}\sim\tilde{\pi}}\left[\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\rVert^{2}\right] implies the existence of a bound on 𝔼𝑿~Ωπ~[𝑿~Ω]\mathbb{E}_{\tilde{\bm{X}}_{\Omega}^{\infty}\sim\tilde{\pi}}\left[\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\rVert\right]. This can be seen from the fact that by Jensen’s inequality, 𝔼𝑿~Ωπ~[𝑿~Ω]2𝔼𝑿~Ωπ~[𝑿~Ω2]\mathbb{E}_{\tilde{\bm{X}}_{\Omega}^{\infty}\sim\tilde{\pi}}\left[\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\rVert\right]^{2}\leq\mathbb{E}_{\tilde{\bm{X}}_{\Omega}^{\infty}\sim\tilde{\pi}}\left[\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\rVert^{2}\right], and thus 𝔼𝑿~Ωπ~[𝑿~Ω2]M2\mathbb{E}_{\tilde{\bm{X}}_{\Omega}^{\infty}\sim\tilde{\pi}}\left[\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\rVert^{2}\right]\leq M_{2} implies that 𝔼𝑿~Ωπ~[𝑿~Ω]M2\mathbb{E}_{\tilde{\bm{X}}_{\Omega}^{\infty}\sim\tilde{\pi}}\left[\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\rVert\right]\leq\sqrt{M_{2}}. However, we state Condition 3.2 as it is to allow for the possibility of using a stronger bound on 𝔼𝑿~Ωπ~[𝑿~Ω]\mathbb{E}_{\tilde{\bm{X}}_{\Omega}^{\infty}\sim\tilde{\pi}}\left[\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\rVert\right] than the one derived from M2M_{2}.

Remark 3.2.

Recalling that 𝑿~~Ω=1Ω(XΩΩ𝒗)\tilde{\tilde{\bm{X}}}_{\Omega}=\frac{1}{\Omega}\left(X_{\Omega}-\Omega\bm{v}^{*}\right) is the “concentration scaled” Markov chain shifted to 𝒗\bm{v}^{*}, Condition 3.2 is equivalent to the existence of Ω0\Omega_{0}, M1M_{1}, and M2M_{2} such that for all Ω{Ω𝒪|ΩΩ0}\Omega\in\left\{\Omega\in\mathcal{O}\middle|\Omega\geq\Omega_{0}\right\}, there exists a unique stationary distribution of 𝑿~~Ω(t)\tilde{\tilde{\bm{X}}}_{\Omega}(t), π~~\tilde{\tilde{\pi}}, and it satisfies

𝔼𝑿~~Ωπ~~[𝑿~~Ω]1ΩM1,\mathbb{E}_{\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty}\sim\tilde{\tilde{\pi}}}\left[\lVert\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty}\rVert\right]\leq\frac{1}{\sqrt{\Omega}}M_{1}, (19)

and

𝔼𝑿~~Ωπ~~[𝑿~~Ω2]1ΩM2.\mathbb{E}_{\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty}\sim\tilde{\tilde{\pi}}}\left[\lVert\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty}\rVert^{2}\right]\leq\frac{1}{\Omega}M_{2}. (20)

We are now ready to state our main result.

Theorem 3.1.

Consider an SCRN 𝐗Ω(t)\bm{X}_{\Omega}(t) satisfying Assumptions 2.1, 2.2, and 2.3 and let 𝐯int(𝒳)\bm{v}^{*}\in\mathrm{int}(\mathcal{X}) be an equilibrium point of (12a) such that F𝐯(𝐯)\frac{\partial F}{\partial\bm{v}}(\bm{v}^{*}) is Hurwitz. If Conditions 3.1 and 3.2 hold, then there exist CC and Ω0\Omega_{0}^{\prime} such that for all Ω{Ω𝒪|ΩΩ0}\Omega\in\left\{\Omega\in\mathcal{O}\middle|\Omega\geq\Omega_{0}^{\prime}\right\},

𝒲1(𝑿~Ω,𝒀)ClnΩΩ.\mathcal{W}_{1}\left(\tilde{\bm{X}}_{\Omega}^{\infty},\bm{Y}^{\infty}\right)\leq C\frac{\ln\Omega}{\sqrt{\Omega}}. (21)
Proof.

See Section 4.1. ∎

Remark 3.3.

The conclusion of Theorem 3.1 implies that

limΩ𝒲1(𝑿~Ω,𝒀)=0,\lim_{\Omega\rightarrow\infty}\mathcal{W}_{1}\left(\tilde{\bm{X}}_{\Omega}^{\infty},\bm{Y}^{\infty}\right)=0, (22)

from which we can conclude convergence in distribution of 𝑿~Ω\tilde{\bm{X}}_{\Omega}^{\infty} to 𝒀\bm{Y}^{\infty} [63].

Remark 3.4.

The constants CC and Ω0\Omega_{0}^{\prime} in Theorem 3.1 depend on the propensities and stoichiometry vectors, as well as M1M_{1}, M2M_{2}, and Ω0\Omega_{0} from Condition 3.2. We note that an explicit computation of CC and Ω0\Omega_{0}^{\prime} is possible, as will be seen in the proof, where we give an explicit upper bound that is dominated by a term proportional to lnΩΩ\frac{\ln\Omega}{\sqrt{\Omega}}.

Though our Stein’s Method approach does not require the preliminary step of establishing a law of large numbers type result relating 𝑿~~Ω\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty} to 𝒗\bm{v}^{*}, an equilibrium point of (12a), we can conclude such a result whenever the assumptions of Theorem 3.1 are satisfied.

Corollary 3.1.

Consider an SCRN 𝐗Ω(t)\bm{X}_{\Omega}(t) satisfying Assumptions 2.1, 2.2, and 2.3 and let 𝐯int(𝒳)\bm{v}^{*}\in\mathrm{int}(\mathcal{X}) be an equilibrium point of (12a) such that F𝐯(𝐯)\frac{\partial F}{\partial\bm{v}}(\bm{v}^{*}) is Hurwitz. If Conditions 3.1 and 3.2 hold, then there exist C~\tilde{C} and Ω~0\tilde{\Omega}_{0}^{\prime} such that for all Ω{Ω𝒪|ΩΩ~0}\Omega\in\left\{\Omega\in\mathcal{O}\middle|\Omega\geq\tilde{\Omega}_{0}^{\prime}\right\},

𝒲1(𝑿~~Ω,0)C~1Ω.\mathcal{W}_{1}\left(\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty},0\right)\leq\tilde{C}\frac{1}{\sqrt{\Omega}}. (23)
Proof.

From the triangle inequality, we have that

𝒲1(𝑿~~Ω,0)𝒲1(𝑿~~Ω,1Ω𝒀)+𝒲1(1Ω𝒀,0).\mathcal{W}_{1}\left(\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty},0\right)\leq\mathcal{W}_{1}\left(\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty},\frac{1}{\sqrt{\Omega}}\bm{Y}^{\infty}\right)+\mathcal{W}_{1}\left(\frac{1}{\sqrt{\Omega}}\bm{Y}^{\infty},0\right). (24)

To bound the right hand side, notice that 𝑿~~Ω=1Ω𝑿~Ω\tilde{\tilde{\bm{X}}}^{\infty}_{\Omega}=\frac{1}{\sqrt{\Omega}}\tilde{\bm{X}}^{\infty}_{\Omega}. Thus,

𝒲1(𝑿~~Ω,1Ω𝒀)=1Ω𝒲1(𝑿~Ω,𝒀).\mathcal{W}_{1}\left(\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty},\frac{1}{\sqrt{\Omega}}\bm{Y}^{\infty}\right)=\frac{1}{\sqrt{\Omega}}\mathcal{W}_{1}\left(\tilde{\bm{X}}^{\infty}_{\Omega},\bm{Y}^{\infty}\right). (25)

Under the assumptions of the corollary, we can apply Theorem 3.1 to conclude that there exist CC and Ω0\Omega_{0}^{\prime} such that for all Ω{Ω𝒪|ΩΩ0}\Omega\in\left\{\Omega\in\mathcal{O}\middle|\Omega\geq\Omega_{0}^{\prime}\right\},

𝒲1(𝑿~Ω,𝒀)Cln(Ω)Ω,\mathcal{W}_{1}\left(\tilde{\bm{X}}_{\Omega}^{\infty},\bm{Y}^{\infty}\right)\leq C\frac{\ln(\Omega)}{\sqrt{\Omega}}, (26)

and thus from (25), we have that for all Ω{Ω𝒪|ΩΩ0}\Omega\in\left\{\Omega\in\mathcal{O}\middle|\Omega\geq\Omega_{0}^{\prime}\right\},

𝒲1(𝑿~~Ω,1Ω𝒀)ClnΩΩ.\mathcal{W}_{1}\left(\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty},\frac{1}{\sqrt{\Omega}}\bm{Y}^{\infty}\right)\leq C\frac{\ln\Omega}{\Omega}. (27)

Observing that 𝒲1(1Ω𝒀,0)=1Ω𝒲1(𝒀,0)\mathcal{W}_{1}\left(\frac{1}{\sqrt{\Omega}}\bm{Y}^{\infty},0\right)=\frac{1}{\sqrt{\Omega}}\mathcal{W}_{1}\left(\bm{Y}^{\infty},0\right), we have from (24) and (27) that for all Ω{Ω𝒪|ΩΩ0}\Omega\in\left\{\Omega\in\mathcal{O}\middle|\Omega\geq\Omega_{0}^{\prime}\right\},

𝒲1(𝑿~~Ω,0)\displaystyle\mathcal{W}_{1}\left(\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty},0\right) ClnΩΩ+1Ω𝒲1(𝒀,0),\displaystyle\leq C\frac{\ln\Omega}{\Omega}+\frac{1}{\sqrt{\Omega}}\mathcal{W}_{1}\left(\bm{Y}^{\infty},0\right), (28)
(C+𝒲1(𝒀,0))1Ω,\displaystyle\leq\left(C+\mathcal{W}_{1}\left(\bm{Y}^{\infty},0\right)\right)\frac{1}{\sqrt{\Omega}}, (29)

which completes the proof. ∎

A special case where Conditions 3.1 and 3.2 are easy to verify is an SCRN with mass action kinetics and only zeroth and first order reactions. This idea is formalized in Corollary 3.2.

Corollary 3.2.

Consider an SCRN 𝐗Ω(t)\bm{X}_{\Omega}(t) satisfying Assumption 2.3 where for all ii, λ¯i(𝐱)\bar{\lambda}_{i}(\bm{x}) is either of the form λ¯i(𝐱)=ki\bar{\lambda}_{i}(\bm{x})=k_{i} or λ¯i(𝐱)=kixji\bar{\lambda}_{i}(\bm{x})=k_{i}x_{j_{i}} for some ki>0k_{i}>0 and ji{1,,n}j_{i}\in\{1,\dots,n\}. Let 𝐯int(𝒳)\bm{v}^{*}\in\mathrm{int}(\mathcal{X}) be an equilibrium point of (12a) such that F(𝐯)𝐯\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}} is Hurwitz. Then, there exist CC and Ω0\Omega_{0}^{\prime} such that for all ΩΩ0\Omega\geq\Omega_{0}^{\prime},

𝒲1(𝑿~Ω,𝒀)ClnΩΩ.\mathcal{W}_{1}\left(\tilde{\bm{X}}_{\Omega}^{\infty},\bm{Y}^{\infty}\right)\leq C\frac{\ln\Omega}{\sqrt{\Omega}}. (30)
Proof.

The proof relies on the fact that for SCRNs with only zeroth and first order reactions, the LNA gives exactly the first and second moments of 𝑿Ω\bm{X}_{\Omega}^{\infty}. Specifically, the fact that F(𝒗)𝒗\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}} is Hurwitz implies that we can apply Proposition 7 in [32] to conclude that all moments of 𝑿Ω\bm{X}_{\Omega}^{\infty} are finite, and thus Condition 3.1 holds. Additionally, since all of the propensities are affine, a computation of the first and second moment dynamics of 𝑿Ω(t)\bm{X}_{\Omega}(t) reveals that for all time [43], 𝔼[𝑿~~Ω(t)]=0\mathbb{E}\left[\tilde{\tilde{\bm{X}}}_{\Omega}(t)\right]=0 and 𝔼[𝑿~~Ω(t)𝑿~~ΩT(t)]=1Ω𝔼[𝒀(t)𝒀T(t)]\mathbb{E}\left[\tilde{\tilde{\bm{X}}}_{\Omega}(t)\tilde{\tilde{\bm{X}}}_{\Omega}^{T}(t)\right]=\frac{1}{\Omega}\mathbb{E}\left[\bm{Y}(t)\bm{Y}^{T}(t)\right]. Therefore, by Remarks 3.1 and 3.2, Condition 3.2 is satisfied. The result then follows from Theorem 3.1. ∎

We now give a result that gives sufficient conditions on the RREs for the conclusion of Theorem 3.1 to hold. This result is substantially easier to use in practice than Theorem 3.1, since it requires only analyzing an ODE with nn variables, and not the stationary distribution of a CTMC over 𝒳Ω\mathcal{X}_{\Omega}.

Theorem 3.2.

Consider an SCRN 𝐗Ω(t)\bm{X}_{\Omega}(t) satisfying Assumptions 2.1, 2.2, and 2.3, and let 𝐯int(𝒳)\bm{v}^{*}\in\mathrm{int}(\mathcal{X}). Suppose that there exist K,γ1>0K,\gamma_{1}>0 such that for all 𝐯0𝒳\bm{v}_{0}\in\mathcal{X},

𝒗(t)𝒗Keγ1t𝒗0𝒗,\left\lVert\bm{v}(t)-\bm{v}^{*}\right\rVert\leq Ke^{-\gamma_{1}t}\left\lVert\bm{v}_{0}-\bm{v}^{*}\right\rVert, (31)

where 𝐯(t)\bm{v}(t) is the solution to (12a), and that there exists 𝐜>0n\bm{c}\in\mathbb{R}^{n}_{>0} and d,γ2>0d,\gamma_{2}>0 such that for all 𝐯{𝐯𝒳|𝐜T𝐯d}\bm{v}\in\left\{\bm{v}\in\mathcal{X}\middle|\bm{c}^{T}\bm{v}\geq d\right\},

𝒄TF(𝒗)γ2𝒄T𝒗.\bm{c}^{T}F(\bm{v})\leq-\gamma_{2}\bm{c}^{T}\bm{v}. (32)

Then, there exist CC and Ω0\Omega_{0}^{\prime} such that for all Ω{Ω𝒪|ΩΩ0}\Omega\in\left\{\Omega\in\mathcal{O}\middle|\Omega\geq\Omega_{0}^{\prime}\right\},

𝒲1(𝑿~Ω,𝒀)ClnΩΩ.\mathcal{W}_{1}\left(\tilde{\bm{X}}_{\Omega}^{\infty},\bm{Y}^{\infty}\right)\leq C\frac{\ln\Omega}{\sqrt{\Omega}}. (33)
Proof.

See Section 4.4. ∎

4 Proofs of the Main Results

In this section we provide the proofs of the results presented in Section 3.

4.1 Proof of Theorem 3.1

We require the following Lemma from [29], which gives sufficient conditions for the the Basic Adjoint Relationship (BAR) to hold.

Lemma 4.1 (Proposition 3 in [29]).

Consider a CTMC over state space 𝒮\mathcal{S} with rate matrix having entries G(𝐱,𝐱)G(\bm{x},\bm{x}^{\prime}) and stationary distribution π(𝐱)\pi(\bm{x}). If

𝒙𝒮π(𝒙)|G(𝒙,𝒙)||f(𝒙)|<,\sum_{\bm{x}\in\mathcal{S}}\pi(\bm{x})\left\lvert G(\bm{x},\bm{x})\right\rvert\left\lvert f(\bm{x})\right\rvert<\infty, (34)

then

𝔼𝑿π[Gf(𝑿)]=0.\mathbb{E}_{\bm{X}^{\infty}\sim\pi}\left[Gf(\bm{X}^{\infty})\right]=0. (35)

It will be convenient for us to consider the 1-Wasserstein distance computed by taking a supremum over not the 1-Lipschitz functions, but the 1-Lipschitz functions that are additionally in C3C^{3} with bounded derivatives. To this end, let

={hC3(n)|hLip(1),k{1,2,3},sup𝒙nkh(𝒙)<}.\mathcal{H}=\{h\in C^{3}(\mathbb{R}^{n})|h\in\mathrm{Lip}(1),\;\forall k\in\{1,2,3\},\;\sup_{\bm{x}\in\mathbb{R}^{n}}\lVert\nabla^{k}h(\bm{x})\rVert<\infty\}. (36)

We have the following Lemma, which is proved in Section 4.3.

Lemma 4.2.

The following holds for any two probability distributions ν\nu and ρ\rho over n\mathbb{R}^{n}:

suphLip(1)|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]|=suph|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]|.\sup_{h\in\mathrm{Lip}(1)}\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h(\bm{Y})]\right\rvert=\sup_{h\in\mathcal{H}}\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h(\bm{Y})]\right\rvert. (37)

We now present the derivative bounds that we will need. The proof of Lemma 4.3 uses results from [30], and is provided in 4.2.

Lemma 4.3 (Derivative Bounds).

Consider the Stein Equation

G𝒀fh(𝒙)=h(𝒙)𝔼[h(𝒀)],G_{\bm{Y}}f_{h}(\bm{x})=h(\bm{x})-\mathbb{E}\left[h(\bm{Y}^{\infty})\right], (38)

for hh\in\mathcal{H}. Assume that there exists a symmetric matrix H>0H>0 and a real number ϕ>0\phi>0 such that

HF(𝒗)𝒗+(F(𝒗)𝒗)TH2ϕH.H\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}}+\left(\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}}\right)^{T}H\leq-2\phi H. (39)

Assume also that Sdiag𝛌¯(𝐯)S\operatorname{diag}\sqrt{\bar{\bm{\lambda}}(\bm{v}^{*})} has a right inverse. Then, there exists a solution fhf_{h} to the Stein Equation, which satisfies

𝒙n,fh(𝒙)C1,\forall\bm{x}\in\mathbb{R}^{n},\;\left\lVert\nabla f_{h}(\bm{x})\right\rVert\leq C_{1}, (40)
𝒙n,2fh(𝒙)C2,\forall\bm{x}\in\mathbb{R}^{n},\;\left\lVert\nabla^{2}f_{h}(\bm{x})\right\rVert\leq C_{2}, (41)

and for all 0<ζ<10<\zeta<1, there exists C3(ζ)C_{3}(\zeta) such that

𝒙,𝒙n,2fh(𝒙)2fh(𝒙)C3(ζ)𝒙𝒙1ζ,\forall\bm{x},\bm{x}^{\prime}\in\mathbb{R}^{n},\;\left\lVert\nabla^{2}f_{h}(\bm{x})-\nabla^{2}f_{h}(\bm{x}^{\prime})\right\rVert\leq C_{3}(\zeta)\lVert\bm{x}-\bm{x}^{\prime}\rVert^{1-\zeta}, (42)

where C1C_{1}, C2C_{2}, and C3C_{3} are given by

C1\displaystyle C_{1} =0eAt𝑑t,\displaystyle=\int_{0}^{\infty}\left\lVert e^{At}\right\rVert dt, (43)
C2\displaystyle C_{2} =Σ1cond(H)(2+eϕC1),\displaystyle=\left\lVert\Sigma^{-1}\right\rVert\mathrm{cond}(H)\left(2+e^{-\phi}C_{1}\right), (44)
C3\displaystyle C_{3} =2C3ζ+eϕC3C1,\displaystyle=\frac{2C_{3}^{\prime}}{\zeta}+e^{-\phi}C_{3}^{\prime}C_{1}, (45)

where A=F(𝐯)𝐯A=\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}}, Σ1\Sigma^{-1} is the right inverse of Sdiag𝛌¯(𝐯)S\operatorname{diag}\sqrt{\bar{\bm{\lambda}}(\bm{v}^{*})}, and

C3=2max{Σ1,Σ12}cond(H)52.C_{3}^{\prime}=2\max\{\left\lVert\Sigma^{-1}\right\rVert,\left\lVert\Sigma^{-1}\right\rVert^{2}\}\mathrm{cond}(H)^{\tfrac{5}{2}}. (46)
Proof.

See Section 4.2. ∎

Remark 4.1.

The assumption that there exists a symmetric matrix H>0H>0 and a real number ϕ>0\phi>0 such that

HF(𝒗)𝒗+(F(𝒗)𝒗)TH2ϕHH\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}}+\left(\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}}\right)^{T}H\leq-2\phi H (47)

is equivalent to the assumption that F(𝒗)𝒗\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}} is Hurwitz.

Our goal is to analyze

𝒲1(𝑿~Ω,𝒀)=suphLip(1)|𝔼[h(𝑿~Ω)]𝔼[h(𝒀)]|.\mathcal{W}_{1}\left(\tilde{\bm{X}}_{\Omega}^{\infty},\bm{Y}^{\infty}\right)=\sup_{h\in\mathrm{Lip}(1)}\left\lvert\mathbb{E}\left[h\left(\tilde{\bm{X}}_{\Omega}^{\infty}\right)\right]-\mathbb{E}\left[h\left(\bm{Y}^{\infty}\right)\right]\right\rvert. (48)

We begin by observing that per Lemma 4.2 we can obtain 𝒲1(𝑿~Ω,𝒀)\mathcal{W}_{1}\left(\tilde{\bm{X}}_{\Omega}^{\infty},\bm{Y}^{\infty}\right) by taking the supremum over hh\in\mathcal{H} instead of hLip(1)h\in\mathrm{Lip}(1), and thus we can analyze

𝒲1(𝑿~Ω,𝒀)=suph|𝔼[h(𝑿~Ω)]𝔼[h(𝒀)]|.\mathcal{W}_{1}\left(\tilde{\bm{X}}_{\Omega}^{\infty},\bm{Y}^{\infty}\right)=\sup_{h\in\mathcal{H}}\left\lvert\mathbb{E}\left[h\left(\tilde{\bm{X}}_{\Omega}^{\infty}\right)\right]-\mathbb{E}\left[h\left(\bm{Y}^{\infty}\right)\right]\right\rvert. (49)

To do this we use Stein’s Method following [11]. The Stein Equation is

G𝒀fh(𝒙)=h(𝒙)𝔼[h(𝒀)],G_{\bm{Y}}f_{h}(\bm{x})=h(\bm{x})-\mathbb{E}\left[h(\bm{Y}^{\infty})\right], (50)

which in our case is the Poisson equation

fh(𝒙)𝒙F(𝒗)𝒗𝒙+12i=1,j=1nDij2fh(𝒙)xixj=h(𝒙)𝔼[h(𝒀)],\frac{\partial f_{h}(\bm{x})}{\partial\bm{x}}\cdot\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}}\bm{x}+\frac{1}{2}\sum_{i=1,j=1}^{n}D_{ij}\frac{\partial^{2}f_{h}(\bm{x})}{\partial x_{i}\partial x_{j}}=h(\bm{x})-\mathbb{E}\left[h(\bm{Y}^{\infty})\right], (51)

which is guaranteed to have a solution by Lemma 4.3. Throughout the rest of this proof, we will denote by fhf_{h} a solution to (51) with hh\in\mathcal{H} that satisfies the bounds given by Lemma 4.3. We take the expectation of the Stein Equation with respect to 𝑿~Ω\tilde{\bm{X}}_{\Omega}^{\infty} to find

𝔼[G𝒀fh(𝑿~Ω)]=𝔼[h(𝑿~Ω)]𝔼[h(𝒀)].\mathbb{E}\left[G_{\bm{Y}}f_{h}(\tilde{\bm{X}}_{\Omega}^{\infty})\right]=\mathbb{E}\left[h(\tilde{\bm{X}}_{\Omega}^{\infty})\right]-\mathbb{E}\left[h(\bm{Y}^{\infty})\right]. (52)

Before proceeding, we show that 𝔼[G𝑿~Ωfh(𝑿~Ω)]=0\mathbb{E}\left[G_{\tilde{\bm{X}}_{\Omega}}f_{h}(\tilde{\bm{X}}_{\Omega}^{\infty})\right]=0 for all hh\in\mathcal{H}. We have that G𝑿~Ω(𝒙,𝒙)=i=1rλi(Ω𝒙+Ω𝒗)G_{\tilde{\bm{X}}_{\Omega}}(\bm{x},\bm{x})=-\sum_{i=1}^{r}\lambda_{i}(\sqrt{\Omega}\bm{x}+\Omega\bm{v}^{*}), and so by Condition 3.1, there exists K>0K>0 such that

𝒙[𝑿~Ω=𝒙]|G𝑿~Ω(𝒙,𝒙)fh(𝒙)|𝒙[𝑿~Ω=𝒙](K+Ω𝒙+Ω𝒗κ)|fh(𝒙)|,\sum_{\bm{x}}\mathbb{P}[\tilde{\bm{X}}_{\Omega}^{\infty}=\bm{x}]\lvert G_{\tilde{\bm{X}}_{\Omega}}(\bm{x},\bm{x})f_{h}(\bm{x})\rvert\leq\sum_{\bm{x}}\mathbb{P}[\tilde{\bm{X}}_{\Omega}^{\infty}=\bm{x}](K+\lVert\sqrt{\Omega}\bm{x}+\Omega\bm{v}^{*}\rVert^{\kappa})\lvert f_{h}(\bm{x})\rvert, (53)

where the sums are taken over 𝒙\bm{x} such that Ω𝒙+Ω𝒗𝒳Ω\sqrt{\Omega}\bm{x}+\Omega\bm{v}^{*}\in\mathcal{X}_{\Omega}. By Lemma 4.3, fhf_{h} is Lipschitz, and so |fh(𝒙)||fh(0)|+𝒙\lvert f_{h}(\bm{x})\rvert\leq\lvert f_{h}(0)\rvert+\lVert\bm{x}\rVert. Therefore,

𝒙[𝑿~Ω=𝒙]|G𝑿~Ω(𝒙,𝒙)fh(𝒙)|\displaystyle\sum_{\bm{x}}\mathbb{P}[\tilde{\bm{X}}_{\Omega}^{\infty}=\bm{x}]\lvert G_{\tilde{\bm{X}}_{\Omega}}(\bm{x},\bm{x})f_{h}(\bm{x})\rvert 𝒙[𝑿~Ω=𝒙](K+Ω𝒙+Ω𝒗κ)(|fh(0)|+𝒙),\displaystyle\leq\sum_{\bm{x}}\mathbb{P}[\tilde{\bm{X}}_{\Omega}^{\infty}=\bm{x}](K+\lVert\sqrt{\Omega}\bm{x}+\Omega\bm{v}^{*}\rVert^{\kappa})\left(\lvert f_{h}(0)\rvert+\lVert\bm{x}\rVert\right), (54)
𝒙[𝑿~Ω=𝒙]((K|fh(0)|+K𝒙+|fh(0)|Ω𝒙+Ω𝒗κ+max{1,(Ω𝒙+Ω𝒗)κ}𝒙),\displaystyle\begin{split}&\leq\sum_{\bm{x}}\mathbb{P}[\tilde{\bm{X}}_{\Omega}^{\infty}=\bm{x}]\left((K\lvert f_{h}(0)\rvert+K\lVert\bm{x}\rVert\vphantom{\max\left\{1,\left(\lVert\sqrt{\Omega}\bm{x}\rVert+\lVert\Omega\bm{v}^{*}\rVert\right)^{\kappa}\right\}}\right.\\ &\left.\quad\quad\quad+\lvert f_{h}(0)\rvert\lVert\sqrt{\Omega}\bm{x}+\Omega\bm{v}^{*}\rVert^{\kappa}\right.\\ &\quad\quad\quad\left.+\max\left\{1,\left(\lVert\sqrt{\Omega}\bm{x}\rVert+\lVert\Omega\bm{v}^{*}\rVert\right)^{\kappa}\right\}\lVert\bm{x}\rVert\right),\end{split} (55)

which is finite due to Condition 3.1. Therefore, from Lemma 4.1, 𝔼[G𝑿~Ωfh(𝑿~Ω)]=0\mathbb{E}\left[G_{\tilde{\bm{X}}_{\Omega}}f_{h}(\tilde{\bm{X}}_{\Omega}^{\infty})\right]=0. Thus, for all hh\in\mathcal{H}, (52) implies that

|𝔼[h(𝑿~Ω)]𝔼[h(𝒀)]|\displaystyle\left\lvert\mathbb{E}\left[h\left(\tilde{\bm{X}}_{\Omega}^{\infty}\right)\right]-\mathbb{E}\left[h\left(\bm{Y}^{\infty}\right)\right]\right\rvert =|𝔼[G𝒀fh(𝑿~Ω)]𝔼[G𝑿~Ωfh(𝑿~Ω)]|,\displaystyle=\left\lvert\mathbb{E}\left[G_{\bm{Y}}f_{h}(\tilde{\bm{X}}_{\Omega}^{\infty})\right]-\mathbb{E}\left[G_{\tilde{\bm{X}}_{\Omega}}f_{h}(\tilde{\bm{X}}_{\Omega}^{\infty})\right]\right\rvert, (56)
=|𝔼[G𝒀fh(𝑿~Ω)G𝑿~Ωfh(𝑿~Ω)]|,\displaystyle=\left\lvert\mathbb{E}\left[G_{\bm{Y}}f_{h}(\tilde{\bm{X}}_{\Omega}^{\infty})-G_{\tilde{\bm{X}}_{\Omega}}f_{h}(\tilde{\bm{X}}_{\Omega}^{\infty})\right]\right\rvert, (57)
𝔼|G𝒀fh(𝑿~Ω)G𝑿~Ωfh(𝑿~Ω)|,\displaystyle\leq\mathbb{E}\left\lvert G_{\bm{Y}}f_{h}(\tilde{\bm{X}}_{\Omega}^{\infty})-G_{\tilde{\bm{X}}_{\Omega}}f_{h}(\tilde{\bm{X}}_{\Omega}^{\infty})\right\rvert, (58)

where fhf_{h} is a solution to the Stein equation. The existence of fhf_{h} is guaranteed by Lemma 4.3. For conciseness, let us define δ=1Ω\delta=\frac{1}{\sqrt{\Omega}}. Let 𝒍=𝒙/δ+𝒗/δ2\bm{l}=\bm{x}/\delta+\bm{v}^{*}/\delta^{2}. For a fixed 𝒍\bm{l} we take the first order Taylor expansion of G𝑿~Ωfh(𝒙)G_{\tilde{\bm{X}}_{\Omega}}f_{h}(\bm{x}) in δ\delta using the Lagrange form of the remainder.

G𝑿~Ωfh(𝒙)\displaystyle G_{\tilde{\bm{X}}_{\Omega}}f_{h}(\bm{x}) =i=1rλi(𝒍)(f(𝒙+δ𝝃i)f(𝒙)),\displaystyle=\sum_{i=1}^{r}\lambda_{i}(\bm{l})\left(f(\bm{x}+\delta\bm{\xi}_{i})-f(\bm{x})\right), (59)
=δi=1rλi(𝒍)fh(𝒙)𝒙𝝃i+δ212i=1rλi(l)𝝃iT2fh(𝒙+ϵ𝝃i)𝝃i,\displaystyle=\delta\sum_{i=1}^{r}\lambda_{i}(\bm{l})\frac{\partial f_{h}(\bm{x})}{\partial\bm{x}}\bm{\xi}_{i}+\delta^{2}\frac{1}{2}\sum_{i=1}^{r}\lambda_{i}(l)\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x}+\epsilon\bm{\xi}_{i})\bm{\xi}_{i}, (60)
=δi=1rλi(𝒍)fh(𝒙)𝒙𝝃i+δ212i=1rλi(l)𝝃iT2fh(𝒙)𝝃i+δ212i=1rλi(𝒍)(𝝃iT2fh(𝒙+ϵ𝝃i)𝝃i𝝃iT2fh(𝒙)𝝃i),\displaystyle\begin{split}&=\delta\sum_{i=1}^{r}\lambda_{i}(\bm{l})\frac{\partial f_{h}(\bm{x})}{\partial\bm{x}}\bm{\xi}_{i}+\delta^{2}\frac{1}{2}\sum_{i=1}^{r}\lambda_{i}(l)\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x})\bm{\xi}_{i}\\ &\quad\quad+\delta^{2}\frac{1}{2}\sum_{i=1}^{r}\lambda_{i}(\bm{l})\left(\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x}+\epsilon\bm{\xi}_{i})\bm{\xi}_{i}-\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x})\bm{\xi}_{i}\right),\end{split} (61)

where 0ϵδ0\leq\epsilon\leq\delta, and ϵ\epsilon can depend on 𝒍\bm{l}. We have that λi(𝒍)=1δ2λ¯i(δ𝒙+𝒗)\lambda_{i}(\bm{l})=\frac{1}{\delta^{2}}\bar{\lambda}_{i}(\delta\bm{x}+\bm{v}^{*}), and so from Taylor’s Theorem we have

1δ2λ¯i(δ𝒙+𝒗)=1δ2λ¯i(𝒗)+1δλ¯i(𝒗)𝒗𝒙+Ri(𝒙),\frac{1}{\delta^{2}}\bar{\lambda}_{i}(\delta\bm{x}+\bm{v}^{*})=\frac{1}{\delta^{2}}\bar{\lambda}_{i}(\bm{v}^{*})+\frac{1}{\delta}\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\bm{x}+R_{i}(\bm{x}), (62)

where |Ri(𝒙)|12𝒙2sup𝒛𝒳2λ¯i(𝒛)\lvert R_{i}(\bm{x})\rvert\leq\frac{1}{2}\lVert\bm{x}\rVert^{2}\sup_{\bm{z}\in\mathcal{X}}\lVert\nabla^{2}{\bar{\lambda}_{i}}(\bm{z})\rVert, in which the supremum over 𝒳\mathcal{X} is finite due to our assumption that λ¯i(𝒙)\bar{\lambda}_{i}(\bm{x}) has bounded second derivative. We have that (LABEL:eq:taylor_expanded) becomes

G𝑿~Ωfh(𝒙)=i=1r(1δ2λ¯i(𝒗)+1δλ¯i(𝒗)𝒗𝒙+Ri(𝒙))δfh(𝒙)𝒙𝝃i+i=1r(1δ2λ¯i(𝒗)+1δλ¯i(𝒗)𝒗𝒙+Ri(𝒙))δ212𝝃iT2fh(𝒙)𝝃i+i=1r(1δ2λ¯i(𝒗)+1δλ¯i(𝒗)𝒗𝒙)δ212(𝝃iT2fh(𝒙+ϵ𝝃i)𝝃i𝝃iT2fh(𝒙)𝝃i)+i=1rRi(𝒙)δ212(𝝃iT2fh(𝒙+ϵ𝝃i)𝝃i𝝃iT2fh(𝒙)𝝃i).\displaystyle\begin{split}G_{\tilde{\bm{X}}_{\Omega}}f_{h}(\bm{x})&=\sum_{i=1}^{r}\left(\frac{1}{\delta^{2}}\bar{\lambda}_{i}(\bm{v}^{*})+\frac{1}{\delta}\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\bm{x}+R_{i}(\bm{x})\right)\delta\frac{\partial f_{h}(\bm{x})}{\partial\bm{x}}\bm{\xi}_{i}\\ &+\sum_{i=1}^{r}\left(\frac{1}{\delta^{2}}\bar{\lambda}_{i}(\bm{v}^{*})+\frac{1}{\delta}\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\bm{x}+R_{i}(\bm{x})\right)\delta^{2}\frac{1}{2}\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x})\bm{\xi}_{i}\\ &+\sum_{i=1}^{r}\left(\frac{1}{\delta^{2}}\bar{\lambda}_{i}(\bm{v}^{*})+\frac{1}{\delta}\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\bm{x}\right)\delta^{2}\frac{1}{2}\left(\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x}+\epsilon\bm{\xi}_{i})\bm{\xi}_{i}-\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x})\bm{\xi}_{i}\right)\\ &+\sum_{i=1}^{r}R_{i}(\bm{x})\delta^{2}\frac{1}{2}\left(\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x}+\epsilon\bm{\xi}_{i})\bm{\xi}_{i}-\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x})\bm{\xi}_{i}\right).\end{split} (63)

Collecting terms and using the fact that F(𝒗)=0F(\bm{v}^{*})=0, we obtain

G𝑿~Ωfh(𝒙)=fh(𝒙)𝒙i=1r𝝃iλ¯i(𝒗)𝒗𝒙+12i=1rλ¯i(𝒗)𝝃iT2fh(𝒙)𝝃i+δ12i=1r[2Ri(𝒙)fh(𝒙)𝒙𝝃i+λ¯i(𝒗)𝒗𝒙𝝃iT2fh(𝒙)𝝃i]+δ212i=1rRi(𝒙)𝝃iT2fh(𝒙)𝝃i+12i=1r(λ¯i(𝒗)+δλ¯i(𝒗)𝒗𝒙+δ2Ri(𝒙))(𝝃iT2fh(𝒙+ϵ𝝃i)𝝃i𝝃iT2fh(𝒙)𝝃i).\displaystyle\begin{split}&G_{\tilde{\bm{X}}_{\Omega}}f_{h}(\bm{x})=\frac{\partial f_{h}(\bm{x})}{\partial\bm{x}}\sum_{i=1}^{r}\bm{\xi}_{i}\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\bm{x}+\frac{1}{2}\sum_{i=1}^{r}\bar{\lambda}_{i}(\bm{v}^{*})\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x})\bm{\xi}_{i}\\ &\quad\quad+\delta\frac{1}{2}\sum_{i=1}^{r}\left[2R_{i}(\bm{x})\frac{\partial f_{h}(\bm{x})}{\partial\bm{x}}\bm{\xi}_{i}+\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\bm{x}\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x})\bm{\xi}_{i}\right]\\ &\quad\quad+\delta^{2}\frac{1}{2}\sum_{i=1}^{r}R_{i}(\bm{x})\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x})\bm{\xi}_{i}\\ &\quad\quad+\frac{1}{2}\sum_{i=1}^{r}\left(\bar{\lambda}_{i}(\bm{v}^{*})+\delta\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\bm{x}+\delta^{2}R_{i}(\bm{x})\right)\left(\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x}+\epsilon\bm{\xi}_{i})\bm{\xi}_{i}-\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x})\bm{\xi}_{i}\right).\end{split} (64)

Observe that the first two terms are identical to GYfh(𝒙)G_{Y}f_{h}(\bm{x}), since

12i=1rλ¯i(𝒗)𝝃iT2fh(𝒙)𝝃i=12p=1,q=1nDpqf(𝒙)xpxq,\frac{1}{2}\sum_{i=1}^{r}\bar{\lambda}_{i}(\bm{v}^{*})\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x})\bm{\xi}_{i}=\frac{1}{2}\sum_{p=1,q=1}^{n}D_{pq}\frac{\partial f(\bm{x})}{\partial x_{p}\partial x_{q}}, (65)

with D=Sdiag(𝝀¯(𝒗))STD=S\operatorname{diag}(\bar{\bm{\lambda}}(\bm{v}^{*}))S^{T}. Therefore, for all 𝒙\bm{x} such that Ω𝒙+Ω𝒗𝒳Ω\sqrt{\Omega}\bm{x}+\Omega\bm{v}^{*}\in\mathcal{X}_{\Omega},

|G𝒀fh(𝒙)G𝑿~Ωfh(𝒙)||12δi=1r[2Ri(𝒙)fh(𝒙)𝒙𝝃i+λ¯i(𝒗)𝒗𝒙𝝃iT2fh(𝒙)𝝃i]|+|12δ2i=1rRi(𝒙)𝝃iT2fh(𝒙)𝝃i|+|12i=1r(λ¯i(𝒗)+δλ¯i(𝒗)𝒗𝒙+δ2Ri(𝒙))(𝝃iT2fh(𝒙+ϵ𝝃i)𝝃i𝝃iT2fh(𝒙)𝝃i)|.\displaystyle\begin{split}&\left\lvert G_{\bm{Y}}f_{h}(\bm{x})-G_{\tilde{\bm{X}}_{\Omega}}f_{h}(\bm{x})\right\rvert\leq\left\lvert\frac{1}{2}\delta\sum_{i=1}^{r}\left[2R_{i}(\bm{x})\frac{\partial f_{h}(\bm{x})}{\partial\bm{x}}\bm{\xi}_{i}+\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\bm{x}\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x})\bm{\xi}_{i}\right]\right\rvert\\ &\quad\quad\quad+\left\lvert\frac{1}{2}\delta^{2}\sum_{i=1}^{r}R_{i}(\bm{x})\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x})\bm{\xi}_{i}\right\rvert\\ &\quad\quad+\left\lvert\frac{1}{2}\sum_{i=1}^{r}\left(\bar{\lambda}_{i}(\bm{v}^{*})+\delta\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\bm{x}+\delta^{2}R_{i}(\bm{x})\right)\left(\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x}+\epsilon\bm{\xi}_{i})\bm{\xi}_{i}-\bm{\xi}_{i}^{T}\nabla^{2}{f_{h}}(\bm{x})\bm{\xi}_{i}\right)\right\rvert.\end{split} (66)

Letting Qi=sup𝒗𝒳2λ¯i(𝒗)Q_{i}=\sup_{\bm{v}\in\mathcal{X}}\lVert\nabla^{2}{\bar{\lambda}_{i}}(\bm{v})\rVert and invoking Lemma 4.3 by the fact that Assumptions 2.1 and 2.3 ensure that Sdiag𝝀¯(𝒗)S\operatorname{diag}\sqrt{\bar{\bm{\lambda}}(\bm{v}^{*})} has a right inverse since 𝝀¯(𝒗)>0\bar{\bm{\lambda}}(\bm{v}^{*})>0 and SS has full column rank, we have

|G𝒀fh(𝒙)G𝑿~Ωfh(𝒙)|12δi=1r(C1Qi𝒙2𝝃i+C2λ¯i(𝒗)𝒗𝒙𝝃i2)+14δ2C2𝒙2i=1rQi𝝃i2+12δ2ζC3(ζ)𝒙i=1rλ¯i(𝒗)𝒗𝝃i3ζ+14δ3ζC3(ζ)𝒙2i=1rQi𝝃i3ζ+12δ1ζC3(ζ)i=1r|λ¯i(𝒗)|𝝃i3ζ.\displaystyle\begin{split}\left\lvert G_{\bm{Y}}f_{h}(\bm{x})-G_{\tilde{\bm{X}}_{\Omega}}f_{h}(\bm{x})\right\rvert&\leq\frac{1}{2}\delta\sum_{i=1}^{r}\left(C_{1}Q_{i}\left\lVert\bm{x}\right\rVert^{2}\lVert\bm{\xi}_{i}\rVert+C_{2}\left\lVert\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\right\rVert\left\lVert\bm{x}\right\rVert\lVert\bm{\xi}_{i}\rVert^{2}\right)\\ &+\frac{1}{4}\delta^{2}C_{2}\left\lVert\bm{x}\right\rVert^{2}\sum_{i=1}^{r}Q_{i}\left\lVert\bm{\xi}_{i}\right\rVert^{2}\\ &+\frac{1}{2}\delta^{2-\zeta}C_{3}(\zeta)\lVert\bm{x}\rVert\sum_{i=1}^{r}\left\lVert\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\right\rVert\left\lVert\bm{\xi}_{i}\right\rVert^{3-\zeta}\\ &+\frac{1}{4}\delta^{3-\zeta}C_{3}(\zeta)\left\lVert\bm{x}\right\rVert^{2}\sum_{i=1}^{r}Q_{i}\left\lVert\bm{\xi}_{i}\right\rVert^{3-\zeta}\\ &+\frac{1}{2}\delta^{1-\zeta}C_{3}(\zeta)\sum_{i=1}^{r}\left\lvert\bar{\lambda}_{i}(\bm{v}^{*})\right\rvert\left\lVert\bm{\xi}_{i}\right\rVert^{3-\zeta}.\end{split} (67)

This bound holds for any hh\in\mathcal{H}, with fhf_{h} chosen as the solution to the Stein equation (51) specified by Lemma 4.3, and thus from (56) we have

|𝔼[h(𝑿~Ω)]𝔼[h(𝒀)]|𝔼[12δi=1r(C1Qi𝑿~Ω2𝝃i+C2λ¯i(𝒗)𝒗𝑿~Ω𝝃i2)+14δ2C2𝑿~Ω2i=1rQi𝝃i2+12δ2ζC3(ζ)𝑿~Ωi=1rλ¯i(𝒗)𝒗𝝃i3ζ+14δ3ζC3(ζ)𝑿~Ω2i=1rQi𝝃i3ζ+12δ1ζC3(ζ)i=1r|λ¯i(𝒗)|𝝃i3ζ].\begin{split}\left\lvert\mathbb{E}\left[h\left(\tilde{\bm{X}}_{\Omega}^{\infty}\right)\right]-\mathbb{E}\left[h\left(\bm{Y}^{\infty}\right)\right]\right\rvert&\\ &\leq\mathbb{E}\left[\frac{1}{2}\delta\sum_{i=1}^{r}\left(C_{1}Q_{i}\left\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\right\rVert^{2}\lVert\bm{\xi}_{i}\rVert+C_{2}\left\lVert\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\right\rVert\left\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\right\rVert\lVert\bm{\xi}_{i}\rVert^{2}\right)\right.\\ +\frac{1}{4}\delta^{2}C_{2}\left\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\right\rVert^{2}\sum_{i=1}^{r}Q_{i}\left\lVert\bm{\xi}_{i}\right\rVert^{2}\\ +\frac{1}{2}\delta^{2-\zeta}C_{3}(\zeta)\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\rVert\sum_{i=1}^{r}\left\lVert\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\right\rVert\left\lVert\bm{\xi}_{i}\right\rVert^{3-\zeta}\\ +\frac{1}{4}\delta^{3-\zeta}C_{3}(\zeta)\left\lVert\tilde{\bm{X}}_{\Omega}^{\infty}\right\rVert^{2}\sum_{i=1}^{r}Q_{i}\left\lVert\bm{\xi}_{i}\right\rVert^{3-\zeta}\\ \left.+\frac{1}{2}\delta^{1-\zeta}C_{3}(\zeta)\sum_{i=1}^{r}\left\lvert\bar{\lambda}_{i}(\bm{v}^{*})\right\rvert\left\lVert\bm{\xi}_{i}\right\rVert^{3-\zeta}\right].\end{split} (68)

Using the moment bounds and the linearity of the expectation we have

|𝔼[h(𝑿~Ω)]𝔼[h(𝒀)]|12δi=1r(C1QiM2𝝃i+C2λ¯i(𝒗)𝒗M1𝝃i2)+14δ2C2M2i=1rQi𝝃i2+12δ2ζC3(ζ)M1i=1rλ¯i(𝒗)𝒗𝝃i3ζ+14δ3ζC3(ζ)M2i=1rQi𝝃i3ζ+12δ1ζC3(ζ)i=1r|λ¯i(𝒗)|𝝃i3ζ.\begin{split}\left\lvert\mathbb{E}\left[h\left(\tilde{\bm{X}}_{\Omega}^{\infty}\right)\right]-\mathbb{E}\left[h\left(\bm{Y}^{\infty}\right)\right]\right\rvert\leq\frac{1}{2}\delta\sum_{i=1}^{r}\left(C_{1}Q_{i}M_{2}\lVert\bm{\xi}_{i}\rVert+C_{2}\left\lVert\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\right\rVert M_{1}\lVert\bm{\xi}_{i}\rVert^{2}\right)\\ +\frac{1}{4}\delta^{2}C_{2}M_{2}\sum_{i=1}^{r}Q_{i}\left\lVert\bm{\xi}_{i}\right\rVert^{2}\\ +\frac{1}{2}\delta^{2-\zeta}C_{3}(\zeta)M_{1}\sum_{i=1}^{r}\left\lVert\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\right\rVert\left\lVert\bm{\xi}_{i}\right\rVert^{3-\zeta}\\ +\frac{1}{4}\delta^{3-\zeta}C_{3}(\zeta)M_{2}\sum_{i=1}^{r}Q_{i}\left\lVert\bm{\xi}_{i}\right\rVert^{3-\zeta}\\ +\frac{1}{2}\delta^{1-\zeta}C_{3}(\zeta)\sum_{i=1}^{r}\left\lvert\bar{\lambda}_{i}(\bm{v}^{*})\right\rvert\left\lVert\bm{\xi}_{i}\right\rVert^{3-\zeta}.\end{split} (69)

To prove the error bound proportional to lnΩΩ\frac{\ln\Omega}{\sqrt{\Omega}}, we take ζ=(lnδ)1\zeta=-(\ln\delta)^{-1} for Ω>e2\Omega>e^{2}, in which case C3(ζ)=2ln(δ)C3+eϕC3C1C_{3}(\zeta)=-2\ln(\delta)C_{3}^{\prime}+e^{-\phi}C_{3}^{\prime}C_{1}. Let ι>0\iota>0. For all Ωmax{Ω0,e2+ι}\Omega\geq\max\{\Omega_{0},e^{2}+\iota\} we have

|𝔼[h(𝑿~Ω)]𝔼[h(𝒀)]|12δi=1r(C1QiM2𝝃i+C2λ¯i(𝒗)𝒗M1𝝃i2)+14δ2C2M2i=1rQi𝝃i2+12δ2C3e(2ln(δ1)+eϕC1)M1i=1rλ¯i(𝒗)𝒗max{𝝃i2,𝝃i3}+14δ3C3e(2ln(δ1)+eϕC1)M2i=1rQimax{𝝃i2,𝝃i3}+12δC3e(2ln(δ1)+eϕC1)i=1r|λ¯i(𝒗)|max{𝝃i2,𝝃i3}.\begin{split}\left\lvert\mathbb{E}\left[h\left(\tilde{\bm{X}}_{\Omega}^{\infty}\right)\right]-\mathbb{E}\left[h\left(\bm{Y}^{\infty}\right)\right]\right\rvert\leq\frac{1}{2}\delta\sum_{i=1}^{r}\left(C_{1}Q_{i}M_{2}\lVert\bm{\xi}_{i}\rVert+C_{2}\left\lVert\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\right\rVert M_{1}\lVert\bm{\xi}_{i}\rVert^{2}\right)\\ +\frac{1}{4}\delta^{2}C_{2}M_{2}\sum_{i=1}^{r}Q_{i}\left\lVert\bm{\xi}_{i}\right\rVert^{2}\\ +\frac{1}{2}\delta^{2}C_{3}^{\prime}e\left(2\ln(\delta^{-1})+e^{-\phi}C_{1}\right)M_{1}\sum_{i=1}^{r}\left\lVert\frac{\partial\bar{\lambda}_{i}(\bm{v}^{*})}{\partial\bm{v}}\right\rVert\max\{\left\lVert\bm{\xi}_{i}\right\rVert^{2},\left\lVert\bm{\xi}_{i}\right\rVert^{3}\}\\ +\frac{1}{4}\delta^{3}C_{3}^{\prime}e\left(2\ln(\delta^{-1})+e^{-\phi}C_{1}\right)M_{2}\sum_{i=1}^{r}Q_{i}\max\{\left\lVert\bm{\xi}_{i}\right\rVert^{2},\left\lVert\bm{\xi}_{i}\right\rVert^{3}\}\\ +\frac{1}{2}\delta C_{3}^{\prime}e\left(2\ln(\delta^{-1})+e^{-\phi}C_{1}\right)\sum_{i=1}^{r}\left\lvert\bar{\lambda}_{i}(\bm{v}^{*})\right\rvert\max\{\left\lVert\bm{\xi}_{i}\right\rVert^{2},\left\lVert\bm{\xi}_{i}\right\rVert^{3}\}.\end{split} (70)

The δln(δ1)=12δln(δ2)=12lnΩΩ\delta\ln(\delta^{-1})=\frac{1}{2}\delta\ln(\delta^{-2})=\frac{1}{2}\frac{\ln\Omega}{\sqrt{\Omega}} term decays the slowest, which yields the desired result.

4.2 Proof of Lemma 4.3

Here we prove Lemma 4.3. The proof is an application of Theorem 5 in [30], with the bounds tightened with more detailed analysis in our special case. For convenience we adopt the following notion from [30]. For a function gg with domain n\mathbb{R}^{n}, we use the notation

Mj(g)=sup𝒙,𝒙n,𝒙𝒙j1g(𝒙)j1g(𝒙)𝒙𝒙{j},M_{j}(g)=\sup_{\bm{x},\bm{x}^{\prime}\in\mathbb{R}^{n},\bm{x}\neq\bm{x}^{\prime}}\frac{\lVert\nabla^{\lceil j\rceil-1}g(\bm{x})-\nabla^{\lceil j\rceil-1}g(\bm{x}^{\prime})\rVert}{\lVert\bm{x}-\bm{x}^{\prime}\rVert^{\{j\}}}, (71)

where {j}=jj1\{j\}=j-\lceil j-1\rceil, and

Fj(g)=sup𝒙n,v1=v2==vj=1jg(𝒙)[𝒛1,,𝒛j]FF_{j}(g)=\sup_{\bm{x}\in\mathbb{R}^{n},\lVert v_{1}\rVert=\lVert v_{2}\rVert=\dots=\lVert v_{j}\rVert=1}\lVert\nabla^{j}g(\bm{x})[\bm{z}_{1},\dots,\bm{z}_{j}]\rVert_{F} (72)

Here, for g:ng:\mathbb{R}^{n}\rightarrow\mathbb{R}, rg(𝒙)[𝒛1,,𝒛j]\nabla^{r}g(\bm{x})[\bm{z}_{1},\dots,\bm{z}_{j}] denotes the rrth derivative of gg along the directions 𝒛1,,𝒛j\bm{z}_{1},\dots,\bm{z}_{j}. An Itô Diffusion 𝒁(t)\bm{Z}(t) is the solution to

d𝒁𝒙(t)=b(𝒁𝒙(t))dt+σ(𝒁𝒙(t))d𝑾(t),𝒁𝒙(t)=𝒙,d\bm{Z}_{\bm{x}}(t)=b(\bm{Z}_{\bm{x}}(t))dt+\sigma(\bm{Z}_{\bm{x}}(t))d\bm{W}(t),\;\bm{Z}_{\bm{x}}(t)=\bm{x}, (73)

where 𝑾(t)\bm{W}(t) is an mm dimensional Wiener process. We denote the transition semigroup of 𝒁(t)\bm{Z}(t) by (Pt)t0(P_{t})_{t\geq 0}, and the infinitesimal generator of 𝒁(t)\bm{Z}(t) by 𝒜\mathcal{A}. 𝒁(t)\bm{Z}(t) has Wasserstein decay rate r(t) for a nonincreasing integrable function r:0r:\mathbb{R}_{\geq 0}\rightarrow\mathbb{R} if for all 𝒛,𝒛n\bm{z},\bm{z}^{\prime}\in\mathbb{R}^{n} and all t0t\geq 0,

𝒲1(𝒁(t)|𝒁(0)=𝒛,𝒁(t)|𝒁(0)=𝒛)r(t)𝒲1(𝒛,𝒛).\mathcal{W}_{1}\left(\bm{Z}(t)|\bm{Z}(0)=\bm{z},\bm{Z}(t)|\bm{Z}(0)=\bm{z}^{\prime}\right)\leq r(t)\mathcal{W}_{1}(\bm{z},\bm{z}^{\prime}). (74)

For clarity, we start by stating the following theorem, slightly adapted from [30], which ensures the existence of constants C1C_{1}, C2C_{2} and C3C_{3}^{\prime}. We will shortly improve these constants in our special case.

Theorem 4.1 (Theorem 5 in [30]).

Let h:nh:\mathbb{R}^{n}\rightarrow\mathbb{R} be Lipschitz. Consider an nn dimensional Itô diffusion given by (73) with Wasserstein decay rate r¯(t)\bar{r}(t) and invariant measure PP. If bb and σ\sigma have locally Lipschitz second and third derivatives and a right inverse σ1(𝐱)\sigma^{-1}(\bm{x}) for each 𝐱n\bm{x}\in\mathbb{R}^{n}, and hC3(n)h\in C^{3}(\mathbb{R}^{n}) with bounded second and third derivatives, then

fh=0𝔼P[h(𝒁)]Pthdtf_{h}=\int_{0}^{\infty}\mathbb{E}_{P}\left[h(\bm{Z})\right]-P_{t}hdt (75)

is twice continuously differentiable, satisfies the Stein equation

𝒜fh=h𝔼P[h(𝒁)],\mathcal{A}f_{h}=h-\mathbb{E}_{P}\left[h(\bm{Z})\right], (76)

and additionally,

M1(fh)M1(h)0r¯(t)𝑑t,M_{1}(f_{h})\leq M_{1}(h)\int_{0}^{\infty}\bar{r}(t)dt, (77)

and

M2(fh)M1(h)(β1+β2),M_{2}(f_{h})\leq M_{1}(h)(\beta_{1}+\beta_{2}), (78)

where

β1=r¯(0)(2M0(σ1)+r¯(0)M1(σ)M0(σ1)+r¯(0)α),\displaystyle\beta_{1}=\bar{r}(0)\left(2M_{0}(\sigma^{-1})+\bar{r}(0)M_{1}(\sigma)M_{0}(\sigma^{-1})+\bar{r}(0)\sqrt{\alpha}\right), (79)
β2=r¯(0)(eγ2M0(σ1)+eγ2M1(σ)M0(σ1)+23eγ4α),\displaystyle\beta_{2}=\bar{r}(0)\left(e^{\gamma_{2}}M_{0}(\sigma^{-1})+e^{\gamma_{2}}M_{1}(\sigma)M_{0}(\sigma^{-1})+\frac{2}{3}e^{\gamma_{4}}\sqrt{\alpha}\right), (80)

with γρ=ρM1(b)+ρ22ρ2M1(σ)2+ρ2F1(σ)2\gamma_{\rho}=\rho M_{1}(b)+\frac{\rho^{2}-2\rho}{2}M_{1}(\sigma)^{2}+\frac{\rho}{2}F_{1}(\sigma)^{2} and α=M2(b)22M1(b)+4M1(σ)2+2F2(σ)2\alpha=\frac{M_{2}(b)^{2}}{2M_{1}(b)+4M_{1}(\sigma)^{2}}+2F_{2}(\sigma)^{2}. Furthermore, for all ζ(0,1)\zeta\in(0,1),

M3ζ(fh)M1(h)1K(1ζ+0r¯(t)𝑑t),M_{3-\zeta}(f_{h})\leq M_{1}(h)\frac{1}{K}\left(\frac{1}{\zeta}+\int_{0}^{\infty}\bar{r}(t)dt\right), (81)

for K>0K>0 that depends only on M1:3(σ)M_{1:3}(\sigma), M1:3(b)M_{1:3}(b), M0(σ1)M_{0}(\sigma^{-1}), and r¯\bar{r}.

We now show that 𝒀(t)\bm{Y}(t) has a Wasserstein decay rate of eAt\left\lVert e^{At}\right\rVert, meaning that

𝒲1(𝒀(t)|𝒀(0)=𝒚,𝒀(t)|𝒀(0)=𝒚)eAt𝒲1(𝒚,𝒚),\mathcal{W}_{1}(\bm{Y}(t)|\bm{Y}(0)=\bm{y},\bm{Y}(t)|\bm{Y}(0)=\bm{y}^{\prime})\leq\left\lVert e^{At}\right\rVert\mathcal{W}_{1}(\bm{y},\bm{y}^{\prime}), (82)

where 𝒀(t)|𝒀(0)=𝒚\bm{Y}(t)|\bm{Y}(0)=\bm{y} denotes 𝒀(t)\bm{Y}(t), the solution to (12b) with 𝒗(t)=𝒗\bm{v}(t)=\bm{v}^{*} and 𝒀(0)=𝒚\bm{Y}^{\prime}(0)=\bm{y}. Observe that 𝒀(t)|𝒀(0)=𝒚\bm{Y}(t)|\bm{Y}(0)=\bm{y} and 𝒀(t)|𝒀(0)=𝒚\bm{Y}(t)|\bm{Y}(0)=\bm{y}^{\prime} are Gaussian with the same covariance. Therefore, letting 𝒀𝒚=𝒀(t)|𝒀(0)=𝒚\bm{Y}_{\bm{y}}=\bm{Y}(t)|\bm{Y}(0)=\bm{y}, we have

𝒲1(𝒀𝒚(t),𝒀𝒚(t))=𝔼[𝒀𝒚(t)]𝔼[𝒀𝒚(t)],\mathcal{W}_{1}\left(\bm{Y}_{\bm{y}}(t),\bm{Y}_{\bm{y}^{\prime}}(t)\right)=\left\lVert\mathbb{E}\left[\bm{Y}_{\bm{y}}(t)\right]-\mathbb{E}\left[\bm{Y}_{\bm{y}^{\prime}}(t)\right]\right\rVert, (83)

which can be shown by considering the lower bound

𝒲1(𝒀𝒚(t),𝒀𝒚(t))𝔼[𝒀𝒚(t)]𝔼[𝒀𝒚(t)]\mathcal{W}_{1}\left(\bm{Y}_{\bm{y}}(t),\bm{Y}_{\bm{y}^{\prime}}(t)\right)\geq\left\lVert\mathbb{E}\left[\bm{Y}_{\bm{y}}(t)\right]-\mathbb{E}\left[\bm{Y}_{\bm{y}^{\prime}}(t)\right]\right\rVert (84)

from Jensen’s inequality, and the upper bound

𝒲1(𝒀𝒚(t),𝒀𝒚(t))𝔼[𝒀𝒚(t)]𝔼[𝒀𝒚(t)]\mathcal{W}_{1}\left(\bm{Y}_{\bm{y}}(t),\bm{Y}_{\bm{y}^{\prime}}(t)\right)\leq\left\lVert\mathbb{E}\left[\bm{Y}_{\bm{y}}(t)\right]-\mathbb{E}\left[\bm{Y}_{\bm{y}^{\prime}}(t)\right]\right\rVert (85)

as evidenced by the coupling 𝒀𝒚(t)=𝒀𝒚(t)+𝔼[𝒀𝒚(t)]𝔼[𝒀𝒚(t)]\bm{Y}_{\bm{y}}(t)=\bm{Y}_{\bm{y}^{\prime}}(t)+\mathbb{E}\left[\bm{Y}_{\bm{y}}(t)\right]-\mathbb{E}\left[\bm{Y}_{\bm{y}^{\prime}}(t)\right]. Theorem 4.1 then gives us the constant C1C_{1}. To prove the claimed expressions for C2C_{2} and C3C_{3}^{\prime}, we must specialize the arguments of [30] to our diffusion, which has a linear and Hurwitz bb and a uniform σ\sigma. As in [30], denote by 𝑽𝒗(t)\bm{V}_{\bm{v}}(t) the first directional derivative flow, defined as the solution to

d𝑽𝒗(t)=b(𝒁𝒙(t))𝑽𝒗(t)dt+σ(𝒁𝒙(t))d𝑾(t),𝑽𝒗(0)=𝒗.d\bm{V}_{\bm{v}}(t)=\nabla b(\bm{Z}_{\bm{x}}(t))\bm{V}_{\bm{v}}(t)dt+\nabla\sigma(\bm{Z}_{\bm{x}}(t))d\bm{W}(t),\quad\bm{V}_{\bm{v}}(0)=\bm{v}. (86)

Further, define 𝑼𝒗,𝒗(t)\bm{U}_{\bm{v},\bm{v}^{\prime}}(t), the second directional derivative flow, defined as the solution to

d𝑼𝒗,𝒗(t)=(b(𝒁𝒙(t))𝑼𝒗,𝒗+2b(𝒁𝒙(t))[𝑽𝒗(t)]𝑽𝒗(t))dt+(σ(𝒁𝒙(t))𝑼𝒗,𝒗(t)+2σ(𝒁𝒙(t))[𝑽𝒗(t)]𝑽𝒗(t))d𝑾(t),𝑼𝒗,𝒗(0)=0.\begin{split}d\bm{U}_{\bm{v},\bm{v}^{\prime}}(t)=\left(\nabla b(\bm{Z}_{\bm{x}}(t))\bm{U}_{\bm{v},\bm{v}^{\prime}}+\nabla^{2}b(\bm{Z}_{\bm{x}}(t))[\bm{V}_{\bm{v}^{\prime}}(t)]\bm{V}_{\bm{v}}(t)\right)dt+\left(\nabla\sigma(\bm{Z}_{\bm{x}}(t))\bm{U}_{\bm{v},\bm{v}^{\prime}}(t)\right.\\ \left.+\nabla^{2}\sigma(\bm{Z}_{\bm{x}}(t))[\bm{V}_{\bm{v}^{\prime}}(t)]\bm{V}_{\bm{v}}(t)\right)d\bm{W}(t),\;\bm{U}_{\bm{v},\bm{v}^{\prime}}(0)=0.\end{split} (87)

In our proof the following lemma replaces the derivative flow bounds given in Lemma 16 of [30].

Lemma 4.4.

Consider an Itô diffusion given by (73) with transition semigroup (Pt)t0(P_{t})_{t\geq 0} and assume that b(𝐳)=A𝐳b(\bm{z})=A\bm{z} where An×nA\in\mathbb{R}^{n\times n} satisfies

HA+ATH2ϕH,HA+A^{T}H\leq-2\phi H, (88)

For a matrix H>0H>0 and a real number ϕ>0\phi>0. Assume also that σ(𝐳)=σ\sigma(\bm{z})=\sigma is a constant. Then, for all t0t\geq 0, ρ>0\rho>0, and 𝐯n\bm{v}\in\mathbb{R}^{n},

𝔼[𝑽𝒗(t)ρ]cond(H)ρ/2eρϕt𝒗ρ\mathbb{E}\left[\lVert\bm{V}_{\bm{v}}(t)\rVert^{\rho}\right]\leq\mathrm{cond}(H)^{\rho/2}e^{-\rho\phi t}\lVert\bm{v}\rVert^{\rho} (89)

and for all t0t\geq 0, 𝐔𝐯,𝐯(t)=0\bm{U}_{\bm{v},\bm{v}^{\prime}}(t)=0.

Proof.

Under the assumptions of the Lemma, (86) becomes

d𝑽𝒗(t)=F(𝒗)𝒗𝑽𝒗(t)dt,𝑽𝒗(0)=𝒗,d\bm{V}_{\bm{v}}(t)=\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}}\bm{V}_{\bm{v}}(t)dt,\quad\bm{V}_{\bm{v}}(0)=\bm{v}, (90)

and thus 𝑽𝒗(t)=eAt𝒗\bm{V}_{\bm{v}}(t)=e^{At}\bm{v}. Therefore, 𝔼[𝑽𝒗(t)ρ]=eAt𝒗ρ\mathbb{E}\left[\bm{V}_{\bm{v}}(t)^{\rho}\right]=\lVert e^{At}\bm{v}\rVert^{\rho}. To bound this quantity, observe that the condition

HA+ATH2ϕH,HA+A^{T}H\leq-2\phi H, (91)

implies that

HAH1+(HAH1)T2ϕI,\sqrt{H}A\sqrt{H}^{-1}+\left(\sqrt{H}A\sqrt{H}^{-1}\right)^{T}\leq-2\phi I, (92)

where H\sqrt{H} is the principle square root of HH. It then follows from standard arguments from linear systems theory [15, 45] that for all t0t\geq 0,

eHAH1teϕt.\lVert e^{\sqrt{H}A\sqrt{H}^{-1}t}\rVert\leq e^{-\phi t}. (93)

We then have that for all t0t\geq 0 and 𝒗n\bm{v}\in\mathbb{R}^{n},

eAt𝒗\displaystyle\lVert e^{At}\bm{v}\rVert =H1eHAH1tH𝒗\displaystyle=\lVert\sqrt{H}^{-1}e^{\sqrt{H}A\sqrt{H}^{-1}t}\sqrt{H}\bm{v}\rVert (94)
H1eHAH1tH𝒗\displaystyle\leq\lVert\sqrt{H}^{-1}\rVert\lVert e^{\sqrt{H}A\sqrt{H}^{-1}t}\rVert\lVert\sqrt{H}\rVert\lVert\bm{v}\rVert (95)
cond(H)eϕt𝒗\displaystyle\leq\sqrt{\mathrm{cond}(H)}e^{-\phi t}\lVert\bm{v}\rVert (96)

which establishes the desired bound on eAt𝒗\lVert e^{At}\bm{v}\rVert. To complete the proof, observe that under the assumptions of the lemma, (87) becomes

d𝑼𝒗,𝒗(t)=b(𝒁𝒙(t))𝑼𝒗,𝒗,𝑼𝒗,𝒗(0)=0,d\bm{U}_{\bm{v},\bm{v}^{\prime}}(t)=\nabla b(\bm{Z}_{\bm{x}}(t))\bm{U}_{\bm{v},\bm{v}^{\prime}},\;\bm{U}_{\bm{v},\bm{v}^{\prime}}(0)=0, (97)

and thus 𝑼𝒗,𝒗(t)=0\bm{U}_{\bm{v},\bm{v}^{\prime}}(t)=0. ∎

With this specialization, we now trace the construction of C2C_{2} and C3(ζ)C_{3}(\zeta) through the same argument used in proof of Theorem 5 in [30] to obtain the claimed expressions. Using Lemma 4.4 we make the specializations described in Lemmas 4.5 and 4.6, which we state and prove before proceeding. In our case, the bound on M2(f)M_{2}(f) given by Lemma 15 in [30] can be replaced by the following lemma.

Lemma 4.5.

Consider an Itô diffusion given by (73) with transition semigroup (Pt)t0(P_{t})_{t\geq 0} and assume that b(𝐳)=A𝐳b(\bm{z})=A\bm{z} where An×nA\in\mathbb{R}^{n\times n} satisfies

HA+ATH2ϕH,HA+A^{T}H\leq-2\phi H, (98)

For a matrix H>0H>0 and a real number ϕ>0\phi>0. Assume also that σ(𝐳)=σ\sigma(\bm{z})=\sigma is a constant with right inverse σ1\sigma^{-1}. Then, for all t>0t>0 and fC2f\in C^{2} with bounded first and second derivatives, PtfP_{t}f satisfies

M2(Ptf)inft0(0,t]M1(f)r(tt0)M0(σ1)eϕt0cond(H)1t0,M_{2}(P_{t}f)\leq\inf_{t_{0}\in(0,t]}M_{1}(f)r(t-t_{0})M_{0}(\sigma^{-1})e^{-\phi t_{0}}\mathrm{cond}(H)\frac{1}{\sqrt{t_{0}}}, (99)

where r(t)=eAtr(t)=\lVert e^{At}\rVert.

Proof.

First, observe that under the assumptions of the lemma, the Itô diffusion has a Wasserstein decay rate of r(t)=eAtr(t)=\lVert e^{At}\rVert. Thus, we can invoke Lemma 15 directly to obtain that PtfP_{t}f is twice continuously differentiable. The remainder of the proof follows that of Lemma 15 in [30], with the substitution of our form of bb and σ\sigma, and using Lemma 4.4 instead of Lemma 16 in [30] for bounds on the derivative flows. ∎

In our special case, the bound on M3(Ptf)M_{3}(P_{t}f) from Lemma 20 in [30] can be replaced with the following lemma.

Lemma 4.6.

Consider an Itô diffusion given by (73) with transition semigroup (Pt)t0(P_{t})_{t\geq 0} and assume that b(𝐳)=A𝐳b(\bm{z})=A\bm{z} where An×nA\in\mathbb{R}^{n\times n} satisfies

HA+ATH2ϕH,HA+A^{T}H\leq-2\phi H, (100)

For a matrix H>0H>0 and a real number ϕ>0\phi>0. Assume also that σ(𝐳)=σ\sigma(\bm{z})=\sigma is a constant with right inverse σ1\sigma^{-1}. Then, for all t>0t>0 and fC3f\in C^{3} with bounded second and third derivatives, PtfP_{t}f satisfies

M3(Ptf)inft0(0,t]2t0M1(f)r(tt0)M02(σ1)cond(H)52e32ϕt0,M_{3}(P_{t}f)\leq\inf_{t_{0}\in(0,t]}\frac{2}{t_{0}}M_{1}(f)r(t-t_{0})M_{0}^{2}(\sigma^{-1})\mathrm{cond}(H)^{\frac{5}{2}}e^{-\frac{3}{2}\phi t_{0}}, (101)

where r(t)=eAtr(t)=\lVert e^{At}\rVert.

Proof.

First, observe that under the assumptions of the lemma, the Itô diffusion has a Wasserstein decay rate of r(t)=eAtr(t)=\lVert e^{At}\rVert. Then, the proof follows that of Lemma 20 in [30], but with the substitution of the bound on M2(Ptf)M_{2}(P_{t}f) given by Lemma 4.5, and the derivative flow bounds of Lemma 16 in [30] replaced by Lemma 4.4. ∎

We now finish our proof of Lemma 4.3. Observe that in this case we have A=F𝒙(𝒙)A=\frac{\partial F}{\partial\bm{x}}(\bm{x}^{*}) and σ=Σ\sigma=\Sigma, where Σ\Sigma has a right inverse Σ1\Sigma^{-1}. Let 0<ζ<10<\zeta<1. We have for fhf_{h} solving 𝒜fh=h𝔼P[h(𝒁)]\mathcal{A}f_{h}=h-\mathbb{E}_{P}\left[h(\bm{Z})\right] that by the Dominated Convergence Theorem and Jensen’s inequality that

Mζ1(2fh)=Mζ1(02Pthdt)0M1ζ(2Pth)𝑑t.M_{\zeta-1}(\nabla^{2}f_{h})=M_{\zeta-1}\left(-\int_{0}^{\infty}\nabla^{2}P_{t}hdt\right)\leq\int_{0}^{\infty}M_{1-\zeta}\left(\nabla^{2}P_{t}h\right)dt. (102)

Splitting the last integral into the interval [0,1][0,1] and [1,][1,\infty] we have

Mζ1(2fh)T1+T2,M_{\zeta-1}(\nabla^{2}f_{h})\leq T_{1}+T_{2}, (103)

where

T1=01M1ζ(2Pth)𝑑t,T_{1}=\int_{0}^{1}M_{1-\zeta}\left(\nabla^{2}P_{t}h\right)dt, (104)

and

T2=1M1ζ(2Pth)𝑑t.T_{2}=\int_{1}^{\infty}M_{1-\zeta}\left(\nabla^{2}P_{t}h\right)dt. (105)

Applying the seminorm interpolation result of Lemma 19 in [30], we have

M1ζ(2Pth)\displaystyle M_{1-\zeta}\left(\nabla^{2}P_{t}h\right) 2ζ(M0(2Pth))ζ(M1(2Pth))1ζ\displaystyle\leq 2^{\zeta}\left(M_{0}(\nabla^{2}P_{t}h)\right)^{\zeta}\left(M_{1}(\nabla^{2}P_{t}h)\right)^{1-\zeta} (106)
=2ζ(M2(Pth))ζ(M3(Pth))1ζ.\displaystyle=2^{\zeta}\left(M_{2}(P_{t}h)\right)^{\zeta}\left(M_{3}(P_{t}h)\right)^{1-\zeta}. (107)

Under the assumptions of the lemma, we can apply Lemmas 4.5 and 4.6. For t1t\leq 1 we set t0=tt_{0}=t to obtain a bound on T1T_{1} as follows.

T1\displaystyle T_{1} =01M1ζ(2Pth)𝑑t\displaystyle=\int_{0}^{1}M_{1-\zeta}\left(\nabla^{2}P_{t}h\right)dt (108)
012ζ(M2(Pth))ζ(M3(Pth))1ζ𝑑t\displaystyle\leq\int_{0}^{1}2^{\zeta}\left(M_{2}(P_{t}h)\right)^{\zeta}\left(M_{3}(P_{t}h)\right)^{1-\zeta}dt (109)
012ζ(r(0)M0(Σ1)eϕtcond(H)1t)ζ(2tr(0)M02(Σ1)cond(H)52e32ϕt)1ζ𝑑t\displaystyle\leq\int_{0}^{1}2^{\zeta}\left(r(0)M_{0}(\Sigma^{-1})e^{-\phi t}\mathrm{cond}(H)\frac{1}{\sqrt{t}}\right)^{\zeta}\left(\frac{2}{t}r(0)M_{0}^{2}(\Sigma^{-1})\mathrm{cond}(H)^{\frac{5}{2}}e^{-\frac{3}{2}\phi t}\right)^{1-\zeta}dt (110)
=2r(0)M02ζ(Σ1)cond(H)5232ζ01e(3212ζ)ϕtt1+12ζ𝑑t\displaystyle=2r(0)M_{0}^{2-\zeta}(\Sigma^{-1})\mathrm{cond}(H)^{\frac{5}{2}-\frac{3}{2}\zeta}\int_{0}^{1}e^{-(\frac{3}{2}-\frac{1}{2}\zeta)\phi t}t^{-1+\frac{1}{2}\zeta}dt (111)
2r(0)M02ζ(Σ1)cond(H)5232ζ2ζ\displaystyle\leq 2r(0)M_{0}^{2-\zeta}(\Sigma^{-1})\mathrm{cond}(H)^{\frac{5}{2}-\frac{3}{2}\zeta}\frac{2}{\zeta} (112)
=4ζmax{Σ1,Σ12}cond(H)52,\displaystyle=\frac{4}{\zeta}\max\left\{\lVert\Sigma^{-1}\rVert,\lVert\Sigma^{-1}\rVert^{2}\right\}\mathrm{cond}(H)^{\frac{5}{2}}, (113)

where we have used the observations that r(0)=1r(0)=1 and cond(H)1\mathrm{cond}(H)\geq 1. turning to T2T_{2}, for t1t\geq 1 we set t0=1t_{0}=1 in Lemmas 4.5 and 4.6, obtaining

T2\displaystyle T_{2} =1M1ζ(2Pth)𝑑t\displaystyle=\int_{1}^{\infty}M_{1-\zeta}\left(\nabla^{2}P_{t}h\right)dt (114)
12ζ(M2(Pth))ζ(M3(Pth))1ζ𝑑t\displaystyle\leq\int_{1}^{\infty}2^{\zeta}\left(M_{2}(P_{t}h)\right)^{\zeta}\left(M_{3}(P_{t}h)\right)^{1-\zeta}dt (115)
1(2r(t1)M0(Σ1)eϕcond(H))ζ(2r(t1)M02(Σ1)cond(H)52e32ϕ)1ζ𝑑t\displaystyle\leq\int_{1}^{\infty}\left(2r(t-1)M_{0}(\Sigma^{-1})e^{-\phi}\mathrm{cond}(H)\right)^{\zeta}\left(2r(t-1)M_{0}^{2}(\Sigma^{-1})\mathrm{cond}(H)^{\frac{5}{2}}e^{-\frac{3}{2}\phi}\right)^{1-\zeta}dt (116)
=2M02ζ(Σ1)cond(H)5232ζe(3212ζ)ϕ1r(t1)𝑑t\displaystyle=2M_{0}^{2-\zeta}(\Sigma^{-1})\mathrm{cond}(H)^{\frac{5}{2}-\frac{3}{2}\zeta}e^{-(\frac{3}{2}-\frac{1}{2}\zeta)\phi}\int_{1}^{\infty}r(t-1)dt (117)
2M02ζ(Σ1)cond(H)5232ζeϕ0r(t)𝑑t\displaystyle\leq 2M_{0}^{2-\zeta}(\Sigma^{-1})\mathrm{cond}(H)^{\frac{5}{2}-\frac{3}{2}\zeta}e^{-\phi}\int_{0}^{\infty}r(t)dt (118)
=2max{Σ1,Σ12}cond(H)52eϕ0r(t)𝑑t.\displaystyle=2\max\left\{\lVert\Sigma^{-1}\rVert,\lVert\Sigma^{-1}\rVert^{2}\right\}\mathrm{cond}(H)^{\frac{5}{2}}e^{-\phi}\int_{0}^{\infty}r(t)dt. (119)

Finally, by the definition of M1ζM_{1-\zeta} we have

𝒙,𝒙n,2fh(𝒙)2fh(𝒙)(T1+T2)𝒙𝒙1ζ,\forall\bm{x},\bm{x}^{\prime}\in\mathbb{R}^{n},\;\lVert\nabla^{2}f_{h}(\bm{x})-\nabla^{2}f_{h}(\bm{x}^{\prime})\rVert\leq(T_{1}+T_{2})\lVert\bm{x}-\bm{x}^{\prime}\rVert^{1-\zeta}, (120)

which proves the desired bound when combined with the bounds on T1T_{1} and T2T_{2} derived above.

4.3 Proof of Lemma 4.2

Proof.

We have that

suphLip(1)|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]|suph|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]|.\sup_{h\in\mathrm{Lip}(1)}\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h(\bm{Y})]\right\rvert\geq\sup_{h\in\mathcal{H}}\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h(\bm{Y})]\right\rvert. (121)

We will show that

suphLip(1)|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]|suph|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]|.\sup_{h\in\mathrm{Lip}(1)}\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h(\bm{Y})]\right\rvert\leq\sup_{h\in\mathcal{H}}\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h(\bm{Y})]\right\rvert. (122)

Our proof is based on mollifying hLip(1)h\in\mathrm{Lip}(1). Let φϵ(𝒙)\varphi_{\epsilon}(\bm{x}) be the bump function defined by

φϵ(𝒙)={1ϵnCe11𝒙/ϵ2,𝒙/ϵ1,0,else,\varphi_{\epsilon}(\bm{x})=\left\{\begin{array}[]{cc}\frac{1}{\epsilon^{n}}Ce^{\frac{-1}{1-\lVert\bm{x}/\epsilon\rVert^{2}}},&\lVert\bm{x}/\epsilon\rVert\leq 1,\\ 0,&\text{else},\end{array}\right. (123)

Where C=(ne11𝒙2𝑑𝒙)1C=\left(\int_{\mathbb{R}^{n}}e^{\frac{-1}{1-\lVert\bm{x}\rVert^{2}}}d\bm{x}\right)^{-1}. Suppose for contradiction that

suphLip(1)|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]|>suph|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]|,\sup_{h\in\mathrm{Lip}(1)}\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h(\bm{Y})]\right\rvert>\sup_{h\in\mathcal{H}}\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h(\bm{Y})]\right\rvert, (124)

then, there exists hLip(1)h^{*}\in\mathrm{Lip}(1) such that hh^{*}\notin\mathcal{H}, and

|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]|suph|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]|=:a>0.\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h^{*}(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h^{*}(\bm{Y})]\right\rvert-\sup_{h\in\mathcal{H}}\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h(\bm{Y})]\right\rvert=:a>0. (125)

Let ha/4=hφa/4h^{*}_{a/4}=h^{*}\star\varphi_{a/4}. We will show that ha/4=hφa/4h^{*}_{a/4}=h^{*}\star\varphi_{a/4}\in\mathcal{H}, and hha/4a/4\lVert h^{*}-h^{*}_{a/4}\rVert_{\infty}\leq a/4. Let ϵ=a/4\epsilon=a/4. Recall that

hϵ(𝒙)=(φϵh)(𝒙)=nφϵ(𝒙𝒚)h(𝒚)μ(d𝒚),h^{*}_{\epsilon}(\bm{x})=\left(\varphi_{\epsilon}\star h^{*}\right)(\bm{x})=\int_{\mathbb{R}^{n}}\varphi_{\epsilon}(\bm{x}-\bm{y})h(\bm{y})\mu(d\bm{y}), (126)

where μ\mu is the Lebesgue measure. Observe that

|hϵ(𝒙)hϵ(𝒙)|\displaystyle\left\lvert h^{*}_{\epsilon}(\bm{x})-h^{*}_{\epsilon}(\bm{x}^{\prime})\right\rvert =|nφϵ(𝒚)(h(𝒙𝒚)h(𝒙𝒚))μ(d𝒚)|,\displaystyle=\left\lvert\int_{\mathbb{R}^{n}}\varphi_{\epsilon}(\bm{y})\left(h(\bm{x}-\bm{y})-h(\bm{x}^{\prime}-\bm{y})\right)\mu(d\bm{y})\right\rvert, (127)
n|φϵ(𝒚)(h(𝒙𝒚)h(𝒙𝒚))|μ(d𝒚),\displaystyle\leq\int_{\mathbb{R}^{n}}\left\lvert\varphi_{\epsilon}(\bm{y})\left(h(\bm{x}-\bm{y})-h(\bm{x}^{\prime}-\bm{y})\right)\right\rvert\mu(d\bm{y}), (128)
nφϵ(𝒚)|h(𝒙𝒚)h(𝒙𝒚)|μ(d𝒚),\displaystyle\leq\int_{\mathbb{R}^{n}}\varphi_{\epsilon}(\bm{y})\left\lvert h(\bm{x}-\bm{y})-h(\bm{x}^{\prime}-\bm{y})\right\rvert\mu(d\bm{y}), (129)
𝒙𝒙nφϵ(𝒚)μ(d𝒚),\displaystyle\leq\lVert\bm{x}-\bm{x}^{\prime}\rVert\int_{\mathbb{R}^{n}}\varphi_{\epsilon}(\bm{y})\mu(d\bm{y}), (130)
=𝒙𝒙,\displaystyle=\lVert\bm{x}-\bm{x}^{\prime}\rVert, (131)

and thus hϵLip(1)h^{*}_{\epsilon}\in\mathrm{Lip}(1). Next, let 𝜶𝒙𝜶\frac{\partial^{\bm{\alpha}}}{\partial\bm{x}^{\bm{\alpha}}} with 𝜶0n\bm{\alpha}\in\mathbb{Z}_{\geq 0}^{n} denote j=1nαjxjαj\prod_{j=1}^{n}\frac{\partial^{\alpha_{j}}}{\partial x_{j}^{\alpha_{j}}}, and observe that by the Dominated Convergence Theorem,

𝜶hϵ(𝒙)𝒙𝜶=n𝜶φϵ(𝒙𝒚)𝒙𝜶h(𝒚)μ(d𝒚),\frac{\partial^{\bm{\alpha}}h^{*}_{\epsilon}(\bm{x})}{\partial\bm{x}^{\bm{\alpha}}}=\int_{\mathbb{R}^{n}}\frac{\partial^{\bm{\alpha}}\varphi_{\epsilon}(\bm{x}-\bm{y})}{\partial\bm{x}^{\bm{\alpha}}}h(\bm{y})\mu(d\bm{y}), (132)

showing that hϵ(𝒙)C(n)h^{*}_{\epsilon}(\bm{x})\in C^{\infty}(\mathbb{R}^{n}), and that furthermore that 𝜶hϵ(𝒙)𝒙𝜶\left\lVert\frac{\partial^{\bm{\alpha}}h^{*}_{\epsilon}(\bm{x})}{\partial\bm{x}^{\bm{\alpha}}}\right\rVert is bounded. We have shown that hϵh^{*}_{\epsilon}\in\mathcal{H}. Now, consider

|h(𝒙)ha/4(𝒙)|\displaystyle\left\lvert h^{*}(\bm{x})-h^{*}_{a/4}(\bm{x})\right\rvert =|nφa/4(𝒚)h(𝒙)μ(d𝒚)nφa/4(𝒚)h(𝒙𝒚)μ(d𝒚)|,\displaystyle=\left\lvert\int_{\mathbb{R}^{n}}\varphi_{a/4}(\bm{y})h^{*}(\bm{x})\mu(d\bm{y})-\int_{\mathbb{R}^{n}}\varphi_{a/4}(\bm{y})h^{*}(\bm{x}-\bm{y})\mu(d\bm{y})\right\rvert, (133)
n|φa/4(𝒚)(h(𝒙)h(𝒙𝒚))|μ(d𝒚),\displaystyle\leq\int_{\mathbb{R}^{n}}\left\lvert\varphi_{a/4}(\bm{y})\left(h^{*}(\bm{x})-h^{*}(\bm{x}-\bm{y})\right)\right\rvert\mu(d\bm{y}), (134)
𝒚a/4φa/4(𝒚)𝒚μ(d𝒚),\displaystyle\leq\int_{\lVert\bm{y}\rVert\leq a/4}\varphi_{a/4}(\bm{y})\lVert\bm{y}\rVert\mu(d\bm{y}), (135)
a4𝒚a/4φa/4(𝒚)μ(d𝒚),\displaystyle\leq\frac{a}{4}\int_{\lVert\bm{y}\rVert\leq a/4}\varphi_{a/4}(\bm{y})\mu(d\bm{y}), (136)
=a4,\displaystyle=\frac{a}{4}, (137)

where we have used the fact that φa/4(𝒚)=0\varphi_{a/4}(\bm{y})=0 for all 𝒚>a4\lVert\bm{y}\rVert>\frac{a}{4}, and the fact that hLip(1)h^{*}\in\mathrm{Lip}(1). We have that

|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]||𝔼𝑿ν[ha/4(𝑿)]𝔼𝒀ρ[ha/4(𝒀)]|\displaystyle\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h^{*}(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h^{*}(\bm{Y})]\right\rvert-\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h^{*}_{a/4}(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h^{*}_{a/4}(\bm{Y})]\right\rvert (138)
=|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]𝔼𝑿ν[ha/4(𝑿)]+𝔼𝒀ρ[ha/4(𝒀)]+𝔼𝑿ν[ha/4(𝑿)]𝔼𝒀ρ[ha/4(𝒀)]||𝔼𝑿ν[ha/4(𝑿)]𝔼𝒀ρ[ha/4(𝒀)]|,\displaystyle\begin{split}&=\left|\mathbb{E}_{\bm{X}\sim\nu}[h^{*}(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h^{*}(\bm{Y})]-\mathbb{E}_{\bm{X}\sim\nu}[h^{*}_{a/4}(\bm{X})]+\mathbb{E}_{\bm{Y}\sim\rho}[h^{*}_{a/4}(\bm{Y})]+\mathbb{E}_{\bm{X}\sim\nu}[h^{*}_{a/4}(\bm{X})]\right.\\ &\quad\quad\quad\left.-\mathbb{E}_{\bm{Y}\sim\rho}[h^{*}_{a/4}(\bm{Y})]\right|-\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h^{*}_{a/4}(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h^{*}_{a/4}(\bm{Y})]\right\rvert,\end{split} (139)
|𝔼𝑿ν[h(𝑿)ha/4(𝑿)]|+|𝔼𝒀ρ[h(𝒀)ha/4(𝒀)]|,\displaystyle\leq\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h^{*}(\bm{X})-h^{*}_{a/4}(\bm{X})]\right\rvert+\left\lvert\mathbb{E}_{\bm{Y}\sim\rho}[h^{*}(\bm{Y})-h^{*}_{a/4}(\bm{Y})]\right\rvert, (140)
𝔼𝑿ν[|h(𝑿)ha/4(𝑿)|]+𝔼𝒀ρ[|h(𝒀)ha/4(𝒀)|]a2,\displaystyle\leq\mathbb{E}_{\bm{X}\sim\nu}[\left\lvert h^{*}(\bm{X})-h^{*}_{a/4}(\bm{X})\right\rvert]+\mathbb{E}_{\bm{Y}\sim\rho}[\left\lvert h^{*}(\bm{Y})-h^{*}_{a/4}(\bm{Y})\right\rvert]\leq\frac{a}{2}, (141)

where the second to last inequality follows from Jensen’s Inequality and the last follows from the fact that hha/4a4\lVert h^{*}-h^{*}_{a/4}\rVert_{\infty}\leq\frac{a}{4}. Now, we have from (125) and (138) that

suph|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]|+a=|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]||𝔼𝑿ν[ha/4(𝑿)]𝔼𝒀ρ[ha/4(𝒀)]|+a/2,\sup_{h\in\mathcal{H}}\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h(\bm{Y})]\right\rvert+a=\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h^{*}(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h^{*}(\bm{Y})]\right\rvert\\ \leq\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h^{*}_{a/4}(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h^{*}_{a/4}(\bm{Y})]\right\rvert+a/2, (142)

and thus

suph|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]|<|𝔼𝑿ν[ha/4(𝑿)]𝔼𝒀ρ[ha/4(𝒀)]|,\sup_{h\in\mathcal{H}}\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h(\bm{Y})]\right\rvert<\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h^{*}_{a/4}(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h^{*}_{a/4}(\bm{Y})]\right\rvert, (143)

which cannot be true since ha/4h^{*}_{a/4}\in\mathcal{H}. Thus, by contradiction,

suphLip(1)|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]|suph|𝔼𝑿ν[h(𝑿)]𝔼𝒀ρ[h(𝒀)]|,\sup_{h\in\mathrm{Lip}(1)}\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h(\bm{Y})]\right\rvert\leq\sup_{h\in\mathcal{H}}\left\lvert\mathbb{E}_{\bm{X}\sim\nu}[h(\bm{X})]-\mathbb{E}_{\bm{Y}\sim\rho}[h(\bm{Y})]\right\rvert, (144)

which completes the proof. ∎

4.4 Proof of Theorem 3.2

We first give a lemma that uses a Foster-Lyapunov criteria [49] to conclude that Condition 3.2 holds.

Lemma 4.7.

Consider an SCRN 𝐗Ω(t)\bm{X}_{\Omega}(t) satisfying Assumption 2.1 and 2.3, and suppose there exists V:{𝐱n|𝐱+𝐯𝒳}V:\{\bm{x}\in\mathbb{R}^{n}|\bm{x}+\bm{v}^{*}\in\mathcal{X}\}\rightarrow\mathbb{R}, a,b,K>0a,b,K>0, and Ω0\Omega_{0} such that a𝐱2V(𝐱)b𝐱2a\left\lVert\bm{x}\right\rVert^{2}\leq V(\bm{x})\leq b\left\lVert\bm{x}\right\rVert^{2} and for all Ω{Ω𝒪|ΩΩ0}\Omega\in\left\{\Omega\in\mathcal{O}\middle|\Omega\geq\Omega_{0}\right\} and all 𝐱{𝐱n|Ω(𝐱+𝐯)𝒳Ω}\bm{x}\in\{\bm{x}\in\mathbb{R}^{n}|\Omega(\bm{x}+\bm{v}^{*})\in\mathcal{X}_{\Omega}\},

G𝑿~~ΩV(𝒙)γ𝒙2+KΩ.G_{\tilde{\tilde{\bm{X}}}_{\Omega}}V(\bm{x})\leq-\gamma\lVert\bm{x}\rVert^{2}+\frac{K}{\Omega}. (145)

Then, Condition 3.2 is satisfied.

Proof.

We begin by showing that for all Ω{Ω𝒪|ΩΩ0}\Omega\in\left\{\Omega\in\mathcal{O}\middle|\Omega\leq\Omega_{0}\right\}, 𝑿Ω(t)\bm{X}_{\Omega}(t) has a unique stationary distribution and is nonexplosive. If |𝒳Ω||\mathcal{X}_{\Omega}| is finite, then since 𝑿Ω(t)\bm{X}_{\Omega}(t) is irreducible by Assumption 2.3, 𝑿Ω(t)\bm{X}_{\Omega}(t) has a unique stationary distribution and is nonexplosive. Otherwise, observe that V(𝒙)V(\bm{x}) is radially unbounded due to the assumption that a𝒙2V(𝒙)a\left\lVert\bm{x}\right\rVert^{2}\leq V(\bm{x}). Additionally, by using the bound V(𝒙)b𝒙2V(\bm{x})\leq b\lVert\bm{x}\rVert^{2}, we have that

G𝑿~~ΩV(𝒙)γbV(𝒙)+KΩ.G_{\tilde{\tilde{\bm{X}}}_{\Omega}}V(\bm{x})\leq-\frac{\gamma}{b}V(\bm{x})+\frac{K}{\Omega}. (146)

Thus, by Theorem 7.1 in [49], 𝑿~~Ω(t)\tilde{\tilde{\bm{X}}}_{\Omega}(t) is exponentially ergodic, and thus has a unique stationary distribution and is nonexplosive. Now we show the desired moment bounds. We have that for all Ω{Ω𝒪|ΩΩ0}\Omega\in\left\{\Omega\in\mathcal{O}\middle|\Omega\geq\Omega_{0}\right\},

G𝑿~~ΩV(𝒙)γ(𝒙2+1)+γ+KΩ.G_{\tilde{\tilde{\bm{X}}}_{\Omega}}V(\bm{x})\leq-\gamma(\lVert\bm{x}\rVert^{2}+1)+\gamma+\frac{K}{\Omega}. (147)

By Theorem 4.3 in [49], we therefore have that 𝔼[𝑿~~Ω2]+1γ+KΩγ\mathbb{E}\left[\left\lVert\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty}\right\rVert^{2}\right]+1\leq\frac{\gamma+\frac{K}{\Omega}}{\gamma}, which implies that

𝔼[𝑿~~Ω2]KγΩ.\mathbb{E}\left[\left\lVert\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty}\right\rVert^{2}\right]\leq\frac{K}{\gamma\Omega}. (148)

By Jensen’s inequality, 𝔼[𝑿~~Ω]KγΩ\mathbb{E}\left[\left\lVert\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty}\right\rVert\right]\leq\sqrt{\frac{K}{\gamma\Omega}}. Thus, the SCRN satisfies Condition 3.2. ∎

Building on Lemma 4.7, we now prove lemma that relates the existence of a quadratic type Lyapunov function that for the RRE to Conditions 3.1 and 3.2, by using the Lyapunov function as a Foster-Lyapunov function for 𝑿~~Ω(t)\tilde{\tilde{\bm{X}}}_{\Omega}(t).

Lemma 4.8.

Consider an SCRN 𝐗Ω(t)\bm{X}_{\Omega}(t) satisfying Assumptions 2.1, 2.2, and 2.3, and let 𝐯int(𝒳)\bm{v}^{*}\in\mathrm{int}(\mathcal{X}) be an equilibrium point of (12a). Suppose that there exists an open set 𝒟{𝐱n|𝐱+𝐯𝒳}\mathcal{D}\supseteq\left\{\bm{x}\in\mathbb{R}^{n}\middle|\bm{x}+\bm{v}^{*}\in\mathcal{X}\right\} and a Lyapunov function for the RRE V:𝒟0V:\mathcal{D}\rightarrow\mathbb{R}_{\geq 0} twice continuously differentiable and a,b,γ>0a,b,\gamma>0 such that for all 𝐱{𝐱n|𝐱+𝐯𝒳}\bm{x}\in\left\{\bm{x}\in\mathbb{R}^{n}\middle|\bm{x}+\bm{v}^{*}\in\mathcal{X}\right\}, a𝐱2V(𝐱)b𝐱2a\lVert\bm{x}\rVert^{2}\leq V(\bm{x})\leq b\lVert\bm{x}\rVert^{2} and

V𝒛F(𝒙+𝒗)γ𝒙2,\frac{\partial V}{\partial\bm{z}}F(\bm{x}+\bm{v}^{*})\leq-\gamma\lVert\bm{x}\rVert^{2}, (149)

and additionally, there exists BB such that sup𝐱{𝐱n|𝐱+𝐯𝒳}2V(𝐱)B\sup_{\bm{x}\in\left\{\bm{x}\in\mathbb{R}^{n}\middle|\bm{x}+\bm{v}^{*}\in\mathcal{X}\right\}}\lVert\nabla^{2}V(\bm{x})\rVert\leq B. Then, Condition 3.2 is satisfied.

Proof.

Let η=1Ω\eta=\frac{1}{\Omega}, and for 𝒛𝒳Ω\bm{z}\in\mathcal{X}_{\Omega} let 𝒙=η𝒛𝒗\bm{x}=\eta\bm{z}-\bm{v}^{*}. Consider

G𝑿~~ΩV(𝒙)=i=1rλ¯i(𝒙+𝒗)1η(V(𝒙+η𝝃i)V(𝒙)).G_{\tilde{\tilde{\bm{X}}}_{\Omega}}V(\bm{x})=\sum_{i=1}^{r}\bar{\lambda}_{i}(\bm{x}+\bm{v}^{*})\frac{1}{\eta}\left(V(\bm{x}+\eta\bm{\xi}_{i})-V(\bm{x})\right). (150)

If we take the Taylor expansion in η\eta using the Lagrange form of the remainder we obtain

G𝑿~~ΩV(𝒙)=i=1rλ¯i(𝒙+𝒗)(V𝒙𝝃i+12η𝝃iT2V(𝒙+ϵ𝝃i)𝝃i),G_{\tilde{\tilde{\bm{X}}}_{\Omega}}V(\bm{x})=\sum_{i=1}^{r}\bar{\lambda}_{i}(\bm{x}+\bm{v}^{*})\left(\frac{\partial V}{\partial\bm{x}}\bm{\xi}_{i}+\frac{1}{2}\eta\bm{\xi}_{i}^{T}\nabla^{2}{V}(\bm{x}+\epsilon\bm{\xi}_{i})\bm{\xi}_{i}\right), (151)

where 0ϵη0\leq\epsilon\leq\eta depends on 𝒙\bm{x}. Thus, for sufficiently large Ω\Omega we obtain

G𝑿~~ΩV(𝒙)V(𝒙)𝒛F(𝒙+𝒗)+KΩ(1+𝒙2)γ2𝒙2+KΩ,G_{\tilde{\tilde{\bm{X}}}_{\Omega}}V(\bm{x})\leq\frac{\partial V(\bm{x})}{\partial\bm{z}}F(\bm{x}+\bm{v}^{*})+\frac{K}{\Omega}(1+\lVert\bm{x}\rVert^{2})\leq-\frac{\gamma}{2}\lVert\bm{x}\rVert^{2}+\frac{K}{\Omega}, (152)

for some K>0K>0. Here we have used the fact that by Assumption 2.2, there exists KK^{\prime} such that |i=1rλ¯i(𝒙+𝒗)|K(1+𝒙2)\left\lvert\sum_{i=1}^{r}\bar{\lambda}_{i}(\bm{x}+\bm{v}^{*})\right\rvert\leq K^{\prime}(1+\lVert\bm{x}\rVert^{2}). Hence, by Lemma 4.7, Condition 3.2 is satisfied. ∎

We now prove Theorem 3.2. We note that we only need to verify Condition 3.1 for κ=2\kappa=2, since we are assuming that the Hessian of λ¯i(𝑿)\bar{\lambda}_{i}(\bm{X}) is bounded.

Proof.

We begin by checking that Condition 3.1 holds. Let us consider the scaled Markov chain 𝒁Ω(t)=1Ω𝑿Ω(t)\bm{Z}_{\Omega}(t)=\frac{1}{\Omega}\bm{X}_{\Omega}(t) and the function V(𝒗)=(𝒄T𝒗)κ+1V(\bm{v})=\left(\bm{c}^{T}\bm{v}\right)^{\kappa+1}. We have that

G𝒁ΩV(𝒗)\displaystyle G_{\bm{Z}_{\Omega}}V(\bm{v}) =i=1rλi(Ω𝒗)(V(𝒗+1Ω𝝃i)V(𝒗))\displaystyle=\sum_{i=1}^{r}\lambda_{i}\left(\Omega\bm{v}\right)\left(V(\bm{v}+\frac{1}{\Omega}\bm{\xi}_{i})-V(\bm{v})\right) (153)
=i=1rΩλ¯i(𝒗)(V(𝒗+1Ω𝝃i)V(𝒗))\displaystyle=\sum_{i=1}^{r}\Omega\bar{\lambda}_{i}\left(\bm{v}\right)\left(V(\bm{v}+\frac{1}{\Omega}\bm{\xi}_{i})-V(\bm{v})\right) (154)
=i=1rΩλ¯i(𝒗)((𝒄T(𝒗+1Ω𝝃i))κ+1(𝒄T𝒗)κ+1)\displaystyle=\sum_{i=1}^{r}\Omega\bar{\lambda}_{i}\left(\bm{v}\right)\left(\left(\bm{c}^{T}(\bm{v}+\frac{1}{\Omega}\bm{\xi}_{i})\right)^{\kappa+1}-\left(\bm{c}^{T}\bm{v}\right)^{\kappa+1}\right) (155)

For κ=2\kappa=2, we have

Ω((𝒄T(𝒗+1Ω𝝃i))κ+1(𝒄T𝒗)κ+1)=Ω((𝒄T𝒗)3+3Ω(𝒄T𝒗)2𝒄T𝝃i+3Ω2𝒄T𝒗(𝒄T𝝃i)2+1Ω3(𝒄T𝝃i)3(𝒄T𝒗)3)\displaystyle\begin{split}\Omega\left(\left(\bm{c}^{T}(\bm{v}+\frac{1}{\Omega}\bm{\xi}_{i})\right)^{\kappa+1}-\left(\bm{c}^{T}\bm{v}\right)^{\kappa+1}\right)&=\Omega\left((\bm{c}^{T}\bm{v})^{3}+\frac{3}{\Omega}(\bm{c}^{T}\bm{v})^{2}\bm{c}^{T}\bm{\xi}_{i}\right.\\ &\left.+\frac{3}{\Omega^{2}}\bm{c}^{T}\bm{v}(\bm{c}^{T}\bm{\xi}_{i})^{2}+\frac{1}{\Omega^{3}}(\bm{c}^{T}\bm{\xi}_{i})^{3}-(\bm{c}^{T}\bm{v})^{3}\right)\end{split} (156)
=3(𝒄T𝒗)2𝒄T𝝃i+3Ω𝒄T𝒗(𝒄T𝝃i)2+1Ω2(𝒄T𝝃i)3\displaystyle=3(\bm{c}^{T}\bm{v})^{2}\bm{c}^{T}\bm{\xi}_{i}+\frac{3}{\Omega}\bm{c}^{T}\bm{v}(\bm{c}^{T}\bm{\xi}_{i})^{2}+\frac{1}{\Omega^{2}}(\bm{c}^{T}\bm{\xi}_{i})^{3} (157)

Noting that 𝒗V(𝒗)=(κ+1)(𝒄T𝒗)κ𝒄T\frac{\partial}{\partial\bm{v}}V(\bm{v})=(\kappa+1)(\bm{c}^{T}\bm{v})^{\kappa}\bm{c}^{T}, we have

G𝒁ΩV(𝒗)\displaystyle G_{\bm{Z}_{\Omega}}V(\bm{v}) =i=1rλ¯i(𝒗)(3(𝒄T𝒗)2𝒄T𝝃i+3Ω𝒄T𝒗(𝒄T𝝃i)2+1Ω2(𝒄T𝝃i)3)\displaystyle=\sum_{i=1}^{r}\bar{\lambda}_{i}\left(\bm{v}\right)\left(3(\bm{c}^{T}\bm{v})^{2}\bm{c}^{T}\bm{\xi}_{i}+\frac{3}{\Omega}\bm{c}^{T}\bm{v}(\bm{c}^{T}\bm{\xi}_{i})^{2}+\frac{1}{\Omega^{2}}(\bm{c}^{T}\bm{\xi}_{i})^{3}\right) (158)
=V(𝒗)𝒗F(𝒗)+3Ωi=1rλ¯i(𝒗)𝒄T𝒗(𝒄T𝝃i)2+1Ω2i=1rλ¯i(𝒗)(𝒄T𝝃i)3\displaystyle=\frac{\partial V(\bm{v})}{\partial\bm{v}}F(\bm{v})+\frac{3}{\Omega}\sum_{i=1}^{r}\bar{\lambda}_{i}\left(\bm{v}\right)\bm{c}^{T}\bm{v}(\bm{c}^{T}\bm{\xi}_{i})^{2}+\frac{1}{\Omega^{2}}\sum_{i=1}^{r}\bar{\lambda}_{i}\left(\bm{v}\right)(\bm{c}^{T}\bm{\xi}_{i})^{3} (159)

Thus, for sufficiently large Ω\Omega, and 𝒗0\bm{v}\geq 0 such that 𝒄T𝒗d\bm{c}^{T}\bm{v}\geq d, we have that G𝒁ΩV(𝒗)12γ2(𝒄T𝒗)3G_{\bm{Z}_{\Omega}}V(\bm{v})\leq-\frac{1}{2}\gamma_{2}(\bm{c}^{T}\bm{v})^{3}. Therefore,

G𝒁ΩV(𝒗)(12γ2(𝒄T𝒗)3+1)+max𝒗1Ω𝒳Ω:𝒄T𝒗d(G𝒁ΩV(𝒗)+12γ2(𝒄T𝒗)3+1),G_{\bm{Z}_{\Omega}}V(\bm{v})\leq-\left(\frac{1}{2}\gamma_{2}(\bm{c}^{T}\bm{v})^{3}+1\right)+\max\limits_{\bm{v}\in\frac{1}{\Omega}\mathcal{X}_{\Omega}:\bm{c}^{T}\bm{v}\leq d}\left(G_{\bm{Z}_{\Omega}}V(\bm{v})+\frac{1}{2}\gamma_{2}(\bm{c}^{T}\bm{v})^{3}+1\right), (160)

which shows by Theorem 7.1 in [49] that 𝒁Ω(t)\bm{Z}_{\Omega}(t) has a unique stationary distribution and that by Theorem 4.3 in [49] 𝔼[(𝒄T𝒁Ω)3]<\mathbb{E}[(\bm{c}^{T}\bm{Z}_{\Omega}^{\infty})^{3}]<\infty, and thus 𝑿~~Ω(t)\tilde{\tilde{\bm{X}}}_{\Omega}(t) has a unique stationary distribution and 𝔼[(𝒄T𝑿~~Ω)3]<\mathbb{E}[(\bm{c}^{T}\tilde{\tilde{\bm{X}}}_{\Omega}^{\infty})^{3}]<\infty. Therefore, Condition 3.1 is satisfied. We now verify that Condition 3.2 holds. We will need the following Lemma, which is a slight variation on the standard converse Lyapunov theorem, see e.g. [37].

Lemma 4.9.

Let 𝒳0n\mathcal{X}\subseteq\mathbb{R}^{n}_{\geq 0} be closed and let 𝒟n\mathcal{D}\subset\mathbb{R}^{n} be an open set such that 𝒳𝒟\mathcal{X}\subset\mathcal{D}. Let F¯:𝒟n\bar{F}:\mathcal{D}\rightarrow\mathbb{R}^{n} be a C2C^{2} function defining a vector field for which 𝒳\mathcal{X} is positively invariant. Suppose that for all 𝐱𝒳\bm{x}\in\mathcal{X}, the solution to

𝒙˙=F¯(𝒙),𝒙(0)=𝒙,\dot{\bm{x}}^{\prime}=\bar{F}(\bm{x}^{\prime}),\quad\bm{x}^{\prime}(0)=\bm{x}, (161)

ϕ(t;𝒙)\phi(t;\bm{x}), exists uniquely on [0,)[0,\infty) and that there exists K¯,γ¯1>0\bar{K},\bar{\gamma}_{1}>0 and 𝐱𝒳\bm{x}^{*}\in\mathcal{X} such that for all 𝐱𝒳\bm{x}\in\mathcal{X} and all t0t\geq 0,

ϕ(t;𝒙)𝒙K¯eγ¯1t𝒙𝒙,\lVert\phi(t;\bm{x})-\bm{x}^{*}\rVert\leq\bar{K}e^{-\bar{\gamma}_{1}t}\lVert\bm{x}-\bm{x}^{*}\rVert, (162)

and furthermore, there exist 𝐜>0n\bm{c}\in\mathbb{R}^{n}_{>0}, d¯>0,γ¯2>0\bar{d}>0,\bar{\gamma}_{2}>0 such that for all 𝐱{𝐱𝒳|𝐜¯T𝐱d¯}\bm{x}\in\left\{\bm{x}\in\mathcal{X}\middle|\bar{\bm{c}}^{T}\bm{x}\geq\bar{d}\right\},

𝒄¯TF¯(𝒙)γ2𝒄¯T𝒙.\bar{\bm{c}}^{T}\bar{F}(\bm{x})\leq-\gamma_{2}\bar{\bm{c}}^{T}\bm{x}. (163)

Then, for each dd¯d^{\prime}\geq\bar{d}, there exists W¯1:𝒟\overline{W}_{1}:\mathcal{D}\rightarrow\mathbb{R} twice continuously differentiable and a¯1,b¯1,γ¯>0\bar{a}_{1},\bar{b}_{1},\bar{\gamma}>0 such that for all 𝐱{𝐱𝒳|𝐜¯T𝐱d}\bm{x}\in\left\{\bm{x}\in\mathcal{X}\middle|\bar{\bm{c}}^{T}\bm{x}\leq d^{\prime}\right\},

a¯1𝒙𝒙2W¯1(𝒙)\displaystyle\bar{a}_{1}\lVert\bm{x}-\bm{x}^{*}\rVert^{2}\leq\overline{W}_{1}(\bm{x}) b¯1𝒙𝒙2,\displaystyle\leq\bar{b}_{1}\lVert\bm{x}-\bm{x}^{*}\rVert^{2}, (164)
W¯1(𝒙)𝒗F¯(𝒙)\displaystyle\frac{\partial\overline{W}_{1}(\bm{x})}{\partial\bm{v}}\bar{F}(\bm{x}) γ¯𝒙𝒙2.\displaystyle\leq-\bar{\gamma}\left\lVert\bm{x}-\bm{x}^{*}\right\rVert^{2}. (165)
Proof.

The proof that there exists W¯1\overline{W}_{1} satisfying the quadratic bounds proceeds analogously to the proof of Theorem 4.14 in [37], with the only adjustment that we use {𝒙𝒳|𝒄T𝒙d}\left\{\bm{x}\in\mathcal{X}\middle|\bm{c}^{T}\bm{x}\leq d^{\prime}\right\} as the domain instead of a ball around the origin. That W¯1C2\overline{W}_{1}\in C^{2} is assured by the form

W¯1(𝒙)=0sϕ(t;𝒙)Tϕ(t;𝒙)𝑑t,\overline{W}_{1}(\bm{x})=\int_{0}^{s}\phi(t;\bm{x})^{T}\phi(t;\bm{x})dt, (166)

where ϕ(t;𝒙)\phi(t;\bm{x}) is the solution to 𝒙˙=F¯(𝒙)\dot{\bm{x}}^{\prime}=\bar{F}(\bm{x}^{\prime}) with 𝒙(0)=𝒙\bm{x}^{\prime}(0)=\bm{x} and ϕ(t;𝒙)C2\phi(t;\bm{x})\in C^{2} by F¯C2\bar{F}\in C^{2} and e.g. Theorem 8.43 in [64]. ∎

We now use Lemma 4.9 to construct a Lyapunov Function W1W_{1}. Recall that γ1\gamma_{1}, KK, 𝒄\bm{c}, and dd are assumed to exist by the assumptions of Theorem 3.2. We apply Lemma 4.9 to F¯(𝒙)=F(𝒙)\bar{F}(\bm{x})=F(\bm{x}) with γ¯1=γ¯\bar{\gamma}_{1}=\bar{\gamma}, K¯=K\bar{K}=K, 𝒄¯=𝒄\bar{\bm{c}}=\bm{c}, d¯=d\bar{d}=d, 𝒙=𝒗\bm{x}^{*}=\bm{v}^{*}, and γ¯2=γ2\bar{\gamma}_{2}=\gamma_{2}, picking d>dd^{\prime}>d to obtain a function W¯1\overline{W}_{1} with associated values a¯1\bar{a}_{1}, b¯1\bar{b}_{1}, and γ¯\bar{\gamma}. Let W1(𝒗)=W¯1(𝒗)W_{1}(\bm{v})=\overline{W}_{1}(\bm{v}). In order to create a Lyapunov function on all of 𝒳\mathcal{X}, we define W2:𝒟W_{2}:\mathcal{D}\rightarrow\mathbb{R} as

W2(𝒗)=2b¯1((𝒄T𝒗)2(d)2).W_{2}(\bm{v})=2\bar{b}_{1}\left(\left(\bm{c}^{T}\bm{v}\right)^{2}-(d^{\prime})^{2}\right). (167)

We construct a global Lyapunov function by merging W1W_{1} near the origin with W2W_{2} far from the origin. We must do this in a way that creates a C2C^{2} function with bounded second derivative so that we can apply Lemma 4.8. Our approach is similar to a technique used to prove Theorem B.1 in [44]. To this end, let W:𝒟W:\mathcal{D}\rightarrow\mathbb{R} be defined by

W(𝒗)=max{W1(𝒗),W2(𝒗)}.W(\bm{v})=\max\left\{W_{1}(\bm{v}),W_{2}(\bm{v})\right\}. (168)

We have that for all 𝒗{𝒗𝒳|𝒄T𝒗<d}\bm{v}\in\left\{\bm{v}\in\mathcal{X}\middle|\bm{c}^{T}\bm{v}<d^{\prime}\right\}, W(𝒗)=W1(𝒗)W(\bm{v})=W_{1}(\bm{v}), and hence in this region,

W(𝒗)𝒗F(𝒗)\displaystyle\frac{\partial W(\bm{v})}{\partial\bm{v}}F(\bm{v}) γ¯𝒗𝒗2.\displaystyle\leq-\bar{\gamma}\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}. (169)

Observe that due to the quadratic upper bound on W1(𝒗)W_{1}(\bm{v}), there exists d′′>dd^{\prime\prime}>d^{\prime} such that for all 𝒗{𝒗𝒳|𝒄T𝒗>d′′}\bm{v}\in\left\{\bm{v}\in\mathcal{X}\middle|\bm{c}^{T}\bm{v}>d^{\prime\prime}\right\}, W2(𝒗)>W1(𝒗)W_{2}(\bm{v})>W_{1}(\bm{v}), and thus in this region

W(𝒗)𝒗F(𝒗)\displaystyle\frac{\partial W(\bm{v})}{\partial\bm{v}}F(\bm{v}) 4b¯1γ2(𝒄T𝒗)2,\displaystyle\leq-4\bar{b}_{1}\gamma_{2}(\bm{c}^{T}\bm{v})^{2}, (170)
4b¯1γ2ici2vi2,\displaystyle\leq-4\bar{b}_{1}\gamma_{2}\sum_{i}c_{i}^{2}v_{i}^{2}, (171)
4b¯1γ2(minici2)ivi2,\displaystyle\leq-4\bar{b}_{1}\gamma_{2}\left(\min_{i}c_{i}^{2}\right)\sum_{i}v_{i}^{2}, (172)
4b¯1γ2(minici2)i(vi22vivi),\displaystyle\leq-4\bar{b}_{1}\gamma_{2}\left(\min_{i}c_{i}^{2}\right)\sum_{i}(v_{i}^{2}-2v_{i}v_{i}^{*}), (173)

where we have used that 𝒳0n\mathcal{X}\subseteq\mathbb{R}_{\geq 0}^{n} in the last line. hence, there exists γ~2>0\tilde{\gamma}_{2}>0 such that for all 𝒗{𝒗𝒳|𝒄T𝒗>d′′}\bm{v}\in\left\{\bm{v}\in\mathcal{X}\middle|\bm{c}^{T}\bm{v}>d^{\prime\prime}\right\},

W(𝒗)𝒗F(𝒗)\displaystyle\frac{\partial W(\bm{v})}{\partial\bm{v}}F(\bm{v}) γ~2i(vi22vivi+(vi)2),\displaystyle\leq-\tilde{\gamma}_{2}\sum_{i}(v_{i}^{2}-2v_{i}v_{i}^{*}+(v^{*}_{i})^{2}), (174)
=γ~2𝒗𝒗2.\displaystyle=-\tilde{\gamma}_{2}\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}. (175)

Let d1,d2,d3,d4d_{1},d_{2},d_{3},d_{4} be such that 𝒄T𝒗<d1<d2<d\bm{c}^{T}\bm{v}^{*}<d_{1}<d_{2}<d^{\prime} and d′′<d3<d4d^{\prime\prime}<d_{3}<d_{4}. Let S1={𝒗𝒳|𝒄T𝒗<d2}S_{1}=\left\{\bm{v}\in\mathcal{X}\middle|\bm{c}^{T}\bm{v}<d_{2}\right\}, S2={𝒗𝒳|d1<𝒄T𝒗<d4}S_{2}=\left\{\bm{v}\in\mathcal{X}\middle|d_{1}<\bm{c}^{T}\bm{v}<d_{4}\right\}, and S3={𝒗𝒳|d3<𝒄T𝒗}S_{3}=\left\{\bm{v}\in\mathcal{X}\middle|d_{3}<\bm{c}^{T}\bm{v}\right\}. Now consider 𝒗S2\bm{v}\in S_{2}. Let φϵ(𝒙)\varphi_{\epsilon}(\bm{x}) be the bump function defined by

φϵ(𝒙)={1ϵnCe11𝒙/ϵ2,𝒙/ϵ1,0,else,\varphi_{\epsilon}(\bm{x})=\left\{\begin{array}[]{cc}\frac{1}{\epsilon^{n}}Ce^{\frac{-1}{1-\lVert\bm{x}/\epsilon\rVert^{2}}},&\lVert\bm{x}/\epsilon\rVert\leq 1,\\ 0,&\text{else},\end{array}\right. (176)

where C=(ne11𝒙2𝑑𝒙)1C=\left(\int_{\mathbb{R}^{n}}e^{\frac{-1}{1-\lVert\bm{x}\rVert^{2}}}d\bm{x}\right)^{-1}, and let Wϵ(𝒗)=WφϵW_{\epsilon}(\bm{v})=W\star\varphi_{\epsilon}, where we treat W(𝒗)=0W(\bm{v})=0 for all 𝒗𝒟\bm{v}\notin\mathcal{D}. We have that

Wϵ(𝒗)𝒗F(𝒗)\displaystyle\frac{\partial W_{\epsilon}(\bm{v})}{\partial\bm{v}}F(\bm{v}) =𝒗nW(𝒗𝒔)φϵ(𝒔)μ(d𝒔)F(𝒗),\displaystyle=\frac{\partial}{\partial\bm{v}}\int_{\mathbb{R}^{n}}W(\bm{v}-\bm{s})\varphi_{\epsilon}(\bm{s})\mu(d\bm{s})\cdot F(\bm{v}), (177)

where μ\mu is the Lebesgue measure on n\mathbb{R}^{n}. For 𝝉n\bm{\tau}\in\mathbb{R}^{n}, we have that

𝒗W(𝒗𝒔)φϵ(𝒔)𝝉=limη0W(𝒗𝒔+η𝝉)W(𝒗𝒔)ηφϵ(𝒔),\frac{\partial}{\partial\bm{v}}W(\bm{v}-\bm{s})\varphi_{\epsilon}(\bm{s})\cdot\bm{\tau}=\lim_{\eta\rightarrow 0}\frac{W(\bm{v}-\bm{s}+\eta\bm{\tau})-W(\bm{v}-\bm{s})}{\eta}\varphi_{\epsilon}(\bm{s}), (178)

and for η>0\eta>0 we have

W(𝒗𝒔+η𝝉)W(𝒗𝒔)ηφϵ(𝒔)L𝝉max𝒔nφϵ(𝒔),\left\lVert\frac{W(\bm{v}-\bm{s}+\eta\bm{\tau})-W(\bm{v}-\bm{s})}{\eta}\varphi_{\epsilon}(\bm{s})\right\rVert\leq L\lVert\bm{\tau}\rVert\cdot\max_{\bm{s}\in\mathbb{R}^{n}}\lVert\varphi_{\epsilon}(\bm{s})\rVert, (179)

where LL is the Lipschitz constant of WW on S2S_{2}, and max𝒔nφϵ(𝒔)=C/(eϵn)\max_{\bm{s}\in\mathbb{R}^{n}}\lVert\varphi_{\epsilon}(\bm{s})\rVert=C/(e\epsilon^{n}). By the Dominated Convergence Theorem and the fact that limη0W(𝒗𝒔+η𝝉)W(𝒗𝒔)ηφϵ(𝒔)=W(𝒗)𝒗φϵ(𝒔)𝝉\lim_{\eta\rightarrow 0}\frac{W(\bm{v}-\bm{s}+\eta\bm{\tau})-W(\bm{v}-\bm{s})}{\eta}\varphi_{\epsilon}(\bm{s})=\frac{\partial W(\bm{v})}{\partial\bm{v}}\varphi_{\epsilon}(\bm{s})\cdot\bm{\tau} almost everywhere,

Wϵ(𝒗)𝒗F(𝒗)=nW(𝒗𝒔)𝒗F(𝒗)φϵ(𝒔)μ(d𝒔).\frac{\partial W_{\epsilon}(\bm{v})}{\partial\bm{v}}F(\bm{v})=\int_{\mathbb{R}^{n}}\frac{\partial W(\bm{v}-\bm{s})}{\partial\bm{v}}\cdot F(\bm{v})\varphi_{\epsilon}(\bm{s})\mu(d\bm{s}). (180)

For sufficiently small ϵ\epsilon, there exists γ>0\gamma>0 such that for all 𝒔ϵ\lVert\bm{s}\rVert\leq\epsilon and 𝒗S2\bm{v}\in S_{2} such that WW is differentiable at 𝒗𝒔\bm{v}-\bm{s},

W(𝒗𝒔)𝒗F(𝒗)γ𝒗𝒗2.\frac{\partial W(\bm{v}-\bm{s})}{\partial\bm{v}}F(\bm{v})\leq-\gamma\lVert\bm{v}-\bm{v}^{*}\rVert^{2}. (181)

Thus, for 𝒗S2\bm{v}\in S_{2}, we have

Wϵ(𝒗)𝒗F(𝒗)γ𝒗𝒗2.\frac{\partial W_{\epsilon}(\bm{v})}{\partial\bm{v}}F(\bm{v})\leq-\gamma\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}. (182)

Now, let βi:𝒟[0,1]\beta_{i}:\mathcal{D}\rightarrow[0,1] for i={1,2,3}i=\{1,2,3\} be a smooth partition of unity subordinate to {S1,S2,S3}\{S_{1},S_{2},S_{3}\}, and let

Ws(𝒗)=β1(𝒗)W1(𝒗)+β2(𝒗)Wϵ(𝒗)+β3(𝒗)W2(𝒗).W_{s}(\bm{v})=\beta_{1}(\bm{v})W_{1}(\bm{v})+\beta_{2}(\bm{v})W_{\epsilon}(\bm{v})+\beta_{3}(\bm{v})W_{2}(\bm{v}). (183)

We have that

Ws(𝒗)𝒗F(𝒗)=(β1(𝒗)W1(𝒗)𝒗+β2(𝒗)Wϵ(𝒗)𝒗+β3(𝒗)W3(𝒗)𝒗)F(𝒗)+(β1(𝒗)𝒗W1(𝒗)+β2(𝒗)𝒗Wϵ(𝒗)+β3(𝒗)𝒗W2(𝒗))F(𝒗),\displaystyle\begin{split}\frac{\partial W_{s}(\bm{v})}{\partial\bm{v}}F(\bm{v})&=\left(\beta_{1}(\bm{v})\frac{\partial W_{1}(\bm{v})}{\partial\bm{v}}+\beta_{2}(\bm{v})\frac{\partial W_{\epsilon}(\bm{v})}{\partial\bm{v}}+\beta_{3}(\bm{v})\frac{\partial W_{3}(\bm{v})}{\partial\bm{v}}\right)F(\bm{v})\\ &\quad+\left(\frac{\partial\beta_{1}(\bm{v})}{\partial\bm{v}}W_{1}(\bm{v})+\frac{\partial\beta_{2}(\bm{v})}{\partial\bm{v}}W_{\epsilon}(\bm{v})+\frac{\partial\beta_{3}(\bm{v})}{\partial\bm{v}}W_{2}(\bm{v})\right)F(\bm{v}),\end{split} (184)
Ws(𝒗)𝒗F(𝒗)=(β1(𝒗)W1(𝒗)𝒗+β2(𝒗)Wϵ(𝒗)𝒗+β3(𝒗)W3(𝒗)𝒗)F(𝒗)+(β1(𝒗)𝒗(W1(𝒗)W(𝒗))+β2(𝒗)𝒗(Wϵ(𝒗)W(𝒗))+β3(𝒗)𝒗(W2(𝒗)W(𝒗)))F(𝒗),\displaystyle\begin{split}\frac{\partial W_{s}(\bm{v})}{\partial\bm{v}}F(\bm{v})&=\left(\beta_{1}(\bm{v})\frac{\partial W_{1}(\bm{v})}{\partial\bm{v}}+\beta_{2}(\bm{v})\frac{\partial W_{\epsilon}(\bm{v})}{\partial\bm{v}}+\beta_{3}(\bm{v})\frac{\partial W_{3}(\bm{v})}{\partial\bm{v}}\right)F(\bm{v})\\ &\quad+\left(\frac{\partial\beta_{1}(\bm{v})}{\partial\bm{v}}(W_{1}(\bm{v})-W(\bm{v}))+\frac{\partial\beta_{2}(\bm{v})}{\partial\bm{v}}(W_{\epsilon}(\bm{v})-W(\bm{v}))\right.\\ &\left.+\frac{\partial\beta_{3}(\bm{v})}{\partial\bm{v}}(W_{2}(\bm{v})-W(\bm{v}))\right)F(\bm{v}),\end{split} (185)

where we have used the fact that j=13βj(𝒗)𝒗=0\sum_{j=1}^{3}\frac{\partial\beta_{j}(\bm{v})}{\partial\bm{v}}=0.

Let us consider 𝒗𝒳(S1S2)\bm{v}\in\mathcal{X}\cap(S_{1}\setminus S_{2}). In this region, W(𝒗)=W1(𝒗)W(\bm{v})=W_{1}(\bm{v}), β2(𝒗)=β3(𝒗)=0\beta_{2}(\bm{v})=\beta_{3}(\bm{v})=0, and β2(𝒗)𝒗=β3(𝒗)𝒗=0\frac{\partial\beta_{2}(\bm{v})}{\partial\bm{v}}=\frac{\partial\beta_{3}(\bm{v})}{\partial\bm{v}}=0. Thus,

Ws(𝒗)𝒗F(𝒗)γ1𝒗𝒗2.\frac{\partial W_{s}(\bm{v})}{\partial\bm{v}}F(\bm{v})\leq-\gamma_{1}\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}. (186)

Now, let us consider 𝒗𝒳S1S2¯\bm{v}\in\overline{\mathcal{X}\cap S_{1}\cap S_{2}}. In this region, W(𝒗)=W1(𝒗)W(\bm{v})=W_{1}(\bm{v}), β3(𝒗)=0\beta_{3}(\bm{v})=0, and β3(𝒗)𝒗=0\frac{\partial\beta_{3}(\bm{v})}{\partial\bm{v}}=0, implying that

Ws(𝒗)𝒗F(𝒗)min{γ,γ1}𝒗𝒗2+β2(𝒗)𝒗(Wϵ(𝒗)W(𝒗)).\frac{\partial W_{s}(\bm{v})}{\partial\bm{v}}F(\bm{v})\leq-\min\{\gamma,\gamma_{1}\}\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}+\left\lVert\frac{\partial\beta_{2}(\bm{v})}{\partial\bm{v}}(W_{\epsilon}(\bm{v})-W(\bm{v}))\right\rVert. (187)

By e.g. Lemma B.2 in [44], Wϵ(𝒗)W(𝒗)W_{\epsilon}(\bm{v})\rightarrow W(\bm{v}) uniformly on the compact set 𝒳S1S2¯\overline{\mathcal{X}\cap S_{1}\cap S_{2}}. Additionally, observe that min𝒗𝒳S1S2¯𝒗𝒗2>0\min\limits_{\bm{v}\in\overline{\mathcal{X}\cap S_{1}\cap S_{2}}}\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}>0, and thus for sufficiently small ϵ\epsilon, for all 𝒗𝒳S1S2¯\bm{v}\in\overline{\mathcal{X}\cap S_{1}\cap S_{2}},

Ws(𝒗)𝒗F(𝒗)12min{γ,γ1}𝒗𝒗2.\frac{\partial W_{s}(\bm{v})}{\partial\bm{v}}F(\bm{v})\leq-\frac{1}{2}\min\{\gamma,\gamma_{1}\}\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}. (188)

Now, let us consider 𝒗𝒳(S2S1S3)¯\bm{v}\in\overline{\mathcal{X}\cap(S_{2}\setminus S_{1}\setminus S_{3})}. In this region, β1(𝒗)=β3(𝒗)=0\beta_{1}(\bm{v})=\beta_{3}(\bm{v})=0 and β1(𝒗)𝒗=β3(𝒗)𝒗=0\frac{\partial\beta_{1}(\bm{v})}{\partial\bm{v}}=\frac{\partial\beta_{3}(\bm{v})}{\partial\bm{v}}=0, and thus

Ws(𝒗)𝒗F(𝒗)γ𝒗𝒗2+β2(𝒗)𝒗(Wϵ(𝒗)W(𝒗)).\frac{\partial W_{s}(\bm{v})}{\partial\bm{v}}F(\bm{v})\leq-\gamma\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}+\left\lVert\frac{\partial\beta_{2}(\bm{v})}{\partial\bm{v}}(W_{\epsilon}(\bm{v})-W(\bm{v}))\right\rVert. (189)

By noting that 𝒳(S2S1S3)¯\overline{\mathcal{X}\cap(S_{2}\setminus S_{1}\setminus S_{3})} is compact, and again using Lemma B.2 from [44] and the fact that
min𝒗𝒳(S2S1S3)¯𝒗𝒗2>0\min\limits_{\bm{v}\in\overline{\mathcal{X}\cap(S_{2}\setminus S_{1}\setminus S_{3})}}\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}>0, we have that for sufficiently small ϵ\epsilon,

Ws(𝒗)𝒗F(𝒗)12γ𝒗𝒗2.\frac{\partial W_{s}(\bm{v})}{\partial\bm{v}}F(\bm{v})\leq-\frac{1}{2}\gamma\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}. (190)

Let us consider 𝒗𝒳(S3S2)¯\bm{v}\in\overline{\mathcal{X}\cap(S_{3}\cap S_{2})}. In this region, β1(𝒗)=0\beta_{1}(\bm{v})=0 and W(𝒗)=W2(𝒗)W(\bm{v})=W_{2}(\bm{v}). Thus,

Ws(𝒗)𝒗F(𝒗)min{γ,γ~2}𝒗𝒗2+β2(𝒗)𝒗(Wϵ(𝒗)W(𝒗)).\frac{\partial W_{s}(\bm{v})}{\partial\bm{v}}F(\bm{v})\leq-\min\{\gamma,\tilde{\gamma}_{2}\}\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}+\left\lVert\frac{\partial\beta_{2}(\bm{v})}{\partial\bm{v}}(W_{\epsilon}(\bm{v})-W(\bm{v}))\right\rVert. (191)

Again using Lemma B.2 from [44], and the fact that min𝒗𝒳(S3S2)¯𝒗𝒗2>0\min\limits_{\bm{v}\in\overline{\mathcal{X}\cap(S_{3}\cap S_{2})}}\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}>0, we have that for sufficiently small epsilon,

Ws(𝒗)𝒗F(𝒗)12min{γ,γ~2}𝒗𝒗2.\frac{\partial W_{s}(\bm{v})}{\partial\bm{v}}F(\bm{v})\leq-\frac{1}{2}\min\{\gamma,\tilde{\gamma}_{2}\}\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}. (192)

Finally, consider 𝒗𝒳(S3S2)¯\bm{v}\in\overline{\mathcal{X}\cap(S_{3}\setminus S_{2})}. In this region, W(𝒗)=W2(𝒗)W(\bm{v})=W_{2}(\bm{v}), β1(𝒗)=β2(𝒗)=0\beta_{1}(\bm{v})=\beta_{2}(\bm{v})=0, and β1(𝒗)𝒗=β2(𝒗)𝒗=0\frac{\partial\beta_{1}(\bm{v})}{\partial\bm{v}}=\frac{\partial\beta_{2}(\bm{v})}{\partial\bm{v}}=0. Thus,

Ws(𝒗)𝒗F(𝒗)γ~2𝒗𝒗2.\frac{\partial W_{s}(\bm{v})}{\partial\bm{v}}F(\bm{v})\leq-\tilde{\gamma}_{2}\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}. (193)

Combining the preceding results for each subset of 𝒳\mathcal{X}, we have that for sufficiently small ϵ\epsilon, for all 𝒗𝒳\bm{v}\in\mathcal{X},

Ws(𝒗)𝒗F(𝒗)12min{γ,γ1,γ~2}𝒗𝒗2.\frac{\partial W_{s}(\bm{v})}{\partial\bm{v}}F(\bm{v})\leq-\frac{1}{2}\min\{\gamma,\gamma_{1},\tilde{\gamma}_{2}\}\left\lVert\bm{v}-\bm{v}^{*}\right\rVert^{2}. (194)

Observe that

sup𝒗𝒳2Ws(𝒗)=max{max𝒗𝒳S1S2¯2Ws(𝒗),4b¯1𝒄𝒄T},\sup_{\bm{v}\in\mathcal{X}}\left\lVert\nabla^{2}{W_{s}}(\bm{v})\right\rVert=\max\left\{\max_{\bm{v}\in\overline{\mathcal{X}\setminus S_{1}\setminus S_{2}}}\left\lVert\nabla^{2}{W_{s}}(\bm{v})\right\rVert,4\bar{b}_{1}\left\lVert\bm{c}\bm{c}^{T}\right\rVert\right\}, (195)

where max𝒗𝒳S1S2¯2Ws(𝒗)\max_{\bm{v}\in\overline{\mathcal{X}\setminus S_{1}\setminus S_{2}}}\left\lVert\nabla^{2}{W_{s}}(\bm{v})\right\rVert exists by the smoothness of WsW_{s} and the compactness of 𝒳S1S2¯\overline{\mathcal{X}\setminus S_{1}\setminus S_{2}}. Therefore, for some sufficiently small ϵ\epsilon, we can apply Lemma 4.8 with V(𝒗)=Ws(𝒗)V(\bm{v})=W_{s}(\bm{v}) to conclude that the SCRN satisfies Condition 3.2.

We have now verified Conditions 3.1 and 3.2. Observe that 0=F(𝒗)0=F(\bm{v}^{*}), i.e. 𝒗\bm{v}^{*} is an equilibrium point of (12a), and by Corollary 4.3 in [37], the global exponential stability of 𝒗\bm{v}^{*} implies that F(𝒗)𝒗\frac{\partial F(\bm{v}^{*})}{\partial\bm{v}} is Hurwitz. Thus, we can apply Theorem 3.1 to obtain the desired result. ∎

5 Examples

5.1 Antithetic motif

The antithetic feedback motif is a common SCRN encountered in synthetic biology because it can approximately implement an integrator [53, 12]. The antithetic motif is given by the following set of chemical reactions:

\schemestart\subscheme\emptyset \arrow(z–x1)¡=¿[k2k_{2}][k1k_{1}][135] \subschemeX1\mathrm{X}_{1} \arrow(@z–x2)¡=¿[k3k_{3}][k4k_{4}][90] \subschemeX2\mathrm{X}_{2} \arrow(@z–x12)¡-[k5k_{5}][45] \subschemeX1+X2\mathrm{X}_{1}+\mathrm{X}_{2} \schemestop (196)

The reaction vectors and propensities are:

λ1(𝒙)\displaystyle\lambda_{1}(\bm{x}) =Ωk1,\displaystyle=\Omega k_{1}, 𝝃1=[+10]T,\displaystyle\bm{\xi}_{1}=\begin{bmatrix}+1&0\end{bmatrix}^{T}, (197)
λ2(𝒙)\displaystyle\lambda_{2}(\bm{x}) =k2x1,\displaystyle=k_{2}x_{1}, 𝝃2=[10]T,\displaystyle\bm{\xi}_{2}=\begin{bmatrix}-1&0\end{bmatrix}^{T}, (198)
λ3(𝒙)\displaystyle\lambda_{3}(\bm{x}) =Ωk3,\displaystyle=\Omega k_{3}, 𝝃3=[0+1]T,\displaystyle\bm{\xi}_{3}=\begin{bmatrix}0&+1\end{bmatrix}^{T}, (199)
λ4(𝒙)\displaystyle\lambda_{4}(\bm{x}) =k4x2,\displaystyle=k_{4}x_{2}, 𝝃4=[01]T,\displaystyle\bm{\xi}_{4}=\begin{bmatrix}0&-1\end{bmatrix}^{T}, (200)
λ5(𝒙)\displaystyle\lambda_{5}(\bm{x}) =k5Ωx1x2,\displaystyle=\frac{k_{5}}{\Omega}x_{1}x_{2}, 𝝃5=[11]T.\displaystyle\bm{\xi}_{5}=\begin{bmatrix}-1&-1\end{bmatrix}^{T}. (201)

This SCRN satisfies Assumptions 2.1, 2.2, and 2.3, with 𝒳Ω=02\mathcal{X}_{\Omega}=\mathbb{Z}_{\geq 0}^{2}. We show that we can apply Theorem 3.2. Observe that as shown in [9], the RREs

v˙1\displaystyle\dot{v}_{1} =k1k2v1k5v1v2,\displaystyle=k_{1}-k_{2}v_{1}-k_{5}v_{1}v_{2}, (203)
v˙2\displaystyle\dot{v}_{2} =k3k4v2k5v1v2,\displaystyle=k_{3}-k_{4}v_{2}-k_{5}v_{1}v_{2}, (204)

have a unique, globally exponentially stable equilibrium point 𝒗\bm{v}^{*} in n\mathbb{R}^{n}, with 𝒗>0\bm{v}^{*}>0. Letting 𝒄=[11]T\bm{c}=\begin{bmatrix}1&1\end{bmatrix}^{T} we have that 𝒄TF(𝒗)=k1+k2k2v1k4v22k5v1v2k1+k2min{k2,k4}(v1+v2)\bm{c}^{T}F(\bm{v})=k_{1}+k_{2}-k_{2}v_{1}-k_{4}v_{2}-2k_{5}v_{1}v_{2}\leq k_{1}+k_{2}-\min\{k_{2},k_{4}\}(v_{1}+v_{2}), and thus there exists d,γ>0d,\gamma>0 such for all 𝒗{𝒗02|𝒄T𝒗d}\bm{v}\in\left\{\bm{v}\in\mathbb{R}_{\geq 0}^{2}\middle|\bm{c}^{T}\bm{v}\geq d\right\}, 𝒄TF(𝒗)γ𝒄T𝒗\bm{c}^{T}F(\bm{v})\leq-\gamma\bm{c}^{T}\bm{v}. Therefore, by Theorem 3.2, there exists C,Ω0>0C,\Omega_{0}^{\prime}>0 such that for all ΩΩ0\Omega\geq\Omega_{0}^{\prime},

𝒲1(𝑿~Ω,𝒀)ClnΩΩ.\mathcal{W}_{1}\left(\tilde{\bm{X}}_{\Omega}^{\infty},\bm{Y}^{\infty}\right)\leq C\frac{\ln\Omega}{\sqrt{\Omega}}. (205)

5.2 Transcriptional activation

We consider a simple model of a protein PP that is transcriptionally regulated by RR binding to the promoter TT to form complex CC [14]:

(206)
(207)
(208)

This SCRN has four species, however, we can use the two conservation laws R+C=RtotR+C=R_{tot} and T+C=TtotT+C=T_{tot} to obtain the following reduced model:

X1\displaystyle X_{1} =C\displaystyle=C (209)
X2\displaystyle X_{2} =P\displaystyle=P (210)

with reactions

λ1(𝒙)\displaystyle\lambda_{1}(\bm{x}) =k1Ω(Rtotx1)(Ttotx1),\displaystyle=\frac{k_{1}}{\Omega}(R_{tot}-x_{1})(T_{tot}-x_{1}), 𝝃1=[+10]T,\displaystyle\bm{\xi}_{1}=\begin{bmatrix}+1&0\end{bmatrix}^{T}, (211)
λ2(𝒙)\displaystyle\lambda_{2}(\bm{x}) =k2x1,\displaystyle=k_{2}x_{1}, 𝝃2=[10]T,\displaystyle\bm{\xi}_{2}=\begin{bmatrix}-1&0\end{bmatrix}^{T}, (212)
λ3(𝒙)\displaystyle\lambda_{3}(\bm{x}) =k3x1,\displaystyle=k_{3}x_{1}, 𝝃3=[0+1]T,\displaystyle\bm{\xi}_{3}=\begin{bmatrix}0&+1\end{bmatrix}^{T}, (213)
λ4(𝒙)\displaystyle\lambda_{4}(\bm{x}) =k4x2,\displaystyle=k_{4}x_{2}, 𝝃4=[01]T,\displaystyle\bm{\xi}_{4}=\begin{bmatrix}0&-1\end{bmatrix}^{T}, (214)

resulting in the RREs

v˙1\displaystyle\dot{v}_{1} =k1(R¯totv1)(T¯totv1)k2v1,\displaystyle=k_{1}(\bar{R}_{tot}-v_{1})(\bar{T}_{tot}-v_{1})-k_{2}v_{1}, (216)
v˙2\displaystyle\dot{v}_{2} =k3v1k4v2.\displaystyle=k_{3}v_{1}-k_{4}v_{2}. (217)

If we set Rtot=ΩR¯totR_{tot}=\Omega\bar{R}_{tot} and Ttot=ΩT¯totT_{tot}=\Omega\bar{T}_{tot}, with R¯tot,T¯tot\bar{R}_{tot},\bar{T}_{tot}\in\mathbb{Q}, the SCRN satisfies Assumptions 2.1 and 2.2 with 𝒳=[0,min{R¯tot,T¯tot}]×[0,)\mathcal{X}=[0,\min\{\bar{R}_{tot},\bar{T}_{tot}\}]\times[0,\infty). Additionally, the SCRN satisfies Assumption 2.3. The RREs have a unique, globally exponentially stable equilibrium point in 𝒳\mathcal{X}, and letting 𝒄=[01]T\bm{c}=\begin{bmatrix}0&1\end{bmatrix}^{T}, we have 𝒄TF(𝒗)=k3v1k4v2\bm{c}^{T}F(\bm{v})=k_{3}v_{1}-k_{4}v_{2}, and so there exists γ,d>0\gamma,d>0 such that for all 𝒗{𝒗𝒳|𝒄T𝒗d}\bm{v}\in\left\{\bm{v}\in\mathcal{X}\middle|\bm{c}^{T}\bm{v}\geq d\right\}, 𝒄TF(𝒗)γ𝒄T𝒗\bm{c}^{T}F(\bm{v})\leq-\gamma\bm{c}^{T}\bm{v}. Therefore, by Theorem 3.2, there exists C,Ω0>0C,\Omega_{0}^{\prime}>0 such that for all Ω{Ω𝒪|ΩΩ0}\Omega\in\left\{\Omega\in\mathcal{O}\middle|\Omega\geq\Omega_{0}^{\prime}\right\},

𝒲1(𝑿~Ω,𝒀)ClnΩΩ.\mathcal{W}_{1}\left(\tilde{\bm{X}}_{\Omega}^{\infty},\bm{Y}^{\infty}\right)\leq C\frac{\ln\Omega}{\sqrt{\Omega}}. (218)

6 Conclusion

In this work we studied the relationship between the stationary distributions of appropriately scaled Markov chains corresponding to SCRNs, and the steady state behavior of the RRE and LNA models which are commonly used as approximations. Using Stein’s Method, we obtained a bound on the 1-Wasserstein distance between 𝑿~Ω=1Ω(𝑿ΩΩ𝒗)\tilde{\bm{X}}_{\Omega}^{\infty}=\frac{1}{\sqrt{\Omega}}\left(\bm{X}_{\Omega}^{\infty}-\Omega\bm{v}^{*}\right) and 𝒀\bm{Y}^{\infty}, the fluctuation term of the LNA, that is proportional to lnΩΩ\frac{\ln\Omega}{\sqrt{\Omega}}, and a bound on the 1-Wasserstein distance between 1Ω𝑿Ω\frac{1}{\Omega}\bm{X}_{\Omega}^{\infty} and 𝒗\bm{v}^{*} that is proportional to 1Ω\frac{1}{\sqrt{\Omega}}. Our main result requires one to control the second moment of 𝑿~Ω\tilde{\bm{X}}_{\Omega}^{\infty}, and we presented a Foster-Lyapunov based method to do so by analyzing just the RREs. Our condition requires global exponential stability of 𝒗\bm{v}^{*} in the RRE, along with the existence of a linear function with exponential decay far from the origin. While proving the global exponential stability of an equilibrium point of the RRE is in and of itself a challenging task, a large body of literature on the topic is available, since the RRE have been the subject on intense study over the years. In fact, recent advances in global stability of equilibria of the RRE can potentially be leveraged via our results to analyze the error in the RRE and LNA for large volume [2, 1]. The authors foresee this work applying to model reduction via timescale separation of SCRNs, and system identification algorithms, where the computational simplicity of the LNA makes it an attractive model. In both cases, the fact that we can obtain non-asymptotic error bounds on the LNA is critical to analyzing the error in the reduced models and the identified system respectively.

7 Acknowledgments

This work was supported in part by the U.S. AFOSR under grant FA9550-22-1-0316. The authors thank Dylan Hirsch for his comments on the proofs of Lemma 4.2 and Theorem 3.2, and thank Ruth J Williams for her helpful feedback on a preliminary version of this work.

References

  • [1] M Ali Al-Radhawi. Graphical characterizations of robust stability in biological interaction networks. Mathematics of Control, Signals, and Systems, pages 1–33, 2023.
  • [2] Muhammad Ali Al-Radhawi and David Angeli. New approach to the stability of chemical reaction networks: Piecewise linear in rates lyapunov functions. IEEE Transactions on Automatic Control, 61(1):76–89, 2015.
  • [3] Uri Alon. An introduction to systems biology: design principles of biological circuits. Chapman and Hall/CRC, 2006.
  • [4] David F Anderson, Gheorghe Craciun, and Thomas G Kurtz. Product-form stationary distributions for deficiency zero chemical reaction networks. Bulletin of mathematical biology, 72:1947–1970, 2010.
  • [5] David F Anderson and Thomas G Kurtz. Stochastic analysis of biochemical systems, volume 674. Springer, 2015.
  • [6] Adam Arkin, John Ross, and Harley H McAdams. Stochastic Kinetic Analysis of Developmental Pathway Bifurcation in Phage λ\lambda-Infected Escherichia coli Cells. Genetics, 149(4):1633–1648, 08 1998.
  • [7] Andrew D Barbour. Stein’s method and poisson process convergence. Journal of Applied Probability, 25(A):175–184, 1988.
  • [8] Andrew D Barbour. Stein’s method for diffusion approximations. Probability theory and related fields, 84(3):297–322, 1990.
  • [9] Franco Blanchini and Elisa Franco. Structurally robust biological networks. BMC Systems Biology, 5(1):74, 2011.
  • [10] Anton Braverman and JG Dai. Stein’s method for steady-state diffusion approximations of m/ph/n+ m systems. Annals of Applied Probability, 27(1):550–581, 2017.
  • [11] Anton Braverman, JG Dai, and Jiekun Feng. Stein’s method for steady-state diffusion approximations: an introduction through the erlang-a and erlang-c models. Stochastic Systems, 6(2):301–366, 2017.
  • [12] Corentin Briat, Ankit Gupta, and Mustafa Khammash. Antithetic integral feedback ensures robust perfect adaptation in noisy biomolecular networks. Cell Systems, 2(1):15–26, 2016.
  • [13] Sourav Chatterjee. A short survey of stein’s method. In Proceedings of the Proceedings of the International Congress of Mathematicians, pages 1–24, 2014.
  • [14] D. Del Vecchio and R. M. Murray. Biomolecular Feedback Systems. Princeton University Press, 1st edition, 2014.
  • [15] Charles Desoer and Hiromasa Haneda. The measure of a matrix as a tool to analyze computer algorithms for circuit analysis. IEEE Transactions on Circuit Theory, 19(5):480–486, 1972.
  • [16] Joseph L Doob. Markoff chains—denumerable case. Transactions of the American Mathematical Society, 58:455–473, 1945.
  • [17] Werner Ehm. Binomial approximation to the poisson binomial distribution. Statistics & Probability Letters, 11(1):7–16, 1991.
  • [18] Johan Elf and Måns Ehrenberg. Fast evaluation of fluctuations in biochemical networks with the linear noise approximation. Genome research, 13(11):2475–2484, 2003.
  • [19] M. B. Elowitz and S. Leibler. A synthetic oscillatory network of transcriptional regulators. Nature, 403:339–342, 2000.
  • [20] Michael B Elowitz, Arnold J Levine, Eric D Siggia, and Peter S Swain. Stochastic gene expression in a single cell. Science, 297(5584):1183–1186, 2002.
  • [21] Stewart N Ethier and Thomas G Kurtz. Markov processes: characterization and convergence. John Wiley & Sons, 2009.
  • [22] Paul Fearnhead, Vasilieos Giagos, and Chris Sherlock. Inference for reaction networks using the linear noise approximation. Biometrics, 70(2):457–466, 2014.
  • [23] Bärbel Finkenstädt, Dan J Woodcock, Michal Komorowski, Claire V Harper, Julian RE Davis, Mike RH White, and David A Rand. Quantifying intrinsic and extrinsic noise in gene transcription using the linear noise approximation: An application to single cell data. The Annals of Applied Statistics, pages 1960–1982, 2013.
  • [24] T.S. Gardner, C.R. Cantor, and J.J. Collins. Construction of the genetic toggle switch in Escherichia Coli. Nature, 403:339–342, 2000.
  • [25] Alison L Gibbs and Francis Edward Su. On choosing and bounding probability metrics. International statistical review, 70(3):419–435, 2002.
  • [26] Daniel T Gillespie. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. Journal of computational physics, 22(4):403–434, 1976.
  • [27] Daniel T Gillespie. A rigorous derivation of the chemical master equation. Physica A: Statistical Mechanics and its Applications, 188(1-3):404–425, 1992.
  • [28] Daniel T. Gillespie. The chemical langevin equation. The Journal of Chemical Physics, 113(1):297–306, 2000.
  • [29] Peter W Glynn and Assaf Zeevi. Bounding stationary expectations of markov processes. In Markov processes and related topics: a Festschrift for Thomas G. Kurtz, pages 195–214. Institute of Mathematical Statistics, 2008.
  • [30] Jackson Gorham, Andrew B Duncan, Sebastian J Vollmer, and Lester Mackey. Measuring sample quality with diffusions. The Annals of Applied Probability, 29(5):2884–2928, 2019.
  • [31] Ramon Grima. Linear-noise approximation and the chemical master equation agree up to second-order moments for a class of chemical systems. Phys. Rev. E, 92:042124, Oct 2015.
  • [32] Ankit Gupta, Corentin Briat, and Mustafa Khammash. A scalable computational framework for establishing long-term behavior of stochastic reaction networks. PLoS computational biology, 10(6):e1003669, 2014.
  • [33] Itai Gurvich. Diffusion models and steady-state approximations for exponentially ergodic markovian queues. Annals of Applied Probability, 24(6):2527–2559, 2014.
  • [34] Dylan Hirsch, Theodore W Grunberg, and Domitilla Del Vecchio. Error bound for hill-function approximations in a class of stochastic transcriptional network models. In Proceedings of the 62st IEEE Conference on Decision and Control, Singapore, 2023. IEEE.
  • [35] NG van Kampen. A power series expansion of the master equation. Canadian Journal of Physics, 39(4):551–567, 1961.
  • [36] Hye-Won Kang and Thomas G Kurtz. Separation of time-scales and model reduction for stochastic reaction networks. The Annals of Applied Probability, 23(2):529–583, 2013.
  • [37] Hassan K Khalil. Nonlinear Systems. Prentice Hall, Upper Saddle River, NJ, 3rd edition, 2002.
  • [38] Michał Komorowski, Bärbel Finkenstädt, Claire V Harper, and David A Rand. Bayesian inference of biochemical kinetic parameters using the linear noise approximation. BMC bioinformatics, 10:1–10, 2009.
  • [39] Thomas G Kurtz. The relationship between stochastic and deterministic models for chemical reactions. The Journal of Chemical Physics, 57(7):2976–2978, 1972.
  • [40] Thomas G Kurtz. Limit theorems and diffusion approximations for density dependent markov chains. Stochastic Systems: Modeling, Identification and Optimization, I, pages 67–78, 1976.
  • [41] Thomas G Kurtz. Strong approximation theorems for density dependent markov chains. Stochastic Processes and their Applications, 6(3):223–240, 1978.
  • [42] Saul C Leite and Ruth J Williams. A constrained langevin approximation for chemical reaction networks. The Annals of Applied Probability, 29(3):1541–1608, 2019.
  • [43] Ioannis Lestas, Johan Paulsson, Nicholas E. Ross, and Glenn Vinnicombe. Noise in gene regulatory networks. IEEE Transactions on Automatic Control, 53(Special Issue):189–200, 2008.
  • [44] Yuandan Lin, Eduardo D Sontag, and Yuan Wang. A smooth converse lyapunov theorem for robust stability. SIAM Journal on Control and Optimization, 34(1):124–160, 1996.
  • [45] Winfried Lohmiller and Jean-Jacques E Slotine. Nonlinear process control using contraction theory. AIChE journal, 46(3):588–596, 2000.
  • [46] Harley H. McAdams and Adam Arkin. Stochastic mechanisms in gene expression. Proceedings of the National Academy of Sciences, 94(3):814–819, 1997.
  • [47] Bence Mélykúti, Joao P Hespanha, and Mustafa Khammash. Equilibrium distributions of simple biochemical reaction systems for time-scale separation in stochastic reaction networks. Journal of The Royal Society Interface, 11(97):20140054, 2014.
  • [48] X Flora Meng, Ania-Ariadna Baetica, Vipul Singhal, and Richard M Murray. Recursively constructing analytic expressions for equilibrium distributions of stochastic biochemical reaction networks. Journal of The Royal Society Interface, 14(130):20170157, 2017.
  • [49] Sean P Meyn and Richard L Tweedie. Stability of markovian processes II: Continuous-time processes and sampled chains. Advances in Applied Probability, 25(3):487–517, 1993.
  • [50] Johan Paulsson. Summing up the noise in gene networks. Nature, 427(6973):415–418, 2004.
  • [51] Johan Paulsson. Models of stochastic gene expression. Physics of life reviews, 2(2):157–175, 2005.
  • [52] Johan Paulsson and Måns Ehrenberg. Random signal fluctuations can reduce random fluctuations in regulated components of chemical regulatory networks. Physical review letters, 84(23):5447, 2000.
  • [53] Yili Qian and Domitilla Del Vecchio. Realizing integral control in living cells: how to overcome leaky integration due to dilution? Journal of The Royal Society Interface, 15(139), 2018.
  • [54] David Schnoerr, Guido Sanguinetti, and Ramon Grima. The complex chemical langevin equation. The Journal of chemical physics, 141(2), 2014.
  • [55] Abhyudai Singh. Negative feedback through mrna provides the best control of gene-expression noise. IEEE transactions on nanobioscience, 10(3):194–200, 2011.
  • [56] Abhyudai Singh and Joao P Hespanha. Optimal feedback strength for noise suppression in autoregulatory gene networks. Biophysical journal, 96(10):4013–4023, 2009.
  • [57] Eduardo D Sontag and Abhyudai Singh. Exact moment dynamics for feedforward nonlinear chemical reaction networks. IEEE Life Sciences Letters, 1(2):26–29, 2015.
  • [58] C Stein. Approximate computation of expectations, institute of mathematical statistics lecture notes—monograph series, 7, institute of mathematical statistics, hayward, ca, 1986. IMS Lecture Notes and Monograph Series, 7, 1986.
  • [59] Charles Stein. A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory, volume 6, pages 583–603. University of California Press, 1972.
  • [60] Yi Tao, Xiudeng Zheng, and Yuehua Sun. Effect of feedback regulation on stochastic gene expression. Journal of theoretical biology, 247(4):827–836, 2007.
  • [61] Tianhai Tian and Kevin Burrage. Stochastic models for regulatory networks of the genetic toggle switch. Proceedings of the national Academy of Sciences, 103(22):8372–8377, 2006.
  • [62] Nicolaas Godfried Van Kampen. Stochastic processes in physics and chemistry, volume 1. Elsevier, 1992.
  • [63] Cédric Villani. Optimal transport: old and new, volume 338. Springer, 2009.
  • [64] G Kelley Walter and C Peterson Allan. The theory of differential equations classical and qualitative, 2010.
  • [65] Leor S. Weinberger, John C. Burnett, Jared E. Toettcher, Adam P. Arkin, and David V. Schaffer. Stochastic gene expression in a lentiviral positive-feedback loop: Hiv-1 tat fluctuations drive phenotypic diversity. Cell, 122(2):169–182, 2005.
  • [66] Peter Whittle. Systems in stochastic equilibrium. John Wiley & Sons, Inc., 1986.
  • [67] Darren J Wilkinson. Stochastic modelling for quantitative description of heterogeneous biological systems. Nature Reviews Genetics, 10(2):122–133, 2009.