A Stein’s Method Approach to the Linear Noise Approximation for Stationary Distributions of Chemical Reaction Networks
Abstract
Stochastic Chemical Reaction Networks are continuous time Markov chain models that describe the time evolution of the molecular counts of species interacting stochastically via discrete reactions. Such models are ubiquitous in systems and synthetic biology, but often have a large or infinite number of states, and thus are not amenable to computation and analysis. Due to this, approximations that rely on the molecular counts and the volume being large are commonly used, with the most common being the Reaction Rate Equations and the Linear Noise Approximation. For finite time intervals, Kurtz established the validity of the Reaction Rate Equations and Linear Noise Approximation, by proving law of large numbers and central limit theorem results respectively. However, the analogous question for the stationary distribution of the Markov chain model has remained mostly unanswered, except for chemical reaction networks with special structures or bounded molecular counts. In this work, we use Stein’s Method to obtain sufficient conditions for the stationary distribution of an appropriately scaled Stochastic Chemical Reaction Network to converge to the Linear Noise Approximation as the system size goes to infinity. Our results give non asymptotic bounds on the error in the Reaction Rate Equations and in the Linear Noise Approximation as applied to the stationary distribution. As a special case, we give conditions under which the global exponential stability of an equilibrium point of the Reaction Rate Equations is sufficient to obtain our error bounds, thus permitting one to obtain conclusions about the Markov chain by analyzing the deterministic Reaction Rate Equations.
1 Introduction
1.1 Background and existing results
Systems composed of a set of chemical species interacting via a finite set of reactions are most often modeled as a Stochastic Chemical Reaction Network (SCRN), a continuous time Markov chain (CTMC) model for how the molecular counts of each species change over time. Such models have found widespread use in systems and synthetic biology as a method to capture the counts of relevant biomolecules within a single cell, where due to the small numbers of molecules present, the effect of stochasticity in the system’s evolution can be profound [67, 61, 52, 46, 6, 20, 65]. The time evolution of the probability distribution of molecular counts can be computed via the Kolmogorov forward equations, often called the Chemical Master Equation [27], or by a Monte Carlo approach using the Doob-Gillespie Algorithm [16, 26]. However, due to the very large, and often countably infinite, number of states in the Markov chain, simpler approximations have been proposed, the most prominent of which exploit the fact that for many physical systems the volume in which the reactions occur and the molecular counts of all species are large. These approximations are formalized by considering a family of Markov chains , where is the vector of molecular counts at time and is the volume. The Reaction Rate Equation (RRE) model is an Ordinary Differential Equation (ODE), whose solution is a deterministic approximation for , rigorously justified by Kurtz on finite time intervals in the sense that under mild technical conditions, uniformly on any finite time interval [39]. Due to being the solution to an ODE, the computational and analytical simplicity of the RRE model has lead to its widespread use in both modeling in systems biology [3] and design in synthetic biology [14, 24, 19]. In essence, the RRE model approximates the concentrations by a constant at each point in time, neglecting any stochastic fluctuation away from this value. In order to account for these fluctuations, Van Kampen expanded the Forward Kolmogorov Equations in powers of [35, 62]. The first order approximation he derived in this manner, called the Linear Noise Approximation (LNA), adds a correction term to the RREs, specifically a Gaussian approximation to fluctuations of about . The LNA was rigorously justified by Kurtz in the sense of giving technical conditions for the central limit theorem result to hold uniformly on finite time intervals with an appropriate covariance matrix . The LNA is similar to the RREs in that the approximating distribution can be easily computed from the solution to an ODE, and due to this simplicity it has been used widely to quantify noise in natural biological systems, [50, 51, 18, 60], as well as for design [56, 55], and inference [38, 22, 23] in synthetic biology.
However, a key weakness of the theoretical justification for the RREs and the LNA is that the results given in [39] do not say anything about the relationship between the stationary distribution of the Markov chain model and the steady state behavior of the RREs or LNA, even when the stationary distribution exists uniquely for all and the RREs/LNA has a well defined limiting behavior. In a later work, Kurtz gave a sufficient condition for the uniform in time convergence of to and of to [40]. However, that result requires that the state space of be bounded uniformly in , which rules out the analysis of many biological systems of interest where the molecular counts of certain species can grow unbounded, and thus for all , evolves on a countably infinite subset of the integer lattice. This technical difficulty in using the RREs or LNA to approximate the stationary distribution of an SCRN has long been understood [62], and in fact it is often the case that the Markov chain and RREs/LNA models have different long term behavior [43]. In this work we give sufficient conditions under which there exists a constant such that for all sufficiently large ,
(1) |
where is distributed according to the stationary distribution of , is an appropriate equilbrium point of the RRE, with a covariance matrix computed via the LNA, and is the 1-Wasserstein distance between the laws of the two random variables. This result immediately implies both that the Markov chain’s stationary distribution converges to an equilibrium point of the RRE in the sense , where is an appropriate equilbrium of the RRE, and that [25]. We give sufficient conditions in terms of second moment of for (1) to hold, and additionally give sufficient conditions on only the RRE model for (1) to hold, which include global exponential stability of . Interestingly, the convergence rate given by (1) is identical to the convergence rate shown by Kurtz on finite time intervals [41].
The primary technique we use to prove our results is Stein’s Method [13]. Stein’s Method is a powerful technique for proving central limit theorems introduced by Stein in 1972 [59, 58], and used in a wide variety of situations since then, such as when the limiting distribution is Poisson [7] or Binomial [17], as well as for convergence to a diffusion process [8]. The most closely related work to our study comes from queueing theory, where Braverman et al. using Stein’s Method, and Gurvich using a closely related technique, studied diffusion approximations of certain queueing systems in the heavy traffic regime [11, 10, 33]. Our application of Stein’s method follows that of [11] at a high level, where the key difference is that our family of CTMCs and limiting process is in dimensions instead of one. In order to handle this, we use results on Stein Factors from [30].
1.2 Related work
Although rigorous work on approximating the stationary distribution of SCRNs in the large volume limit has been limited, a variety of related ideas have been presented in the literature. In particular, when the microscopic rates are all affine, it is known that the LNA matches the first and second moments of for all time [62, 43]. This idea has been extended to a restricted class of SCRNs with nonlinear propensities in [31]. Additionally, Sontag and Singh gave a special class of SCRNs where the moments can be exactly computed even when they may not match the moments of the LNA exactly [57]. However, in these cases it is still unclear if the stationary distribution of converges to the Gaussian given by the LNA in the large volume limit. In fact, the results we present allow one to rigorously go from convergence of to with respect to the first and second moments, to convergence in 1-Wasserstein distance. We can thus strengthen the results of e.g. [31] with limited additional work. We note that methods for establishing convergence of the stationary distributions of a family of Markov Processes to the stationary distribution of a limiting process are given in [21]. However, not only have these methods not been successfully applied to SCRNs, but also our Stein’s Method approach yields non-asymptotic bounds on the approximation error in 1-Wasserstein distance.
In certain special cases, such as detailed or complex balance, the stationary distribution of can be computed in closed form, and thus in principle one could directly analyze the approximation error when using the LNA [66, 4, 48]. However, for most SCRNs of interest in biology, such results are not applicable. For example, a simple two step transcription-translation model for protein expression is not detailed or complex balanced [14], and thus the results in [66, 4] cannot be used to obtain a closed form expression for the stationary distribution, whereas the results of [48] are only exact for systems where has finitely many states.
We note that the LNA is only one diffusion approximation for SCRNs, and in fact the Chemical Langevin Equation (CLE) is often a more accurate approximation to for moderately large volumes [28]. The convergence of to the CLE on finite time intervals was established by Kurtz [41], but using the CLE to approximate the stationary distribution of has remained mostly unstudied. In fact, the CLE often exhibits finite time breakdown and thus does not have a well defined stationary distribution, a problem that has been tackled by allowing complex values in the Stochastic Differential equation [54], as well as imposing the physical constraint that concentrations are nonnegative [42].
1.3 Outline of the rest of the paper
The rest of the paper is as follows: In Section 2 we give notation and mathematical background, and then introduce the three relevant models. In Section 3 we state our main results, the proofs of which appear in Section 4. In Section 5 we illustrate our results with several examples. In Section 6 we provide concluding remarks and future research directions.
2 Problem Setting
In this section we first introduce the notation we will use, and then introduce SCRNs and the associated CTMC model along with and the Reaction Rate Equations and Linear Noise Approximation models. We also introduce the 1-Wasserstein distance, which we will use to measure the distance between probability distributions.
2.1 Notation
We denote (column) vectors by bold symbols , with the elements indexed as . Let [] denote the nonnegative [positive] integer lattice in dimensions. Let denote the set of rational numbers. Let , , and denote n-dimensional Euclidian space, and the nonnegative and positive orthants of the same respectively. For we denote the standard 2-norm by , and for a matrix we denote by the induced 2-norm , and denote by the condition number of . When all eigenvalues of have strictly negative real parts we say that is Hurwitz. Given a function , we denote by , , and the row vector of partial derivatives, the gradient, and the Hessian of respectively. We denote the third (directional) derivative of by , along the directions , and define its induced norm as . For two functions and from to , we define the convolution of and as , with the Lebesgue measure on . We denote the set of k-continuously differentiable functions on a domain by , where we omit the domain if it is clear from context.
2.2 Stochastic Chemical Reaction Networks
We consider Stochastic Chemical Reaction Networks parameterized by , composed of species , interacting according to reactions, each characterized by a nonzero reaction vector and a microscopic propensity function , which depends on . The reaction vector describes the change in the counts on the species when reaction fires, and the microscopic propensity function describes the rate at which reaction fires. Thus, for each , the Stochastic Chenical Reaction Network describes a CTMC , where represents the molecular counts of the species. The state space can be chosen as any subset of such that 1) for all and 2) for all and all such that , . The infinitesimal transition rates from to are given by . We refer the interested reader to [5] for a more thorough background on SCRNs. Throughout this work we will need to impose the following assumption on our SCRNs:
Assumption 2.1.
The propensity functions have the form
(2) |
Additionally, there exists where for each , for some , or , such that:
-
1.
For all , for all such that there exists satisfying .
-
2.
For all , for all such that for all , .
Remark 2.1.
The functions are the macroscopic propensities of each reaction.
Under Assumption 2.1, we can define . Note that our assumption that ensures that , i.e. our family of Markov chains includes those with arbitrarily large volume. Under Assumption 2.1 we will always choose to define the state space of our CTMC as . In order to apply our Stein’s Method technique, we will need the following assumption:
Assumption 2.2.
For all , is twice continuously differentiable on some open set containing , and has bounded second derivatives on .
One form of which satisfies Assumptions 2.1 and 2.2 is mass action kinetics [62], which means that the propensities are of the form
(3) |
where with is the reaction vector of the chemical reaction
(4) |
The requirement that must be imposed for to satisfy Assumption 2.2. See [27] for a discussion of the physical assumptions necessary to reach such propensities. We note that (3) requires that no reactant appears twice on the left hand side of the reaction. Having a reactant appear twice on the left hand side of a reaction creates a propensity proportional to instead of the physically accurate [27]. Here is the reaction rate constant of reaction and is the th element of . Another form of that satisfies Assumptions 2.1 and 2.2 is
(5) |
where and such that or . Such “Hill-type propensities” can arise through timescale separation in systems of chemical reactions with mass action kinetics [36, 47, 34]. For simplicity, we will impose the following assumption:
Assumption 2.3.
The stoichiometric subspace, , has dimension , and for each , is irreducible.
Though Assumption 2.3 appears to rule out SCRNs that have conservation laws, i.e. a linear combination of species counts that remains unchanged by all reactions, we will show through an example in Section 5 that Assumption 2.3 does not fundamentally prevent the analysis of SCRNs with conservation laws, as long as a change of coordinates exist where the system can be represented in a lower dimensional lattice as an SCRN with stoichiometric subspace of dimension . For the generator of is
(6) |
We will also consider two shifted and scaled version of defined by
(7) |
and
(8) |
where is a specified point in . For the generator of is given by
(9) |
and the generator of is given by
(10) |
When has a unique stationary distribution , we denote by a random variable having law . We likewise define and .
2.3 Reaction Rate Equations
The Reaction Rate Equations (RREs) are an Ordinary Differential Equation Model which is used to approximate . Defining and , the reaction rate equations are
(11) |
where and . An equilibrium point of the RREs is any point satisfying . We note that under Assumption 2.1, will be positive invariant with respect to 11.
2.4 Linear Noise Approximation
The Linear Noise Approximation (LNA) is a diffusion approximation obtained by expanding the Chemical Master Equation using the ansatz [62]. The terms and are given by
(12a) | ||||||
(12b) |
where evidently (12a) is the RRE (11). Here is a unit covariance Wiener Process. Let be an equilibrium point of (12a). We note that must hold for (12b) to make sense, which is guaranteed by e.g. Assumption 2.1. We denote by the solution to (12b) with and . The generator of is
(13) |
where . We remark that is an Ornstein–Uhlenbeck process, and thus is Gaussian as long as is. The stationary distribution of is a zero mean Gaussian with covariance matrix given as the solution to the Lyapunov Equation
(14) |
which has a unique positive definite solution as long as is Hurwitz. In this case, we denote by a random variable distributed according to .
2.5 1-Wasserstein Distance
In this work we measure the distance between probability distributions using the 1-Wasserstein distance [25]. By a slight abuse of notation, we define the 1-Wasserstein distance between two random variables as the 1-Wasserstein distance between the laws of the random variables, formally:
Definition 2.1.
Given two random variables and , both taking values in , we define the 1-Wasserstein distance between and as
(15) |
where .
We remark that this definition of 1-Wasserstein distance uses the dual formulation [63], and is a slight abuse of notation since the 1-Wasserstein distance is usually defined between two probability distributions. Our definition is equivalent to taking the 1-Wasserstein distance between the laws of and .
3 Approximating Stationary Distributions of SCRNs
Here we state the main results of this work. To begin, we present the technical conditions that will guarantee that the stationary distribution of converges to . We then give our main result, showing that when the first and second moments of are controlled, we can bound the 1-Wasserstein distance between and . We next show that this result can be used to analyze the RRE approximation as well. Finally, we give a method to check that the second moment of is controlled by relating the global stability properties of the RRE to the required moment bound.
Condition 3.1.
There exists , a polynomial of degree , such that for all and all , . Furthermore, there exists such that for all , has a unique stationary distribution , which satisfies
(16) |
Condition 3.1 does not require any type of uniformity in , and therefore the condition can be checked instead.
Condition 3.2 (Uniform moment bounds).
There exist constants , , and such that for all , has a unique stationary distribution , which satisfies
(17) |
and
(18) |
Remark 3.1.
Remark 3.2.
Recalling that is the “concentration scaled” Markov chain shifted to , Condition 3.2 is equivalent to the existence of , , and such that for all , there exists a unique stationary distribution of , , and it satisfies
(19) |
and
(20) |
We are now ready to state our main result.
Theorem 3.1.
Proof.
See Section 4.1. ∎
Remark 3.3.
Remark 3.4.
Though our Stein’s Method approach does not require the preliminary step of establishing a law of large numbers type result relating to , an equilibrium point of (12a), we can conclude such a result whenever the assumptions of Theorem 3.1 are satisfied.
Corollary 3.1.
Proof.
From the triangle inequality, we have that
(24) |
To bound the right hand side, notice that . Thus,
(25) |
Under the assumptions of the corollary, we can apply Theorem 3.1 to conclude that there exist and such that for all ,
(26) |
and thus from (25), we have that for all ,
(27) |
Observing that , we have from (24) and (27) that for all ,
(28) | ||||
(29) |
which completes the proof. ∎
A special case where Conditions 3.1 and 3.2 are easy to verify is an SCRN with mass action kinetics and only zeroth and first order reactions. This idea is formalized in Corollary 3.2.
Corollary 3.2.
Proof.
The proof relies on the fact that for SCRNs with only zeroth and first order reactions, the LNA gives exactly the first and second moments of . Specifically, the fact that is Hurwitz implies that we can apply Proposition 7 in [32] to conclude that all moments of are finite, and thus Condition 3.1 holds. Additionally, since all of the propensities are affine, a computation of the first and second moment dynamics of reveals that for all time [43], and . Therefore, by Remarks 3.1 and 3.2, Condition 3.2 is satisfied. The result then follows from Theorem 3.1. ∎
We now give a result that gives sufficient conditions on the RREs for the conclusion of Theorem 3.1 to hold. This result is substantially easier to use in practice than Theorem 3.1, since it requires only analyzing an ODE with variables, and not the stationary distribution of a CTMC over .
Theorem 3.2.
Proof.
See Section 4.4. ∎
4 Proofs of the Main Results
In this section we provide the proofs of the results presented in Section 3.
4.1 Proof of Theorem 3.1
We require the following Lemma from [29], which gives sufficient conditions for the the Basic Adjoint Relationship (BAR) to hold.
Lemma 4.1 (Proposition 3 in [29]).
Consider a CTMC over state space with rate matrix having entries and stationary distribution . If
(34) |
then
(35) |
It will be convenient for us to consider the 1-Wasserstein distance computed by taking a supremum over not the 1-Lipschitz functions, but the 1-Lipschitz functions that are additionally in with bounded derivatives. To this end, let
(36) |
We have the following Lemma, which is proved in Section 4.3.
Lemma 4.2.
The following holds for any two probability distributions and over :
(37) |
We now present the derivative bounds that we will need. The proof of Lemma 4.3 uses results from [30], and is provided in 4.2.
Lemma 4.3 (Derivative Bounds).
Consider the Stein Equation
(38) |
for . Assume that there exists a symmetric matrix and a real number such that
(39) |
Assume also that has a right inverse. Then, there exists a solution to the Stein Equation, which satisfies
(40) |
(41) |
and for all , there exists such that
(42) |
where , , and are given by
(43) | ||||
(44) | ||||
(45) |
where , is the right inverse of , and
(46) |
Proof.
See Section 4.2. ∎
Remark 4.1.
The assumption that there exists a symmetric matrix and a real number such that
(47) |
is equivalent to the assumption that is Hurwitz.
Our goal is to analyze
(48) |
We begin by observing that per Lemma 4.2 we can obtain by taking the supremum over instead of , and thus we can analyze
(49) |
To do this we use Stein’s Method following [11]. The Stein Equation is
(50) |
which in our case is the Poisson equation
(51) |
which is guaranteed to have a solution by Lemma 4.3. Throughout the rest of this proof, we will denote by a solution to (51) with that satisfies the bounds given by Lemma 4.3. We take the expectation of the Stein Equation with respect to to find
(52) |
Before proceeding, we show that for all . We have that , and so by Condition 3.1, there exists such that
(53) |
where the sums are taken over such that . By Lemma 4.3, is Lipschitz, and so . Therefore,
(54) | ||||
(55) |
which is finite due to Condition 3.1. Therefore, from Lemma 4.1, . Thus, for all , (52) implies that
(56) | ||||
(57) | ||||
(58) |
where is a solution to the Stein equation. The existence of is guaranteed by Lemma 4.3. For conciseness, let us define . Let . For a fixed we take the first order Taylor expansion of in using the Lagrange form of the remainder.
(59) | ||||
(60) | ||||
(61) |
where , and can depend on . We have that , and so from Taylor’s Theorem we have
(62) |
where , in which the supremum over is finite due to our assumption that has bounded second derivative. We have that (LABEL:eq:taylor_expanded) becomes
(63) |
Collecting terms and using the fact that , we obtain
(64) |
Observe that the first two terms are identical to , since
(65) |
with . Therefore, for all such that ,
(66) |
Letting and invoking Lemma 4.3 by the fact that Assumptions 2.1 and 2.3 ensure that has a right inverse since and has full column rank, we have
(67) |
This bound holds for any , with chosen as the solution to the Stein equation (51) specified by Lemma 4.3, and thus from (56) we have
(68) |
Using the moment bounds and the linearity of the expectation we have
(69) |
To prove the error bound proportional to , we take for , in which case . Let . For all we have
(70) |
The term decays the slowest, which yields the desired result.
4.2 Proof of Lemma 4.3
Here we prove Lemma 4.3. The proof is an application of Theorem 5 in [30], with the bounds tightened with more detailed analysis in our special case. For convenience we adopt the following notion from [30]. For a function with domain , we use the notation
(71) |
where , and
(72) |
Here, for , denotes the th derivative of along the directions . An Itô Diffusion is the solution to
(73) |
where is an dimensional Wiener process. We denote the transition semigroup of by , and the infinitesimal generator of by . has Wasserstein decay rate r(t) for a nonincreasing integrable function if for all and all ,
(74) |
For clarity, we start by stating the following theorem, slightly adapted from [30], which ensures the existence of constants , and . We will shortly improve these constants in our special case.
Theorem 4.1 (Theorem 5 in [30]).
Let be Lipschitz. Consider an dimensional Itô diffusion given by (73) with Wasserstein decay rate and invariant measure . If and have locally Lipschitz second and third derivatives and a right inverse for each , and with bounded second and third derivatives, then
(75) |
is twice continuously differentiable, satisfies the Stein equation
(76) |
and additionally,
(77) |
and
(78) |
where
(79) | |||
(80) |
with and . Furthermore, for all ,
(81) |
for that depends only on , , , and .
We now show that has a Wasserstein decay rate of , meaning that
(82) |
where denotes , the solution to (12b) with and . Observe that and are Gaussian with the same covariance. Therefore, letting , we have
(83) |
which can be shown by considering the lower bound
(84) |
from Jensen’s inequality, and the upper bound
(85) |
as evidenced by the coupling . Theorem 4.1 then gives us the constant . To prove the claimed expressions for and , we must specialize the arguments of [30] to our diffusion, which has a linear and Hurwitz and a uniform . As in [30], denote by the first directional derivative flow, defined as the solution to
(86) |
Further, define , the second directional derivative flow, defined as the solution to
(87) |
In our proof the following lemma replaces the derivative flow bounds given in Lemma 16 of [30].
Lemma 4.4.
Consider an Itô diffusion given by (73) with transition semigroup and assume that where satisfies
(88) |
For a matrix and a real number . Assume also that is a constant. Then, for all , , and ,
(89) |
and for all , .
Proof.
Under the assumptions of the Lemma, (86) becomes
(90) |
and thus . Therefore, . To bound this quantity, observe that the condition
(91) |
implies that
(92) |
where is the principle square root of . It then follows from standard arguments from linear systems theory [15, 45] that for all ,
(93) |
We then have that for all and ,
(94) | ||||
(95) | ||||
(96) |
which establishes the desired bound on . To complete the proof, observe that under the assumptions of the lemma, (87) becomes
(97) |
and thus . ∎
With this specialization, we now trace the construction of and through the same argument used in proof of Theorem 5 in [30] to obtain the claimed expressions. Using Lemma 4.4 we make the specializations described in Lemmas 4.5 and 4.6, which we state and prove before proceeding. In our case, the bound on given by Lemma 15 in [30] can be replaced by the following lemma.
Lemma 4.5.
Consider an Itô diffusion given by (73) with transition semigroup and assume that where satisfies
(98) |
For a matrix and a real number . Assume also that is a constant with right inverse . Then, for all and with bounded first and second derivatives, satisfies
(99) |
where .
Proof.
First, observe that under the assumptions of the lemma, the Itô diffusion has a Wasserstein decay rate of . Thus, we can invoke Lemma 15 directly to obtain that is twice continuously differentiable. The remainder of the proof follows that of Lemma 15 in [30], with the substitution of our form of and , and using Lemma 4.4 instead of Lemma 16 in [30] for bounds on the derivative flows. ∎
In our special case, the bound on from Lemma 20 in [30] can be replaced with the following lemma.
Lemma 4.6.
Consider an Itô diffusion given by (73) with transition semigroup and assume that where satisfies
(100) |
For a matrix and a real number . Assume also that is a constant with right inverse . Then, for all and with bounded second and third derivatives, satisfies
(101) |
where .
Proof.
We now finish our proof of Lemma 4.3. Observe that in this case we have and , where has a right inverse . Let . We have for solving that by the Dominated Convergence Theorem and Jensen’s inequality that
(102) |
Splitting the last integral into the interval and we have
(103) |
where
(104) |
and
(105) |
Applying the seminorm interpolation result of Lemma 19 in [30], we have
(106) | ||||
(107) |
Under the assumptions of the lemma, we can apply Lemmas 4.5 and 4.6. For we set to obtain a bound on as follows.
(108) | ||||
(109) | ||||
(110) | ||||
(111) | ||||
(112) | ||||
(113) |
where we have used the observations that and . turning to , for we set in Lemmas 4.5 and 4.6, obtaining
(114) | ||||
(115) | ||||
(116) | ||||
(117) | ||||
(118) | ||||
(119) |
Finally, by the definition of we have
(120) |
which proves the desired bound when combined with the bounds on and derived above.
4.3 Proof of Lemma 4.2
Proof.
We have that
(121) |
We will show that
(122) |
Our proof is based on mollifying . Let be the bump function defined by
(123) |
Where . Suppose for contradiction that
(124) |
then, there exists such that , and
(125) |
Let . We will show that , and . Let . Recall that
(126) |
where is the Lebesgue measure. Observe that
(127) | ||||
(128) | ||||
(129) | ||||
(130) | ||||
(131) |
and thus . Next, let with denote , and observe that by the Dominated Convergence Theorem,
(132) |
showing that , and that furthermore that is bounded. We have shown that . Now, consider
(133) | ||||
(134) | ||||
(135) | ||||
(136) | ||||
(137) |
where we have used the fact that for all , and the fact that . We have that
(138) | ||||
(139) | ||||
(140) | ||||
(141) |
where the second to last inequality follows from Jensen’s Inequality and the last follows from the fact that . Now, we have from (125) and (138) that
(142) |
and thus
(143) |
which cannot be true since . Thus, by contradiction,
(144) |
which completes the proof. ∎
4.4 Proof of Theorem 3.2
We first give a lemma that uses a Foster-Lyapunov criteria [49] to conclude that Condition 3.2 holds.
Lemma 4.7.
Proof.
We begin by showing that for all , has a unique stationary distribution and is nonexplosive. If is finite, then since is irreducible by Assumption 2.3, has a unique stationary distribution and is nonexplosive. Otherwise, observe that is radially unbounded due to the assumption that . Additionally, by using the bound , we have that
(146) |
Thus, by Theorem 7.1 in [49], is exponentially ergodic, and thus has a unique stationary distribution and is nonexplosive. Now we show the desired moment bounds. We have that for all ,
(147) |
By Theorem 4.3 in [49], we therefore have that , which implies that
(148) |
By Jensen’s inequality, . Thus, the SCRN satisfies Condition 3.2. ∎
Building on Lemma 4.7, we now prove lemma that relates the existence of a quadratic type Lyapunov function that for the RRE to Conditions 3.1 and 3.2, by using the Lyapunov function as a Foster-Lyapunov function for .
Lemma 4.8.
Consider an SCRN satisfying Assumptions 2.1, 2.2, and 2.3, and let be an equilibrium point of (12a). Suppose that there exists an open set and a Lyapunov function for the RRE twice continuously differentiable and such that for all , and
(149) |
and additionally, there exists such that . Then, Condition 3.2 is satisfied.
Proof.
Let , and for let . Consider
(150) |
If we take the Taylor expansion in using the Lagrange form of the remainder we obtain
(151) |
where depends on . Thus, for sufficiently large we obtain
(152) |
for some . Here we have used the fact that by Assumption 2.2, there exists such that . Hence, by Lemma 4.7, Condition 3.2 is satisfied. ∎
We now prove Theorem 3.2. We note that we only need to verify Condition 3.1 for , since we are assuming that the Hessian of is bounded.
Proof.
We begin by checking that Condition 3.1 holds. Let us consider the scaled Markov chain and the function . We have that
(153) | ||||
(154) | ||||
(155) |
For , we have
(156) | ||||
(157) |
Noting that , we have
(158) | ||||
(159) |
Thus, for sufficiently large , and such that , we have that . Therefore,
(160) |
which shows by Theorem 7.1 in [49] that has a unique stationary distribution and that by Theorem 4.3 in [49] , and thus has a unique stationary distribution and . Therefore, Condition 3.1 is satisfied. We now verify that Condition 3.2 holds. We will need the following Lemma, which is a slight variation on the standard converse Lyapunov theorem, see e.g. [37].
Lemma 4.9.
Let be closed and let be an open set such that . Let be a function defining a vector field for which is positively invariant. Suppose that for all , the solution to
(161) |
, exists uniquely on and that there exists and such that for all and all ,
(162) |
and furthermore, there exist , such that for all ,
(163) |
Then, for each , there exists twice continuously differentiable and such that for all ,
(164) | ||||
(165) |
Proof.
The proof that there exists satisfying the quadratic bounds proceeds analogously to the proof of Theorem 4.14 in [37], with the only adjustment that we use as the domain instead of a ball around the origin. That is assured by the form
(166) |
where is the solution to with and by and e.g. Theorem 8.43 in [64]. ∎
We now use Lemma 4.9 to construct a Lyapunov Function . Recall that , , , and are assumed to exist by the assumptions of Theorem 3.2. We apply Lemma 4.9 to with , , , , , and , picking to obtain a function with associated values , , and . Let . In order to create a Lyapunov function on all of , we define as
(167) |
We construct a global Lyapunov function by merging near the origin with far from the origin. We must do this in a way that creates a function with bounded second derivative so that we can apply Lemma 4.8. Our approach is similar to a technique used to prove Theorem B.1 in [44]. To this end, let be defined by
(168) |
We have that for all , , and hence in this region,
(169) |
Observe that due to the quadratic upper bound on , there exists such that for all , , and thus in this region
(170) | ||||
(171) | ||||
(172) | ||||
(173) |
where we have used that in the last line. hence, there exists such that for all ,
(174) | ||||
(175) |
Let be such that and . Let , , and . Now consider . Let be the bump function defined by
(176) |
where , and let , where we treat for all . We have that
(177) |
where is the Lebesgue measure on . For , we have that
(178) |
and for we have
(179) |
where is the Lipschitz constant of on , and . By the Dominated Convergence Theorem and the fact that almost everywhere,
(180) |
For sufficiently small , there exists such that for all and such that is differentiable at ,
(181) |
Thus, for , we have
(182) |
Now, let for be a smooth partition of unity subordinate to , and let
(183) |
We have that
(184) | ||||
(185) |
where we have used the fact that .
Let us consider . In this region, , , and . Thus,
(186) |
Now, let us consider . In this region, , , and , implying that
(187) |
By e.g. Lemma B.2 in [44], uniformly on the compact set . Additionally, observe that , and thus for sufficiently small , for all ,
(188) |
Now, let us consider . In this region, and , and thus
(189) |
By noting that is compact, and again using Lemma B.2 from [44] and the fact that
, we have that for sufficiently small ,
(190) |
Let us consider . In this region, and . Thus,
(191) |
Again using Lemma B.2 from [44], and the fact that , we have that for sufficiently small epsilon,
(192) |
Finally, consider . In this region, , , and . Thus,
(193) |
Combining the preceding results for each subset of , we have that for sufficiently small , for all ,
(194) |
Observe that
(195) |
where exists by the smoothness of and the compactness of . Therefore, for some sufficiently small , we can apply Lemma 4.8 with to conclude that the SCRN satisfies Condition 3.2.
5 Examples
5.1 Antithetic motif
The antithetic feedback motif is a common SCRN encountered in synthetic biology because it can approximately implement an integrator [53, 12]. The antithetic motif is given by the following set of chemical reactions:
\schemestart\subscheme \arrow(z–x1)¡=¿[][][135] \subscheme \arrow(@z–x2)¡=¿[][][90] \subscheme \arrow(@z–x12)¡-[][45] \subscheme \schemestop | (196) |
The reaction vectors and propensities are:
(197) | |||||
(198) | |||||
(199) | |||||
(200) | |||||
(201) |
This SCRN satisfies Assumptions 2.1, 2.2, and 2.3, with . We show that we can apply Theorem 3.2. Observe that as shown in [9], the RREs
(203) | ||||
(204) |
have a unique, globally exponentially stable equilibrium point in , with . Letting we have that , and thus there exists such for all , . Therefore, by Theorem 3.2, there exists such that for all ,
(205) |
5.2 Transcriptional activation
We consider a simple model of a protein that is transcriptionally regulated by binding to the promoter to form complex [14]:
(206) | |||
(207) | |||
(208) |
This SCRN has four species, however, we can use the two conservation laws and to obtain the following reduced model:
(209) | ||||
(210) |
with reactions
(211) | |||||
(212) | |||||
(213) | |||||
(214) |
resulting in the RREs
(216) | ||||
(217) |
If we set and , with , the SCRN satisfies Assumptions 2.1 and 2.2 with . Additionally, the SCRN satisfies Assumption 2.3. The RREs have a unique, globally exponentially stable equilibrium point in , and letting , we have , and so there exists such that for all , . Therefore, by Theorem 3.2, there exists such that for all ,
(218) |
6 Conclusion
In this work we studied the relationship between the stationary distributions of appropriately scaled Markov chains corresponding to SCRNs, and the steady state behavior of the RRE and LNA models which are commonly used as approximations. Using Stein’s Method, we obtained a bound on the 1-Wasserstein distance between and , the fluctuation term of the LNA, that is proportional to , and a bound on the 1-Wasserstein distance between and that is proportional to . Our main result requires one to control the second moment of , and we presented a Foster-Lyapunov based method to do so by analyzing just the RREs. Our condition requires global exponential stability of in the RRE, along with the existence of a linear function with exponential decay far from the origin. While proving the global exponential stability of an equilibrium point of the RRE is in and of itself a challenging task, a large body of literature on the topic is available, since the RRE have been the subject on intense study over the years. In fact, recent advances in global stability of equilibria of the RRE can potentially be leveraged via our results to analyze the error in the RRE and LNA for large volume [2, 1]. The authors foresee this work applying to model reduction via timescale separation of SCRNs, and system identification algorithms, where the computational simplicity of the LNA makes it an attractive model. In both cases, the fact that we can obtain non-asymptotic error bounds on the LNA is critical to analyzing the error in the reduced models and the identified system respectively.
7 Acknowledgments
References
- [1] M Ali Al-Radhawi. Graphical characterizations of robust stability in biological interaction networks. Mathematics of Control, Signals, and Systems, pages 1–33, 2023.
- [2] Muhammad Ali Al-Radhawi and David Angeli. New approach to the stability of chemical reaction networks: Piecewise linear in rates lyapunov functions. IEEE Transactions on Automatic Control, 61(1):76–89, 2015.
- [3] Uri Alon. An introduction to systems biology: design principles of biological circuits. Chapman and Hall/CRC, 2006.
- [4] David F Anderson, Gheorghe Craciun, and Thomas G Kurtz. Product-form stationary distributions for deficiency zero chemical reaction networks. Bulletin of mathematical biology, 72:1947–1970, 2010.
- [5] David F Anderson and Thomas G Kurtz. Stochastic analysis of biochemical systems, volume 674. Springer, 2015.
- [6] Adam Arkin, John Ross, and Harley H McAdams. Stochastic Kinetic Analysis of Developmental Pathway Bifurcation in Phage -Infected Escherichia coli Cells. Genetics, 149(4):1633–1648, 08 1998.
- [7] Andrew D Barbour. Stein’s method and poisson process convergence. Journal of Applied Probability, 25(A):175–184, 1988.
- [8] Andrew D Barbour. Stein’s method for diffusion approximations. Probability theory and related fields, 84(3):297–322, 1990.
- [9] Franco Blanchini and Elisa Franco. Structurally robust biological networks. BMC Systems Biology, 5(1):74, 2011.
- [10] Anton Braverman and JG Dai. Stein’s method for steady-state diffusion approximations of m/ph/n+ m systems. Annals of Applied Probability, 27(1):550–581, 2017.
- [11] Anton Braverman, JG Dai, and Jiekun Feng. Stein’s method for steady-state diffusion approximations: an introduction through the erlang-a and erlang-c models. Stochastic Systems, 6(2):301–366, 2017.
- [12] Corentin Briat, Ankit Gupta, and Mustafa Khammash. Antithetic integral feedback ensures robust perfect adaptation in noisy biomolecular networks. Cell Systems, 2(1):15–26, 2016.
- [13] Sourav Chatterjee. A short survey of stein’s method. In Proceedings of the Proceedings of the International Congress of Mathematicians, pages 1–24, 2014.
- [14] D. Del Vecchio and R. M. Murray. Biomolecular Feedback Systems. Princeton University Press, 1st edition, 2014.
- [15] Charles Desoer and Hiromasa Haneda. The measure of a matrix as a tool to analyze computer algorithms for circuit analysis. IEEE Transactions on Circuit Theory, 19(5):480–486, 1972.
- [16] Joseph L Doob. Markoff chains—denumerable case. Transactions of the American Mathematical Society, 58:455–473, 1945.
- [17] Werner Ehm. Binomial approximation to the poisson binomial distribution. Statistics & Probability Letters, 11(1):7–16, 1991.
- [18] Johan Elf and Måns Ehrenberg. Fast evaluation of fluctuations in biochemical networks with the linear noise approximation. Genome research, 13(11):2475–2484, 2003.
- [19] M. B. Elowitz and S. Leibler. A synthetic oscillatory network of transcriptional regulators. Nature, 403:339–342, 2000.
- [20] Michael B Elowitz, Arnold J Levine, Eric D Siggia, and Peter S Swain. Stochastic gene expression in a single cell. Science, 297(5584):1183–1186, 2002.
- [21] Stewart N Ethier and Thomas G Kurtz. Markov processes: characterization and convergence. John Wiley & Sons, 2009.
- [22] Paul Fearnhead, Vasilieos Giagos, and Chris Sherlock. Inference for reaction networks using the linear noise approximation. Biometrics, 70(2):457–466, 2014.
- [23] Bärbel Finkenstädt, Dan J Woodcock, Michal Komorowski, Claire V Harper, Julian RE Davis, Mike RH White, and David A Rand. Quantifying intrinsic and extrinsic noise in gene transcription using the linear noise approximation: An application to single cell data. The Annals of Applied Statistics, pages 1960–1982, 2013.
- [24] T.S. Gardner, C.R. Cantor, and J.J. Collins. Construction of the genetic toggle switch in Escherichia Coli. Nature, 403:339–342, 2000.
- [25] Alison L Gibbs and Francis Edward Su. On choosing and bounding probability metrics. International statistical review, 70(3):419–435, 2002.
- [26] Daniel T Gillespie. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. Journal of computational physics, 22(4):403–434, 1976.
- [27] Daniel T Gillespie. A rigorous derivation of the chemical master equation. Physica A: Statistical Mechanics and its Applications, 188(1-3):404–425, 1992.
- [28] Daniel T. Gillespie. The chemical langevin equation. The Journal of Chemical Physics, 113(1):297–306, 2000.
- [29] Peter W Glynn and Assaf Zeevi. Bounding stationary expectations of markov processes. In Markov processes and related topics: a Festschrift for Thomas G. Kurtz, pages 195–214. Institute of Mathematical Statistics, 2008.
- [30] Jackson Gorham, Andrew B Duncan, Sebastian J Vollmer, and Lester Mackey. Measuring sample quality with diffusions. The Annals of Applied Probability, 29(5):2884–2928, 2019.
- [31] Ramon Grima. Linear-noise approximation and the chemical master equation agree up to second-order moments for a class of chemical systems. Phys. Rev. E, 92:042124, Oct 2015.
- [32] Ankit Gupta, Corentin Briat, and Mustafa Khammash. A scalable computational framework for establishing long-term behavior of stochastic reaction networks. PLoS computational biology, 10(6):e1003669, 2014.
- [33] Itai Gurvich. Diffusion models and steady-state approximations for exponentially ergodic markovian queues. Annals of Applied Probability, 24(6):2527–2559, 2014.
- [34] Dylan Hirsch, Theodore W Grunberg, and Domitilla Del Vecchio. Error bound for hill-function approximations in a class of stochastic transcriptional network models. In Proceedings of the 62st IEEE Conference on Decision and Control, Singapore, 2023. IEEE.
- [35] NG van Kampen. A power series expansion of the master equation. Canadian Journal of Physics, 39(4):551–567, 1961.
- [36] Hye-Won Kang and Thomas G Kurtz. Separation of time-scales and model reduction for stochastic reaction networks. The Annals of Applied Probability, 23(2):529–583, 2013.
- [37] Hassan K Khalil. Nonlinear Systems. Prentice Hall, Upper Saddle River, NJ, 3rd edition, 2002.
- [38] Michał Komorowski, Bärbel Finkenstädt, Claire V Harper, and David A Rand. Bayesian inference of biochemical kinetic parameters using the linear noise approximation. BMC bioinformatics, 10:1–10, 2009.
- [39] Thomas G Kurtz. The relationship between stochastic and deterministic models for chemical reactions. The Journal of Chemical Physics, 57(7):2976–2978, 1972.
- [40] Thomas G Kurtz. Limit theorems and diffusion approximations for density dependent markov chains. Stochastic Systems: Modeling, Identification and Optimization, I, pages 67–78, 1976.
- [41] Thomas G Kurtz. Strong approximation theorems for density dependent markov chains. Stochastic Processes and their Applications, 6(3):223–240, 1978.
- [42] Saul C Leite and Ruth J Williams. A constrained langevin approximation for chemical reaction networks. The Annals of Applied Probability, 29(3):1541–1608, 2019.
- [43] Ioannis Lestas, Johan Paulsson, Nicholas E. Ross, and Glenn Vinnicombe. Noise in gene regulatory networks. IEEE Transactions on Automatic Control, 53(Special Issue):189–200, 2008.
- [44] Yuandan Lin, Eduardo D Sontag, and Yuan Wang. A smooth converse lyapunov theorem for robust stability. SIAM Journal on Control and Optimization, 34(1):124–160, 1996.
- [45] Winfried Lohmiller and Jean-Jacques E Slotine. Nonlinear process control using contraction theory. AIChE journal, 46(3):588–596, 2000.
- [46] Harley H. McAdams and Adam Arkin. Stochastic mechanisms in gene expression. Proceedings of the National Academy of Sciences, 94(3):814–819, 1997.
- [47] Bence Mélykúti, Joao P Hespanha, and Mustafa Khammash. Equilibrium distributions of simple biochemical reaction systems for time-scale separation in stochastic reaction networks. Journal of The Royal Society Interface, 11(97):20140054, 2014.
- [48] X Flora Meng, Ania-Ariadna Baetica, Vipul Singhal, and Richard M Murray. Recursively constructing analytic expressions for equilibrium distributions of stochastic biochemical reaction networks. Journal of The Royal Society Interface, 14(130):20170157, 2017.
- [49] Sean P Meyn and Richard L Tweedie. Stability of markovian processes II: Continuous-time processes and sampled chains. Advances in Applied Probability, 25(3):487–517, 1993.
- [50] Johan Paulsson. Summing up the noise in gene networks. Nature, 427(6973):415–418, 2004.
- [51] Johan Paulsson. Models of stochastic gene expression. Physics of life reviews, 2(2):157–175, 2005.
- [52] Johan Paulsson and Måns Ehrenberg. Random signal fluctuations can reduce random fluctuations in regulated components of chemical regulatory networks. Physical review letters, 84(23):5447, 2000.
- [53] Yili Qian and Domitilla Del Vecchio. Realizing integral control in living cells: how to overcome leaky integration due to dilution? Journal of The Royal Society Interface, 15(139), 2018.
- [54] David Schnoerr, Guido Sanguinetti, and Ramon Grima. The complex chemical langevin equation. The Journal of chemical physics, 141(2), 2014.
- [55] Abhyudai Singh. Negative feedback through mrna provides the best control of gene-expression noise. IEEE transactions on nanobioscience, 10(3):194–200, 2011.
- [56] Abhyudai Singh and Joao P Hespanha. Optimal feedback strength for noise suppression in autoregulatory gene networks. Biophysical journal, 96(10):4013–4023, 2009.
- [57] Eduardo D Sontag and Abhyudai Singh. Exact moment dynamics for feedforward nonlinear chemical reaction networks. IEEE Life Sciences Letters, 1(2):26–29, 2015.
- [58] C Stein. Approximate computation of expectations, institute of mathematical statistics lecture notes—monograph series, 7, institute of mathematical statistics, hayward, ca, 1986. IMS Lecture Notes and Monograph Series, 7, 1986.
- [59] Charles Stein. A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory, volume 6, pages 583–603. University of California Press, 1972.
- [60] Yi Tao, Xiudeng Zheng, and Yuehua Sun. Effect of feedback regulation on stochastic gene expression. Journal of theoretical biology, 247(4):827–836, 2007.
- [61] Tianhai Tian and Kevin Burrage. Stochastic models for regulatory networks of the genetic toggle switch. Proceedings of the national Academy of Sciences, 103(22):8372–8377, 2006.
- [62] Nicolaas Godfried Van Kampen. Stochastic processes in physics and chemistry, volume 1. Elsevier, 1992.
- [63] Cédric Villani. Optimal transport: old and new, volume 338. Springer, 2009.
- [64] G Kelley Walter and C Peterson Allan. The theory of differential equations classical and qualitative, 2010.
- [65] Leor S. Weinberger, John C. Burnett, Jared E. Toettcher, Adam P. Arkin, and David V. Schaffer. Stochastic gene expression in a lentiviral positive-feedback loop: Hiv-1 tat fluctuations drive phenotypic diversity. Cell, 122(2):169–182, 2005.
- [66] Peter Whittle. Systems in stochastic equilibrium. John Wiley & Sons, Inc., 1986.
- [67] Darren J Wilkinson. Stochastic modelling for quantitative description of heterogeneous biological systems. Nature Reviews Genetics, 10(2):122–133, 2009.