A sufficient condition for the quasipotential to be the rate function of the invariant measure of countable-state mean-field interacting particle systems
Abstract
This paper considers the family of invariant measures of Markovian mean-field interacting particle systems on a countably infinite state space and studies its large deviation asymptotics. The Freidlin-Wentzell quasipotential is the usual candidate rate function for the sequence of invariant measures indexed by the number of particles. The paper provides two counterexamples where the quasipotential is not the rate function. The quasipotential arises from finite horizon considerations. However there are certain barriers that cannot be surmounted easily in any finite time horizon, but these barriers can be crossed in the stationary regime. Consequently, the quasipotential is infinite at some points where the rate function is finite. After highlighting this phenomenon, the paper studies some sufficient conditions on a class of interacting particle systems under which one can continue to assert that the Freidlin-Wentzell quasipotential is indeed the rate function.
MSC 2020 subject classifications: Primary 60F10; Secondary 60K35, 82C22, 60J74, 90B15
Keywords: Mean-field interaction, invariant measure, large deviations, static large deviation, Freidlin-Wentzell quasipotential, relative entropy
1 Introduction
For a broad class of Markov processes such as small-noise diffusions, finite-state mean-field models, simple exclusion processes, etc., it is well-known that the Freidlin-Wentzell quasipotential is the rate function that governs the large deviation principle (LDP) for the family of invariant measures [18, 33, 7, 17]. The quasipotential is the minimum cost (arising from the rate function for a process-level large deviation principle) associated with trajectories of arbitrary but finite duration, with fixed initial and terminal conditions. We begin this paper with two counterexamples of independently evolving countable-state particle systems for which the quasipotential is not the rate function for the family of invariant measures. The family of invariant measures of these counterexamples satisfy the LDP with a suitable relative entropy as its rate function, and we show that the quasipotential is not the same as this relative entropy. Specifically, we show that there are points in the state space where the rate function is finite, but the quasipotential is infinite. These points cannot be reached easily via trajectories of arbitrary but finite time duration. However the barriers to reach these points are surmounted in the stationary regime. There are however some sufficient conditions, at least on a family of such countable-state interacting particle systems, where the Freidlin-Wentzell quasipotential is indeed the correct rate function; this will be the main result of this paper. Intuitively, the sufficient conditions cut-down the speed of outward excursions and ensure that the insurmountable barriers for the finite horizon trajectories continue to be insurmountable in the stationary regime.
Before we describe the counterexamples and the main result, let us introduce some notations and describe the model of a countable-state mean-field interacting particle system. Let denote the set of non-negative integers and let denote a directed graph on . Let denote the space of probability measures on equipped with the total variation metric (which we denote by ). For each , let denote the set of probability measures on that can arise as empirical measures of -particle configurations on . For each , we consider a Markov process with the infinitesimal generator acting on functions on :
(1.1) |
here , , are given functions that describe the transition rates and denotes the Dirac measure. Such processes arise as the empirical measure of weakly interacting Markovian mean-field particle systems where the evolution of the state of a particle depends on the states of the other particles only through the empirical measure of the states of all the particles. Under suitable assumptions on the model, the martingale problem for is well posed and the associated Markov process possesses a unique invariant probability measure . This paper highlights certain nuances associated with the large deviation principle for the sequence on .
Fix and let denote the Markov process with initial condition whose infinitesimal generator is . Its sample paths are elements of , the space of -valued functions on that are right-continuous with left limits equipped with the Skorohod topology. Such processes have been well studied in the past. Under mild conditions on the transition rates, when in as , it is well-known that the family converges in probability, in , as to the mean-field limit111See McKean [25] in the context of interacting diffusions and Bordenave et al. [6] in the context of countable-state mean-field models.:
(1.2) |
here denotes the derivative of at time , , , denotes the rate matrix when the empirical measure is (i.e., when , when , and ), and denotes the transpose of . The above dynamical system on is called the McKean-Vlasov equation. This mean-field convergence allows one to view the process as a small random perturbation of the dynamical system (1.2). The starting point of our study of the asymptotics of is the process-level LDP for , whenever converges to in . This LDP was established by Léonard [21] when the initial conditions are fixed, and by Borkar and Sundaresan [7] when the initial conditions converge222Often, as done in [7], one lets be random, and only requires in distribution, where is deterministic. For simplicity, we restrict to be deterministic. in . The rate function of this LDP is governed by “costs” associated with trajectories on with initial condition , which we denote by , (see (2.8) for its definition).
We assume that is the unique globally asymptotically stable equilibrium of (1.2). Define the Freidlin-Wentzell quasipotential
(1.3) |
From the theory of large deviations of the invariant measure of Markov processes [18, 33, 11, 7], is a natural candidate for the rate function of the family .
1.1 Two counterexamples
We begin with two counterexamples for which is not the rate function for the family of invariant measures.
1.1.1 Non-interacting M/M/1 queues
Consider the graph whose edge set consists of forward edges and backward edges (see Figure 1). Let and be two positive numbers. Consider the generator acting on functions on by
where for each and for each . When , the invariant probability measure associated with this Markov process is
For each , we consider particles, each of which evolves independently as a Markov process on with the infinitesimal generator . That is, the particles are independent M/M/1 queues. It is easy to check that the empirical measure of the system of particles is also a Markov process on the state space and it possesses a unique invariant probability measure, which we denote by .
On one hand, it is straightforward to see that the family satisfies the LDP on . Indeed, under stationarity, the state of each particle is distributed as . As a consequence, is the law of the random variable on , where are independent and identically distributed (i.i.d.) as . Therefore, by Sanov’s theorem [13, Theorem 6.2.10], satisfies the LDP with the rate function , where is the relative entropy defined by333We use the convention .
(1.4) |
On the other hand, it is natural to conjecture that the rate function for the family is given by the quasipotential (1.3) with replaced by . However, as discussed in the next paragraph, the quasipotential is not the same as . Hence, from the uniqueness of the large deviations rate function [13, Lemma 4.1.4], the quasipotential does not govern the rate function for the family .
We now provide some intuition on why the quasipotential is not the rate function in the example under consideration. For a formal proof, see Section 8. We first introduce some notation. Let denote the infinite product of equipped with the product topology. We view as the subset of with the subspace topology (e.g., see [15, Chapter 3, Section 2]). If , we define
(1.5) |
whenever the limit exists. Also, define by
(1.6) |
with the convention that , and define , . Using the fact that has geometric decay, it can be checked that is finite if and only if the first moment of (i.e., ) is finite. However it turns out that (i.e., the quantity in (1.3) with replaced by ) is finite if and only if the -moment of (i.e., ) is finite. In particular, if we consider a whose first moment is finite but -moment is infinite then . Let , be such that but , and consider the -neighbourhood of in . By Sanov’s theorem, the probability of this neighbourhood under is of the form . For a fixed , let us now try to estimate the probability of being in this neighbourhood when is in a small neighbourhood of . If the process is initiated at a near , then the probability that the random variable is in the -neighbourhood of is at most
Since is lower semicontinuous (we prove this in Lemma 5.4), we must have
Hence we can choose an small enough so that . For this , the probability that lies is the -neighbourhood of is upper bounded by , which is smaller than , even in the exponential scale, for large enough . That is, for any arbitrary but fixed , we can find a small neighbourhood of such that the probability that lies in that neighbourhood is smaller than what we expect to see in the stationary regime. In other words, there are some barriers in that cannot be surmounted in any finite time, yet these barriers can be crossed in the stationary regime. These barriers indicate that, to obtain the correct stationary regime probability of a small neighbourhood of using the dynamics of , one should wait longer than any fixed time horizon. That is, one should consider the random variable , where is a suitable function of , and estimate the probability that belongs to a small neighbourhood of . However it is not straightforward to obtain such estimates from the process-level large deviation estimates of since the latter are usually available for a fixed time duration.
There are natural barriers in the context of finite-state mean-field models when the limiting dynamical system has multiple (but finitely many) stable equilibria [36]. In such situations, passages from a neighbourhood of one equilibrium to a neighbourhood of another take place over time durations of the form where is the number of particles444 refers to a bounded sequence, and refers to a sequence that goes to .. Interestingly, these barriers can be surmounted using trajectories of finite time durations; i.e., for any fixed , the probability that the empirical measure process reaches a neighbourhood of an equilibrium at time when it is initiated in a small neighbourhood of another equilibrium is of the form . In contrast, in the case of the above counterexample, the barriers cannot be surmounted in finite time durations; for any fixed , the probability that reaches a small neighbourhood of a point in with finite first moment but infinite -moment when it is initiated from a neighbourhood of is of the form . Hence we anticipate that the barriers that we encounter in the above counterexample are somehow more difficult to surmount than those that arise in the case of finite-state mean-field models with multiple stable equilibria.
1.1.2 Non-interacting nodes in a wireless network
We provide another counterexample where the issue is similar. Consider the graph whose edge set consists of forward edges and backward edges (see Figure 2). Let and be positive numbers. Consider the generator acting on functions on by
where for each and for each . The invariant probability measure associated with this Markov process is
Similar to the previous example, for each , we consider particles, each of which evolves independently as a Markov process on with the infinitesimal generator . It is easy to check that the empirical measure of the system of particles possesses a unique invariant probability measure, which we denote by . Under stationarity, the state of each particle is distributed as . As a consequence, is the law of the random variable on , where are i.i.d. . Hence, by Sanov’s theorem, the family satisfies the LDP with the rate function . As we show in Section 8, in this example too, the quasipotential (1.3) with replaced by is not the same as . As in the previous example, there are points where but , points that have a finite first moment but infinite -moment. Once again, the quasipotential does not govern the rate function for the family .
1.2 Assumptions and main result
We now provide some assumptions on the model of countable-state mean-field interacting particle systems that ensure that the barriers in that are insurmountable using trajectories of arbitrary but finite time duration remain insurmountable in the stationary regime as well. Under these assumptions, we prove the main result of this paper, i.e., the sequence of invariant measures satisfies the LDP with rate function .
1.2.1 Assumptions
Our first set of assumptions is on the mean-field interacting particle system (i.e., on the generator defined in (1.1)).
-
(A1)
The edge set is given by
-
(A2)
There exist positive constants and such that
for all .
-
(A3)
The functions , and , , are uniformly Lipschitz continuous on .
Note that assumption (A1) considers a specific transition graph (Figure 2) for each particle. This graph arises in the contexts of random backoff algorithms for medium access in wireless local area networks [20] and decentralised control of loads in a smart grid [26]. Assumption (A2) ensures that the forward transition rates at state decays as . This key assumption cuts down the speed of outward excursions and enables us to overcome the issue described in the counterexamples. To highlight this, consider a modified example of Section 1.1.2 where , ; the rest of the description remains the same. Let denote the invariant probability measure associated with one particle. It can be checked that is of the order of , unlike which has geometric decay. As a consequence, is finite if and only if the -moment of is finite. Hence, by imposing (A2), we have ensured that the barriers in that are insurmountable for finite time duration trajectories continue to remain insurmountable in the stationary regime; this is the key property that enables us to prove the main result of this paper. Assumption (A3) is a uniform Lipschitz continuity property for the transition rates which is required for the process-level LDP for to hold and for the the McKean-Vlasov equation (1.2) to be well-posed.
Our second set of assumptions is on the McKean-Vlasov equation (1.2). Let , , denote the solution to the limiting dynamics (1.2) with initial condition . Recall the function . Define , .
-
(B1)
There exists a unique globally asymptotically stable equilibrium for the McKean-Vlasov equation (1.2).
-
(B2)
and for each .
The first assumption above asserts that all the trajectories of (1.2) converge to as time becomes large. The proof of the LDP upper and lower bounds for the family involves construction of trajectories that start at suitable compact sets, reach the stable equilibrium using arbitrarily small cost, and then terminate at a desired point in starting from . All these are enabled by assumption (B1) (see more remarks about this assumption in Section 1.3). The second assumption asserts that the -moment of the solution to the limiting dynamics converges uniformly over initial conditions lying in sets of bounded -moment. In the case of a non-interacting system that satisfies (A1) but with constant forward transition rates (for example, see in Section 1.1.2), the analogue of this assumption can easily be verified: the first moment of the solution to the limiting dynamics converges uniformly over initial conditions lying in sets of bounded first moment. In fact, one can explicitly write down the first moment of the solution to the limiting dynamics in this case and verify this assumption easily. Assumption (B2) is the analogous statement for our mean-field system that satisfies the -decay of the forward transition rates in assumption (A2).
1.2.2 Main result
We now state the main result of this paper, namely the LDP for the family of invariant measures under the assumptions (A1)–(A3) and (B1)–(B2).
We first assert the existence and uniqueness of the invariant measure for for each , and the exponential tightness of the family .
Proposition 1.1.
Recall the quasipotential defined in (1.3). We now state the main result of this paper.
Theorem 1.1.
The proof of this result is carried out in Sections 4–7. We begin with the process-level uniform LDP for over compact subsets of ; this uniform LDP gives us the large deviation estimates for the process uniformly over the initial conditions lying in a given compact set (see Definition 2.2 and Theorem 2.1). We prove the LDP for the family by transferring this process-level uniform LDP for over compact subsets of to the stationary regime. The proof of the LDP lower bound (in Section 4) considers specific trajectories and lower bounds the probability of small neighbourhoods of points in under using the probability that the process remains close to these trajectories. For the proof of the upper bound, we require certain regularity properties of the quasipotential. These properties are established in Section 5. We first show a controllability555This terminology is from Cerrai and Röckner [11]. property for : is finite if and only if . Using the lower bound proved in Section 4, we then show that the level sets of are compact subsets of . Since is not locally compact and has compact lower level sets, we do not expect to be continuous on . Indeed, if is such that is continuous at and , given there exists a such that implies that . In particular, . Since is compact in , this shows that has a relatively compact neighborhood in , which is a contradiction. This shows that, for any such that , is discontinuous at . However we show the following small cost connection property: whenever in and as , we have . These properties of the quasipotential are then used to transfer the process-level uniform LDP upper bound for (uniform over compact subsets of ) to the LDP upper bound for the family of invariant measures. The proof of the upper bound is carried out in Section 6. Finally, we complete the proof of the theorem in Section 7.
While the proofs of our lower and upper bounds follow the general methodology of Sowers [33], there are significant model-specific difficulties that arise in our context. The main novelty in the proof of Theorem 1.1 is to establish the small cost connection property of the quasipotential under assumptions (A1)–(A3) and (B1)–(B2). That is, we can find trajectories of small cost that start at and end at points in whose -moment is not very far from that of . In the work of Sowers [33], this has been carried out by considering the “straight-line” trajectory that connects the attractor to the nearby point under consideration. Such a trajectory may not have small cost in our case since the mass transfer is restricted to the edges in . We overcome this difficulty by considering a piecewise constant velocity mass transfer via the edges in . We then carefully estimate the cost of this trajectory and prove the necessary small cost connection property. We also simplify the proof of the compactness of the lower level sets of ; while Sowers [34, Proposition 7] studies the minimisation of the costs of trajectories over the infinite-horizon, we arrive at it by using the LDP lower bound and the exponential tightness of the family . We also remark that the methodology of Sowers [33] has been used by Cerrai and Röckner [11] in the context of stochastic reaction diffusion equations and by Cerrai and Paskal [9] in the context of two-dimensional stochastic Navier-Stokes equations.
1.3 Discussion and future directions
The main result and the counterexamples suggest that in order for the family of invariant measures of a Markov process to satisfy the large derivation principle with rate function governed by the Freidlin-Wentzell quasipotential, one must have some good properties on the model under consideration. In the case of our main result, this goodness property was achieved by the -decay of the forward transition rates from assumption (A2). We use this assumption to show the exponential tightness of the invariant measure over compact subsets with bounded -moments. It also enables us to show the necessary regularity properties of the quasipotential required to transfer the process-level large deviation result to the stationary regime. However a general treatment of the LDP for the family of invariant measures of Markov processes (that encompasses the cases of [33, 11, 9, 7, 17]), especially when the ambient state space is not locally compact, is missing in the literature.
One of the assumptions that plays a significant role in the proof of our main result is the existence of a unique globally asymptotically stable equilibrium for the limiting dynamics (assumption (B1))666In the works of Sowers [33], Cerrai and Röckner [11], and Cerrai and Paskal [9], their model assumptions ensure that (B1) holds.. In general, the limiting dynamical system (1.2) could possess multiple -limit sets. In that case the approach of our proofs breaks down. A well-known approach to study large deviations of the invariant measures in such cases is to focus on small neighbourhoods of these -limit sets and then analyse the discrete time Markov chain that evolves on these neighbourhoods. The LDP then follows from the estimates of the invariant measure of this discrete time chain (see Freidlin and Wentzell [18, Chapter 6, Section 4]). However this approach requires the uniform LDP over open subsets of , which is not yet available for our mean-field model. If this can be established, along with the regularity properties of the quasipotential established in Section 5, one can not only use the above idea to extend our main result to the case when the limiting dynamical system possesses multiple -limit sets but also to study exit problems and metastability phenomena in our mean-field model.
Another definition of the quasipotential appears in the literature. It is given by the minimisation of costs of the form over infinite-horizon trajectories on such that the terminal time condition is fixed and as (see Sowers [33], Cerrai and Röckner [11]). While it is clear that the above definition of the quasipotential is a lower bound for in (1.3), unlike in Sowers [33] and Cerrai and Röckner [11], we are not able to show that the two definitions are the same. A proof of this equality, or otherwise, will add more insight on the general case.
We remark that assumption (A3) does not play a role in the proof of our main result. It is used to invoke the process-level LDP for (see Theorem 2.1) and the well-posedness of the limiting dynamical system (1.2). If these two properties are established through some other means then the proof of Theorem 1.1 holds verbatim without the need for assumption (A3).
Finally, we mention that a time-independent variational formula for the quasipotential is available for some non-reversible models in statistical mechanics, see Bertini et al. [2, 3]. It is not clear if the quasipotential in (1.3) admits a time-independent variational form. This would be an interesting direction to explore.
1.4 Related literature
Process-level large deviations of small-noise diffusion processes have been well studied in the past. For finite-dimensional large deviation problems, see Freidlin and Wentzell [18, Chapter 5], Liptser [23], Veretennikov [35], Puhalskii [29], and the references therein. For infinite-dimensional problems where the state space is not locally compact, see Sowers [34] and Cerrai and Röckner [10]. More recently, uniform large deviation principle (uniform LDP) for Banach-space valued stochastic differential equations over the class of bounded and open subsets of the Banach space have been studied by Salins et al. [31]. These have been used to study the exit times and metastability in such processes, see Salins and Spiliopoulos [32]. While the above works focus on diffusion processes, our work focuses on the stationary regime large deviations of countable-state mean-field models with jumps. In the spirit of the small-noise problems listed above, our process can be viewed as a small random perturbation of the dynamical system (1.2) on .
In the context of interacting particle systems, Dawson and Gärtner [12] established the process-level LDP for weakly interacting diffusion processes, and Léonard [21] and Borkar and Sundaresan [7] extended this to mean-field interacting particle systems with jumps. In this work, we focus on the stationary regime large deviations of mean-field models with jumps when the state of each particle comes from a countable set. For small-noise diffusion process on Euclidean spaces and finite-state mean-field models, since the state space (on which the empirical measure process evolves) is locally compact, the process-level large deviation results have been extended in a straightforward manner to the uniform LDP over the class of open subsets of the space. Such uniform large deviation estimates have been used to prove the large deviations of the invariant measure and the exit time estimates, see Freidlin and Wentzell [18, Chapter 6] in the context of diffusion processes, Borkar and Sundaresan [7] and [36] in the context of finite-state mean-field models. One of the key ingredients in these proofs is the continuity of the quasipotential. However in our case, the state space is infinite-dimensional and not locally compact. Therefore, since the quasipotential (1.3) is expected to have compact lower level sets, we do not expect it to be continuous on unlike in the finite-dimensional problems mentioned above. Hence the ideas presented in [7] are not directly applicable to our context of the LDP for the family of invariant measures.
Large deviations of the family of invariant measures for small-noise diffusion processes on non-locally compact spaces have also been studied in the past, see Sowers [33] and Cerrai and Röckner [11]. They have a unique attractor for the limiting dynamics, and the proof essentially involves conversion of the uniform LDP over the finite-time horizon to the stationary regime. Martirosyan [24] studied a situation where the limiting dynamical system possesses multiple attractors. For the study of large deviations of the family of invariant measures for simple exclusion processes, see Bodineau and Giacomin [5] and Bertini et al. [3]. More recently, Farfán et al. [17] extended this to a simple exclusion process whose limiting hydrodynamic equation has multiple attractors. Their proof proceeds similar to the case of finite-dimensional diffusions in Freidlin and Wentzell [18, Chapter 6, Section 4] by first approximating the process near the attractors and then using the Khasminskii reconstruction formula [19, Chapter 4, Section 4]. In particular, it requires the uniform LDP to hold over open subsets of the state space. Since their state space, although infinite-dimensional, is compact, the proof of the uniform LDP over open subsets easily follows from the process-level LDP. Also, the compactness of the state space simplifies the proofs of the small cost connection property from the attractors to nearby points, a property needed in the Khasminskii reconstruction. Although we restrict our attention to the case of a unique globally asymptotically stable equilibrium as in [33, 11], the main novelty of our work is that we establish certain regularity properties of the quasipotential for countable-state mean-field models with jumps which were not done in the past. We then use these properties to prove the LDP for the family of invariant measures. Furthermore, we demonstrate two counterexamples where the stationary regime LDP’s rate functions are not governed by the usual quasipotential. To the best of our knowledge, such examples where the LDP for the family of invariant measures hold but there rate functions are not governed by the usual Freidlin-Wentzell quasipotential are new. These examples are constructed in a way that the particle systems do not possess the small cost connection property from the attractor to nearby points with finite first moment but infinite -moment.
Large deviations of the family of invariant measure for a queueing network in a finite-dimensional setting has been studied by Puhalskii [28]. Finally, large deviations of the family of invariant measures for a stochastic process under some general conditions has been studied by Puhalskii [30]. One of their conditions is the small cost connection property between any two nearby points in the state space, which we do not expect to be true in our countable-state mean-field model since our state space is infinite-dimensional.
1.5 Organisation
This paper is organised as follows. In Section 2, we provide preliminary results on the large deviations over finite time horizons. The proof of the main result is carried out in Sections 3–7. In Section 3, we prove the existence, uniqueness, and exponential tightness of the family of invariant measures. In Section 4, we prove the LDP lower bound for the family of invariant measures. In Section 5, we establish some regularity properties of the quasipotential defined in (1.3). In Section 6, we prove the LDP upper bound for the family of invariant measures. In Section 7, we complete the proof of the main result. Finally in Section 8, we prove that the quasipotential differs from the relative entropy (with respect to the globally asymptotically stable equilibrium) for the two counterexamples discussed in Section 1.1.
2 Preliminaries
2.1 Frequently used notation
We first summarise the frequently used notation in the paper. Let denote the set of nonnegative integers and let denote a directed graph on . Let denote the infinite product of equipped with the topology of pointwise convergence. Let denote the space of functions on with compact support. Recall that denotes the space of probability measure on equipped with the total variation metric (denoted by ). This metric generates the topology of weak convergence on . By Scheffé’s lemma [15, Chapter 3, Section 2], can be identified with the subset of with the subspace topology. For each , recall that denotes the space of probability measures on that can arise as empirical measures of -particle configurations on . Recall defined in (1.6). Given and , let the bracket denote . Similarly, given , let the bracket denote , whenever the limit exists. For , define
(2.1) |
by Prohorov’s theorem, is a compact subset of . Define . Let denote the globally asymptotically stable equilibrium for the McKean-Vlasov equation (1.2) (see assumption (B1)). For each , define
(2.2) |
note that depends on as well (which we do not indicate for ease of readability). Define
(2.3) |
Note that is the log-moment generating function of the centred unit rate Poisson law, and define its convex dual
(2.7) |
For a complete and separable metric space , , and , let denote . For a set let denote the complement of . For two numbers and , let (resp. ) denote maximum (resp. minimum) of and . Also, let . For a metric space , let denote the Borel -field on . Finally, constants are denoted by and their values may be different in each occurrence.
2.1.1 Notation related to the dynamics
Let denote the space of -valued functions on that are right continuous with left limits. It is equipped with the Skorohod topology which makes it a complete and separable metric space (see, for example, Ethier and Kurtz [16, Chapter 3]). Let denote a metric on that generates the Skorohod topology. An element of is called a “trajectory”, and we shall refer to the process-level large deviations rate function evaluated on a trajectory as the “cost” associated with that trajectory. For a trajectory , let both and denote the evaluation of at time . For and , let denote the solution to the -valued martingale problem for with initial condition (whenever the martingale problem for is well-posed). Let denote the random element of whose law is . For each , let denote the generator acting on functions on by
i.e., the generator of the single particle evolving on under the static mean-field .
Let denote the space of real-valued functions on with compact support that are continuously differentiable in the first argument. Given a trajectory such that the mapping is absolutely continuous (see Dawson and Gärtner [12, Section 4.1]), one can define for almost all such that
holds for each and .
Finally, let denote the space of probability measures on equipped with the usual weak topology. Also, let denote the space of probability measures on equipped with the weak topology.
2.2 Process-level large deviations
We first recall the definition of the large deviation principle for a family of random variables indexed by one parameter.
Definition 2.1 (Large deviation principle).
Let be a metric space. We say that a family of -valued random variables defined on a probability space satisfies the large deviation principle with rate function if
-
•
(Compactness of level sets). For any , is a compact subset of ;
-
•
(LDP lower bound). For any , , and , there exists such that
for any ;
-
•
(LDP upper bound). For any , , and , there exists such that
for any .
This definition is also used to study the large deviations of a family of probability measures. For each , let , the law of the random variable on . We say that the family of probability measures satisfies the LDP on with rate function if the sequence of -valued random variables satisfies the LDP with rate function .
The LDP lower bound in the above definition is equivalent to the following statement [18, Chapter 3, Section 3]
Similarly, under the compactness of the level sets of the rate function , the LDP upper bound above is equivalent to the following statement:
To study the LDP for the family of invariant measures, we require estimates on the probabilities of the process-level large deviations of . In particular, we consider hitting times of on certain subsets of the state space and apply the process-level large deviation lower and upper bounds for starting at these subsets. Therefore, in addition to the scaling parameter , we must consider the process indexed by the initial condition . To study the process-level large deviations of such stochastic processes indexed by two parameters, we use the following definition of the uniform large deviation principle (see Freidlin and Wentzell [18, Chapter 3, Section 3]).
Definition 2.2 (Uniform large deviation principle).
We say that the family of -valued random variables defined on a probability space satisfies the uniform large derivation principle over the class of subsets of with the family of rate functions , , , if
-
•
(Compactness of level sets). For each compact and , is a compact subset of , where ;
-
•
(Uniform LDP lower bound). For any , , , and , there exists such that
for all , , and ;
-
•
(Uniform LDP upper bound). For any , , , and , there exists such that
for all , , and .
Note that the initial conditions in the upper and lower bounds lie in , unlike in the definition in [18, Chapter 3, Section 3].
We now make some definitions. Recall defined in (2.3). For each and , define the functional by
(2.8) |
whenever and the mapping is absolutely continuous; otherwise. Define the lower level sets of the functional by
The next lemma asserts that these level sets are compact in when the initial conditions belong to a compact subset of . The proof is deferred to Appendix A.
Lemma 2.1.
For each , , and compact,
is a compact subset of .
The starting point of our study of the invariant measure asymptotics is the following uniform large deviation principle for the family over the class of compact subsets of with the family of rate functions . Its proof uses the process-level LDP for studied in Léonard [21] for a fixed initial condition and its extension (when is a finite set) to the case when initial conditions converge to a point in in Borkar and Sundaresan [7]. The proof can be found in Appendix A.
Theorem 2.1.
The rate function admits a non-variational representation in terms of a minimal cost “control” that modulates the transition rates across various edges in so that the desired trajectory is obtained. Recall defined in (2.7).
Theorem 2.2 (Non-variational representation; Léonard [22]).
Let be such that . Then there exists a measurable function such that
(2.9) |
holds for all and all , and admits the non-variational representation
(2.10) |
3 Invariant measure: Existence, uniqueness, and exponential tightness
In this section we prove Proposition 1.1, the existence and uniqueness of the invariant measure for for each , and the exponential tightness of the family of invariant measures . The proof relies on the standard Krylov-Bogolyubov argument and a coupling between the interacting particle system under consideration and a non-interacting system with maximal forward transition rates minimal backward transition rates.
We first introduce some notations for the non-interacting particle system. Let denote the generator acting on functions on by
(3.1) |
where and . For each , let denote the solution to the -valued martingale problem for with initial condition . Integration with respect to is denoted by . Let denote the unique invariant probability measure for . Let denote the solution to the martingale problem for with initial law . Integration with respect to is denoted by . By solving the detailed balance equations for , we see that
In particular, has superexponential decay in , and for small enough , where is defined in (1.6). Finally, for each , let denote the solution to the -valued martingale problem for with initial condition , replaced by and replaced by in (1.1), respectively, for each . Integration with respect to is denoted by . Also, recall , , from Section 2.1.1. We are now ready to prove Proposition 1.1.
Proof of Proposition 1.1.
Fix . We first show the existence and uniqueness of the invariant probability measure for . Consider the family of probability measures on defined by
Let denote the state of the th particle at time . Recall the compact sets , , defined in (2.1). We first couple the laws and . For , define . Let denote the -length vector with a in the th position and everywhere else. Consider the Markov process on with the infinitesimal generator acting on functions on by
Such couplings were studied for continuous-time Markov chains, see, e.g., [27]. Note that, under the above Markov process, for any two initial conditions , the empirical measure flow associated with the first (resp. second) marginal has law (resp. ). Therefore, for any , , and , we have
(3.2) |
where the first inequality follows from the above coupling since (i) the th particle under moves from to whenever it does so under , and (ii) the th particle under moves to (i.e., a to transition for some ) whenever it does so under . The second inequality in (3.2) is a consequence of Chebyshev’s inequality. Recall , and the laws and . We couple the laws and . Consider the Markov process on with the infinitesimal generator acting on functions on by
Note that, when the initial condition has law , the first (resp. second) component under the above process has law (resp. ). Also, note that if then for all under the above coupling. Since the first component is at least the second component under the initial law , it follows that . The latter is finite for sufficiently small , thanks to the decay of the probability measure on . Thus we can choose small enough (independent of ) so that . Hence (3.2) implies that
Therefore, for any and , we get
(3.3) |
Since is a compact subset of , this show that the family is tight. Hence it follows that there exists an invariant probability measure for (see, for example, Ethier and Kurtz [16, Theorem 9.3, page 240]). By Assumption (A1), is an irreducible Markov process; hence is the unique invariant probability measure for .
We now show the exponential tightness of the family . Let be given, and choose . For each , since is a weak limit of the family as , from (3.3) with replaced by , it follows that
(3.4) |
for each . Hence,
which establishes that the family is exponential tight. This completes the proof of the proposition. ∎
4 The LDP lower bound
In this section we prove the LDP lower bound for the family . To lower bound the probability of a small neighbourhood of a point under , we first produce a trajectory that starts at for a suitable , connects to with a small cost, and then reaches from with cost arbitrarily close to , where is the quasipotential defined in (1.3). The probability of a small neighbourhood of under is then lower bounded by the probability that the process remains in a small neighbourhood of the trajectory constructed above. The latter is then lower bounded using the uniform LDP lower bound for , where the uniformity is over the initial condition lying in a given compact subset of .
Recall defined in (2.2). We begin with a lemma that allows us to connect points in to for small enough with small cost. We omit its proof here, since it follows from a certain continuity property of which will be shown in Lemma 5.3.
Lemma 4.1.
Given there exist and such that for any there exists a trajectory on such that , , and .
We now prove the LDP lower bound for the family .
Lemma 4.2.
For any , , and , there exists such that
(4.1) |
for all .
Proof.
Fix , , and . We may assume that ; if then (4.1) trivially holds for all . Choose some and such that for all ; this is possible from the exponential tightness of the family , see Proposition 1.1. Using Lemma 4.1, choose and such that for any there exists a trajectory on such that , and . Since is the globally asymptotically stable equilibrium for (1.2) and since is compact, for the above , there exists a such that for any we have , where denotes the solution to the McKean-Vlasov equation (1.2) with initial condition (see assumption (B2)). Also, by the definition of , there exists a and a trajectory such that , and . Let . Given , we construct a trajectory on by using the above three trajectories as follows. Let ; for ; for ; and for . Note that .
Recall that is the metric on and is the metric on . Note that we can choose a (depending on and ) such that implies that for any and . Indeed, if such a choice is not possible, then there exists a sequence , and a sequence of trajectories such that and for each , but . By the compactness of the level sets of in Lemma 2.1, it follows that there exists a subsequential limit for (say, ); since , also converges to in as . Furthermore, since , from Theorem 2.2, we have that is continuous. Since is continuous at all such that is continuous (see, e.g, [4, page 124]), it follows that as . This contradicts the assumption . This shows that we can choose a such that implies that for any and . Therefore, for each , we have
(4.2) |
here the first equality follows since is invariant to time shifts. By the uniform LDP lower bound in Theorem 2.1, there exists such that
for all , , and . Noting that for any , and using the above uniform LDP lower bound, (4.2) becomes
for all . Finally, choose so that . Then the above becomes
for all . This completes the proof of LDP lower bound for the family . ∎
5 Properties of the quasipotential
In this section we prove three key properties of the quasipotential defined in (1.3). These three properties are (i) a characterisation of the set of points for which is finite, (ii) a certain continuity property for , and (iii) the compactness of the lower level sets of . These properties play an important role in the proof of the LDP upper bound in Section 6.
5.1 A characterisation of finiteness of the quasipotential
Recall the function defined in (1.6) and the compact sets , , defined in (2.1). We start with a lemma that enables us to connect , the point mass at state , to a point for some . This connection is made using a piecewise constant velocity trajectory wherein for each , we move the mass from state to state in steps; in the th step, we move the mass from state to state with unit velocity. The lemma asserts that the cost of this piecewise constant velocity trajectory is bounded above by a constant that depends only on .
Lemma 5.1.
Given there exists a constant depending on such that for any there exists a and a trajectory on such that , , and .
Proof.
Fix and . Fix and define , for , and . Note that . We shall first construct a trajectory such that , for each , and bounded above by a constant independent of .
Let . For each , starting with , we move the mass from the state to state using a piecewise unit velocity trajectory over the time duration . We define this trajectory on as follows. Let . For each and , when , let
, and define , , .
We now calculate the cost of this trajectory. Fix such that , and let . For each and , note that
Hence,
(5.1) |
where the last two inequalities follow from assumption (A2). Consider the first term above. For , integration of this quantity over the time duration gives
where the first equality follows from the variable change and the facts (i) , (ii) when , (iii) when , and (iv) . For , using the bound , we get
where the last equality follows from the variable change , and the facts (i) , (ii) when so that when , (iii) when so that when , and (iv) . Thus, proceeding as before for the case , we arrive at
Hence, integrating (5.1) over and summing over , we get, for each ,
(5.2) |
where Let . Thus, summing the above display over , we arrive at
Note that
(5.3) |
where the first inequality comes from the fact that the mapping is monotonically increasing for . Hence,
Define . We now extend the trajectory to by defining for . Noting that for all on , this extension suffers an additional cost of at most . Hence, we get
Noting that (i) the right hand side above is upper bounded by , where is a constant depending on and , and (ii) , the above display yields
where is a constant depending on , and . Using the compactness of the level sets of (see Lemma 2.1), it follows that the sequence of trajectories has a convergent subsequence. Re-indexing the original sequence, let in as . By construction, for each , for all ; hence for all . Recall that lower semicontinuity of was proved in the course of the proof of Lemma 2.1. Therefore, it follows that
This completes the proof of the lemma. ∎
We are now ready to characterise the set of points in whose is finite.
Lemma 5.2.
if and only if . Furthermore, for any , there exists a constant such that implies .
Proof.
Let be such that . Then there exists a and a trajectory on such that , and . By Theorem 2.2, there exists a measurable function on such that
(5.4) |
holds for all and , and is given by
For any and , using the convex duality relation , we get the inequality . Hence, from the above non-variational representation for , (5.4) implies
(5.5) |
Recall the function on . For , define
By convexity, note that and , for each . Therefore, using the upper bound for the transition rates from assumption (A2), observe that
for each and . It follows from (5.5) with replaced by that
for each and . Letting and using monotone convergence, we conclude that
(5.6) |
In particular, . It follows that .
Conversely, let . Let be such that . By Lemma 5.1, there exists a and a trajectory on such that , , and for some constant depending on . Let , , , and . We now construct another trajectory on such that , , and . This trajectory is constructed using piecewise constant velocity paths and its cost is computed using arguments similar to those used in the proof of Lemma 5.1; we provide the details here for completeness. When for some , let
, and define , , . Note that, for each , when , we have
so that optimising the left hand side of the above display over yields
where the last inequality follows form the lower bound on the backward transition rates in assumption (A2). Integrating the above over and summing over , we arrive at
(5.7) |
Since , proceeding via the steps in (5.3), we conclude that the right hand side of the above display is finite. We combine and and define a new trajectory on as follows: on ; on . Note that , , and . Hence .
To prove the second statement, we note that given any , for any , the cost of the trajectory constructed in the previous paragraph is bounded above by a constant depending only on (and not on ). This completes the proof of the lemma. ∎
5.2 Continuity
We now establish a certain continuity property of the quasipotential . Since has compact level sets and the space is not locally compact, we cannot expect to be continuous on . In fact, for any point with , one can produce a sequence such that in as , and for all , so that . We prove that is continuous under the convergence of -moments when it is restricted to . That is, when , in and as , then as . Towards this, we produce a trajectory that connects to by first moving the mass from all the large enough states back to the state , then producing a constant velocity trajectory that fills the required mass from state to all the large enough states , and finally adjusting mass within a finite subset of to reach . We show that the cost of the trajectory constructed above can be made arbitrarily small for large enough .
Lemma 5.3.
Let , , and . Suppose that in and as . Then as .
Proof.
We first prove that . Fix . We shall move from to in five steps. The outline of this construction is as follows:
-
•
: This trajectory starts with and moves all the mass for all states , for a suitable large enough , back to state . This backward movement results in a cost of .
-
•
: Next, we move any additional mass, if required, from the states back to state so that there is enough mass at state to fill up all the states beyond . Again, this backward movement results in a cost of .
-
•
: Next, we construct a piecewise constant-velocity trajectory to move the mass from state to state . After this movement, state contains all the mass required to fill up the states beyond it. This forward movement results in a cost of , instead of , because we move the total mass for all the states beyond .
-
•
: Then, for each , we move the required mass (i.e., ) from state to state using a piece-wise constant velocity trajectory. At the end of this procedure, for each , the mass at state becomes . This forward movement results in a cost of .
-
•
: Finally, we adjust the mass within the finite set to match with . This also results in a cost at most . Again, this cost is instead of because we move, for each , the sum of the additional mass (under compared to ) in the states from state to state .
Therefore, the total cost of all these trajectories is at most , which vanishes as . We now define these trajectories in detail and evaluate their costs.
Let be such that
Then choose such that holds for all ; this is possible since in and as . Let
Define the trajectory on as follows. When for some , let
, and define
Note that for , for , and . Let . Using ideas similar to those used in the proof of Lemma 5.2, it can be checked that , for some constant depending on , , and . Indeed, the cost is , which, using the argument used to arrive at the bound (5.7) and the choice of , is bounded by
Let . If , then we move the extra mass from the states to state as follows. Let . When is between and for some , let
. Define the trajectory on as follows: when ; , , . Note that depends on , but we suppress this in the notation for ease of readability. Again, since is smaller than , by using calculations similar to those used in the proof of Lemma 5.2, we see that for some constant depending on , , and . On the other hand, if , we set and on . In both cases, we have .
Let . We now construct another trajectory on to transfer the mass from state (in ) to state . Let . When for some , let
, and define , , . Note that whenever , and that . Hence, using calculations similar to those done in the proof of Lemma 5.1, we see that can be bounded above by where is a constant depending on , , and , for each (recall that depends on ). Indeed, the cost is bounded by the order of (see the bound in (5.2))
where the first inequality uses the fact that , and the second inequality uses the fact that so that .
Note that . We now construct a trajectory that distributes this mass from the state to all the states to match with . Let for and . Similar to the construction in the proof of Lemma 5.1, we can now construct a trajectory on such that , for each , and for some constant depending on , , and , for all . Indeed, using the bounds in (5.2) and (5.3), the total cost is bounded by the order of
where the inequality follows from the choice of .
Finally, we construct a trajectory that connects to by adjusting the mass within the states . Note that for each . Let denote the set of all such that . Similar to the construction of , for each , we move the mass from state to state using unit velocity over a time duration . Once these mass transfers are complete, starting with , we move the mass
from state to state with at unit rate. Let
and let denote this piecewise constant velocity trajectory. Let . At each step of , since we move a mass of at most from state to state , the cost of is at most of the order of (see (5.2))
Since as , we may choose so that for all . Therefore, for , the above display is bounded by
which is . Therefore, for all , for some constant depending on , , and .
Let . We now append the four paths , constructed in the previous paragraphs over the time duration to get a path such that , and where is a constant depending on , and . Hence, for each , we have
Therefore, . Letting and noting that , we arrive at .
To prove , we reverse the role of and in the above argument. That is, we construct a trajectory on such that , , and for all , where as . Thus, we get
Letting , we conclude that . This completes the proof of the lemma. ∎
Remark 5.1.
The choice of in the above proof suggests that the inequality can be proved as long as in as and holds. Similarly, the inequality can be proved as long as in and holds. This observation will be later used in the proof of the compactness of the lower level sets of .
5.3 Compactness of the lower level sets of the quasipotential
Define the level sets of by
In this section we establish the compactness of for each .
Lemma 5.4.
For each , is a compact subset of .
Proof.
We first prove an inclusion property of the level sets of , namely, given there exists such that
(5.8) |
On one hand, using Proposition 1.1 on the exponential tightness of the family , choose (see (3.4)) such that
On the other hand, using the LDP lower bound established in Lemma 4.2 and the compactness of , we have
Combining the above two displays, we get
That is, implies . This shows (5.8). By Prohorov’s theorem, is a compact subset of ; hence (5.8) shows that is precompact for each .
We now show that is closed in . Let for each and let in as . By Fatou’s lemma, we have . Hence, by Remark 5.1, we have . Thus, . This completes the proof of the lemma. ∎
6 The LDP upper bound
Recall defined in (2.1) and defined in (2.2). For , define
That is, denotes the set of all trajectories that start at and do not intersect at all integer time points in . We begin with a lemma that asserts that the elements of for large enough must have non-trivial cost. The key idea used in the proof comes from the compactness of level sets of the process-level large deviations rate function , for any compact subset of (see Lemma 2.1).
Lemma 6.1.
For any , , and , there exists such that
(6.1) |
Proof.
Suppose not. Then there exist , , , a sequence of positive numbers such that as , and a sequence of trajectories such that , and for each .
Note that there exists an such that for each and each . Indeed, by Lemma 5.2, there exists such that implies . Thus, for each , there exist a and a trajectory on such that , , and . We extend this trajectory to by defining on . Note that , so that for each and each . Thus, we can find an such that (5.8) holds with replaced by and replaced by . It follows that for each and each .
For the above choice of , using assumption (B2), choose such that for each and each , where is the solution to the McKean-Vlasov equation (1.2) with initial condition . Note that the closure of the set of all trajectories on in with initial condition and does not contain any trajectory of the McKean-Vlasov equation (1.2). It follows from Lemma 2.1 that
Therefore, noting that for each and , we see that
which contradicts our assumption. This completes the proof of the lemma. ∎
With a slight abuse of notation, given , , and , define
We now prove a certain containment property for elements of that can arise as end-points of trajectories in , and , i.e., points such that there exists a trajectory with and . We prove that such points are not far from the lower level sets of in . This connection between trajectories over finite time horizons and the level sets of the quasipotential is the key to transfer the process-level LDP upper bound in Theorem 2.1 to the LDP upper bound for the family of invariant measures .
Lemma 6.2.
For any and there exists and such that for all ,
(6.2) |
Proof.
Suppose not. Then there exist , , sequences , such that and as , and trajectories such that for each . Let , . By Lemma 5.3, there exists a and a sequence , with as , such that for any there exists a trajectory on such that , and . For each , let be the trajectory on defined as follows. Let ; on ; on . In particular, . Clearly, . It follows that . Using the compactness of the lower level sets of (see Lemma 5.4), we can find a convergent subsequence of ; after re-indexing and denoting this convergent subsequence by , let in as . By assumption, for each , and hence . Using the lower semicontinuity of , we see that
Hence . This contradicts , which is a consequence of our assumption. This proves the lemma. ∎
We are now ready to prove the LDP upper bound for the family . The proof relies on the uniform LDP upper bound in Theorem 2.1, the exponential tightness of the family , the containment property established in Lemma 6.2, an estimate on the probability that lies in (which uses the process-level uniform LDP upper bound in Theorem 2.1 and the result of Lemma 6.1), and finally the strong Markov property of .
Lemma 6.3.
For any , , and , there exists such that
for all .
Proof.
Fix , , and . Choose and such that for all ; this is possible from the exponential tightness of the family , see Proposition 1.1. For the given and , from Lemma 6.2, choose and such that (6.2) holds for all . For the above choice of and , by Lemma 6.1, choose such that such that (6.1) holds. By (6.1) and the compactness of in (which follows from Lemma 2.1), the closure of does not intersect . It follows that there exists a such that implies . Hence by the uniform LDP upper bound in Theorem 2.1, there exists such that
(6.3) |
for all and . Thus, with and , we have
(6.4) |
here the first equality follows since is invariant to time shifts, the first inequality follows from the choice of , and the third inequality follows from (6.3).
To bound the integrand in the third term above, let and . Choose777The existence of such a can be justified via arguments similar to those used in the proof of Lemma 4.2; see the paragraph before (4.2) (depending on and , and not on and ) such that implies whenever and . Note that if a trajectory on with initial condition is such that , then there exists a trajectory such that . By the choice of , we have . By Lemma 6.2, we find that . Hence by triangle inequality . The contrapositive of the above statement is
We therefore conclude that
(6.5) |
for all , , and .
Note that the integrand in the last term of (6.4) can be upper bounded by
(6.6) |
where the first inequality follows from the strong Markov property of and the second inequality follows from (6.5) by the choice of . By the uniform LDP upper bound in Theorem 2.1, for each , there exist such that
for all and . Put . Then (6.6) yields
for all and . Substitution of this back in (6.4) yields
for all . Finally, choose such that for all . Then the above display becomes
for all . This completes the proof of the lemma. ∎
7 Proof of Theorem 1.1
8 Two counterexamples
In this section, for two non-interacting counterexamples described in Section 1.1, we prove that the quasipotential is not equal to the relative entropy with respect to the corresponding globally asymptotically stable equilibrium. These two counterexamples are (i) a system of non-interacting M/M/1 queues, and (ii) a system of non-interacting nodes in a wireless local area network (WLAN) with constant forward transition rates. We detail the proofs in the case of non-interacting M/M/1 queues. Similar arguments carry over to the case of non-interacting WLAN system with constant forward transition rates as well.
8.1 A system of non-interacting M/M/1 queues
Recall the system of non-interacting M/M/1 queues described in Section 1.1.1. Recall the relative entropy from (1.4) and the process-level large deviations rate function from (2.11). Also recall the function defined in (1.6) and the compact sets , , defined in (2.1). Define the quasipotential
where is defined by (2.11) with replaced by and replaced by for each .
We first prove that the quasipotential is not finite outside . The key property used for this is the fact that the attractor has geometric decay. As a consequence . Using this property, we first show that if , then the associated quasipotential evaluated at cannot be finite. This is shown by producing a lower bound for the cost of any trajectory starting at and ending at from the rate function in (2.11).
Lemma 8.1.
If is such that , then .
Proof.
Fix . Let and be such that and . For each , define by
and define for each . Note that the use of is to approximate using functions so that we can insert them into (2.11). We first assume that . In particular, . Using the function in place of in the RHS of (2.11), we have
where , , and , . Noting that is either , or for each , we have for each . Hence the above becomes
Note that . Hence, letting and using the monotone convergence theorem, we conclude that .
We now assume that is such that . Let and be such that and . Without loss of generality, we can assume that ; otherwise the argument in the above paragraph shows that . Define
Using in the RHS of (2.11), we get
Noting that can be upper bounded by for each , it follows that for each . Hence the above display becomes
As before, letting , using the monotone convergence theorem, and noting that , we conclude that .
Since , , and such that and are arbitrary, the proof of the lemma is complete. ∎
We now prove the main result of this section, namely, the quasipotential is not equal to the relative entropy .
Proposition 8.1.
Let be such that and . Then and . In particular, .
Proof.
By the Donsker-Varadhan variational formula (see Donsker and Varadhan [14, Lemma 2.1]), for any and any bounded function on , we have
Recall the definition of and from the proof of Lemma 8.1. Let be such that . Replacing by in the above display, letting and using the monotone convergence theorem, we arrive at
It follows that
On the other hand, since , it is easy to check that .
Let be such that and . Then the above yields . By Lemma 8.1, we see that . This completes the proof of the proposition. ∎
8.2 A non-interacting WLAN system with constant forward rates
Recall the model described in Section 1.1.2. Define the quasipotential
where is defined by (2.11) with replaced by and replaced by for each . We now state the main result for this non-interacting wireless local area network.
Proposition 8.2.
Let be such that and . Then and . In particular, .
We start with the following lemma. The proof follows along similar lines of the proof of Lemma 8.1 by noting that , and it is left to the reader.
Lemma 8.2.
If is such that , then .
Appendix A Proofs of Section 2
A.1 Proof of Lemma 2.1
Fix , , and compact. Given , and a finite set , choosing for all , (2.9) yields
for all . Note that we may take , else the rate function would be infinite as per (2.10) and the definition of in (2.7). Therefore, we get
(A.1) |
Noting that
it follows that the family is uniformly integrable. That is,
as . Hence for any , using the boundedness of the transition rates (from assumption (A2)), (A.1) yields
for all , and . It follows that
Letting first and then , we arrive at
Hence it follows that is precompact in (see, for example, Billingsley [4, Theorem 12.3]).
To show that is closed, let and suppose that in . Note that, for any , the mapping
is continuous on , and hence, the mapping
is lower semicontinuous on (see, for example, Berge [1, Theorem 1, page 115]). Hence, it follows that
and it follows that is closed. Consequently, is a compact subset of . ∎
A.2 Proof of Theorem 2.1
In this Section, we prove Theorem 2.1. In the case of finite state space (i.e., when is a finite set), the LDP for the family , whenever in as , was proved in [7, Theorem 3.1] under suitable assumptions. The main assumption required in the proof of [7, Theorem 3.1] was the boundedness of the “total outgoing jump rate” across all the states, which also holds in our countable state space case under Assumptions (A1)–(A3). So, to prove the LDP for the family , whenever in as , one can go through the steps in [7, Section 5] verbatim; we reproduce the important steps here for the sake of completeness. Once this LDP is proved, we then show the uniform LDP over the class of compact subsets of using [8, Proposition 1.12, 1.14].
A.2.1 LDP for when in
We first introduce some notation. Let denote the joint evolution of the states of all the particles. This is a Markov process on with the infinitesimal generator acting on functions on given by
where . Define the empirical measure
is a -valued random variable. Let denote the canonical projection map. Note that , . Similarly, let denote the evolution of the independent particles, where each particle executes a Markov process with the infinitesimal generator defined in (3.1). Define the corresponding empirical measure by
Let (resp. ) denote the law of (resp. ) with initial condition (i.e., ). These are probability measures on , i.e., .
Note that . For and , define
where are the non-interacting rates defined by
Also, define
(A.2) |
Using Girsanov’s theorem, it is straightforward to check that
We now introduce some notation related to path spaces. Define by
is the number of discontinuities in . Since is a countable set, it follows that for all ([4, Chapter 3, Lemma 1]). Define
and equip with the subspace topology. Since is countable, we have that is continuous on . Define
Then, define
and
is a subset of , the algebraic dual of , and we equip it with the weak* topology. This is the coarsest topology on where we say in as if and only if
Recall , , from Section 3. For each , define by
(A.3) |
By [7, Lemma 5.3], we also have
(A.4) |
where is the space of bounded and continuous functions on equipped with the supremum norm.
We first state a lemma for the LDP for the family on whenever in as . Its proof follows verbatim from [7, Lemma 5.1].
Lemma A.1 (LDP for the non-interacting system; [7, Lemma 5.1]).
Let in as . Then the family satisfies the LDP on with rate function defined in (A.3).
Next, we provide two necessary conditions for the finiteness of defined in (A.3).
Lemma A.2 (Finiteness of ; [7, Lemma 5.2]).
If , then we have and .
Proof.
Let be such that . The proof of follows verbatim from [7, Lemma 5.2]. For the first assertion, since , from the definition of in (A.3), we have
(A.5) |
Note that, for each , under (see (3.1)), the number of jumps on is stochastically dominated by a Poisson random variable with parameter . Therefore,
where is some constant independent of . Therefore,
Hence, from (A.5), using , we conclude that
It follows that . ∎
The next lemma is required to prove the continuity of on .
Lemma A.3 (see [7, Lemma 5.7] for the finite state space case).
Suppose that is such that . Then,
Proof.
Let denote the mixture distribution defined by . Since , it follows that . Indeed, using Jensen’s inequality, we have,
and hence, from (A.4) and the Donsker-Varadhan variational formula for , we conclude that
(A.6) |
Since , the above implies that . This shows . Hence, with , we have
(A.7) |
where the last inequality follows from the Hölder’s inequality in Orlicz spaces. Here, is the Orlicz norm defined by
Similarly, is defined as above with replaced by .
Consider . Note that, there exists a such that for all . Therefore,
where the second inequality follows from (A.6) and the third inequality follow from the assumption that . Since for (by Jensen’s inequality), this shows
(A.8) |
Next, consider . Note that, under , the number of jumps in is stochastically dominated by a Poisson random variable with parameter . Therefore, . Since for any , we have
Therefore, if we choose , the right-hand side of the above display becomes . This shows
for all . Hence, by (A.7), (A.8), and the previous display, we get
This completes the proof of the lemma. ∎
Next, we argue the continuity of the projection map .
Lemma A.4 (Continuity of ; [6, Lemma 5.8]).
Let be such that . Then is continuous at .
Proof.
Let be such that . By Lemma A.2, it follows that . In [21, Lemma 2.8], for the case when for some , it was shown that is continuous at whenever 888 This continuity was shown in [21] when is equipped with the usual weak topology and is equipped with the stronger uniform topology. Since the Skorohod topology on is coarser than the uniform topology, it follows that is continuous. For general , by using the result of Lemma A.3 and following the proof of [21, Lemma 2.8] verbatim, we arrive at the continuity of . ∎
Finally, we have that is continuous on .
Proof.
The above lemmas give us the LDP for the family on whenever in as .
Proposition A.1.
Proof.
Let in as . By Lemma A.1, we have that satisfies the LDP on with rate function . Since is continuous on the set (by Lemma A.5), from Varadhan’s lemma, one can conclude that (see [7, Proof of Theorem 3.1]) the family satisfies the LDP on with rate function . By Lemma A.4, since is continuous (with the usual weak topology on ) at when , it follows that the restriction of to is also continuous (with respect to the stronger topology on ) at when . Therefore, using the generalized contraction principle (e.g., [13, Theorem 4.2.23]), the LDP for the family on follows. The rate function for this LDP can be shown to admit the form given in (2.8) (see, e.g., [21, Proof of Theorem 3.1]). ∎
A.2.2 Uniform LDP for over the class of compact subsets of
Proposition A.1 establishes the LDP for the family , whenever in as . We now extend this to the uniform LDP on the class of compact subsets of . Towards this, we rely on [8, Proposition 1.12, 1.14]. Although our definition of the uniform LDP (Definition 2.2) has initial conditions lying in (unlike the definition of uniform LDP in [8, Definition 1.13] where the initial conditions do not depend on the parameter ), we can use straightforward modifications of the arguments in [8, Proposition 1.12, 1.14] to prove the desired uniform LDP. We provide an outline of these arguments here.
We first provide a definition of the uniform Laplace principle over the class of compact subsets of . Recall the definition of the rate function in (2.8). For and , define
Definition A.1.
We say that the family of -valued random variables defined on a probability space satisfies the uniform Laplace principle over the class of subsets of with the family of rate functions , , , if
-
•
(Compactness of level sets). For each compact and , is a compact subset of , where ;
-
•
(Uniform Laplace asymptotics) For any and , we have
This is a modification of [8, Definition 1.11] to the case when the initial conditions are only allowed to lie in . We have the following result.
Lemma A.6 (see [8, Proposition 1.12]).
Proof.
By Lemma 2.1, we have that for each compact and , is a compact subset of , where .
To show the uniform Laplace asymptotics, let . By Proposition A.1, whenever in as we have that the family satisfies the LDP on with rate function . Therefore, by Varadhan’s lemma (e.g., [13, Theorem 4.3.1]), we have
(A.9) |
Define
Using (A.9), we now show that the mapping is continuous. To show this continuity, it suffices to show that, given any there exists such that for all such that and such that as , we have
Indeed, if this is true, sending in the above display and using (A.9), we arrive at , which shows the continuity of . We now show the above statement using contraposition. Suppose the above statement is not true. Then there exists and a sequence with and as such that . Using (A.9), we get , which is a contradiction. This establishes the continuity of the mapping .
Proof of Theorem 2.1.
By Lemma A.6, the family satisfies the uniform Laplace principle over the class of compact subsets of with the family of rate functions . Restricting the initial conditions to and following the proof of [8, Proposition 1.14] verbatim, we conclude that the family satisfies the uniform LDP on over the class of compact subsets of with the family of rate functions . ∎
Acknowledgements
The authors were supported by a grant from the Indo-French Centre for Applied Mathematics on a project titled “Metastability phenomena in algorithms and engineered systems”. The first author was supported in part by a fellowship grant from the Centre for Networked Intelligence (a Cisco CSR initiative), Indian Institute of Science, Bangalore; and in part by Office of Naval Research under the Vannevar Bush Faculty Fellowship N0014-21-1-2887. The authors thank two anonymous referees for carefully reading the manuscript and providing valuable comments that improved the paper.
References
- [1] C. Berge. Topological Spaces: Including a Treatment of Multi-Valued Functions, Vector Spaces, and Convexity. Courier Corporation, 1997.
- [2] L. Bertini, A. De Sole, D. Gabrielli, G. Jona-Lasinio, and C. Landim. Macroscopic fluctuation theory for stationary non-equilibrium states. Journal of Statistical Physics, 107(3):635–675, 2002.
- [3] L. Bertini, A. De Sole, D. Gabrielli, G. Jona-Lasinio, and C. Landim. Large deviations for the boundary driven symmetric simple exclusion process. Mathematical Physics, Analysis and Geometry, 6(3):231–267, 2003.
- [4] P. Billingsley. Convergence of Probability Measures. Wiley Series in Probability and Statistics, 2 edition, 1999.
- [5] T. Bodineau and G. Giacomin. From dynamic to static large deviations in boundary driven exclusion particle systems. Stochastic Processes and their Applications, 110(1):67–81, 2004.
- [6] C. Bordenave, D. McDonald, and A. Proutiere. A particle system in interaction with a rapidly varying environment: Mean field limits and applications. Netw. Heterog. Media, 5(1):31–62, 2010.
- [7] V. S. Borkar and R. Sundaresan. Asymptotics of the invariant measure in mean field models with jumps. Stoch. Syst., 2(2):322–380, 2012.
- [8] A. Budhiraja and P. Dupuis. Analysis and Approximation of Rare Events. Springer, New York, NY, 2019.
- [9] S. Cerrai and N. Paskal. Large deviations principle for the invariant measures of the 2d stochastic Navier-Stokes equations with vanishing noise correlation. arXiv preprint arXiv:2012.14953, 2020.
- [10] S. Cerrai and M. Röckner. Large deviations for stochastic reaction-diffusion systems with multiplicative noise and non-Lipschitz reaction term. The Annals of Probability, 32(1B):1100 – 1139, 2004.
- [11] S. Cerrai and M. Röckner. Large deviations for invariant measures of stochastic reaction–diffusion systems with multiplicative noise and non-Lipschitz reaction term. Annales de l’Institut Henri Poincare (B) Probability and Statistics, 41(1):69–105, 2005.
- [12] D. A. Dawson and J. Gärtner. Large deviations from the McKean-Vlasov limit for weakly interacting diffusions. Stochastics, 20(4):247–308, 1987.
- [13] A. Dembo and O. Zeitouni. Large Deviations Techniques and Applications. Springer-Verlag Berlin Heidelberg, 2 edition, 2010.
- [14] M. D. Donsker and S. R. S. Varadhan. Asymptotic evaluation of certain Markov process expectations for large time, I. Communications on Pure and Applied Mathematics, 28(1):1–47, 1975.
- [15] R. Durrett. Probability: Theory and Examples. Cambridge University Press, 5th edition edition, 2019.
- [16] S. N. Ethier and T. G. Kurtz. Markov Processes: Characterization and Convergence. John Wiley & Sons, 2 edition, 2005.
- [17] J. Farfán, C. Landim, and K. Tsunoda. Static large deviations for a reaction–diffusion model. Probability Theory and Related Fields, 174(1):49–101, 2019.
- [18] M. I. Freidlin and A. D. Wentzell. Random Perturbations of Dynamical Systems. Grundlehren der mathematischen Wissenschaften. American Mathematical Society, 3 edition, 2012.
- [19] R. Khasminskii. Stochastic Stability of Differential Equations, volume 66 of Stochastic Modelling and Applied Probability. Springer, Berlin, Heidelberg, 2012.
- [20] A. Kumar, E. Altman, D. Miorandi, and M. Goyal. New insights from a fixed point analysis of single cell IEEE 802.11 WLANs. In IEEE INFOCOM 2006, 2006.
- [21] C. Léonard. Large deviations for long range interacting particle systems with jumps. Ann. Inst. Henri Poincaré Probab. Stat., 31(2):289–323, 1995.
- [22] C. Léonard. On large deviations for particle systems associated with spatially homogeneous Boltzmann type equations. Probability Theory and Related Fields, 101(1):1–44, Mar 1995.
- [23] R. Liptser. Large deviations for two scaled diffusions. Probability theory and related fields, 106(1):71–104, 1996.
- [24] D. Martirosyan. Large deviations for stationary measures of stochastic nonlinear wave equations with smooth white noise. Communications on Pure and Applied Mathematics, 70(9):1754–1797, 2017.
- [25] H. P. McKean. Propagation of chaos for a class of non-linear parabolic equations. In Lecture Series in Differential Equations, Catholic University (Washington D. C.), 1967.
- [26] S. P. Meyn, P. Barooah, A. Bušić, Y. Chen, and J. Ehren. Ancillary service to the grid using intelligent deferrable loads. IEEE Transactions on Automatic Control, 60(11):2847–2862, 2015.
- [27] C. Mufa. Optimal Markovian couplings and applications. Acta Mathematica Sinica, 10(3):260–275, 1994.
- [28] A. Puhalskii. Large deviations of the long term distribution of a non Markov process. Electronic Communications in Probability, 24:1 – 11, 2019.
- [29] A. A. Puhalskii. On large deviations of coupled diffusions with time scale separation. Ann. Probab., 44(4):3111–3186, 07 2016.
- [30] A. A. Puhalskii. Large deviation limits of invariant measures. arXiv preprint arXiv:2006.16456, 2020.
- [31] M. Salins, A. Budhiraja, and P. Dupuis. Uniform large deviation principles for Banach space valued stochastic differential equations. Transactions of the American Mathematical Society, pages 8363–8421, 2019.
- [32] M. Salins and K. Spiliopoulos. Metastability and exit problems for systems of stochastic reaction-diffusion equations. The Annals of Probability, 49(5):2317–2370, 2021.
- [33] R. Sowers. Large deviations for the invariant measure of a reaction-diffusion equation with non-Gaussian perturbations. Probability Theory and Related Fields, 92(3):393–421, 1992.
- [34] R. B. Sowers. Large Deviations for a Reaction-Diffusion Equation with Non-Gaussian Perturbations. The Annals of Probability, 20(1):504 – 537, 1992.
- [35] A. Y. Veretennikov. On large deviations for SDEs with small diffusion and averaging. Stochastic Processes and their Applications, 89(1):69–79, 2000.
- [36] S. Yasodharan and R. Sundaresan. Large time behaviour and the second eigenvalue problem for finite state mean-field interacting particle systems. Advances in Applied Probability, 55(1):85–125, 2023.
S. Yasodharan
Division of Applied Mathematics
Brown University
Providence RI 02912, USA
email: [email protected]
R. Sundaresan
Department of Electrical Communication Engineering
Indian Institute of Science
Bangalore 560 012, India
email: [email protected]