This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A sufficient condition for the quasipotential to be the rate function of the invariant measure of countable-state mean-field interacting particle systems

Sarath Yasodharan and Rajesh Sundaresan
Brown University and Indian Institute of Science
Most of this work was competed when SY was a PhD student at the Indian Institute of Science.
Abstract

This paper considers the family of invariant measures of Markovian mean-field interacting particle systems on a countably infinite state space and studies its large deviation asymptotics. The Freidlin-Wentzell quasipotential is the usual candidate rate function for the sequence of invariant measures indexed by the number of particles. The paper provides two counterexamples where the quasipotential is not the rate function. The quasipotential arises from finite horizon considerations. However there are certain barriers that cannot be surmounted easily in any finite time horizon, but these barriers can be crossed in the stationary regime. Consequently, the quasipotential is infinite at some points where the rate function is finite. After highlighting this phenomenon, the paper studies some sufficient conditions on a class of interacting particle systems under which one can continue to assert that the Freidlin-Wentzell quasipotential is indeed the rate function.

MSC 2020 subject classifications: Primary 60F10; Secondary 60K35, 82C22, 60J74, 90B15
Keywords: Mean-field interaction, invariant measure, large deviations, static large deviation, Freidlin-Wentzell quasipotential, relative entropy

1 Introduction

For a broad class of Markov processes such as small-noise diffusions, finite-state mean-field models, simple exclusion processes, etc., it is well-known that the Freidlin-Wentzell quasipotential is the rate function that governs the large deviation principle (LDP) for the family of invariant measures [18, 33, 7, 17]. The quasipotential is the minimum cost (arising from the rate function for a process-level large deviation principle) associated with trajectories of arbitrary but finite duration, with fixed initial and terminal conditions. We begin this paper with two counterexamples of independently evolving countable-state particle systems for which the quasipotential is not the rate function for the family of invariant measures. The family of invariant measures of these counterexamples satisfy the LDP with a suitable relative entropy as its rate function, and we show that the quasipotential is not the same as this relative entropy. Specifically, we show that there are points in the state space where the rate function is finite, but the quasipotential is infinite. These points cannot be reached easily via trajectories of arbitrary but finite time duration. However the barriers to reach these points are surmounted in the stationary regime. There are however some sufficient conditions, at least on a family of such countable-state interacting particle systems, where the Freidlin-Wentzell quasipotential is indeed the correct rate function; this will be the main result of this paper. Intuitively, the sufficient conditions cut-down the speed of outward excursions and ensure that the insurmountable barriers for the finite horizon trajectories continue to be insurmountable in the stationary regime.

Before we describe the counterexamples and the main result, let us introduce some notations and describe the model of a countable-state mean-field interacting particle system. Let 𝒵\mathcal{Z} denote the set of non-negative integers and let (𝒵,)(\mathcal{Z},\mathcal{E}) denote a directed graph on 𝒵\mathcal{Z}. Let 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) denote the space of probability measures on 𝒵\mathcal{Z} equipped with the total variation metric (which we denote by dd). For each N1N\geq 1, let 1N(𝒵)1(𝒵)\mathcal{M}_{1}^{N}(\mathcal{Z})\subset\mathcal{M}_{1}(\mathcal{Z}) denote the set of probability measures on 𝒵\mathcal{Z} that can arise as empirical measures of NN-particle configurations on 𝒵N\mathcal{Z}^{N}. For each N1N\geq 1, we consider a Markov process with the infinitesimal generator acting on functions ff on 1N(𝒵)\mathcal{M}_{1}^{N}(\mathcal{Z}):

Nf(ξ)(z,z)Nξ(z)λz,z(ξ)[f(ξ+δzNδzN)f(ξ)],ξ1N(𝒵);\displaystyle\mathscr{L}^{N}f(\xi)\coloneqq\sum_{(z,z^{\prime})\in\mathcal{E}}N\xi(z)\lambda_{z,z^{\prime}}(\xi)\left[f\left(\xi+\frac{\delta_{z^{\prime}}}{N}-\frac{\delta_{z}}{N}\right)-f(\xi)\right],\,\xi\in\mathcal{M}_{1}^{N}(\mathcal{Z}); (1.1)

here λz,z:1(𝒵)+\lambda_{z,z^{\prime}}:\mathcal{M}_{1}(\mathcal{Z})\to\mathbb{R}_{+}, (z,z)(z,z^{\prime})\in\mathcal{E}, are given functions that describe the transition rates and δ\delta denotes the Dirac measure. Such processes arise as the empirical measure of weakly interacting Markovian mean-field particle systems where the evolution of the state of a particle depends on the states of the other particles only through the empirical measure of the states of all the particles. Under suitable assumptions on the model, the martingale problem for N\mathscr{L}^{N} is well posed and the associated Markov process possesses a unique invariant probability measure N\wp^{N}. This paper highlights certain nuances associated with the large deviation principle for the sequence {N,N1}\{\wp^{N},N\geq 1\} on 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}).

Fix T>0T>0 and let μνNN\mu^{N}_{\nu_{N}} denote the Markov process with initial condition νN1N(𝒵)\nu_{N}\in\mathcal{M}_{1}^{N}(\mathcal{Z}) whose infinitesimal generator is N\mathscr{L}^{N}. Its sample paths are elements of D([0,T],1N(𝒵))D([0,T],\mathcal{M}_{1}^{N}(\mathcal{Z})), the space of 1N(𝒵)\mathcal{M}_{1}^{N}(\mathcal{Z})-valued functions on [0,T][0,T] that are right-continuous with left limits equipped with the Skorohod topology. Such processes have been well studied in the past. Under mild conditions on the transition rates, when νNν\nu_{N}\to\nu in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as NN\to\infty, it is well-known that the family {μνNN,N1}\{\mu^{N}_{\nu_{N}},N\geq 1\} converges in probability, in D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})), as NN\to\infty to the mean-field limit111See McKean [25] in the context of interacting diffusions and Bordenave et al. [6] in the context of countable-state mean-field models.:

μ˙(t)=Λμ(t)μ(t),μ(0)=ν,t[0,T];\displaystyle\dot{\mu}(t)=\Lambda_{\mu(t)}^{*}\mu(t),\mu(0)=\nu,\,t\in[0,T]; (1.2)

here μ˙(t)\dot{\mu}(t) denotes the derivative of μ\mu at time tt, Λξ\Lambda_{\xi}, ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}), denotes the rate matrix when the empirical measure is ξ\xi (i.e., Λξ(z,z)=λz,z(ξ)\Lambda_{\xi}(z,z^{\prime})=\lambda_{z,z^{\prime}}(\xi) when (z,z)(z,z^{\prime})\in\mathcal{E}, Λξ(z,z)=0\Lambda_{\xi}(z,z^{\prime})=0 when (z,z)(z,z^{\prime})\notin\mathcal{E}, and Λξ(z,z)=zzλz,z(ξ)\Lambda_{\xi}(z,z)=-\sum_{z^{\prime}\neq z}\lambda_{z,z^{\prime}}(\xi)), and Λξ\Lambda^{*}_{\xi} denotes the transpose of Λξ\Lambda_{\xi}. The above dynamical system on 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) is called the McKean-Vlasov equation. This mean-field convergence allows one to view the process μνNN\mu^{N}_{\nu_{N}} as a small random perturbation of the dynamical system (1.2). The starting point of our study of the asymptotics of {N,N1}\{\wp^{N},N\geq 1\} is the process-level LDP for {μνNN,νN1N(𝒵),N1}\{\mu^{N}_{\nu_{N}},\nu_{N}\in\mathcal{M}_{1}^{N}(\mathcal{Z}),N\geq 1\}, whenever νN\nu_{N} converges to ν\nu in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}). This LDP was established by Léonard [21] when the initial conditions are fixed, and by Borkar and Sundaresan [7] when the initial conditions converge222Often, as done in [7], one lets νN\nu_{N} be random, and only requires νNν\nu_{N}\rightarrow\nu in distribution, where ν\nu is deterministic. For simplicity, we restrict νN\nu_{N} to be deterministic. in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}). The rate function of this LDP is governed by “costs” associated with trajectories on [0,T][0,T] with initial condition ν\nu, which we denote by S[0,T](φ|ν)S_{[0,T]}(\varphi|\nu), φD([0,T],1(𝒵))\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})) (see (2.8) for its definition).

We assume that ξ\xi^{*} is the unique globally asymptotically stable equilibrium of (1.2). Define the Freidlin-Wentzell quasipotential

V(ξ)inf{S[0,T](φ|ξ):φ(0)=ξ,φ(T)=ξ,T>0},ξ1(𝒵).\displaystyle V(\xi)\coloneqq\inf\{S_{[0,T]}(\varphi|\xi^{*}):\varphi(0)=\xi^{*},\varphi(T)=\xi,T>0\},\,\xi\in\mathcal{M}_{1}(\mathcal{Z}). (1.3)

From the theory of large deviations of the invariant measure of Markov processes [18, 33, 11, 7], VV is a natural candidate for the rate function of the family {N,N1}\{\wp^{N},N\geq 1\}.

1.1 Two counterexamples

We begin with two counterexamples for which VV is not the rate function for the family of invariant measures.

1.1.1 Non-interacting M/M/1 queues

01122zzλf\lambda_{f}λf\lambda_{f}λf\lambda_{f}λf\lambda_{f}λf\lambda_{f}λb\lambda_{b}λb\lambda_{b}λb\lambda_{b}λb\lambda_{b}λb\lambda_{b}
Figure 1: Transition rates of an M/M/1 queue

Consider the graph (𝒵,Q)(\mathcal{Z},\mathcal{E}_{Q}) whose edge set Q\mathcal{E}_{Q} consists of forward edges {(z,z+1),z𝒵}\{(z,z+1),z\in\mathcal{Z}\} and backward edges {(z,z1),z𝒵{0}}\{(z,z-1),z\in\mathcal{Z}\setminus\{0\}\} (see Figure 1). Let λf\lambda_{f} and λb\lambda_{b} be two positive numbers. Consider the generator LQL^{Q} acting on functions ff on 𝒵\mathcal{Z} by

LQf(z)z:(z,z)Qλz,z(f(z)f(z)),z𝒵,\displaystyle L^{Q}f(z)\coloneqq\sum_{z^{\prime}:(z,z^{\prime})\in\mathcal{E}_{Q}}\lambda_{z,z^{\prime}}(f(z^{\prime})-f(z)),\,z\in\mathcal{Z},

where λz,z+1=λf\lambda_{z,z+1}=\lambda_{f} for each z𝒵z\in\mathcal{Z} and λz,z1=λb\lambda_{z,z-1}=\lambda_{b} for each z𝒵{0}z\in\mathcal{Z}\setminus\{0\}. When λf<λb\lambda_{f}<\lambda_{b}, the invariant probability measure associated with this Markov process is

ξQ(z)(1λfλb)(λfλb)z,z𝒵.\displaystyle\xi^{*}_{Q}(z)\coloneqq\left(1-\frac{\lambda_{f}}{\lambda_{b}}\right)\left(\frac{\lambda_{f}}{\lambda_{b}}\right)^{z},\,z\in\mathcal{Z}.

For each N1N\geq 1, we consider NN particles, each of which evolves independently as a Markov process on 𝒵\mathcal{Z} with the infinitesimal generator LQL^{Q}. That is, the particles are independent M/M/1 queues. It is easy to check that the empirical measure of the system of particles is also a Markov process on the state space 1N(𝒵)\mathcal{M}_{1}^{N}(\mathcal{Z}) and it possesses a unique invariant probability measure, which we denote by QN\wp^{N}_{Q}.

On one hand, it is straightforward to see that the family {QN,N1}\{\wp^{N}_{Q},N\geq 1\} satisfies the LDP on 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}). Indeed, under stationarity, the state of each particle is distributed as ξQ\xi^{*}_{Q}. As a consequence, QN\wp^{N}_{Q} is the law of the random variable 1Nn=1Nδζn\frac{1}{N}\sum_{n=1}^{N}\delta_{\zeta_{n}} on 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}), where ζ1,,ζN\zeta_{1},\ldots,\zeta_{N} are independent and identically distributed (i.i.d.) as ξQ\xi^{*}_{Q}. Therefore, by Sanov’s theorem [13, Theorem 6.2.10], {QN,N1}\{\wp^{N}_{Q},N\geq 1\} satisfies the LDP with the rate function I(ξQ)I(\cdot\|\xi_{Q}^{*}), where I:1(𝒵)×1(𝒵)[0,]I:\mathcal{M}_{1}(\mathcal{Z})\times\mathcal{M}_{1}(\mathcal{Z})\to[0,\infty] is the relative entropy defined by333We use the convention 0log0=00\log 0=0.

I(ζν){z𝒵ζ(z)log(ζ(z)ν(z)), if ζν,, otherwise.\displaystyle I(\zeta\|\nu)\coloneqq\left\{\begin{aligned} &\sum_{z\in\mathcal{Z}}\zeta(z)\log\left(\frac{\zeta(z)}{\nu(z)}\right),&\text{ if }\zeta\ll\nu,\\ &\infty,&\text{ otherwise.}\end{aligned}\right. (1.4)

On the other hand, it is natural to conjecture that the rate function for the family {QN,N1}\{\wp^{N}_{Q},N\geq 1\} is given by the quasipotential (1.3) with ξ\xi^{*} replaced by ξQ\xi_{Q}^{*}. However, as discussed in the next paragraph, the quasipotential is not the same as I(ξQ)I(\cdot\|\xi_{Q}^{*}). Hence, from the uniqueness of the large deviations rate function [13, Lemma 4.1.4], the quasipotential does not govern the rate function for the family {QN,N1}\{\wp^{N}_{Q},N\geq 1\}.

We now provide some intuition on why the quasipotential is not the rate function in the example under consideration. For a formal proof, see Section 8. We first introduce some notation. Let \mathbb{R}^{\infty} denote the infinite product of \mathbb{R} equipped with the product topology. We view 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as the subset {x:xi0i,i0xi=1}\{x\in\mathbb{R}^{\infty}:x_{i}\geq 0\,\forall i,\sum_{i\geq 0}x_{i}=1\} of \mathbb{R}^{\infty} with the subspace topology (e.g., see [15, Chapter 3, Section 2]). If ξ,f\xi,f\in\mathbb{R}^{\infty}, we define

ξ,flimmz=0mξ(z)f(z),\displaystyle\langle\xi,f\rangle\coloneqq\lim_{m\to\infty}\sum_{z=0}^{m}\xi(z)f(z), (1.5)

whenever the limit exists. Also, define ϑ:𝒵+\vartheta:\mathcal{Z}\to\mathbb{R}_{+} by

ϑ(z)zlogz,z𝒵,\displaystyle\vartheta(z)\coloneqq z\log z,\,z\in\mathcal{Z}, (1.6)

with the convention that 0log0=00\log 0=0, and define ι(z)z\iota(z)\coloneqq z, z𝒵z\in\mathcal{Z}. Using the fact that ξQ\xi^{*}_{Q} has geometric decay, it can be checked that I(ξξQ)I(\xi\|\xi^{*}_{Q}) is finite if and only if the first moment of ξ\xi (i.e., ξ,ι\langle\xi,\iota\rangle) is finite. However it turns out that V(ξ)V(\xi) (i.e., the quantity in (1.3) with ξ\xi^{*} replaced by ξQ\xi^{*}_{Q}) is finite if and only if the ϑ\vartheta-moment of ξ\xi (i.e., ξ,ϑ\langle\xi,\vartheta\rangle) is finite. In particular, if we consider a ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}) whose first moment is finite but ϑ\vartheta-moment is infinite then V(ξ)I(ξξQ)V(\xi)\neq I(\xi\|\xi^{*}_{Q}). Let ε>0\varepsilon>0, ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}) be such that ξ,ι<\langle\xi,\iota\rangle<\infty but ξ,ϑ=\langle\xi,\vartheta\rangle=\infty, and consider the ε\varepsilon-neighbourhood of ξ\xi in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}). By Sanov’s theorem, the probability of this neighbourhood under QN\wp^{N}_{Q} is of the form exp{N(I(ξξQ)+o(1))}\exp\{-N(I(\xi\|\xi^{*}_{Q})+o(1))\}. For a fixed T>0T>0, let us now try to estimate the probability of μνNN(T)\mu^{N}_{\nu_{N}}(T) being in this neighbourhood when νN\nu_{N} is in a small neighbourhood of ξQ\xi^{*}_{Q}. If the process μN\mu^{N} is initiated at a νN\nu_{N} near ξQ\xi^{*}_{Q}, then the probability that the random variable μνNN(T)\mu^{N}_{\nu_{N}}(T) is in the ε\varepsilon-neighbourhood of ξ\xi is at most

exp{N(inf{ξ:d(ξ,ξ)ε}V(ξ)+o(1))}.\displaystyle\exp\left\{-N\left(\inf_{\{\xi^{\prime}:d(\xi,\xi^{\prime})\leq\varepsilon\}}V(\xi^{\prime})+o(1)\right)\right\}.

Since VV is lower semicontinuous (we prove this in Lemma 5.4), we must have

inf{ξ:d(ξ,ξ)ε}V(ξ) as ε0.\displaystyle\inf_{\{\xi^{\prime}:d(\xi,\xi^{\prime})\leq\varepsilon\}}V(\xi^{\prime})\to\infty\text{ as }\varepsilon\to 0.

Hence we can choose an ε\varepsilon small enough so that inf{ξ:d(ξ,ξ)ε}V(ξ)>2I(ξξQ)\inf_{\{\xi^{\prime}:d(\xi,\xi^{\prime})\leq\varepsilon\}}V(\xi^{\prime})>2I(\xi\|\xi^{*}_{Q}). For this ε\varepsilon, the probability that μνNN(T)\mu^{N}_{\nu_{N}}(T) lies is the ε\varepsilon-neighbourhood of ξ\xi is upper bounded by exp{N×(2I(ξξQ)+o(1))}\exp\{-N\times(2I(\xi\|\xi^{*}_{Q})+o(1))\}, which is smaller than exp{N(I(ξξQ)+o(1))}\exp\{-N(I(\xi\|\xi^{*}_{Q})+o(1))\}, even in the exponential scale, for large enough NN. That is, for any arbitrary but fixed TT, we can find a small neighbourhood of ξ\xi such that the probability that μνNN(T)\mu^{N}_{\nu_{N}}(T) lies in that neighbourhood is smaller than what we expect to see in the stationary regime. In other words, there are some barriers in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) that cannot be surmounted in any finite time, yet these barriers can be crossed in the stationary regime. These barriers indicate that, to obtain the correct stationary regime probability of a small neighbourhood of ξ\xi using the dynamics of μνNN\mu^{N}_{\nu_{N}}, one should wait longer than any fixed time horizon. That is, one should consider the random variable μνNN(T(N))\mu^{N}_{\nu_{N}}(T(N)), where T(N)T(N) is a suitable function of NN, and estimate the probability that μνNN(T(N))\mu^{N}_{\nu_{N}}(T(N)) belongs to a small neighbourhood of ξ\xi. However it is not straightforward to obtain such estimates from the process-level large deviation estimates of μνNN\mu^{N}_{\nu_{N}} since the latter are usually available for a fixed time duration.

There are natural barriers in the context of finite-state mean-field models when the limiting dynamical system has multiple (but finitely many) stable equilibria [36]. In such situations, passages from a neighbourhood of one equilibrium to a neighbourhood of another take place over time durations of the form exp{N×O(1)}\exp\{N\times O(1)\} where NN is the number of particles444O(1)O(1) refers to a bounded sequence, and ω(1)\omega(1) refers to a sequence that goes to \infty.. Interestingly, these barriers can be surmounted using trajectories of finite time durations; i.e., for any fixed TT, the probability that the empirical measure process reaches a neighbourhood of an equilibrium at time TT when it is initiated in a small neighbourhood of another equilibrium is of the form exp{N×O(1)}\exp\{-N\times O(1)\}. In contrast, in the case of the above counterexample, the barriers cannot be surmounted in finite time durations; for any fixed TT, the probability that μN(T)\mu^{N}(T) reaches a small neighbourhood of a point in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) with finite first moment but infinite ϑ\vartheta-moment when it is initiated from a neighbourhood of ξQ\xi^{*}_{Q} is of the form exp{N×ω(1)}\exp\{-N\times\omega(1)\}. Hence we anticipate that the barriers that we encounter in the above counterexample are somehow more difficult to surmount than those that arise in the case of finite-state mean-field models with multiple stable equilibria.

1.1.2 Non-interacting nodes in a wireless network

01122zzλf\lambda_{f}λf\lambda_{f}λf\lambda_{f}λf\lambda_{f}λf\lambda_{f}λb\lambda_{b}λb\lambda_{b}λb\lambda_{b}
Figure 2: Transition rates of a wireless node

We provide another counterexample where the issue is similar. Consider the graph (𝒵,W)(\mathcal{Z},\mathcal{E}_{W}) whose edge set W\mathcal{E}_{W} consists of forward edges {(z,z+1),z𝒵}\{(z,z+1),z\in\mathcal{Z}\} and backward edges {(z,0),z𝒵{0}}\{(z,0),z\in\mathcal{Z}\setminus\{0\}\} (see Figure 2). Let λf\lambda_{f} and λb\lambda_{b} be positive numbers. Consider the generator LWL^{W} acting on functions ff on 𝒵\mathcal{Z} by

LWf(z)z:(z,z)Wλz,z(f(z)f(z)),z𝒵,\displaystyle L^{W}f(z)\coloneqq\sum_{z^{\prime}:(z,z^{\prime})\in\mathcal{E}_{W}}\lambda_{z,z^{\prime}}(f(z^{\prime})-f(z)),z\in\mathcal{Z},

where λz,z+1=λf\lambda_{z,z+1}=\lambda_{f} for each z𝒵z\in\mathcal{Z} and λz,0=λb\lambda_{z,0}=\lambda_{b} for each z𝒵{0}z\in\mathcal{Z}\setminus\{0\}. The invariant probability measure associated with this Markov process is

ξW(z)λbλf+λb(λfλf+λb)z,z𝒵.\displaystyle\xi^{*}_{W}(z)\coloneqq\frac{\lambda_{b}}{\lambda_{f}+\lambda_{b}}\left(\frac{\lambda_{f}}{\lambda_{f}+\lambda_{b}}\right)^{z},\,z\in\mathcal{Z}.

Similar to the previous example, for each N1N\geq 1, we consider NN particles, each of which evolves independently as a Markov process on 𝒵\mathcal{Z} with the infinitesimal generator LWL^{W}. It is easy to check that the empirical measure of the system of particles possesses a unique invariant probability measure, which we denote by WN\wp_{W}^{N}. Under stationarity, the state of each particle is distributed as ξW\xi^{*}_{W}. As a consequence, WN\wp^{N}_{W} is the law of the random variable 1Nn=1Nδζn\frac{1}{N}\sum_{n=1}^{N}\delta_{\zeta_{n}} on 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}), where ζ1,,ζN\zeta_{1},\ldots,\zeta_{N} are i.i.d. ξW\xi^{*}_{W}. Hence, by Sanov’s theorem, the family {WN,N1}\{\wp^{N}_{W},N\geq 1\} satisfies the LDP with the rate function I(ξW)I(\cdot\|\xi^{*}_{W}). As we show in Section 8, in this example too, the quasipotential (1.3) with ξ\xi^{*} replaced by ξW\xi_{W}^{*} is not the same as I(ξW)I(\cdot\|\xi^{*}_{W}). As in the previous example, there are points ξ\xi where V(ξ)=V(\xi)=\infty but I(ξξQ)<I(\xi\|\xi_{Q}^{*})<\infty, points ξ\xi that have a finite first moment but infinite ϑ\vartheta-moment. Once again, the quasipotential does not govern the rate function for the family {WN,N1}\{\wp^{N}_{W},N\geq 1\}.

1.2 Assumptions and main result

We now provide some assumptions on the model of countable-state mean-field interacting particle systems that ensure that the barriers in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) that are insurmountable using trajectories of arbitrary but finite time duration remain insurmountable in the stationary regime as well. Under these assumptions, we prove the main result of this paper, i.e., the sequence of invariant measures {N,N1}\{\wp^{N},N\geq 1\} satisfies the LDP with rate function VV.

1.2.1 Assumptions

Our first set of assumptions is on the mean-field interacting particle system (i.e., on the generator N\mathscr{L}^{N} defined in (1.1)).

  1. (A1)

    The edge set is given by ={(z,z+1),z𝒵}{(z,0),z𝒵{0}}.\mathcal{E}=\{(z,z+1),z\in\mathcal{Z}\}\cup\{(z,0),z\in\mathcal{Z}\setminus\{0\}\}.

  2. (A2)

    There exist positive constants λ¯\overline{\lambda} and λ¯\underline{\lambda} such that

    λ¯z+1λz,z+1(ξ)λ¯z+1, and λ¯λz,0(ξ)λ¯,\displaystyle\frac{\underline{\lambda}}{z+1}\leq\lambda_{z,z+1}(\xi)\leq\frac{\overline{\lambda}}{z+1},\text{ and }\underline{\lambda}\leq\lambda_{z,0}(\xi)\leq\overline{\lambda},

    for all ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}).

  3. (A3)

    The functions (z+1)λz,z+1()(z+1)\lambda_{z,z+1}(\cdot), z𝒵,z\in\mathcal{Z}, and λz,0()\lambda_{z,0}(\cdot), z𝒵{0}z\in\mathcal{Z}\setminus\{0\}, are uniformly Lipschitz continuous on 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}).

Note that assumption (A1) considers a specific transition graph (Figure 2) for each particle. This graph arises in the contexts of random backoff algorithms for medium access in wireless local area networks [20] and decentralised control of loads in a smart grid [26]. Assumption (A2) ensures that the forward transition rates at state zz decays as 1/z1/z. This key assumption cuts down the speed of outward excursions and enables us to overcome the issue described in the counterexamples. To highlight this, consider a modified example of Section 1.1.2 where λz,z+1=λf/(z+1)\lambda_{z,z+1}=\lambda_{f}/(z+1), z𝒵z\in\mathcal{Z}; the rest of the description remains the same. Let ξ~W1(𝒵)\tilde{\xi}_{W}\in\mathcal{M}_{1}(\mathcal{Z}) denote the invariant probability measure associated with one particle. It can be checked that ξ~W(z)\tilde{\xi}_{W}(z) is of the order of exp{ϑ(z)}\exp\{-\vartheta(z)\}, unlike ξW\xi^{*}_{W} which has geometric decay. As a consequence, I(ξξ~W)I(\xi\|\tilde{\xi}_{W}) is finite if and only if the ϑ\vartheta-moment of ξ\xi is finite. Hence, by imposing (A2), we have ensured that the barriers in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) that are insurmountable for finite time duration trajectories continue to remain insurmountable in the stationary regime; this is the key property that enables us to prove the main result of this paper. Assumption (A3) is a uniform Lipschitz continuity property for the transition rates which is required for the process-level LDP for μνNN\mu^{N}_{\nu_{N}} to hold and for the the McKean-Vlasov equation (1.2) to be well-posed.

Our second set of assumptions is on the McKean-Vlasov equation (1.2). Let μν\mu_{\nu}, ν1(𝒵)\nu\in\mathcal{M}_{1}(\mathcal{Z}), denote the solution to the limiting dynamics (1.2) with initial condition ν1(𝒵)\nu\in\mathcal{M}_{1}(\mathcal{Z}). Recall the function ϑ\vartheta. Define 𝒦M{ξ1(𝒵):ξ,ϑM}\mathscr{K}_{M}\coloneqq\{\xi\in\mathcal{M}_{1}(\mathcal{Z}):\langle\xi,\vartheta\rangle\leq M\}, M>0M>0.

  1. (B1)

    There exists a unique globally asymptotically stable equilibrium ξ\xi^{*} for the McKean-Vlasov equation (1.2).

  2. (B2)

    ξ,ϑ<\langle\xi^{*},\vartheta\rangle<\infty and limtsupν𝒦Mμν(t),ϑ=ξ,ϑ\lim_{t\to\infty}\sup_{\nu\in\mathscr{K}_{M}}\langle\mu_{\nu}(t),\vartheta\rangle=\langle\xi^{*},\vartheta\rangle for each M>0M>0.

The first assumption above asserts that all the trajectories of (1.2) converge to ξ\xi^{*} as time becomes large. The proof of the LDP upper and lower bounds for the family {N,N1}\{\wp^{N},N\geq 1\} involves construction of trajectories that start at suitable compact sets, reach the stable equilibrium ξ\xi^{*} using arbitrarily small cost, and then terminate at a desired point in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) starting from ξ\xi^{*}. All these are enabled by assumption (B1) (see more remarks about this assumption in Section 1.3). The second assumption asserts that the ϑ\vartheta-moment of the solution to the limiting dynamics converges uniformly over initial conditions lying in sets of bounded ϑ\vartheta-moment. In the case of a non-interacting system that satisfies (A1) but with constant forward transition rates (for example, see LWL^{W} in Section 1.1.2), the analogue of this assumption can easily be verified: the first moment of the solution to the limiting dynamics converges uniformly over initial conditions lying in sets of bounded first moment. In fact, one can explicitly write down the first moment of the solution to the limiting dynamics in this case and verify this assumption easily. Assumption (B2) is the analogous statement for our mean-field system that satisfies the 1/z1/z-decay of the forward transition rates in assumption (A2).

1.2.2 Main result

We now state the main result of this paper, namely the LDP for the family of invariant measures {N,N1}\{\wp^{N},N\geq 1\} under the assumptions (A1)(A3) and (B1)(B2).

We first assert the existence and uniqueness of the invariant measure N\wp^{N} for N\mathscr{L}^{N} for each N1N\geq 1, and the exponential tightness of the family {N,N1}\{\wp^{N},N\geq 1\}.

Proposition 1.1.

Assume (A1) and (A2). For each N1N\geq 1, N\mathscr{L}^{N} admits a unique invariant probability measure N\wp^{N}. Further, the family {N,N1}\{\wp^{N},N\geq 1\} is exponentially tight in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}).

Recall the quasipotential VV defined in (1.3). We now state the main result of this paper.

Theorem 1.1.

Assume (A1), (A2), (A3), (B1), and (B2). Then the family of probability measures {N,N1}\{\wp^{N},N\geq 1\} satisfies the large deviation principle on 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) with rate function VV.

The proof of this result is carried out in Sections 47. We begin with the process-level uniform LDP for μνNN\mu^{N}_{\nu_{N}} over compact subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}); this uniform LDP gives us the large deviation estimates for the process μνNN\mu^{N}_{\nu_{N}} uniformly over the initial conditions νN\nu_{N} lying in a given compact set (see Definition 2.2 and Theorem 2.1). We prove the LDP for the family {N,N1}\{\wp^{N},N\geq 1\} by transferring this process-level uniform LDP for μνNN\mu^{N}_{\nu_{N}} over compact subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) to the stationary regime. The proof of the LDP lower bound (in Section 4) considers specific trajectories and lower bounds the probability of small neighbourhoods of points in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) under N\wp^{N} using the probability that the process μνNN\mu^{N}_{\nu_{N}} remains close to these trajectories. For the proof of the upper bound, we require certain regularity properties of the quasipotential. These properties are established in Section 5. We first show a controllability555This terminology is from Cerrai and Röckner [11]. property for VV: V(ξ)V(\xi) is finite if and only if ξ,ϑ<\langle\xi,\vartheta\rangle<\infty. Using the lower bound proved in Section 4, we then show that the level sets of VV are compact subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}). Since 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) is not locally compact and VV has compact lower level sets, we do not expect VV to be continuous on 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}). Indeed, if ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}) is such that VV is continuous at ξ\xi and V(ξ)<V(\xi)<\infty, given ε>0\varepsilon>0 there exists a δ>0\delta>0 such that d(ξ,ξ)<δd(\xi^{\prime},\xi)<\delta implies that |V(ξ)V(ξ)|<ε|V(\xi^{\prime})-V(\xi)|<\varepsilon. In particular, {ξ1(𝒵):V(ξ)V(ξ)+ε}B(ξ,δ)\{\xi^{\prime}\in\mathcal{M}_{1}(\mathcal{Z}):V(\xi^{\prime})\leq V(\xi)+\varepsilon\}\supset B(\xi,\delta). Since {ξ1(𝒵):V(ξ)V(ξ)+ε}\{\xi^{\prime}\in\mathcal{M}_{1}(\mathcal{Z}):V(\xi^{\prime})\leq V(\xi)+\varepsilon\} is compact in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}), this shows that ξ\xi has a relatively compact neighborhood in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}), which is a contradiction. This shows that, for any ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}) such that V(ξ)<V(\xi)<\infty, VV is discontinuous at ξ\xi. However we show the following small cost connection property: whenever ξnξ\xi_{n}\to\xi^{*} in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) and ξn,ϑξ,ϑ\langle\xi_{n},\vartheta\rangle\to\langle\xi^{*},\vartheta\rangle as nn\to\infty, we have limnV(ξn)=V(ξ)=0\lim_{n\to\infty}V(\xi_{n})=V(\xi^{*})=0. These properties of the quasipotential are then used to transfer the process-level uniform LDP upper bound for μνNN\mu^{N}_{\nu_{N}} (uniform over compact subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z})) to the LDP upper bound for the family of invariant measures. The proof of the upper bound is carried out in Section 6. Finally, we complete the proof of the theorem in Section 7.

While the proofs of our lower and upper bounds follow the general methodology of Sowers [33], there are significant model-specific difficulties that arise in our context. The main novelty in the proof of Theorem 1.1 is to establish the small cost connection property of the quasipotential VV under assumptions (A1)(A3) and (B1)(B2). That is, we can find trajectories of small cost that start at ξ\xi^{*} and end at points in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) whose ϑ\vartheta-moment is not very far from that of ξ\xi^{*}. In the work of Sowers [33], this has been carried out by considering the “straight-line” trajectory that connects the attractor to the nearby point under consideration. Such a trajectory may not have small cost in our case since the mass transfer is restricted to the edges in \mathcal{E}. We overcome this difficulty by considering a piecewise constant velocity mass transfer via the edges in \mathcal{E}. We then carefully estimate the cost of this trajectory and prove the necessary small cost connection property. We also simplify the proof of the compactness of the lower level sets of VV; while Sowers [34, Proposition 7] studies the minimisation of the costs of trajectories over the infinite-horizon, we arrive at it by using the LDP lower bound and the exponential tightness of the family {N,N1}\{\wp^{N},N\geq 1\}. We also remark that the methodology of Sowers [33] has been used by Cerrai and Röckner [11] in the context of stochastic reaction diffusion equations and by Cerrai and Paskal [9] in the context of two-dimensional stochastic Navier-Stokes equations.

1.3 Discussion and future directions

The main result and the counterexamples suggest that in order for the family of invariant measures of a Markov process to satisfy the large derivation principle with rate function governed by the Freidlin-Wentzell quasipotential, one must have some good properties on the model under consideration. In the case of our main result, this goodness property was achieved by the 1/z1/z-decay of the forward transition rates from assumption (A2). We use this assumption to show the exponential tightness of the invariant measure over compact subsets with bounded ϑ\vartheta-moments. It also enables us to show the necessary regularity properties of the quasipotential required to transfer the process-level large deviation result to the stationary regime. However a general treatment of the LDP for the family of invariant measures of Markov processes (that encompasses the cases of  [33, 11, 9, 7, 17]), especially when the ambient state space is not locally compact, is missing in the literature.

One of the assumptions that plays a significant role in the proof of our main result is the existence of a unique globally asymptotically stable equilibrium for the limiting dynamics (assumption (B1))666In the works of Sowers [33], Cerrai and Röckner [11], and Cerrai and Paskal [9], their model assumptions ensure that (B1) holds.. In general, the limiting dynamical system (1.2) could possess multiple ω\omega-limit sets. In that case the approach of our proofs breaks down. A well-known approach to study large deviations of the invariant measures in such cases is to focus on small neighbourhoods of these ω\omega-limit sets and then analyse the discrete time Markov chain that evolves on these neighbourhoods. The LDP then follows from the estimates of the invariant measure of this discrete time chain (see Freidlin and Wentzell [18, Chapter 6, Section 4]). However this approach requires the uniform LDP over open subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}), which is not yet available for our mean-field model. If this can be established, along with the regularity properties of the quasipotential established in Section 5, one can not only use the above idea to extend our main result to the case when the limiting dynamical system possesses multiple ω\omega-limit sets but also to study exit problems and metastability phenomena in our mean-field model.

Another definition of the quasipotential appears in the literature. It is given by the minimisation of costs of the form S(,0](φ)S_{(-\infty,0]}(\varphi) over infinite-horizon trajectories φ\varphi on (,0](-\infty,0] such that the terminal time condition φ(0)\varphi(0) is fixed and φ(t)ξ\varphi(t)\to\xi^{*} as tt\to-\infty (see Sowers [33], Cerrai and Röckner [11]). While it is clear that the above definition of the quasipotential is a lower bound for VV in (1.3), unlike in Sowers [33] and Cerrai and Röckner [11], we are not able to show that the two definitions are the same. A proof of this equality, or otherwise, will add more insight on the general case.

We remark that assumption (A3) does not play a role in the proof of our main result. It is used to invoke the process-level LDP for μνNN\mu^{N}_{\nu_{N}} (see Theorem 2.1) and the well-posedness of the limiting dynamical system (1.2). If these two properties are established through some other means then the proof of Theorem 1.1 holds verbatim without the need for assumption (A3).

Finally, we mention that a time-independent variational formula for the quasipotential is available for some non-reversible models in statistical mechanics, see Bertini et al. [2, 3]. It is not clear if the quasipotential VV in (1.3) admits a time-independent variational form. This would be an interesting direction to explore.

1.4 Related literature

Process-level large deviations of small-noise diffusion processes have been well studied in the past. For finite-dimensional large deviation problems, see Freidlin and Wentzell [18, Chapter 5], Liptser [23], Veretennikov [35], Puhalskii [29], and the references therein. For infinite-dimensional problems where the state space is not locally compact, see Sowers [34] and Cerrai and Röckner [10]. More recently, uniform large deviation principle (uniform LDP) for Banach-space valued stochastic differential equations over the class of bounded and open subsets of the Banach space have been studied by Salins et al. [31]. These have been used to study the exit times and metastability in such processes, see Salins and Spiliopoulos [32]. While the above works focus on diffusion processes, our work focuses on the stationary regime large deviations of countable-state mean-field models with jumps. In the spirit of the small-noise problems listed above, our process μνNN\mu^{N}_{\nu_{N}} can be viewed as a small random perturbation of the dynamical system (1.2) on 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}).

In the context of interacting particle systems, Dawson and Gärtner [12] established the process-level LDP for weakly interacting diffusion processes, and Léonard [21] and Borkar and Sundaresan [7] extended this to mean-field interacting particle systems with jumps. In this work, we focus on the stationary regime large deviations of mean-field models with jumps when the state of each particle comes from a countable set. For small-noise diffusion process on Euclidean spaces and finite-state mean-field models, since the state space (on which the empirical measure process evolves) is locally compact, the process-level large deviation results have been extended in a straightforward manner to the uniform LDP over the class of open subsets of the space. Such uniform large deviation estimates have been used to prove the large deviations of the invariant measure and the exit time estimates, see Freidlin and Wentzell [18, Chapter 6] in the context of diffusion processes, Borkar and Sundaresan [7] and [36] in the context of finite-state mean-field models. One of the key ingredients in these proofs is the continuity of the quasipotential. However in our case, the state space 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) is infinite-dimensional and not locally compact. Therefore, since the quasipotential (1.3) is expected to have compact lower level sets, we do not expect it to be continuous on 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) unlike in the finite-dimensional problems mentioned above. Hence the ideas presented in [7] are not directly applicable to our context of the LDP for the family of invariant measures.

Large deviations of the family of invariant measures for small-noise diffusion processes on non-locally compact spaces have also been studied in the past, see Sowers [33] and Cerrai and Röckner [11]. They have a unique attractor for the limiting dynamics, and the proof essentially involves conversion of the uniform LDP over the finite-time horizon to the stationary regime. Martirosyan [24] studied a situation where the limiting dynamical system possesses multiple attractors. For the study of large deviations of the family of invariant measures for simple exclusion processes, see Bodineau and Giacomin [5] and Bertini et al. [3]. More recently, Farfán et al. [17] extended this to a simple exclusion process whose limiting hydrodynamic equation has multiple attractors. Their proof proceeds similar to the case of finite-dimensional diffusions in Freidlin and Wentzell [18, Chapter 6, Section 4] by first approximating the process near the attractors and then using the Khasminskii reconstruction formula [19, Chapter 4, Section 4]. In particular, it requires the uniform LDP to hold over open subsets of the state space. Since their state space, although infinite-dimensional, is compact, the proof of the uniform LDP over open subsets easily follows from the process-level LDP. Also, the compactness of the state space simplifies the proofs of the small cost connection property from the attractors to nearby points, a property needed in the Khasminskii reconstruction. Although we restrict our attention to the case of a unique globally asymptotically stable equilibrium as in [33, 11], the main novelty of our work is that we establish certain regularity properties of the quasipotential for countable-state mean-field models with jumps which were not done in the past. We then use these properties to prove the LDP for the family of invariant measures. Furthermore, we demonstrate two counterexamples where the stationary regime LDP’s rate functions are not governed by the usual quasipotential. To the best of our knowledge, such examples where the LDP for the family of invariant measures hold but there rate functions are not governed by the usual Freidlin-Wentzell quasipotential are new. These examples are constructed in a way that the particle systems do not possess the small cost connection property from the attractor to nearby points with finite first moment but infinite ϑ\vartheta-moment.

Large deviations of the family of invariant measure for a queueing network in a finite-dimensional setting has been studied by Puhalskii [28]. Finally, large deviations of the family of invariant measures for a stochastic process under some general conditions has been studied by Puhalskii [30]. One of their conditions is the small cost connection property between any two nearby points in the state space, which we do not expect to be true in our countable-state mean-field model since our state space is infinite-dimensional.

1.5 Organisation

This paper is organised as follows. In Section 2, we provide preliminary results on the large deviations over finite time horizons. The proof of the main result is carried out in Sections 37. In Section 3, we prove the existence, uniqueness, and exponential tightness of the family of invariant measures. In Section 4, we prove the LDP lower bound for the family of invariant measures. In Section 5, we establish some regularity properties of the quasipotential VV defined in (1.3). In Section 6, we prove the LDP upper bound for the family of invariant measures. In Section 7, we complete the proof of the main result. Finally in Section 8, we prove that the quasipotential differs from the relative entropy (with respect to the globally asymptotically stable equilibrium) for the two counterexamples discussed in Section 1.1.

2 Preliminaries

2.1 Frequently used notation

We first summarise the frequently used notation in the paper. Let 𝒵\mathcal{Z} denote the set of nonnegative integers and let (𝒵,)(\mathcal{Z},\mathcal{E}) denote a directed graph on 𝒵\mathcal{Z}. Let \mathbb{R}^{\infty} denote the infinite product of \mathbb{R} equipped with the topology of pointwise convergence. Let C0(𝒵)C_{0}(\mathcal{Z}) denote the space of functions on 𝒵\mathcal{Z} with compact support. Recall that 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) denotes the space of probability measure on 𝒵\mathcal{Z} equipped with the total variation metric (denoted by dd). This metric generates the topology of weak convergence on 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}). By Scheffé’s lemma [15, Chapter 3, Section 2], 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) can be identified with the subset {x:xi0i,i0xi=1}\{x\in\mathbb{R}^{\infty}:x_{i}\geq 0\,\forall i,\sum_{i\geq 0}x_{i}=1\} of \mathbb{R}^{\infty} with the subspace topology. For each N1N\geq 1, recall that 1N(𝒵)1(𝒵)\mathcal{M}_{1}^{N}(\mathcal{Z})\subset\mathcal{M}_{1}(\mathcal{Z}) denotes the space of probability measures on 𝒵\mathcal{Z} that can arise as empirical measures of NN-particle configurations on 𝒵N\mathcal{Z}^{N}. Recall ϑ\vartheta defined in (1.6). Given αC0(𝒵)\alpha\in C_{0}(\mathcal{Z}) and gg\in\mathbb{R}^{\infty}, let the bracket α,g\langle\alpha,g\rangle denote z𝒵α(z)g(z)\sum_{z\in\mathcal{Z}}\alpha(z)g(z). Similarly, given f,gf,g\in\mathbb{R}^{\infty}, let the bracket f,g\langle f,g\rangle denote limnk=0nf(k)g(k)\lim_{n\to\infty}\sum_{k=0}^{n}f(k)g(k), whenever the limit exists. For M>0M>0, define

𝒦M{ξ1(𝒵):ξ,ϑM};\displaystyle\mathscr{K}_{M}\coloneqq\left\{\xi\in\mathcal{M}_{1}(\mathcal{Z}):\langle\xi,\vartheta\rangle\leq M\right\}; (2.1)

by Prohorov’s theorem, 𝒦M\mathscr{K}_{M} is a compact subset of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}). Define 𝒦M1𝒦M\mathscr{K}\coloneqq\bigcup_{M\geq 1}\mathscr{K}_{M}. Let ξ1(𝒵)\xi^{*}\in\mathcal{M}_{1}(\mathcal{Z}) denote the globally asymptotically stable equilibrium for the McKean-Vlasov equation (1.2) (see assumption (B1)). For each Δ>0\Delta>0, define

K(Δ){ξ𝒦:d(ξ,ξ)Δ and |ξ,ϑξ,ϑ|Δ};\displaystyle K(\Delta)\coloneqq\{\xi\in\mathscr{K}:d(\xi^{*},\xi)\leq\Delta\text{ and }|\langle\xi^{*},\vartheta\rangle-\langle\xi,\vartheta\rangle|\leq\Delta\}; (2.2)

note that K(Δ)K(\Delta) depends on ξ\xi^{*} as well (which we do not indicate for ease of readability). Define

τ(u)euu1,u.\displaystyle\tau(u)\coloneqq e^{u}-u-1,\,u\in\mathbb{R}. (2.3)

Note that τ\tau is the log-moment generating function of the centred unit rate Poisson law, and define its convex dual

τ(u){ if u<1,1 if u=1,(u+1)log(u+1)u if u>1,\displaystyle\tau^{*}(u)\coloneqq\left\{\begin{array}[]{ll}\infty&\text{ if }u<-1,\\ 1&\text{ if }u=-1,\\ (u+1)\log(u+1)-u&\text{ if }u>-1,\end{array}\right. (2.7)

For a complete and separable metric space (𝒮,d0)(\mathcal{S},d_{0}), A𝒮A\subset\mathcal{S}, and x𝒮x\in\mathcal{S}, let d0(x,A)d_{0}(x,A) denote infyAd0(x,y)\inf_{y\in A}d_{0}(x,y). For a set AA let A\sim A denote the complement of AA. For two numbers aa and bb, let aba\vee b (resp. aba\wedge b) denote maximum (resp. minimum) of aa and bb. Also, let a+=max{a,0}a^{+}=\max\{a,0\}. For a metric space 𝒮\mathcal{S}, let (𝒮)\mathcal{B}(\mathcal{S}) denote the Borel σ\sigma-field on 𝒮\mathcal{S}. Finally, constants are denoted by CC and their values may be different in each occurrence.

2.1.1 Notation related to the dynamics

Let D([0,T],𝒮)D([0,T],\mathcal{S}) denote the space of 𝒮\mathcal{S}-valued functions on [0,T][0,T] that are right continuous with left limits. It is equipped with the Skorohod topology which makes it a complete and separable metric space (see, for example, Ethier and Kurtz [16, Chapter 3]). Let ρ\rho denote a metric on D([0,T],𝒮)D([0,T],\mathcal{S}) that generates the Skorohod topology. An element of D([0,T],𝒮)D([0,T],\mathcal{S}) is called a “trajectory”, and we shall refer to the process-level large deviations rate function evaluated on a trajectory as the “cost” associated with that trajectory. For a trajectory φ\varphi, let both φt\varphi_{t} and φ(t)\varphi(t) denote the evaluation of φ\varphi at time tt. For N1N\geq 1 and νN1N(𝒵)\nu_{N}\in\mathcal{M}_{1}^{N}(\mathcal{Z}), let νNN\mathbb{P}^{N}_{\nu_{N}} denote the solution to the D([0,T],1N(𝒵))D([0,T],\mathcal{M}_{1}^{N}(\mathcal{Z}))-valued martingale problem for N\mathscr{L}^{N} with initial condition νN1N(𝒵)\nu_{N}\in\mathcal{M}_{1}^{N}(\mathcal{Z}) (whenever the martingale problem for N\mathscr{L}^{N} is well-posed). Let μνNN\mu^{N}_{\nu_{N}} denote the random element of D([0,T],1N(𝒵))D([0,T],\mathcal{M}_{1}^{N}(\mathcal{Z})) whose law is νNN\mathbb{P}^{N}_{\nu_{N}}. For each ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}), let LξL_{\xi} denote the generator acting on functions ff on 𝒵\mathcal{Z} by

fLξ(z)z:(z,z)λz,z(ξ)(f(z)f(z)),z𝒵,\displaystyle f\mapsto L_{\xi}(z)\coloneqq\sum_{z^{\prime}:(z,z^{\prime})\in\mathcal{E}}\lambda_{z,z^{\prime}}(\xi)(f(z^{\prime})-f(z)),\,z\in\mathcal{Z},

i.e., the generator of the single particle evolving on 𝒵\mathcal{Z} under the static mean-field ξ\xi.

Let C01([0,T]×𝒵)C_{0}^{1}([0,T]\times\mathcal{Z}) denote the space of real-valued functions on [0,T]×𝒵[0,T]\times\mathcal{Z} with compact support that are continuously differentiable in the first argument. Given a trajectory φD([0,T],1(𝒵))\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})) such that the mapping [0,T]tφt1(𝒵)[0,T]\ni t\mapsto\varphi_{t}\in\mathcal{M}_{1}(\mathcal{Z}) is absolutely continuous (see Dawson and Gärtner [12, Section 4.1]), one can define φ˙t\dot{\varphi}_{t}\in\mathbb{R}^{\infty} for almost all t[0,T]t\in[0,T] such that

φt,ft=φ0,f0+[0,t]φ˙u,fu𝑑u+[0,t]φu,ufu𝑑u\displaystyle\langle\varphi_{t},f_{t}\rangle=\langle\varphi_{0},f_{0}\rangle+\int_{[0,t]}\langle\dot{\varphi}_{u},f_{u}\rangle du+\int_{[0,t]}\langle\varphi_{u},\partial_{u}f_{u}\rangle du

holds for each fC01([0,T]×𝒵)f\in C_{0}^{1}([0,T]\times\mathcal{Z}) and t[0,T]t\in[0,T].

Finally, let 1(D([0,T],𝒵))\mathcal{M}_{1}(D([0,T],\mathcal{Z})) denote the space of probability measures on D([0,T],𝒵)D([0,T],\mathcal{Z}) equipped with the usual weak topology. Also, let 1(1(D([0,T],𝒵)))\mathcal{M}_{1}(\mathcal{M}_{1}(D([0,T],\mathcal{Z}))) denote the space of probability measures on 1(D([0,T],𝒵))\mathcal{M}_{1}(D([0,T],\mathcal{Z})) equipped with the weak topology.

2.2 Process-level large deviations

We first recall the definition of the large deviation principle for a family of random variables indexed by one parameter.

Definition 2.1 (Large deviation principle).

Let (𝒮,d0)(\mathcal{S},d_{0}) be a metric space. We say that a family {XN,N1}\{X^{N},N\geq 1\} of 𝒮\mathcal{S}-valued random variables defined on a probability space (Ω,,P)(\Omega,\mathcal{F},P) satisfies the large deviation principle with rate function I:S[0,]I:S\to[0,\infty] if

  • (Compactness of level sets). For any s0s\geq 0, Φ(s){x𝒮:I(x)s}\Phi(s)\coloneqq\{x\in\mathcal{S}:I(x)\leq s\} is a compact subset of 𝒮\mathcal{S};

  • (LDP lower bound). For any γ>0\gamma>0, δ>0\delta>0, and x𝒮x\in\mathcal{S}, there exists N01N_{0}\geq 1 such that

    P(d0(XN,x)<δ)exp{N(I(x)+γ)}\displaystyle P(d_{0}(X^{N},x)<\delta)\geq\exp\{-N(I(x)+\gamma)\}

    for any NN0N\geq N_{0};

  • (LDP upper bound). For any γ>0\gamma>0, δ>0\delta>0, and s>0s>0, there exists N01N_{0}\geq 1 such that

    P(d0(XN,Φ(s))δ)exp{N(sγ)}\displaystyle P(d_{0}(X^{N},\Phi(s))\geq\delta)\leq\exp\{-N(s-\gamma)\}

    for any NN0N\geq N_{0}.

This definition is also used to study the large deviations of a family of probability measures. For each N1N\geq 1, let PN=P(XN)1P^{N}=P\circ(X^{N})^{-1}, the law of the random variable XNX_{N} on (𝒮,d0)(\mathcal{S},d_{0}). We say that the family of probability measures {PN,N1}\{P^{N},N\geq 1\} satisfies the LDP on (𝒮,d0)(\mathcal{S},d_{0}) with rate function II if the sequence of 𝒮\mathcal{S}-valued random variables {XN,N1}\{X^{N},N\geq 1\} satisfies the LDP with rate function II.

The LDP lower bound in the above definition is equivalent to the following statement [18, Chapter 3, Section 3]

lim infN1NlogP(XNG)infxGI(x), for all GS open.\displaystyle\liminf_{N\to\infty}\frac{1}{N}\log P(X^{N}\in G)\geq-\inf_{x\in G}I(x),\text{ for all }G\subset S\text{ open}.

Similarly, under the compactness of the level sets of the rate function II, the LDP upper bound above is equivalent to the following statement:

lim supN1NlogP(XNF)infxFI(x), for all FS closed.\displaystyle\limsup_{N\to\infty}\frac{1}{N}\log P(X^{N}\in F)\leq-\inf_{x\in F}I(x),\text{ for all }F\subset S\text{ closed}.

To study the LDP for the family of invariant measures, we require estimates on the probabilities of the process-level large deviations of μνNN\mu^{N}_{\nu_{N}}. In particular, we consider hitting times of μνNN\mu^{N}_{\nu_{N}} on certain subsets of the state space 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) and apply the process-level large deviation lower and upper bounds for μνNN\mu^{N}_{\nu_{N}} starting at these subsets. Therefore, in addition to the scaling parameter NN, we must consider the process μνNN\mu^{N}_{\nu_{N}} indexed by the initial condition νN1N(𝒵)\nu_{N}\in\mathcal{M}_{1}^{N}(\mathcal{Z}). To study the process-level large deviations of such stochastic processes indexed by two parameters, we use the following definition of the uniform large deviation principle (see Freidlin and Wentzell [18, Chapter 3, Section 3]).

Definition 2.2 (Uniform large deviation principle).

We say that the family {μνNN,νN1N(𝒵),N1}\{\mu^{N}_{\nu_{N}},\nu_{N}\in\mathcal{M}_{1}^{N}(\mathcal{Z}),N\geq 1\} of D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z}))-valued random variables defined on a probability space (Ω,,P)(\Omega,\mathcal{F},P) satisfies the uniform large derivation principle over the class 𝒜\mathcal{A} of subsets of 1N(𝒵)\mathcal{M}_{1}^{N}(\mathcal{Z}) with the family of rate functions {Iν,ν1(𝒵)}\{I_{\nu},\nu\in\mathcal{M}_{1}(\mathcal{Z})\}, Iν:D([0,T],1(𝒵))[0,+]I_{\nu}:D([0,T],\mathcal{M}_{1}(\mathcal{Z}))\to[0,+\infty], ν1(𝒵)\nu\in\mathcal{M}_{1}(\mathcal{Z}), if

  • (Compactness of level sets). For each K1(𝒵)K\subset\mathcal{M}_{1}(\mathcal{Z}) compact and s0s\geq 0, νKΦν(s)\bigcup_{\nu\in K}\Phi_{\nu}(s) is a compact subset of D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})), where Φν(s){φD([0,T],1(𝒵)):φ(0)=ν,Iν(φ)s}\Phi_{\nu}(s)\coloneqq\{\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})):\varphi(0)=\nu,I_{\nu}(\varphi)\leq s\};

  • (Uniform LDP lower bound). For any γ>0\gamma>0, δ>0\delta>0, s>0s>0, and A𝒜A\in\mathcal{A}, there exists N01N_{0}\geq 1 such that

    P(ρ(μνNN,φ)<δ)exp{N(IνN(φ)+γ)},\displaystyle P(\rho(\mu^{N}_{\nu_{N}},\varphi)<\delta)\geq\exp\{-N(I_{\nu_{N}}(\varphi)+\gamma)\},

    for all νNA1N(𝒵)\nu_{N}\in A\cap\mathcal{M}_{1}^{N}(\mathcal{Z}), φΦνN(s)\varphi\in\Phi_{\nu_{N}}(s), and NN0N\geq N_{0};

  • (Uniform LDP upper bound). For any γ>0\gamma>0, δ>0\delta>0, s0>0s_{0}>0, and A𝒜A\in\mathcal{A}, there exists N01N_{0}\geq 1 such that

    P(ρ(μνNN,ΦνN(s))δ)exp{N(sγ)},\displaystyle P(\rho(\mu^{N}_{\nu_{N}},\Phi_{\nu_{N}}(s))\geq\delta)\leq\exp\{-N(s-\gamma)\},

    for all νNA1N(𝒵)\nu_{N}\in A\cap\mathcal{M}_{1}^{N}(\mathcal{Z}), ss0s\leq s_{0}, and NN0N\geq N_{0}.

Note that the initial conditions in the upper and lower bounds lie in A1N(𝒵)A\cap\mathcal{M}_{1}^{N}(\mathcal{Z}), unlike in the definition in [18, Chapter 3, Section 3].

We now make some definitions. Recall τ\tau defined in (2.3). For each ν1(𝒵)\nu\in\mathcal{M}_{1}(\mathcal{Z}) and T>0T>0, define the functional S[0,T](|ν):D([0,T],1(𝒵))[0,]S_{[0,T]}(\cdot|\nu):D([0,T],\mathcal{M}_{1}(\mathcal{Z}))\to[0,\infty] by

S[0,T](φ|ν)[0,T]supαC0(𝒵){α,φ˙tΛφtφt(z,z)τ(α(z)α(z))λz,z(φt)φt(z)}dt,\displaystyle S_{[0,T]}(\varphi|\nu)\coloneqq\int_{[0,T]}\sup_{\alpha\in C_{0}(\mathcal{Z})}\biggr{\{}\langle\alpha,\dot{\varphi}_{t}-\Lambda_{\varphi_{t}}^{*}\varphi_{t}\rangle-\sum_{(z,z^{\prime})\in\mathcal{E}}\tau(\alpha(z^{\prime})-\alpha(z))\lambda_{z,z^{\prime}}(\varphi_{t})\varphi_{t}(z)\biggr{\}}dt, (2.8)

whenever φ(0)=ν\varphi(0)=\nu and the mapping [0,T]tφ(t)1(𝒵)[0,T]\ni t\mapsto\varphi(t)\in\mathcal{M}_{1}(\mathcal{Z}) is absolutely continuous; S[0,T](φ|ν)=S_{[0,T]}(\varphi|\nu)=\infty otherwise. Define the lower level sets of the functional S[0,T](|ν)S_{[0,T]}(\cdot|\nu) by

Φν[0,T](s){φD([0,T],1(𝒵)):φ(0)=ν,S[0,T](φ|ν)s},s>0,ν1(𝒵).\displaystyle\Phi_{\nu}^{[0,T]}(s)\coloneqq\{\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})):\varphi(0)=\nu,S_{[0,T]}(\varphi|\nu)\leq s\},\,s>0,\,\nu\in\mathcal{M}_{1}(\mathcal{Z}).

The next lemma asserts that these level sets are compact in D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})) when the initial conditions belong to a compact subset of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}). The proof is deferred to Appendix A.

Lemma 2.1.

For each T>0T>0, s>0s>0, and K1(𝒵)K\subset\mathcal{M}_{1}(\mathcal{Z}) compact,

{φD([0,T],1(𝒵)):φ(0)K,S[0,T](φ|φ(0))s}\displaystyle\{\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})):\varphi(0)\in K,S_{[0,T]}(\varphi|\varphi(0))\leq s\}

is a compact subset of D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})).

The starting point of our study of the invariant measure asymptotics is the following uniform large deviation principle for the family {μνNN,νN1N(𝒵),N1}\{\mu^{N}_{\nu_{N}},\nu_{N}\in\mathcal{M}_{1}^{N}(\mathcal{Z}),N\geq 1\} over the class of compact subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) with the family of rate functions {S[0,T](|ν),ν1(𝒵)}\{S_{[0,T]}(\cdot|\nu),\nu\in\mathcal{M}_{1}(\mathcal{Z})\}. Its proof uses the process-level LDP for μνNN\mu^{N}_{\nu_{N}} studied in Léonard [21] for a fixed initial condition and its extension (when 𝒵\mathcal{Z} is a finite set) to the case when initial conditions converge to a point in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) in Borkar and Sundaresan [7]. The proof can be found in Appendix A.

Theorem 2.1.

Fix   T>0T>0 and assume (A1), (A2), and (A3). Then the family of D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z}))-valued random variables {μνNN,νN1N(𝒵),N1}\{\mu^{N}_{\nu_{N}},\nu_{N}\in\mathcal{M}_{1}^{N}(\mathcal{Z}),N\geq 1\} satisfies the uniform large deviation principle over the class of compact subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) with the family of rate functions {S[0,T](|ν),ν1(𝒵)}\{S_{[0,T]}(\cdot|\nu),\nu\in\mathcal{M}_{1}(\mathcal{Z})\}.

The rate function S[0,T](|ν)S_{[0,T]}(\cdot|\nu) admits a non-variational representation in terms of a minimal cost “control” that modulates the transition rates across various edges in \mathcal{E} so that the desired trajectory is obtained. Recall τ\tau^{*} defined in (2.7).

Theorem 2.2 (Non-variational representation; Léonard [22]).

Let φD([0,T],1(𝒵))\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})) be such that S[0,T](φ|φ(0))<S_{[0,T]}(\varphi|\varphi(0))<\infty. Then there exists a measurable function hφ:[0,T]×h_{\varphi}:[0,T]\times\mathcal{E}\to\mathbb{R} such that

φt,ft\displaystyle\langle\varphi_{t},f_{t}\rangle =φ0,f0+[0,t]φu,ufu𝑑u\displaystyle=\langle\varphi_{0},f_{0}\rangle+\int_{[0,t]}\langle\varphi_{u},\partial_{u}f_{u}\rangle du
+[0,t](z,z)(fu(z)fu(z))(1+hφ(u,z,z))λz,z(φu)φu(z)du\displaystyle\qquad+\int_{[0,t]}\sum_{(z,z^{\prime})\in\mathcal{E}}(f_{u}(z^{\prime})-f_{u}(z))(1+h_{\varphi}(u,z,z^{\prime}))\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du (2.9)

holds for all t[0,T]t\in[0,T] and all fC01([0,T]×𝒵)f\in C_{0}^{1}([0,T]\times\mathcal{Z}), and S[0,T](φ|φ(0))S_{[0,T]}(\varphi|\varphi(0)) admits the non-variational representation

S[0,T](φ|φ(0))=[0,T](z,z)τ(hφ(t,z,z))λz,z(φt)φt(z)dt.\displaystyle S_{[0,T]}(\varphi|\varphi(0))=\int_{[0,T]}\sum_{(z,z^{\prime})\in\mathcal{E}}\tau^{*}(h_{\varphi}(t,z,z^{\prime}))\lambda_{z,z^{\prime}}(\varphi_{t})\varphi_{t}(z)dt. (2.10)
Remark 2.1.

It can be shown that the rate function S[0,T]S_{[0,T]} defined in (2.8) can also be expressed as

S[0,T](φ|ν)\displaystyle S_{[0,T]}(\varphi|\nu) =supfC01([0,T]×𝒵){φT,fTφ0,f0[0,T]φu,ufudu\displaystyle=\sup_{f\in C_{0}^{1}([0,T]\times\mathcal{Z})}\Biggr{\{}\langle\varphi_{T},f_{T}\rangle-\langle\varphi_{0},f_{0}\rangle-\int_{[0,T]}\langle\varphi_{u},\partial_{u}f_{u}\rangle du
[0,T]φu,Lφufudu[0,T](z,z)τ(fu(z)fu(z))λz,z(φu)φu(z)du},\displaystyle\qquad-\int_{[0,T]}\langle\varphi_{u},L_{\varphi_{u}}f_{u}\rangle du-\int_{[0,T]}\sum_{(z,z^{\prime})\in\mathcal{E}}\tau(f_{u}(z^{\prime})-f_{u}(z))\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du\Biggr{\}}, (2.11)

φD([0,T],1(𝒵))\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})), see Léonard [22]. This form of the rate function will indeed be used in the proof of the counterexamples in Section 8.

3 Invariant measure: Existence, uniqueness, and exponential tightness

In this section we prove Proposition 1.1, the existence and uniqueness of the invariant measure N\wp^{N} for N\mathscr{L}^{N} for each N1N\geq 1, and the exponential tightness of the family of invariant measures {N,N1}\{\wp^{N},N\geq 1\}. The proof relies on the standard Krylov-Bogolyubov argument and a coupling between the interacting particle system under consideration and a non-interacting system with maximal forward transition rates minimal backward transition rates.

We first introduce some notations for the non-interacting particle system. Let L¯\bar{L} denote the generator acting on functions ff on 𝒵\mathcal{Z} by

L¯f(z)=z:(z,z)λz,z(f(z)f(z)),z𝒵,\displaystyle\bar{L}f(z)=\sum_{z^{\prime}:(z,z^{\prime})\in\mathcal{E}}\lambda_{z,z^{\prime}}(f(z^{\prime})-f(z)),\,z\in\mathcal{Z}, (3.1)

where λz,z+1=λ¯/(z+1)\lambda_{z,z+1}=\overline{\lambda}/(z+1) and λz,0=λ¯\lambda_{z,0}=\underline{\lambda}. For each z𝒵z\in\mathcal{Z}, let P¯z\bar{P}_{z} denote the solution to the D([0,T],𝒵)D([0,T],\mathcal{Z})-valued martingale problem for L¯\bar{L} with initial condition zz. Integration with respect to P¯z\bar{P}_{z} is denoted by E¯z\bar{E}_{z}. Let π1(𝒵)\pi\in\mathcal{M}_{1}(\mathcal{Z}) denote the unique invariant probability measure for L¯\bar{L}. Let P¯π\bar{P}_{\pi} denote the solution to the martingale problem for L¯\bar{L} with initial law π\pi. Integration with respect to P¯π\bar{P}_{\pi} is denoted by E¯π\bar{E}_{\pi}. By solving the detailed balance equations for L¯\bar{L}, we see that

π(z)π(0)(λ¯λ¯)zk=1z1k,z1.\displaystyle\pi(z)\leq\pi(0)\left(\frac{\overline{\lambda}}{\underline{\lambda}}\right)^{z}\prod_{k=1}^{z}\frac{1}{k},\,\,z\geq 1.

In particular, π(z)\pi(z) has superexponential decay in zz, and E¯π(exp{βϑ(X)})<\bar{E}_{\pi}(\exp\{\beta\vartheta(X)\})<\infty for small enough β>0\beta>0, where ϑ\vartheta is defined in (1.6). Finally, for each N1N\geq 1, let ¯νNN\bar{\mathbb{P}}^{N}_{\nu_{N}} denote the solution to the D([0,T],1N(𝒵))D([0,T],\mathcal{M}_{1}^{N}(\mathcal{Z}))-valued martingale problem for N\mathscr{L}^{N} with initial condition νN1N(𝒵)\nu_{N}\in\mathcal{M}_{1}^{N}(\mathcal{Z}), λz,z+1(ζ)\lambda_{z,z+1}(\zeta) replaced by λ¯/(z+1)\overline{\lambda}/(z+1) and λz,0(ζ)\lambda_{z,0}(\zeta) replaced by λ¯\underline{\lambda} in (1.1), respectively, for each ζ1(𝒵)\zeta\in\mathcal{M}_{1}(\mathcal{Z}). Integration with respect to ¯νNN\bar{\mathbb{P}}^{N}_{\nu_{N}} is denoted by 𝔼¯νNN\mathbb{\bar{E}}^{N}_{\nu_{N}}. Also, recall νNN\mathbb{P}^{N}_{\nu_{N}}, νN1N(𝒵)\nu_{N}\in\mathcal{M}_{1}^{N}(\mathcal{Z}), from Section 2.1.1. We are now ready to prove Proposition 1.1.

Proof of Proposition 1.1.

Fix N1N\geq 1. We first show the existence and uniqueness of the invariant probability measure for N\mathscr{L}^{N}. Consider the family of probability measures {ηTN,T1}\{\eta^{N}_{T},T\geq 1\} on 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) defined by

ηTN(A)1T0Tδ0N(μN(t)A)𝑑t,A(1(𝒵)),T1.\displaystyle\eta^{N}_{T}(A)\coloneqq\frac{1}{T}\int_{0}^{T}\mathbb{P}^{N}_{\delta_{0}}(\mu^{N}(t)\in A)dt,\,A\in\mathcal{B}(\mathcal{M}_{1}(\mathcal{Z})),\,T\geq 1.

Let XnN(t)X_{n}^{N}(t) denote the state of the nnth particle at time tt. Recall the compact sets 𝒦M\mathscr{K}_{M}, M>0M>0, defined in (2.1). We first couple the laws δ0N\mathbb{P}^{N}_{\delta_{0}} and ¯δ0N\bar{\mathbb{P}}^{N}_{\delta_{0}}. For 𝐳N𝒵N\mathbf{z}^{N}\in\mathcal{Z}^{N}, define emp(𝐳N):=1Nn=1NδznN1N(𝒵)\text{emp}(\mathbf{z}^{N}):=\frac{1}{N}\sum_{n=1}^{N}\delta_{z_{n}^{N}}\in\mathcal{M}_{1}^{N}(\mathcal{Z}). Let 𝐞nN\mathbf{e}_{n}^{N} denote the NN-length vector with a 11 in the nnth position and 0 everywhere else. Consider the Markov process on 𝒵N×𝒵N\mathcal{Z}^{N}\times\mathcal{Z}^{N} with the infinitesimal generator acting on functions ff on 𝒵N×𝒵N\mathcal{Z}^{N}\times\mathcal{Z}^{N} by

(𝐳N,𝐳¯N)\displaystyle(\mathbf{z}^{N},\bar{\mathbf{z}}^{N})\mapsto n=1N[(f(𝐳N+𝐞nN,𝐳¯N+𝐞nN)f(𝐳N,𝐳¯N))(λznN,znN+1(emp(𝐳N))λ¯z¯nN+1)\displaystyle\sum_{n=1}^{N}\biggr{[}\left(f(\mathbf{z}^{N}+\mathbf{e}_{n}^{N},\bar{\mathbf{z}}^{N}+\mathbf{e}_{n}^{N})-f(\mathbf{z}^{N},\bar{\mathbf{z}}^{N})\right)\left(\lambda_{z_{n}^{N},z_{n}^{N}+1}(\text{emp}(\mathbf{z}^{N}))\wedge\frac{\overline{\lambda}}{\bar{z}_{n}^{N}+1}\right)
+(f(𝐳N+𝐞nN,𝐳¯N)f(𝐳N,𝐳¯N))(λznN,znN+1(emp(𝐳N))λ¯z¯nN+1)+\displaystyle\qquad+\left(f(\mathbf{z}^{N}+\mathbf{e}_{n}^{N},\bar{\mathbf{z}}^{N})-f(\mathbf{z}^{N},\bar{\mathbf{z}}^{N})\right)\left(\lambda_{z_{n}^{N},z_{n}^{N}+1}(\text{emp}(\mathbf{z}^{N}))-\frac{\overline{\lambda}}{\bar{z}_{n}^{N}+1}\right)^{+}
+(f(𝐳N,𝐳¯N+𝐞nN)f(𝐳N,𝐳¯N))(λ¯z¯nN+1λznN,znN+1(emp(𝐳N)))+\displaystyle\qquad+\left(f(\mathbf{z}^{N},\bar{\mathbf{z}}^{N}+\mathbf{e}_{n}^{N})-f(\mathbf{z}^{N},\bar{\mathbf{z}}^{N})\right)\left(\frac{\overline{\lambda}}{\bar{z}_{n}^{N}+1}-\lambda_{z_{n}^{N},z_{n}^{N}+1}(\text{emp}(\mathbf{z}^{N}))\right)^{+}
+(f(𝐳NznN𝐞nN,𝐳¯Nz¯nN𝐞nN)f(𝐳N,𝐳¯N))(λznN,0(emp(𝐳N))λ¯)𝟏{znN>0,z¯nN>0}\displaystyle\qquad+\left(f(\mathbf{z}^{N}-z^{N}_{n}\mathbf{e}_{n}^{N},\bar{\mathbf{z}}^{N}-\bar{z}_{n}^{N}\mathbf{e}_{n}^{N})-f(\mathbf{z}^{N},\bar{\mathbf{z}}^{N})\right)\left(\lambda_{z_{n}^{N},0}(\text{emp}(\mathbf{z}^{N}))\wedge\underline{\lambda}\right)\mathbf{1}_{\{z_{n}^{N}>0,\bar{z}_{n}^{N}>0\}}
+(f(𝐳NznN𝐞nN,𝐳¯N)f(𝐳N,𝐳¯N))(λznN,0(emp(𝐳N))λ¯)+𝟏{znN>0}\displaystyle\qquad+\left(f(\mathbf{z}^{N}-z^{N}_{n}\mathbf{e}_{n}^{N},\bar{\mathbf{z}}^{N})-f(\mathbf{z}^{N},\bar{\mathbf{z}}^{N})\right)\left(\lambda_{z_{n}^{N},0}(\text{emp}(\mathbf{z}^{N}))-\underline{\lambda}\right)^{+}\mathbf{1}_{\{z_{n}^{N}>0\}}
+(f(𝐳N,𝐳¯Nz¯nN𝐞nN)f(𝐳N,𝐳¯N))(λ¯λznN,0(emp(𝐳N)))+𝟏{z¯nN>0}].\displaystyle\qquad+\left(f(\mathbf{z}^{N},\bar{\mathbf{z}}^{N}-\bar{z}^{N}_{n}\mathbf{e}_{n}^{N})-f(\mathbf{z}^{N},\bar{\mathbf{z}}^{N})\right)\left(\underline{\lambda}-\lambda_{z_{n}^{N},0}(\text{emp}(\mathbf{z}^{N}))\right)^{+}\mathbf{1}_{\{\bar{z}_{n}^{N}>0\}}\biggr{]}.

Such couplings were studied for continuous-time Markov chains, see, e.g., [27]. Note that, under the above Markov process, for any two initial conditions νN,ν¯N1N(𝒵)\nu_{N},\bar{\nu}_{N}\in\mathcal{M}_{1}^{N}(\mathcal{Z}), the empirical measure flow associated with the first (resp. second) marginal has law νNN\mathbb{P}^{N}_{\nu_{N}} (resp. ¯ν¯NN\bar{\mathbb{P}}^{N}_{\bar{\nu}_{N}}). Therefore, for any t>1t>1, M>1M>1, and β>0\beta>0, we have

δ0N(μN(t)𝒦M)\displaystyle\mathbb{P}^{N}_{\delta_{0}}(\mu^{N}(t)\notin\mathscr{K}_{M}) ¯δ0N(μN(t)𝒦M)\displaystyle\leq\bar{\mathbb{P}}^{N}_{\delta_{0}}(\mu^{N}(t)\notin\mathscr{K}_{M})
=¯δ0N(n=1Nϑ(XnN(t))>NM)\displaystyle=\bar{\mathbb{P}}^{N}_{\delta_{0}}\left(\sum_{n=1}^{N}\vartheta(X_{n}^{N}(t))>NM\right)
exp{NMβ}𝔼¯δ0N(exp{βn=1Nϑ(XnN(t))})\displaystyle\leq\exp\{-NM\beta\}\mathbb{\bar{E}}^{N}_{\delta_{0}}\left(\exp\left\{\beta\sum_{n=1}^{N}\vartheta(X_{n}^{N}(t))\right\}\right)
=exp{NMβ}(E¯0(exp{βϑ(X1N(t))}))N,\displaystyle=\exp\{-NM\beta\}(\bar{E}_{0}(\exp\{\beta\vartheta(X_{1}^{N}(t))\}))^{N}, (3.2)

where the first inequality follows from the above coupling since (i) the nnth particle under ¯δ0N\bar{\mathbb{P}}^{N}_{\delta_{0}} moves from zz to z+1z+1 whenever it does so under δ0N\mathbb{P}^{N}_{\delta_{0}}, and (ii) the nnth particle under δ0N\mathbb{P}^{N}_{\delta_{0}} moves to 0 (i.e., a zz to 0 transition for some zz) whenever it does so under ¯δ0N\bar{\mathbb{P}}^{N}_{\delta_{0}}. The second inequality in (3.2) is a consequence of Chebyshev’s inequality. Recall π\pi, and the laws P¯π\bar{P}_{\pi} and P¯0\bar{P}_{0}. We couple the laws P¯π\bar{P}_{\pi} and P¯0\bar{P}_{0}. Consider the Markov process on 𝒵×𝒵\mathcal{Z}\times\mathcal{Z} with the infinitesimal generator acting on functions ff on 𝒵×𝒵\mathcal{Z}\times\mathcal{Z} by

(z¯1,z¯2)\displaystyle(\bar{z}_{1},\bar{z}_{2}) (f(z¯1+1,z¯2+1)f(z¯1,z¯2))(λ¯z¯1+1λ¯z¯2+1)\displaystyle\mapsto\left(f(\bar{z}_{1}+1,\bar{z}_{2}+1)-f(\bar{z}_{1},\bar{z}_{2})\right)\left(\frac{\overline{\lambda}}{\bar{z}_{1}+1}\wedge\frac{\overline{\lambda}}{\bar{z}_{2}+1}\right)
+(f(z¯1+1,z¯2)f(z¯1,z¯2))(λ¯z¯1+1λ¯z¯2+1)+\displaystyle\qquad+\left(f(\bar{z}_{1}+1,\bar{z}_{2})-f(\bar{z}_{1},\bar{z}_{2})\right)\left(\frac{\overline{\lambda}}{\bar{z}_{1}+1}-\frac{\overline{\lambda}}{\bar{z}_{2}+1}\right)^{+}
+(f(z¯1,z¯2+1)f(z¯1,z¯2))(λ¯z¯2+1λ¯z¯1+1)+\displaystyle\qquad+\left(f(\bar{z}_{1},\bar{z}_{2}+1)-f(\bar{z}_{1},\bar{z}_{2})\right)\left(\frac{\overline{\lambda}}{\bar{z}_{2}+1}-\frac{\overline{\lambda}}{\bar{z}_{1}+1}\right)^{+}
+(f(0,0)f(z¯1,z¯2))λ¯𝟏{z¯1>0,z¯2>0}\displaystyle\qquad+\left(f(0,0)-f(\bar{z}_{1},\bar{z}_{2})\right)\underline{\lambda}\mathbf{1}_{\{\bar{z}_{1}>0,\bar{z}_{2}>0\}}
+(f(0,z¯2)f(z¯1,z¯2))λ¯𝟏{z¯1>0,z¯2=0}\displaystyle\qquad+\left(f(0,\bar{z}_{2})-f(\bar{z}_{1},\bar{z}_{2})\right)\underline{\lambda}\mathbf{1}_{\{\bar{z}_{1}>0,\bar{z}_{2}=0\}}
+(f(z¯1,0)f(z¯1,z¯2))λ¯𝟏{z¯1=0,z¯2>0}.\displaystyle\qquad+\left(f(\bar{z}_{1},0)-f(\bar{z}_{1},\bar{z}_{2})\right)\underline{\lambda}\mathbf{1}_{\{\bar{z}_{1}=0,\bar{z}_{2}>0\}}.

Note that, when the initial condition has law (π,δ0)(\pi,\delta_{0}), the first (resp. second) component under the above process has law P¯π\bar{P}_{\pi} (resp. P¯0\bar{P}_{0}). Also, note that if X¯1(0)X¯2(0)\bar{X}_{1}(0)\geq\bar{X}_{2}(0) then X¯1(s)X¯2(s)\bar{X}_{1}(s)\geq\bar{X}_{2}(s) for all ss under the above coupling. Since the first component is at least the second component under the initial law (π,δ0)(\pi,\delta_{0}), it follows that E¯0(exp{βϑ(X1N(t))})E¯π(exp{βϑ(X1N(t))})\bar{E}_{0}(\exp\{\beta\vartheta(X_{1}^{N}(t))\})\leq\bar{E}_{\pi}(\exp\{\beta\vartheta(X_{1}^{N}(t))\}). The latter is finite for sufficiently small β>0\beta>0, thanks to the exp{ϑ(z)}\exp\{-\vartheta(z)\} decay of the probability measure π\pi on 𝒵\mathcal{Z}. Thus we can choose β¯>0\bar{\beta}>0 small enough (independent of MM) so that logE¯π(exp{β¯ϑ(X1N(t))})<1\log\bar{E}_{\pi}(\exp\{\bar{\beta}\vartheta(X_{1}^{N}(t))\})<1. Hence (3.2) implies that

δ0N(μN(t)𝒦M)\displaystyle\mathbb{P}^{N}_{\delta_{0}}(\mu^{N}(t)\notin\mathscr{K}_{M}) exp{N(Mβ¯1)}.\displaystyle\leq\exp\{-N(M\bar{\beta}-1)\}.

Therefore, for any M>0M>0 and T1T\geq 1, we get

ηTN(𝒦M)exp{N(Mβ¯1)}.\displaystyle\eta^{N}_{T}(\sim\mathscr{K}_{M})\leq\exp\{-N(M\bar{\beta}-1)\}. (3.3)

Since 𝒦M\mathscr{K}_{M} is a compact subset of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}), this show that the family {ηTN,T1}\{\eta^{N}_{T},T\geq 1\} is tight. Hence it follows that there exists an invariant probability measure N\wp^{N} for N\mathscr{L}^{N} (see, for example, Ethier and Kurtz [16, Theorem 9.3, page 240]). By Assumption (A1), μN\mu^{N} is an irreducible Markov process; hence N\wp^{N} is the unique invariant probability measure for N\mathscr{L}^{N}.

We now show the exponential tightness of the family {N,N1}\{\wp^{N},N\geq 1\}. Let M>0M>0 be given, and choose M=(M+1)/β¯M^{\prime}=(M+1)/\bar{\beta}. For each N1N\geq 1, since N\wp^{N} is a weak limit of the family {ηTN,T1}\{\eta^{N}_{T},T\geq 1\} as TT\to\infty, from (3.3) with MM replaced by MM^{\prime}, it follows that

N(𝒦M)lim infTηTN(𝒦M)exp{NM}.\displaystyle\wp^{N}(\sim\mathscr{K}_{M^{\prime}})\leq\liminf_{T\to\infty}\eta^{N}_{T}(\sim\mathscr{K}_{M^{\prime}})\leq\exp\{-NM\}. (3.4)

for each N1N\geq 1. Hence,

lim supN1NlogN(𝒦M)M,\displaystyle\limsup_{N\to\infty}\frac{1}{N}\log\wp^{N}(\sim\mathscr{K}_{M^{\prime}})\leq-M,

which establishes that the family {N,N1}\{\wp^{N},N\geq 1\} is exponential tight. This completes the proof of the proposition. ∎

4 The LDP lower bound

In this section we prove the LDP lower bound for the family {N,N1}\{\wp^{N},N\geq 1\}. To lower bound the probability of a small neighbourhood of a point ξ\xi under N\wp^{N}, we first produce a trajectory that starts at 𝒦M\mathscr{K}_{M} for a suitable M>0M>0, connects to ξ\xi^{*} with a small cost, and then reaches ξ\xi from ξ\xi^{*} with cost arbitrarily close to V(ξ)V(\xi), where VV is the quasipotential defined in (1.3). The probability of a small neighbourhood of ξ\xi under N\wp^{N} is then lower bounded by the probability that the process μN\mu^{N} remains in a small neighbourhood of the trajectory constructed above. The latter is then lower bounded using the uniform LDP lower bound for μN\mu^{N}, where the uniformity is over the initial condition lying in a given compact subset of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}).

Recall K(Δ)K(\Delta) defined in (2.2). We begin with a lemma that allows us to connect points in K(Δ)K(\Delta) to ξ\xi^{*} for small enough Δ\Delta with small cost. We omit its proof here, since it follows from a certain continuity property of VV which will be shown in Lemma 5.3.

Lemma 4.1.

Given γ>0\gamma>0 there exist Δ>0\Delta>0 and T=T(Δ)>0T=T(\Delta)>0 such that for any ζK(Δ)\zeta\in K(\Delta) there exists a trajectory φ\varphi on [0,T][0,T] such that φ(0)=ζ\varphi(0)=\zeta, φ(T)=ξ\varphi(T)=\xi^{*}, and S[0,T](φ|ζ)γS_{[0,T]}(\varphi|\zeta)\leq\gamma.

We now prove the LDP lower bound for the family {N,N1}\{\wp^{N},N\geq 1\}.

Lemma 4.2.

For any γ>0\gamma>0, δ>0\delta>0, and ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}), there exists N01N_{0}\geq 1 such that

N{ζ1(𝒵):d(ζ,ξ)<δ}exp{N(V(ξ)+γ)}\displaystyle\wp^{N}\{\zeta\in\mathcal{M}_{1}(\mathcal{Z}):d(\zeta,\xi)<\delta\}\geq\exp\{-N(V(\xi)+\gamma)\} (4.1)

for all NN0N\geq N_{0}.

Proof.

Fix γ>0\gamma>0, δ>0\delta>0, and ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}). We may assume that V(ξ)<V(\xi)<\infty; if V(ξ)=V(\xi)=\infty then (4.1) trivially holds for all N1N\geq 1. Choose some M>0M>0 and N11N_{1}\geq 1 such that N(𝒦M)1/2\wp^{N}(\mathscr{K}_{M})\geq 1/2 for all NN1N\geq N_{1}; this is possible from the exponential tightness of the family {N,N1}\{\wp^{N},N\geq 1\}, see Proposition 1.1. Using Lemma 4.1, choose ε>0\varepsilon>0 and T0>0T_{0}>0 such that for any ζ1K(ε)\zeta_{1}\in K(\varepsilon) there exists a trajectory φ1\varphi_{1} on [0,T0][0,T_{0}] such that φ1(0)=ζ1,φ1(T0)=ξ\varphi_{1}(0)=\zeta_{1},\varphi_{1}(T_{0})=\xi^{*}, and S[0,T0](φ1|ζ1)γ/4S_{[0,T_{0}]}(\varphi_{1}|\zeta_{1})\leq\gamma/4. Since ξ\xi^{*} is the globally asymptotically stable equilibrium for (1.2) and since 𝒦M\mathscr{K}_{M} is compact, for the above ε>0\varepsilon>0, there exists a T1>0T_{1}>0 such that for any ζ𝒦M\zeta\in\mathscr{K}_{M} we have μζ(T1)K(ε)\mu_{\zeta}(T_{1})\in K(\varepsilon), where μζ\mu_{\zeta} denotes the solution to the McKean-Vlasov equation (1.2) with initial condition ζ\zeta (see assumption (B2)). Also, by the definition of V(ξ)V(\xi), there exists a T2>0T_{2}>0 and a trajectory φ2\varphi_{2} such that φ2(0)=ξ,φ2(T2)=ξ\varphi_{2}(0)=\xi^{*},\varphi_{2}(T_{2})=\xi, and S[0,T2](φ2|ξ)V(ξ)+γ/4S_{[0,T_{2}]}(\varphi_{2}|\xi^{*})\leq V(\xi)+\gamma/4. Let T=T1+T0+T2T=T_{1}+T_{0}+T_{2}. Given ζ𝒦M\zeta\in\mathscr{K}_{M}, we construct a trajectory φζ\varphi_{\zeta} on [0,T][0,T] by using the above three trajectories as follows. Let φζ(0)=ζ\varphi_{\zeta}(0)=\zeta; φζ(t)=μζ(t)\varphi_{\zeta}(t)=\mu_{\zeta}(t) for t[0,T1]t\in[0,T_{1}]; φζ(t)=φ1(tT1)\varphi_{\zeta}(t)=\varphi_{1}(t-T_{1}) for t(T1,T1+T0]t\in(T_{1},T_{1}+T_{0}]; and φζ(t)=φ2(t(T1+T0))\varphi_{\zeta}(t)=\varphi_{2}(t-(T_{1}+T_{0})) for t(T1+T0,T]t\in(T_{1}+T_{0},T]. Note that S[0,T](φζ|ζ)V(ξ)+γ/2S_{[0,T]}(\varphi_{\zeta}|\zeta)\leq V(\xi)+\gamma/2.

Recall that dd is the metric on 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) and ρ\rho is the metric on D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})). Note that we can choose a δ>0\delta^{\prime}>0 (depending on TT and MM) such that ρ(φ,φζ)<δ\rho(\varphi,\varphi_{\zeta})<\delta^{\prime} implies that d(φ(T),φζ(T))<δd(\varphi(T),\varphi_{\zeta}(T))<\delta for any φD([0,T],1(𝒵))\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})) and ζ𝒦M\zeta\in\mathscr{K}_{M}. Indeed, if such a choice is not possible, then there exists a sequence {ζn}𝒦M\{\zeta_{n}\}\in\mathscr{K}_{M}, and a sequence of trajectories {φn}D([0,T],1(𝒵))\{\varphi_{n}\}\subset D([0,T],\mathcal{M}_{1}(\mathcal{Z})) such that S[0,T](φζn|ζn)V(ξ)+γ/2S_{[0,T]}(\varphi_{\zeta_{n}}|\zeta_{n})\leq V(\xi)+\gamma/2 and ρ(φn,φζn)<1/n\rho(\varphi_{n},\varphi_{\zeta_{n}})<1/n for each n1n\geq 1, but d(φn(T),φζn(T))>δd(\varphi_{n}(T),\varphi_{\zeta_{n}}(T))>\delta. By the compactness of the level sets of S[0,T]S_{[0,T]} in Lemma 2.1, it follows that there exists a subsequential limit for {φζnk}k1\{\varphi_{\zeta_{n_{k}}}\}_{k\geq 1} (say, φ\varphi^{*}); since ρ(φn,φζn)<1/n\rho(\varphi_{n},\varphi_{\zeta_{n}})<1/n, φnk\varphi_{n_{k}} also converges to φ\varphi^{*} in D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})) as kk\to\infty. Furthermore, since S[0,T](φ|φ0)<S_{[0,T]}(\varphi^{*}|\varphi^{*}_{0})<\infty, from Theorem 2.2, we have that [0,T]tφ(t)[0,T]\ni t\mapsto\varphi^{*}(t) is continuous. Since D([0,T],1(𝒵))φφ(T)D([0,T],\mathcal{M}_{1}(\mathcal{Z}))\ni\varphi\mapsto\varphi(T) is continuous at all φ\varphi such that tφ(t)t\mapsto\varphi(t) is continuous (see, e.g, [4, page 124]), it follows that d(φnk(T),φ(T))0d(\varphi_{n_{k}}(T),\varphi^{*}(T))\to 0 as kk\to\infty. This contradicts the assumption d(φn(T),φζn(T))>δd(\varphi_{n}(T),\varphi_{\zeta_{n}}(T))>\delta. This shows that we can choose a δ>0\delta^{\prime}>0 such that ρ(φ,φζ)<δ\rho(\varphi,\varphi_{\zeta})<\delta^{\prime} implies that d(φ(T),φζ(T))<δd(\varphi(T),\varphi_{\zeta}(T))<\delta for any φD([0,T],1(𝒵))\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})) and ζ𝒦M\zeta\in\mathscr{K}_{M}. Therefore, for each NN1N\geq N_{1}, we have

N{ζ1(𝒵):d(ζ,ξ)<δ}\displaystyle\wp^{N}\{\zeta\in\mathcal{M}_{1}(\mathcal{Z}):d(\zeta,\xi)<\delta\} =1N(𝒵)ζN(d(μN(T),ξ)<δ)N(dζ)\displaystyle=\int_{\mathcal{M}_{1}^{N}(\mathcal{Z})}\mathbb{P}^{N}_{\zeta}(d(\mu^{N}(T),\xi)<\delta)\wp^{N}(d\zeta)
𝒦M1N(𝒵)ζN(d(μN(T),ξ)<δ)N(dζ)\displaystyle\geq\int_{\mathscr{K}_{M}\cap\mathcal{M}_{1}^{N}(\mathcal{Z})}\mathbb{P}^{N}_{\zeta}(d(\mu^{N}(T),\xi)<\delta)\wp^{N}(d\zeta)
𝒦M1N(𝒵)ζN(ρ(μN,φζ)<δ)N(dζ)\displaystyle\geq\int_{\mathscr{K}_{M}\cap\mathcal{M}_{1}^{N}(\mathcal{Z})}\mathbb{P}^{N}_{\zeta}(\rho(\mu^{N},\varphi_{\zeta})<\delta^{\prime})\wp^{N}(d\zeta)
12infζ𝒦M1N(𝒵)ζN(ρ(μN,φζ)<δ);\displaystyle\geq\frac{1}{2}\inf_{\zeta\in\mathscr{K}_{M}\cap\mathcal{M}_{1}^{N}(\mathcal{Z})}\mathbb{P}^{N}_{\zeta}(\rho(\mu^{N},\varphi_{\zeta})<\delta^{\prime}); (4.2)

here the first equality follows since N\wp^{N} is invariant to time shifts. By the uniform LDP lower bound in Theorem 2.1, there exists N2N1N_{2}\geq N_{1} such that

ζN(ρ(μN,φ)<δ)exp{N(S[0,T](φ|ζ)+γ/4)}\displaystyle\mathbb{P}^{N}_{\zeta}(\rho(\mu^{N},\varphi)<\delta^{\prime})\geq\exp\{-N(S_{[0,T]}(\varphi|\zeta)+\gamma/4)\}

for all ζ𝒦M1N(𝒵)\zeta\in\mathscr{K}_{M}\cap\mathcal{M}_{1}^{N}(\mathcal{Z}), φΦζ[0,T](V(ξ)+γ/2)\varphi\in\Phi_{\zeta}^{[0,T]}(V(\xi)+\gamma/2), and NN2N\geq N_{2}. Noting that S[0,T](φζ|ζ)V(ξ)+γ/2S_{[0,T]}(\varphi_{\zeta}|\zeta)\leq V(\xi)+\gamma/2 for any ζ𝒦M1N(𝒵)\zeta\in\mathscr{K}_{M}\cap\mathcal{M}_{1}^{N}(\mathcal{Z}), and using the above uniform LDP lower bound, (4.2) becomes

N{ζ1(𝒵):d(ζ,ξ)<δ}\displaystyle\wp^{N}\{\zeta\in\mathcal{M}_{1}(\mathcal{Z}):d(\zeta,\xi)<\delta\} 12exp{N(V(ξ)+3γ/4)}\displaystyle\geq\frac{1}{2}\exp\{-N(V(\xi)+3\gamma/4)\}

for all NN2N\geq N_{2}. Finally, choose N0N2N_{0}\geq N_{2} so that 1/2exp{Nγ/4}1/2\geq\exp\{-N\gamma/4\}. Then the above becomes

N{ζ1(𝒵):d(ζ,ξ)<δ}\displaystyle\wp^{N}\{\zeta\in\mathcal{M}_{1}(\mathcal{Z}):d(\zeta,\xi)<\delta\} exp{N(V(ξ)+γ)}\displaystyle\geq\exp\{-N(V(\xi)+\gamma)\}

for all NN0N\geq N_{0}. This completes the proof of LDP lower bound for the family {N,N1}\{\wp^{N},N\geq 1\}. ∎

5 Properties of the quasipotential

In this section we prove three key properties of the quasipotential VV defined in (1.3). These three properties are (i) a characterisation of the set of points for which VV is finite, (ii) a certain continuity property for VV, and (iii) the compactness of the lower level sets of VV. These properties play an important role in the proof of the LDP upper bound in Section 6.

5.1 A characterisation of finiteness of the quasipotential

Recall the function ϑ\vartheta defined in (1.6) and the compact sets 𝒦M\mathscr{K}_{M}, M>0M>0, defined in (2.1). We start with a lemma that enables us to connect δ0\delta_{0}, the point mass at state 0, to a point ξ𝒦M\xi\in\mathscr{K}_{M} for some M>0M>0. This connection is made using a piecewise constant velocity trajectory wherein for each z1z\geq 1, we move the mass ξ(z)\xi(z) from state 0 to state zz in zz steps; in the kkth step, we move the mass ξ(z)\xi(z) from state k1k-1 to state kk with unit velocity. The lemma asserts that the cost of this piecewise constant velocity trajectory is bounded above by a constant that depends only on MM.

Lemma 5.1.

Given M>0M>0 there exists a constant CMC_{M} depending on MM such that for any ξ𝒦M\xi\in\mathscr{K}_{M} there exists a T>0T>0 and a trajectory φ\varphi on [0,T][0,T] such that φ(0)=δ0\varphi(0)=\delta_{0}, φ(T)=ξ\varphi(T)=\xi, and S[0,T](φ|δ0)CMS_{[0,T]}(\varphi|\delta_{0})\leq C_{M}.

Proof.

Fix M>0M>0 and ξ𝒦M\xi\in\mathscr{K}_{M}. Fix J𝒵{0}J\in\mathcal{Z}\setminus\{0\} and define 𝒵J={1,2,,J}\mathcal{Z}_{J}=\{1,2,\ldots,J\}, tz=zξ(z)t_{z}=z\xi(z) for z𝒵Jz\in\mathcal{Z}_{J}, and Tz=z𝒵J,zztzT_{z}=\sum_{z^{\prime}\in\mathcal{Z}_{J},z^{\prime}\geq z}t_{z^{\prime}}. Note that TJTJ1T1T_{J}\leq T_{J-1}\leq\cdots\leq T_{1}. We shall first construct a trajectory φJ\varphi^{J} such that φJ(0)=δ0\varphi^{J}(0)=\delta_{0}, φJ(T1)(z)=ξ(z)\varphi^{J}(T_{1})(z)=\xi(z) for each z𝒵Jz\in\mathcal{Z}_{J}, and S[0,T1](φJ|δ0)S_{[0,T_{1}]}(\varphi^{J}|\delta_{0}) bounded above by a constant independent of JJ.

Let TJ+1=0T_{J+1}=0. For each z𝒵Jz\in\mathcal{Z}_{J}, starting with z=Jz=J, we move the mass ξ(z)\xi(z) from the state 0 to state zz using a piecewise unit velocity trajectory over the time duration (Tz+1,Tz+1+tz](T_{z+1},T_{z+1}+t_{z}]. We define this trajectory φJ\varphi^{J} on [0,T1][0,T_{1}] as follows. Let φ0J=δ0\varphi^{J}_{0}=\delta_{0}. For each z𝒵Jz\in\mathcal{Z}_{J} and 1kz1\leq k\leq z, when t(Tz+1+(k1)ξ(z),Tz+1+kξ(z)]t\in(T_{z+1}+(k-1)\xi(z),T_{z+1}+k\xi(z)], let

φ˙tJ(l)={1 if l=k1 if l=k10 otherwise,\displaystyle\dot{\varphi}^{J}_{t}(l)=\left\{\begin{array}[]{ll}1&\text{ if }l=k\\ -1&\text{ if }l=k-1\\ 0&\text{ otherwise},\end{array}\right.

l𝒵l\in\mathcal{Z}, and define φtJ(l)=δ0(l)+[0,t]φ˙uJ(l)𝑑u\varphi^{J}_{t}(l)=\delta_{0}(l)+\int_{[0,t]}\dot{\varphi}^{J}_{u}(l)du, l𝒵l\in\mathcal{Z}, t[0,T]t\in[0,T].

We now calculate the cost of this trajectory. Fix z𝒵z\in\mathcal{Z} such that ξ(z)>0\xi(z)>0, and let 1kz1\leq k\leq z. For each t(Tz+1+(k1)ξ(z),Tz+1+kξ(z))t\in(T_{z+1}+(k-1)\xi(z),T_{z+1}+k\xi(z)) and αC0(𝒵)\alpha\in C_{0}(\mathcal{Z}), note that

α,φ˙tJ\displaystyle\langle\alpha,\dot{\varphi}^{J}_{t} ΛφtJφtJ(z,z)τ(α(z)α(z))λz,z(φtJ)φJt(z)\displaystyle-\Lambda_{\varphi^{J}_{t}}^{*}\varphi^{J}_{t}\rangle-\sum_{(z,z^{\prime})\in\mathcal{E}}\tau(\alpha(z^{\prime})-\alpha(z))\lambda_{z,z^{\prime}}(\varphi^{J}_{t})\varphi^{J}_{t}(z)
=(α(k)α(k1))(z,z)(exp{α(z)α(z)}1)λz,z(φtJ)φtJ(z).\displaystyle=(\alpha(k)-\alpha(k-1))-\sum_{(z,z^{\prime})\in\mathcal{E}}(\exp\{\alpha(z^{\prime})-\alpha(z)\}-1)\lambda_{z,z^{\prime}}(\varphi^{J}_{t})\varphi^{J}_{t}(z).

Hence,

supαC0(𝒵){α,φ˙tJ\displaystyle\sup_{\alpha\in C_{0}(\mathcal{Z})}\biggr{\{}\langle\alpha,\dot{\varphi}^{J}_{t} ΛφtJφtJ(z,z)τ(α(z)α(z))λz,z(φtJ)φtJ(z)}\displaystyle-\Lambda_{\varphi^{J}_{t}}^{*}\varphi^{J}_{t}\rangle-\sum_{(z,z^{\prime})\in\mathcal{E}}\tau(\alpha(z^{\prime})-\alpha(z))\lambda_{z,z^{\prime}}(\varphi^{J}_{t})\varphi^{J}_{t}(z)\biggr{\}}
supx(x(exp{x}1)λk1,k(φtJ)φtJ(k1))\displaystyle\leq\sup_{x\in\mathbb{R}}(x-(\exp\{x\}-1)\lambda_{k-1,k}(\varphi^{J}_{t})\varphi^{J}_{t}(k-1))
+supαC0(𝒵)((z,z);(z,z)(k1,k)(exp{α(z)α(z)}1)λz,z(φtJ)φtJ(z))\displaystyle\qquad+\sup_{\alpha\in C_{0}(\mathcal{Z})}\left(-\sum_{(z,z^{\prime})\in\mathcal{E};(z,z^{\prime})\neq(k-1,k)}(\exp\{\alpha(z^{\prime})-\alpha(z)\}-1)\lambda_{z,z^{\prime}}(\varphi^{J}_{t})\varphi^{J}_{t}(z)\right)
log(1φtJ(k1)λk1,k(φtJ))+2λ¯\displaystyle\leq\log\left(\frac{1}{\varphi^{J}_{t}(k-1)\lambda_{k-1,k}(\varphi^{J}_{t})}\right)+2\overline{\lambda}
log(1φtJ(k1))+logk+log(1λ¯)+2λ¯,\displaystyle\leq\log\left(\frac{1}{\varphi^{J}_{t}(k-1)}\right)+\log k+\log\left(\frac{1}{\underline{\lambda}}\right)+2\overline{\lambda}, (5.1)

where the last two inequalities follow from assumption (A2). Consider the first term above. For k>1k>1, integration of this quantity over the time duration t(Tz+1+(k1)ξ(z),Tz+1+kξ(z))t\in(T_{z+1}+(k-1)\xi(z),T_{z+1}+k\xi(z)) gives

(Tz+1+(k1)ξ(z),Tz+1+kξ(z))log(1φtJ(k1))𝑑t\displaystyle\int_{(T_{z+1}+(k-1)\xi(z),T_{z+1}+k\xi(z))}\,\log\left(\frac{1}{\varphi^{J}_{t}(k-1)}\right)dt =ξ(z)0log(1u)𝑑u\displaystyle=-\int_{\xi(z)}^{0}\log\left(\frac{1}{u}\right)\,du
=(uloguu)|ξ(z)0\displaystyle=(u\log u-u)\biggr{|}_{\xi(z)}^{0}
=ξ(z)log(1ξ(z))+ξ(z),\displaystyle=\xi(z)\log\left(\frac{1}{\xi(z)}\right)+\xi(z),

where the first equality follows from the variable change u=φtJ(k1)u=\varphi^{J}_{t}(k-1) and the facts (i) φ˙tJ(k1)=1\dot{\varphi}^{J}_{t}(k-1)=-1, (ii) φtJ(k1)=ξ(z)\varphi^{J}_{t}(k-1)=\xi(z) when t=Tz+1+(k1)ξ(z)t=T_{z+1}+(k-1)\xi(z), (iii) φtJ(k1)=0\varphi^{J}_{t}(k-1)=0 when t=Tz+1+kξ(z)t=T_{z+1}+k\xi(z), and (iv) du=dtdu=-dt. For k=1k=1, using the bound φtJ(0)φtJ(0)(1z=zJξ(z))\varphi^{J}_{t}(0)\geq\varphi^{J}_{t}(0)-(1-\sum_{z^{\prime}=z}^{J}\xi(z^{\prime})), we get

(Tz+1,Tz+1+ξ(z))\displaystyle\int_{(T_{z+1},T_{z+1}+\xi(z))} log(1φtJ(0))dt\displaystyle\,\log\left(\frac{1}{\varphi^{J}_{t}(0)}\right)dt
(Tz+1,Tz+1+ξ(z))log(1φtJ(0)(1z=zJξ(z)))𝑑t\displaystyle\leq\int_{(T_{z+1},T_{z+1}+\xi(z))}\log\left(\frac{1}{\varphi^{J}_{t}(0)-(1-\sum_{z^{\prime}=z}^{J}\xi(z^{\prime}))}\right)dt
=ξ(z)0log(1u)𝑑u,\displaystyle=-\int_{\xi(z)}^{0}\log\left(\frac{1}{u}\right)\,du,

where the last equality follows from the variable change u=φtJ(0)(1z=zJξ(z))u=\varphi^{J}_{t}(0)-(1-\sum_{z^{\prime}=z}^{J}\xi(z^{\prime})), and the facts (i) φ˙tJ(0)=1\dot{\varphi}^{J}_{t}(0)=-1, (ii) φtJ(0)=1z=z+1Jξ(z)\varphi^{J}_{t}(0)=1-\sum_{z^{\prime}=z+1}^{J}\xi(z^{\prime}) when t=Tz+1t=T_{z+1} so that φtJ(0)(1z=zJξ(z))=ξ(z)\varphi^{J}_{t}(0)-(1-\sum_{z^{\prime}=z}^{J}\xi(z^{\prime}))=\xi(z) when t=Tz+1t=T_{z+1}, (iii) φtJ(0)=1z=zJξ(z)\varphi^{J}_{t}(0)=1-\sum_{z^{\prime}=z}^{J}\xi(z^{\prime}) when t=Tz+1+ξ(z)t=T_{z+1}+\xi(z) so that φtJ(0)(1z=zJξ(z))=0\varphi^{J}_{t}(0)-(1-\sum_{z^{\prime}=z}^{J}\xi(z^{\prime}))=0 when t=Tz+1+ξ(z)t=T_{z+1}+\xi(z), and (iv) du=dtdu=-dt. Thus, proceeding as before for the case k>1k>1, we arrive at

(Tz+1,Tz+1+ξ(z))log(1φtJ(0))𝑑tξ(z)log(1ξ(z))+ξ(z).\displaystyle\int_{(T_{z+1},T_{z+1}+\xi(z))}\log\left(\frac{1}{\varphi^{J}_{t}(0)}\right)dt\leq\xi(z)\log\left(\frac{1}{\xi(z)}\right)+\xi(z).

Hence, integrating (5.1) over t(Tz+1+(k1)ξ(z),Tz+1+kξ(z))t\in(T_{z+1}+(k-1)\xi(z),T_{z+1}+k\xi(z)) and summing over 1kz1\leq k\leq z, we get, for each z𝒵Jz\in\mathcal{Z}_{J},

(Tz+1,Tz+1+tz)supαC0(𝒵){α,φ˙tJ\displaystyle\int_{(T_{z+1},T_{z+1}+t_{z})}\,\sup_{\alpha\in C_{0}(\mathcal{Z})}\biggr{\{}\langle\alpha,\dot{\varphi}^{J}_{t} ΛφtJφtJ(z,z)τ(α(z)α(z))λz,z(φtJ)φtJ(z)}dt\displaystyle-\Lambda_{\varphi^{J}_{t}}^{*}\varphi^{J}_{t}\rangle-\sum_{(z,z^{\prime})\in\mathcal{E}}\tau(\alpha(z^{\prime})-\alpha(z))\lambda_{z,z^{\prime}}(\varphi^{J}_{t})\varphi^{J}_{t}(z)\biggr{\}}dt
zξ(z)log(1ξ(z))+C~z,\displaystyle\leq z\xi(z)\log\left(\frac{1}{\xi(z)}\right)+\tilde{C}_{z}, (5.2)

where C~z=(zlogz+z)ξ(z)+zξ(z)(log(1λ¯)+2λ¯).\tilde{C}_{z}=(z\log z+z)\xi(z)+z\xi(z)\left(\log\left(\frac{1}{\underline{\lambda}}\right)+2\overline{\lambda}\right). Let C~J=z𝒵JC~z\tilde{C}^{J}=\sum_{z\in\mathcal{Z}_{J}}\tilde{C}_{z}. Thus, summing the above display over z𝒵Jz\in\mathcal{Z}_{J}, we arrive at

S[0,T1](φJ|δ0)z𝒵Jzξ(z)log(1ξ(z))+C~J.\displaystyle S_{[0,T_{1}]}(\varphi^{J}|\delta_{0})\leq\sum_{z\in\mathcal{Z}_{J}}z\xi(z)\log\left(\frac{1}{\xi(z)}\right)+\tilde{C}^{J}.

Note that

z𝒵Jzξ(z)log(1ξ(z))\displaystyle\sum_{z\in\mathcal{Z}_{J}}z\xi(z)\log\left(\frac{1}{\xi(z)}\right) =ξ(z)1/z3z𝒵J:zξ(z)log(1ξ(z))+ξ(z)>1/z3z𝒵J:zξ(z)log(1ξ(z))\displaystyle=\sum_{\stackrel{{\scriptstyle z\in\mathcal{Z}_{J}:}}{{\xi(z)\leq 1/z^{3}}}}z\xi(z)\log\left(\frac{1}{\xi(z)}\right)+\sum_{\stackrel{{\scriptstyle z\in\mathcal{Z}_{J}:}}{{\xi(z)>1/z^{3}}}}z\xi(z)\log\left(\frac{1}{\xi(z)}\right)
1e+ξ(z)1/z3z𝒵J{1}:3logzz2+3ξ(z)>1/z3z𝒵J:(zlogz)ξ(z)\displaystyle\leq\frac{1}{e}+\sum_{\stackrel{{\scriptstyle z\in\mathcal{Z}_{J}\setminus\{1\}:}}{{\xi(z)\leq 1/z^{3}}}}\frac{3\log z}{z^{2}}+3\sum_{\stackrel{{\scriptstyle z\in\mathcal{Z}_{J}:}}{{\xi(z)>1/z^{3}}}}(z\log z)\xi(z)
1e+3z𝒵J{logzz2+(zlogz)ξ(z)},\displaystyle\leq\frac{1}{e}+3\sum_{z\in\mathcal{Z}_{J}}\left\{\frac{\log z}{z^{2}}+(z\log z)\xi(z)\right\}, (5.3)

where the first inequality comes from the fact that the mapping xxlog(1/x)x\mapsto x\log(1/x) is monotonically increasing for x[0,1/e]x\in[0,1/e]. Hence,

S[0,T1](φJ|δ0)1e+3z𝒵J{logzz2+(zlogz)ξ(z)}+C~J,J1.\displaystyle S_{[0,T_{1}]}(\varphi^{J}|\delta_{0})\leq\frac{1}{e}+3\sum_{z\in\mathcal{Z}_{J}}\left\{\frac{\log z}{z^{2}}+(z\log z)\xi(z)\right\}+\tilde{C}^{J},\,J\geq 1.

Define T=z𝒵zξ(z)T=\sum_{z\in\mathcal{Z}}z\xi(z). We now extend the trajectory φJ\varphi^{J} to (T1,T](T_{1},T] by defining φtJ=φT1J\varphi^{J}_{t}=\varphi^{J}_{T_{1}} for t(T1,T]t\in(T_{1},T]. Noting that φ˙tJ(z)=0\dot{\varphi}^{J}_{t}(z)=0 for all z𝒵z\in\mathcal{Z} on t(T1,T]t\in(T_{1},T], this extension suffers an additional cost of at most 2λ¯T2\overline{\lambda}T. Hence, we get

S[0,T](φJ|δ0)1e+3z𝒵J{logzz2+(zlogz)ξ(z)}+C~J+2λ¯T.\displaystyle S_{[0,T]}(\varphi^{J}|\delta_{0})\leq\frac{1}{e}+3\sum_{z\in\mathcal{Z}_{J}}\left\{\frac{\log z}{z^{2}}+(z\log z)\xi(z)\right\}+\tilde{C}^{J}+2\overline{\lambda}T.

Noting that (i) the right hand side above is upper bounded by ξ,ϑC(λ¯,λ¯)\langle\xi,\vartheta\rangle C(\overline{\lambda},\underline{\lambda}), where C(λ¯,λ¯)C(\overline{\lambda},\underline{\lambda}) is a constant depending on λ¯\overline{\lambda} and λ¯\underline{\lambda}, and (ii) ξ,ϑM\langle\xi,\vartheta\rangle\leq M, the above display yields

S[0,T](φJ|δ0)C(M,λ¯,λ¯),\displaystyle S_{[0,T]}(\varphi^{J}|\delta_{0})\leq C(M,\overline{\lambda},\underline{\lambda}),

where C(M,λ¯,λ¯)C(M,\overline{\lambda},\underline{\lambda}) is a constant depending on M,λ¯M,\overline{\lambda}, and λ¯\underline{\lambda}. Using the compactness of the level sets of S[0,T]S_{[0,T]} (see Lemma 2.1), it follows that the sequence of trajectories {φJ,J1}\{\varphi^{J},J\geq 1\} has a convergent subsequence. Re-indexing the original sequence, let φJφ\varphi^{J}\to\varphi in D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})) as JJ\to\infty. By construction, for each J𝒵{0}J\in\mathcal{Z}\setminus\{0\}, φTJ(z)=ξ(z)\varphi^{J}_{T}(z)=\xi(z) for all z𝒵Jz\in\mathcal{Z}_{J}; hence φT(z)=ξ(z)\varphi_{T}(z)=\xi(z) for all z𝒵z\in\mathcal{Z}. Recall that lower semicontinuity of S[0,T]S_{[0,T]} was proved in the course of the proof of Lemma 2.1. Therefore, it follows that

S[0,T](φ|δ0)lim infJS[0,T](φJ|δ0)C(M,λ¯,λ¯).\displaystyle S_{[0,T]}(\varphi|\delta_{0})\leq\liminf_{J\to\infty}S_{[0,T]}(\varphi^{J}|\delta_{0})\leq C(M,\overline{\lambda},\underline{\lambda}).

This completes the proof of the lemma. ∎

We are now ready to characterise the set of points ξ\xi in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) whose V(ξ)V(\xi) is finite.

Lemma 5.2.

V(ξ)<V(\xi)<\infty if and only if ξ𝒦\xi\in\mathscr{K}. Furthermore, for any M>0M>0, there exists a constant CM>0C_{M}>0 such that ξ𝒦M\xi\in\mathscr{K}_{M} implies V(ξ)CMV(\xi)\leq C_{M}.

Proof.

Let ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}) be such that V(ξ)<V(\xi)<\infty. Then there exists a T>0T>0 and a trajectory φ\varphi on [0,T][0,T] such that φ(0)=ξ,φ(T)=ξ\varphi(0)=\xi^{*},\varphi(T)=\xi, and S[0,T](φ|ξ)V(ξ)+1S_{[0,T]}(\varphi|\xi^{*})\leq V(\xi)+1. By Theorem 2.2, there exists a measurable function hφh_{\varphi} on [0,T]×[0,T]\times\mathcal{E} such that

φt,f\displaystyle\langle\varphi_{t},f\rangle =φ0,f+[0,t](z,z)(f(z)f(z))(1+hφ(u,z,z))λz,z(φu)φu(z)du\displaystyle=\langle\varphi_{0},f\rangle+\int_{[0,t]}\sum_{(z,z^{\prime})\in\mathcal{E}}(f(z^{\prime})-f(z))(1+h_{\varphi}(u,z,z^{\prime}))\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du (5.4)

holds for all t[0,T]t\in[0,T] and fC0(𝒵)f\in C_{0}(\mathcal{Z}), and S[0,T](φ|φ(0))S_{[0,T]}(\varphi|\varphi(0)) is given by

S[0,T](φ|φ(0))=[0,T](z,z)τ(hφ(t,z,z))λz,z(φt)φt(z)dt.\displaystyle S_{[0,T]}(\varphi|\varphi(0))=\int_{[0,T]}\sum_{(z,z^{\prime})\in\mathcal{E}}\tau^{*}(h_{\varphi}(t,z,z^{\prime}))\lambda_{z,z^{\prime}}(\varphi_{t})\varphi_{t}(z)dt.

For any x0x\geq 0 and yy\in\mathbb{R}, using the convex duality relation (x1)yτ(x1)+τ(y)(x-1)y\leq\tau^{*}(x-1)+\tau(y), we get the inequality xyτ(x1)+(exp{y}1)xy\leq\tau^{*}(x-1)+(\exp\{y\}-1). Hence, from the above non-variational representation for S[0,T](φ|φ(0))S_{[0,T]}(\varphi|\varphi(0)), (5.4) implies

φt,f\displaystyle\langle\varphi_{t},f\rangle ξ,f+[0,t](z,z)τ(hφ(u,z,z))λz,z(φu)φu(z)du\displaystyle\leq\langle\xi^{*},f\rangle+\int_{[0,t]}\sum_{(z,z^{\prime})\in\mathcal{E}}\tau^{*}(h_{\varphi}(u,z,z^{\prime}))\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du
+[0,t](z,z)(exp{f(z)f(z)}1)λz,z(φu)φu(z)du\displaystyle\qquad+\int_{[0,t]}\sum_{(z,z^{\prime})\in\mathcal{E}}(\exp\{f(z^{\prime})-f(z)\}-1)\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du
ξ,f+V(ξ)+1\displaystyle\leq\langle\xi^{*},f\rangle+V(\xi)+1
+[0,t](z,z)(exp{f(z)f(z)}1)λz,z(φu)φu(z)du.\displaystyle\qquad+\int_{[0,t]}\sum_{(z,z^{\prime})\in\mathcal{E}}(\exp\{f(z^{\prime})-f(z)\}-1)\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du. (5.5)

Recall the function ϑ\vartheta on 𝒵\mathcal{Z}. For n1n\geq 1, define

ϑn(z)={ϑ(z), if zn,0, otherwise.\displaystyle\vartheta_{n}(z)=\left\{\begin{array}[]{ll}\vartheta(z),&\text{ if }z\leq n,\\ 0,&\text{ otherwise}.\end{array}\right.

By convexity, note that ϑn(z+1)ϑn(z)1+log(z+1)\vartheta_{n}(z+1)-\vartheta_{n}(z)\leq 1+\log(z+1) and ϑn(0)ϑn(z)0\vartheta_{n}(0)-\vartheta_{n}(z)\leq 0, for each z𝒵z\in\mathcal{Z}. Therefore, using the upper bound for the transition rates from assumption (A2), observe that

[0,t](z,z)(exp{ϑn(z)ϑn(z)}1)λz,z(φu)φu(z)duλ¯(e1)t,\displaystyle\int_{[0,t]}\sum_{(z,z^{\prime})\in\mathcal{E}}(\exp\{\vartheta_{n}(z^{\prime})-\vartheta_{n}(z)\}-1)\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du\leq\overline{\lambda}(e-1)t,

for each t[0,T]t\in[0,T] and n1n\geq 1. It follows from (5.5) with ff replaced by ϑn\vartheta_{n} that

φt,ϑnξ,ϑn+V(ξ)+1+λ¯(e1)T\displaystyle\langle\varphi_{t},\vartheta_{n}\rangle\leq\langle\xi^{*},\vartheta_{n}\rangle+V(\xi)+1+\overline{\lambda}(e-1)T

for each t[0,T]t\in[0,T] and n1n\geq 1. Letting nn\to\infty and using monotone convergence, we conclude that

supt[0,T]φt,ϑ=supt[0,T]limnφt,ϑnξ,ϑ+V(ξ)+1+λ¯(e1)T.\displaystyle\sup_{t\in[0,T]}\langle\varphi_{t},\vartheta\rangle=\sup_{t\in[0,T]}\lim_{n\to\infty}\langle\varphi_{t},\vartheta_{n}\rangle\leq\langle\xi^{*},\vartheta\rangle+V(\xi)+1+\overline{\lambda}(e-1)T. (5.6)

In particular, ξ,ϑξ,ϑ+V(ξ)+1+λ¯(e1)T\langle\xi,\vartheta\rangle\leq\langle\xi^{*},\vartheta\rangle+V(\xi)+1+\overline{\lambda}(e-1)T. It follows that ξ𝒦\xi\in\mathscr{K}.

Conversely, let ξ𝒦\xi\in\mathscr{K}. Let M>0M>0 be such that ξ𝒦M\xi\in\mathscr{K}_{M}. By Lemma 5.1, there exists a T>0T>0 and a trajectory φ(2)\varphi^{(2)} on [0,T][0,T] such that φ(2)(0)=δ0\varphi^{(2)}(0)=\delta_{0}, φ(2)(T)=ξ\varphi^{(2)}(T)=\xi, and S[0,T](φ(2)|δ0)CMS_{[0,T]}(\varphi^{(2)}|\delta_{0})\leq C_{M} for some constant CM>0C_{M}>0 depending on MM. Let t0=0t_{0}=0, tz=z=1zξ(z)t_{z}=\sum_{z^{\prime}=1}^{z}\xi^{*}(z^{\prime}), z𝒵{0}z\in\mathcal{Z}\setminus\{0\}, and T1=z0ξ(z)T_{1}=\sum_{z^{\prime}\neq 0}\xi^{*}(z^{\prime}). We now construct another trajectory φ(1)\varphi^{(1)} on [0,T1][0,T_{1}] such that φ(1)(0)=ξ\varphi^{(1)}(0)=\xi^{*}, φ(1)(T1)=δ0\varphi^{(1)}(T_{1})=\delta_{0}, and S[0,T1](φ(1)|ξ)<S_{[0,T_{1}]}(\varphi^{(1)}|\xi^{*})<\infty. This trajectory is constructed using piecewise constant velocity paths and its cost S[0,T1](φ(1)|ξ)S_{[0,T_{1}]}(\varphi^{(1)}|\xi^{*}) is computed using arguments similar to those used in the proof of Lemma 5.1; we provide the details here for completeness. When t(tz1,tz]t\in(t_{z-1},t_{z}] for some z𝒵{0}z\in\mathcal{Z}\setminus\{0\}, let

φ˙t(1)(l)={1, if l=z,1, if l=0,0, otherwise,\displaystyle\dot{\varphi}^{(1)}_{t}(l)=\left\{\begin{array}[]{ll}-1,&\text{ if }l=z,\\ 1,&\text{ if }l=0,\\ 0,&\text{ otherwise},\end{array}\right.

l𝒵l\in\mathcal{Z}, and define φt(1)(l)=φ0(1)(l)+[0,t]φ˙u(1)(l)𝑑u\varphi^{(1)}_{t}(l)=\varphi^{(1)}_{0}(l)+\int_{[0,t]}\dot{\varphi}^{(1)}_{u}(l)du, l𝒵l\in\mathcal{Z}, t[0,T1]t\in[0,T_{1}]. Note that, for each αC0(𝒵)\alpha\in C_{0}(\mathcal{Z}), when t(tz1,tz)t\in(t_{z-1},t_{z}), we have

{α,\displaystyle\biggr{\{}\langle\alpha, φ˙t(1)Λφt(1)φt(1)(z,z)τ(α(z)α(z))λz,z(φt(1))φt(1)(z)}\displaystyle\dot{\varphi}^{(1)}_{t}-\Lambda_{\varphi^{(1)}_{t}}^{*}\varphi^{(1)}_{t}\rangle-\sum_{(z,z^{\prime})\in\mathcal{E}}\tau(\alpha(z^{\prime})-\alpha(z))\lambda_{z,z^{\prime}}(\varphi^{(1)}_{t})\varphi^{(1)}_{t}(z)\biggr{\}}
=(α(0)α(z))(exp{α(0)α(z)}1)λz,0(φt(1))φt(1)(z)\displaystyle=(\alpha(0)-\alpha(z))-(\exp\{\alpha(0)-\alpha(z)\}-1)\lambda_{z,0}(\varphi^{(1)}_{t})\varphi^{(1)}_{t}(z)
(z0,z):(z0,z)(z,0)(exp{α(z)α(z0)}1)λz0,z(φt(1))φt(1)(z0)},\displaystyle\qquad-\sum_{(z_{0},z^{\prime})\in\mathcal{E}:(z_{0},z^{\prime})\neq(z,0)}(\exp\{\alpha(z^{\prime})-\alpha(z_{0})\}-1)\lambda_{z_{0},z^{\prime}}(\varphi^{(1)}_{t})\varphi^{(1)}_{t}(z_{0})\biggr{\}},

so that optimising the left hand side of the above display over αC0(𝒵)\alpha\in C_{0}(\mathcal{Z}) yields

supαC0(𝒵){α,\displaystyle\sup_{\alpha\in C_{0}(\mathcal{Z})}\biggr{\{}\langle\alpha, φ˙t(1)Λφt(1)φt(1)(z,z)τ(α(z)α(z))λz,z(φt(1))φt(1)(z)}\displaystyle\dot{\varphi}^{(1)}_{t}-\Lambda_{\varphi^{(1)}_{t}}^{*}\varphi^{(1)}_{t}\rangle-\sum_{(z,z^{\prime})\in\mathcal{E}}\tau(\alpha(z^{\prime})-\alpha(z))\lambda_{z,z^{\prime}}(\varphi^{(1)}_{t})\varphi^{(1)}_{t}(z)\biggr{\}}
log(1φt(1)(z)λz,0(φt(1)))+2λ¯\displaystyle\leq\log\left(\frac{1}{\varphi^{(1)}_{t}(z)\lambda_{z,0}(\varphi^{(1)}_{t})}\right)+2\bar{\lambda}
log(1φt(1)(z))+log(1λ¯)+2λ¯,\displaystyle\leq\log\left(\frac{1}{\varphi^{(1)}_{t}(z)}\right)+\log\left(\frac{1}{\underline{\lambda}}\right)+2\overline{\lambda},

where the last inequality follows form the lower bound on the backward transition rates in assumption (A2). Integrating the above over (tz1,tz)(t_{z-1},t_{z}) and summing over z𝒵{0}z\in\mathcal{Z}\setminus\{0\}, we arrive at

S[0,T1](φ(1)|ξ)z𝒵{0}{ξ(z)log1ξ(z)+ξ(z)(1+log(1λ¯)+2λ¯)}.\displaystyle S_{[0,T_{1}]}(\varphi^{(1)}|\xi^{*})\leq\sum_{z\in\mathcal{Z}\setminus\{0\}}\left\{\xi^{*}(z)\log\frac{1}{\xi^{*}(z)}+\xi^{*}(z)\left(1+\log\left(\frac{1}{\underline{\lambda}}\right)+2\overline{\lambda}\right)\right\}. (5.7)

Since ξ𝒦\xi^{*}\in\mathscr{K}, proceeding via the steps in (5.3), we conclude that the right hand side of the above display is finite. We combine φ(1)\varphi^{(1)} and φ(2)\varphi^{(2)} and define a new trajectory φ~\tilde{\varphi} on [0,T1+T][0,T_{1}+T] as follows: φ~(t)=φ(1)(t)\tilde{\varphi}(t)=\varphi^{(1)}(t) on t[0,T1]t\in[0,T_{1}]; φ~(t)=φ(2)(tT1)\tilde{\varphi}(t)=\varphi^{(2)}(t-T_{1}) on t(T1,T1+T]t\in(T_{1},T_{1}+T]. Note that φ~(0)=ξ\tilde{\varphi}(0)=\xi^{*}, φ~(T1+T)=ξ\tilde{\varphi}(T_{1}+T)=\xi, and S[0,T1+T](φ~|ξ)<S_{[0,T_{1}+T]}(\tilde{\varphi}|\xi^{*})<\infty. Hence V(ξ)<V(\xi)<\infty.

To prove the second statement, we note that given any M>0M>0, for any ξ𝒦M\xi\in\mathscr{K}_{M}, the cost of the trajectory φ~\tilde{\varphi} constructed in the previous paragraph is bounded above by a constant depending only on MM (and not on ξ\xi). This completes the proof of the lemma. ∎

5.2 Continuity

We now establish a certain continuity property of the quasipotential VV. Since VV has compact level sets and the space 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) is not locally compact, we cannot expect VV to be continuous on 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}). In fact, for any point ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}) with V(ξ)<V(\xi)<\infty, one can produce a sequence {ξn,n1}\{\xi_{n},n\geq 1\} such that ξnξ\xi_{n}\to\xi in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as nn\to\infty, and ξn,ϑ=\langle\xi_{n},\vartheta\rangle=\infty for all n1n\geq 1, so that infn1V(ξn)=\inf_{n\geq 1}V(\xi_{n})=\infty. We prove that VV is continuous under the convergence of ϑ\vartheta-moments when it is restricted to 𝒦\mathscr{K}. That is, when ξn,ξ𝒦\xi_{n},\xi\in\mathscr{K}, ξnξ\xi_{n}\to\xi in 1(𝒵),\mathcal{M}_{1}(\mathcal{Z}), and ξn,ϑξ,ϑ\langle\xi_{n},\vartheta\rangle\to\langle\xi,\vartheta\rangle as nn\to\infty, then V(ξn)V(ξ)V(\xi_{n})\to V(\xi) as nn\to\infty. Towards this, we produce a trajectory that connects ξ\xi to ξn\xi_{n} by first moving the mass from all the large enough states zz back to the state 0, then producing a constant velocity trajectory that fills the required mass from state 0 to all the large enough states zz, and finally adjusting mass within a finite subset of 𝒵\mathcal{Z} to reach ξn\xi_{n}. We show that the cost of the trajectory constructed above can be made arbitrarily small for large enough nn.

Lemma 5.3.

Let ξn𝒦\xi_{n}\in\mathscr{K}, n1n\geq 1, and ξ𝒦\xi\in\mathscr{K}. Suppose that ξnξ\xi_{n}\to\xi in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) and ξn,ϑξ,ϑ\langle\xi_{n},\vartheta\rangle\to\langle\xi,\vartheta\rangle as nn\to\infty. Then V(ξn)V(ξ)V(\xi_{n})\to V(\xi) as nn\to\infty.

Proof.

We first prove that lim supnV(ξn)V(ξ)\limsup_{n\to\infty}V(\xi_{n})\leq V(\xi). Fix ε>0\varepsilon>0. We shall move from ξ\xi to ξn\xi_{n} in five steps. The outline of this construction is as follows:

  • φ(0)\varphi^{(0)}: This trajectory starts with ξ\xi and moves all the mass for all states z>z0z>z_{0}, for a suitable large enough z0z_{0}, back to state 0. This backward movement results in a cost of O(ε)O(\varepsilon).

  • φ(1)\varphi^{(1)}: Next, we move any additional mass, if required, from the states {1,2,,z0}\{1,2,\ldots,z_{0}\} back to state 0 so that there is enough mass at state 0 to fill up all the states beyond z0z_{0}. Again, this backward movement results in a cost of O(ε)O(\varepsilon).

  • φ(2)\varphi^{(2)}: Next, we construct a piecewise constant-velocity trajectory to move the mass z>z0ξn(z)\sum_{z^{\prime}>z_{0}}\xi_{n}(z) from state 0 to state z0+1z_{0}+1. After this movement, state z0+1z_{0}+1 contains all the mass required to fill up the states beyond it. This forward movement results in a cost of O(εlog(1/ε))O(\varepsilon\log(1/\varepsilon)), instead of O(ε)O(\varepsilon), because we move the total mass for all the states beyond z0z_{0}.

  • φ(3)\varphi^{(3)}: Then, for each z>z0z>z_{0}, we move the required mass (i.e., ξn(z)\xi_{n}(z)) from state 0 to state zz using a piece-wise constant velocity trajectory. At the end of this procedure, for each z>z0z>z_{0}, the mass at state zz becomes ξn(z)\xi_{n}(z). This forward movement results in a cost of O(ε)O(\varepsilon).

  • φ(4)\varphi^{(4)}: Finally, we adjust the mass within the finite set {1,2,,z0}\{1,2,\ldots,z_{0}\} to match with ξn\xi_{n}. This also results in a cost at most O(εlog(1/ε))O(\varepsilon\log(1/\varepsilon)). Again, this cost is O(εlog(1/ε))O(\varepsilon\log(1/\varepsilon)) instead of O(ε)O(\varepsilon) because we move, for each z{1,2,,z0}z\in\{1,2,\ldots,z_{0}\}, the sum of the additional mass (under ξn\xi_{n} compared to ξ\xi) in the states {z,z+1,,z0}\{z,z+1,\ldots,z_{0}\} from state 0 to state zz.

Therefore, the total cost of all these trajectories is at most O(εlog(1/ε))O(\varepsilon\log(1/\varepsilon)), which vanishes as ε0\varepsilon\to 0. We now define these trajectories in detail and evaluate their costs.

Let z02z_{0}\geq 2 be such that

z>z0ϑ(z)ξ(z)<ε/6andz>z0logzz2<ε.\displaystyle\sum_{z>z_{0}}\vartheta(z)\xi(z)<\varepsilon/6\quad\text{and}\quad\sum_{z>z_{0}}\frac{\log z}{z^{2}}<\varepsilon.

Then choose n11n_{1}\geq 1 such that z>z0ϑ(z)ξn(z)<ε/3\sum_{z>z_{0}}\vartheta(z)\xi_{n}(z)<\varepsilon/3 holds for all nn1n\geq n_{1}; this is possible since ξnξ\xi_{n}\to\xi in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) and ξn,ϑξ,ϑ\langle\xi_{n},\vartheta\rangle\to\langle\xi,\vartheta\rangle as nn\to\infty. Let

tz0=0,tz=z=z0+1zξ(z),z>z0, andT0=z>z0ξ(z).\displaystyle t_{z_{0}}=0,\quad t_{z}=\sum_{z^{\prime}=z_{0}+1}^{z}\xi(z^{\prime}),\,z>z_{0},\quad\text{ and}\quad T_{0}=\sum_{z^{\prime}>z_{0}}\xi(z^{\prime}).

Define the trajectory φ(0)\varphi^{(0)} on [0,T0][0,T_{0}] as follows. When t(tz1,tz]t\in(t_{z-1},t_{z}] for some z>z0z>z_{0}, let

φ˙t(0)(l)={1, if l=z,1, if l=0,0, otherwise,\displaystyle\dot{\varphi}^{(0)}_{t}(l)=\left\{\begin{array}[]{ll}-1,&\text{ if }l=z,\\ 1,&\text{ if }l=0,\\ 0,&\text{ otherwise},\end{array}\right.

l𝒵l\in\mathcal{Z}, and define

φt(0)(l)=ξ(l)+[0,t]φ˙u(0)(l)𝑑u,l𝒵,t[0,T0].\displaystyle\varphi^{(0)}_{t}(l)=\xi(l)+\int_{[0,t]}\dot{\varphi}^{(0)}_{u}(l)du,\quad l\in\mathcal{Z},\,t\in[0,T_{0}].

Note that φT0(0)(z)=ξ(z)\varphi^{(0)}_{T_{0}}(z)=\xi(z) for 1zz01\leq z\leq z_{0}, φT0(0)(z)=0\varphi^{(0)}_{T_{0}}(z)=0 for z>z0z>z_{0}, and φT0(0)(0)=ξ(0)+z>z0ξ(z)\varphi^{(0)}_{T_{0}}(0)=\xi(0)+\sum_{z>z_{0}}\xi(z). Let M=(supnn1ξn,ϑ)ξ,ϑ+1M=(\sup_{n\geq n_{1}}\langle\xi_{n},\vartheta\rangle)\vee\langle\xi,\vartheta\rangle+1. Using ideas similar to those used in the proof of Lemma 5.2, it can be checked that S[0,T0](φ(0)|ξ)C0(M,λ¯,λ¯)εS_{[0,T_{0}]}(\varphi^{(0)}|\xi)\leq C_{0}(M,\overline{\lambda},\underline{\lambda})\varepsilon, for some constant C1(M,λ¯,λ¯)C_{1}(M,\overline{\lambda},\underline{\lambda}) depending on MM, λ¯\overline{\lambda}, and λ¯\underline{\lambda}. Indeed, the cost is O(z>z0ξ(z)log(1/ξ(z)))O(\sum_{z>z_{0}}\xi(z)\log(1/\xi(z))), which, using the argument used to arrive at the bound (5.7) and the choice of z0z_{0}, is bounded by

O(z>z0((zlogz)ξ(z)+logzz2))=O(ε).\displaystyle O\left(\sum_{z>z_{0}}\left((z\log z)\xi(z)+\frac{\log z}{z^{2}}\right)\right)=O(\varepsilon).

Let εn=z>z0ξn(z)\varepsilon_{n}=\sum_{z>z_{0}}\xi_{n}(z). If εn>φT0(0)(0)\varepsilon_{n}>\varphi_{T_{0}}^{(0)}(0), then we move the extra mass εnφT0(0)(0)\varepsilon_{n}-\varphi_{T_{0}}^{(0)}(0) from the states {1,2,,z0}\{1,2,\ldots,z_{0}\} to state 0 as follows. Let T1=T0+εnφT0(0)(0)T_{1}=T_{0}+\varepsilon_{n}-\varphi_{T_{0}}^{(0)}(0). When tt is between T0+z=z+1z0φT0(0)(z)T_{0}+\sum_{z^{\prime}=z+1}^{z_{0}}\varphi_{T_{0}}^{(0)}(z^{\prime}) and (T0+z=zz0φT0(0)(z))T1(T_{0}+\sum_{z^{\prime}=z}^{z_{0}}\varphi_{T_{0}}^{(0)}(z^{\prime}))\wedge T_{1} for some zz0z\leq z_{0}, let

φ˙t(1)(l)={1, if l=z,1, if l=0,0, otherwise,\displaystyle\dot{\varphi}^{(1)}_{t}(l)=\left\{\begin{array}[]{ll}-1,&\text{ if }l=z,\\ 1,&\text{ if }l=0,\\ 0,&\text{ otherwise},\end{array}\right.

l𝒵l\in\mathcal{Z}. Define the trajectory φ(1)\varphi^{(1)} on [0,T1][0,T_{1}] as follows: φt(1)=φt(0)\varphi^{(1)}_{t}=\varphi^{(0)}_{t} when t[0,T0]t\in[0,T_{0}]; φt(1)(l)=φT0(0)(l)+[0,t]φ˙u(1)(l)𝑑u\varphi^{(1)}_{t}(l)=\varphi^{(0)}_{T_{0}}(l)+\int_{[0,t]}\dot{\varphi}^{(1)}_{u}(l)du, l𝒵l\in\mathcal{Z}, t(T0,T1]t\in(T_{0},T_{1}]. Note that φ(1)\varphi^{(1)} depends on nn, but we suppress this in the notation for ease of readability. Again, since εn\varepsilon_{n} is smaller than ε/3\varepsilon/3, by using calculations similar to those used in the proof of Lemma 5.2, we see that S[T0,T1](φ(1)|φT0(0))C1(M,λ¯,λ¯)εS_{[T_{0},T_{1}]}(\varphi^{(1)}|\varphi_{T_{0}}^{(0)})\leq C_{1}(M,\overline{\lambda},\underline{\lambda})\varepsilon for some constant C1(M,λ¯,λ¯)C_{1}(M,\overline{\lambda},\underline{\lambda}) depending on MM, λ¯\overline{\lambda}, and λ¯\underline{\lambda}. On the other hand, if εnφT0(0)(0)\varepsilon_{n}\leq\varphi_{T_{0}}^{(0)}(0), we set T1=T0T_{1}=T_{0} and φt(1)=φt(0)\varphi^{(1)}_{t}=\varphi^{(0)}_{t} on [0,T1][0,T_{1}]. In both cases, we have φT1(1)(0)εn\varphi^{(1)}_{T_{1}}(0)\geq\varepsilon_{n}.

Let T2=(z0+1)εnT_{2}=(z_{0}+1)\varepsilon_{n}. We now construct another trajectory φ(2)\varphi^{(2)} on [0,T2][0,T_{2}] to transfer the mass εn\varepsilon_{n} from state 0 (in φT1(1)\varphi^{(1)}_{T_{1}}) to state z0+1z_{0}+1. Let φ0(2)=φT1(1)\varphi^{(2)}_{0}=\varphi^{(1)}_{T_{1}}. When t((z1)εn,zεn]t\in((z-1)\varepsilon_{n},z\varepsilon_{n}] for some z{1,2,,z0+1}z\in\{1,2,\ldots,z_{0}+1\}, let

φ˙t(2)(l)={1, if l=z1,1, if l=z,0, otherwise,\displaystyle\dot{\varphi}^{(2)}_{t}(l)=\left\{\begin{array}[]{ll}-1,&\text{ if }l=z-1,\\ 1,&\text{ if }l=z,\\ 0,&\text{ otherwise},\end{array}\right.

l𝒵l\in\mathcal{Z}, and define φt(2)(l)=φT1(1)(l)+[0,t]φ˙u(2)(l)𝑑u\varphi^{(2)}_{t}(l)=\varphi^{(1)}_{T_{1}}(l)+\int_{[0,t]}\dot{\varphi}^{(2)}_{u}(l)du, l𝒵l\in\mathcal{Z}, t(0,T2]t\in(0,T_{2}]. Note that |xlog(1x)ylog(1y)|δ+δlog(1/δ)|x\log(\frac{1}{x})-y\log(\frac{1}{y})|\leq\delta+\delta\log(1/\delta) whenever |xy|δ|x-y|\leq\delta, and that εnε/(z0logz0)\varepsilon_{n}\leq\varepsilon/(z_{0}\log z_{0}). Hence, using calculations similar to those done in the proof of Lemma 5.1, we see that S[0,T2](φ(2)|φT1(1))S_{[0,T_{2}]}(\varphi^{(2)}|\varphi^{(1)}_{T_{1}}) can be bounded above by C2(M,λ¯,λ¯)εlog(1/ε)C_{2}(M,\overline{\lambda},\underline{\lambda})\varepsilon\log(1/\varepsilon) where C2(M,λ¯,λ¯)C_{2}(M,\overline{\lambda},\underline{\lambda}) is a constant depending on MM, λ¯\overline{\lambda}, and λ¯\underline{\lambda}, for each nn1n\geq n_{1} (recall that φ(2)\varphi^{(2)} depends on nn). Indeed, the cost is bounded by the order of (see the bound in (5.2))

z0εnlog(1εn)+(z0logz0)εn\displaystyle z_{0}\varepsilon_{n}\log\left(\frac{1}{\varepsilon_{n}}\right)+(z_{0}\log z_{0})\varepsilon_{n} z0εz0logz0log(z0logz0ε)+ε\displaystyle\leq z_{0}\frac{\varepsilon}{z_{0}\log z_{0}}\log\left(\frac{z_{0}\log z_{0}}{\varepsilon}\right)+\varepsilon
εlog(1/ε)+3ε,\displaystyle\leq\varepsilon\log(1/\varepsilon)+3\varepsilon,

where the first inequality uses the fact that εnε/(z0logz0)\varepsilon_{n}\leq\varepsilon/(z_{0}\log z_{0}), and the second inequality uses the fact that z02z_{0}\geq 2 so that z0logz0>1z_{0}\log z_{0}>1.

Note that φT2(2)(z0+1)=εn\varphi^{(2)}_{T_{2}}(z_{0}+1)=\varepsilon_{n}. We now construct a trajectory that distributes this mass εn\varepsilon_{n} from the state z0+1z_{0}+1 to all the states zz0+1z\geq z_{0}+1 to match with ξn(z)\xi_{n}(z). Let tz=zξn(z)t^{\prime}_{z}=z\xi_{n}(z) for zz0+2z\geq z_{0}+2 and T3=zz0+2tzT_{3}=\sum_{z\geq z_{0}+2}t^{\prime}_{z}. Similar to the construction in the proof of Lemma 5.1, we can now construct a trajectory φ(3)\varphi^{(3)} on [0,T3][0,T_{3}] such that φ0(3)=φT2(2)\varphi^{(3)}_{0}=\varphi^{(2)}_{T_{2}}, φT3(3)(z)=ξn(z)\varphi^{(3)}_{T_{3}}(z)=\xi_{n}(z) for each zz0+1z\geq z_{0}+1, and S[0,T3](φ(3)|φT2(2))C3(M,λ¯,λ¯)εS_{[0,T_{3}]}(\varphi^{(3)}|\varphi^{(2)}_{T_{2}})\leq C_{3}(M,\overline{\lambda},\underline{\lambda})\varepsilon for some constant C3(M,λ¯,λ¯)C_{3}(M,\overline{\lambda},\underline{\lambda}) depending on MM, λ¯\overline{\lambda}, and λ¯\underline{\lambda}, for all nn1n\geq n_{1}. Indeed, using the bounds in (5.2) and (5.3), the total cost is bounded by the order of

z>z0+1((zlogz)ξn(z)+logzz2)ε3+ε,\displaystyle\sum_{z>z_{0}+1}\left((z\log z)\xi_{n}(z)+\frac{\log z}{z^{2}}\right)\leq\frac{\varepsilon}{3}+\varepsilon,

where the inequality follows from the choice of z0z_{0}.

Finally, we construct a trajectory that connects φT3(3)\varphi^{(3)}_{T_{3}} to ξn\xi_{n} by adjusting the mass within the states {0,1,,z0}\{0,1,\ldots,z_{0}\}. Note that φT3(3)(z)=ξn(z)\varphi^{(3)}_{T_{3}}(z)=\xi_{n}(z) for each zz0+1z\geq z_{0}+1. Let 𝒵0{1,2,,z0}\mathcal{Z}_{0}\subset\{1,2,\ldots,z_{0}\} denote the set of all z{1,2,,z0}z\in\{1,2,\ldots,z_{0}\} such that φT3(3)(z)>ξn(z)\varphi^{(3)}_{T_{3}}(z)>\xi_{n}(z). Similar to the construction of φ(1)\varphi^{(1)}, for each z𝒵0z\in\mathcal{Z}_{0}, we move the mass φT3(3)(z)ξn(z)\varphi^{(3)}_{T_{3}}(z)-\xi_{n}(z) from state zz to state 0 using unit velocity over a time duration φT3(3)(z)ξn(z)\varphi^{(3)}_{T_{3}}(z)-\xi_{n}(z). Once these mass transfers are complete, starting with z=1z=1, we move the mass

zz,z𝒵0,zz0(ξn(z)φT3(3)(z))\displaystyle\sum_{z^{\prime}\geq z,z^{\prime}\notin\mathcal{Z}_{0},z^{\prime}\leq z_{0}}(\xi_{n}(z^{\prime})-\varphi_{T_{3}}^{(3)}(z^{\prime}))

from state z1z-1 to state zz with at unit rate. Let

T4=z𝒵0(φT3(3)(z)ξn(z))+z𝒵0,zz0(ξn(z)φT3(3)(z)),\displaystyle T_{4}=\sum_{z\in\mathcal{Z}_{0}}(\varphi^{(3)}_{T_{3}}(z)-\xi_{n}(z))+\sum_{z\notin\mathcal{Z}_{0},z\leq z_{0}}(\xi_{n}(z)-\varphi^{(3)}_{T_{3}}(z)),

and let φ(4)\varphi^{(4)} denote this piecewise constant velocity trajectory. Let ε~n=z𝒵0,zz0(ξn(z)φT3(3)(z))\tilde{\varepsilon}_{n}=\sum_{z\notin\mathcal{Z}_{0},z\leq z_{0}}(\xi_{n}(z)-\varphi_{T_{3}}^{(3)}(z)). At each step of φ(4)\varphi^{(4)}, since we move a mass of at most ε~n\tilde{\varepsilon}_{n} from state z1z-1 to state zz, the cost of φ(4)\varphi^{(4)} is at most of the order of (see (5.2))

z0ε~nlog(1ε~n)+(z0logz0)ε~n.\displaystyle z_{0}\tilde{\varepsilon}_{n}\log\left(\frac{1}{\tilde{\varepsilon}_{n}}\right)+(z_{0}\log z_{0})\tilde{\varepsilon}_{n}.

Since ε~n0\tilde{\varepsilon}_{n}\to 0 as nn\to\infty, we may choose n2n1n_{2}\geq n_{1} so that ε~nε/(z0logz0)\tilde{\varepsilon}_{n}\leq\varepsilon/(z_{0}\log z_{0}) for all nn2n\geq n_{2}. Therefore, for nn2n\geq n_{2}, the above display is bounded by

z0εz0logz0log(z0logz0ε)+εεlog(1/ε)+3ε,\displaystyle z_{0}\frac{\varepsilon}{z_{0}\log z_{0}}\log\left(\frac{z_{0}\log z_{0}}{\varepsilon}\right)+\varepsilon\leq\varepsilon\log(1/\varepsilon)+3\varepsilon,

which is O(εlog(1/ε))O(\varepsilon\log(1/\varepsilon)). Therefore, S[0,T4](φ(4)|φT3(3))C4(M,λ¯,λ¯)εlog(1/ε)S_{[0,T_{4}]}(\varphi^{(4)}|\varphi^{(3)}_{T_{3}})\leq C_{4}(M,\overline{\lambda},\underline{\lambda})\varepsilon\log(1/\varepsilon) for all nn2n\geq n_{2}, for some constant C4(M,λ¯,λ¯)C_{4}(M,\overline{\lambda},\underline{\lambda}) depending on MM, λ¯\overline{\lambda}, and λ¯\underline{\lambda}.

Let T=i=14TiT=\sum_{i=1}^{4}T_{i}. We now append the four paths φ(i),1i4\varphi^{(i)},1\leq i\leq 4, constructed in the previous paragraphs over the time duration [0,T][0,T] to get a path φ\varphi such that φ0=ξ\varphi_{0}=\xi, φT=ξn\varphi_{T}=\xi_{n} and S[0,T](φ|ξ)C(M,λ¯,λ¯)εlog(1/ε)S_{[0,T]}(\varphi|\xi)\leq C(M,\overline{\lambda},\underline{\lambda})\varepsilon\log(1/\varepsilon) where C(M,λ¯,λ¯)C(M,\overline{\lambda},\underline{\lambda}) is a constant depending on MM, λ¯\overline{\lambda} and λ¯\underline{\lambda}. Hence, for each nn2n\geq n_{2}, we have

V(ξn)V(ξ)+S[0,T4](φ|ξ)V(ξ)+C(M,λ¯,λ¯)εlog(1/ε).\displaystyle V(\xi_{n})\leq V(\xi)+S_{[0,T_{4}]}(\varphi|\xi)\leq V(\xi)+C(M,\overline{\lambda},\underline{\lambda})\varepsilon\log(1/\varepsilon).

Therefore, lim supnV(ξn)V(ξ)+C(M,λ¯,λ¯)εlog(1/ε)\limsup_{n\to\infty}V(\xi_{n})\leq V(\xi)+C(M,\overline{\lambda},\underline{\lambda})\varepsilon\log(1/\varepsilon). Letting ε0\varepsilon\to 0 and noting that εlog(1/ε)0\varepsilon\log(1/\varepsilon)\to 0, we arrive at lim supnV(ξn)V(ξ)\limsup_{n\to\infty}V(\xi_{n})\leq V(\xi).

To prove lim infnV(ξn)V(ξ)\liminf_{n\to\infty}V(\xi_{n})\geq V(\xi), we reverse the role of ξn\xi_{n} and ξ\xi in the above argument. That is, we construct a trajectory φ\varphi on [0,T][0,T] such that φ0=ξn\varphi_{0}=\xi_{n}, φT=ξ\varphi_{T}=\xi, and S[0,T](φ|ξn)εnS_{[0,T]}(\varphi|\xi_{n})\leq\varepsilon_{n} for all n1n\geq 1, where εn0\varepsilon_{n}\to 0 as nn\to\infty. Thus, we get

V(ξ)V(ξn)+εn.\displaystyle V(\xi)\leq V(\xi_{n})+\varepsilon_{n}.

Letting nn\to\infty, we conclude that lim infnV(ξn)V(ξ)\liminf_{n\to\infty}V(\xi_{n})\geq V(\xi). This completes the proof of the lemma. ∎

Remark 5.1.

The choice of n1n_{1} in the above proof suggests that the inequality lim supnV(ξn)V(ξ)\limsup_{n\to\infty}V(\xi_{n})\leq V(\xi) can be proved as long as ξnξ\xi_{n}\to\xi in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as nn\to\infty and lim supnξn,ϑξ,ϑ\limsup_{n\to\infty}\langle\xi_{n},\vartheta\rangle\leq\langle\xi,\vartheta\rangle holds. Similarly, the inequality lim infnV(ξn)V(ξ)\liminf_{n\to\infty}V(\xi_{n})\geq V(\xi) can be proved as long as ξnξ\xi_{n}\to\xi in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) and lim infnξn,ϑξ,ϑ\liminf_{n\to\infty}\langle\xi_{n},\vartheta\rangle\geq\langle\xi,\vartheta\rangle holds. This observation will be later used in the proof of the compactness of the lower level sets of VV.

5.3 Compactness of the lower level sets of the quasipotential

Define the level sets of VV by

Ξ(s){ξ1(𝒵):V(ξ)s},s>0.\displaystyle\Xi(s)\coloneqq\{\xi\in\mathcal{M}_{1}(\mathcal{Z}):V(\xi)\leq s\},\,s>0.

In this section we establish the compactness of Ξ(s)\Xi(s) for each s>0s>0.

Lemma 5.4.

For each s>0s>0, Ξ(s)\Xi(s) is a compact subset of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}).

Proof.

We first prove an inclusion property of the level sets of VV, namely, given M>0M>0 there exists M>0M^{\prime}>0 such that

{ξ1(𝒵):V(ξ)M}𝒦M.\displaystyle\{\xi\in\mathcal{M}_{1}(\mathcal{Z}):V(\xi)\leq M\}\subset\mathscr{K}_{M^{\prime}}. (5.8)

On one hand, using Proposition 1.1 on the exponential tightness of the family {N,N1}\{\wp^{N},N\geq 1\}, choose M>0M^{\prime}>0 (see (3.4)) such that

lim supN1NlogN(𝒦M)(M+1).\displaystyle\limsup_{N\to\infty}\frac{1}{N}\log\wp^{N}(\sim\mathscr{K}_{M^{\prime}})\leq-(M+1).

On the other hand, using the LDP lower bound established in Lemma 4.2 and the compactness of 𝒦M\mathscr{K}_{M^{\prime}}, we have

lim infN1NlogN(𝒦M)infξ𝒦MV(ξ).\displaystyle\liminf_{N\to\infty}\frac{1}{N}\log\wp^{N}(\sim\mathscr{K}_{M^{\prime}})\geq-\inf_{\xi\notin\mathscr{K}_{M^{\prime}}}V(\xi).

Combining the above two displays, we get

infξ𝒦MV(ξ)lim infN1NlogN(𝒦M)lim supN1NlogN(𝒦M)(M+1).\displaystyle-\inf_{\xi\notin\mathscr{K}_{M^{\prime}}}V(\xi)\leq\liminf_{N\to\infty}\frac{1}{N}\log\wp^{N}(\sim\mathscr{K}_{M^{\prime}})\leq\limsup_{N\to\infty}\frac{1}{N}\log\wp^{N}(\sim\mathscr{K}_{M^{\prime}})\leq-(M+1).

That is, ξ𝒦M\xi\notin\mathscr{K}_{M^{\prime}} implies V(ξ)M+1>MV(\xi)\geq M+1>M. This shows (5.8). By Prohorov’s theorem, 𝒦M\mathscr{K}_{M} is a compact subset of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}); hence (5.8) shows that Ξ(s)\Xi(s) is precompact for each s>0s>0.

We now show that Ξ(s)\Xi(s) is closed in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}). Let ξnΞ(s)\xi_{n}\in\Xi(s) for each n1n\geq 1 and let ξnξ\xi_{n}\to\xi in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as nn\to\infty. By Fatou’s lemma, we have lim infnξn,ϑξ,ϑ\liminf_{n\to\infty}\langle\xi_{n},\vartheta\rangle\geq\langle\xi,\vartheta\rangle. Hence, by Remark 5.1, we have lim infnV(ξn)V(ξ)\liminf_{n\to\infty}V(\xi_{n})\geq V(\xi). Thus, ξΞ(s)\xi\in\Xi(s). This completes the proof of the lemma. ∎

6 The LDP upper bound

Recall 𝒦M\mathscr{K}_{M} defined in (2.1) and K(Δ)K(\Delta) defined in (2.2). For mm\in\mathbb{N}, define

𝒮m(Δ,M)={φD([0,m],1(𝒵)):φ(0)𝒦M,φ(n)K(Δ) for all n=1,2,,m}.\displaystyle\mathscr{S}_{m}(\Delta,M)=\{\varphi\in D([0,m],\mathcal{M}_{1}(\mathcal{Z})):\varphi(0)\in\mathscr{K}_{M},\varphi(n)\notin K(\Delta)\text{ for all }n=1,2,\ldots,m\}.

That is, 𝒮m(Δ,M)\mathscr{S}_{m}(\Delta,M) denotes the set of all trajectories that start at 𝒦M\mathscr{K}_{M} and do not intersect K(Δ)K(\Delta) at all integer time points in [0,m][0,m]. We begin with a lemma that asserts that the elements of 𝒮m(Δ,M)\mathscr{S}_{m}(\Delta,M) for large enough mm must have non-trivial cost. The key idea used in the proof comes from the compactness of level sets of the process-level large deviations rate function S[0,T](|ν),νKS_{[0,T]}(\cdot|\nu),\nu\in K, for any compact subset KK of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) (see Lemma 2.1).

Lemma 6.1.

For any s>0s>0, M>0M>0, and Δ>0\Delta>0, there exists m0m_{0}\in\mathbb{N} such that

inf{S[0,m0](φ|φ(0)),φ𝒮m0(Δ,M)}>s.\displaystyle\inf\{S_{[0,m_{0}]}(\varphi|\varphi(0)),\varphi\in\mathscr{S}_{m_{0}}(\Delta,M)\}>s. (6.1)
Proof.

Suppose not. Then there exist s>0s>0, M>0M>0, Δ>0\Delta>0, a sequence of positive numbers {εm,m1}\{\varepsilon_{m},m\geq 1\} such that εm0\varepsilon_{m}\to 0 as mm\to\infty, and a sequence of trajectories {φm,m1}\{\varphi_{m},m\geq 1\} such that φm𝒮m(Δ,M)\varphi_{m}\in\mathscr{S}_{m}(\Delta,M), and S[0,m](φm|φm(0))s+εmS_{[0,m]}(\varphi_{m}|\varphi_{m}(0))\leq s+\varepsilon_{m} for each m1m\geq 1.

Note that there exists an M1>0M_{1}>0 such that φm(t)𝒦M1\varphi_{m}(t)\in\mathscr{K}_{M_{1}} for each t[0,m]t\in[0,m] and each m1m\geq 1. Indeed, by Lemma 5.2, there exists CM>0C_{M}>0 such that ζK(Δ)\zeta\in K(\Delta) implies V(ζ)CMV(\zeta)\leq C_{M}. Thus, for each m1m\geq 1, there exist a T¯m>0\bar{T}_{m}>0 and a trajectory φ¯m\bar{\varphi}_{m} on [0,T¯m][0,\bar{T}_{m}] such that φ¯m(0)=ξ\bar{\varphi}_{m}(0)=\xi^{*}, φ¯m(T¯m)=ζK(Δ)\bar{\varphi}_{m}(\bar{T}_{m})=\zeta\in K(\Delta), and S[0,T¯m](φ¯m|ξ)CM+1S_{[0,\bar{T}_{m}]}(\bar{\varphi}_{m}|\xi^{*})\leq C_{M}+1. We extend this trajectory φ¯m\bar{\varphi}_{m} to (T¯m,T¯m+m](\bar{T}_{m},\bar{T}_{m}+m] by defining φ¯m(t)=φm(tT¯m)\bar{\varphi}_{m}(t)=\varphi_{m}(t-\bar{T}_{m}) on t(T¯m,T¯+m]t\in(\bar{T}_{m},\bar{T}+m]. Note that S[0,T¯m+m](φ¯m|ξ)CM+1+s+εmS_{[0,\bar{T}_{m}+m]}(\bar{\varphi}_{m}|\xi^{*})\leq C_{M}+1+s+\varepsilon_{m}, so that V(φm(t))CM+1+s+εmV(\varphi_{m}(t))\leq C_{M}+1+s+\varepsilon_{m} for each t[0,m]t\in[0,m] and each m1m\geq 1. Thus, we can find an M1>0M_{1}>0 such that (5.8) holds with MM replaced by CM+s+supm1εm+2C_{M}+s+\sup_{m\geq 1}\varepsilon_{m}+2 and MM^{\prime} replaced by M1M_{1}. It follows that φm(t)𝒦M1\varphi_{m}(t)\in\mathscr{K}_{M_{1}} for each t[0,m]t\in[0,m] and each m1m\geq 1.

For the above choice of M1M_{1}, using assumption (B2), choose T1>1T_{1}>1 such that μζ(t)K(Δ/2)\mu_{\zeta}(t)\in K(\Delta/2) for each tT1t\geq T_{1} and each ζ𝒦M1\zeta\in\mathscr{K}_{M_{1}}, where μζ\mu_{\zeta} is the solution to the McKean-Vlasov equation (1.2) with initial condition ζ\zeta. Note that the closure of the set of all trajectories φ\varphi on [0,T1][0,T_{1}] in D([0,T1],1(𝒵))D([0,T_{1}],\mathcal{M}_{1}(\mathcal{Z})) with initial condition φ(0)𝒦M1\varphi(0)\in\mathscr{K}_{M_{1}} and φ(T1)K(Δ)\varphi(T_{1})\notin K(\Delta) does not contain any trajectory of the McKean-Vlasov equation (1.2). It follows from Lemma 2.1 that

βinf{S[0,T1](φ|φ(0)),φ(0)𝒦M1,φ(n)K(Δ) for each n=1,2,,T1}>0.\displaystyle\beta\coloneqq\inf\{S_{[0,T_{1}]}(\varphi|\varphi(0)),\varphi(0)\in\mathscr{K}_{M_{1}},\varphi(n)\notin K(\Delta)\text{ for each }n=1,2,\ldots,\lfloor T_{1}\rfloor\}>0.

Therefore, noting that φm(t)𝒦M1\varphi_{m}(t)\in\mathscr{K}_{M_{1}} for each t[0,m]t\in[0,m] and m1m\geq 1, we see that

S[0,m](φm|φm(0))\displaystyle S_{[0,m]}(\varphi_{m}|\varphi_{m}(0)) n=1m/T1S[(n1)T1,nT1](φm|φm((n1)T1))\displaystyle\geq\sum_{n=1}^{\lfloor m/T_{1}\rfloor}S_{[(n-1)T_{1},nT_{1}]}(\varphi_{m}|\varphi_{m}((n-1)T_{1}))
mT1β\displaystyle\geq\biggr{\lfloor}\frac{m}{T_{1}}\biggr{\rfloor}\beta
 as m,\displaystyle\to\infty\text{ as }m\to\infty,

which contradicts our assumption. This completes the proof of the lemma. ∎

With a slight abuse of notation, given A1(𝒵)A\subset\mathcal{M}_{1}(\mathcal{Z}), s>0s>0, and T>0T>0, define

ΦA[0,T](s){φD([0,T],1(𝒵)):φ(0)A,S[0,T](φ|φ(0))s}.\displaystyle\Phi_{A}^{[0,T]}(s)\coloneqq\{\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})):\varphi(0)\in A,S_{[0,T]}(\varphi|\varphi(0))\leq s\}.

We now prove a certain containment property for elements of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) that can arise as end-points of trajectories in ΦK(Δ)[0,T](s)\Phi_{K(\Delta)}^{[0,T]}(s), s>0s>0 and Δ>0\Delta>0, i.e., points ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}) such that there exists a trajectory φ\varphi with φ0K(Δ)\varphi_{0}\in K(\Delta) and S[0,T](φ|φ0)sS_{[0,T]}(\varphi|\varphi_{0})\leq s. We prove that such points are not far from the lower level sets of VV in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}). This connection between trajectories over finite time horizons and the level sets of the quasipotential VV is the key to transfer the process-level LDP upper bound in Theorem 2.1 to the LDP upper bound for the family of invariant measures {N,N1}\{\wp^{N},N\geq 1\}.

Lemma 6.2.

For any s>0s>0 and δ>0\delta>0 there exists Δ>0\Delta>0 and T11T_{1}\geq 1 such that for all TT1T\geq T_{1},

{φ(T):φΦK(Δ)[0,T](s)}{ξ1(𝒵):d(ξ,Ξ(s))δ}.\displaystyle\{\varphi(T):\varphi\in\Phi_{K(\Delta)}^{[0,T]}(s)\}\subset\{\xi\in\mathcal{M}_{1}(\mathcal{Z}):d(\xi,\Xi(s))\leq\delta\}. (6.2)
Proof.

Suppose not. Then there exist s>0s>0, δ>0\delta>0, sequences {Δn,n1}\{\Delta_{n},n\geq 1\}, {Tn,n1}\{T_{n},n\geq 1\} such that Δn0\Delta_{n}\downarrow 0 and TnT_{n}\uparrow\infty as nn\to\infty, and trajectories φnΦK(Δn)[0,Tn](s)\varphi_{n}\in\Phi_{K(\Delta_{n})}^{[0,T_{n}]}(s) such that d(φn(Tn),Ξ(s))>δd(\varphi_{n}(T_{n}),\Xi(s))>\delta for each n1n\geq 1. Let ξn=φn(Tn)\xi_{n}=\varphi_{n}(T_{n}), n1n\geq 1. By Lemma 5.3, there exists a T>0T^{\prime}>0 and a sequence {εn,n1}\{\varepsilon_{n},n\geq 1\}, with εn0\varepsilon_{n}\to 0 as nn\to\infty, such that for any ζK(Δn)\zeta^{\prime}\in K(\Delta_{n}) there exists a trajectory φ¯ζ\bar{\varphi}^{\zeta^{\prime}} on [0,T][0,T^{\prime}] such that φ¯ζ(0)=ξ,φ¯ζ(T)=ζ\bar{\varphi}^{\zeta^{\prime}}(0)=\xi^{*},\bar{\varphi}^{\zeta^{\prime}}(T^{\prime})=\zeta^{\prime}, and S[0,T](φ¯ζ|ξ)εnS_{[0,T^{\prime}]}(\bar{\varphi}^{\zeta^{\prime}}|\xi^{*})\leq\varepsilon_{n}. For each n1n\geq 1, let φ~n\tilde{\varphi}_{n} be the trajectory on [0,T+Tn][0,T^{\prime}+T_{n}] defined as follows. Let φ~n(0)=ξ\tilde{\varphi}_{n}(0)=\xi^{*}; φ~n(t)=φ¯φn(0)(t)\tilde{\varphi}_{n}(t)=\bar{\varphi}^{\varphi_{n}(0)}(t) on t[0,T]t\in[0,T^{\prime}]; φ~n(t)=φn(tT)\tilde{\varphi}_{n}(t)=\varphi_{n}(t-T^{\prime}) on t(T,T+Tn]t\in(T^{\prime},T^{\prime}+T_{n}]. In particular, φ~n(T+Tn)=ξn\tilde{\varphi}_{n}(T^{\prime}+T_{n})=\xi_{n}. Clearly, S[0,T+Tn](φ~n|ξ)s+εnS_{[0,T^{\prime}+T_{n}]}(\tilde{\varphi}_{n}|\xi^{*})\leq s+\varepsilon_{n}. It follows that V(ξn)s+εnV(\xi_{n})\leq s+\varepsilon_{n}. Using the compactness of the lower level sets of VV (see Lemma 5.4), we can find a convergent subsequence of {ξn,n1}\{\xi_{n},n\geq 1\}; after re-indexing and denoting this convergent subsequence by {ξn,n1}\{\xi_{n},n\geq 1\}, let ξnξ\xi_{n}\to\xi in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as nn\to\infty. By assumption, d(ξn,Ξ(s))>δd(\xi_{n},\Xi(s))>\delta for each n1n\geq 1, and hence d(ξ,Ξ(s))δd(\xi,\Xi(s))\geq\delta. Using the lower semicontinuity of VV, we see that

V(ξ)lim infnV(ξn)lim infn(s+εn)=s.\displaystyle V(\xi)\leq\liminf_{n\to\infty}V(\xi_{n})\leq\liminf_{n\to\infty}(s+\varepsilon_{n})=s.

Hence ξΞ(s)\xi\in\Xi(s). This contradicts d(ξ,Ξ(s))δd(\xi,\Xi(s))\geq\delta, which is a consequence of our assumption. This proves the lemma. ∎

We are now ready to prove the LDP upper bound for the family {N,N1}\{\wp^{N},N\geq 1\}. The proof relies on the uniform LDP upper bound in Theorem 2.1, the exponential tightness of the family {N,N1}\{\wp^{N},N\geq 1\}, the containment property established in Lemma 6.2, an estimate on the probability that μN\mu^{N} lies in 𝒮m(M,Δ)\mathscr{S}_{m}(M,\Delta) (which uses the process-level uniform LDP upper bound in Theorem 2.1 and the result of Lemma 6.1), and finally the strong Markov property of μN\mu^{N}.

Lemma 6.3.

For any γ>0\gamma>0, δ>0\delta>0, and s>0s>0, there exists N01N_{0}\geq 1 such that

N{ζ1(𝒵):d(ζ,Ξ(s))δ}exp{N(sγ)}\displaystyle\wp^{N}\{\zeta\in\mathcal{M}_{1}(\mathcal{Z}):d(\zeta,\Xi(s))\geq\delta\}\leq\exp\{-N(s-\gamma)\}

for all NN0N\geq N_{0}.

Proof.

Fix γ>0\gamma>0, δ>0\delta>0, and s>0s>0. Choose M>0M>0 and N11N_{1}\geq 1 such that N(𝒦M)exp{Ns}\wp^{N}(\sim\mathscr{K}_{M})\leq\exp\{-Ns\} for all NN1N\geq N_{1}; this is possible from the exponential tightness of the family {N,N1}\{\wp^{N},N\geq 1\}, see Proposition 1.1. For the given s>0s>0 and δ>0\delta>0, from Lemma 6.2, choose Δ>0\Delta>0 and T1>0T_{1}>0 such that (6.2) holds for all TT1T\geq T_{1}. For the above choice of Δ>0\Delta>0 and M>0M>0, by Lemma 6.1, choose m0m_{0}\in\mathbb{N} such that such that (6.1) holds. By (6.1) and the compactness of Φ𝒦M[0,m0](s)\Phi^{[0,m_{0}]}_{\mathscr{K}_{M}}(s) in D([0,m0],1(𝒵))D([0,m_{0}],\mathcal{M}_{1}(\mathcal{Z})) (which follows from Lemma 2.1), the closure of 𝒮m0(Δ,M)\mathscr{S}_{m_{0}}(\Delta,M) does not intersect Φ𝒦M[0,m0](s)\Phi^{[0,m_{0}]}_{\mathscr{K}_{M}}(s). It follows that there exists a δ0>0\delta_{0}>0 such that φ𝒮m0(Δ,M)\varphi\in\mathscr{S}_{m_{0}}(\Delta,M) implies ρ(φ,Φ𝒦M[0,m0](s))δ0\rho(\varphi,\Phi_{\mathscr{K}_{M}}^{[0,m_{0}]}(s))\geq\delta_{0}. Hence by the uniform LDP upper bound in Theorem 2.1, there exists N2N1N_{2}\geq N_{1} such that

ζN(μN𝒮m0(Δ,M))\displaystyle\mathbb{P}^{N}_{\zeta}(\mu^{N}\in\mathscr{S}_{m_{0}}(\Delta,M)) ζN(ρ(μN,Φ𝒦M[0,m0](s))δ0)\displaystyle\leq\mathbb{P}^{N}_{\zeta}(\rho(\mu^{N},\Phi_{\mathscr{K}_{M}}^{[0,m_{0}]}(s))\geq\delta_{0})
exp{N(sγ/2)}\displaystyle\leq\exp\{-N(s-\gamma/2)\} (6.3)

for all ζ𝒦M1N(𝒵)\zeta\in\mathscr{K}_{M}\cap\mathcal{M}_{1}^{N}(\mathcal{Z}) and NN2N\geq N_{2}. Thus, with T=m0+T1T=m_{0}+T_{1} and NN2N\geq N_{2}, we have

N{ζ1(𝒵)\displaystyle\wp^{N}\{\zeta\in\mathcal{M}_{1}(\mathcal{Z}) :d(ζ,Ξ(s))δ}\displaystyle:d(\zeta,\Xi(s))\geq\delta\}
=1N(𝒵)ζN(d(μN(T),Ξ(s))δ)N(dζ)\displaystyle=\int_{\mathcal{M}_{1}^{N}(\mathcal{Z})}\mathbb{P}^{N}_{\zeta}(d(\mu^{N}(T),\Xi(s))\geq\delta)\wp^{N}(d\zeta)
exp{Ns}+𝒦M1N(𝒵)ζN(d(μN(T),Ξ(s))δ)N(dζ)\displaystyle\leq\exp\{-Ns\}+\int_{\mathscr{K}_{M}\cap\mathcal{M}_{1}^{N}(\mathcal{Z})}\mathbb{P}^{N}_{\zeta}(d(\mu^{N}(T),\Xi(s))\geq\delta)\wp^{N}(d\zeta)
exp{Ns}+supζ𝒦M1N(𝒵)ζN(μN𝒮m0(Δ,M))\displaystyle\leq\exp\{-Ns\}+\sup_{\zeta\in\mathscr{K}_{M}\cap\mathcal{M}_{1}^{N}(\mathcal{Z})}\mathbb{P}^{N}_{\zeta}(\mu^{N}\in\mathscr{S}_{m_{0}}(\Delta,M))
+𝒦M1N(𝒵)ζN(μN𝒮m0(Δ,M),d(μN(T),Ξ(s))δ)N(dζ)\displaystyle\qquad+\int_{\mathscr{K}_{M}\cap\mathcal{M}_{1}^{N}(\mathcal{Z})}\mathbb{P}^{N}_{\zeta}(\mu^{N}\notin\mathscr{S}_{m_{0}}(\Delta,M),d(\mu^{N}(T),\Xi(s))\geq\delta)\wp^{N}(d\zeta)
exp{Ns}+exp{N(sγ/2)}\displaystyle\leq\exp\{-Ns\}+\exp\{-N(s-\gamma/2)\}
+𝒦M1N(𝒵)ζN(μN𝒮m0(Δ,M),d(μN(T),Ξ(s))δ)N(dζ);\displaystyle\qquad+\int_{\mathscr{K}_{M}\cap\mathcal{M}_{1}^{N}(\mathcal{Z})}\mathbb{P}^{N}_{\zeta}(\mu^{N}\notin\mathscr{S}_{m_{0}}(\Delta,M),d(\mu^{N}(T),\Xi(s))\geq\delta)\wp^{N}(d\zeta); (6.4)

here the first equality follows since N\wp^{N} is invariant to time shifts, the first inequality follows from the choice of MM, and the third inequality follows from (6.3).

To bound the integrand in the third term above, let TT1T^{\prime}\geq T_{1} and ζK(Δ)\zeta^{\prime}\in K(\Delta). Choose777The existence of such a δ\delta^{\prime} can be justified via arguments similar to those used in the proof of Lemma 4.2; see the paragraph before (4.2) 0<δ<δ0<\delta^{\prime}<\delta (depending on TT and ss, and not on ζ\zeta^{\prime} and TT^{\prime}) such that ρ(φ1,φ2)<δ/2\rho(\varphi_{1},\varphi_{2})<\delta^{\prime}/2 implies d(φ1(T),φ2(T))<δ/2d(\varphi_{1}(T^{\prime}),\varphi_{2}(T^{\prime}))<\delta/2 whenever φ1D([0,T],1(𝒵))\varphi_{1}\in D([0,T^{\prime}],\mathcal{M}_{1}(\mathcal{Z})) and φ2Φζ[0,T](s)\varphi_{2}\in\Phi^{[0,T^{\prime}]}_{\zeta^{\prime}}(s). Note that if a trajectory φ\varphi on [0,T][0,T^{\prime}] with initial condition φ(0)=ζ\varphi(0)=\zeta^{\prime} is such that ρ(φ,Φζ[0,T](s))<δ/2\rho(\varphi,\Phi_{\zeta^{\prime}}^{[0,T^{\prime}]}(s))<\delta^{\prime}/2, then there exists a trajectory φΦζ[0,T](s)\varphi^{\prime}\in\Phi_{\zeta^{\prime}}^{[0,T^{\prime}]}(s) such that ρ(φ,φ)<δ/2\rho(\varphi,\varphi^{\prime})<\delta^{\prime}/2. By the choice of δ\delta^{\prime}, we have d(φ(T),φ(T))<δ/2d(\varphi(T^{\prime}),\varphi^{\prime}(T^{\prime}))<\delta/2. By Lemma 6.2, we find that d(φ(T),Ξ(s))δ/2d(\varphi^{\prime}(T^{\prime}),\Xi(s))\leq\delta^{\prime}/2. Hence by triangle inequality d(φ(T),Ξ(s))<δ/2+δ/2<δd(\varphi(T^{\prime}),\Xi(s))<\delta/2+\delta^{\prime}/2<\delta. The contrapositive of the above statement is

d(φ(T),Ξ(s))δρ(φ,Φζ[0,T](s))δ/2.\displaystyle d(\varphi(T^{\prime}),\Xi(s))\geq\delta\Rightarrow\rho(\varphi,\Phi_{\zeta^{\prime}}^{[0,T^{\prime}]}(s))\geq\delta^{\prime}/2.

We therefore conclude that

ζN(d(μN(T),Ξ(s))δ)ζN(ρ(μN,Φζ[0,T](s))δ/2)\displaystyle\mathbb{P}^{N}_{\zeta^{\prime}}(d(\mu^{N}(T^{\prime}),\Xi(s))\geq\delta)\leq\mathbb{P}^{N}_{\zeta^{\prime}}(\rho(\mu^{N},\Phi_{\zeta^{\prime}}^{[0,T^{\prime}]}(s))\geq\delta^{\prime}/2) (6.5)

for all TT1T^{\prime}\geq T_{1}, ζ𝒦(Δ)1N(𝒵)\zeta^{\prime}\in\mathscr{K}(\Delta)\cap\mathcal{M}_{1}^{N}(\mathcal{Z}), and N1N\geq 1.

Note that the integrand in the last term of (6.4) can be upper bounded by

ζN\displaystyle\mathbb{P}^{N}_{\zeta} (μN𝒮m0(Δ,M),d(μN(T),Ξ(s))δ)\displaystyle(\mu^{N}\notin\mathscr{S}_{m_{0}}(\Delta,M),d(\mu^{N}(T),\Xi(s))\geq\delta)
=ζN(μN(m)K(Δ) for some m=1,2,,m0,d(μN(T),Ξ(s))δ)\displaystyle=\mathbb{P}^{N}_{\zeta}(\mu^{N}(m)\in K(\Delta)\text{ for some }m=1,2,\ldots,m_{0},\,d(\mu^{N}(T),\Xi(s))\geq\delta)
m=1m0supζK(Δ)1N(𝒵)ζN(d(μN(Tm),Ξ(s))δ)\displaystyle\leq\sum_{m=1}^{m_{0}}\sup_{\zeta^{\prime}\in K(\Delta)\cap\mathcal{M}_{1}^{N}(\mathcal{Z})}\mathbb{P}^{N}_{\zeta^{\prime}}(d(\mu^{N}(T-m),\Xi(s))\geq\delta)
m=1m0supζK(Δ)1N(𝒵)ζN(ρ((μN(t),0tTm),Φζ[0,Tm](s))δ/2)\displaystyle\leq\sum_{m=1}^{m_{0}}\sup_{\zeta^{\prime}\in K(\Delta)\cap\mathcal{M}_{1}^{N}(\mathcal{Z})}\mathbb{P}^{N}_{\zeta^{\prime}}(\rho((\mu^{N}(t),0\leq t\leq T-m),\Phi_{\zeta^{\prime}}^{[0,T-m]}(s))\geq\delta^{\prime}/2) (6.6)

where the first inequality follows from the strong Markov property of μN\mu^{N} and the second inequality follows from (6.5) by the choice of TT. By the uniform LDP upper bound in Theorem 2.1, for each m=1,2,m0m=1,2,\ldots m_{0}, there exist N(m)N2N(m)\geq N_{2} such that

ζN(ρ((μN(t),0tTm),Φζ[0,Tm](s))δ/2)exp{N(sγ/2)}\displaystyle\mathbb{P}^{N}_{\zeta^{\prime}}(\rho((\mu^{N}(t),0\leq t\leq T-m),\Phi_{\zeta^{\prime}}^{[0,T-m]}(s))\geq\delta^{\prime}/2)\leq\exp\{-N(s-\gamma/2)\}

for all ζ𝒦(Δ)1N(𝒵)\zeta^{\prime}\in\mathscr{K}(\Delta)\cap\mathcal{M}_{1}^{N}(\mathcal{Z}) and NN(m)N\geq N(m). Put N3=max{N(m),m=1,2,,m0,N1,N2}N_{3}=\max\{N(m),m=1,2,\ldots,m_{0},N_{1},N_{2}\}. Then (6.6) yields

ζN\displaystyle\mathbb{P}^{N}_{\zeta} (μN𝒮m0(Δ,M),d(μN(T),Ξ(s))δ)m0exp{N(sγ/2)}\displaystyle(\mu^{N}\notin\mathscr{S}_{m_{0}}(\Delta,M),d(\mu^{N}(T),\Xi(s))\geq\delta)\leq m_{0}\exp\{-N(s-\gamma/2)\}

for all ζ𝒦M1N(𝒵)\zeta\in\mathscr{K}_{M}\cap\mathcal{M}_{1}^{N}(\mathcal{Z}) and NN3N\geq N_{3}. Substitution of this back in (6.4) yields

N{ζ1(𝒵)\displaystyle\wp^{N}\{\zeta\in\mathcal{M}_{1}(\mathcal{Z}) :d(ζ,Ξ(s))δ}exp{Ns}+(m0+1)exp{N(sγ/2)}\displaystyle:d(\zeta,\Xi(s))\geq\delta\}\leq\exp\{-Ns\}+(m_{0}+1)\exp\{-N(s-\gamma/2)\}

for all NN3N\geq N_{3}. Finally, choose N0N3N_{0}\geq N_{3} such that 1+(m0+1)exp{Nγ/2}exp{Nγ}1+(m_{0}+1)\exp\{N\gamma/2\}\leq\exp\{N\gamma\} for all NN0N\geq N_{0}. Then the above display becomes

N{ζ1(𝒵)\displaystyle\wp^{N}\{\zeta\in\mathcal{M}_{1}(\mathcal{Z}) :d(ζ,Ξ(s))δ}exp{N(sγ)}\displaystyle:d(\zeta,\Xi(s))\geq\delta\}\leq\exp\{-N(s-\gamma)\}

for all NN0N\geq N_{0}. This completes the proof of the lemma. ∎

7 Proof of Theorem 1.1

We now complete the proof of Theorem 1.1.

  • (Compactness of level sets). For any s>0s>0, by Lemma 5.4, the set Ξ(s)={ξ1(𝒵):V(ξ)s}\Xi(s)=\{\xi\in\mathcal{M}_{1}(\mathcal{Z}):V(\xi)\leq s\} is a compact subset of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z});

  • (LDP lower bound). Given γ>0\gamma>0, δ>0\delta>0, and ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}), by Lemma 4.2, there exists N01N_{0}\geq 1 such that

    N{ζ1(𝒵):d(ζ,ξ)<δ}exp{N(V(ξ)+γ)}\displaystyle\wp^{N}\{\zeta\in\mathcal{M}_{1}(\mathcal{Z}):d(\zeta,\xi)<\delta\}\geq\exp\{-N(V(\xi)+\gamma)\}

    for all NN0N\geq N_{0};

  • (LDP upper bound). Given γ>0\gamma>0, δ>0\delta>0, and s>0s>0, by Lemma 6.3, there exists N01N_{0}\geq 1 such that

    N{ζ1(𝒵):d(ζ,Ξ(s))δ}exp{N(sγ)}\displaystyle\wp^{N}\{\zeta\in\mathcal{M}_{1}(\mathcal{Z}):d(\zeta,\Xi(s))\geq\delta\}\leq\exp\{-N(s-\gamma)\}

    for all NN0N\geq N_{0}.

This completes the proof of Theorem 1.1.

8 Two counterexamples

In this section, for two non-interacting counterexamples described in Section 1.1, we prove that the quasipotential is not equal to the relative entropy with respect to the corresponding globally asymptotically stable equilibrium. These two counterexamples are (i) a system of non-interacting M/M/1 queues, and (ii) a system of non-interacting nodes in a wireless local area network (WLAN) with constant forward transition rates. We detail the proofs in the case of non-interacting M/M/1 queues. Similar arguments carry over to the case of non-interacting WLAN system with constant forward transition rates as well.

8.1 A system of non-interacting M/M/1 queues

Recall the system of non-interacting M/M/1 queues described in Section 1.1.1. Recall the relative entropy from (1.4) and the process-level large deviations rate function from (2.11). Also recall the function ϑ\vartheta defined in (1.6) and the compact sets 𝒦M\mathscr{K}_{M}, M>0M>0, defined in (2.1). Define the quasipotential

VQ(ξ)inf{S[0,T]Q(φ|ξQ),φ(0)=ξQ,φ(T)=ξ,T>0},ξ1(𝒵),\displaystyle V_{Q}(\xi)\coloneqq\inf\{S^{Q}_{[0,T]}(\varphi|\xi^{*}_{Q}),\varphi(0)=\xi^{*}_{Q},\varphi(T)=\xi,T>0\},\,\xi\in\mathcal{M}_{1}(\mathcal{Z}),

where SQS^{Q} is defined by (2.11) with \mathcal{E} replaced by Q\mathcal{E}_{Q} and LζL_{\zeta} replaced by LQL^{Q} for each ζ1(𝒵)\zeta\in\mathcal{M}_{1}(\mathcal{Z}).

We first prove that the quasipotential VQV_{Q} is not finite outside 𝒦\mathscr{K}. The key property used for this is the fact that the attractor ξQ\xi^{*}_{Q} has geometric decay. As a consequence ξQ,ϑ<\langle\xi^{*}_{Q},\vartheta\rangle<\infty. Using this property, we first show that if ξ𝒦\xi\notin\mathscr{K}, then the associated quasipotential evaluated at ξ\xi cannot be finite. This is shown by producing a lower bound for the cost of any trajectory starting at ξQ\xi^{*}_{Q} and ending at ξ𝒦\xi\notin\mathscr{K} from the rate function in (2.11).

Lemma 8.1.

If ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}) is such that ξ𝒦\xi\notin\mathscr{K}, then VQ(ξ)=V_{Q}(\xi)=\infty.

Proof.

Fix ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}). Let T>0T>0 and φD([0,T],1(𝒵))\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})) be such that φ0=ξQ\varphi_{0}=\xi^{*}_{Q} and φT=ξ\varphi_{T}=\xi. For each n1n\geq 1, define fnf_{n} by

fn(z)={z, if zn2nz, if n+1z2n,0, if z>2n,\displaystyle f_{n}(z)=\left\{\begin{array}[]{ll}z,\text{ if }z\leq n\\ 2n-z,\text{ if }n+1\leq z\leq 2n,\\ 0,\text{ if }z>2n,\end{array}\right.

and define f(z)=zf_{\infty}(z)=z for each z𝒵z\in\mathcal{Z}. Note that the use of fnf_{n} is to approximate ff_{\infty} using C0(𝒵)C_{0}(\mathcal{Z}) functions so that we can insert them into (2.11). We first assume that ξ,f=\langle\xi,f_{\infty}\rangle=\infty. In particular, ξ𝒦\xi\notin\mathscr{K}. Using the function fnf_{n} in place of ff in the RHS of (2.11), we have

S[0,T]Q(φ|ξQ)\displaystyle S_{[0,T]}^{Q}(\varphi|\xi^{*}_{Q}) φT,fnξQ,fn[0,T]φu,LQfn[0,T](z,z)Qτ(fn(z)fn(z))λz,zφu(z)du\displaystyle\geq\langle\varphi_{T},f_{n}\rangle-\langle\xi^{*}_{Q},f_{n}\rangle-\int_{[0,T]}\langle\varphi_{u},L^{Q}f_{n}\rangle-\int_{[0,T]}\sum_{(z,z^{\prime})\in\mathcal{E}_{Q}}\tau(f_{n}(z^{\prime})-f_{n}(z))\lambda_{z,z^{\prime}}\varphi_{u}(z)du
=φT,fnξQ,fn[0,T](z,z)Q(exp{fn(z)fn(z)}1)λz,zφu(z)du,\displaystyle=\langle\varphi_{T},f_{n}\rangle-\langle\xi^{*}_{Q},f_{n}\rangle-\int_{[0,T]}\sum_{(z,z^{\prime})\in\mathcal{E}_{Q}}(\exp\{f_{n}(z^{\prime})-f_{n}(z)\}-1)\lambda_{z,z^{\prime}}\varphi_{u}(z)du,

where λz,z+1=λf\lambda_{z,z+1}=\lambda_{f}, z𝒵z\in\mathcal{Z}, and λz,z1=λb\lambda_{z,z-1}=\lambda_{b}, z𝒵{0}z\in\mathcal{Z}\setminus\{0\}. Noting that fn(z)fn(z)f_{n}(z^{\prime})-f_{n}(z) is either 11, 0 or 1-1 for each (z,z)Q(z,z^{\prime})\in\mathcal{E}_{Q}, we have (z,z)Q(exp{fn(z)fn(z)}1)λz,zφu(z)2(e1)λb\sum_{(z,z^{\prime})\in\mathcal{E}_{Q}}(\exp\{f_{n}(z^{\prime})-f_{n}(z)\}-1)\lambda_{z,z^{\prime}}\varphi_{u}(z)\leq 2(e-1)\lambda_{b} for each u[0,T]u\in[0,T]. Hence the above becomes

S[0,T]Q(φ|ξQ)\displaystyle S_{[0,T]}^{Q}(\varphi|\xi^{*}_{Q}) φT,fnξQ,fn2(e1)λbT.\displaystyle\geq\langle\varphi_{T},f_{n}\rangle-\langle\xi^{*}_{Q},f_{n}\rangle-2(e-1)\lambda_{b}T.

Note that ξQ,f<\langle\xi^{*}_{Q},f_{\infty}\rangle<\infty. Hence, letting nn\to\infty and using the monotone convergence theorem, we conclude that S[0,T]Q(φ|ξQ)=S_{[0,T]}^{Q}(\varphi|\xi^{*}_{Q})=\infty.

We now assume that ξ𝒦\xi\notin\mathscr{K} is such that ξ,f<\langle\xi,f_{\infty}\rangle<\infty. Let T>0T>0 and φD([0,T],1(𝒵))\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})) be such that φ0=ξQ\varphi_{0}=\xi^{*}_{Q} and φT=ξ\varphi_{T}=\xi. Without loss of generality, we can assume that supt[0,T]φt,f<\sup_{t\in[0,T]}\langle\varphi_{t},f_{\infty}\rangle<\infty; otherwise the argument in the above paragraph shows that S[0,T]Q(φ|ξQ)=S_{[0,T]}^{Q}(\varphi|\xi^{*}_{Q})=\infty. Define

ϑn(z)={ϑ(z), if zn,ϑ(2nz) if n+1z2n,0, if z>2n.\displaystyle\vartheta_{n}(z)=\left\{\begin{array}[]{ll}\vartheta(z),\text{ if }z\leq n,\\ \vartheta(2n-z)\text{ if }n+1\leq z\leq 2n,\\ 0,\text{ if }z>2n.\end{array}\right.

Using ϑn\vartheta_{n} in the RHS of (2.11), we get

S[0,T]Q(φ|ξQ)ξ,ϑnξQ,ϑn[0,T](z,z)Q(exp{ϑn(z)ϑn(z)}1)λz,zφu(z)du.\displaystyle S_{[0,T]}^{Q}(\varphi|\xi^{*}_{Q})\geq\langle\xi,\vartheta_{n}\rangle-\langle\xi^{*}_{Q},\vartheta_{n}\rangle-\int_{[0,T]}\sum_{(z,z^{\prime})\in\mathcal{E}_{Q}}(\exp\{\vartheta_{n}(z^{\prime})-\vartheta_{n}(z)\}-1)\lambda_{z,z^{\prime}}\varphi_{u}(z)du.

Noting that ϑn(z)ϑn(z)\vartheta_{n}(z^{\prime})-\vartheta_{n}(z) can be upper bounded by 1+log(z+1)1+\log(z+1) for each (z,z)Q(z,z^{\prime})\in\mathcal{E}_{Q}, it follows that (z,z)Q(exp{ϑn(z)ϑn(z)}1)λz,zφu(z)2λb(e(supt[0,T]φt,f+1)1)\sum_{(z,z^{\prime})\in\mathcal{E}_{Q}}(\exp\{\vartheta_{n}(z^{\prime})-\vartheta_{n}(z)\}-1)\lambda_{z,z^{\prime}}\varphi_{u}(z)\leq 2\lambda_{b}(e(\sup_{t\in[0,T]}\langle\varphi_{t},f_{\infty}\rangle+1)-1) for each u[0,T]u\in[0,T]. Hence the above display becomes

S[0,T]Q(φ|ξQ)ξ,ϑnξQ,ϑn2λb(e(supt[0,T]φt,f+1)1)T.\displaystyle S_{[0,T]}^{Q}(\varphi|\xi^{*}_{Q})\geq\langle\xi,\vartheta_{n}\rangle-\langle\xi^{*}_{Q},\vartheta_{n}\rangle-2\lambda_{b}(e(\sup_{t\in[0,T]}\langle\varphi_{t},f_{\infty}\rangle+1)-1)T.

As before, letting nn\to\infty, using the monotone convergence theorem, and noting that ξQ𝒦\xi^{*}_{Q}\in\mathscr{K}, we conclude that S[0,T]Q(φ|ξQ)=S_{[0,T]}^{Q}(\varphi|\xi^{*}_{Q})=\infty.

Since ξ𝒦\xi\notin\mathscr{K}, T>0T>0, and φD([0,T],1(𝒵))\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})) such that φ0=ξQ\varphi_{0}=\xi^{*}_{Q} and φT=ξ\varphi_{T}=\xi are arbitrary, the proof of the lemma is complete. ∎

We now prove the main result of this section, namely, the quasipotential VQV_{Q} is not equal to the relative entropy I(ξQ)I(\cdot\|\xi^{*}_{Q}).

Proposition 8.1.

Let ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}) be such that ξ,f<\langle\xi,f_{\infty}\rangle<\infty and ξ𝒦\xi\notin\mathscr{K}. Then I(ξξQ)<I(\xi\|\xi_{Q}^{*})<\infty and V(ξ)=V(\xi)=\infty. In particular, VI(ξQ)V\neq I(\cdot\|\xi_{Q}^{*}).

Proof.

By the Donsker-Varadhan variational formula (see Donsker and Varadhan [14, Lemma 2.1]), for any ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}) and any bounded function ff on 𝒵\mathcal{Z}, we have

I(ξξQ)ξ,flog(z𝒵exp{f(z)}ξQ(z)).\displaystyle I(\xi\|\xi^{*}_{Q})\geq\langle\xi,f\rangle-\log\left(\sum_{z\in\mathcal{Z}}\exp\{f(z)\}\xi^{*}_{Q}(z)\right).

Recall the definition of fnf_{n} and ff_{\infty} from the proof of Lemma 8.1. Let β¯>0\bar{\beta}>0 be such that z𝒵exp{β¯z}ξQ(z)<\sum_{z\in\mathcal{Z}}\exp\{\bar{\beta}z\}\xi^{*}_{Q}(z)<\infty. Replacing ff by β¯fn\bar{\beta}f_{n} in the above display, letting nn\to\infty and using the monotone convergence theorem, we arrive at

β¯ξ,fI(ξξQ)+log(z𝒵exp{β¯z}ξQ(z)).\displaystyle\bar{\beta}\langle\xi,f_{\infty}\rangle\leq I(\xi\|\xi^{*}_{Q})+\log\left(\sum_{z\in\mathcal{Z}}\exp\{\bar{\beta}z\}\xi^{*}_{Q}(z)\right).

It follows that

{ξ1(𝒵):I(ξξQ)<}{ξ1(𝒵):ξ,f<}.\displaystyle\{\xi\in\mathcal{M}_{1}(\mathcal{Z}):I(\xi\|\xi^{*}_{Q})<\infty\}\subset\{\xi\in\mathcal{M}_{1}(\mathcal{Z}):\langle\xi,f_{\infty}\rangle<\infty\}.

On the other hand, since ξQ,f<\langle\xi_{Q}^{*},f_{\infty}\rangle<\infty, it is easy to check that {ξ1(𝒵):I(ξξQ)<}{ξ1(𝒵):ξ,f<}\{\xi\in\mathcal{M}_{1}(\mathcal{Z}):I(\xi\|\xi^{*}_{Q})<\infty\}\supset\{\xi\in\mathcal{M}_{1}(\mathcal{Z}):\langle\xi,f_{\infty}\rangle<\infty\}.

Let ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}) be such that ξ,ϑ=\langle\xi,\vartheta\rangle=\infty and ξ,f<\langle\xi,f_{\infty}\rangle<\infty. Then the above yields I(ξξQ)<I(\xi\|\xi^{*}_{Q})<\infty. By Lemma 8.1, we see that VQ(ξ)=V_{Q}(\xi)=\infty. This completes the proof of the proposition. ∎

8.2 A non-interacting WLAN system with constant forward rates

Recall the model described in Section 1.1.2. Define the quasipotential

VW(ξ)inf{S[0,T]W(φ|ξW),φ0=ξW,φT=ξ,T>0},ξ1(𝒵),\displaystyle V_{W}(\xi)\coloneqq\inf\{S^{W}_{[0,T]}(\varphi|\xi^{*}_{W}),\varphi_{0}=\xi^{*}_{W},\varphi_{T}=\xi,T>0\},\,\xi\in\mathcal{M}_{1}(\mathcal{Z}),

where SWS^{W} is defined by (2.11) with \mathcal{E} replaced by W\mathcal{E}_{W} and LζL_{\zeta} replaced by LWL^{W} for each ζ1(𝒵)\zeta\in\mathcal{M}_{1}(\mathcal{Z}). We now state the main result for this non-interacting wireless local area network.

Proposition 8.2.

Let ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}) be such that ξ,f<\langle\xi,f_{\infty}\rangle<\infty and ξ𝒦\xi\notin\mathscr{K}. Then I(ξξW)<I(\xi\|\xi^{*}_{W})<\infty and V(ξ)=V(\xi)=\infty. In particular, VWI(ξW)V_{W}\neq I(\cdot\|\xi^{*}_{W}).

We start with the following lemma. The proof follows along similar lines of the proof of Lemma 8.1 by noting that ξW,ϑ<\langle\xi^{*}_{W},\vartheta\rangle<\infty, and it is left to the reader.

Lemma 8.2.

If ξ1(𝒵)\xi\in\mathcal{M}_{1}(\mathcal{Z}) is such that ξ𝒦\xi\notin\mathscr{K}, then VW(ξ)=V_{W}(\xi)=\infty.

Using the above lemma, we can now prove Proposition 8.2 along similar lines of the proof of Proposition 8.1 in the previous section.

Appendix A Proofs of Section 2

A.1 Proof of Lemma 2.1

Fix T>0T>0, s>0s>0, and K1(𝒵)K\subset\mathcal{M}_{1}(\mathcal{Z}) compact. Given νK\nu\in K, φΦν[0,T](s)\varphi\in\Phi_{\nu}^{[0,T]}(s) and a finite set B𝒵B\subset\mathcal{Z}, choosing f(t,z)=𝟏{zB}f(t,z)=\mathbf{1}_{\{z\in B\}} for all t[0,T]t\in[0,T], (2.9) yields

φt(B)φr(B)\displaystyle\varphi_{t}(B)-\varphi_{r}(B) =[r,t](z,z)(f(z)f(z))(1+hφ(u,z,z))λz,z(φu)φu(z)du\displaystyle=\int_{[r,t]}\sum_{(z,z^{\prime})\in\mathcal{E}}(f(z^{\prime})-f(z))(1+h_{\varphi}(u,z,z^{\prime}))\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du

for all 0r<tT0\leq r<t\leq T. Note that we may take hφ1h_{\varphi}\geq-1, else the rate function would be infinite as per (2.10) and the definition of τ\tau^{*} in (2.7). Therefore, we get

|φt(B)φr(B)|\displaystyle|\varphi_{t}(B)-\varphi_{r}(B)| [0,T](z,z)(1+hφ(u,z,z))×𝟏{u[r,t]}λz,z(φu)φu(z)du.\displaystyle\leq\int_{[0,T]}\sum_{(z,z^{\prime})\in\mathcal{E}}(1+h_{\varphi}(u,z,z^{\prime}))\times\mathbf{1}_{\{u\in[r,t]\}}\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du. (A.1)

Noting that

sup{[0,T](z,z)τ(hφ(u,z,z))λz,z(φu)φu(z)du,φΦν[0,T](s),νK}s,\displaystyle\sup\left\{\int_{[0,T]}\sum_{(z,z^{\prime})\in\mathcal{E}}\tau^{*}(h_{\varphi}(u,z,z^{\prime}))\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du,\varphi\in\Phi_{\nu}^{[0,T]}(s),\nu\in K\right\}\leq s,

it follows that the family {1+hφ,φΦν[0,T](s),νK}\{1+h_{\varphi},\varphi\in\Phi_{\nu}^{[0,T]}(s),\nu\in K\} is uniformly integrable. That is,

sup{[0,T](1+hφ(u,z,z))×𝟏{1+hφM}λz,z(φu)φu(z)du,φΦν[0,T](s),νK}0\displaystyle\sup\left\{\int_{[0,T]}(1+h_{\varphi}(u,z,z^{\prime}))\times\mathbf{1}_{\{1+h_{\varphi}\geq M\}}\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du,\varphi\in\Phi_{\nu}^{[0,T]}(s),\nu\in K\right\}\to 0

as MM\to\infty. Hence for any M>0M>0, using the boundedness of the transition rates (from assumption (A2)), (A.1) yields

|\displaystyle| φt(B)φr(B)|\displaystyle\varphi_{t}(B)-\varphi_{r}(B)|
2Mλ¯(tr)+[0,T](z,z)(1+hφ(u,z,z))×𝟏{1+hφM}λz,z(φu)φu(z)du.\displaystyle\leq 2M\overline{\lambda}(t-r)+\int_{[0,T]}\sum_{(z,z^{\prime})\in\mathcal{E}}(1+h_{\varphi}(u,z,z^{\prime}))\times\mathbf{1}_{\{1+h_{\varphi}\geq M\}}\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du.

for all 0r<tT0\leq r<t\leq T, and B1(𝒵)B\subset\mathcal{M}_{1}(\mathcal{Z}). It follows that

supφνKΦν[0,T](s)supt,r:|tr|δd(φt,φr)\displaystyle\sup_{\varphi\in\cup_{\nu\in K}\Phi_{\nu}^{[0,T]}(s)}\sup_{t,r:|t-r|\leq\delta}d(\varphi_{t},\varphi_{r})
2Mλ¯δ+supφνKΦν[0,T](s)supt,r:|tr|δ[0,T](z,z)(1+hφ(u,z,z))\displaystyle\qquad\leq 2M\overline{\lambda}\delta+\sup_{\varphi\in\cup_{\nu\in K}\Phi_{\nu}^{[0,T]}(s)}\sup_{t,r:|t-r|\leq\delta}\int_{[0,T]}\sum_{(z,z^{\prime})\in\mathcal{E}}(1+h_{\varphi}(u,z,z^{\prime}))
×𝟏{1+hφM}λz,z(φu)φu(z)du\displaystyle\qquad\qquad\times\mathbf{1}_{\{1+h_{\varphi}\geq M\}}\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du

Letting δ0\delta\to 0 first and then MM\to\infty, we arrive at

limδ0supφνKΦν[0,T](s)supt,r:|tr|δd(φt,φr)=0.\displaystyle\lim_{\delta\downarrow 0}\sup_{\varphi\in\cup_{\nu\in K}\Phi_{\nu}^{[0,T]}(s)}\sup_{t,r:|t-r|\leq\delta}d(\varphi_{t},\varphi_{r})=0.

Hence it follows that νKΦν[0,T](s)\cup_{\nu\in K}\Phi_{\nu}^{[0,T]}(s) is precompact in D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})) (see, for example, Billingsley [4, Theorem 12.3]).

To show that νKΦν[0,T](s)\cup_{\nu\in K}\Phi_{\nu}^{[0,T]}(s) is closed, let {φn,n1}νKΦν[0,T](s)\{\varphi_{n},n\geq 1\}\subset\cup_{\nu\in K}\Phi_{\nu}^{[0,T]}(s) and suppose that φnφ¯\varphi_{n}\to\bar{\varphi} in D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})). Note that, for any fC01([0,T]×1(𝒵))f\in C_{0}^{1}([0,T]\times\mathcal{M}_{1}(\mathcal{Z})), the mapping

φ\displaystyle\varphi {φT,fTφ0,f0[0,T]φu,ufudu\displaystyle\mapsto\Biggr{\{}\langle\varphi_{T},f_{T}\rangle-\langle\varphi_{0},f_{0}\rangle-\int_{[0,T]}\langle\varphi_{u},\partial_{u}f_{u}\rangle du
[0,T]φu,Lφufudu[0,T](z,z)τ(fu(z)fu(z))λz,z(φu)φu(z)du}\displaystyle\qquad-\int_{[0,T]}\langle\varphi_{u},L_{\varphi_{u}}f_{u}\rangle du-\int_{[0,T]}\sum_{(z,z^{\prime})\in\mathcal{E}}\tau(f_{u}(z^{\prime})-f_{u}(z))\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du\Biggr{\}}

is continuous on D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})), and hence, the mapping

φ\displaystyle\varphi supfC01([0,T]×𝒵){φT,fTφ0,f0[0,T]φu,ufudu\displaystyle\mapsto\sup_{f\in C_{0}^{1}([0,T]\times\mathcal{Z})}\Biggr{\{}\langle\varphi_{T},f_{T}\rangle-\langle\varphi_{0},f_{0}\rangle-\int_{[0,T]}\langle\varphi_{u},\partial_{u}f_{u}\rangle du
[0,T]φu,Lφufudu[0,T](z,z)τ(fu(z)fu(z))λz,z(φu)φu(z)du}\displaystyle\qquad-\int_{[0,T]}\langle\varphi_{u},L_{\varphi_{u}}f_{u}\rangle du-\int_{[0,T]}\sum_{(z,z^{\prime})\in\mathcal{E}}\tau(f_{u}(z^{\prime})-f_{u}(z))\lambda_{z,z^{\prime}}(\varphi_{u})\varphi_{u}(z)du\Biggr{\}}

is lower semicontinuous on D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})) (see, for example, Berge [1, Theorem 1, page 115]). Hence, it follows that

S[0,T](φ¯|φ¯(0))lim infnS[0,T](φn|φn(0))s,\displaystyle S_{[0,T]}(\bar{\varphi}|\bar{\varphi}(0))\leq\liminf_{n\to\infty}S_{[0,T]}(\varphi_{n}|\varphi_{n}(0))\leq s,

and it follows that νKΦν[0,T](s)\cup_{\nu\in K}\Phi_{\nu}^{[0,T]}(s) is closed. Consequently, νKΦν[0,T](s)\cup_{\nu\in K}\Phi_{\nu}^{[0,T]}(s) is a compact subset of D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})). ∎

A.2 Proof of Theorem 2.1

In this Section, we prove Theorem 2.1. In the case of finite state space (i.e., when 𝒵\mathcal{Z} is a finite set), the LDP for the family {μνNN,N1}\{\mu^{N}_{\nu_{N}},N\geq 1\}, whenever νNν\nu_{N}\to\nu in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as NN\to\infty, was proved in [7, Theorem 3.1] under suitable assumptions. The main assumption required in the proof of [7, Theorem 3.1] was the boundedness of the “total outgoing jump rate” across all the states, which also holds in our countable state space case under Assumptions (A1)(A3). So, to prove the LDP for the family {μνNN,N1}\{\mu^{N}_{\nu_{N}},N\geq 1\}, whenever νNν\nu_{N}\to\nu in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as NN\to\infty, one can go through the steps in [7, Section 5] verbatim; we reproduce the important steps here for the sake of completeness. Once this LDP is proved, we then show the uniform LDP over the class of compact subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) using [8, Proposition 1.12, 1.14].

A.2.1 LDP for {μνNN,N1}\{\mu_{\nu_{N}}^{N},N\geq 1\} when νNν\nu_{N}\to\nu in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z})

We first introduce some notation. Let {(XnN(t),t[0,T]),1nN}\{(X^{N}_{n}(t),t\in[0,T]),1\leq n\leq N\} denote the joint evolution of the states of all the particles. This is a Markov process on 𝒵N\mathcal{Z}^{N} with the infinitesimal generator acting on functions ff on 𝒵N\mathcal{Z}^{N} given by

(z1,,zN)n=1Nzn{zn+1,0}(f(z1,,zn,,zN)f(z1,,zN))λzn,zn(emp(z1,,zN)),\displaystyle(z_{1},\ldots,z_{N})\mapsto\sum_{n=1}^{N}\sum_{z_{n}^{\prime}\in\{z_{n}+1,0\}}(f(z_{1},\ldots,z_{n}^{\prime},\ldots,z_{N})-f(z_{1},\ldots,z_{N}))\lambda_{z_{n},z_{n}^{\prime}}(\text{emp}(z_{1},\ldots,z_{N})),

where emp(z1,,zN):=1Nn=1Nδzn1N(𝒵)\text{emp}(z_{1},\ldots,z_{N}):=\frac{1}{N}\sum_{n=1}^{N}\delta_{z_{n}}\in\mathcal{M}_{1}^{N}(\mathcal{Z}). Define the empirical measure

ΘN:=1Nn=1NδXnN();\displaystyle\Theta^{N}:=\frac{1}{N}\sum_{n=1}^{N}\delta_{X^{N}_{n}(\cdot)};

ΘN\Theta^{N} is a 1(D([0,T],𝒵))\mathcal{M}_{1}(D([0,T],\mathcal{Z}))-valued random variable. Let σ:1(D([0,T],𝒵))D([0,T],1(𝒵))\sigma:\mathcal{M}_{1}(D([0,T],\mathcal{Z}))\to D([0,T],\mathcal{M}_{1}(\mathcal{Z})) denote the canonical projection map. Note that μN(t)=σt(ΘN)\mu^{N}(t)=\sigma_{t}(\Theta^{N}), t[0,T]t\in[0,T]. Similarly, let {(X¯nN(t),t[0,T]),1nN}\{(\bar{X}^{N}_{n}(t),t\in[0,T]),1\leq n\leq N\} denote the evolution of the independent particles, where each particle executes a Markov process with the infinitesimal generator L¯\bar{L} defined in (3.1). Define the corresponding empirical measure Θ¯N\bar{\Theta}^{N} by

Θ¯N:=1Nn=1NδX¯nN().\displaystyle\bar{\Theta}^{N}:=\frac{1}{N}\sum_{n=1}^{N}\delta_{\bar{X}^{N}_{n}(\cdot)}.

Let 𝒫νNN\mathcal{P}^{N}_{\nu_{N}} (resp. 𝒫¯νNN\bar{\mathcal{P}}^{N}_{\nu_{N}}) denote the law of ΘN\Theta^{N} (resp. Θ¯N\bar{\Theta}^{N}) with initial condition νN1N(𝒵)\nu_{N}\in\mathcal{M}_{1}^{N}(\mathcal{Z}) (i.e., 1Nn=1NδXnN(0)=νN\frac{1}{N}\sum_{n=1}^{N}\delta_{X^{N}_{n}(0)}=\nu_{N}). These are probability measures on 1(D([0,T],𝒵))\mathcal{M}_{1}(D([0,T],\mathcal{Z})), i.e., 𝒫νNN,𝒫¯νNN1(1(D([0,T],𝒵)))\mathcal{P}^{N}_{\nu_{N}},\bar{\mathcal{P}}^{N}_{\nu_{N}}\in\mathcal{M}_{1}(\mathcal{M}_{1}(D([0,T],\mathcal{Z}))).

Note that 𝒫νNN𝒫¯νNN\mathcal{P}^{N}_{\nu_{N}}\ll\bar{\mathcal{P}}^{N}_{\nu_{N}}. For xD([0,T],𝒵)x\in D([0,T],\mathcal{Z}) and μD([0,T],1(𝒵))\mu\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})), define

h(x;μ):=t[0,T]𝟏{x(t)x(t)}log(λx(t),x(t)(μt)λ~x(t),x(t))[0,T](x(t),z)z𝒵:(λx(t),z(μt)λ~x(t),z)dt,\displaystyle h(x;\mu):=\sum_{t\in[0,T]}\mathbf{1}_{\{x(t)\neq x(t-)\}}\log\left(\frac{\lambda_{x(t-),x(t)}(\mu_{t})}{\widetilde{\lambda}_{x(t-),x(t)}}\right)-\int_{[0,T]}\sum_{\stackrel{{\scriptstyle z^{\prime}\in\mathcal{Z}:}}{{(x(t-),z^{\prime})\in\mathcal{E}}}}\left(\lambda_{x(t-),z^{\prime}}(\mu_{t})-\widetilde{\lambda}_{x(t-),z^{\prime}}\right)dt,

where {λ~z,z,(z,z)}\{\widetilde{\lambda}_{z,z^{\prime}},(z,z^{\prime})\in\mathcal{E}\} are the non-interacting rates defined by

λ~z,z:={λ¯/(z+1) if z=z+1,λ¯ if z=0,z1.\displaystyle\widetilde{\lambda}_{z,z^{\prime}}:=\begin{cases}\overline{\lambda}/(z+1)&\text{ if }z^{\prime}=z+1,\\ \underline{\lambda}&\text{ if }z^{\prime}=0,z\geq 1.\end{cases}

Also, define

h(Q):=D([0,T],𝒵)h(;σ(Q))𝑑Q,Q1(D([0,T],𝒵)).\displaystyle h(Q):=\int_{D([0,T],\mathcal{Z})}h(\cdot;\sigma(Q))\,dQ,\quad Q\in\mathcal{M}_{1}(D([0,T],\mathcal{Z})). (A.2)

Using Girsanov’s theorem, it is straightforward to check that

d𝒫νNNd𝒫¯νNN(Q)=exp{Nh(Q)},Q1(D([0,T],𝒵)).\displaystyle\frac{d\mathcal{P}^{N}_{\nu_{N}}}{d\bar{\mathcal{P}}^{N}_{\nu_{N}}}(Q)=\exp\{Nh(Q)\},\quad Q\in\mathcal{M}_{1}(D([0,T],\mathcal{Z})).

We now introduce some notation related to path spaces. Define ψ:D([0,T],𝒵){0,1,}\psi:D([0,T],\mathcal{Z})\to\{0,1,\ldots\} by

ψ(x)=t[0,T]𝟏{x(t)x(t)};\displaystyle\psi(x)=\sum_{t\in[0,T]}\mathbf{1}_{\{x(t)\neq x(t-)\}};

ψ(x)\psi(x) is the number of discontinuities in xx. Since 𝒵\mathcal{Z} is a countable set, it follows that ψ(x)<\psi(x)<\infty for all xD([0,T],𝒵)x\in D([0,T],\mathcal{Z}) ([4, Chapter 3, Lemma 1]). Define

𝒳:={xD([0,T],𝒵):ψ(x)<,(x(t),x(t)) whenever x(t)x(t),t[0,T]},\displaystyle\mathcal{X}:=\{x\in D([0,T],\mathcal{Z}):\psi(x)<\infty,(x(t-),x(t))\in\mathcal{E}\text{ whenever }x(t)\neq x(t-),\,t\in[0,T]\},

and equip 𝒳\mathcal{X} with the subspace topology. Since 𝒵\mathcal{Z} is countable, we have that ψ\psi is continuous on 𝒳\mathcal{X}. Define

fψ:=supx𝒳|f(x)|1+ψ(x), for f:𝒳.\displaystyle\|f\|_{\psi}:=\sup_{x\in\mathcal{X}}\frac{|f(x)|}{1+\psi(x)},\,\text{ for }f:\mathcal{X}\to\mathbb{R}.

Then, define

𝒞ψ(𝒳):={f:𝒳 such that f is continuous and fψ<},\displaystyle\mathcal{C}_{\psi}(\mathcal{X}):=\{f:\mathcal{X}\to\mathbb{R}\text{ such that }f\text{ is continuous and }\|f\|_{\psi}<\infty\},

and

1,ψ(𝒳):={Q1(𝒳):𝒳ψ𝑑Q<}.\displaystyle\mathcal{M}_{1,\psi}(\mathcal{X}):=\left\{Q\in\mathcal{M}_{1}(\mathcal{X}):\int_{\mathcal{X}}\psi\,dQ<\infty\right\}.

1,ψ(𝒳)\mathcal{M}_{1,\psi}(\mathcal{X}) is a subset of 𝒞ψ(𝒳)\mathcal{C}_{\psi}(\mathcal{X})^{*}, the algebraic dual of 𝒞ψ(𝒳)\mathcal{C}_{\psi}(\mathcal{X}), and we equip it with the weak* topology. This is the coarsest topology on 1,ψ(𝒳)\mathcal{M}_{1,\psi}(\mathcal{X}) where we say QNQQ_{N}\to Q in 1,ψ(𝒳)\mathcal{M}_{1,\psi}(\mathcal{X}) as NN\to\infty if and only if

𝒳f𝑑QN𝒳f𝑑Q as N, for all f𝒞ψ(𝒳).\displaystyle\int_{\mathcal{X}}f\,dQ_{N}\to\int_{\mathcal{X}}f\,dQ\,\,\text{ as }N\to\infty,\quad\text{ for all }f\in\mathcal{C}_{\psi}(\mathcal{X}).

Recall P¯z\bar{P}_{z}, z𝒵z\in\mathcal{Z}, from Section 3. For each ν1(𝒵)\nu\in\mathcal{M}_{1}(\mathcal{Z}), define J:1(𝒳)[0,]J:\mathcal{M}_{1}(\mathcal{X})\to[0,\infty] by

J(Q):=supf𝒞ψ(𝒳)[𝒳f𝑑Qz𝒵ν(z)log𝒳exp{f}𝑑P¯z].\displaystyle J(Q):=\sup_{f\in\mathcal{C}_{\psi}(\mathcal{X})}\left[\int_{\mathcal{X}}f\,dQ-\sum_{z\in\mathcal{Z}}\nu(z)\log\int_{\mathcal{X}}\exp\{f\}\,d\bar{P}_{z}\right]. (A.3)

By [7, Lemma 5.3], we also have

J(Q)=supf𝒞b(𝒳)[𝒳f𝑑Qz𝒵ν(z)log𝒳exp{f}𝑑P¯z],\displaystyle J(Q)=\sup_{f\in\mathcal{C}_{b}(\mathcal{X})}\left[\int_{\mathcal{X}}f\,dQ-\sum_{z\in\mathcal{Z}}\nu(z)\log\int_{\mathcal{X}}\exp\{f\}\,d\bar{P}_{z}\right], (A.4)

where 𝒞b(𝒳)\mathcal{C}_{b}(\mathcal{X}) is the space of bounded and continuous functions on 𝒳\mathcal{X} equipped with the supremum norm.

We first state a lemma for the LDP for the family {𝒫¯νNN,N1}\{\bar{\mathcal{P}}^{N}_{\nu_{N}},N\geq 1\} on 1,ψ(𝒳)\mathcal{M}_{1,\psi}(\mathcal{X}) whenever νNν\nu_{N}\to\nu in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as NN\to\infty. Its proof follows verbatim from [7, Lemma 5.1].

Lemma A.1 (LDP for the non-interacting system; [7, Lemma 5.1]).

Let νNν\nu_{N}\to\nu in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as NN\to\infty. Then the family {𝒫¯νNN,N1}\{\bar{\mathcal{P}}^{N}_{\nu_{N}},N\geq 1\} satisfies the LDP on 1,ψ(𝒳)\mathcal{M}_{1,\psi}(\mathcal{X}) with rate function JJ defined in (A.3).

Next, we provide two necessary conditions for the finiteness of JJ defined in (A.3).

Lemma A.2 (Finiteness of JJ; [7, Lemma 5.2]).

If J(Q)<J(Q)<\infty, then we have Q1,ψ(𝒳)Q\in\mathcal{M}_{1,\psi}(\mathcal{X}) and Qσ01=νQ\circ\sigma^{-1}_{0}=\nu.

Proof.

Let QQ be such that J(Q)<J(Q)<\infty. The proof of Qσ01=νQ\circ\sigma^{-1}_{0}=\nu follows verbatim from [7, Lemma 5.2]. For the first assertion, since ψ𝒞ψ(𝒳)\psi\in\mathcal{C}_{\psi}(\mathcal{X}), from the definition of JJ in (A.3), we have

J(Q)𝒳ψ𝑑Qz𝒵ν(z)log𝒳exp{ψ}𝑑P¯z.\displaystyle J(Q)\geq\int_{\mathcal{X}}\psi\,dQ-\sum_{z\in\mathcal{Z}}\nu(z)\log\int_{\mathcal{X}}\exp\{\psi\}\,d\bar{P}_{z}. (A.5)

Note that, for each z𝒵z\in\mathcal{Z}, under P¯z\bar{P}_{z} (see (3.1)), the number of jumps on [0,T][0,T] is stochastically dominated by a Poisson random variable with parameter (λ¯+λ¯)T2λ¯T(\underline{\lambda}+\overline{\lambda})T\leq 2\overline{\lambda}T. Therefore,

𝒳exp{ψ}𝑑P¯zk0exp{k}exp{2λ¯T}(2λ¯T)kk!=c1<,\displaystyle\int_{\mathcal{X}}\exp\{\psi\}\,d\bar{P}_{z}\leq\sum_{k\geq 0}\exp\{k\}\frac{\exp\{-2\overline{\lambda}T\}(2\overline{\lambda}T)^{k}}{k!}=c_{1}<\infty,

where c1c_{1} is some constant independent of zz. Therefore,

z𝒵ν(z)log𝒳exp{ψ}𝑑P¯z<.\displaystyle\sum_{z\in\mathcal{Z}}\nu(z)\log\int_{\mathcal{X}}\exp\{\psi\}d\bar{P}_{z}<\infty.

Hence, from (A.5), using J(Q)<J(Q)<\infty, we conclude that

𝒳ψ𝑑Q<.\displaystyle\int_{\mathcal{X}}\psi\,dQ<\infty.

It follows that Q1,ψ(𝒳)Q\in\mathcal{M}_{1,\psi}(\mathcal{X}). ∎

The next lemma is required to prove the continuity of hh on 1,ψ(𝒳)\mathcal{M}_{1,\psi}(\mathcal{X}).

Lemma A.3 (see [7, Lemma 5.7] for the finite state space case).

Suppose that Q1(𝒳)Q\in\mathcal{M}_{1}(\mathcal{X}) is such that J(Q)<J(Q)<\infty. Then,

limα0supt[0,T]𝒳supu[tα,t+α][0,T]𝟏{X(u)X(u)}dQ(X)=0.\displaystyle\lim_{\alpha\to 0}\sup_{t\in[0,T]}\int_{\mathcal{X}}\sup_{u\in[t-\alpha,t+\alpha]\cap[0,T]}\mathbf{1}_{\{X(u)\neq X(u-)\}}\,dQ(X)=0.
Proof.

Let P1(𝒳)P\in\mathcal{M}_{1}(\mathcal{X}) denote the mixture distribution defined by dP:=z𝒵ν(z)dP¯zdP:=\sum_{z\in\mathcal{Z}}\nu(z)d\bar{P}_{z}. Since J(Q)<J(Q)<\infty, it follows that QPQ\ll P. Indeed, using Jensen’s inequality, we have,

z𝒵log𝒳exp{f}𝑑Pzlog𝒳exp{f}𝑑P for any f𝒞b(𝒳),\displaystyle\sum_{z\in\mathcal{Z}}\log\int_{\mathcal{X}}\exp\{f\}\,dP_{z}\leq\log\int_{\mathcal{X}}\exp\{f\}\,dP\,\quad\text{ for any }f\in\mathcal{C}_{b}(\mathcal{X}),

and hence, from (A.4) and the Donsker-Varadhan variational formula for I(QP)I(Q\|P), we conclude that

I(QP)J(Q).\displaystyle I(Q\|P)\leq J(Q). (A.6)

Since J(Q)<J(Q)<\infty, the above implies that I(QP)<I(Q\|P)<\infty. This shows QPQ\ll P. Hence, with Kt,α={x𝒳:x(u)x(u) for some u[tα,t+α][0,T]}K_{t,\alpha}=\{x\in\mathcal{X}:x(u)\neq x(u-)\text{ for some }u\in[t-\alpha,t+\alpha]\cap[0,T]\}, we have

𝒳supu[tα,t+α][0,T]𝟏{X(u)X(u)}dQ(X)\displaystyle\int_{\mathcal{X}}\sup_{u\in[t-\alpha,t+\alpha]\cap[0,T]}\mathbf{1}_{\{X(u)\neq X(u-)\}}\,dQ(X) =Q(Kt,α)\displaystyle=Q(K_{t,\alpha})
=𝒳(dQdP)𝟏{Kt,α}𝑑P\displaystyle=\int_{\mathcal{X}}\left(\frac{dQ}{dP}\right)\mathbf{1}_{\{K_{t,\alpha}\}}\,dP
(dQdP)τ,P𝟏{Kt,α}τ,P,\displaystyle\leq\left\|\left(\frac{dQ}{dP}\right)\right\|_{\tau^{*},P}\|\mathbf{1}_{\{K_{t,\alpha}\}}\|_{\tau,P}, (A.7)

where the last inequality follows from the Hölder’s inequality in Orlicz spaces. Here, τ,P\|\cdot\|_{\tau,P} is the Orlicz norm defined by

fτ,P:=inf{a>0:𝒳τ(|f(x)|a)𝑑P(x)1}.\displaystyle\|f\|_{\tau,P}:=\inf\left\{a>0:\int_{\mathcal{X}}\tau\left(\frac{|f(x)|}{a}\right)\,dP(x)\leq 1\right\}.

Similarly, fτ,P\|f\|_{\tau^{*},P} is defined as above with τ\tau replaced by τ\tau^{*}.

Consider (dQdP)τ,P\left\|\left(\frac{dQ}{dP}\right)\right\|_{\tau^{*},P}. Note that, there exists a u01u_{0}\geq 1 such that τ(u)2ulogu\tau^{*}(u)\leq 2u\log u for all uu0u\geq u_{0}. Therefore,

𝒳τ(dQdP)𝑑P\displaystyle\int_{\mathcal{X}}\tau^{*}\left(\frac{dQ}{dP}\right)\,dP τ(u0)+2𝒳(dQdP)log(dQdP)𝑑P\displaystyle\leq\tau^{*}(u_{0})+2\int_{\mathcal{X}}\left(\frac{dQ}{dP}\right)\log\left(\frac{dQ}{dP}\right)\,dP
=τ(u0)+2I(QP)\displaystyle=\tau^{*}(u_{0})+2I(Q\|P)
τ(u0)+2J(Q)\displaystyle\leq\tau^{*}(u_{0})+2J(Q)
<,\displaystyle<\infty,

where the second inequality follows from (A.6) and the third inequality follow from the assumption that J(Q)<J(Q)<\infty. Since τ(u/a)τ(u)/a\tau^{*}(u/a)\leq\tau^{*}(u)/a for a1a\geq 1 (by Jensen’s inequality), this shows

(dQdP)τ,P<c2< for some c2 that does not depend on t.\displaystyle\left\|\left(\frac{dQ}{dP}\right)\right\|_{\tau^{*},P}<c_{2}<\infty\text{ for some }c_{2}\text{ that does not depend on }t. (A.8)

Next, consider 𝟏{Kt,α}τ,P\|\mathbf{1}_{\{K_{t,\alpha}\}}\|_{\tau^{*},P}. Note that, under PP, the number of jumps in [tα,t+α][0,T][t-\alpha,t+\alpha]\cap[0,T] is stochastically dominated by a Poisson random variable with parameter 2α(λ¯+λ¯)4αλ¯2\alpha(\overline{\lambda}+\underline{\lambda})\leq 4\alpha\overline{\lambda}. Therefore, P(Kt,α)1exp{4αλ¯}4αλ¯P(K_{t,\alpha})\leq 1-\exp\{-4\alpha\overline{\lambda}\}\leq 4\alpha\overline{\lambda}. Since τ(𝟏{Kt,α}/a)=τ(1/a)𝟏{Kt,α}\tau(\mathbf{1}_{\{K_{t,\alpha}\}}/a)=\tau(1/a)\mathbf{1}_{\{K_{t,\alpha}\}} for any a>0a>0, we have

𝒳τ(𝟏{Kt,α}/a)𝑑P=τ(1/a)P(Kt,α)τ(1/a)4αλ¯.\displaystyle\int_{\mathcal{X}}\tau\left(\mathbf{1}_{\{K_{t,\alpha}\}}/a\right)\,dP=\tau(1/a)P(K_{t,\alpha})\leq\tau(1/a)4\alpha\overline{\lambda}.

Therefore, if we choose a=1/(τ1(1/4αλ¯))a=1/(\tau^{-1}(1/4\alpha\overline{\lambda})), the right-hand side of the above display becomes 11. This shows

𝟏{Kt,α}τ,P1τ1(1/4αλ¯)\displaystyle\|\mathbf{1}_{\{K_{t,\alpha}\}}\|_{\tau^{*},P}\leq\frac{1}{\tau^{-1}(1/4\alpha\overline{\lambda})}

for all tt. Hence, by (A.7), (A.8), and the previous display, we get

supt[0,T]𝒳supu[tα,t+α][0,T]𝟏{X(u)X(u)}dQ(X)c2τ1(1/4αλ¯)0 as α0.\displaystyle\sup_{t\in[0,T]}\int_{\mathcal{X}}\sup_{u\in[t-\alpha,t+\alpha]\cap[0,T]}\mathbf{1}_{\{X(u)\neq X(u-)\}}\,dQ(X)\leq\frac{c_{2}}{\tau^{-1}(1/4\alpha\overline{\lambda})}\to 0\quad\text{ as }\alpha\to 0.

This completes the proof of the lemma. ∎

Next, we argue the continuity of the projection map σ\sigma.

Lemma A.4 (Continuity of σ\sigma; [6, Lemma 5.8]).

Let Q1(𝒳)Q\in\mathcal{M}_{1}(\mathcal{X}) be such that J(Q)<J(Q)<\infty. Then σ:1(D([0,T],𝒵))D([0,T],1(𝒵))\sigma:\mathcal{M}_{1}(D([0,T],\mathcal{Z}))\to D([0,T],\mathcal{M}_{1}(\mathcal{Z})) is continuous at QQ.

Proof.

Let Q1(𝒳)Q\in\mathcal{M}_{1}(\mathcal{X}) be such that J(Q)<J(Q)<\infty. By Lemma A.2, it follows that Q1,ψ(𝒳)Q\in\mathcal{M}_{1,\psi}(\mathcal{X}). In [21, Lemma 2.8], for the case when ν=δz0\nu=\delta_{z_{0}} for some z0𝒵z_{0}\in\mathcal{Z}, it was shown that σ:1(D([0,T],𝒵))D([0,T],1(𝒵))\sigma:\mathcal{M}_{1}(D([0,T],\mathcal{Z}))\to D([0,T],\mathcal{M}_{1}(\mathcal{Z})) is continuous at QQ whenever Q1,ψ(𝒳)Q\in\mathcal{M}_{1,\psi}(\mathcal{X})888 This continuity was shown in [21] when 1(D([0,T],𝒵))\mathcal{M}_{1}(D([0,T],\mathcal{Z})) is equipped with the usual weak topology and D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})) is equipped with the stronger uniform topology. Since the Skorohod topology on D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})) is coarser than the uniform topology, it follows that σ\sigma is continuous. For general ν1(𝒵)\nu\in\mathcal{M}_{1}(\mathcal{Z}), by using the result of Lemma A.3 and following the proof of [21, Lemma 2.8] verbatim, we arrive at the continuity of σ\sigma. ∎

Finally, we have that hh is continuous on 1,ψ(𝒳)\mathcal{M}_{1,\psi}(\mathcal{X}).

Lemma A.5.

Assume (A1), (A2), and (A3). Then the mapping hh defined in (A.2) is continuous on 1,ψ(𝒳)\mathcal{M}_{1,\psi}(\mathcal{X}).

Proof.

Using Lemma A.3, Lemma A.4 and Assumptions (A1)(A3), the proof of [21, Lemma 2.9] holds verbatim. ∎

The above lemmas give us the LDP for the family {μνNN,N1}\{\mu^{N}_{\nu_{N}},N\geq 1\} on D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})) whenever νNν\nu_{N}\to\nu in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as NN\to\infty.

Proposition A.1.

Assume (A1), (A2), and (A3). Suppose that νNν\nu_{N}\to\nu in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as NN\to\infty. Then, the family {μνNN,N1}\{\mu^{N}_{\nu_{N}},N\geq 1\} satisfies the LDP on D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})) with rate function S[0,T](|ν)S_{[0,T]}(\cdot|\nu) defined in (2.8).

Proof.

Let νNν\nu_{N}\to\nu in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as NN\to\infty. By Lemma A.1, we have that {𝒫¯νNN,N1}\{\bar{\mathcal{P}}^{N}_{\nu_{N}},N\geq 1\} satisfies the LDP on 1,ψ(𝒳)\mathcal{M}_{1,\psi}(\mathcal{X}) with rate function JJ. Since hh is continuous on the set {Q1,ψ(𝒳):J(Q)<}\{Q\in\mathcal{M}_{1,\psi}(\mathcal{X}):J(Q)<\infty\} (by Lemma A.5), from Varadhan’s lemma, one can conclude that (see [7, Proof of Theorem 3.1]) the family {𝒫νNN}\{\mathcal{P}^{N}_{\nu_{N}}\} satisfies the LDP on 1,ψ(𝒳)\mathcal{M}_{1,\psi}(\mathcal{X}) with rate function QJ(Q)h(Q)Q\mapsto J(Q)-h(Q). By Lemma A.4, since σ\sigma is continuous (with the usual weak topology on 1(D([0,T],𝒵))\mathcal{M}_{1}(D([0,T],\mathcal{Z}))) at QQ when J(Q)<J(Q)<\infty, it follows that the restriction of σ\sigma to 1,ψ(𝒳)\mathcal{M}_{1,\psi}(\mathcal{X}) is also continuous (with respect to the stronger topology on 1,ψ(𝒳)\mathcal{M}_{1,\psi}(\mathcal{X})) at QQ when J(Q)<J(Q)<\infty. Therefore, using the generalized contraction principle (e.g., [13, Theorem 4.2.23]), the LDP for the family {μνNN,N1}\{\mu^{N}_{\nu_{N}},N\geq 1\} on D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})) follows. The rate function for this LDP can be shown to admit the form given in (2.8) (see, e.g., [21, Proof of Theorem 3.1]). ∎

A.2.2 Uniform LDP for {μνNN,N1}\{\mu_{\nu_{N}}^{N},N\geq 1\} over the class of compact subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z})

Proposition A.1 establishes the LDP for the family {μνNN,N1}\{\mu^{N}_{\nu_{N}},N\geq 1\}, whenever νNν\nu_{N}\to\nu in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as NN\to\infty. We now extend this to the uniform LDP on the class of compact subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}). Towards this, we rely on [8, Proposition 1.12, 1.14]. Although our definition of the uniform LDP (Definition 2.2) has initial conditions lying in A1N(𝒵)A\cap\mathcal{M}_{1}^{N}(\mathcal{Z}) (unlike the definition of uniform LDP in [8, Definition 1.13] where the initial conditions do not depend on the parameter NN), we can use straightforward modifications of the arguments in [8, Proposition 1.12, 1.14] to prove the desired uniform LDP. We provide an outline of these arguments here.

We first provide a definition of the uniform Laplace principle over the class of compact subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}). Recall the definition of the rate function S[0,T]S_{[0,T]} in (2.8). For ν1(𝒵)\nu\in\mathcal{M}_{1}(\mathcal{Z}) and gCb(D([0,T],1(𝒵)))g\in C_{b}(D([0,T],\mathcal{M}_{1}(\mathcal{Z}))), define

F(ν,g):=infφD([0,T],1(𝒵))[g(φ)+S[0,T](φ|ν)].\displaystyle F(\nu,g):=-\inf_{\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z}))}\left[g(\varphi)+S_{[0,T]}(\varphi|\nu)\right].
Definition A.1.

We say that the family {μνNN,N1}\{\mu^{N}_{\nu_{N}},N\geq 1\} of D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z}))-valued random variables defined on a probability space (Ω,,P)(\Omega,\mathcal{F},P) satisfies the uniform Laplace principle over the class 𝒜\mathcal{A} of subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) with the family of rate functions {S[0,T](|ν),ν1(𝒵)}\{S_{[0,T]}(\cdot|\nu),\nu\in\mathcal{M}_{1}(\mathcal{Z})\}, S[0,T](|ν):D([0,T],1(𝒵))[0,+]S_{[0,T]}(\cdot|\nu):D([0,T],\mathcal{M}_{1}(\mathcal{Z}))\to[0,+\infty], ν1(𝒵)\nu\in\mathcal{M}_{1}(\mathcal{Z}), if

  • (Compactness of level sets). For each K1(𝒵)K\subset\mathcal{M}_{1}(\mathcal{Z}) compact and s0s\geq 0, νKΦν(s)\bigcup_{\nu\in K}\Phi_{\nu}(s) is a compact subset of D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})), where Φν(s){φD([0,T],1(𝒵)):φ0=ν,S[0,T](φ|ν)s}\Phi_{\nu}(s)\coloneqq\{\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})):\varphi_{0}=\nu,S_{[0,T]}(\varphi|\nu)\leq s\};

  • (Uniform Laplace asymptotics) For any A𝒜A\in\mathcal{A} and gCb(D([0,T],1(𝒵)))g\in C_{b}(D([0,T],\mathcal{M}_{1}(\mathcal{Z}))), we have

    limNsupνNA1N(𝒵)|1NlogEνN[exp{Ng(μνNN)}]F(νN,g)|=0.\displaystyle\lim_{N\to\infty}\sup_{\nu_{N}\in A\cap\mathcal{M}_{1}^{N}(\mathcal{Z})}\left|\frac{1}{N}\log E_{\nu_{N}}\left[\exp\{-Ng(\mu^{N}_{\nu_{N}})\}\right]-F(\nu_{N},g)\right|=0.

This is a modification of [8, Definition 1.11] to the case when the initial conditions are only allowed to lie in A1N(𝒵)A\cap\mathcal{M}_{1}^{N}(\mathcal{Z}). We have the following result.

Lemma A.6 (see [8, Proposition 1.12]).

Assume (A1), (A2), and (A3). Then the family {μνNN,N1}\{\mu^{N}_{\nu_{N}},N\geq 1\} satisfies the uniform Laplace principle over the class of compact subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) with the family of rate functions {S[0,T](|ν),ν1(𝒵)}\{S_{[0,T]}(\cdot|\nu),\nu\in\mathcal{M}_{1}(\mathcal{Z})\}, S[0,T](|ν):D([0,T],1(𝒵))[0,+]S_{[0,T]}(\cdot|\nu):D([0,T],\mathcal{M}_{1}(\mathcal{Z}))\to[0,+\infty], ν1(𝒵)\nu\in\mathcal{M}_{1}(\mathcal{Z}).

Proof.

By Lemma 2.1, we have that for each K1(𝒵)K\subset\mathcal{M}_{1}(\mathcal{Z}) compact and s0s\geq 0, νKΦν(s)\bigcup_{\nu\in K}\Phi_{\nu}(s) is a compact subset of D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})), where Φν(s){φD([0,T],1(𝒵)):φ0=ν,Iν(φ)s}\Phi_{\nu}(s)\coloneqq\{\varphi\in D([0,T],\mathcal{M}_{1}(\mathcal{Z})):\varphi_{0}=\nu,I_{\nu}(\varphi)\leq s\}.

To show the uniform Laplace asymptotics, let gCb(D([0,T],1(𝒵)))g\in C_{b}(D([0,T],\mathcal{M}_{1}(\mathcal{Z}))). By Proposition A.1, whenever νNν\nu_{N}\to\nu in 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) as NN\to\infty we have that the family {μνNN,N1}\{\mu^{N}_{\nu_{N}},N\geq 1\} satisfies the LDP on D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})) with rate function S[0,T](|ν)S_{[0,T]}(\cdot|\nu). Therefore, by Varadhan’s lemma (e.g., [13, Theorem 4.3.1]), we have

limN1NlogEνN[exp{Ng(μνNN)}]=F(ν,g).\displaystyle\lim_{N\to\infty}\frac{1}{N}\log E_{\nu_{N}}\left[\exp\{-Ng(\mu^{N}_{\nu_{N}})\}\right]=F(\nu,g). (A.9)

Define

FN(νN,g):=1NlogEνN[exp{Ng(μνNN)}],νN1N(𝒵).\displaystyle F^{N}(\nu_{N}^{\prime},g):=\frac{1}{N}\log E_{\nu_{N}^{\prime}}\left[\exp\{-Ng(\mu^{N}_{\nu_{N}^{\prime}})\}\right],\quad\nu_{N}^{\prime}\in\mathcal{M}_{1}^{N}(\mathcal{Z}).

Using (A.9), we now show that the mapping νF(ν,g)\nu\mapsto F(\nu,g) is continuous. To show this continuity, it suffices to show that, given any ε>0\varepsilon>0 there exists δ>0\delta>0 such that for all ν1(𝒵)\nu^{\prime}\in\mathcal{M}_{1}(\mathcal{Z}) such that d(ν,ν)<δd(\nu^{\prime},\nu)<\delta and νN1N(𝒵)\nu^{\prime}_{N}\in\mathcal{M}_{1}^{N}(\mathcal{Z}) such that νNν\nu_{N}^{\prime}\to\nu^{\prime} as NN\to\infty, we have

|FN(νN,g)F(ν,g)|<ε for all large enough N.\displaystyle|F^{N}(\nu_{N}^{\prime},g)-F(\nu,g)|<\varepsilon\quad\text{ for all large enough }N.

Indeed, if this is true, sending NN\to\infty in the above display and using (A.9), we arrive at |F(ν,g)F(ν,g)|<ε|F(\nu^{\prime},g)-F(\nu,g)|<\varepsilon, which shows the continuity of νF(ν,g)\nu\mapsto F(\nu,g). We now show the above statement using contraposition. Suppose the above statement is not true. Then there exists ε>0\varepsilon>0 and a sequence {νN}\{\nu_{N}\} with νN1N(𝒵)\nu_{N}\in\mathcal{M}_{1}^{N}(\mathcal{Z}) and νNν\nu_{N}\to\nu as NN\to\infty such that |FN(νN,g)F(ν,g)|>ε|F^{N}(\nu_{N},g)-F(\nu,g)|>\varepsilon. Using (A.9), we get |F(ν,g)F(ν,g)|>ε>0|F(\nu,g)-F(\nu,g)|>\varepsilon>0, which is a contradiction. This establishes the continuity of the mapping 1(𝒵)νF(ν,g)\mathcal{M}_{1}(\mathcal{Z})\ni\nu\mapsto F(\nu,g).

Since νF(ν,g)\nu\mapsto F(\nu,g) is continuous, using (A.9), by the same arguments in [8, Proposition 1.12], one can show that for any compact subset KK of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}), supνNK1N(𝒵)|FN(νN,g)F(ν,g)|0\sup_{\nu_{N}\in K\cap\mathcal{M}_{1}^{N}(\mathcal{Z})}|F^{N}(\nu_{N},g)-F(\nu,g)|\to 0 as NN\to\infty. This shows that the family {μνNN,N1}\{\mu^{N}_{\nu_{N}},N\geq 1\} satisfies the uniform Laplace principle over the class of compact subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) with the family of rate functions {S[0,T](|ν),ν1(𝒵)}\{S_{[0,T]}(\cdot|\nu),\nu\in\mathcal{M}_{1}(\mathcal{Z})\}. ∎

We can now complete the proof of Theorem 2.1 using the arguments in [8, Proposition 1.14].

Proof of Theorem 2.1.

By Lemma A.6, the family {μνNN,N1}\{\mu^{N}_{\nu_{N}},N\geq 1\} satisfies the uniform Laplace principle D([0,T],1(𝒵))D([0,T],\mathcal{M}_{1}(\mathcal{Z})) over the class of compact subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) with the family of rate functions {S[0,T](|ν),ν1(𝒵)}\{S_{[0,T]}(\cdot|\nu),\nu\in\mathcal{M}_{1}(\mathcal{Z})\}. Restricting the initial conditions to 1N(𝒵)\mathcal{M}_{1}^{N}(\mathcal{Z}) and following the proof of [8, Proposition 1.14] verbatim, we conclude that the family {μνNN,N1}\{\mu^{N}_{\nu_{N}},N\geq 1\} satisfies the uniform LDP on 1(D([0,T],𝒵))\mathcal{M}_{1}(D([0,T],\mathcal{Z})) over the class of compact subsets of 1(𝒵)\mathcal{M}_{1}(\mathcal{Z}) with the family of rate functions {S[0,T](|ν),ν1(𝒵)}\{S_{[0,T]}(\cdot|\nu),\nu\in\mathcal{M}_{1}(\mathcal{Z})\}. ∎

Acknowledgements

The authors were supported by a grant from the Indo-French Centre for Applied Mathematics on a project titled “Metastability phenomena in algorithms and engineered systems”. The first author was supported in part by a fellowship grant from the Centre for Networked Intelligence (a Cisco CSR initiative), Indian Institute of Science, Bangalore; and in part by Office of Naval Research under the Vannevar Bush Faculty Fellowship N0014-21-1-2887. The authors thank two anonymous referees for carefully reading the manuscript and providing valuable comments that improved the paper.

References

  • [1] C. Berge. Topological Spaces: Including a Treatment of Multi-Valued Functions, Vector Spaces, and Convexity. Courier Corporation, 1997.
  • [2] L. Bertini, A. De Sole, D. Gabrielli, G. Jona-Lasinio, and C. Landim. Macroscopic fluctuation theory for stationary non-equilibrium states. Journal of Statistical Physics, 107(3):635–675, 2002.
  • [3] L. Bertini, A. De Sole, D. Gabrielli, G. Jona-Lasinio, and C. Landim. Large deviations for the boundary driven symmetric simple exclusion process. Mathematical Physics, Analysis and Geometry, 6(3):231–267, 2003.
  • [4] P. Billingsley. Convergence of Probability Measures. Wiley Series in Probability and Statistics, 2 edition, 1999.
  • [5] T. Bodineau and G. Giacomin. From dynamic to static large deviations in boundary driven exclusion particle systems. Stochastic Processes and their Applications, 110(1):67–81, 2004.
  • [6] C. Bordenave, D. McDonald, and A. Proutiere. A particle system in interaction with a rapidly varying environment: Mean field limits and applications. Netw. Heterog. Media, 5(1):31–62, 2010.
  • [7] V. S. Borkar and R. Sundaresan. Asymptotics of the invariant measure in mean field models with jumps. Stoch. Syst., 2(2):322–380, 2012.
  • [8] A. Budhiraja and P. Dupuis. Analysis and Approximation of Rare Events. Springer, New York, NY, 2019.
  • [9] S. Cerrai and N. Paskal. Large deviations principle for the invariant measures of the 2d stochastic Navier-Stokes equations with vanishing noise correlation. arXiv preprint arXiv:2012.14953, 2020.
  • [10] S. Cerrai and M. Röckner. Large deviations for stochastic reaction-diffusion systems with multiplicative noise and non-Lipschitz reaction term. The Annals of Probability, 32(1B):1100 – 1139, 2004.
  • [11] S. Cerrai and M. Röckner. Large deviations for invariant measures of stochastic reaction–diffusion systems with multiplicative noise and non-Lipschitz reaction term. Annales de l’Institut Henri Poincare (B) Probability and Statistics, 41(1):69–105, 2005.
  • [12] D. A. Dawson and J. Gärtner. Large deviations from the McKean-Vlasov limit for weakly interacting diffusions. Stochastics, 20(4):247–308, 1987.
  • [13] A. Dembo and O. Zeitouni. Large Deviations Techniques and Applications. Springer-Verlag Berlin Heidelberg, 2 edition, 2010.
  • [14] M. D. Donsker and S. R. S. Varadhan. Asymptotic evaluation of certain Markov process expectations for large time, I. Communications on Pure and Applied Mathematics, 28(1):1–47, 1975.
  • [15] R. Durrett. Probability: Theory and Examples. Cambridge University Press, 5th edition edition, 2019.
  • [16] S. N. Ethier and T. G. Kurtz. Markov Processes: Characterization and Convergence. John Wiley & Sons, 2 edition, 2005.
  • [17] J. Farfán, C. Landim, and K. Tsunoda. Static large deviations for a reaction–diffusion model. Probability Theory and Related Fields, 174(1):49–101, 2019.
  • [18] M. I. Freidlin and A. D. Wentzell. Random Perturbations of Dynamical Systems. Grundlehren der mathematischen Wissenschaften. American Mathematical Society, 3 edition, 2012.
  • [19] R. Khasminskii. Stochastic Stability of Differential Equations, volume 66 of Stochastic Modelling and Applied Probability. Springer, Berlin, Heidelberg, 2012.
  • [20] A. Kumar, E. Altman, D. Miorandi, and M. Goyal. New insights from a fixed point analysis of single cell IEEE 802.11 WLANs. In IEEE INFOCOM 2006, 2006.
  • [21] C. Léonard. Large deviations for long range interacting particle systems with jumps. Ann. Inst. Henri Poincaré Probab. Stat., 31(2):289–323, 1995.
  • [22] C. Léonard. On large deviations for particle systems associated with spatially homogeneous Boltzmann type equations. Probability Theory and Related Fields, 101(1):1–44, Mar 1995.
  • [23] R. Liptser. Large deviations for two scaled diffusions. Probability theory and related fields, 106(1):71–104, 1996.
  • [24] D. Martirosyan. Large deviations for stationary measures of stochastic nonlinear wave equations with smooth white noise. Communications on Pure and Applied Mathematics, 70(9):1754–1797, 2017.
  • [25] H. P. McKean. Propagation of chaos for a class of non-linear parabolic equations. In Lecture Series in Differential Equations, Catholic University (Washington D. C.), 1967.
  • [26] S. P. Meyn, P. Barooah, A. Bušić, Y. Chen, and J. Ehren. Ancillary service to the grid using intelligent deferrable loads. IEEE Transactions on Automatic Control, 60(11):2847–2862, 2015.
  • [27] C. Mufa. Optimal Markovian couplings and applications. Acta Mathematica Sinica, 10(3):260–275, 1994.
  • [28] A. Puhalskii. Large deviations of the long term distribution of a non Markov process. Electronic Communications in Probability, 24:1 – 11, 2019.
  • [29] A. A. Puhalskii. On large deviations of coupled diffusions with time scale separation. Ann. Probab., 44(4):3111–3186, 07 2016.
  • [30] A. A. Puhalskii. Large deviation limits of invariant measures. arXiv preprint arXiv:2006.16456, 2020.
  • [31] M. Salins, A. Budhiraja, and P. Dupuis. Uniform large deviation principles for Banach space valued stochastic differential equations. Transactions of the American Mathematical Society, pages 8363–8421, 2019.
  • [32] M. Salins and K. Spiliopoulos. Metastability and exit problems for systems of stochastic reaction-diffusion equations. The Annals of Probability, 49(5):2317–2370, 2021.
  • [33] R. Sowers. Large deviations for the invariant measure of a reaction-diffusion equation with non-Gaussian perturbations. Probability Theory and Related Fields, 92(3):393–421, 1992.
  • [34] R. B. Sowers. Large Deviations for a Reaction-Diffusion Equation with Non-Gaussian Perturbations. The Annals of Probability, 20(1):504 – 537, 1992.
  • [35] A. Y. Veretennikov. On large deviations for SDEs with small diffusion and averaging. Stochastic Processes and their Applications, 89(1):69–79, 2000.
  • [36] S. Yasodharan and R. Sundaresan. Large time behaviour and the second eigenvalue problem for finite state mean-field interacting particle systems. Advances in Applied Probability, 55(1):85–125, 2023.

S. Yasodharan
Division of Applied Mathematics
Brown University
Providence RI 02912, USA
email: [email protected]

R. Sundaresan
Department of Electrical Communication Engineering
Indian Institute of Science
Bangalore 560 012, India
email: [email protected]