This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Graphon-valued processes with
vertex-level fluctuations

Peter Braunsteins Korteweg-de-Vries Instituut, Universiteit van Amsterdam, PO Box 94248, 1090 GE Amsterdam, The Netherlands [email protected] Frank den Hollander Mathematisch Instituut, Universiteit Leiden, PO Box 9512, 2300 RA Leiden, The Netherlands [email protected]  and  Michel Mandjes Korteweg-de Vries Instituut, Universiteit van Amsterdam, PO Box 94248, 1090 GE Amsterdam, The Netherlands [email protected]
(Date: March 3, 2025)
Abstract.

We consider a class of graph-valued stochastic processes in which each vertex has a type that fluctuates randomly over time. Collectively, the paths of the vertex types up to a given time determine the probabilities that the edges are active or inactive at that time. Our focus is on the evolution of the associated empirical graphon in the limit as the number of vertices tends to infinity, in the setting where fluctuations in the graph-valued process are more likely to be caused by fluctuations in the vertex types than by fluctuations in the states of the edges given these types. We derive both sample-path large deviation principles and convergence of stochastic processes. We demonstrate the flexibility of our approach by treating a class of stochastic processes where the edge probabilities depend not only on the fluctuations in the vertex types but also on the state of the graph itself.

Key words. Graphs, graphons, dynamics, sample paths, process convergence, large deviations, optimal paths.
MSC2010. 05C80, 60C05, 60F10.
Acknowledgment. The work in this paper was supported by the Netherlands Organisation for Scientific Research (NWO) through Gravitation-grant NETWORKS-024.002.003.

1. Introduction

1.1. Background

Graphons arise as a powerful tool for characterising the limit of a sequence of dense graphs, i.e., graphs in which the number of edges scales as the square of the number of vertices. The theory describing these graphons (see e.g. [21], [22], [4], [5], [20]) focuses on the limiting properties of large dense graphs in terms of their subgraph densities. The literature covers both typical and atypical behaviour, a notable result being the large deviation principle (LDP) for homogeneous Erdős-Rényi random graphs and associated graphons [9], and their inhomogeneous counterparts [12].

While most of the existing theory focuses on static random graphons, the attention has gradually shifted to dynamic random graphons (see e.g. [26], [7], [1]). Two notable contributions in this area are [1], which presents a stochastic process limit in the space of graphons for a class of processes where the edges evolve in a dependent manner, and [6], which extends the LDP of [9] to a sample-path LDP for a dynamic random graph in which the edges switch on and off independently in a random fashion. In [6] the authors leave open the question whether a sample-path large deviation principle can be established for processes where the edges switch on and off in a dependent manner, such as in [1].

1.2. Motivation

The goal of the present paper is two-fold: (1) to answer the open question raised in [6] by establishing a sample-path large deviation principle for a class of processes in which edges evolve in a dependent manner; (2) to strengthen the results in [1] while working in a more general framework.

In the class of processes we consider, each vertex is assigned a type that changes randomly over time, and fluctuations in the types of the vertices determine how the edges interact with each other while switching on and off. Specifically, the paths of the types of all the vertices up to time tt determine the probability that the edges in the random graph are active at time tt. Collectively, these paths are called the driving process.

Our results generalise those of [1] in a number of directions:

  • (i)

    We establish sample-path large deviations (Theorem 3.5), whereas [1] restricts attention to diffusion limits.

  • (ii)

    We consider a general driving process and a general edge-switching dynamics, whereas [1] restricts attention to a specific driving process (the multi-type Moran model) and to a specific edge-switching dynamics (modulated by a fitness function).

  • (iii)

    We establish stochastic process convergence in the space of (𝒲,d)({\mathscr{W}},d_{\square})-valued càdlàg paths (Theorem 3.10), whereas [1] works in the space of (𝒲~,δ)(\tilde{\mathscr{W}},\delta_{\square})-valued Skorokhod paths. (For the definition of these two spaces, see Section 2.1 below.)

  • (iv)

    We allow for processes in which the probabilities that edges are active depend not only on the fluctuations in the types but also on the state of the graph itself, i.e., on which edges are active or not (Section 4.1).

On the way to proving our results for graphon-valued processes, we also prove a new large deviation principle for static random graphs (Theorem 2.6). This result can be viewed as a generalisation of the large deviation principle for inhomogeneous Erdős-Rényi random graphs in which each vertex is assigned a random type. Our proofs rely on concentrations estimates, coupling arguments, and continuous mapping. Along the way, several examples are presented.

The models analysed in this paper have a one-way dependence: the states of the edges depend on the types of the vertices, but the types of the vertices do not depend on the states of the edges. There are many natural models that fall into this framework, coming from statistical physics, population genetics and the social sciences, where the strengths of the interactions between particles, alleles or individuals generally depend on the type they carry. It is much harder to analyse models that exhibit a two-way dependence. Such models capture the evolution of spins, infections or opinions on dynamic random networks with mutual feedback, an area that so far remains largely unexplored.

1.3. Outline

In Section 2 we recall basic LDPs for graphons, and present three LDPs for what we call inhomogeneous random graphs with type dependence (IRGTs), which are static random objects. In Section 3 we look at their dynamic counterparts, which are graph-valued processes, the main result being a sample-path LDP in graphon space. We illustrate our results via a running example, and derive convergence of the graph-valued process to a graphon process. In Section 4 we describe various applications, and discuss possible extensions. Section 5 contains the proofs of our main theorems. Appendix A identifies the rate function in the LDP of the underlying driving process.

2. Large deviations for static random graphs

While the goal of the present paper is to study a specific class of dynamic random graphs, we begin by analyzing their static counterparts. The reason is that the marginal distributions of the dynamic random graphs to be considered (introduced in Section 3) at any given time t0t\geq 0 corresponds to a random graph with type dependence (introduced in Section 2.3).

In Section 2.1 we recall a few basic definitions related to graphons. In Section 2.2 we introduce inhomogeneous Erdős-Rényi random graphs and recall the large deviation principle for their associated empirical graphons. In Section 2.3 we describe a generalisation of inhomogeneous Erdős-Rényi random graphs, referred to as inhomogeneous random graphs with type dependence (IRGT), which motivate the definition of the class of graph-valued stochastic processes that we will be working with from Section 3 onwards. In Section 2.4 we state a number of key assumptions that are needed along the way. In Section 2.5 we establish the large deviation principle for the associated empirical graphon processes under the assumption that the driving process satisfies the LDP. The latter assumption is investigated in detail in Appendix A.

2.1. Graphs and graphons

Let 𝒲\mathscr{W} be the space of functions h:[0,1]2[0,1]h\colon\,[0,1]^{2}\to[0,1] such that h(x,y)=h(y,x)h(x,y)=h(y,x) for all (x,y)[0,1]2(x,y)\in[0,1]^{2}, formed after taking the quotient with respect to the equivalence relation of almost everywhere equality. A finite simple graph GG on nn vertices can be represented as a graphon hG𝒲h^{G}\in\mathscr{W} by setting

hG(x,y):={1if there is an edge between vertex nx and vertex ny,0otherwise.h^{G}(x,y):=\begin{cases}1&\quad\text{if there is an edge between vertex }\lceil nx\rceil\text{ and vertex }\lceil ny\rceil,\\ 0&\quad\text{otherwise.}\end{cases} (2.1)

This object is referred to as an empirical graphon and has a block structure. The space of graphons 𝒲\mathscr{W} is endowed with the cut distance

d(h1,h2):=supS,T[0,1]|S×Tdxdy[h1(x,y)h2(x,y)]|,h1,h2𝒲.d_{\square}(h_{1},h_{2}):=\sup_{S,T\subseteq[0,1]}\left|\int_{S\times T}\mathrm{d}x\,\mathrm{d}y\,[h_{1}(x,y)-h_{2}(x,y)]\right|,\quad h_{1},h_{2}\in\mathscr{W}. (2.2)

It is noted that the space (𝒲,d)(\mathscr{W},d_{\square}) is not compact [19, Example F.6].

On 𝒲\mathscr{W} there is a natural equivalence relation, referred to as ‘\sim’. Letting \mathscr{M} denote the set of measure-preserving bijections σ:[0,1][0,1]\sigma\colon\,[0,1]\to[0,1], we write h1h2h_{1}\sim h_{2} when there exists a σ\sigma\in\mathscr{M} such that h1(x,y)=h2(σ(x),σ(y))h_{1}(x,y)=h_{2}(\sigma(x),\sigma(y)) for all (x,y)[0,1]2(x,y)\in[0,1]^{2}. This equivalence relation induces the quotient space (𝒲~,δ)(\tilde{\mathscr{W}},\delta_{\square}), where δ\delta_{\square} is the cut metric defined by

δ(h~1,h~2):=infσ1,σ2d(h1σ1,h2σ2),h~1,h~2𝒲~.\delta_{\square}(\tilde{h}_{1},\tilde{h}_{2}):=\inf_{\sigma_{1},\sigma_{2}\in\mathscr{M}}d_{\square}(h_{1}^{\sigma_{1}},h_{2}^{\sigma_{2}}),\quad\tilde{h}_{1},\tilde{h}_{2}\in\tilde{\mathscr{W}}. (2.3)

Notably, the space (𝒲~,δ)(\tilde{\mathscr{W}},\delta_{\square}) is compact [21, Lemma 8].

2.2. Inhomogeneous Erdős-Rényi random graph

Let r𝒲r\in\mathscr{W} be a reference graphon. Fix nn\in\mathbb{N} and consider a random graph G^n\widehat{G}_{n} with vertex set [n]:={1,,n}[n]:=\{1,\dots,n\}, where the pair of vertices i,j[n]i,j\in[n], iji\neq j, is connected by an edge with probability r(in,jn)r(\frac{i}{n},\frac{j}{n}), independently of other pairs of vertices. Write n\mathbb{P}_{n} to denote the law of G^n\widehat{G}_{n}. Use the same symbol to denote the law on 𝒲\mathscr{W} induced by the map that associates with the graph G^n\widehat{G}_{n} its graphon hG^nh^{\widehat{G}_{n}}. Write ~n\tilde{\mathbb{P}}_{n} to denote the law of h~G^n\tilde{h}^{\widehat{G}_{n}}, the equivalence class associated with hG^nh^{\widehat{G}_{n}}.

The following theorem is an extension of the LDP for homogeneous Erdős-Rényi random graphs established in [9]. It was first stated in [12] under additional assumptions. These assumptions were subsequently relaxed in [24], [3], [13]. The following version of the LDP corresponds to [13, Theorem 4.1].

Theorem 2.1.

Suppose that rlogrr\log r and (1r)log(1r)(1-r)\log(1-r) are integrable. Then the sequence of probability measures (~n)n(\tilde{\mathbb{P}}_{n})_{n\in\mathbb{N}} satisfies the LDP on (𝒲~,δ)(\tilde{\mathscr{W}},\delta_{\square}) with rate (n2){n\choose 2} and with rate function I~r\tilde{I}_{r}, i.e.,

lim supn1(n2)log~n(𝒞)infh~𝒞I~r(h~)𝒞𝒲~ closed,\displaystyle\limsup_{n\to\infty}\frac{1}{{n\choose 2}}\log\tilde{\mathbb{P}}_{n}(\mathcal{C})\leq-\inf_{\tilde{h}\in\mathcal{C}}\tilde{I}_{r}(\tilde{h})\qquad\forall\,\,\mathcal{C}\subseteq\tilde{\mathscr{W}}\text{ closed,} (2.4)
lim infn1(n2)log~n(𝒪)infh~𝒪I~r(h~)𝒪𝒲~ open,\displaystyle\liminf_{n\to\infty}\frac{1}{{n\choose 2}}\log\tilde{\mathbb{P}}_{n}(\mathcal{O})\geq-\inf_{\tilde{h}\in\mathcal{O}}\tilde{I}_{r}(\tilde{h})\qquad\forall\,\,\mathcal{O}\subseteq\tilde{\mathscr{W}}\text{ open,}

where

I~r(h~)=infσIr(hσ),\tilde{I}_{r}(\tilde{h})=\inf_{\sigma\in\mathscr{M}}I_{r}(h^{\sigma}), (2.5)

hh is any representative of h~\tilde{h}, and

Ir(h):=[0,12]dxdy(h(x,y)|r(x,y)),I_{r}(h):=\int_{[0,1^{2}]}{\rm d}x\,{\rm d}y\,\,\mathcal{R}(h(x,y)\,|\,r(x,y)), (2.6)

with

(a|b):=alogab+(1a)log1a1b.\mathcal{R}(a\,|\,b):=a\log\frac{a}{b}+(1-a)\log\frac{1-a}{1-b}. (2.7)

2.3. Inhomogeneous random graphs with type dependence

Consider the following generalisation of the inhomogeneous Erdős-Rényi random graph defined in Section 2.2. Suppose that each vertex i[n]i\in[n] is assigned a (possibly random) type Xi(n)[0,1]X_{i}^{(n)}\in[0,1]. Denote the empirical type measure by

μn=1ni=1nδXi(n)\mu_{n}=\frac{1}{n}\sum_{i=1}^{n}\delta_{X_{i}^{(n)}} (2.8)

and the empirical type distribution

Fn(𝑿(n),x)=1ni=1n𝟙{Xi(n)x},x[0,1],F_{n}({\boldsymbol{X}}^{(n)},x)=\frac{1}{n}\sum_{i=1}^{n}\mathbbm{1}\{X_{i}^{(n)}\leq x\},\qquad x\in[0,1], (2.9)

where 𝟙{A}\mathbbm{1}\{A\} is the indicator function of the event AA and 𝑿(n)(X1(n),,Xn(n)){\boldsymbol{X}}^{(n)}\equiv(X_{1}^{(n)},\ldots,X_{n}^{(n)}). Let ([0,1])\mathcal{M}([0,1]) denote the space of measures on [0,1][0,1] endowed with the topology of weak convergence.

The way the graph is constructed is as follows. Whether or not an edge (i,j)(i,j) is active depends on local properties, namely, the types Xi(n)X_{i}^{(n)} and Xj(n)X_{j}^{(n)}, as well as global properties, namely, the empirical type distribution Fn(𝑿(n),)F_{n}({\boldsymbol{X}}^{(n)},\cdot\,). Concretely, we let edge ijij be active with probability

H(Xi(n),Xj(n),Fn)H(Xi(n),Xj(n),Fn(𝑿(n),)),H\Big{(}X_{i}^{(n)},X_{j}^{(n)},F_{n}\Big{)}\equiv H\Big{(}X_{i}^{(n)},X_{j}^{(n)},F_{n}({\boldsymbol{X}}^{(n)},\cdot\,)\Big{)}, (2.10)

where H:[0,1]2×([0,1])[0,1]H\colon\,[0,1]^{2}\times\mathcal{M}([0,1])\to[0,1] is symmetric in its first two inputs. Given 𝑿(n){\boldsymbol{X}}^{(n)}, the edge placement is independent for all vertex pairs. We label the resulting sequence of random graphs as {Gn}n\{G_{n}\}_{n\in\mathbb{N}}, and refer to them as inhomogeneous random graphs with type dependence (IRGT).

Two relations with inhomogeneous Erdős-Rényi random graphs are worth mentioning:

  • \circ

    Observe that if

    Xi(n)=inn,i[n],H(x,y,F)=r(x,y)x,y[0,1],X_{i}^{(n)}=\frac{i}{n}\quad\forall\,n\in\mathbb{N},\,i\in[n],\qquad H(x,y,F)=r(x,y)\quad\forall\,x,y\in[0,1], (2.11)

    then the IRGT is equivalent to the inhomogeneous Erdős-Rényi random graph defined in Section 2.2.

  • \circ

    Let F¯\bar{F} denote the right-continuous generalised inverse of a distribution function FF with support [0,1][0,1], which is defined in the usual way as

    F¯(u):=inf{x[0,1]:F(x)>u},u[0,1).\bar{F}(u):=\inf\{x\in[0,1]\colon\,F(x)>u\},\qquad u\in[0,1). (2.12)

    For F([0,1])F\in\mathcal{M}([0,1]), define the induced reference graphon g[F]𝒲g^{[F]}\in\mathscr{W} by setting

    g[F](x,y)=H(F¯(x),F¯(y),F).g^{[F]}(x,y)=H\big{(}\bar{F}(x),\bar{F}(y),F\big{)}. (2.13)

    Given the type distribution FnF_{n}, h~Gn\tilde{h}^{G_{n}} has the same distribution as the inhomogeneous Erdős-Rényi random graph with reference graphon g[Fn]g^{[F_{n}]}. In other words, we have

    h~Gn|Fn=dh~G^n,r=g[Fn].\tilde{h}^{G_{n}}\,|\,F_{n}\stackrel{{\scriptstyle\rm d}}{{=}}\tilde{h}^{\widehat{G}_{n}},\,r=g^{[F_{n}]}. (2.14)

    This observation is central to the LDP for IRGTs stated in Theorem 2.6 below.

2.4. Key assumptions

Before stating Theorem 2.6, we make a number of assumptions.

Assumption 2.2.

The sequence of type distributions (Fn(𝑿(n),))n(F_{n}({\boldsymbol{X}}^{(n)},\cdot\,))_{n\in\mathbb{N}} satisfies the LDP on ([0,1])\mathcal{M}([0,1]) with rate (n)\ell(n) and with rate function KK. \diamondsuit

When Xi(n)X^{(n)}_{i}, ii\in\mathbb{N}, are fixed and μnμ\mu_{n}\to\mu in ([0,1])\mathcal{M}([0,1]) as nn\to\infty, as in (2.11), then Assumption 2.2 is satisfied with (n)=\ell(n)=\infty. Assumption 2.2 holds, for instance, when {Xi(n)}n,i[n]\{X^{(n)}_{i}\}_{n\in\mathbb{N},i\in[n]} are i.i.d. random variables with distribution ff, in which case (n)=n\ell(n)=n and K(f)=(f|f)K(f^{\circ})={\mathscr{H}}(f^{\circ}\,|\,f) is the relative entropy (or Kullback-Leibler divergence) of ff^{\circ} with respect to ff. Assumption 2.2 may also hold with (n)\ell(n) not scaling linearly in nn.

To provide an example, we extend the setup in [13, Example 2.5]. Let p>0p>0, ff be a probability distribution on \mathbb{R} with bounded support, and {Yi(n)}i[np]\{Y_{i}^{(n)}\}_{i\in[\lfloor n^{p}\rfloor]} be i.i.d. random variables with distribution ff. Let sns_{n} be such that

sn(x)=Yi(n),x(i1np,inp),s_{n}(x)=Y_{i}^{(n)},\qquad x\in\left(\frac{i-1}{\lfloor n^{p}\rfloor},\frac{i}{\lfloor n^{p}\rfloor}\right), (2.15)

and identify sns_{n} with its periodic extension to \mathbb{R}. Let ρ\rho be a smooth convolution kernel with compact support, and define

Xi(n)=dyρ(inpy)sn(y).X_{i}^{(n)}=\int_{\mathbb{R}}{\rm d}y\,\rho\left(\frac{i}{\lfloor n^{p}\rfloor}-y\right)s_{n}(y). (2.16)

In this case (n)=np\ell(n)=n^{p} and

K(v)=inff{(f|f):v(x)=dyρ(xy)f(y),x}.K(v)=\inf_{f^{\circ}}\left\{{\mathscr{H}}(f^{\circ}\,|\,f):v(x)=\int_{\mathbb{R}}{\rm d}y\,\rho(x-y)f^{\circ}(y),\,x\in\mathbb{R}\right\}. (2.17)

We refer the reader to [13, Example 2.5] for the arguments underlying this result.

Assumption 2.3.

The function Fg[F]F\mapsto g^{[F]} defined in (2.13) is a continuous mapping from ([0,1])\mathcal{M}([0,1]) to (𝒲,L1)(\mathscr{W},\lVert\cdot\rVert_{L_{1}}). \diamondsuit

Assumption 2.3 holds, for example, when H(x,y,F)H(x,y)H(x,y,F)\equiv H^{*}(x,y) (i.e., there is no dependence on the type distribution) and H:[0,1]2[0,1]H^{*}\colon\,[0,1]^{2}\to[0,1] is a continuous function. Assumption 2.3 also holds when, in addition, f:([0,1])[0,1]f\colon\,\mathcal{M}([0,1])\to[0,1] and h:[0,1]2[0,1]h\colon\,[0,1]^{2}\to[0,1] are continuous functions, and

H(x,y;F)=h(H(x,y),f(F))[x,y][0,1]2,F([0,1]).H(x,y;F)=h(H^{*}(x,y),f(F))\qquad\forall\,\,[x,y]\in[0,1]^{2},\,F\in\mathcal{M}([0,1]). (2.18)

In certain settings we require two further assumptions that are of a more technical nature.

Assumption 2.4.

For all F([0,1])F\in\mathcal{M}([0,1]), the induced graphon g[F]g^{[F]} is away from the boundary, i.e.,

ηg[F](x,y)1η(x,y)[0,1]2,\eta\leq g^{[F]}(x,y)\leq 1-\eta\qquad\forall\,\,(x,y)\in[0,1]^{2}, (2.19)

for some η>0\eta>0. \diamondsuit

Assumption 2.5.

The rate function KK has a unique zero, labelled FF^{*}. \diamondsuit

2.5. LDP for IRGTs

Let

J(h~)=infF([0,1]):g~[F]=h~K(F)J(\tilde{h})=\inf_{F\in\mathcal{M}([0,1]):\tilde{g}^{[F]}=\tilde{h}}K(F) (2.20)

and recall that (r,h~)Ir(h~)(r,\tilde{h})\mapsto I_{r}(\tilde{h}) is a function from 𝒲×𝒲~\mathscr{W}\times\tilde{\mathscr{W}} to +\mathbb{R}_{+}. We are now ready to state our LDP for IRGTs.

Theorem 2.6.

Subject to Assumptions 2.2 and 2.3 the following hold:

  • (i)

    If (n)=o((n2))\ell(n)=o\left({n\choose 2}\right), then {h~G^n}\{\tilde{h}^{\widehat{G}_{n}}\} satisfies the LDP with rate (n)\ell(n) and with rate function I(h~)=J(h~)I^{*}(\tilde{h})=J(\tilde{h}).

  • (ii)

    If limn(n)/(n2)=c\lim_{n\to\infty}\ell(n)/{n\choose 2}=c and Assumption 2.4 holds as well, then {h~G^n}\{\tilde{h}^{\widehat{G}_{n}}\} satisfies the LDP with rate (n2){n\choose 2} and with rate function I(h~)=infg𝒲~[cJ(g~)+Ig(h~)]I^{*}(\tilde{h})=\inf_{g\in\tilde{\mathscr{W}}}[cJ(\tilde{g})+I_{g}(\tilde{h})], where gg is any representative of g~\tilde{g}.

  • (iii)

    If (n2)=o((n)){n\choose 2}=o(\ell(n)) and Assumptions 2.4 and 2.5 hold as well, then {h~G^n}\{\tilde{h}^{\widehat{G}_{n}}\} satisfies the LDP with rate (n2){n\choose 2} and with rate function I(h~)=Ig[F](h~)I^{*}(\tilde{h})=I_{g^{[F^{*}]}}(\tilde{h}).

To understand where Theorem 2.6 comes from, it is instructive to realize that two random mechanisms play a role, and that the dominant mechanism determines the rare event behavior. Concretely, think of simulating outcomes of h~G^n\tilde{h}^{\widehat{G}_{n}} in two steps:

  • \circ

    Simulate the types of the vertices, i.e., simulate the type distribution FnF_{n}.

  • \circ

    Simulate the edges given FnF_{n}, i.e., simulate h~G^n\tilde{h}^{\widehat{G}_{n}} given the induced reference graphon g[Fn]g^{[F_{n}]}.

Due to Assumption 2.2, large fluctuations in Step 1 are governed by the LDP with rate (n)\ell(n) and with rate function K()K(\cdot), whereas due to (2.14) large fluctuations in Step 2 are governed by the LDP with rate (n2){n\choose 2} and with rate function Ig[Fn]I_{g^{[F_{n}]}}. In particular, this implies that when (n)=o((n2))\ell(n)=o\left({n\choose 2}\right) large fluctuations in h~G^n\tilde{h}^{\widehat{G}_{n}} are most likely to be caused by a rare event in Step 1, whereas when (n2)=o((n)){n\choose 2}=o(\ell(n)) large fluctuations in h~G^n\tilde{h}^{\widehat{G}_{n}} are most likely to be caused by a rare event in Step 2. The regime (n)(n2)\ell(n)\asymp{n\choose 2} can be viewed as ‘balanced’, in the sense that large fluctuations in h~G^n\tilde{h}^{\widehat{G}_{n}} are most likely to be caused by a combination of rare events in both Steps 1 and 2. When (n)=o((n2))\ell(n)=o\left({n\choose 2}\right) we say that the IRGT exhibits vertex-level fluctuations, whereas when (n2)=o((n)){n\choose 2}=o(\ell(n)) we say that it exhibits edge-level fluctuations.

The IRGT is of interest in its own right. However, our primary motivation for introducing the IRGT is that it can be generalized in a natural way to a stochastic process. The rough idea behind its dynamic counterpart is that at each point in time the distribution of the process corresponds to an IRGT. We will focus primarily on processes that exhibit vertex-level fluctuations.

3. Graphon-valued processes

In Section 3.1 we first introduce the graph-valued process of interest, which can be viewed as the dynamic counterpart of the model discussed in Section 2. Section 3.2 describes an illustrative example. Section 3.3 states the sample-path LDP for the graph-valued process under the assumption that the driving process satisfies an LDP. Section 3.4 states the stochastic process limit for the graph-valued process under the assumption that the driving process satisfies a stochastic process limit. The latter assumption is investigated in Appendix A.

3.1. The model

For a given time horizon TT, let (Gn(t))t[0,T](G_{n}(t))_{t\in[0,T]} denote our graph-valued process. This process is constructed as follows. Suppose that each vertex i[n]i\in[n] has a type Xi(n)(t)X^{(n)}_{i}(t) that may fluctuate randomly over time. Let (μn(t))t[0,T](\mu_{n}(t))_{t\in[0,T]} denote the process of empirical type measures defined by

μn(t)=1ni=1nδXi(n)(t),\mu_{n}(t)=\frac{1}{n}\sum_{i=1}^{n}\delta_{X_{i}^{(n)}(t)}, (3.1)

i.e., the dynamic version of (2.8). This process evolves autonomously, i.e., independently of the graph-valued process (Gn(t))t[0,T](G_{n}(t))_{t\in[0,T]}.

In addition, let (Fn(t;))t[0,T](F_{n}(t;\cdot))_{t\in[0,T]} denote the process of empirical type distribution defined by

Fn(t;x)=1ni=1n𝟙{Xi(n)(t)x},x[0,1],F_{n}(t;x)=\frac{1}{n}\sum_{i=1}^{n}\mathbbm{1}\{X_{i}^{(n)}(t)\leq x\},\qquad x\in[0,1], (3.2)

i.e., the dynamic counterpart of (2.9). The process (μn(t))t[0,T](\mu_{n}(t))_{t\in[0,T]} lives on D(([0,1]),[0,T])D(\mathcal{M}([0,1]),[0,T]), the space of ([0,1])\mathcal{M}([0,1])-valued càdlàg paths. We suppose that, at any given time tt, edge ijij is active with probability

H(t;Xi(n)(t),Xj(n)(t),(Fn(t;))t[0,T]),H\big{(}t;X^{(n)}_{i}(t),X^{(n)}_{j}(t),(F_{n}(t;\cdot))_{t\in[0,T]}\big{)}, (3.3)

independently of all other edges at time tt, of Xi(n)(t)X^{(n)}_{i}(t), Xj(n)(t)X^{(n)}_{j}(t), and of (Fn(t;))t[0,T](F_{n}(t;\cdot))_{t\in[0,T]}, where

H:[0,T]×[0,1]2×D(([0,1]),[0,T])[0,1].H\colon\,[0,T]\times[0,1]^{2}\times D(\mathcal{M}([0,1]),[0,T])\mapsto[0,1]. (3.4)

This function HH gives rise to the induced reference graphon process g[F]g^{[F]}, which, for FD(([0,1]),[0,T])F\in D(\mathcal{M}([0,1]),[0,T]), is characterised by

g[F](t;x,y)=H(t;F¯(t;x),F¯(t;y),(F(t;)t[0,T])).g^{[F]}(t;x,y)=H\big{(}t;\bar{F}(t;x),\bar{F}(t;y),(F(t;\cdot)_{t\in[0,T]})\big{)}. (3.5)

Observe that, for any t[0,T]t\in[0,T], given the outcome of the empirical type distribution Fn(t,)F_{n}(t,\cdot\,), the distribution of h~Gn(t)\tilde{h}^{G_{n}(t)} corresponds to that of an inhomogeneous Erdős-Rényi random graph with reference graphon g[Fn](t;,)g^{[F_{n}]}(t;\cdot,\cdot). In other words, for any t[0,T]t\in[0,T],

hGn(t)|Fn=dhG^n,r=g[Fn](t;),h^{G_{n}(t)}|F_{n}\stackrel{{\scriptstyle\rm d}}{{=}}h^{\widehat{G}_{n}},\qquad r=g^{[F_{n}]}(t;\cdot), (3.6)

where G^n\widehat{G}_{n} is the inhomogeneous Erdős-Rényi random graph defined in Section 2.2. We make the following assumption on the function Fg[F]F\mapsto g^{[F]}, which due to (3.5) is an assumption on HH.

Assumption 3.1.

The map Fg[F]F\mapsto g^{[F]} from D(([0,1]),[0,T],)D(\mathcal{M}([0,1]),[0,T],) to D((𝒲,L1),[0,T])D((\mathscr{W},\lVert\cdot\rVert_{L_{1}}),[0,T]) is continuous. \diamondsuit

3.2. An illustrative example

Suppose that (Gn(t))t[0,T](G_{n}(t))_{t\in[0,T]} is characterised by the following dynamics:

  • Gn(0)G_{n}(0) is the empty graph.

  • Each vertex is assigned an independent Poisson clock with rate γ\gamma, i.e., the time intervals between two consecutive ring times are exponentially distributed with parameter γ\gamma. Each time the clock attached to vertex vv rings, all the edges are adjacent to vv become inactive.

  • If the edge ijij is inactive, then it becomes active at rate λ\lambda, independently of anything else.

We first describe the associated driving process. Let {τk(v)}k\{\tau_{k}(v)\}_{k\in\mathbb{N}} denote the sequence of times at which the Poisson clock attached to vertex vv rings, and let

Yv(t):=tmaxk{τk(v):τk(v)t}0Y_{v}(t):=t-\max_{k}\{\tau_{k}(v)\colon\,\tau_{k}(v)\leq t\}\vee 0 (3.7)

denote the time since the clock last rung. The value of Yv(t)Y_{v}(t) can be thought of as the age of vertex vv at time tt: each time the clock associated with vv rings, it dies and all its adjacent edges are lost. Recalling that we assumed that types take values in [0,1][0,1], we write

Xv(t):=Fexp(Yv(t))=1eγYv(t)X_{v}(t):=F^{{\rm exp}}(Y_{v}(t))=1-\mathrm{e}^{-\gamma Y_{v}(t)} (3.8)

to denote the type of vertex vv at time tt, where Fexp()F^{{\rm exp}}(\cdot) can be interpreted as the distribution function of an exponential random variable with rate γ\gamma.

The function H(t;u,v,F)H(t;u,v,F) can now also be identified. The probability that there is an active edge between vertices of ages u¯\bar{u} and v¯\bar{v} is 1exp{λ(u¯v¯)}1-\exp\{-\lambda(\bar{u}\wedge\bar{v})\}. Putting u=Fexp(u¯)u=F^{{\rm exp}}(\bar{u}) and v=Fexp(v¯)v=F^{{\rm exp}}(\bar{v}), we obtain, using that u¯=log(1u)/γ\bar{u}=-\log(1-u)/\gamma and v¯=log(1v)/γ\bar{v}=-\log(1-v)/\gamma,

H(t;u,v,F)=1exp(λ(1γlog(1uv))=1(1uv)λ/γ.H(t;u,v,F)=1-\exp\left(\lambda\left(\frac{1}{\gamma}\log(1-u\wedge v\right)\right)=1-(1-u\wedge v)^{\lambda/\gamma}. (3.9)

Because H(t;u,v,F)H(t;u,v,F) is a continuous function of uu and vv, and is independent of tt and FF, it is straightforward to verify that Assumption 3.1 holds. A more involved example is given in Section 4.1.

3.3. Sample-path large deviations

Similarly as in Section 2.5, we assume that the driving process satisfies the LDP (which for the above illustrative example is established in Lemma A.1).

Assumption 3.2.

{Fn}n\{F_{n}\}_{n\in\mathbb{N}} satisfies the LDP on D(([0,1]),[0,T])D(\mathcal{M}([0,1]),[0,T]) with rate (n)=o((n2))\ell(n)=o({n\choose 2}) and with rate function KK. \diamondsuit

To establish the sample-path LDP for the graphon-valued process {(h~G^n(t))t0}n\{(\tilde{h}^{\widehat{G}_{n}(t)})_{t\geq 0}\}_{n\in\mathbb{N}}, we need to:

  • (I)

    establish the LDP in the pointwise topology;

  • (II)

    strengthen this topology by establishing exponential tightness.

Step (I) is settled by the following result. For h~D((𝒲~,δ),[0,T])\tilde{h}\in D((\tilde{\mathscr{W}},\delta_{\square}),[0,T]), let

J(h~)=infFD(([0,1]),[0,T]):g~[F]=h~K(F).J(\tilde{h})=\inf_{F\in D(\mathcal{M}([0,1]),[0,T]):\tilde{g}^{[F]}=\tilde{h}}K(F). (3.10)
Proposition 3.3.

If Assumptions 3.1 and 3.2 hold, then the sequence ((h~Gn(t))t0)n((\tilde{h}^{G_{n}(t)})_{t\geq 0})_{n\in\mathbb{N}} satisfies the LDP in the pointwise topology with rate (n)\ell(n) and with rate function J(h~)J(\tilde{h}).

Note that Proposition 3.3 does not refer to any edge-switching dynamics. Specifically, if two process {Gn}n\{G_{n}\}_{n\in\mathbb{N}} and {Gn}n\{G^{*}_{n}\}_{n\in\mathbb{N}} have a common sequence of types ((Xi(t))t0)i[n]((X_{i}(t))_{t\geq 0})_{i\in[n]} and a common edge-connection function HH, then the marginal distributions are equivalent, i.e.,

h~Gn(t)=dh~Gn(t),t[0,T].\tilde{h}^{G_{n}(t)}\stackrel{{\scriptstyle\rm d}}{{=}}\tilde{h}^{G^{*}_{n}(t)},\qquad t\in[0,T]. (3.11)

However, this does not necessarily mean that the joint distributions are equivalent, i.e., we may have

(h~Gn(t))t[0,T]d(h~Gn(t))t[0,T],\big{(}\tilde{h}^{G_{n}(t)}\big{)}_{t\in[0,T]}\stackrel{{\scriptstyle\rm d}}{{\neq}}\big{(}\tilde{h}^{G^{*}_{n}(t)}\big{)}_{t\in[0,T]}, (3.12)

because these depend on the specific edge-switching dynamics. Nonetheless, Proposition 3.3 implies that both {Gn}n\{G_{n}\}_{n\in\mathbb{N}} and {Gn}n\{G^{*}_{n}\}_{n\in\mathbb{N}} satisfy equivalent LDPs in the pointwise topology, i.e., the rate function depends only on the marginal distributions of the process and not on the specific edge-switching dynamics. In Sections 4.1 and 4.2 we provide examples of processes with equivalent marginals and different edge-switching dynamics.

The specific edge-switching dynamics do need to be taken into consideration when we want to perform step (II), i.e., strengthen the topology of the LDP in Proposition 3.3 by establishing exponential tightness. We next provide a condition that can be used to verify that {h~Gn}n\{\tilde{h}^{G_{n}}\}_{n\in\mathbb{N}} are exponentially tight. Let

Eij(n)(t)={1,if edge ij is active at time t,0,otherwise,E^{(n)}_{ij}(t)=\begin{cases}1,\quad&\text{if edge $ij$ is active at time $t$,}\\ 0,\quad&\text{otherwise},\end{cases} (3.13)

and define

Cn(t,δ)=1i<jnsuptuvt+δ|Eij(n)(u)Eij(n)(v)|.C_{n}(t,\delta)=\sum_{1\leq i<j\leq n}\sup_{t\leq u\leq v\leq t+\delta}|E^{(n)}_{ij}(u)-E^{(n)}_{ij}(v)|. (3.14)

In other words, Cn(t,δ)C_{n}(t,\delta) is the number of edges that change (i.e., go from active to inactive or vice versa) at some time between tt and t+δt+\delta.

Proposition 3.4.

If, for all t[0,T]t\in[0,T] and ε>0\varepsilon>0,

limδ0lim supn1(n)log(Cn(t,δ)>ε(n2))=,\displaystyle\lim_{\delta\downarrow 0}\limsup_{n\to\infty}\frac{1}{\ell(n)}\log\mathbb{P}\left(C_{n}(t,\delta)>\varepsilon{n\choose 2}\right)=-\infty, (3.15)

then ((h~Gn(t))t0)n((\tilde{h}^{G_{n}(t)})_{t\geq 0})_{n\in\mathbb{N}} is exponentially tight.

Combining the above two propositions, we obtain the following.

Theorem 3.5.

If the conditions of Propositions 3.3 and 3.4 are satisfied, then the sequence of processes (h~Gn(t))t0)n(\tilde{h}^{G_{n}(t)})_{t\geq 0})_{n\in\mathbb{N}} satisfies the LDP on D(𝒲~,[0,T])D(\tilde{\mathscr{W}},[0,T]) with rate (n)\ell(n) and with rate function J(h~)J(\tilde{h}).

In view of Lemma A.1, the conditions of Theorem 3.5 can be readily verified for the illustrative example. In Theorem 3.10 we establish a sample-path LDP for a class of processes that includes the illustrative example.

3.4. Stochastic process convergence

In the sequel, \Rightarrow denotes convergence in distribution, and fdd\stackrel{{\scriptstyle\rm fdd}}{{\Rightarrow}} convergence of the associated finite-dimensional distributions. We assume that the empirical type distribution has a stochastic process limit.

Assumption 3.6.

Suppose that FnFF_{n}\Rightarrow F as nn\to\infty on D(([0,1]),[0,T])D(\mathcal{M}([0,1]),[0,T]). \diamondsuit

We establish the stochastic process limit of (hGn(t))t[0,T](h^{G_{n}(t)})_{t\in[0,T]} as nn\to\infty on D((𝒲,d),[0,T])D((\mathscr{W},d_{\square}),[0,T]), i.e., we no longer take the quotient with respect to the equivalence relation \sim. To establish a stochastic process limit in this finer topology, as explained below, we need to ensure that the labels of the vertices update dynamically.

Assumption 3.7.

At any time t[0,1]t\in[0,1] the labels of the vertices are such that

X1(t)Xn(t).X_{1}(t)\leq\dots\leq X_{n}(t). (3.16)

\diamondsuit

The importance of Assumption 3.7 is illustrated in Figure 3.4, where we consider the illustrative example from Section 3.2 with t=1t=1, λ=6\lambda=6 and γ=3\gamma=3. The left panel shows an outcome of hG100(1)h^{G_{100}(1)} with a static labelling, where vertices are labelled arbitrarily at time t=0t=0 and their labels do not change over time. We observe that hG100(1)h^{G_{100}(1)} has no discernible structure, and so with probability 1 the sequence (hGn(1))n(h^{G_{n}(1)})_{n\in\mathbb{N}} does not converge to a limit in (𝒲,d)(\mathscr{W},d_{\square}). The center panel of Figure 3.4 illustrates an outcome of hG100(1)h^{G_{100}(1)} under the dynamic labelling given in Assumption 3.7. Under this assumption hG100(1)h^{G_{100}(1)} has a discernible structure and hGn(1)gh^{G_{n}(1)}\to g in (𝒲,d)(\mathscr{W},d_{\square}) as nn\to\infty, where gg is the smooth graphon illustrated in the right panel of Figure 3.4.

Refer to caption
Refer to caption
Refer to caption
Figure 1. . An illustration of hG100(1)h^{G_{100}(1)} with static labelling (left panel) and dynamic labelling (center panel), and the corresponding limit for the dynamic labelling in (𝒲,d)(\mathcal{W},d_{\square}) (right panel). In this figure black corresponds to the value 1 and white corresponds to the value 0.

Given the dynamic labelling in Assumption 3.7 and the illustrative example, the motivation behind establishing our stochastic process limits on D((𝒲,d),[0,T])D((\mathscr{W},d_{\square}),[0,T]) rather than on D((𝒲~,δ),[0,T])D((\tilde{\mathscr{W}},\delta_{\square}),[0,T]) is clear: it provides insight into the natural question ‘Are the older vertices more connected than the younger vertices?’ If we establish a limit in D((𝒲,d),[0,T])D((\mathscr{W},d_{\square}),[0,T]), then we have a definitive answer, whereas if we establish a limit in D((𝒲~,δ),[0,T])D((\tilde{\mathscr{W}},\delta_{\square}),[0,T]), then we do not gain any insight.

Proposition 3.8.

If Assumptions 3.1, 3.6 and 3.7 hold, then hGnfddg[F]h^{G_{n}}\stackrel{{\scriptstyle\rm fdd}}{{\Rightarrow}}g^{[F]} as nn\to\infty.

Strengthening the topology to obtain convergence in distribution on D((𝒲,d),[0,T])D((\mathscr{W},d_{\square}),[0,T]) is more difficult than in Section 3.3, because, unlike the space (𝒲~,δ)(\tilde{\mathscr{W}},\delta_{\square}), the space (𝒲,d)(\mathscr{W},d_{\square}) is not Polish, and hence we cannot directly apply established sufficient conditions for tightness, such as those stated in [14, Sections 3.6–3.9]. Nonetheless, we are able to establish convergence directly by using [14, Corollary 3.3].

Assumption 3.9.

For any ε>0\varepsilon>0,

limδ0lim supnsupt[0,T]Tδ(Cn(t,δ)>ε(n2))=0.\lim_{\delta\downarrow 0}\limsup_{n\to\infty}\sup_{t\in[0,T]}\frac{T}{\delta}\,\mathbb{P}\left(C_{n}(t,\delta)>\varepsilon{n\choose 2}\right)=0. (3.17)
Theorem 3.10.

Subject to Assumptions 3.1, 3.63.7, 3.9, hGng[F]h^{G_{n}}\Rightarrow g^{[F]} as nn\to\infty in D((𝒲,d),[0,T])D((\mathscr{W},d_{\square}),[0,T]).

In view of Lemma A.1, the conditions of Theorem 3.10 can again be readily verified for the illustrative example. We establish a more general result in Proposition 4.2.

4. Applications and extensions

In this section we consider a class of processes that generalise the illustrative example. We use this class of processes to make three points that we believe apply more generally:

  • (I)

    An additional layer of dependence between the edges can be introduced that cannot be captured by the types of the vertices (so that (3.6) no longer holds), but still allows to establish limiting results in the spirit of Section 3. Roughly speaking, this is the case when the additional layer of dependence between the edges is of mean-field type (see Section 4.1).

  • (II)

    The specific edge-switching dynamics rarely affects the limiting path of the process (see Section 4.2).

  • (III)

    The dependence between edges in inhomogeneous random graphs with type dependence leads to new behaviour in the corresponding variational problems, even in relatively simple settings (see Section 4.3).

4.1. Beyond conditional independence of edges

4.1.1. Model and LDP

Suppose that (Gn(t))t[0,T](G_{n}(t))_{t\in[0,T]} is characterised by the following dynamics:

  • Gn(0)G_{n}(0) is the empty graph.

  • Each vertex is assigned an independent rate-γ\gamma Poisson clock, and each time the clock associated with vertex vv rings all the edges that are adjacent to vv become inactive. Let {τk(v)}k\{\tau_{k}(v)\}_{k\in\mathbb{N}} denote the sequence of times at which the Poisson clock attached to vertex vv rings, and let, as before,

    Xv(t):=tmaxk{τk(v):τk(v)t}0X_{v}(t):=t-\max_{k}\{\tau_{k}(v)\colon\,\tau_{k}(v)\leq t\}\vee 0 (4.1)

    denote the time since the clock last rung (i.e., the ‘age’ of vertex vv). To ensure that Xv(t)[0,1]X_{v}(t)\in[0,1] we take T=1T=1. Note that the definition Xv(t)X_{v}(t) here differs slightly from that in the illustrative example (Section 3.2).

  • If edge ijij is inactive, then it becomes active at rate λ(t,Xi(t),Xj(t),Fn(t;),h~Gn(t))\lambda(t,X_{i}(t),X_{j}(t),F_{n}(t;\cdot),\tilde{h}^{G_{n}(t)}).

  • If edge ijij is active, then it becomes inactive at rate μ(t,Xi(t),Xj(t),Fn(t;),h~Gn(t))\mu(t,X_{i}(t),X_{j}(t),F_{n}(t;\cdot),\tilde{h}^{{G_{n}(t)}}).

We assume that λ\lambda and μ\mu are Lipshitz-continuous functions on [0,T]×[0,1]2×([0,1])×𝒲~[0,T]\times[0,1]^{2}\times\mathcal{M}([0,1])\times\tilde{\mathscr{W}}.

If λ()\lambda(\cdot) and μ()\mu(\cdot) do not depend on the current state of the unlabelled graph h~Gn(t)\tilde{h}^{{G_{n}(t)}}, i.e., if

λ(t,u,v,F,h~)λ(t,u,v,F),μ(t,u,v,F,h~)μ(t,u,v,F),\lambda(t,u,v,F,\tilde{h})\equiv\lambda(t,u,v,F),\qquad\mu(t,u,v,F,\tilde{h})\equiv\mu(t,u,v,F), (4.2)

then the process fits into the framework of Section 3, otherwise it does not. To understand why, we compute the probability that edge ijij is active under (4.2) at time tt given Xi(t)=xiX_{i}(t)=x_{i}, Xj(t)=xjX_{j}(t)=x_{j} and Fn=FF_{n}=F. We have

H(t;xi,xj,F)\displaystyle H(t;x_{i},x_{j},F) =txixjtdsλ(s,xit+s,xjt+s,F(s;))\displaystyle=\int_{t-x_{i}\wedge x_{j}}^{t}{\rm d}s\,\lambda(s,x_{i}-t+s,x_{j}-t+s,F(s;\cdot)) (4.3)
×exp{stda[μ(a,xit+a,xjt+a,F(a;))\displaystyle\qquad\times\exp\bigg{\{}-\int_{s}^{t}{\rm d}a\,[\mu(a,x_{i}-t+a,x_{j}-t+a,F(a;\cdot))
+λ(a,xit+a,xjt+a,F(a;))]};\displaystyle\qquad\qquad+\lambda(a,x_{i}-t+a,x_{j}-t+a,F(a;\cdot))]\bigg{\}};

an explanation of the expression on the right-hand side is given below, when we discuss a related differential equation version. It is also easy to see that two edges ijij and kk\ell are independent given tt, Xi(t)X_{i}(t), Xj(t)X_{j}(t), Xk(t)X_{k}(t), X(t)X_{\ell}(t) and Fn=FF_{n}=F, and hence the process indeed falls into the framework of Section 3. To recover the illustrative example, take λ(t,u,v,F)=λ+\lambda(t,u,v,F)=\lambda\in\mathbb{R}_{+} and μ(t,u,v,F)=0\mu(t,u,v,F)=0. Note that through (3.5) and (4.3), we see that the empirical reference graphon g[F]()g^{[F]}(\cdot) is characterised by

g[F](t;x,F)\displaystyle g^{[F]}(t;x,F) =H(t;F¯(t;x),F¯(t;y),F).\displaystyle=H(t;\bar{F}(t;x),\bar{F}(t;y),F). (4.4)

An example of a choice for λ()\lambda(\cdot) that does depend on the current state of the unlabelled graph h~Gn(t)\tilde{h}^{{G_{n}(t)}} is

λ(t,u,v,F,h~)=1+s(G,h~),\lambda(t,u,v,F,\tilde{h})=1+s(G,\tilde{h}), (4.5)

where GG is a simple graph (e.g. a triangle) and s(G,h~)s(G,\tilde{h}) denotes the homomorphism density of GG in h~\tilde{h}. Note that, by the counting lemma (see, for instance, [8, Proposition 2.2]), this particular choice of λ()\lambda(\cdot) is Lipshitz-continuous. In this case, the two edges ijij and kk\ell are not independent given tt, Xi(t)X_{i}(t), Xj(t)X_{j}(t), Xk(t)X_{k}(t), X(t)X_{\ell}(t) and Fn=FF_{n}=F. Indeed, if edge ijij is active, then it may participate in additional copies of GG, which means that edge kk\ell is more likely to be active. This dependence is inherent to the model, in that it cannot be removed by changing the definition of the types Xi(t)X_{i}(t).

We continue by giving some background on the next main result, Theorem 4.1, which demonstrates that, despite the above observation, we can still establish a sample-path LDP for the process. In order to express the rate function of this LDP, we need to define a mapping Fg(F)F\mapsto g^{(F)} such that g(F)(t;x,y)g^{(F)}(t;x,y) can be interpreted as the (approximate) probability that there is an edge between vertices nx\lceil nx\rceil and ny\lceil ny\rceil at time tt. We proceed in a similar manner as above. Given Xi(t)=xiX_{i}(t)=x_{i}, Xj(t)=xjX_{j}(t)=x_{j} and Fn=FF_{n}=F, the probability that there is an edge between vertices ii and jj is given by

H(t;xi,xj,F)\displaystyle H(t;x_{i},x_{j},F) =txixjtdsλ(s,xit+s,xjt+s,F(s;),h~G(s))\displaystyle=\int_{t-x_{i}\wedge x_{j}}^{t}{\rm d}s\,\lambda(s,x_{i}-t+s,x_{j}-t+s,F(s;\cdot),\tilde{h}^{G(s)}) (4.6)
×exp{stda[μ(a,xit+a,xjt+a,F(a;),h~G(a))\displaystyle\qquad\times\exp\bigg{\{}-\int_{s}^{t}{\rm d}a\,[\mu(a,x_{i}-t+a,x_{j}-t+a,F(a;\cdot),\tilde{h}^{G(a)})
+λ(a,xit+a,xjt+a,F(a;),h~G(a))]}.\displaystyle\qquad\qquad+\lambda(a,x_{i}-t+a,x_{j}-t+a,F(a;\cdot),\tilde{h}^{G(a)})]\bigg{\}}.

Because this expression depends on h~G()\tilde{h}^{G(\cdot)} (in addition to FF and tt), it cannot be used to define a mapping Fg(F)F\mapsto g^{(F)} through (3.5). However, if we tacitly assume that h~G()\tilde{h}^{G(\cdot)} is well approximated by g~(F)()\tilde{g}^{(F)}(\cdot), then we can use (4.6) with h~G(t)\tilde{h}^{G(t)} replaced by g~(F)\tilde{g}^{(F)} in combination with (3.5), to implicitly define the mapping Fg(F)F\mapsto g^{(F)} by

g(F)(t;x,y)\displaystyle g^{(F)}(t;x,y) =tF¯(t;x)F¯(t;y)tdsλ(s,F¯(s;x)t+s,F¯(s;y)t+s,F(s;),g~(F)(s))\displaystyle=\int_{t-\bar{F}(t;x)\wedge\bar{F}(t;y)}^{t}{\rm d}s\,\lambda(s,\bar{F}(s;x)-t+s,\bar{F}(s;y)-t+s,F(s;\cdot),\tilde{g}^{(F)}(s)) (4.7)
×exp{stda[μ(a,F¯(a;x)t+a,F¯(a;y)t+a,F(a;),g~(F)(a))\displaystyle\quad\times\exp\bigg{\{}-\int_{s}^{t}{\rm d}a\,[\mu(a,\bar{F}(a;x)-t+a,\bar{F}(a;y)-t+a,F(a;\cdot),\tilde{g}^{(F)}(a))
+λ(a,F¯(a;x)t+a,F¯(a;y)t+a,F(a;),g~(F)(a))]}.\displaystyle\quad\quad+\lambda(a,\bar{F}(a;x)-t+a,\bar{F}(a;y)-t+a,F(a;\cdot),\tilde{g}^{(F)}(a))]\bigg{\}}.

The mapping in (4.7) is well-defined for all the relevant paths of FF (being such that F¯XF\in\bar{\mathcal{M}}_{X}, where ¯X\bar{\mathcal{M}}_{X} is defined below), which can be seen by re-expressing it as the solution to a differential equation. For any i[n]i\in[n], Xi()X_{i}(\cdot) is a random variable on

DX={s():s(t)=tmax{τi:τit} for some 0τ1<<τkT}D_{X}=\{s(\cdot):s(t)=t-\max\{\tau_{i}:\tau_{i}\leq t\}\text{ for some }0\leq\tau_{1}<\dots<\tau_{k}\leq T\} (4.8)

and Fn()F_{n}(\cdot) is a random variable on

X={F:F(t;x)=1ki=1k𝟙{si(t)x} for some s1,,skDX,k}.\mathcal{M}_{X}=\left\{F:F(t;x)=\frac{1}{k}\sum_{i=1}^{k}\mathbbm{1}\{s_{i}(t)\leq x\}\text{ for some }s_{1},\dots,s_{k}\in D_{X},\,k\in\mathbb{N}\right\}. (4.9)

Let ¯X\widebar{\mathcal{M}}_{X} denote the closure of X\mathcal{M}_{X}, and observe that ¯X\widebar{\mathcal{M}}_{X} contains all possible paths of FnF_{n} and their limits. If F¯XF\in\widebar{\mathcal{M}}_{X} and u=F(t;F¯(t+dt;u)dt)u^{\prime}=F(t;\bar{F}(t+{\rm d}t;u)-{\rm d}t) for u[0,1]u\in[0,1], then the mapping in (4.7) satisfies the differential equation: g(F)(0;,)=0g^{(F)}(0;\cdot,\cdot)=0,

g(F)(t+dt;x,y)=g(F)(t;x,y)+dt[1g(F)(t;x,y)]λ[t,F¯(t;x),F¯(t;y),F(t;),g~(F)(t;)]dtg(F)(t;x,y)μ[t,F¯(t;x),F¯(t;y),F(t;),g~(F)(t;)],\displaystyle\begin{split}g^{(F)}(t+{\rm d}t;x,y)&=g^{(F)}(t;x^{\prime},y^{\prime})\\ &\qquad+{\rm d}t\,[1-g^{(F)}(t;x^{\prime},y^{\prime})]\,\lambda[t,\bar{F}(t;x^{\prime}),\bar{F}(t;y^{\prime}),F(t;\cdot),\tilde{g}^{(F)}(t;\cdot)]\\ &\qquad-{\rm d}t\,g^{(F)}(t;x^{\prime},y^{\prime})\,\mu[t,\bar{F}(t;x^{\prime}),\bar{F}(t;y^{\prime}),F(t;\cdot),\tilde{g}^{(F)}(t;\cdot)],\end{split} (4.10)

if F¯(t+dt;x)F¯(t+dt;y)dt\bar{F}(t+{\rm d}t;x)\wedge\bar{F}(t+{\rm d}t;y)\geq{\rm d}t and 0 otherwise.

The differential equation in (4.10) can be understood as follows. Consider the process (hGn(t))t[0,T](h^{G_{n}(t)})_{t\in[0,T]} under Assumption 3.7, so that the vertices are labelled in order of increasing age. Informally, given Fn=FF_{n}=F, g(F)(t+dt;x,y)g^{(F)}(t+{\rm d}t;x,y) can be thought of as the probability that edge (nx,ny)(\lceil nx\rceil,\lceil ny\rceil) is active at time t+dtt+{\rm d}t. For F¯XF\in\widebar{\mathcal{M}}_{X}, under Assumption 3.7, at time tt these vertices had the labels (nx,ny)(\lceil nx^{\prime}\rceil,\lceil ny^{\prime}\rceil), respectively (with x,yx^{\prime},y^{\prime} given above (4.10)). The first term in the right-hand side of (4.10), g(F)(t;x,y)g^{(F)}(t;x^{\prime},y^{\prime}), is the probability that the edge was active at time tt, the second term accounts for the event that the edge turned on during the time interval [t,t+dt][t,t+{\rm d}t], while the third term accounts for the event that it turned off during the time interval [t,t+dt][t,t+{\rm d}t].

Theorem 4.1.

The sequence of processes {(h~Gn(t))t0}n\{(\tilde{h}^{G_{n}(t)})_{t\geq 0}\}_{n\in\mathbb{N}} satisfies the LDP with rate nn and with rate function

J(h~)=infFD(([0,1]),[0,t]):g~(F)=h~K(F),J(\tilde{h})=\inf_{F\in D(\mathcal{M}([0,1]),[0,t])\colon\,\tilde{g}^{(F)}=\tilde{h}}K(F), (4.11)

where g()g^{(\cdot)} is defined by (4.10) and the function K()K(\cdot) is given in Proposition A.1.

Proposition 4.2.

Suppose Assumption 3.7 holds then (hGn(t))t0g(F)(h^{G_{n}(t)})_{t\geq 0}\Rightarrow g^{(F^{*})} with FF^{*} characterised by

F(x;t)={1eγx,if x<t,1otherwise,F^{*}(x;t)=\begin{cases}1-\mathrm{e}^{-\gamma x},\quad&\text{if }x<t,\\ 1&\text{otherwise},\end{cases} (4.12)

and g()g^{(\cdot)} defined by (4.10).

Theorem 4.1 and Proposition 4.2 are interesting because they show that we can add a ‘mean-field type’ interaction between the edges and still obtain equivalent results.

4.1.2. Numerical illustration

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 2. . An illustration of hG100(t)h^{G_{100}}(t) (top two rows) and g(F)(t)g^{(F^{*})}(t) (bottom two rows) for t=16,13,12,23,56,1t=\frac{1}{6},\frac{1}{3},\frac{1}{2},\frac{2}{3},\frac{5}{6},1.

To illustrate Theorem 4.1 and Proposition 4.2 we suppose that

γ=3,λ(t,u,v,F,h~)=4+60s(Δ,h~),andμ(t,u,v,F,h~)=120(uv)2,\gamma=3,\quad\lambda(t,u,v,F,\tilde{h})=4+60\,s(\Delta,\tilde{h}),\quad\text{and}\quad\mu(t,u,v,F,\tilde{h})=120(u-v)^{2}, (4.13)

where s(Δ,h~)s(\Delta,\tilde{h}) denotes the triangle density in h~\tilde{h}. In Figure 4.1.2 we plot, under Assumption 3.7, hGn()h^{G_{n}}(\cdot) for n=100n=100 and the corresponding fluid limit g(F)g^{(F^{*})} given in Proposition 4.2 for t=16,13,12,23,56,1t=\frac{1}{6},\frac{1}{3},\frac{1}{2},\frac{2}{3},\frac{5}{6},1.

The evolution of the graphon valued processes illustrated in Figure 4.1.2 can be understood by analysing the rates in Equation (4.13). When tt is small (i.e., t16t\leq\frac{1}{6}), the constant term ‘4’ in λ()\lambda(\cdot) is much larger than the terms 120(uv)2120(u-v)^{2} and 60s(Δ,h~)60s(\Delta,\tilde{h}). This is because all vertices start with the same type 0 (so that 120(Xi(t)Xj(t))2=0120(X_{i}(t)-X_{j}(t))^{2}=0 with high probability) and initially the graph contains no edges (so that s(Δ,h~Gn(t))0s(\Delta,\tilde{h}^{G_{n}(t)})\approx 0). Consequently, at t=16t=\frac{1}{6}, the graph is similar to what it would be if λ()=4\lambda(\cdot)=4 and μ()=0\mu(\cdot)=0, as in the illustrative example. As tt increases, both the triangle density and the squared difference in the ages (Xi(t)Xj(t))2(X_{i}(t)-X_{j}(t))^{2} (on average) increase, which means that the terms 60s(Δ,g~Gn(t))60s(\Delta,\tilde{g}^{G_{n}(t)}) and 120(Xi(t)Xj(t))2120(X_{i}(t)-X_{j}(t))^{2} have an increasing influence on the dynamics of the process. Initially, the term 60s(Δ,g~Gn(t))60s(\Delta,\tilde{g}^{G_{n}(t)}) grows faster (on average) than 120(Xi(t)Xj(t))2120(X_{i}(t)-X_{j}(t))^{2} and, consequently, when t=13,12,23t=\frac{1}{3},\frac{1}{2},\frac{2}{3}, the process gGn(t)g^{G_{n}(t)} has a high edge density. However, as tt increases further (t=56,1t=\frac{5}{6},1) the squared differences in the ages 120(Xi(t)Xj(t))2120(X_{i}(t)-X_{j}(t))^{2} (on average) become increasingly dominant and, consequently, pairs of vertices with significantly different ages are much less likely to be connected by an edge. Due to Assumption 3.7 these vertices lie away from the diagonal connecting (0,0)(0,0) to (1,1)(1,1), and therefore we observe fewer edges in these regions of the plots.

Refer to caption
Refer to caption
Figure 3. . Top panel: s(Δ,h~(G100)(t))s(\Delta,\tilde{h}^{(G_{100})}(t)) (solid line), s(Δ,g~(F100)(t))s(\Delta,\tilde{g}^{(F_{100})}(t)) (dashed line), and s(Δ,g~(F¯)(t))s(\Delta,\tilde{g}^{(\bar{F})}(t)) (dotted line) for t[0,1]t\in[0,1]. Bottom panel: s(Δ,g~(F)(t))s(\Delta,\tilde{g}^{(F^{*})}(t)) (dotted line) and s(Δ,g~(Fn)(t))s(\Delta,\tilde{g}^{(F_{n})}(t)) for n=102,103,104,105n=10^{2},10^{3},10^{4},10^{5} (solid lines) and t[0,1]t\in[0,1].

In the top panel of Figure 4.1.2 we plot the triangle density of hG100(t)h^{G_{100}}(t) (solid line), g(F100)(t)g^{(F_{100})}(t) (dashed line), and g(F)g^{(F^{*})} (dotted line). To obtain the outcome of g(F100)(t)g^{(F_{100})}(t), we take the same outcome of the empirical type processes F100()F_{100}(\cdot) that was used to generate the outcome of hG100(t)h^{G_{100}}(t) and insert it into the differential equation in (4.10). We can therefore view the path of hG100(t)h^{G_{100}}(t) as having randomness in both the types of the vertices and the outcomes of the edges, and view the path of g(F100)(t)g^{(F_{100})}(t) as the same path but with the randomness in the outcomes of the edges removed. Because there are only n=100n=100 vertices and (n2)=4950{n\choose 2}=4950 edges, it is much more computationally efficient to simulate g(F100)(t)g^{(F_{100})}(t) than it is to simulate hG100(t)h^{G_{100}}(t). We note that in the top panel of Figure 4.1.2 the solid and dashed curves follow each other closely. This is because, as Theorem 4.1 suggests, fluctuations in the process are considerably more likely to be caused by fluctuations in the types of the vertices than in the specific outcomes of the edges given these types. This means for the purposes of simulation, when nn is sufficiently large (n100n\geq 100) it suffices to simulate g(Fn)(t)g^{(F_{n})}(t). In the bottom panel of Figure 4.1.2 we simulate the triangle density of g(Fn)(t)g^{(F_{n})}(t) for n=102,103,104,105n=10^{2},10^{3},10^{4},10^{5} and observe convergence to the fluid limit.

4.2. Different edge-switching dynamics, equivalent sample-path LDP

Next suppose that the process h~Gn()\tilde{h}^{G_{n}(\cdot)} has the same underlying driving process (i.e., the same {(Xi(t))t0}i\{(X_{i}(t))_{t\geq 0}\}_{i\in\mathbb{N}}) as in Section 4.1, but different edge-switching dynamics. In particular, we consider edge switching dynamics inspired by [1]. Let (Uij)1i<jn(U_{ij})_{1\leq i<j\leq n} be a sequence of independent uniform variables on [0,1][0,1], and suppose that edge ijij is active if

UijH(t;Xi(t),Xj(t),Fn(t;)),U_{ij}\leq H(t;X_{i}(t),X_{j}(t),F_{n}(t;)), (4.14)

where HH is given by (4.2) and (4.3). Observe that for any edge ijij the random number UijU_{ij} is sampled just once, rendering this process not Markov. Nonetheless, by Proposition 3.3 we immediately have that, in the pointwise topology, the sequence of processes satisfies the same LDP as the processes described in Section 4.2. To strengthen the topology it remains to verify establish exponential tightness. This can be done by using Proposition 3.4, which leads to the following result.

Proposition 4.3.

The sequence of processes {(h~Gn(t))t0}n\{(\tilde{h}^{G_{n}(t)})_{t\geq 0}\}_{n\in\mathbb{N}} described in Section 4.2 satisfies the LDP with rate nn and with rate function (4.11).

4.3. The most likely path to an unusually small edge density

Consider the illustrative example described in Section 3.2 with λ=γ\lambda=\gamma. Observe that in this case (3.9) simplifies to

H(u,v,F)=uv.H(u,v,F)=u\wedge v. (4.15)

Suppose that we would like to determine the most likely path the process takes to a prescribed edge density ee^{*} at time TT. The standard method involves two steps: first compute the most likely state of the process at time TT given this edge density, and afterwards use this computation to obtain the most likely trajectory of the process. Below we focus only on the first step and demonstrate that it is far simpler to numerically compute the most likely state of the process at time TT when ee^{*} is below the expected edge density than when it is above the expected edge density. Note that, given (4.15), the results in this section apply to the models introduced in Sections 4.1 and 4.2 (with the above simplification).

4.3.1. Most likely state of the process at time TT

Let QQ denote the distribution of Xv(t)X_{v}(t), where Xv(t)X_{v}(t) is defined in (3.8). Note that

Q(dx)={dx,if x<Fexp(T),1Fexp(T),if x=Fexp(T),0,otherwise.Q({\rm d}x)=\begin{cases}{\rm d}x,\qquad&\text{if }x<F^{{\rm exp}}(T),\\ 1-F^{{\rm exp}}(T),\qquad&\text{if }x=F^{{\rm exp}}(T),\\ 0,\qquad&\text{otherwise}.\end{cases} (4.16)

However, for the moment we will assume that QQ is a general measure on [0,1][0,1]. We first consider the event that GnG_{n} has an unusually small edge density ee^{*} at time TT. By Theorem 2.6, the corresponding variational problem is

minimize01log(dPdQ)dPsubjectto2010yxP(dx)P(dy)eoverP([0,1]).\begin{array}[]{lll}&{\rm minimize}&\,\int_{0}^{1}\log\left(\frac{{\rm d}P}{{\rm d}Q}\right){\rm d}P\\[5.69046pt] &{\rm subject\,to}&\,2\int_{0}^{1}\int_{0}^{y}x\,P({\rm d}x)\,P({\rm d}y)\leq e^{*}\\[5.69046pt] &{\rm over}&\,P\in\mathcal{M}([0,1]).\end{array} (4.17)
Proposition 4.4.

The feasible region of the variational problem described in (4.17) is convex.

Because the objective function in (4.17) is strictly convex, Proposition 4.4 implies that the variational problem has a unique local minimum which is the global minumium. Consequently, there are several numerical methods that we can apply to find the global minimum.

If instead we consider the probability that Gn(T)G_{n}(T) has an unusually high edge density, then we must solve the same variational problem as described in (4.17) with ‘\leq’ replaced by ‘\geq’. Now the feasible region is no longer convex. Consequently, as we illustrate with a numerical example, the corresponding variational problem may have multiple local maxima and multiple local minima.

Let

Q(x)={4511000,if x=0,15,if x=110,11000,if x=1.Q(x)=\begin{cases}\frac{4}{5}-\frac{1}{1000},\quad&\text{if }x=0,\\ \frac{1}{5},\quad&\text{if }x=\frac{1}{10},\\ \frac{1}{1000},\quad&\text{if }x=1.\end{cases} (4.18)

In Figure 4.3.1 we plot the rate (the value of the objective function evaluated at the maxima) against the edge density ee^{*}. When e0.085e^{*}\approx 0.085 there are two distinct optimal solutions corresponding to

P1(x)={0.0782,if x=0,0.9159,if x=110,0.0059,if x=1,P2(x)={0.3728,if x=0,0.4020,if x=110,0.2252,if x=1.P^{*}_{1}(x)=\begin{cases}0.0782,\quad&\text{if }x=0,\\ 0.9159,\quad&\text{if }x=\frac{1}{10},\\ 0.0059,\quad&\text{if }x=1,\end{cases}\qquad\qquad P^{*}_{2}(x)=\begin{cases}0.3728,\quad&\text{if }x=0,\\ 0.4020,\quad&\text{if }x=\frac{1}{10},\\ 0.2252,\quad&\text{if }x=1.\end{cases}

For e0.085e^{*}\approx 0.085, solutions near P1P^{*}_{1} and P2P^{*}_{2} are local minima. These local minima are illustrated by the dotted curve in Figure 4.3.1: values above 0.0850.085 correspond to solutions that are close to P1P^{*}_{1} and values below 0.0850.085 correspond to solutions that are close to P2P^{*}_{2}. Observe that we can restrict our search of an optimal measure PP to measures that are absolutely continuous with respect to QQ. For QQ given by (4.18), these measures live on the 2-dimensional simplex. Consequently, there can be at most two local minima. If QQ has a continuous component (as it does in (4.16)), then there is, in principle, no bound on the number of local minima. This makes it difficult to determine if a global minimum has been reached using numerical methods.

Refer to caption
Figure 4. . The rates of the local minima for different prescribed edge densities ee^{*} when QQ is given by (4.18).

5. Proofs

Sections 5.15.3 contain the proofs of the theorems in Sections 24.

5.1. Proofs of the results in Section 2

We prove Theorem 2.6 via a sequence of lemmas. Recall that we use G^n\widehat{G}_{n} to denote an inhomogeneous Erdős–Rényi random graph (IRG) and GnG_{n} to denote a inhomogeneous random graph with type dependence (IRGT).

5.1.1. Inhomogeneous Erdős–Rényi random graphs

The first lemma is similar to results presented in [12] and to [13, Theorem 4.1], the primary difference being that in [12, 13] there is a single reference graphon rr, i.e., rn=rr_{n}=r for all n0n\geq 0. The generalisation comes at the cost of the addition of Assumption 2.4 in the lower bound (which is not made in [13, Theorem 4.1], but is in [12]).

Lemma 5.1.

Let rnr_{n} denote the reference graphon for G^n\widehat{G}_{n} and suppose that rnrr_{n}\to r in L1L^{1}. Then

lim supn1(n2)log(h~G^n𝒞)infh~𝒞I~r(h~),𝒞 closed,\limsup_{n\to\infty}\frac{1}{{n\choose 2}}\log\mathbb{P}(\tilde{h}^{\widehat{G}_{n}}\in\mathcal{C})\leq-\inf_{\tilde{h}\in\mathcal{C}}\tilde{I}_{r}(\tilde{h}),\quad\forall\,\mathcal{C}\text{ closed}, (5.1)

and, subject to Assumption 2.4,

lim infn1(n2)log(h~G^n𝒪)infh~𝒪I~r(h~),𝒪 open.\liminf_{n\to\infty}\frac{1}{{n\choose 2}}\log\mathbb{P}(\tilde{h}^{\widehat{G}_{n}}\in\mathcal{O})\geq-\inf_{\tilde{h}\in\mathcal{O}}\tilde{I}_{r}(\tilde{h}),\quad\forall\,\mathcal{O}\text{ open}. (5.2)
Proof.

The upper bound follows from the same arguments as used in [13, Theorem 4.1] (by noting that the specific requirement that rn=rr_{n}=r is not used there). The lower bound again follows from similar arguments. However, now instead of applying Jensen’s inequality we apply the dominated convergence theorem, which is possible because of Assumption 2.4. ∎

The previous lemma can be used to obtain the following concentration type result for inhomogeneous Erdős–Rényi random graphs. For r𝒲r\in\mathscr{W}, denote 𝔹(r~,ε)={h~𝒲:δ(r~,h~)<ε}\mathbb{B}_{\square}(\tilde{r},\varepsilon)=\{\tilde{h}\in\mathscr{W}:\delta_{\square}(\tilde{r},\tilde{h})<\varepsilon\}.

Lemma 5.2.

Let G^n\widehat{G}_{n} be an IRG with reference graphon rn𝒲nr_{n}\in\mathscr{W}_{n}. If rnrL10\lVert r_{n}-r\rVert_{L_{1}}\to 0, then, for any r𝒲r\in\mathscr{W},

lim supn1(n2)log(h~G^n𝔹(r~,ε))ε2.\limsup_{n\to\infty}\frac{1}{{n\choose 2}}\log\mathbb{P}(\tilde{h}^{\widehat{G}_{n}}\notin\mathbb{B}_{\square}(\tilde{r},\varepsilon))\leq-\varepsilon^{2}. (5.3)
Proof.

Suppose that h~𝔹(r,ε)\tilde{h}\notin\mathbb{B}_{\square}(r,\varepsilon), and let hh be any member of the equivalence class h~\tilde{h}. Using a Taylor expansion in the first inequality and Jensen’s inequality in the second inequality, we have

Ir(h)\displaystyle I_{r}(h) =12[0,1]2dxdy[h(x,y)log(h(x,y)r(x,y))+(1h(x,y))log(1gn(x,y)1r(x,y))]\displaystyle=\frac{1}{2}\int_{[0,1]^{2}}{\rm d}x\,{\rm d}y\left[h(x,y)\log\left(\frac{h(x,y)}{r(x,y)}\right)+(1-h(x,y))\log\left(\frac{1-g_{n}(x,y)}{1-r(x,y)}\right)\right] (5.4)
[0,1]2dxdy(h(x,y)r(x,y))2hrL12d(h,r)2ε2.\displaystyle\geq\int_{[0,1]^{2}}{\rm d}x\,{\rm d}y\,(h(x,y)-r(x,y))^{2}\geq\lVert h-r\rVert_{L_{1}}^{2}\geq d_{\square}(h,r)^{2}\geq\varepsilon^{2}.

Since 𝔹(r~,ε)\mathbb{B}_{\square}(\tilde{r},\varepsilon) is open, its complement is closed, which implies that we can apply the upper bound in Lemma 5.1, from which the result follows. ∎

5.1.2. Inhomogeneous random graphs with type dependence

We next turn our attention to inhomogeneous random graphs with type dependence GnG_{n}. We first use the previous lemma to show that GnG_{n} is close to the induced reference graphon g[Fn]g^{[F_{n}]} with high probability.

Lemma 5.3.

If Assumption 2.3 holds, then

lim supn1(n2)log(h~Gn𝔹~(g~[Fn],ε))ε2.\limsup_{n\to\infty}\frac{1}{{n\choose 2}}\log\mathbb{P}(\tilde{h}^{G_{n}}\notin\tilde{\mathbb{B}}_{\square}(\tilde{g}^{[F_{n}]},\varepsilon))\leq-\varepsilon^{2}. (5.5)
Proof.

Suppose that Assumption 2.3 holds. We proceed by contradiction. Suppose that (5.5) does not hold. Then there necessarily exist sequences (nk)k(n_{k})_{k\in\mathbb{N}}\subseteq\mathbb{N} and (Fnk)k([0,1])(F^{\star}_{n_{k}})_{k\in\mathbb{N}}\subset\mathcal{M}([0,1]) such that

lim infk1(nk2)log(h~Gnk𝔹~(g~[Fnk],ε)Fnk=Fnk)>ε2,\liminf_{k\to\infty}\frac{1}{{n_{k}\choose 2}}\log\mathbb{P}\left(\tilde{h}^{G_{n_{k}}}\notin\tilde{\mathbb{B}}_{\square}(\tilde{g}^{[F^{\star}_{n_{k}}]},\varepsilon)\mid F_{n_{k}}=F^{\star}_{n_{k}}\right)>-\varepsilon^{2}, (5.6)

where, for each kk\in\mathbb{N}, FnkF^{\star}_{n_{k}} is an empirical distribution function with nkn_{k} data points. Since ([0,T])\mathcal{M}([0,T]) is compact, there exists a convergent subsequence of (Fnk)k(F^{\star}_{n_{k}})_{k\in\mathbb{N}}. Consequently, without loss of generality we may assume that there exists FF^{\star} such that FnkFF^{\star}_{n_{k}}\to F^{\star} in ([0,T])\mathcal{M}([0,T]) as kk\to\infty. Under Assumption 2.3 we therefore have

r[Fnk]r[F]L10,as k.\lVert r^{[F^{\star}_{n_{k}}]}-r^{[F^{\star}]}\rVert_{L_{1}}\to 0,\qquad\text{as }k\to\infty. (5.7)

Recalling that, due to (2.14) (i.e., conditional on the induced reference graphon the graph has the distribution of an inhomogeneous random graph), we can apply Lemma 5.2 to obtain a contradiction. ∎

The next lemma establishes a large deviation principle for the sequence of induced reference graphons g[Fn]g^{[F_{n}]}.

Lemma 5.4.

Subject to Assumptions 2.2 and 2.3, {g~[Fn]}n\{\tilde{g}^{[F_{n}]}\}_{n\in\mathbb{N}} satisfies the LDP on (𝒲~,δ)(\tilde{\mathscr{W}},\delta_{\square}) with rate (n)\ell(n) and with rate function

J(h~)=infF([0,1]):g~[F]=h~K(F).J(\tilde{h})=\inf_{F\in\mathcal{M}([0,1])\,:\,\tilde{g}^{[F]}=\tilde{h}}K(F). (5.8)
Proof.

For any g,f𝒲g,f\in\mathscr{W}, fgL1δ(g~,f~)\lVert f-g\rVert_{L_{1}}\geq\delta_{\square}(\tilde{g},\tilde{f}). Thus, under Assumption 2.3, the map Fg~[F]F\mapsto\tilde{g}^{[F]} is continuous. The result therefore follows by using Assumption 2.2 and the contraction principle [18, Theorem III.20]. ∎

In the remainder of this subsection we make use of the above auxiliary results to prove Theorem 2.6.

Proof of Theorem 2.6: We prove (i), (ii), and (iii) separately. Recall that these correspond to the three cases limn(n)/(n2)=0\lim_{n\to\infty}\ell(n)/{n\choose 2}=0, limn(n)/(n2)=c\lim_{n\to\infty}\ell(n)/{n\choose 2}=c, and limn(n)/(n2)=\lim_{n\to\infty}\ell(n)/{n\choose 2}=\infty respectively.

(i) lower bound: Let 𝒪\mathcal{O} be an open subset of 𝒲~\tilde{\mathscr{W}}, and let 𝒪(ε)\mathcal{O}^{(-\varepsilon)} denote the largest open set whose ε\varepsilon-neighbourhood is contained in 𝒪\mathcal{O}. We have

(h~Gn𝒪)(g~[Fn]𝒪(ε))(1(δ(h~Gn,g~[Fn])>ε)).\mathbb{P}(\tilde{h}^{G_{n}}\in\mathcal{O})\geq\mathbb{P}(\tilde{g}^{[F_{n}]}\in\mathcal{O}^{(-\varepsilon)})\big{(}1-\mathbb{P}(\delta_{\square}(\tilde{h}^{G_{n}},\tilde{g}^{[F_{n}]})>\varepsilon)\big{)}. (5.9)

Applying Lemmas 5.4 and 5.3 to the first and second terms on the right-hand-side of (5.9), respectively, we obtain

lim supn1(n)log(h~Gn𝒪)limε0infh~𝒪(ε)J(h~)=infh~𝒪J(h~).\limsup_{n\to\infty}\frac{1}{\ell(n)}\log\mathbb{P}(\tilde{h}^{G_{n}}\in\mathcal{O})\geq-\lim_{\varepsilon\downarrow 0}\inf_{\tilde{h}\in\mathcal{O}^{(-\varepsilon)}}J(\tilde{h})=-\inf_{\tilde{h}\in\mathcal{O}}J(\tilde{h}). (5.10)

(i) upper bound: Let 𝒞\mathcal{C} be a closed subset of 𝒲~\tilde{\mathscr{W}}, and let 𝒞(+ε)\mathcal{C}^{(+\varepsilon)} denote the largest closed set that contains the ε\varepsilon-neighbourhoods of all the points in 𝒞\mathcal{C}. We have

(h~Gn𝒞)(g~[Fn]𝒞(+ε))+(δ(h~Gn,g~[Fn])>ε).\mathbb{P}(\tilde{h}^{G_{n}}\in\mathcal{C})\leq\mathbb{P}(\tilde{g}^{[F_{n}]}\in\mathcal{C}^{(+\varepsilon)})+\mathbb{P}(\delta_{\square}(\tilde{h}^{G_{n}},\tilde{g}^{[F_{n}]})>\varepsilon). (5.11)

Again applying Lemmas 5.4 and 5.3 to the first and second terms on the right-hand-side of (5.9), respectively, we obtain

lim supn1(n)log(h~Gn𝒞)limε0infh~𝒞(+ε)J(h~)=infh~𝒞J(h~),\limsup_{n\to\infty}\frac{1}{\ell(n)}\log\mathbb{P}(\tilde{h}^{G_{n}}\in\mathcal{C})\leq-\lim_{\varepsilon\downarrow 0}\inf_{\tilde{h}\in\mathcal{C}^{(+\varepsilon)}}J(\tilde{h})=-\inf_{\tilde{h}\in\mathcal{C}}J(\tilde{h}), (5.12)

where in the first step we use the fact that (n)=o((n2))\ell(n)=o({n\choose 2}).

(ii) lower bound: For r𝒲r\in\mathscr{W}, let

F(r,ε)={F([0,1]):g[F]rL1<ε)},F(r,\varepsilon)=\{F\in\mathcal{M}([0,1]):\lVert g^{[F]}-r\rVert_{L_{1}}<\varepsilon)\}, (5.13)

and observe that, by Assumption 2.3, the set F(r,ε)F(r,\varepsilon) is measurable. Under Assumption 2.4 we therefore have

lim infn1(n2)log(h~Gn𝒪)\displaystyle\liminf_{n\to\infty}\frac{1}{{n\choose 2}}\log\mathbb{P}(\tilde{h}^{G_{n}}\in\mathcal{O}) (5.14)
limε0lim infn1(n2)[log(FnF(r,ε))+log(h~G^n𝒪|FnF(r,ε))]\displaystyle\geq\lim_{\varepsilon\downarrow 0}\liminf_{n\to\infty}\frac{1}{{n\choose 2}}\bigg{[}\log\mathbb{P}(F_{n}\in F(r,\varepsilon))+\log\mathbb{P}(\tilde{h}^{\widehat{G}_{n}}\in\mathcal{O}|F_{n}\in F(r,\varepsilon))\bigg{]}
limε0[cK(F(r,ε))+supFF(r,ε)infh~𝒪Ir[F](h~)]\displaystyle\geq-\lim_{\varepsilon\downarrow 0}\left[cK(F(r,\varepsilon))+\sup_{F\in F(r,\varepsilon)}\inf_{\tilde{h}\in\mathcal{O}}I_{r^{[F]}}(\tilde{h})\right]
=[cJ(r~)+infh~𝒪Ir(h~)].\displaystyle=-[cJ(\tilde{r})+\inf_{\tilde{h}\in\mathcal{O}}I_{r}(\tilde{h})].

In the second inequality we use the fact that, since ([0,T])\mathcal{M}([0,T]) is compact, for any sequence (Fn)n(F_{n})_{n\in\mathbb{N}} in F(r,ε)F(r,\varepsilon) there exists a convergent subsequence, which allows us to apply Lemma 5.1. In the final step we use the lower semi-continuity of KK and the assumption that Fr[F]F\mapsto r^{[F]} is a continuous mapping from ([0,1])\mathcal{M}([0,1]) to (𝒲,L1)(\mathscr{W},L_{1}), in combination with the fact that, under Assumption 2.4, if rnrL10\lVert r_{n}-r\rVert_{L_{1}}\to 0, then Irn(h~)Ir(h~)I_{r_{n}}(\tilde{h})\to I_{r}(\tilde{h}) uniformly over h𝒲h\in\mathscr{W} as nn\to\infty (see [12, Lemma 2.3]). Because these arguments hold for any r𝒲r\in\mathscr{W} we can take the sharpest lower bound:

lim infn1(n2)log(h~Gn𝒪)infr~𝒲~[cJ(r~)+infh~𝒪Ir(h~)]=infh~𝒪{infr~𝒲~[cJ(r~)+Ir(h~)]}.\liminf_{n\to\infty}\frac{1}{{n\choose 2}}\log\mathbb{P}(\tilde{h}^{G_{n}}\in\mathcal{O})\geq-\inf_{\tilde{r}\in\tilde{\mathscr{W}}}[cJ(\tilde{r})+\inf_{\tilde{h}\in\mathcal{O}}I_{r}(\tilde{h})]=-\inf_{\tilde{h}\in\mathcal{O}}\big{\{}\inf_{\tilde{r}\in\tilde{\mathscr{W}}}[cJ(\tilde{r})+I_{r}(\tilde{h})]\big{\}}. (5.15)

(ii) upper bound: Let L(,)L(\cdot,\cdot) be the Lévy metric, let BL(F,ε)={H([0,1]):L(H,F)ε}B_{L}(F,\varepsilon)=\{H\in\mathcal{M}([0,1])\colon\,L(H,F)\leq\varepsilon\}, and recall that L(,)L(\cdot,\cdot) metrises the weak topology (see [11, Theorem D.8]). Since ([0,1])\mathcal{M}([0,1]) is a compact space, for any ε>0\varepsilon>0 we can construct a finite set F[ε]F[\varepsilon] with the property that for any H([0,1])H\in\mathcal{M}([0,1]) there exists FF[ε]F\in F[\varepsilon] such that L(F,H)εL(F,H)\leq\varepsilon. We therefore have

lim supn\displaystyle\limsup_{n\to\infty} 1(n2)log(h~G^n𝒞)\displaystyle\frac{1}{{n\choose 2}}\log\mathbb{P}(\tilde{h}^{\widehat{G}_{n}}\in\mathcal{C}) (5.16)
limε0lim supn1(n2)log[FF[ε](FnBL(F,ε))(h~G^n𝒞|FnBL(F,ε))]\displaystyle\leq\lim_{\varepsilon\downarrow 0}\limsup_{n\to\infty}\frac{1}{{n\choose 2}}\log\bigg{[}\sum_{F\in F[\varepsilon]}\mathbb{P}(F_{n}\in B_{L}(F,\varepsilon))\mathbb{P}(\tilde{h}^{\widehat{G}_{n}}\in\mathcal{C}\,|\,F_{n}\in B_{L}(F,\varepsilon))\bigg{]}
limε0minFF[ε][cK(BL(F,ε))+infFBL(F,ε)infh~𝒞Ig[F](h~)].\displaystyle\leq-\lim_{\varepsilon\downarrow 0}\min_{F\in F[\varepsilon]}[cK(B_{L}(F,\varepsilon))+\inf_{F^{\star}\in B_{L}(F,\varepsilon)}\inf_{\tilde{h}\in\mathcal{C}}I_{g^{[F^{\star}]}}(\tilde{h})].
limε0minF([0,1])[cK(BL(F,ε))+infFBL(F,ε)infh~𝒞Ig[F](h~)]\displaystyle\leq-\lim_{\varepsilon\to 0}\min_{F\in\mathcal{M}([0,1])}[cK(B_{L}(F,\varepsilon))+\inf_{F^{\star}\in B_{L}(F,\varepsilon)}\inf_{\tilde{h}\in\mathcal{C}}I_{g^{[F^{\star}]}}(\tilde{h})]
=minF([0,1])[cK(F)+infh~𝒞I~r[F](h~)]\displaystyle=-\min_{F\in\mathcal{M}([0,1])}[cK(F)+\inf_{\tilde{h}\in\mathcal{C}}\tilde{I}_{r}^{[F]}(\tilde{h})]
=infh~𝒞{minF([0,T])[cK(F)+I~r[F](h~)]}.\displaystyle=\inf_{\tilde{h}\in\mathcal{C}}\big{\{}\min_{F\in\mathcal{M}([0,T])}[cK(F)+\tilde{I}_{r}^{[F]}(\tilde{h})]\big{\}}.

In the second step we apply Assumption 2.2, Lemma 5.1 and Laplace’s method, using a similar justification as in the lower bound. In the fourth step we use lower semi-continuity of KK (Assumption 2.2) and apply [12, Lemma 2.3]).

(iii): In this case we can apply similar (albeit simpler) arguments as in case (i). ∎

5.2. Proofs of the results in Section 3

5.2.1. Large deviations

The next lemma will be used to prove Proposition 3.3.

Lemma 5.5.

Subject to Assumption 3.1, H~(;,,Fn)\tilde{H}(\cdot;\cdot,\cdot,F_{n}) satisfies the LDP with rate (n)\ell(n) and with rate function

J(h~)=infF×[0,T]:H~(;,,F)=h~K(F).J(\tilde{h})=\inf_{F\in\mathcal{M}\times[0,T]:\tilde{H}(\cdot;\cdot,\cdot,F)=\tilde{h}}K(F). (5.17)
Proof.

The claim follows from the contraction principle (cf. Lemma 5.4). ∎

Proof of Proposition 3.3. To establish a multi-point LDP, we can follow similar arguments as in the proof of Theorem 2.6.(i). For example, to establish the lower bound, pick 0t1t2tkT0\leq t_{1}\leq t_{2}\leq\dots\leq t_{k}\leq T, let 𝒪i\mathcal{O}_{i} be an open subset of 𝒲~\tilde{\mathscr{W}}, and let 𝒪i(ε)\mathcal{O}^{(-\varepsilon)}_{i} be as in the proof of Theorem 2.6. We have

lim infn1(n)log(h~Gn(ti)𝒪i,i=1,,k)\displaystyle\liminf_{n\to\infty}\frac{1}{\ell(n)}\log\mathbb{P}\left(\tilde{h}^{G_{n}(t_{i})}\in\mathcal{O}_{i},\,\forall i=1,\dots,k\right) (5.18)
limε0lim infn1(n)log[(g~[Fn](ti)𝒪i(ε),i=1,,k)\displaystyle\geq\lim_{\varepsilon\downarrow 0}\liminf_{n\to\infty}\frac{1}{\ell(n)}\log\bigg{[}\mathbb{P}\left(\tilde{g}^{[F_{n}]}(t_{i})\in\mathcal{O}^{(-\varepsilon)}_{i},\,\forall i=1,\dots,k\right)
+(1i=1k(δ(h~Gn(ti),g~[Fn](ti))>ε)]\displaystyle\qquad\qquad\qquad\qquad\qquad+\bigg{(}1-\sum_{i=1}^{k}\mathbb{P}(\delta_{\square}(\tilde{h}^{G_{n}(t_{i})},\tilde{g}^{[F_{n}]}(t_{i}))>\varepsilon\bigg{)}\bigg{]}
infh~:h~(ti)𝒪i,i=1,,kJ(h~),\displaystyle\geq\inf_{\tilde{h}:\tilde{h}(t_{i})\in\mathcal{O}_{i},\,\forall i=1,\dots,k}J(\tilde{h}),

where J(h~)J(\tilde{h}) is given by (5.17), and we use a similar justification as in the lower bound of Theorem 2.6.(i) (now applying Lemma 5.5 where we used to apply Lemma 5.4). The upper bound is again similar and is therefore omitted. With the multi-point LDP established, we can apply the Dawson-Gärtner projective limit theorem [11, Theorem 4.6.1] to establish an LDP in the pointwise topology, and [11, Lemma 4.6.5] to obtain the specific form of the rate function in Proposition 3.3. ∎

Proof of Proposition 3.4. For δ>0\delta>0 and T>0T>0, define the modulus of continuity in D(𝒲~,[0,T])D(\tilde{\mathscr{W}},[0,T]) by

w(h~Gn,δ,T)=inftimaxisups,t[ti,ti+1)δ(h~Gn(s),h~Gn(t)),w^{\prime}(\tilde{h}^{G_{n}},\delta,T)=\inf_{t_{i}}\max_{i}\sup_{s,t\in[t_{i},t_{i+1})}\delta_{\square}\left(\tilde{h}^{G_{n}(s)},\tilde{h}^{G_{n}(t)}\right), (5.19)

where the infimum is over {ti}\{t_{i}\} satisfying

0=t0<t1<<tm1<Ttm0=t_{0}<t_{1}<\dots<t_{m-1}<T\leq t_{m} (5.20)

and min1in(titi1)δ\min_{1\leq i\leq n}(t_{i}-t_{i-1})\geq\delta. By [15, Theorem 4.1] (in combination with the compactness of 𝒲~\tilde{\mathscr{W}}), the sequence of processes {(hGn(t))t[0,T]}n\{(h^{G_{n}(t)})_{t\in[0,T]}\}_{n\in\mathbb{N}} is exponentially tight if

limδ0lim supn1(n)log(w(h~Gn,δ,T)>ε)=\lim_{\delta\downarrow 0}\limsup_{n\to\infty}\frac{1}{\ell(n)}\log\mathbb{P}\left(w^{\prime}(\tilde{h}^{G_{n}},\delta,T)>\varepsilon\right)=-\infty (5.21)

for all ε>0\varepsilon>0. Suppose that (3.15) holds, with Cn(t,δ)C_{n}(t,\delta) given by (3.14). We will show that this entails that (5.21) holds with ti=iδt_{i}=i\delta for i{0,,T/δ}i\in\{0,\dots,\lceil{T}/{\delta}\rceil\}. Indeed, observe that

sups,t[ti,ti+1)δ(h~Gn(s),h~Gn(t))sups,t[ti,ti+1)hGn(s)hGn(t)L1(n2)1Cn(ti,δ).\sup_{s,t\in[t_{i},t_{i+1})}\delta_{\square}\left(\tilde{h}^{G_{n}(s)},\tilde{h}^{G_{n}(t)}\right)\leq\sup_{s,t\in[t_{i},t_{i+1})}\lVert h^{G_{n}(s)}-h^{G_{n}(t)}\rVert_{L_{1}}\leq{n\choose 2}^{-1}C_{n}(t_{i},\delta). (5.22)

Consequently,

(w(h~Gn,δ,T)>ε)T/δsupt[0,T](Cn(t,δ)>ε(n2)).\mathbb{P}\left(w^{\prime}(\tilde{h}^{G_{n}},\delta,T)>\varepsilon\right)\leq\lceil T/\delta\rceil\sup_{t\in[0,T]}\mathbb{P}\left(C_{n}(t,\delta)>\varepsilon{n\choose 2}\right). (5.23)

Hence

limδ0lim supn1(n)log(w(h~Gn,δ,T)>ε)limδ0lim supn[1(n)logT/δ+1(n)log(supt[0,T](Cn(t,δ)>ε(n2)))]=,\displaystyle\begin{split}\lim_{\delta\downarrow 0}&\limsup_{n\to\infty}\frac{1}{\ell(n)}\log\mathbb{P}\left(w^{\prime}(\tilde{h}^{G_{n}},\delta,T)>\varepsilon\right)\\ &\leq\lim_{\delta\downarrow 0}\limsup_{n\to\infty}\left[\frac{1}{\ell(n)}\log\lceil T/\delta\rceil+\frac{1}{\ell(n)}\log\left(\sup_{t\in[0,T]}\mathbb{P}\left(C_{n}(t,\delta)>\varepsilon{n\choose 2}\right)\right)\right]\\ &=-\infty,\end{split} (5.24)

where in the final step we apply (3.15) and use the fact that (n)\ell(n)\to\infty. ∎

5.2.2. Weak convergence

The next lemma is needed in the proofs of Proposition 3.8 and Theorem 3.10.

Lemma 5.6.

Subject to Assumptions 3.1 and 3.6, g[Fn]g[F]g^{[F_{n}]}\Rightarrow g^{[F]} on D((𝒲,d),[0,T])D((\mathscr{W},d_{\square}),[0,T]).

Proof.

By Assumptions 3.1 and 3.6, we can apply the continuous mapping theorem to establish that

g[Fn]g[F]on D((𝒲,L1),[0,T]).g^{[F_{n}]}\Rightarrow g^{[F]}\quad\text{on }D((\mathscr{W},\lVert\cdot\rVert_{L_{1}}),[0,T]). (5.25)

Because (𝒲,L1)(\mathscr{W},\lVert\cdot\rVert_{L_{1}}) is a stronger topology than (𝒲,d)(\mathscr{W},d_{\square}), this implies the claim. ∎

Proof of Proposition 3.8. By Lemma 5.6, we have g[Fn]fddg[F]g^{[F_{n}]}\stackrel{{\scriptstyle\rm fdd}}{{\Rightarrow}}g^{[F]} on (𝒲,d)(\mathscr{W},d_{\square}). From (3.6), Assumption 3.7, and the uniform bound in [8, Lemma 5.11] we know that, for any t[0,T]t\in[0,T],

limn(d(hGn(t),g[Fn](t))ε)=1,\lim_{n\to\infty}\mathbb{P}(d_{\square}(h^{G_{n}(t)},g^{[F_{n}]}(t))\leq\varepsilon)=1, (5.26)

which implies that

limn(hGn(ti)𝔹(g[Fn](ti),ε),i)=1.\lim_{n\to\infty}\mathbb{P}(h^{G_{n}(t_{i})}\in\mathbb{B}_{\square}(g^{[F_{n}]}(t_{i}),\varepsilon),\,\forall i)=1. (5.27)

The claim therefore follows from [14, Corollary 3.3]. ∎

Proof of Theorem 3.10. By Lemma 5.6 and [14, Corollary 3.3], it suffices to prove that

d(g[Fn](),hGn())0,with probability 1,d(g^{[F_{n}]}(\cdot),h^{G_{n}(\cdot)})\to 0,\qquad\text{with probability 1}, (5.28)

where d(,)d(\cdot,\cdot) is a metric that generates D(𝒲,[0,T])D(\mathscr{W},[0,T]) (see [14, Ch. 3.5] for an expression for d(,)d(\cdot,\cdot)). Define gδ[Fn]g_{\delta}^{[F_{n}]} such that, gδ[Fn](0)=g[Fn](0)g_{\delta}^{[F_{n}]}(0)=g^{[F_{n}]}(0) and

gδ[Fn](t)=g[Fn](δi),for t(δi,δ(i+1)],i=1,T/δ.g^{[F_{n}]}_{\delta}(t)=g^{[F_{n}]}(\delta i),\qquad\text{for }t\in(\delta i,\delta(i+1)],\;\;i=1,\dots\lfloor T/\delta\rfloor. (5.29)

Because, for any δ>0\delta>0,

d(g[Fn](),hGn())\displaystyle d(g^{[F_{n}]}(\cdot),h^{G_{n}(\cdot)}) d(g[Fn](),gδ[Fn]())+d(gδ[Fn](),hGn())\displaystyle\leq d\left(g^{[F_{n}]}(\cdot),g_{\delta}^{[F_{n}]}(\cdot)\right)+d\left(g_{\delta}^{[F_{n}]}(\cdot),h^{G_{n}(\cdot)}\right) (5.30)
d(g[Fn](),gδ[Fn]())+supt[0,T]d(gδ[Fn](t),hGn(t))\displaystyle\leq d\left(g^{[F_{n}]}(\cdot),g_{\delta}^{[F_{n}]}(\cdot)\right)+\sup_{t\in[0,T]}d_{\square}\left(g_{\delta}^{[F_{n}]}(t),h^{G_{n}(t)}\right)

it suffices to show that

limδ0limn[d(g[Fn](),gδ[Fn]())+supt[0,T]d(gδ[Fn](t),hGn(t))]=0\lim_{\delta\downarrow 0}\lim_{n\to\infty}\left[d\left(g^{[F_{n}]}(\cdot),g_{\delta}^{[F_{n}]}(\cdot)\right)+\sup_{t\in[0,T]}d_{\square}\left(g_{\delta}^{[F_{n}]}(t),h^{G_{n}(t)}\right)\right]=0 (5.31)

with probability 1. We first deal with the term limδ0limnsupt[0,T]d(gδ[Fn](t),hGn(t))\lim_{\delta\downarrow 0}\lim_{n\to\infty}\sup_{t\in[0,T]}d_{\square}(g_{\delta}^{[F_{n}]}(t),h^{G_{n}(t)}). By the same arguments as in the proof of Proposition 3.8, for any δ,ε>0\delta,\varepsilon>0,

limn(hGn(δi)𝔹(g[Fn](δi),ε)i=0,1,,T/δ)=1.\lim_{n\to\infty}\mathbb{P}\left(h^{G_{n}(\delta i)}\in\mathbb{B}_{\square}(g^{[F_{n}]}(\delta i),\varepsilon)\,\,\forall\,\ i=0,1,\dots,\lfloor T/\delta\rfloor\right)=1. (5.32)

Thus,

limδ0\displaystyle\lim_{\delta\downarrow 0} limn(supt[0,T]d(gδ[Fn](t),hGn(t))>2ε)\displaystyle\lim_{n\to\infty}\mathbb{P}\left(\sup_{t\in[0,T]}d_{\square}\left(g_{\delta}^{[F_{n}]}(t),h^{G_{n}(t)}\right)>2\varepsilon\right) (5.33)
limδ0limni=1T/δ{(d(gδ[Fn](δi),hGn(δi))>ε)+(Cn(t,δ)>ε(n2))}\displaystyle\leq\lim_{\delta\downarrow 0}\lim_{n\to\infty}\sum^{T/\delta}_{i=1}\left\{\mathbb{P}\left(d_{\square}\left(g_{\delta}^{[F_{n}]}(\delta i),h^{G_{n}(\delta i)}\right)>\varepsilon\right)+\mathbb{P}\left(C_{n}(t,\delta)>\varepsilon{n\choose 2}\right)\right\}
=0,\displaystyle=0,

where in the final step we apply (5.32) in combination with Assumption 3.9. The fact that, with probability 1,

limδ0limnd(g[Fn](),gδ[Fn]())=0\lim_{\delta\downarrow 0}\lim_{n\to\infty}d\left(g^{[F_{n}]}(\cdot),g_{\delta}^{[F_{n}]}(\cdot)\right)=0

follows because, due to Assumptions 3.6 and 3.1, the limiting object g[F]g^{[F]} of g[Fn]g^{[F_{n}]} is a random variable on D(𝒲,[0,T])D(\mathscr{W},[0,T]) (i.e., its trajectories are càdlàg paths). ∎

5.3. Proofs of the results in Section 4

5.3.1. Proofs of the results in Section 4.1

To prove Theorem 3.10, we construct a graphon-valued process, (h~Gn(t))t0(\tilde{h}^{G^{*}_{n}(t)})_{t\geq 0}, that mimics the behaviour of (h~Gn(t))t0(\tilde{h}^{G_{n}(t)})_{t\geq 0} while still falling into the framework of Section 3. We couple the two processes and demonstrate that, under the coupling, the probability that the two processes deviate from each other significantly (in a way defined below) is o(en(1+o(1)))o(\mathrm{e}^{-n(1+o(1))}).

Constructing a mimicking process: Suppose that the process (Gn(t))t0(G^{*}_{n}(t))_{t\geq 0} is characterised by the following dynamics:

  • Gn(0)G^{*}_{n}(0) is the empty graph.

  • Each vertex vv is assigned an independent rate-γ\gamma Poisson clock. Each time the clock rings, all the edges that are adjacent to vv become inactive.

  • If edge ijij is inactive, then it becomes active at rate λ(t,Y_i(t),Y_j(t),F_n(t; ⋅), ~g^[F_n](t; ⋅, ⋅)).

  • If edge ijij is active, then it becomes inactive at rate μ(t,Y_i(t),Y_j(t),F_n(t; ⋅), ~g^[F_n](t; ⋅, ⋅)).

Here, g[](t;,)g^{[\cdot]}(t;\cdot,\cdot) is defined in (4.10). We point out that the induced reference graphon process of (G(t))t0(G^{*}(t))_{t\geq 0} is indeed g[F]g^{[F]}. Note that the only difference between the transition rates of (Gn(t))t0(G_{n}(t))_{t\geq 0} and (Gn(t))t0(G^{*}_{n}(t))_{t\geq 0} is that in the transition rate functions λ()\lambda(\cdot) and μ()\mu(\cdot) we have replaced h~Gn(t)\tilde{h}^{G_{n}(t)} by g~[Fn](t;,))\tilde{g}^{[F_{n}]}(t;\cdot,\cdot)).

Theorem 3.10 follows by verifying that we can apply Theorem 3.5 to {(Gn(t))t0}n\{(G^{*}_{n}(t))_{t\geq 0}\}_{n\in\mathbb{N}}, and using the following lemma.

Lemma 5.7.

There exists a coupling of {(Gn(t))t0}n\{(G_{n}(t))_{t\geq 0}\}_{n\in\mathbb{N}} and {(Gn(t))t0}n\{(G^{*}_{n}(t))_{t\geq 0}\}_{n\in\mathbb{N}} such that

limn1nlog(h~Gn(t)h~Gn(t)L1>η,for some t[0,T])=.\lim_{n\to\infty}\frac{1}{n}\log\mathbb{P}\left(\lVert\tilde{h}^{G_{n}(t)}-\tilde{h}^{G^{*}_{n}(t)}\rVert_{L_{1}}>\eta,\,\text{for some }t\in[0,T]\right)=-\infty. (5.34)
Proof.

The claim is proved in three steps.

Step 1: description of the coupling. Let CmaxC_{\rm max} be the maximal value that the rates λ()\lambda(\cdot) and μ()\mu(\cdot) can take, i.e.,

Cmax=maxt[0,T],u,v[0,1],F([0,1]),h~𝒲~λ(t,u,v,F,h~)μ(t,u,v,F,h~),C_{\rm max}=\max_{t\in[0,T],u,v\in[0,1],F\in\mathcal{M}([0,1]),\tilde{h}\in\tilde{\mathscr{W}}}\lambda(t,u,v,F,\tilde{h})\vee\mu(t,u,v,F,\tilde{h}), (5.35)

and observe that Cmax<C_{\rm max}<\infty because λ()\lambda(\cdot) and μ()\mu(\cdot) are Lipshitz continuous functions with a compact domain. Suppose that outcomes of (Gn(t))t0(G_{n}(t))_{t\geq 0} and (Gn(t))t0(G^{*}_{n}(t))_{t\geq 0} are generated in the following manner.

  • For each i[n]i\in[n], vertex ii is assigned the same (coupled) rate-γ\gamma Poisson clock in both processes, so that if the clock associated with vertex ii rings in (Gn(t))t0(G_{n}(t))_{t\geq 0} at time ss, then the clock associated with vertex ii also rings in (Gn(t))t0(G^{*}_{n}(t))_{t\geq 0} at time ss (and vice-versa). When the clock associated to vertex ii then all edges adjacent to vertex ii become inactive (in both (Gn(t))t0(G_{n}(t))_{t\geq 0} and (Gn(t))t0(G^{*}_{n}(t))_{t\geq 0} simultaneously).

  • Assign each edge the same (coupled) Poisson rate-CmaxC_{\rm max} clock in both processes. When the Poisson clock associated with edge ijij rings, generate an outcome uu of a Unif(0,1) distribution.

    • if uλ(t,Yi(t),Yj(t),Fn,h~Gn(t))/Cmaxu\leq\lambda(t,Y_{i}(t),Y_{j}(t),F_{n},\tilde{h}^{G_{n}(t)})/C_{\rm max}, then the edge ijij becomes active in (Gn(t))t0(G_{n}(t))_{t\geq 0}.

    • if uλ(t,Yi(t),Yj(t),Fn,g~[Fn](t;,))/Cmaxu\leq\lambda(t,Y_{i}(t),Y_{j}(t),F_{n},\tilde{g}^{[F_{n}]}(t;\cdot,\cdot))/C_{\rm max}, then the edge ijij becomes active in (Gn(t))t0(G^{*}_{n}(t))_{t\geq 0}.

    (If it was already active, then it remains active.)

  • Assign each edge a second (coupled) Poisson rate-CmaxC_{\rm max} clock in both processes. When the Poisson clock associated with edge ijij rings, generate an outcome uu of a Unif(0,1) distribution.

    • if uμ(t,Yi(t),Yj(t),Fn,h~Gn(t))/Cmaxu\leq\mu(t,Y_{i}(t),Y_{j}(t),F_{n},\tilde{h}^{G_{n}(t)})/C_{\rm max}, then the edge ijij becomes inactive in (Gn(t))t0(G_{n}(t))_{t\geq 0}.

    • if uμ(t,Yi(t),Yj(t),Fn,g~[Fn](t;,))/Cmaxu\leq\mu(t,Y_{i}(t),Y_{j}(t),F_{n},\tilde{g}^{[F_{n}]}(t;\cdot,\cdot))/C_{\rm max}, then the edge ijij becomes inactive in (Gn(t))t0(G^{*}_{n}(t))_{t\geq 0}.

    (If it was already inactive, then it remains inactive.)

Step 2: majorization of the L1L_{1} distance. Observe that if edge ijij is inactive in both models and the clock associated with edge ijij rings, then a difference is formed (i.e., edge ijij is active in one process and inactive in the other) with probability

1Cmax|λ(t,Xi(t),Xj(t),Fn(t;),h~Gn(t))λ(t,Xi(t),Xj(t),Fn(t;),g~[Fn](t;,))|.\frac{1}{C_{\rm max}}\,\Big{|}\lambda\left(t,X_{i}(t),X_{j}(t),F_{n}(t;\cdot),\tilde{h}^{G_{n}(t)}\right)-\lambda\left(t,X_{i}(t),X_{j}(t),F_{n}(t;\cdot),\tilde{g}^{[F_{n}]}(t;\cdot,\cdot)\right)\Big{|}. (5.36)

At any time tt, by the Lipshitz continuity of λ()\lambda(\cdot),

|λ(t,Xi(t),Xj(t),Fn(t;),h~Gn(t))λ(t,Xi(t),Xj(t),Fn(t;),g~[Fn](t;,))|c[δ(g~(Fn)(t;,),h~Gn(t))+δ(h~Gn(t),h~Gn(t))]c[δ(g~(Fn)(t;,),h~Gn(t))+hGn(t)hGn(t)L1],\displaystyle\begin{split}&\left|\lambda\left(t,X_{i}(t),X_{j}(t),F_{n}(t;\cdot),\tilde{h}^{G_{n}(t)}\right)-\lambda\left(t,X_{i}(t),X_{j}(t),F_{n}(t;\cdot),\tilde{g}^{[F_{n}]}(t;\cdot,\cdot)\right)\right|\\ &\qquad\leq c\left[\delta_{\square}(\tilde{g}^{(F_{n})}(t;\cdot,\cdot),\tilde{h}^{G^{*}_{n}(t)})+\delta_{\square}(\tilde{h}^{G^{*}_{n}}(t),\tilde{h}^{G_{n}(t)})\right]\\ &\qquad\leq c\left[\delta_{\square}(\tilde{g}^{(F_{n})}(t;\cdot,\cdot),\tilde{h}^{G^{*}_{n}(t)})+\left\lVert h^{G^{*}_{n}(t)}-h^{G_{n}(t)}\right\rVert_{L_{1}}\right],\end{split} (5.37)

where cc is the Lipshitz constant. Observe that an equivalent bound holds when λ()\lambda(\cdot) is replaced by μ()\mu(\cdot).

Let Di(t)D_{i}(t) denote the number of differences between G(t)G^{*}(t) and G(t)G(t), so that

Di(t)=n22hGn(t)hGn(t)L1.D_{i}(t)=\frac{n^{2}}{2}\left\lVert h^{G^{*}_{n}(t)}-h^{G_{n}(t)}\right\rVert_{L_{1}}. (5.38)

Using a standard property of the superposition of Poisson processes and (LABEL:eqn:rBg), we see that, at any time tt, Dn(t)D_{n}(t) increases by 1 at a rate that is bounded above by

n2Cmaxc[δ(g~(Fn)(t;,),h~Gn(t))+hGn(t)hGn(t)L1].{n^{2}}C_{\rm max}\,c\left[\delta_{\square}(\tilde{g}^{(F_{n})}(t;\cdot,\cdot),\tilde{h}^{G^{*}_{n}(t)})+\left\lVert h^{G^{*}_{n}(t)}-h^{G_{n}(t)}\right\rVert_{L_{1}}\right]. (5.39)

In addition, observe that if a clock associated to a vertex rings it can only decrease the number of differences because then, in both processes, all the edges adjacent to the vertex are inactive. Combining the above, we see that, for any β>0\beta>0, Dn(t)D_{n}(t) is stochastically dominated by the random quantity

Zn(β)(t)+n2𝟙{st:δ(g~(Fn)(s;,),h~Gn(s))>β},Z_{n}^{(\beta)}(t)+n^{2}\mathbbm{1}\left\{\exists s\leq t:\delta_{\square}\left(\tilde{g}^{(F_{n})}(s;\cdot,\cdot),\tilde{h}^{G^{*}_{n}(s)}\right)>\beta\right\}, (5.40)

where (Zn(β)(t))t0(Z^{(\beta)}_{n}(t))_{t\geq 0} is a pure-birth process, with initial state Zn(β)(0)=0Z^{(\beta)}_{n}(0)=0, characterized by the transition rate

Cmaxc[βn2+2i].C_{\rm max}c\left[\beta n^{2}+2i\right]. (5.41)

from state ii to state i+1i+1. Consequently, for any β>0\beta>0 we have

\displaystyle\mathbb{P} (h~Gn(t)h~Gn(t)L1>η,for some t[0,T])\displaystyle\left(\left\lVert\tilde{h}^{G_{n}(t)}-\tilde{h}^{G^{*}_{n}(t)}\right\rVert_{L_{1}}>\eta,\,\text{for some }t\in[0,T]\right) (5.42)
(Zn(β)(T)ηn2/2)+(δ(g~(Fn)(t;,),h~Gn(t))>β,for some t[0,T]),\displaystyle\leq\mathbb{P}(Z^{(\beta)}_{n}(T)\leq\eta n^{2}/2)+\mathbb{P}\left(\delta_{\square}\left(\tilde{g}^{(F_{n})}(t;\cdot,\cdot),\tilde{h}^{G^{*}_{n}(t)}\right)>\beta,\,\text{for some }t\in[0,T]\right),

where we have used the fact that (Zn(β)(t))t0(Z^{(\beta)}_{n}(t))_{t\geq 0} is an increasing process.

Step 3: bounding the dominating process. So as to bound the second term on the right-hand-side of (5.42), we observe that, for any β>0\beta>0,

lim supn1nlog(δ(g~(Fn)(t;,),h~Gn(t))>β,for some t[0,T])=.\limsup_{n\to\infty}\frac{1}{n}\log\mathbb{P}\left(\delta_{\square}\left(\tilde{g}^{(F_{n})}(t;\cdot,\cdot),\tilde{h}^{G^{*}_{n}(t)}\right)>\beta,\,\text{for some }t\in[0,T]\right)=-\infty. (5.43)

We can establish (5.43) following similar arguments as in the proof of Theorem 3.10, and hence these arguments are omitted.

For ease of notation below we write Zn(t)Z_{n}(t) to denote Zn(β)(t)Z_{n}^{(\beta)}(t). It suffices to show that, for any η>0\eta>0, there exists β\beta sufficiently small so that

lim supn1n2/2log(Zn(T)>ηn2/2)C(β,η)>0\limsup_{n\to\infty}\frac{1}{n^{2}/2}\log\mathbb{P}\left(Z_{n}(T)>\eta n^{2}/2\right)\geq-C(\beta,\eta)>0 (5.44)

for some C(β,η)>0C(\beta,\eta)>0. Observe that the Markov chain (Zn(t))t0(Z_{n}(t))_{t\geq 0} described by the transition rates (5.41) is a continuous-time branching process with immigration. The initial population size is 0, immigrants arrive at rate cCmaxβn2c\,C_{\rm max}\,\beta n^{2}, while individuals in the population give birth at rate 2cCmax2c\,C_{\rm max} and die at rate 0. Let X(t)X(t) denote the number of descendants that are alive at time TT of an individual that immigrated to the population at time t<Tt<T. Let C:=2cCmaxC^{*}:=2c\,C_{\rm max}. It was shown by Yule (cf. [17, Chapter V.8]) that

(X(t)=i)=eC(Tt)(1eC(Tt))i1,i,\mathbb{P}(X(t)=i)=\mathrm{e}^{-C^{*}(T-t)}(1-\mathrm{e}^{-C^{*}(T-t)})^{i-1},\quad i\in\mathbb{N}, (5.45)

i.e., X(t)1X(t)-1 has a geometric distribution with success probability eC(Tt)\mathrm{e}^{-C^{*}(T-t)}. Note that (since the death rate of individuals is zero) X0X_{0} stochastically dominates XtX_{t} for all t0t\geq 0. In addition, the total number of immigrants has a Poisson distribution with mean CβTn2/2C^{*}\beta Tn^{2}/2. Thus, if {X0(k,)}k,\{X^{(k,\ell)}_{0}\}_{k,\ell\in\mathbb{N}} are i.i.d. copies of X0X_{0} and Y=k=1n2/2Y(k)Y=\sum_{k=1}^{n^{2}/2}Y^{(k)}, where Y(k)Poi(CβT)Y^{(k)}\sim{\rm Poi}(C^{*}\beta T) are independent of everything else, then

Zn(T)stZn(T):=i=1YX0(i,1)=di=1n2/2k=1Y(i)X0(i,k).Z_{n}(T)\stackrel{{\scriptstyle\rm st}}{{\leq}}Z_{n}^{*}(T):=\sum_{i=1}^{Y}X^{(i,1)}_{0}\stackrel{{\scriptstyle\rm d}}{{=}}\sum_{i=1}^{n^{2}/2}\sum_{k=1}^{Y^{(i)}}X_{0}^{(i,k)}. (5.46)

With

φ(s)\displaystyle\varphi(s) :=𝔼(esk=1Y(i)X0(i,k))=𝔼(𝔼(esX0(i,k))Y(i))\displaystyle:=\mathbb{E}\left(\mathrm{e}^{s\sum_{k=1}^{Y^{(i)}}X_{0}^{(i,k)}}\right)=\mathbb{E}\left(\mathbb{E}\left(\mathrm{e}^{sX^{(i,k)}_{0}}\right)^{Y^{(i)}}\right) (5.47)
=exp{CβT(eCT+s1(1eCT)es1)}\displaystyle=\exp\left\{C^{*}\beta T\left(\frac{\mathrm{e}^{-C^{*}T+s}}{1-(1-\mathrm{e}^{-C^{*}T})\mathrm{e}^{s}}-1\right)\right\}

and I(z)=sups[zslogφ(s)]I(z)=\sup_{s\in\mathbb{R}}\left[zs-\log\varphi(s)\right], and applying the Chernoff bound, we therefore have

(Zn(T)ηn2/2)en2(I(η)+o(1))/2η𝔼(k=1Y(i)X0(i,k))=CβTeCT.\mathbb{P}\left(Z^{*}_{n}(T)\geq\eta n^{2}/2\right)\leq\mathrm{e}^{-n^{2}(I(\eta)+o(1))/2}\qquad\forall\,\eta\geq\mathbb{E}\left(\sum_{k=1}^{Y^{(i)}}X_{0}^{(i,k)}\right)=C^{*}\beta Te^{C^{*}T}. (5.48)

The claim now follows by observing that β\beta can be selected sufficiently small to make the inequality on the right-hand-side of (5.48) strict, in which case I(η)>0I(\eta)>0. ∎

Lemma 5.8.

The process {(Gn(t))t0}n\{(G^{*}_{n}(t))_{t\geq 0}\}_{n\in\mathbb{N}} satisfies the conditions of Theorem 3.5.

Proof.

Assumption 3.2 follows from Proposition A.1. The conditions of Proposition 3.4 can be verified using similar (albeit simpler) arguments as in the proof of Proposition 4.3 below. To verify Assumption 3.1 we need to show that if FnFF_{n}\to F in D(([0,1]),[0,T])D(\mathcal{M}([0,1]),[0,T]) then g[Fn]g[F]g^{[F_{n}]}\to g^{[F]} in D([𝒲,[0,T])D([\mathscr{W},[0,T]). Recall that, for any u[0,1]u\in[0,1], u=F(t;F¯(t+dt;u)dt)u^{\prime}=F(t;\bar{F}(t+{\rm d}t;u)-{\rm d}t), and let

un=Fn(t;F¯n(t+dt;u)dt).u^{\prime}_{n}=F_{n}(t;\bar{F}_{n}(t+{\rm d}t;u)-{\rm d}t). (5.49)

Below we assume that t[0,T]t\in[0,T] is a continuity point of FF. Applying (4.10) and the triangle inequality, we have

Δn\displaystyle\Delta_{n} (t+dt):=g[F](t+dt;,)g[Fn](t+dt;,))L1\displaystyle(t+{\rm d}t):=\left\lVert g^{[F]}(t+{\rm d}t;\cdot,\cdot)-g^{[F_{n}]}(t+{\rm d}t;\cdot,\cdot))\right\rVert_{L_{1}} (5.50)
g[F](t;,)g[Fn](t;,)L1\displaystyle\leq\left\lVert g^{[F]}(t;\cdot,\cdot)-g^{[F_{n}]}(t;\cdot,\cdot)\right\rVert_{L_{1}}
+dt[0,1]2dxdy|λ(t,F¯(t;x),F¯(t;y),F(t;),g~(F)(t;))(1g(F)(t;x,y))\displaystyle+{\rm d}t\int_{[0,1]^{2}}{\rm d}x{\rm d}y\,\bigg{|}\lambda\left(t,\bar{F}(t;x^{\prime}),\bar{F}(t;y^{\prime}),F(t;\cdot),\tilde{g}^{(F)}(t;\cdot)\right)(1-g^{(F)}(t;x^{\prime},y^{\prime}))
λ(t,F¯n(t;xn),F¯n(t;yn),Fn(t;),g~(Fn)(t;))(1g(Fn)(t;xn,yn))|\displaystyle\quad\hskip 22.76219pt-\lambda\left(t,\bar{F}_{n}(t;x^{\prime}_{n}),\bar{F}_{n}(t;y^{\prime}_{n}),F_{n}(t;\cdot),\tilde{g}^{(F_{n})}(t;\cdot)\right)(1-g^{(F_{n})}(t;x^{\prime}_{n},y^{\prime}_{n}))\bigg{|}
+dt[0,1]2dxdy|μ(t,F¯(t;x),F¯(t;y),F(t;),g~(F)(t;))g(F)(t;x,y)\displaystyle+{\rm d}t\int_{[0,1]^{2}}{\rm d}x{\rm d}y\,\bigg{|}\mu\left(t,\bar{F}(t;x^{\prime}),\bar{F}(t;y^{\prime}),F(t;\cdot),\tilde{g}^{(F)}(t;\cdot)\right)g^{(F)}(t;x^{\prime},y^{\prime})
μ(t,F¯n(t;xn),F¯n(t;yn),Fn(t;),g~(Fn)(t;))g(Fn)(t;xn,yn)|.\displaystyle\quad\hskip 22.76219pt-\mu\left(t,\bar{F}_{n}(t;x^{\prime}_{n}),\bar{F}_{n}(t;y^{\prime}_{n}),F_{n}(t;\cdot),\tilde{g}^{(F_{n})}(t;\cdot)\right)g^{(F_{n})}(t;x^{\prime}_{n},y^{\prime}_{n})\bigg{|}.

Using FnFF_{n}\to F, the Lipschitz continuity of μ()\mu(\cdot) and λ()\lambda(\cdot) and the fact that for any g,f𝒲g,f\in\mathscr{W}, δ(g~,f~)gfL1\delta_{\square}(\tilde{g},\tilde{f})\leq\lVert g-f\rVert_{L_{1}}, we see that there exists K¯<\bar{K}<\infty such that

|λ(t,F¯(t;x),F¯(t;y),F(t;),g~(F)(t;))\displaystyle\bigg{|}\lambda\left(t,\bar{F}(t;x^{\prime}),\bar{F}(t;y^{\prime}),F(t;\cdot),\tilde{g}^{(F)}(t;\cdot)\right) (5.51)
λ(t,F¯n(t;xn),F¯n(t;yn),Fn(t;),g~(Fn)(t;))|K¯(Δn(t)+o(1)),\displaystyle\qquad-\lambda\left(t,\bar{F}_{n}(t;x^{\prime}_{n}),\bar{F}_{n}(t;y^{\prime}_{n}),F_{n}(t;\cdot),\tilde{g}^{(F_{n})}(t;\cdot)\right)\bigg{|}\leq\bar{K}(\Delta_{n}(t)+o(1)),

for almost all (x,y)[0,1]2(x,y)\in[0,1]^{2}, with the same inequality holding for μ()\mu(\cdot). In addition because μ()\mu(\cdot) and λ()\lambda(\cdot) are Lipschitz continuous functions with compact support they are uniformly bounded by some constant K^<\widehat{K}<\infty. Combining this with (5.50) and (5.51), we obtain

Δn(t+dt)\displaystyle\Delta_{n}(t+{\rm d}t)\leq (5.52)
Δn(t)+2dt[K¯(Δn(t)+o(1))+K^[0,1]2dxdy|g(Fn)(t;xn,yn)g(F)(t;x,y)|].\displaystyle\Delta_{n}(t)+2{\rm d}t\bigg{[}\bar{K}(\Delta_{n}(t)+o(1))+\widehat{K}\int_{[0,1]^{2}}{\rm d}x\,{\rm d}y\,\left|g^{(F_{n})}(t;x_{n}^{\prime},y^{\prime}_{n})-g^{(F)}(t;x^{\prime},y^{\prime})\right|\bigg{]}.

Using the fact that Δn(0)=0\Delta_{n}(0)=0 and repeatedly applying (LABEL:eq:CPP3), we obtain

Δn(t+dt)\displaystyle\Delta_{n}(t+{\rm d}t)\leq (5.53)
20t+dtds[K¯(Δn(s)+o(1))+K^[0,1]2dxdy|g(Fn)(s;xn,yn)g(F)(s;x,y)|].\displaystyle 2\int_{0}^{t+{\rm dt}}{\rm d}s\bigg{[}\bar{K}(\Delta_{n}(s)+o(1))+\widehat{K}\int_{[0,1]^{2}}{\rm d}x\,{\rm d}y\,\left|g^{(F_{n})}(s;x_{n}^{\prime},y^{\prime}_{n})-g^{(F)}(s;x^{\prime},y^{\prime})\right|\bigg{]}.

Because F(;)D(([0,1]),[0,T])F(\cdot;\cdot)\in D(\mathcal{M}([0,1]),[0,T]) we can replace xx^{\prime} and yy^{\prime} by xx and yy in (LABEL:eq:CPP4) almost everywhere. Since FnFF_{n}\to F, we also have xnxx^{\prime}_{n}\to x and ynyy^{\prime}_{n}\to y almost everywhere. Consequently, from (LABEL:eq:CPP4) we obtain

Δn(t+dt)\displaystyle\Delta_{n}(t+{\rm d}t)\leq (5.54)
20t+dtds[K¯(Δn(s)+o(1))+(K^+o(1))[0,1]2dxdy|g(Fn)(s;x,y)g(F)(s;x,y)|]\displaystyle 2\int_{0}^{t+{\rm dt}}{\rm d}s\bigg{[}\bar{K}(\Delta_{n}(s)+o(1))+(\widehat{K}+o(1))\int_{[0,1]^{2}}{\rm d}x\,{\rm d}y\,\left|g^{(F_{n})}(s;x,y)-g^{(F)}(s;x,y)\right|\bigg{]}
=20t+dtdsΔn(s)(K¯+K^+o(1)).\displaystyle=2\int_{0}^{t+{\rm dt}}{\rm d}s\Delta_{n}(s)(\bar{K}+\widehat{K}+o(1)).

Because Δn(0)=0\Delta_{n}(0)=0, (LABEL:eq:CPP5) implies that Δn(t)0\Delta_{n}(t)\to 0 for all tt, which completes the proof. ∎

Proof of Theorem 3.10. Theorem 3.10 now follows from Lemmas 5.7 and 5.8, in combination with Theorem 3.5. ∎

Proof of Proposition 4.3. Once we observe that FF^{*} is the unique zero of the rate function in Proposition A.1, we see that Proposition 4.2 follows from Lemmas 5.7 and 5.8, in combination with Theorem 3.10. ∎

5.3.2. Proofs of the results in Section 4.2

We proceed by giving the proof of Proposition 4.3.

Proof of Proposition 4.3. It remains to establish exponential tightness using Proposition 3.4. Recall that λ()\lambda(\cdot) and μ()\mu(\cdot) are bounded functions and let K<K<\infty denote their maximum. It directly follows that

limδ0lim supn1nlog(supt[0,T]Cn(t,δ)>ε(n2))\displaystyle\lim_{\delta\downarrow 0}\limsup_{n\to\infty}\frac{1}{n}\log\mathbb{P}\left(\sup_{t\in[0,T]}C_{n}(t,\delta)>\varepsilon{n\choose 2}\right) (5.55)
limδ0lim supn1nlog(maxi{0,,Tδ}Cn(iδ,2δ)>ε(n2))\displaystyle\qquad\leq\lim_{\delta\downarrow 0}\limsup_{n\to\infty}\frac{1}{n}\log\mathbb{P}\left(\max_{i\in\{0,\dots,\lfloor\frac{T}{\delta}\rfloor\}}C_{n}(i\delta,2\delta)>\varepsilon{n\choose 2}\right)
limδ0lim supn1nlog{i=0Tδmaxi{0,,Tδ(Cn(iδ,2δ)>ε(n2))}.\displaystyle\qquad\leq\lim_{\delta\downarrow 0}\limsup_{n\to\infty}\frac{1}{n}\log\left\{\sum_{i=0}^{\lfloor\frac{T}{\delta}\rfloor}\max_{i\in\{0,\dots,\lfloor\frac{T}{\delta}\rfloor}\mathbb{P}\left(C_{n}(i\delta,2\delta)>\varepsilon{n\choose 2}\right)\right\}.

Note that there are two ways that edges can change during the interval [iδ,(i+2)δ][i\delta,(i+2)\delta]. In the first place, the Poisson clock of one of the adjacent vertices of the rings during [iδ,(i+2)δ][i\delta,(i+2)\delta]. We let Cn(v)(t,δ)C^{(v)}_{n}(t,\delta) denote the number of edges (active or inactive) that are adjacent to such a vertex. Because the probability each vertex that each vertex rings during [iδ,2δ][i\delta,2\delta] is 1eγt1-e^{-\gamma t}, Cn(v)(t,2δ)C^{(v)}_{n}(t,2\delta) is stochastically dominated by a random variable Xn(v)(δ)nBin(n,1eγδ)X^{(v)}_{n}(\delta)\sim n\cdot{\rm Bin}(n,1-e^{-\gamma\delta}). In the second place, the value of UijU_{ij} falls within the interval

[minu[0,2δ]H(t+u,Xi(t)+u,\displaystyle\bigg{[}\min_{u\in[0,2\delta]}H(t+u,X_{i}(t)+u, Xj(t)+u,Fn(t+u)),\displaystyle X_{j}(t)+u,F_{n}(t+u)), (5.56)
maxu[0,2δ]H(t+u,Xi(t)+u,Xj(t)+u,Fn(t+u)].\displaystyle\qquad\max_{u\in[0,2\delta]}H(t+u,X_{i}(t)+u,X_{j}(t)+u,F_{n}(t+u)\bigg{]}.

We let Cn(e)(iδ,2δ)C_{n}^{(e)}(i\delta,2\delta) denote the number of such edges. Because both λ()\lambda(\cdot) and μ()\mu(\cdot) are bounded by a constant KK, it follows that

maxu[0,2δ]H(t+u,\displaystyle\max_{u\in[0,2\delta]}H(t+u, Xi(t)+u,Xj(t)+u,Fn(t+u))\displaystyle X_{i}(t)+u,X_{j}(t)+u,F_{n}(t+u)) (5.57)
minu[0,2δ]H(t+u,Xi(t)+u,Xj(t)+u,Fn(t+u))4Kδ.\displaystyle-\min_{u\in[0,2\delta]}H(t+u,X_{i}(t)+u,X_{j}(t)+u,F_{n}(t+u))\leq 4K\delta.

Consequently, Cn(e)(iδ,2δ)C_{n}^{(e)}(i\delta,2\delta) is stochastically dominated by a random variable Xn(e)(δ)Bin((n2),4Kδ)X^{(e)}_{n}(\delta)\sim{\rm Bin}({n\choose 2},4K\delta). We thus have

limδ0lim supn1nlog{i=0Tδmaxi{0,,Tδ(Cn(iδ,2δ)>ε(n2))}\displaystyle\lim_{\delta\downarrow 0}\limsup_{n\to\infty}\frac{1}{n}\log\left\{\sum_{i=0}^{\lfloor\frac{T}{\delta}\rfloor}\max_{i\in\{0,\dots,\lfloor\frac{T}{\delta}\rfloor}\mathbb{P}\left(C_{n}(i\delta,2\delta)>\varepsilon{n\choose 2}\right)\right\} (5.58)
limδ0lim supn1nlog{i=0Tδmaxi{0,,Tδ(Cn(v)(iδ,2δ)>ε2(n2))+(Cn(e)(iδ,2δ)>ε2(n2))}\displaystyle\leq\lim_{\delta\downarrow 0}\limsup_{n\to\infty}\frac{1}{n}\log\left\{\sum_{i=0}^{\lfloor\frac{T}{\delta}\rfloor}\max_{i\in\{0,\dots,\lfloor\frac{T}{\delta}\rfloor}\mathbb{P}\left(C^{(v)}_{n}(i\delta,2\delta)>\frac{\varepsilon}{2}{n\choose 2}\right)+\mathbb{P}\left(C^{(e)}_{n}(i\delta,2\delta)>\frac{\varepsilon}{2}{n\choose 2}\right)\right\}
limδ0lim supn1nlogTδ{(Xn(v)(δ)>ε2(n2))+(Xn(e)(δ)>ε2(n2))}\displaystyle\leq\lim_{\delta\downarrow 0}\limsup_{n\to\infty}\frac{1}{n}\log\frac{T}{\delta}\left\{\mathbb{P}\left(X_{n}^{(v)}(\delta)>\frac{\varepsilon}{2}{n\choose 2}\right)+\mathbb{P}\left(X_{n}^{(e)}(\delta)>\frac{\varepsilon}{2}{n\choose 2}\right)\right\}
limδ0lim supn1nlogTδ{exp(nε2(logε1eγδ1))+exp((n2)ε2(logε4Kδ1))}\displaystyle\leq\lim_{\delta\downarrow 0}\limsup_{n\to\infty}\frac{1}{n}\log\frac{T}{\delta}\left\{\exp\left({-n\frac{\varepsilon}{2}\left(\log\frac{\varepsilon}{1-\mathrm{e}^{-\gamma\delta}}-1\right)}\right)+\exp\left({-{n\choose 2}\frac{\varepsilon}{2}\left(\log\frac{\varepsilon}{4K\delta}-1\right)}\right)\right\}
=,\displaystyle=-\infty,

where in the last inequality we apply [25, Theorem 2.3] (which is essentially a Chernoff bound). ∎

5.3.3. Proofs of the results in Section 4.3

The proof of Proposition 4.4 relies on the following direct computation.

Proof of Proposition 4.4. Pick P1,P2([0,T])P_{1},P_{2}\in\mathcal{M}([0,T]) and suppose that

2010yxPi(dx)Pi(dy)e,i=1,2.2\int_{0}^{1}\int_{0}^{y}x\,P_{i}({\rm d}x)\,P_{i}({\rm d}y)\leq e^{*},\qquad i=1,2. (5.59)

Observe that if Xi(k)X_{i}^{(k)} are independent random variables with distribution PiP_{i}, then

2010yxPi(dx)Pi(dy)=𝔼(Xi(1)Xi(2)).2\int_{0}^{1}\int_{0}^{y}x\,P_{i}({\rm d}x)\,P_{i}({\rm d}y)=\mathbb{E}(X_{i}^{(1)}\wedge X_{i}^{(2)}). (5.60)

Let P3=cP1+(1c)P2P_{3}=cP_{1}+(1-c)P_{2} with c[0,1]c\in[0,1]. We have

2010yxP3(dx)P3(dy)=𝔼(X3(1)X3(2))=c2𝔼(X1(1)X1(2))+(1c)2𝔼(X2(1)X2(2))+2c(1c)𝔼(X1(1)X2(1))e(c2+(1c)2)+2c(1c)𝔼(X1(1)X2(1)).\displaystyle\begin{split}2\int_{0}^{1}\int_{0}^{y}x\,&P_{3}({\rm d}x)\,P_{3}({\rm d}y)=\mathbb{E}(X_{3}^{(1)}\wedge X_{3}^{(2)})\\ &=c^{2}\mathbb{E}(X_{1}^{(1)}\wedge X_{1}^{(2)})+(1-c)^{2}\,\mathbb{E}(X_{2}^{(1)}\wedge X_{2}^{(2)})+2c(1-c)\,\mathbb{E}(X_{1}^{(1)}\wedge X_{2}^{(1)})\\ &\leq e^{*}(c^{2}+(1-c)^{2})+2c(1-c)\,\mathbb{E}(X_{1}^{(1)}\wedge X_{2}^{(1)}).\end{split} (5.61)

Hence it remains to show that 𝔼(X1(1)X2(1))e\mathbb{E}(X_{1}^{(1)}\wedge X_{2}^{(1)})\leq e^{*}. We have

𝔼(X1(1)X2(1))=01dx(X1x)(X2x)(01dx(X1x)201dx(X2x)2)1/2e,\displaystyle\begin{split}\mathbb{E}(X_{1}^{(1)}\wedge X_{2}^{(1)})&=\int^{1}_{0}{\rm d}x\,\mathbb{P}(X_{1}\geq x)\,\mathbb{P}(X_{2}\geq x)\\ &\leq\left(\int^{1}_{0}{\rm d}x\,\mathbb{P}(X_{1}\geq x)^{2}\int^{1}_{0}{\rm d}x\,\mathbb{P}(X_{2}\geq x)^{2}\right)^{1/2}\leq e^{*},\end{split} (5.62)

where in the second step we apply the Cauchy-Schwarz inequality, and in the final step use (5.59). ∎

Appendix A Rate function for the driving process

To establish an LDP for the illustrative example in Section 3.2 and the examples in Sections 4.1 and 4.2, we need to verify Assumption 2.2, i.e., we need to establish an LDP for the driving process. The latter are dealt with in Appendix A.1, the former in Appendix A.2.

A.1. LDP for driving process in Section 4

Recall that in Sections 4.1 and 4.2 each vertex vv is assigned a Poisson clock that rings at times {τk(v)}k\{\tau_{k}(v)\}_{k\in\mathbb{N}}, and

Xv(t):=tmaxk{τk(v):τk(t)t},X_{v}(t):=t-\max_{k}\{\tau_{k}(v):\tau_{k}(t)\leq t\}, (A.1)

with μn(t)=1ni=1nδXi(t)\mu_{n}(t)=\frac{1}{n}\sum^{n}_{i=1}\delta_{X_{i}(t)}. Here we drop the restriction that Xi(t)1X_{i}(t)\leq 1, and suppose that μn(t)(+)\mu_{n}(t)\in\mathcal{M}(\mathbb{R}_{+}). We also suppose that

μn(0)=1ni=1nδ(Xi(0))v,n,\mu_{n}(0)=\frac{1}{n}\sum_{i=1}^{n}\delta(X_{i}(0))\to v,\qquad n\to\infty, (A.2)

in (+)\mathcal{M}(\mathbb{R}_{+}). For A+A\subseteq\mathbb{R}_{+} and t>0t>0, we let

D𝟏μt(A)=limh0μt+h(A+h)μt(A)h.D_{\boldsymbol{1}}\mu_{t}(A)=\lim_{h\to 0}\frac{\mu_{t+h}(A+h)-\mu_{t}(A)}{h}. (A.3)
Proposition A.1.

The sequence of processes (μn)n(\mu_{n})_{n\in\mathbb{N}} satisfies the LDP on D([(+),[0,T])D([\mathcal{M}(\mathbb{R}_{+}),[0,T]) with rate nn and with rate function

K(μ)=0Tdt0[γμi(dx)D𝟏μt(dx)]+0Tdt0D𝟏μt(dx)log(D𝟏μt(dx)ft(0)μi(dx))+0Tdtft(0)log(ft(0)/γ),\displaystyle\begin{split}K(\mu)&=\int^{T}_{0}{\rm d}t\int_{0}^{\infty}[\gamma\mu_{i}({\rm d}x)-D_{\boldsymbol{1}}\mu_{t}({\rm d}x)]+\int_{0}^{T}{\rm d}t\int^{\infty}_{0}D_{\boldsymbol{1}}\mu_{t}({\rm d}x)\log\left(\frac{D_{\boldsymbol{1}}\mu_{t}({\rm d}x)}{f_{t}(0)\mu_{i}({\rm d}x)}\right)\\ &\qquad\qquad\qquad+\int^{T}_{0}{\rm d}t\,f_{t}(0)\log(f_{t}(0)/\gamma),\end{split} (A.4)

where ft(0):=limh0μt([0,h])/hf_{t}(0):=\lim_{h\to 0}\mu_{t}([0,h])/h if μ\mu is absolutely continuous and \infty otherwise.

Proof.

We prove Proposition A.1 by applying the standard method of proving sample-path LDPs:

  • (I)

    Establish a finite-dimensional LDP for times 0<t1<<tk<T0<t_{1}<\dots<t_{k}<T.

  • (II)

    Write down a limiting expression for the rate function when the gaps between the times shrink to zero.

  • (III)

    Strengthen the topology by establishing exponential tightness.

Step (I): Let

Px(𝒕)(dy(1),,dy(r))=(Xi(t1)dy(1),,Xi(tr)dy(r)|Xi(0)=x),P_{x}^{(\boldsymbol{t})}({\rm d}y^{(1)},\dots,{\rm d}y^{(r)})=\mathbb{P}(X_{i}(t_{1})\in{\rm d}y^{(1)},\dots,X_{i}(t_{r})\in{\rm d}y^{(r)}|X_{i}(0)=x), (A.5)

where 𝒕=(t1,,tr)\boldsymbol{t}=(t_{1},\dots,t_{r}) with 0<t1<t2<<tr<T0<t_{1}<t_{2}<\dots<t_{r}<T. We apply the following result, which is an immediate consequence of [10, Theorem 3.5]: If (A.2) holds, then the sequence of measures (((μn(t1),,μn(tr))))n(\mathbb{P}((\mu_{n}(t_{1}),\dots,\mu_{n}(t_{r}))\in\cdot))_{n\in\mathbb{N}} satisfies the LDP with rate nn and with rate function

Iv(𝒕)(μ1,,μr)\displaystyle I^{(\boldsymbol{t})}_{v}(\mu_{1},\dots,\mu_{r}) (A.6)
=supf1,,frCb(+)r[i=1r+μi(dz)fi(z)\displaystyle=\sup_{f_{1},\dots,f_{r}\in C_{b}(\mathbb{R}_{+})^{r}}\bigg{[}\sum_{i=1}^{r}\int_{\mathbb{R}_{+}}\mu_{i}({\rm d}z)f_{i}(z)
+v(dx)log+rPx(𝒕)(dy(1),,dy(r))exp(i=1rfi(y(i)))],\displaystyle\qquad\qquad-\int_{\mathbb{R}_{+}}v({\rm d}x)\log\int_{\mathbb{R}_{+}^{r}}P^{(\boldsymbol{t})}_{x}({\rm d}y^{(1)},\dots,{\rm d}y^{(r)})\exp\left(\sum_{i=1}^{r}f_{i}(y^{(i)})\right)\bigg{]},

where (μ1,,μr)(+)r(\mu_{1},\dots,\mu_{r})\in\mathcal{M}(\mathbb{R}_{+})^{r}.

To apply (A.6), we need to write down a formula for Px(𝒕)(dy(1),,dy(r))P_{x}^{(\boldsymbol{t})}({\rm d}y^{(1)},\dots,{\rm d}y^{(r)}). By the Markov property, this is essentially equivalent to writing down an expression for Px(t)(dy)P_{x}^{(t)}({\rm d}y), i.e., for a single time step. If Xi(0)=xX_{i}(0)=x, then the probability that Xi(t)=x+tX_{i}(t)=x+t is eγt\mathrm{e}^{-\gamma t} (i.e., the probability that the Poisson clock associated with vertex ii does not ring in the time interval [0,t][0,t]). On the other hand, if yty\leq t, then the probability that Xi(t)dyX_{i}(t)\in{\rm d}y is the probability that the Poisson clock associated with vertex ii rings in the time interval [tdy,t][t-{\rm d}y,t] (which occurs with probability γdy\gamma\,{\rm d}y), and afterwards does not ring again (which occurs with probability eγy\mathrm{e}^{-\gamma y}). We thus have

Px(t)(dy)={eγtif y=x+t,γdyeγyif yt,0otherwise.P^{(t)}_{x}({\rm d}y)=\begin{cases}\mathrm{e}^{-\gamma t}\qquad&\text{if }y=x+t,\\ \gamma{\rm d}y\,\mathrm{e}^{-\gamma y}\qquad&\text{if }y\leq t,\\ 0&\text{otherwise}.\end{cases} (A.7)

If we apply (A.6) for a single time step, then we obtain

Iv(t)(μ)\displaystyle I_{v}^{(t)}(\mu) =supfCb([0,))[0μ(dz)f(z)0v(dx)log(0Px(t)(dy)ef(y))]\displaystyle=\sup_{f\in C_{b}([0,\infty))}\left[\int_{0}^{\infty}\mu({\rm d}z)f(z)-\int_{0}^{\infty}v({\rm d}x)\log\left(\int_{0}^{\infty}P^{(t)}_{x}({\rm d}y)\mathrm{e}^{f(y)}\right)\right]
=supfCb[0,)[0μ(dz)f(z)0v(dx)log(eγt+f(x+t)+0tdyγeγy+f(y))].\displaystyle=\sup_{f\in C_{b}[0,\infty)}\left[\int_{0}^{\infty}\mu({\rm d}z)f(z)-\int_{0}^{\infty}v({\rm d}x)\log\left(\mathrm{e}^{-\gamma t+f(x+t)}+\int_{0}^{t}{\rm d}y\,\gamma\mathrm{e}^{-\gamma y+f(y)}\right)\right]. (A.8)

We would like to derive a closed form expression for I(t)(μ)I^{(t)}(\mu). To do this, we consider the term under the supremum, i.e.,

0μ(dz)f(z)0v(dx)log(eγt+f(x+t)+0tdyγeγy+f(y)),\int_{0}^{\infty}\mu({\rm d}z)f(z)-\int_{0}^{\infty}v({\rm d}x)\log\left(\mathrm{e}^{-\gamma t+f(x+t)}+\int_{0}^{t}{\rm d}y\,\gamma\mathrm{e}^{-\gamma y+f(y)}\right), (A.9)

take the derivative with respect to f(x+t)f(x+t) for fixed x0x\geq 0, and set this to zero. This gives

μ(d(x+t))v(dx)eγt+f(x+t)eγt+f(x+t)+0tdyγeγy+f(y)=0,\mu({\rm d}(x+t))-v({\rm d}x)\,\frac{\mathrm{e}^{-\gamma t+f(x+t)}}{\mathrm{e}^{-\gamma t+f(x+t)}+\int_{0}^{t}{\rm d}y\,\gamma\,\mathrm{e}^{-\gamma y+f(y)}}=0, (A.10)

which leads to

f(x+t)=log(μ(d(x+t))0tdyγeγy+f(y)[v(dx)μ(d(x+t))]eγt).f(x+t)=\log\left(\frac{\mu({\rm d}(x+t))\int_{0}^{t}{\rm d}y\,\gamma\,\mathrm{e}^{-\gamma y+f(y)}}{[v({\rm d}x)-\mu({\rm d}(x+t))]\,\mathrm{e}^{-\gamma t}}\right). (A.11)

Substituting (A.11) into (A.9), we obtain

(A.9)=0tμ(dz)f(z)+0μ(d(x+t))log(μ(d(x+t))0tdyγeγy+f(y)[v(dx)μ(d(x+t))]eγt)0v(dx)log(eγtμ(d(x+t))0tdyγeγy+f(y)[v(dx)μ(d(x+t))]eγt+0tdyγeγy+f(y))=0tμ(dz)f(z)+0μ(d(x+t))log(μ(d(x+t))0tdyγeγy+f(y)[v(dx)μ(d(x+t))]eγt)0v(dx)log(v(dx)v(dx)μ(d(x+t))0tdyγeγy+f(y))=0μ(d(x+t))log(μ(d(x+t))[v(dx)μ(d(x+t))]eγt)0v(dx)log(v(dx)v(dx)μ(d(x+t)))+0tμ(dz)f(z)0[v(dx)μ(d(x+t))]log(0tdyγeγy+f(y)).\displaystyle\begin{split}\eqref{JEq}&=\int_{0}^{t-}\mu({\rm d}z)\,f(z)+\int_{0}^{\infty}\mu({\rm d}(x+t))\log\left(\frac{\mu({\rm d}(x+t))\int_{0}^{t}{\rm d}y\,\gamma\,\mathrm{e}^{-\gamma y+f(y)}}{[v({\rm d}x)-\mu({\rm d}(x+t))]\,\mathrm{e}^{-\gamma t}}\right)\\ &\qquad-\int_{0}^{\infty}v({\rm d}x)\,\log\left(\mathrm{e}^{-\gamma t}\frac{\mu({\rm d}(x+t))\int_{0}^{t}{\rm d}y\,\gamma\mathrm{e}^{-\gamma y+f(y)}}{[v({\rm d}x)-\mu({\rm d}(x+t))]\,\mathrm{e}^{-\gamma t}}+\int^{t}_{0}{\rm d}y\,\gamma\mathrm{e}^{-\gamma y+f(y)}\right)\\ &=\int_{0}^{t-}\mu({\rm d}z)f(z)+\int_{0}^{\infty}\mu({\rm d}(x+t))\log\left(\frac{\mu({\rm d}(x+t))\int_{0}^{t}{\rm d}y\,\gamma\,\mathrm{e}^{-\gamma y+f(y)}}{[v({\rm d}x)-\mu({\rm d}(x+t))]\,\mathrm{e}^{-\gamma t}}\right)\\ &\qquad-\int_{0}^{\infty}v({\rm d}x)\log\left(\frac{v({\rm d}x)}{v({\rm d}x)-\mu({\rm d}(x+t))}\int^{t}_{0}{\rm d}y\,\gamma\,\mathrm{e}^{-\gamma y+f(y)}\right)\\ &=\int_{0}^{\infty}\mu({\rm d}(x+t))\log\left(\frac{\mu({\rm d}(x+t))}{[v({\rm d}x)-\mu({\rm d}(x+t))]\,\mathrm{e}^{-\gamma t}}\right)\\ &\qquad-\int_{0}^{\infty}v({\rm d}x)\,\log\left(\frac{v({\rm d}x)}{v({\rm d}x)-\mu({\rm d}(x+t))}\right)\\ &\qquad+\int_{0}^{t-}\mu({\rm d}z)\,f(z)-\int_{0}^{\infty}[v({\rm d}x)-\mu({\rm d}(x+t))]\log\left(\int^{t}_{0}{\rm d}y\,\gamma\,\mathrm{e}^{-\gamma{y+f(y)}}\right).\end{split} (A.12)

We next optimise over f(z)f(z), 0z<t0\leq z<t (note that previously we optimised over f(z)f(z), tzt\leq z\leq\infty). To do this, we consider the last line, which reads

0tμ(dz)f(z)0[v(dx)μ(d(x+t))]log(0tdzγeγz+f(z))\displaystyle\int_{0}^{t-}\mu({\rm d}z)\,f(z)-\int_{0}^{\infty}[v({\rm d}x)-\mu({\rm d}(x+t))]\log\left(\int^{t}_{0}{\rm d}z\,\gamma\mathrm{e}^{-\gamma z+f(z)}\right) (A.13)
=0tμ(dz)f(z)(1t+μ(dx))log(0tdzγeγz+f(z)),\displaystyle\qquad=\int_{0}^{t-}\mu({\rm d}z)\,f(z)-\left(1-\int_{t+}^{\infty}\mu({\rm d}x)\right)\log\left(\int^{t}_{0}{\rm d}z\,\gamma\mathrm{e}^{-\gamma z+f(z)}\right), (A.14)

and take the derivative with respect to f(z)f(z) for fixed z[0,t]z\in[0,t], and set this to zero. This gives

μ(dz)(1t+μ(dx))dzγeγz+f(z)0tdzγeγz+f(z)=0,\mu({\rm d}z)-\left(1-\int_{t+}^{\infty}\mu({\rm d}x)\right)\frac{{\rm d}z\,\gamma\mathrm{e}^{-\gamma z+f(z)}}{\int_{0}^{t}{\rm d}z\,\gamma\mathrm{e}^{\gamma z+f(z)}}=0, (A.15)

which implies that

f(z)=log(μ(dz)0tdyγeγy+f(y)(1t+μ(dx))dzγeγz).f(z)=\log\left(\frac{\mu({\rm d}z)\int_{0}^{t}{\rm d}y\,\gamma\,\mathrm{e}^{-\gamma y+f(y)}}{\left(1-\int_{t+}^{\infty}\mu({\rm d}x)\right){\rm d}z\,\gamma\,\mathrm{e}^{-\gamma z}}\right). (A.16)

Substituting (A.16) into (A.14), we obtain

0tμ(dz)log(μ(dz)0tdyγeγy+f(y)(1t+μ(dx))dzγeγz)(1t+μ(dx))log(0tdzγeγz+f(z))=0tμ(dz)log(μ(dz)dzγeγz)0tμ(dz)log(0tμ(dz)).\displaystyle\begin{split}&\int_{0}^{t-}\mu({\rm d}z)\,\log\left(\frac{\mu({\rm d}z)\int_{0}^{t}{\rm d}y\,\gamma\,\mathrm{e}^{-\gamma y+f(y)}}{\left(1-\int_{t+}^{\infty}\mu({\rm d}x)\right){\rm d}z\,\gamma\mathrm{e}^{-\gamma z}}\right)-\left(1-\int_{t+}^{\infty}\mu({\rm d}x)\right)\log\left(\int^{t}_{0}{\rm d}z\,\gamma\,\mathrm{e}^{-\gamma z+f(z)}\right)\\ &=\int^{t-}_{0}\mu({\rm d}z)\,\log\left(\frac{\mu({\rm d}z)}{{\rm d}z\,\gamma\mathrm{e}^{-\gamma z}}\right)-\int^{t-}_{0}\mu({\rm d}z)\,\log\left(\int^{t-}_{0}\mu({\rm d}z)\right).\end{split} (A.17)

Combining this with (A.13), we obtain

(A.9)\displaystyle\eqref{JEq} =0μ(d(x+t))log(μ(d(x+t))[v(dx)μ(d(x+t))]eγt)\displaystyle=\int_{0}^{\infty}\mu({\rm d}(x+t))\log\left(\frac{\mu({\rm d}(x+t))}{[v({\rm d}x)-\mu({\rm d}(x+t))]\,\mathrm{e}^{-\gamma t}}\right)
0v(dx)log(v(dx)v(dx)μ(d(x+t)))\displaystyle\quad-\int_{0}^{\infty}v({\rm d}x)\log\left(\frac{v({\rm d}x)}{v({\rm d}x)-\mu({\rm d}(x+t))}\right)
+0tμ(dz)log(μ(dz)dzγeγz)0tμ(dz)log(0tμ(dz))\displaystyle\quad+\int^{t-}_{0}\mu({\rm d}z)\log\left(\frac{\mu({\rm d}z)}{{\rm d}z\,\gamma\,\mathrm{e}^{-\gamma z}}\right)-\int^{t-}_{0}\mu({\rm d}z)\,\log\left(\int^{t-}_{0}\mu({\rm d}z)\right)
=0[v(dx)μ(d(x+t))]log(v(dx)μ(d(x+t)))\displaystyle=\int_{0}^{\infty}[v({\rm d}x)-\mu({\rm d}(x+t))]\log\left(v({\rm d}x)-\mu({\rm d}(x+t))\right)
+0μ(d(x+t))log(μ(d(x+t))eγt)\displaystyle\quad+\int^{\infty}_{0}\mu({\rm d}(x+t))\log\left(\frac{\mu({\rm d}(x+t))}{\mathrm{e}^{-\gamma t}}\right)
0v(dx)log(v(dx))0tμ(dz)log(0tμ(dz))\displaystyle\quad-\int^{\infty}_{0}v({\rm d}x)\log(v({\rm d}x))-\int^{t-}_{0}\mu({\rm d}z)\,\log\left(\int^{t-}_{0}\mu({\rm d}z)\right)
+0tμ(dz)log(μ(dz)dxγeγz)\displaystyle\quad+\int^{t-}_{0}\mu({\rm d}z)\,\log\left(\frac{\mu({\rm d}z)}{{\rm d}x\,\gamma\,\mathrm{e}^{-\gamma z}}\right)
=0v(dx)[v(dx)μ(d(x+t))v(dx)log(v(dx)μ(d(x+t))v(dx))\displaystyle=\int^{\infty}_{0}v({\rm d}x)\left[\frac{v({\rm d}x)-\mu({\rm d}(x+t))}{v({\rm d}x)}\log\left(\frac{v({\rm d}x)-\mu({\rm d}(x+t))}{v({\rm d}x)}\right)\right.
+μ(d(x+t))v(dx)log(μ(d(x+t))eγtv(dx))]\displaystyle\quad\quad\left.+\frac{\mu({\rm d}(x+t))}{v({\rm d}x)}\log\left(\frac{\mu({\rm d}(x+t))}{\mathrm{e}^{-\gamma t}v({\rm d}x)}\right)\right]
(0tμ(dz))log(0tμ(dz))+0tμ(dz)log(μ(dz)dzγeγz).\displaystyle\quad-\left(\int^{t-}_{0}\mu({\rm d}z)\right)\log\left(\int^{t-}_{0}\mu({\rm d}z)\right)+\int^{t-}_{0}\mu({\rm d}z)\log\left(\frac{\mu({\rm d}z)}{{\rm d}z\,\gamma\,\mathrm{e}^{-\gamma z}}\right).

Rearranging further, we obtain the following.

Lemma A.2.
Iv(t)(μ)=0μ(d(x+t))log(μ(d(x+t))v(dx)eγt)+0[v(dx)μ(d(x+t))]log(v(dx)μ(d(x+t))v(dx)0tμ(dz))+0tμ(dz)log(μ(dz)dzγeγz).\displaystyle\begin{split}I^{(t)}_{v}(\mu)&=\int^{\infty}_{0}\mu({\rm d}(x+t))\log\left(\frac{\mu({\rm d}(x+t))}{v({\rm d}x)\,\mathrm{e}^{-\gamma t}}\right)\\ &\qquad+\int^{\infty}_{0}[v({\rm d}x)-\mu({\rm d}(x+t))]\log\left(\frac{v({\rm d}x)-\mu({\rm d}(x+t))}{v({\rm d}x)\int^{t-}_{0}\mu({\rm d}z)}\right)\\ &\qquad+\int^{t-}_{0}\mu({\rm d}z)\log\left(\frac{\mu({\rm d}z)}{{\rm d}z\,\gamma\mathrm{e}^{-\gamma z}}\right).\end{split} (A.18)

Note that if

v(dx)={1,if x=0,0,otherwise,v({\rm d}x)=\begin{cases}1,\qquad\text{if }x=0,\\ 0,\qquad\text{otherwise},\end{cases}

then for μ\mu absolutely continuous with respect to Pv(t)P^{(t)}_{v} we have

Iv(t)(μ)\displaystyle I_{v}^{(t)}(\mu) =μ(dt)log(μ(dt)eγt)+(1μ(dt))log(1μ(dt)1μ(dt))+0tμ(dz)log(μ(dzdzγeγz)\displaystyle=\mu({\rm d}t)\log\left(\frac{\mu({\rm d}t)}{\mathrm{e}^{-\gamma t}}\right)+(1-\mu({\rm d}t))\log\left(\frac{1-\mu({\rm d}t)}{1-\mu({\rm d}t)}\right)+\int^{t-}_{0}\mu({\rm d}z)\log\left(\frac{\mu({\rm d}z}{{\rm d}z\,\gamma\mathrm{e}^{-\gamma z}}\right)
=μ(dt)log(μ(dt)eγt)+0tμ(dz)log(μ(dz)dzγeγz),\displaystyle=\mu({\rm d}t)\log\left(\frac{\mu({\rm d}t)}{\mathrm{e}^{-\gamma t}}\right)+\int^{t-}_{0}\mu({\rm d}z)\log\left(\frac{\mu({\rm d}z)}{{\rm d}z\,\gamma\mathrm{e}^{-\gamma z}}\right),

which is the relative entropy from Pv(t)P^{(t)}_{v} to μ\mu.

The above arguments extend naturally to establish the following finite-dimensional large deviation principle.

Lemma A.3.

If (A.2) holds, then the sequence of measures (((μn(t1),,μn(tr))))n(\mathbb{P}((\mu_{n}(t_{1}),\dots,\mu_{n}(t_{r}))\in\cdot))_{n\in\mathbb{N}} satisfis the LDP with rate nn and with rate function Iv(𝐭)(μ)=i=1rIμi1(titi1)(μi)I^{(\boldsymbol{t})}_{v}(\mu)=\sum_{i=1}^{r}I^{(t_{i}-t_{i-1})}_{\mu_{i-1}}(\mu_{i}), where t0=0t_{0}=0 and μ0=v\mu_{0}=v.

Step (II): By the Dawson-Gärtner projective limit LDP ([11, Theorem 4.6.1]), the sequence of measures ((μn))n(\mathbb{P}(\mu_{n}\in\cdot))_{n\in\mathbb{N}} satisfies the LDP in the pointwise topology with rate function

Iv(pw)(μ)=sup0=t0<t1<<tk=TIv(t0,,tk)(μ(t0),,μ(tk)).I_{v}^{(pw)}(\mu)=\sup_{0=t_{0}<t_{1}<\dots<t_{k}=T}I^{(t_{0},\dots,t_{k})}_{v}(\mu(t_{0}),\dots,\mu(t_{k})). (A.19)

The next step is therefore to prove the following lemma.

Lemma A.4.

K(μ)=Iv(pw)(μ)K(\mu)=I_{v}^{(pw)}(\mu) for all μD((),[0,T])\mu\in D(\mathcal{M}(\mathbb{R}),[0,T]) with μ(0)=v\mu(0)=v.

Proof.

The proof comes in two steps.

\bullet We start by demonstrating that Iv(pw)(μ)K(μ)I^{(pw)}_{v}(\mu)\geq K(\mu) for any μ(+)×[0,T]\mu\in\mathcal{M}(\mathbb{R}^{+})\times[0,T]. Consider the times 0<t1=Δ<t2=2Δ<<tk=kΔ=T0<t_{1}=\Delta<t_{2}=2\Delta<\dots<t_{k}=k\Delta=T, and let μi=μ(ti)(+)\mu_{i}=\mu(t_{i})\in\mathcal{M}(\mathbb{R}_{+}). The rate function of the finite-dimensional LDP is

Iv(𝒕)(𝝁)\displaystyle I^{(\boldsymbol{t})}_{v}(\boldsymbol{\mu}) =i=1kIμi1(titi1)(μi)\displaystyle=\sum^{k}_{i=1}I_{\mu_{i-1}}^{(t_{i}-t_{i-1})}(\mu_{i}) (A.20)
=i=1k[0μi(d(x+Δ))log(μi(d(x+Δ))μi1(dx)eγΔ)\displaystyle=\sum_{i=1}^{k}\bigg{[}\int^{\infty}_{0}\mu_{i}({\rm d}(x+\Delta))\log\left(\frac{\mu_{i}({\rm d}(x+\Delta))}{\mu_{i-1}({\rm d}x)e^{-\gamma\Delta}}\right) (A.21)
+0[μi1(dx)μi(d(x+Δ))]log(μi1(dx)μi(d(x+Δ))μi1(dx)0Δμi(dz))\displaystyle\qquad+\int^{\infty}_{0}[\mu_{i-1}({\rm d}x)-\mu_{i}({\rm d}(x+\Delta))]\log\left(\frac{\mu_{i-1}({\rm d}x)-\mu_{i}({\rm d}(x+\Delta))}{\mu_{i-1}({\rm d}x)\int^{\Delta-}_{0}\mu_{i}({\rm d}z)}\right) (A.22)
+0Δμi(dz)log(μi(dz)dzγeγz)].\displaystyle\qquad+\int^{\Delta-}_{0}\mu_{i}({\rm d}z)\log\left(\frac{\mu_{i}({\rm d}z)}{{\rm d}z\,\gamma e^{-\gamma z}}\right)\bigg{]}. (A.23)

We will deal with (A.21), (A.22), (A.23) separately.

Recall the definition of D𝟏D_{\boldsymbol{1}} from (A.3). For the first term we have

(A.21)=i=0k10μi+1(d(x+Δ))log(μi+1(d(x+Δ))μi(dx)eγΔ)=i=0k10μi+1(d(x+Δ))log(1+μi+1(d(x+Δ))μi(dx)eγΔμi(dx)eγΔ)=o(1)+i=0k10μi+1(d(x+Δ))μi(dx)eγΔ[μi+1(d(x+Δ))μi(dx)(1γΔ)]=o(1)+i=0k1i=0k10Δ[γμi(dx)D𝟏μt(dx)]0Tdt0[γμi(dx)D𝟏μt(dx)].\displaystyle\begin{split}\eqref{LT1}&=\sum_{i=0}^{k-1}\int^{\infty}_{0}\mu_{i+1}({\rm d}(x+\Delta))\log\left(\frac{\mu_{i+1}({\rm d}(x+\Delta))}{\mu_{i}({\rm d}x)\,\mathrm{e}^{-\gamma\Delta}}\right)\\ &=\sum_{i=0}^{k-1}\int^{\infty}_{0}\mu_{i+1}({\rm d}(x+\Delta))\log\left(1+\frac{\mu_{i+1}({\rm d}(x+\Delta))-\mu_{i}({\rm d}x)\mathrm{e}^{-\gamma\Delta}}{\mu_{i}({\rm d}x)\,\mathrm{e}^{-\gamma\Delta}}\right)\\ &=o(1)+\sum_{i=0}^{k-1}\int^{\infty}_{0}\frac{\mu_{i+1}({\rm d}(x+\Delta))}{\mu_{i}({\rm d}x)\,\mathrm{e}^{-\gamma\Delta}}[\mu_{i+1}({\rm d}(x+\Delta))-\mu_{i}({\rm d}x)(1-\gamma\Delta)]\\ &=o(1)+\sum_{i=0}^{k-1}\sum_{i=0}^{k-1}\int^{\infty}_{0}\Delta[\gamma\mu_{i}({\rm d}x)-D_{\boldsymbol{1}}\mu_{t}({\rm d}x)]\\ &\to\int^{T}_{0}{\rm d}t\int_{0}^{\infty}[\gamma\mu_{i}({\rm d}x)-D_{\boldsymbol{1}}\mu_{t}({\rm d}x)].\end{split} (A.24)

For the second term we have

(A.22)=i=0k10[μi(dx)μi+1(d(x+Δ))]log(μi(dx)μi+1(d(x+Δ))μi(dx)0Δμi+1(dz))0Tdt0D𝟏μt(dx)log(D𝟏μt(dx)μi(dx)ft(0)),\displaystyle\begin{split}\eqref{LT2}&=\sum_{i=0}^{k-1}\int^{\infty}_{0}[\mu_{i}({\rm d}x)-\mu_{i+1}({\rm d}(x+\Delta))]\log\left(\frac{\mu_{i}({\rm d}x)-\mu_{i+1}({\rm d}(x+\Delta))}{\mu_{i}({\rm d}x)\int^{\Delta-}_{0}\mu_{i+1}({\rm d}z)}\right)\\ &\to\int_{0}^{T}{\rm d}t\int^{\infty}_{0}D_{\boldsymbol{1}}\mu_{t}({\rm d}x)\log\left(\frac{D_{\boldsymbol{1}}\mu_{t}({\rm d}x)}{\mu_{i}({\rm d}x)f_{t}(0)}\right),\end{split} (A.25)

where ft(0):=limh0μt([0,h])/hf_{t}(0):=\lim_{h\to 0}\mu_{t}([0,h])/h. Finally, for the third term we have

(A.23)\displaystyle\eqref{LT3} =i=1k0Δμi(dz)log(μi(dz)dzγeγz)0Tdtft(0)log(ft(0)/γ).\displaystyle=\sum_{i=1}^{k}\int^{\Delta-}_{0}\mu_{i}({\rm d}z)\,\log\left(\frac{\mu_{i}({\rm d}z)}{{\rm d}z\,\gamma\mathrm{e}^{-\gamma z}}\right)\to\int^{T}_{0}{\rm d}tf_{t}(0)\log(f_{t}(0)/\gamma). (A.26)

Combining the three terms, we get the expression in (A.4).

\bullet To prove that Iv(pw)(μ)K(μ)I_{v}^{(pw)}(\mu)\leq K(\mu), suppose to the contrary that Iv(pw)(μ)>K(μ)I_{v}^{(pw)}(\mu)>K(\mu) for some μ¯X\mu\in\widebar{\mathcal{M}}_{X}. In that case there must exist a vector 𝒕=(t0,,tk)\boldsymbol{t}=(t_{0},\dots,t_{k}) such that

Iv(𝒕)(μ(t0),,μ(tk))>K(μ).I^{(\boldsymbol{t})}_{v}(\mu(t_{0}),\dots,\mu(t_{k}))>K(\mu). (A.27)

For \ell\in\mathbb{N}, let 𝒔=(s1[],,sk[]\boldsymbol{s}^{\ell}=(s_{1}^{[\ell]},\dots,s_{k_{\ell}}^{[\ell]}) be such that

  • for any i{0,,k}i\in\{0,\dots,k\} there exists j{0,,k}j\in\{0,\dots,k_{\ell}\} with ti=sj[]t_{i}=s^{[\ell]}_{j},

  • limmaxj{1,,k}|sk[]sj1[]|=0\lim_{\ell\to\infty}\max_{j\in\{1,\dots,k_{\ell}\}}|s_{k}^{[\ell]}-s_{j-1}^{[\ell]}|=0.

By the contraction principle, for any 1\ell\geq 1, we have

Iv(𝒕)(μ(t0),,μ(tk))Iv(𝒔[])(μ(s0[]),,μ(sk[])),I^{(\boldsymbol{t})}_{v}(\mu(t_{0}),\dots,\mu(t_{k}))\leq I^{(\boldsymbol{s}^{[\ell]})}_{v}(\mu(s^{[\ell]}_{0}),\dots,\mu(s^{[\ell]}_{k_{\ell}})), (A.28)

On the other hand, following very similar arguments as those above, we get

limIv(𝒔[])(μ(s0[]),,μ(sk[]))=K(μ),\lim_{\ell\to\infty}I^{(\boldsymbol{s}^{[\ell]})}_{v}(\mu(s^{[\ell]}_{0}),\dots,\mu(s^{[\ell]}_{k_{\ell}}))=K(\mu), (A.29)

which in combination with (A.28) contradicts (A.27). The proof that Iv(pw)(μ)=I^{(pw)}_{v}(\mu)=\infty when μ\mu is not absolutely continuous follows from standard arguments and is therefore omitted. ∎

Step (III): We now establish exponential tightness.

Lemma A.5.

The sequence of measures (μn())\mathbb{P}(\mu_{n}(\cdot)\in\cdot) is exponentially tight in D((+),[0,T])D(\mathcal{M}(\mathbb{R}_{+}),[0,T]).

Proof.

By [15] it suffices to show that

limδ0lim supn1nlog(w(μn,δ,T)>ε)=,\lim_{\delta\to 0}\limsup_{n\to\infty}\frac{1}{n}\log\mathbb{P}(w^{\prime}(\mu_{n},\delta,T)>\varepsilon)=-\infty, (A.30)

where

w(x,δ,T)=inf{ti}maxisups,t[ti1,ti)r(x(s),x(t))w^{\prime}(x,\delta,T)=\inf_{\{t_{i}\}}\max_{i}\sup_{s,t\in[t_{i-1},t_{i})}r(x(s),x(t)) (A.31)

and rr is a metric which generates the weak topology. We equip the space (+)\mathcal{M}(\mathbb{R}_{+}) with the Lévy metric, so that for two distribution functions F,GF,G we have

r(F,G)=inf{ε>0|F(xε)G(x)F(x+ε)+ε,x}.r(F,G)=\inf\{\varepsilon>0|F(x-\varepsilon)\leq G(x)\leq F(x+\varepsilon)+\varepsilon,\,\forall x\in\mathbb{R}\}. (A.32)

Let C(v)(s,t)C^{(v)}(s,t) denote the number of vertices that turn off at some point in the interval [s,t][s,t]. Observe that

sups,t[ti1,tir(Fn(s),Fn(t))C(v)(s,t)n+titi1.\sup_{s,t\in[t_{i-1},t_{i}}r(F_{n}(s),F_{n}(t))\leq\frac{C^{(v)}(s,t)}{n}+t_{i}-t_{i-1}. (A.33)

Given (A.33) the result then follow from very similar arguments to those used in the proof of Proposition 4.3. ∎

A.2. LDP for driving process in Section 3.2

Define μ\mu^{*} by letting

μt(A)=μt(Fexp(A))\mu^{*}_{t}(A)=\mu_{t}(F^{\exp}(A)) (A.34)

for all t>0t>0 and A[0,1]A\subset[0,1], where Fexp()F^{\exp}(\cdot) is the cdf of and exponential random variable with rate γ\gamma. Note that with this transformation we recover the empirical type process in Section 3.2.

Proposition A.6.

The sequence of processes (μn)n(\mu^{*}_{n})_{n\in\mathbb{N}} satisfies the LDP on D(([0,1]),[0,T])D(\mathcal{M}([0,1]),[0,T]) with rate nn and with rate function

K(μ)=infμD((+),[0,T]):Fexp(μ)=μK(μ),K^{*}(\mu^{*})=\inf_{\mu\in D(\mathcal{M}(\mathbb{R}_{+}),[0,T]):F^{\exp}(\mu)=\mu^{*}}K(\mu), (A.35)

where K()K(\cdot) is defined in Proposition A.1.

Proof.

Because the transformation in (A.34) is continuous, the result is a direct application of Proposition A.1 and the contraction principle (see [18]). ∎

References

  • [1] Athreya, S., Hollander, F. D., and Röllin, A. (2021). Graphon-valued stochastic processes from population genetics. Annals of Applied Probability 31, 1724–1745.
  • [2] Billingsley, P. (1979). Probability and Measure. Wiley, New York.
  • [3] Borgs, C., Chayes, J., Gaudio, J., Petti, S., and Sen, S. (2020). A large deviation principle for block models. arXiv:2007.14508.
  • [4] Borgs, C., Chayes, J., Lovász, L., Sós, V., and Vesztergombi, K. (2008). Convergent sequences of dense graphs I: Subgraph frequencies, metric properties and testing. Advances in Mathematics 219, 1801–1851.
  • [5] Borgs, C., Chayes, J., Lovász, L., Sós, V., and Vesztergombi, K. (2012). Convergent sequences of dense graphs II: Multiway cuts and statistical physics. Annals of Mathematics 176, 151–219.
  • [6] Braunsteins, P., den Hollander, F., and Mandjes, M. (2022+). A sample-path large deviation principle for dynamic Erdős–Rényi random graphs. arXiv:2009.12848. Annals of Applied Probability, to appear.
  • [7] Černý, J., and Klimovsky, A. (2020). Markovian dynamics of exchangeable arrays. In: Genealogies of Interacting Particle Systems. Lecture Notes Series, Institute for Mathematical Sciences, National University of Singapore, pp. 209–228.
  • [8] Chatterjee, S. (2017). Large Deviations for Random Graphs Lecture Notes in Mathematics 2197. Springer.
  • [9] Chatterjee, S., and Varadhan, S.R.S. (2011). The large deviation principle for the Erdős-Rényi random graph. European Journal of Combinatorics 32 (7), 1000–1017.
  • [10] Dawson, D A., and Gärtner, J. (1987). Large deviations from the McKean-Vlasov limit for weakly interacting diffusions. Stochastics: An International Journal of Probability and Stochastic Processes 20(4), 247–308.
  • [11] Dembo, A., and Zeitouni, O. (1998). Large Deviations Techniques and Applications. Stochastic Modelling and Applied Probability 38. Springer, New York.
  • [12] Dhara, S., and Sen, S. (2021). Large deviation for uniform graphs with given degrees. Communications in Mathematical Physics 382 (1), 123-171.
  • [13] Dupuis, P., and Medvedev, G. (2020). The large deviation principle for interacting dynamical systems on random graphs. arXiv:2007.13899.
  • [14] Ethier, S. N., and Kurtz, T.G. (2009). Markov Processes: Characterization and Convergence. Wiley Series in Probability and Mathematical Statistics 282. John Wiley & Sons.
  • [15] Feng, J., and Kurtz, T.G. (2006). Large Deviations for Stochastic Processes. Mathematical Surveys and Monographs 131. American Mathematical Society.
  • [16] Grebík, J., and Pikhurko, O. (2021). Large deviation principles for block and step graphon random graph models. arXiv:2101.07025.
  • [17] Harris, T.E. (1963). The Theory of Branching Processes. Grundlehren der mathematischen Wissenschaften 196. Springer, Berlin.
  • [18] den Hollander, F. (2000). Large Deviations. Fields Institute Monograph 14. American Mathematical Society, Providence RI, USA.
  • [19] Janson, F. (2013). Graphons, Cut Norm and Distance, Couplings and Rearrangements. New York Journal of Mathematics Monographs, Vol. 4.
  • [20] Lovász, L. (2012). Large Networks and Graph Limits. American Mathematical Society, Providence RI, USA.
  • [21] Lovász, L., and Szegedy, B. (2006). Limits of dense graph sequences. Journal of Combinatorial Theory, Series B 96, 933–957.
  • [22] Lovász, L., and Szegedy, B. (2007). Szemerédi’s lemma for the analyst. Geometric and Functional Analysis 17, 252–270.
  • [23] Lubetzky, E., and Zhao, Y. (2015). On replica symmetry of large deviations in random graphs. Random Structures & Algorithms 47(1), 109–146.
  • [24] Markering, M. (2020). The large deviation principle for inhomogeneous Erdő sRényi random graphs. arXiv:2010.03504.
  • [25] McDiarmid, C. (1998). Concentration. In: Probabilistic methods for algorithmic discrete mathematics. Springer, Berlin, Heidelberg, pp. 195-248.
  • [26] Ráth, B. (2012). Time evolution of dense multigraph limits under edge-conservative preferential attachment dynamics. Random Structures & Algorithms 41, 365–390.
  • [27] Röllin, A., and Zhang, Z.S. (2021). Dense multigraphon-valued stochastic processes and edge-changing dynamics in the configuration model. arXiv:2104.13024.