This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Grid entropy in last passage percolation — a superadditive critical exponent approach

Alexandru Gatea

Department of Mathematics, University of Toronto
[email protected]
Abstract

Working in the setting of i.i.d. last-passage percolation on D\mathbb{R}^{D} with no assumptions on the underlying edge-weight distribution, we arrive at the notion of grid entropy — a Subadditive Ergodic Theorem limit of the entropies of paths with empirical measures weakly converging to a given target, or equivalently a deterministic critical exponent of canonical order statistics associated with the Levy-Prokhorov metric. This provides a fresh approach to an entropy first developed by Rassoul-Agha and Seppäläinen as a large deviation rate function of empirical measures along paths. In their 2014 paper [RAS14], variational formulas are developed for the point-to-point/point-to-level Gibbs Free Energies as the convex conjugates of this entropy. We rework these formulas in our new framework and explicitly link our descriptions of grid entropy to theirs. We also improve on a known bound for this entropy by introducing a relative entropy term in the inequality. Furthermore, we show that the set of measures with finite grid entropy coincides with the deterministic set of limit points of empirical measures studied in a recent paper [Bat20] by Bates. In addition, we partially answer a directed polymer version of a question of Hoffman which was previously tackled in the zero temperature case by Bates. Our results cover both the point-to-point and point-to-level scenarios.

1 Introduction

The limiting behaviour of empirical measures is of paramount importance in percolation theory. The normalized passage time along a path, what we ultimately care about, is nothing but the identity function integrated against the empirical measure of that path. In First/Last Passage Percolation we study the minimizing/maximizing paths called geodesics.

A major open problem posed by Hoffman in [Ahl15] is whether empirical measures along geodesics in a fixed direction converge weakly. This is partially answered in [Bat20] for FPP. Bates proves that the sets q,\mathcal{R}^{q},\mathcal{R} of limiting distributions in direction qq, limiting distributions in the direction-free case respectively are deterministic and derives an explicit variational formula for the limit shape of the first passage time as the minimum value of a linear functional over q\mathcal{R}^{q}. When the set of minimizers is a singleton, which Bates argues happens for a dense family of edge-weight distributions, it follows that Hoffman’s question is answered in the affirmative. The same argument applied to the LPP model yields analogous conclusions.

Generally, there is no known way of computing the limiting distributions along geodesics. [MSZ21] showcases some cutting edge developments for the solvable Exponential LPP on 2{\mathbb{Z}}^{2} model, including an explicit formula for weak limits of empirical measures along geodesics. For other recent work on geodesics see [AH16], [JRAS19] and [JLS20].

One way to extend the work in [Bat20] is to study the more general question of what entropy of empirical measures in a fixed direction or direction-free converge in distribution to a certain target measure. Grid entropy gives an exact, deterministic answer.

In [RAS14], Rassoul-Agha and Seppäläinen derive this entropy as a large deviation rate function of empirical measures along paths and they provide variational formulas realizing the point-to-point/point-to-level Gibbs Free Energies as the convex conjugates of this mysterious new entropy. This rate function approach suggests that one might be able to derive formulas for this entropy coming from subadditivity, and furthermore that it is related to a critical exponent of paths.

Our main aim is to prove these intuitions correct, working in an LPP setting on D{\mathbb{Z}}^{D} which can easily be extended to more general frameworks. Rather than starting from the definitions given in [RAS14], we define grid entropy as a certain critical exponent of paths and show that it is equivalently described as a superadditive ergodic limit. Moreover, we arrive at variational formulas which are analogues to Rassoul-Agha and Seppäläinen’s and which are in fact positive-temperature analogues of Bates’ variational formula. Along the way, we establish various properties of this entropy, some of which are already noted in or follow from [RAS14] and [Bat20], but also some of which are not, such as a partial answer to Hoffman’s question in a directed polymer setting and such as an amazing equality between grid entropy, relative entropy and Shannon entropy. We give versions of these results for both direction-qq grid entropy (q,ν)||(q,\nu)|| and direction-free grid entropy ν||\nu|| of target measures ν\nu. By concavity and the lattice symmetries, the direction-free case turns out to simply be the direction-fixed case in the unit direction (1D,,1D)(\frac{1}{D},\ldots,\frac{1}{D}) which maximizes the total number of paths.


Let us now be more precise. We consider north-east nearest-neighbor paths on D{\mathbb{Z}}^{D}, and work on D{\mathbb{R}}^{D} by taking coordinate-wise floors. We follow a similar approach to [Bat20], in that we couple our i.i.d. edge weights τe\tau_{e} to i.i.d. edge labels Unif[0,1] random variables UeU_{e} via a measurable function τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}} satisfying τe=τ(Ue)\tau_{e}=\tau(U_{e}). This lets us work on [0,1] at no additional cost, as we can lift everything back to {\mathbb{R}} via the pushforward of τ\tau. We then interest ourselves with how many empirical measures 1nμπ=1neπδUe\frac{1}{n}\mu_{\pi}=\frac{1}{n}\sum\limits_{e\in\pi}\delta_{U_{e}} for paths 0nq\vec{0}\rightarrow\lfloor nq\rfloor converge to some given target measure ν\nu in +\mathcal{M}_{+}, the set of finite non-negative Borel measures on [0,1]. We may keep the direction qq fixed, or we may vary qq over all points in D{\mathbb{R}}^{D} with the same 1-norm q1||q||_{1}.

To perform the counting, we consider the order statistics of the distance of the paths’ empirical measures 1nμπ\frac{1}{n}\mu_{\pi} to the target ν\nu, where distance is measured via the Levy-Prokhorov metric ρ\rho, which metrizes weak convergence of measures. That is, given a direction qDq\in{\mathbb{R}}^{D} and a target measure ν\nu, for every nn\in{\mathbb{N}} we let

minπ:0nq1ρ(1nμπ,ν)minπ:0nq2ρ(1nμπ,ν)minπ:0nq#π:0nqρ(1nμπ,ν)\min_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\leq\min_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{2}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\leq\ldots\leq\min_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}

denote the order statistics value of ρ(1nμπ,ν)\rho(\frac{1}{n}\mu_{\pi},\nu). It is convenient to define

minπ:0nqjρ(1nμπ,ν):=+forj>#π:0nq\min\limits_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{j}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}:=+\infty\ \mbox{for}\ j>\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor

In the direction-free case, where we count all paths of a certain scaled length from 0\vec{0}, we let
t:=νTVt:=||\nu||_{TV} and similarly define minπs.t.|π|=ntjρ(1nμπ,ν)\min\limits_{\pi\ \mbox{s.t.}\ |\pi|=\lfloor nt\rfloor}^{j}\rho(\frac{1}{n}\mu_{\pi},\nu) over paths of length nt\lfloor nt\rfloor anchored at 0\vec{0}. Of course, these order statistics and the paths corresponding to them are event-dependent.

We then define the grid entropy with respect to the target ν\nu and the direction qq, denoted (q,ν)||(q,\nu)||, and the direction-free grid entropy, denoted ν||\nu||, to be the critical exponent where the corresponding order statistics change from converging to 0 a.s. to diverging a.s.:

(q,ν):=sup{α0:limnminπ:0nqeαnρ(1nμπ,ν)=0a.s.}ν:=sup{α0:limnminπs.t.|π|=nteαnρ(1nμπ,ν)=0a.s.}\begin{split}||(q,\nu)||&:=\sup\bigg{\{}\alpha\geq 0\ :\lim_{n\rightarrow\infty}\min_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{\lfloor e^{\alpha n}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\bigg{\}}\\ ||\nu||&:=\sup\bigg{\{}\alpha\geq 0\ :\lim_{n\rightarrow\infty}\min_{\pi\ \mbox{s.t.}\ |\pi|=\lfloor nt\rfloor}^{\lfloor e^{\alpha n}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\bigg{\}}\end{split} (1)

where these are defined to be -\infty if the set of α\alpha’s is empty. It will turn out that we can replace the limits in this definition with lim inf\liminf’s and get the same quantity. Observe that these grid entropies lie in {}[0,H(q)]\{-\infty\}\cup[0,H(q)], {}[0,logD]\{-\infty\}\cup[0,\log D] respectively where

H(q):=i=1Dqilogqiq1H(q):=\sum_{i=1}^{D}-q_{i}\log\frac{q_{i}}{||q||_{1}}

is the (Shannon) entropy of the total number of paths in direction qq, in the sense that

(#paths0nq)=eH(q)n+o(n)(\#\mbox{paths}\ \vec{0}\rightarrow nq)=e^{H(q)n+o(n)}

We note that the description (1) of grid entropy as the critical exponent of these order statistics has been previously shown to hold only for Bernoulli edge labels (see [Car10]).

Over the course of this paper we establish two other equivalent definitions of grid entropy which avoid the annoyance of dealing with these event-dependent orderings of the paths. One of these alternate descriptions is remarkably simple: grid entropy is the negative convex conjugate of Gibbs Free Energy; as a direct consequence, our notion of grid entropy is equivalent to that appearing in [RAS14] modulo some normalizations. The other, as the entropy of paths with empirical measure Levy-Prokhorov-close to the target, is completely new though not unexpected.

We summarize these characterizations in the following theorem and link them to current literature in subsequent remarks.

Theorem A.
  1. (i)

    Let q0Dq\in{\mathbb{R}}^{D}_{\geq 0} and let ν\nu be a finite non-negative Borel measure on [0,1]. Then the directionq-q grid entropy (q,ν)||(q,\nu)|| as defined in (1) is also given by

    (q,ν)=infϵ>0limn1nlogπ:0nqenϵρ(1nμπ,ν)a.s.||(q,\nu)||=\inf_{\epsilon>0}\lim_{n\rightarrow\infty}\frac{1}{n}\log\sum_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}\ \mbox{a.s.} (2)

    The expressions we take an infimum of are each directed metrics with negative sign on D×+{\mathbb{R}}^{D}\times\mathcal{M}_{+}. That is, they take value in [,)[-\infty,\infty), evaluate to 0 when (q,ν)=(0,0)(q,\nu)=(\vec{0},0), and satisfy the triangle inequality with the sign reversed.

    Moreover, direction-fixed grid entropy is the negative convex conjugate of the point-to-point β\beta-Gibbs Free Energy in direction qq (as a function of the environment-coupling function τ\tau): For β>0\beta>0,

    (q,ν)=(Gqβ)(ν)=supτ[βτ,νGqβ(τ)]||(q,\nu)||=-(G_{q}^{\beta})^{*}(\nu)=-\sup_{\tau}[\beta\langle\tau,\nu\rangle-G_{q}^{\beta}(\tau)]

    where the supremum is over bounded measurable τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}}, where ,\langle\cdot,\cdot\rangle is the integration linear functional τ,ν=01τ(u)𝑑ν\langle\tau,\nu\rangle=\int_{0}^{1}\tau(u)d\nu and where the point-to-point β\beta-Gibbs Free Energy is given by

    Gqβ(τ)=limn1nlogπ:0nqeβT(π)G_{q}^{\beta}(\tau)=\lim_{n\rightarrow\infty}\frac{1}{n}\log\sum_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}e^{\beta T(\pi)}
  2. (ii)

    Analogous results hold in the direction-free case. Let ν\nu be a finite non-negative Borel measure on [0,1] and let t:=νTVt:=||\nu||_{TV}. Then the direction-free grid entropy ν||\nu|| as defined in (1) is also given by

    ν\displaystyle||\nu|| =infϵ>0limn1nlogπ𝒫nt(0)enϵρ(1nμπ,ν)a.s.\displaystyle=\inf_{\epsilon>0}\lim_{n\rightarrow\infty}\frac{1}{n}\log\sum_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}\ \mbox{a.s.}
    =supq0D,q1=t(q,ν)\displaystyle=\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=t}||(q,\nu)||
    =(t,ν)where=(1D,,1D)\displaystyle=||(t\ell,\nu)||\ \mbox{where}\ \ell=\bigg{(}\frac{1}{D},\ldots,\frac{1}{D}\bigg{)}

    The expressions we take an infimum of are each directed metrics with negative sign on t\mathcal{M}_{t}, the set of finite non-negative Borel measures with total mass tt. Moreover, direction-free grid entropy is the negative convex conjugate of the point-to-level β\beta-Gibbs Free Energy:

    ν=(Gβ)(ν)=supτ[βτ,νGβ(τ)]a.s.||\nu||=-(G^{\beta})^{*}(\nu)=-\sup_{\tau}[\beta\langle\tau,\nu\rangle-G^{\beta}(\tau)]\ \mbox{a.s.}

    where the supremum is over bounded measurable τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}} and where the point-to-level β\beta-Gibbs Free Energy is given by

    Gβ(τ)=limn1nlogπs.t.|π|=neβT(π)G^{\beta}(\tau)=\lim_{n\rightarrow\infty}\frac{1}{n}\log\sum_{\pi\ \mbox{s.t.}\ |\pi|=n}e^{\beta T(\pi)}
Remark.

We can extend these variational formulas for point-to-point/point-to-level Gibbs Free Energy to general, possibly unbounded, measurable τ\tau by truncating τ\tau in βτ,ν\beta\langle\tau,\nu\rangle at some constant C>0C>0 we take to \infty. For example,

(q,ν)=supC>0supτ[βτC,νGqβ(τ)]||(q,\nu)||=-\sup_{C>0}\sup_{\tau}[\beta\langle\tau\land C,\nu\rangle-G_{q}^{\beta}(\tau)]
Remark.

This theorem is partially proved in [Car10] [Corollary 2] for the case when the edge labels follow a Bernoulli(pp) distribution. In this setting, Carmona shows that the negative convex conjugate of Gibbs Free Energy of measures of the form νs:=sδ1+(1s)δ0\nu_{s}:=s\delta_{1}+(1-s)\delta_{0} is given by

(G1)(νs)={limn1nlog#(length n paths from 0 with ns 1-labels),splimn1nlog#(length n paths from 0 with ns 1-labels),s<p-(G^{1})^{\ast}(\nu_{s})=\begin{cases}\lim\limits_{n\rightarrow\infty}\frac{1}{n}\log\#(\mbox{length $n$ paths from $\vec{0}$ with $\geq ns$ 1-labels}),&s\geq p\\ \lim\limits_{n\rightarrow\infty}\frac{1}{n}\log\#(\mbox{length $n$ paths from $\vec{0}$ with $\leq ns$ 1-labels}),&s<p\\ \end{cases}

The inequalities ,\geq,\leq can be replaced with equality since the number of paths with nsns 1-labels is exponentially decaying in ss. Using the definition of the Levy-Prokhorov metric we can conclude that this formulation is equivalent to (1).

Remark.

As mentioned, grid entropy first appears in literature as the rate function for the Large Deviation Principle of the empirical measures in [RAS14]. The fact that we derive formula (2) using the Subadditive Ergodic Theorem should therefore not be surprising. Our definitions (1) for direction-fixed/direction-free grid entropy align with those of Rassoul-Agha and Seppäläinen as follows. Consider the product space Ω×𝒢\Omega\times\mathcal{G} of the environment
Ω:=[0,1]D×𝒢\Omega:=[0,1]^{{\mathbb{Z}}^{D}\times\mathcal{G}} consisting of the i.i.d. edge labels and of the DD unit NE steps
𝒢={e1,,eD}\mathcal{G}=\{e_{1},\ldots,e_{D}\}, and let ϕ:Ω×𝒢[0,1]\phi:\Omega\times\mathcal{G}\rightarrow[0,1] map an (environment, unit direction) pair to the edge label of the corresponding unit direction anchored at the origin. Then

(q,ν)=logDinfμ:ϕ(μ)=ν,Eμ[Z1]=qH1(μ),ν=logDinfμ:ϕ(μ)=νH1(μ)||(q,\nu)||=\log D-\inf_{\mu:\phi_{\ast}(\mu)=\nu,E^{\mu}[Z_{1}]=q}H_{1}(\mu),\ ||\nu||=\log D-\inf_{\mu:\phi_{\ast}(\mu)=\nu}H_{1}(\mu) (3)

where H1H_{1} is the relative entropy defined in (5.2) of [RAS14] which can be traced back to Varadhan’s paper [Var03], and where the infimums are over the measures μ\mu on Ω×𝒢\Omega\times\mathcal{G} whose ϕ\phi-pushforward is ν\nu and, in the direction-fixed case, for which in addition the μ\mu-mean of the step coordinate Z1Z_{1} is qq. See Section 5.4 for a more detailed derivation and discussion of (3).


Now grid entropy satisfies some rather interesting properties. The following theorem captures the highlights for directionq-q grid entropy; the direction-free analogues hold as well. All but properties (iii) and (v) follow easily from the framework presented previously in [RAS14], yet we will showcase the power of our new approach to grid entropy by proving all of these properties directly.

Theorem B.

Let q0Dq\in{\mathbb{R}}^{D}_{\geq 0}. Then:

  1. (i)

    Grid entropy (q,ν)||(q,\nu)|| is a directed norm with negative sign; it scales with positive-factors and it satisfies a reverse triangle inequality:

    (p,ξ)+(q,ν)(p+q,ξ+ν)||(p,\xi)||+||(q,\nu)||\leq||(p+q,\xi+\nu)||
  2. (ii)

    Grid entropy (q,ν)||(q,\nu)|| is upper semicontinuous.

    Let

    q:={accumulation points of empirical measures along paths in direction q}\mathcal{R}^{q}:=\{\mbox{accumulation points of empirical measures along paths in direction $q$}\}
  3. (iii)

    q\mathcal{R}^{q} is weakly closed, convex and deterministic and coincides almost surely with

    {ν+:(q,ν)>}\{\nu\in\mathcal{M}_{+}:||(q,\nu)||>-\infty\}
  4. (iv)

    q\mathcal{R}^{q} consists only of measures ν\nu with total variation νTV=q1||\nu||_{TV}=||q||_{1} that are absolutely continuous with respect to the Lebesgue measure Λ\Lambda on [0,1].

  5. (v)

    Any νq\nu\in\mathcal{R}^{q} satisfies the following upper bound on the sum of the grid entropy and the relative entropy with respect to Lebesgue measure on [0,1]:

    DKL(ν||Λ)+||(q,ν)||i=1Dqilogqiq1:=H(q)D_{KL}(\nu||\Lambda)+||(q,\nu)||\leq\sum_{i=1}^{D}-q_{i}\log\frac{q_{i}}{||q||_{1}}:=H(q)

    where DKLD_{KL} denotes relative entropy (or Kullback-Leibler divergence), and where,
    again, this upper bound is simply the (Shannon) entropy of the total number of paths in direction qq.

Why do we care? The deterministic set q\mathcal{R}^{q} is nothing more than the LPP analogue of the set Bates takes a infimum over in [Bat20] in his variational formula for the FPP time constant. In our LPP setting, his formula becomes

LPP time constant :=limnlast passage time for paths 0nqn\displaystyle:=\lim_{n\rightarrow\infty}\frac{\mbox{last passage time for paths $\vec{0}\rightarrow\lfloor nq\rfloor$}}{n}
=supνqτ,νa.s.\displaystyle=\sup_{\nu\in\mathcal{R}^{q}}\langle\tau,\nu\rangle\ \mbox{a.s.}

We thus link [Bat20] and [RAS14] by providing a new, more enlightening description of these sets q\mathcal{R}^{q} in terms of these grid entropies. Furthermore, the amazing bound in (iii) is an improvement of a version without the grid entropy term proved by Bates and which also follows from Rassoul-Agha and Seppäläinen’s work.

As in [RAS14], the main application of grid entropy we present is a variational formula for the point-to-point/point-to-level Gibbs Free Energies in direction qq/to the ”level” x1++xD=ntx_{1}+\ldots+x_{D}=nt in the directed polymer model. Our variational formula is simply the zero temperature LPP analogue of Bates’ variational formula for the FPP limit shape, and furthermore can be molded into Rassoul-Agha and Seppäläinen’s variational formula via some normalizations, thus proving that what we call grid entropy really is the same object first developed in [RAS14].

Theorem C.

Fix a direction q0Dq\in{\mathbb{R}}^{D}_{\geq 0} and an inverse temperature β>0\beta>0. For bounded measurable τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}}, the point-to-point/point-to-level Gibbs Free Energies are given by

Gqβ(τ)=supνq[βτ,ν+(q,ν)],G_{q}^{\beta}(\tau)=\sup_{\nu\in\mathcal{R}^{q}}[\beta\langle\tau,\nu\rangle+||(q,\nu)||],
Gβ(τ)=supν1[βτ,ν+ν]=Gβ=supq0D,q1=1GqβG^{\beta}(\tau)=\sup_{\nu\in\mathcal{R}^{1}}[\beta\langle\tau,\nu\rangle+||\nu||]=G^{\beta}_{\ell}=\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=1}G^{\beta}_{q}

where =(1D,,1D)\ell=(\frac{1}{D},\ldots,\frac{1}{D}) is the maximizing direction, and

1=q0D,q1=1q\mathcal{R}^{1}=\bigcup\limits_{q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=1}\mathcal{R}^{q}

is the set of Borel probability measures ν\nu that have finite direction-free grid entropy. Moreover, these supremums are achieved by some ν\nu in q,\mathcal{R}^{q},\mathcal{R}^{\ell} respectively.

Remark.

We may replace the supremums in these formulas to be over all measurable τ\tau by truncating τ\tau at some C>0C>0 and taking a supremum over CC.

As in [Bat20], it follows that the directed polymer analogue of Hoffman’s question is answered in the affirmative when our variational formula has a unique maximizer, which happens for a dense family of measurable functions τ\tau.

Theorem D.

Fix an inverse temperature β>0\beta>0 and a bounded measurable τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}}.

  1. (i)

    Fix q0Dq\in{\mathbb{R}}^{D}_{\geq 0} and suppose βτ,ν+(q,ν)\beta\langle\tau,\nu\rangle+||(q,\nu)|| has a unique maximizer νq\nu\in\mathcal{R}^{q}. For every nn pick a path πn:0nq\pi_{n}:\vec{0}\rightarrow\lfloor nq\rfloor independently and at random according to the probabilities prescribed the corresponding point-to-point β\beta-polymer measure

    ρn,qβ(dπ)=eβT(π)π:0nqeβT(π)for pathsπ:0nq\rho_{n,q}^{\beta}(d\pi)=\frac{e^{\beta T(\pi)}}{\sum\limits_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}e^{\beta T(\pi)}}\ \mbox{for paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor

    Then the empirical measures 1nμπn\frac{1}{n}\mu_{\pi_{n}} converge weakly to ν\nu a.s.

  2. (ii)

    Suppose βτ,ν+ν\beta\langle\tau,\nu\rangle+||\nu|| has a unique maximizer ν1\nu\in\mathcal{M}_{1}. For every nn pick a length nn path πn\pi_{n} from 0\vec{0} independently and at random according to the probabilities prescribed the corresponding point-to-level β\beta-polymer measure

    ρnβ(dπ)=eβT(π)πs.t.|π|=neβT(π)for length n paths π from0\rho_{n}^{\beta}(d\pi)=\frac{e^{\beta T(\pi)}}{\sum\limits_{\pi\ \mbox{s.t.}\ |\pi|=n}e^{\beta T(\pi)}}\ \mbox{for length $n$ paths $\pi$ from}\ \vec{0}

    Then the empirical measures 1nμπn\frac{1}{n}\mu_{\pi_{n}} converge weakly to ν\nu a.s.

Remark.

In fact, even if there is no unique maximizer, we can show that all accumulation points of random paths chosen according to the direction-fixed/direction-free β\beta-polymer measure are among the maximizers of the corresponding variational formula. This partial answer to Hoffman’s question in the positive temperature case is immediate from the development of grid entropy by Rassoul-Agha and Seppäläinen [RAS14] as the rate function of the Large Deviation Principle of empirical measures. However, to the best of our knowledge this is the first time it has been formulated explicitly.

The plan is as follows. In Section 2, we describe the model and setup, and outline various facts and notions we will need over the course of this paper. Section 3 focuses on developing the second definition of grid entropy (2) as a directed norm with negative sign and showing that it is equivalent to the original definition with the minπj\min\limits_{\pi}^{j}. Then, in Section 4, we investigate what information we can extract from grid entropy and what properties it satisfies (Theorem B). We devote Section 5 to applying our results to establish our variational formula for point-to-point/point-to-level Gibbs Free Energies and study the consequences of this (namely Theorems C,D) as well as the correspondence to the work in [RAS14]. Last but not least, we make some closing remarks about adapting our results to other models.

2 Preliminaries

2.1 Empirical Measures on the Lattice

We begin by briefly describing the setup we use in this paper.

We restrict ourselves to a directed Last Passage Percolation model. Consider north-east nearest-neighbour paths on the lattice D,D1{\mathbb{Z}}^{D},D\geq 1 with i.i.d. edge weights τeθ\tau_{e}\sim\theta for some probability distribution θ\theta on {\mathbb{R}}. By north-east, we of course mean that the coordinates of points on the path are nondecreasing.

For p,qDp,q\in{\mathbb{Z}}^{D} we denote by 𝒫(p,q)\mathcal{P}(p,q) the set of all NE paths π:pq\pi:p\rightarrow q. Similarly, for qDq\in{\mathbb{Z}}^{D} and t0t\in{\mathbb{Z}}_{\geq 0} we denote by 𝒫t(q)\mathcal{P}_{t}(q) the set of all NE paths from qq of length tt (no restriction on the endpoint).

Observe that either qp0Dq-p\notin{\mathbb{Z}}_{\geq 0}^{D} and 𝒫(p,q)=\mathcal{P}(p,q)=\emptyset , or qp0q-p\in{\mathbb{Z}}_{\geq 0} and

|𝒫(p,q)|=(qp1q1p1,q2p2,,qDpD)|\mathcal{P}(p,q)|=\binom{||q-p||_{1}}{q_{1}-p_{1},q_{2}-p_{2},\cdots,q_{D}-p_{D}}

Here ||||1||\cdot||_{1} is the 1-norm on D{\mathbb{R}}^{D} defined by p1=i=1D|pi|||p||_{1}=\sum\limits_{i=1}^{D}|p_{i}|.

On the other hand, |𝒫t(q)|=Dt|\mathcal{P}_{t}(q)|=D^{t} trivially for any qDq\in{\mathbb{Z}}^{D}.

Note that unlike other recent work (such as [MSZ21]) we do not restrict ourselves to a known solvable model. We do not impose any restrictions on the edge weight distribution θ\theta, so that our results hold with the greatest generality possible.

We scale the grid by nn and look at the behaviour in the limit. As is standard, we extend our initial inputs p,qp,q to lie in D{\mathbb{R}}^{D} by taking coordinatewise floors of the scaled coordinates. That is, we consider paths π𝒫(np,nq)\pi\in\mathcal{P}(\lfloor np\rfloor,\lfloor nq\rfloor) where

(x1,,xD):=(x1,,xD)\lfloor(x_{1},\ldots,x_{D})\rfloor:=(\lfloor x_{1}\rfloor,\ldots,\lfloor x_{D}\rfloor)

Our normalized directed metrics will converge almost surely to a translation-invariant limit so it will suffice to consider the case when p=0p=\vec{0}. But for now we let p\vec{p} be arbitrary.

Various inequalities we derive involve the asymptotics of the number of length nn NE paths from the origin in a fixed or unfixed direction. The following lemma, which is easily proved using Sterling’s approximation, gives us what we want.

Lemma 1.

Let k,n,ai0,ai=ak,n\in{\mathbb{N}},a_{i}\in{\mathbb{R}}_{\geq 0},\sum a_{i}=a. Then

(nana1,na2,,nak)=(aa1ikaiai+o(1))n\binom{\lfloor na\rfloor}{\lfloor na_{1}\rfloor,\lfloor na_{2}\rfloor,\ldots,\lfloor na_{k}\rfloor}=\bigg{(}\frac{a^{a}}{\prod\limits_{1\leq i\leq k}a_{i}^{a_{i}}}+o(1)\bigg{)}^{n}

where we use the convention 00=10^{0}=1.

Remark.

Recall that for q0Dq\in{\mathbb{R}}^{D}_{\geq 0} we denote by 𝒫(0,nq)\mathcal{P}(\vec{0},\lfloor nq\rfloor) the set of paths π:0nq\pi:\vec{0}\rightarrow\lfloor nq\rfloor. Thus

limn1nlog|𝒫(0,nq)|=i=1Dqilogqiq1:=H(q)\lim_{n\rightarrow\infty}\frac{1}{n}\log|\mathcal{P}(\vec{0},\lfloor nq\rfloor)|=\sum_{i=1}^{D}-q_{i}\log\frac{q_{i}}{||q||_{1}}:=H(q)

H(q)H(q) is the (Shannon) entropy of the number of paths in direction qq. Note that H(q)H(q) scales with positive scalars and H(q)H(q) is maximized among q0Dq\in{\mathbb{R}}^{D}_{\geq 0} with the same 1-norm by

q=q1(1D,,1D):=q1q=||q||_{1}\bigg{(}\frac{1}{D},\ldots,\frac{1}{D}\bigg{)}:=||q||_{1}\ell

in which case H(q)=q1logDH(q)=||q||_{1}\log D.

On the other hand, for t0t\geq 0 recall that we denote by 𝒫nt(0)\mathcal{P}_{\lfloor nt\rfloor}(\vec{0}) the set of paths from 0\vec{0} of length nt\lfloor nt\rfloor. Thus

limn1nlog|𝒫nt(0)|=tlogD=H(t)\lim_{n\rightarrow\infty}\frac{1}{n}\log|\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})|=t\log D=H(t\ell)

We study the distribution of weights that we observe along paths #π:npnq\#\pi:\lfloor np\rfloor\rightarrow\lfloor nq\rfloor. For a path π\pi, let the unnormalized empirical measure along π\pi be

σπ=eπδτe\sigma_{\pi}=\sum_{e\in\pi}\delta_{\tau_{e}}

Note that we normalize by 1n\frac{1}{n} rather than 1|π|\frac{1}{|\pi|}. This is simply for convenience in our proofs, as it gives us a certain superadditivity we do not get when we normalize by 1|π|\frac{1}{|\pi|}.

The Glivenko-Cantelli Theorem [Dur19, Thm. 2.4.7] tells us that for any fixed infinite NE path in the grid, empirical measures along the path converge weakly to θ\theta.

Theorem 2 (Glivenko-Cantelli Theorem).

Let FθF_{\theta} be the cumulative distribution function of θ\theta, let XiθX_{i}\sim\theta be i.i.d. random variables and let

Fn(x)=1ni=1n𝟏{Xix}F_{n}(x)=\frac{1}{n}\sum_{i=1}^{n}\mathbf{1}_{\{X_{i}\leq x\}}

be the cumulative distribution functions of the empirical measures. Then

supx|Fn(x)Fθ(x)|0a.s. asn\sup_{x}|F_{n}(x)-F_{\theta}(x)|\rightarrow 0\ \mbox{a.s. as}\ n\rightarrow\infty

However, we are interested in the limiting behavior of the empirical measure of not one path, but of all paths from 0\vec{0} or all paths with a given direction, as we scale the length of the paths. This allows us to observe more than just the original measure θ\theta.

2.2 Metrics on Measures

To gauge the distance between two measures we use the Levy-Prokhorov metric. We briefly introduce this metric as well as the total variation metric and we outline the relevant properties.

Consider a metric space (X,d)(X,d) with Borel σ\sigma-algebra \mathcal{B}. We denote by \mathcal{M} the set of finite Borel measures on (X,)(X,\mathcal{B}), by +\mathcal{M}_{+} the set of non-negative finite Borel measures, and by t\mathcal{M}_{t} the set of Borel non-negative finite measures with total mass tt for any t0t\geq 0. In this notation, 1\mathcal{M}_{1} is the set of Borel probability measures.

Definition.

The total variation norm on \mathcal{M} is defined by

μTV=supA|μ(A)|||\mu||_{TV}=\sup_{A\in\mathcal{B}}|\mu(A)|

This of course gives rise to a total variation metric, given by

dTV(μ,ν)=μνTVd_{TV}(\mu,\nu)=||\mu-\nu||_{TV}

For example, the total variation of any measure μ+\mu\in\mathcal{M}_{+} is its total mass μ(X)\mu(X).

Definition.

For any AA\in\mathcal{B} and ϵ>0\epsilon>0, the ϵ\epsilon-neighborhood of AA is defined to be

Aϵ:={xX:d(x,a)<ϵfor someaA}A^{\epsilon}:=\{x\in X:d(x,a)<\epsilon\ \mbox{for some}\ a\in A\}
Definition.

The Levy-Prokhorov metric on +\mathcal{M}_{+} is defined by

ρ(μ,ν)=inf{ϵ>0:μ(A)ν(Aϵ)+ϵandν(A)μ(Aϵ)+ϵA}\rho(\mu,\nu)=\inf\{\epsilon>0:\mu(A)\leq\nu(A^{\epsilon})+\epsilon\ \mbox{and}\ \nu(A)\leq\mu(A^{\epsilon})+\epsilon\ \forall A\in\mathcal{B}\}

It is a standard result that ρ\rho metrizes the weak convergence of measures and total variation metrizes the strong convergence of measures in +\mathcal{M}_{+}. For details, see [Hub04, Sect. 2.3].

We now derive two useful inequalities involving the Levy-Prokhorov metric.

Lemma 3.

For μ,ν+\mu,\nu\in\mathcal{M}_{+},

ρ(μ,ν)μνTV\rho(\mu,\nu)\leq||\mu-\nu||_{TV}
Remark.

This lemma establishes that the Levy-Prokhorov metric is weaker than the total variation metric.

Remark.

It is trivial to see that ρ(μ,0)=μTV\rho(\mu,0)=||\mu||_{TV}.

Proof.

Let ϵ:=μνTV\epsilon:=||\mu-\nu||_{TV}. If ϵ=0\epsilon=0 then μ=ν\mu=\nu so ρ(μ,ν)=0\rho(\mu,\nu)=0. If ϵ>0\epsilon>0, for any AA\in\mathcal{B},

AAϵμ(A)ν(Aϵ)μ(A)ν(A)supA|μ(A)ν(A)|=ϵA\subseteq A^{\epsilon}\Rightarrow\mu(A)-\nu(A^{\epsilon})\leq\mu(A)-\nu(A)\leq\sup_{A^{\prime}\in\mathcal{B}}|\mu(A^{\prime})-\nu(A^{\prime})|=\epsilon

and similarly ν(A)μ(Aϵ)ϵ\nu(A)-\mu(A^{\epsilon})\leq\epsilon so ρ(μ,ν)ϵ\rho(\mu,\nu)\leq\epsilon. ∎

We next show that ρ\rho satisfies a kind of subadditivity.

Lemma 4.

Let μ1,μ2,ν1,ν2+\mu_{1},\mu_{2},\nu_{1},\nu_{2}\in\mathcal{M}_{+}. Then

ρ(μ1+μ2,ν1+ν2)ρ(μ1,ν1)+ρ(μ2,ν2)\rho(\mu_{1}+\mu_{2},\nu_{1}+\nu_{2})\leq\rho(\mu_{1},\nu_{1})+\rho(\mu_{2},\nu_{2})
Remark.

Note that in the case μ2=ν2\mu_{2}=\nu_{2} the inequality becomes

ρ(μ1+μ2,ν1+μ2)ρ(μ1,ν1)\rho(\mu_{1}+\mu_{2},\nu_{1}+\mu_{2})\leq\rho(\mu_{1},\nu_{1})
Proof.

For any ϵ1>ρ(μ1,ν1),ϵ2>ρ(μ2,ν2)\epsilon_{1}>\rho(\mu_{1},\nu_{1}),\epsilon_{2}>\rho(\mu_{2},\nu_{2}) we have for any AA\in\mathcal{B},

μ1(A)ν1(Aϵ1)+ϵ1ν1(Aϵ1+ϵ2)+ϵ1andμ2(A)ν2(Aϵ2)+ϵ2ν2(Aϵ1+ϵ2)+ϵ2\mu_{1}(A)\leq\nu_{1}(A^{\epsilon_{1}})+\epsilon_{1}\leq\nu_{1}(A^{\epsilon_{1}+\epsilon_{2}})+\epsilon_{1}\ \mbox{and}\ \mu_{2}(A)\leq\nu_{2}(A^{\epsilon_{2}})+\epsilon_{2}\leq\nu_{2}(A^{\epsilon_{1}+\epsilon_{2}})+\epsilon_{2}
(μ1+μ2)(A)(ν1+ν2)(Aϵ1+ϵ2)+ϵ1+ϵ2\Rightarrow(\mu_{1}+\mu_{2})(A)\leq(\nu_{1}+\nu_{2})(A^{\epsilon_{1}+\epsilon_{2}})+\epsilon_{1}+\epsilon_{2}

By symmetry, the same inequality holds with μi,νi\mu_{i},\nu_{i} swapped. Thus

ρ(μ1+μ2,ν1+ν2)ρ(μ1,ν1)+ρ(μ2,ν2)\rho(\mu_{1}+\mu_{2},\nu_{1}+\nu_{2})\leq\rho(\mu_{1},\nu_{1})+\rho(\mu_{2},\nu_{2})

2.3 A Convenient Coupling of the Edge Weights

We follow [Bat20, Sect. 2.1] in coupling the environment to uniform random variables in order to work in a compact space of measures and to connect our results with his.

The idea is to write our i.i.d. edge weights τeθ\tau_{e}\sim\theta as

τe=τ(Ue)\tau_{e}=\tau(U_{e})

for some measurable function τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}} and i.i.d. Unif[0,1]-valued random variables (Ue)eE(D)(U_{e})_{e\in E({\mathbb{Z}}^{D})} on the same probability space as (τe)eE(D)(\tau_{e})_{e\in E({\mathbb{Z}}^{D})}. For instance, we could take the quantile function

τ(x)=Fθ(x):=inf{t:Fθ(t)x}\tau(x)=F_{\theta}^{-}(x):=\inf\{t\in{\mathbb{R}}:F_{\theta}(t)\geq x\}

But our results (in particular, our definitions of grid entropy) are independent of the τ\tau chosen so we allow τ\tau to be arbitrary (with the conditions stated above). This comes into play later in Section 5, when we study the Gibbs Free Energy as a function of τ\tau.

To distinguish between the τe\tau_{e} and the UeU_{e}, we call the former edge weights and the latter edge labels.

Let Λ\Lambda denote Lebesgue measure on [0,1][0,1]. We tweak the definition of empirical measures in this new setup: for any NE path π:npnq\pi:\lfloor np\rfloor\rightarrow\lfloor nq\rfloor in D{\mathbb{Z}}^{D}, define

μπ:=eπδUe\mu_{\pi}:=\sum_{e\in\pi}\delta_{U_{e}}

Then we can relate Λ\Lambda and the μπ\mu_{\pi} to θ\theta and the σπ\sigma_{\pi} respectively via the pushforward:

θ=τ(Λ),σπ=τ(μπ)whereτ(ξ)(B)=ξ(τ1(B))B(D)and  measures ξ on [0,1]\theta=\tau_{*}(\Lambda),\sigma_{\pi}=\tau_{*}(\mu_{\pi})\ \mbox{where}\ \tau_{*}(\xi)(B)=\xi(\tau^{-1}(B))\ \forall B\in\mathcal{B}({\mathbb{R}}^{D})\ \mbox{and $\forall$ measures $\xi$ on $[0,1]$}

One advantage is of course that the set of probability measures on [0,1][0,1] is weakly compact, so for any sequence of paths πn:npnq\pi_{n}:\lfloor np\rfloor\rightarrow\lfloor nq\rfloor in the grid we get a subsequence for which 1nkμπnk\frac{1}{n_{k}}\mu_{\pi_{n_{k}}} converges weakly to some measure.

In the case of a continuous cumulative distribution function FθF_{\theta}, we can use a lemma proved in [Bat20] to get a nice duality.

Lemma 5.

Given a measure θ\theta on {\mathbb{R}} with continuous cdf FθF_{\theta}, if we let
τ=Fθ:[0,1]\tau=F_{\theta}^{-}:[0,1]\rightarrow{\mathbb{R}} be its quantile function then there is a probability 1 event on which 1nkμπnkν\frac{1}{n_{k}}\mu_{\pi_{n_{k}}}\Rightarrow\nu for some subsequence nkn_{k} and paths πnk:0nkq\pi_{n_{k}}:\vec{0}\rightarrow\lfloor n_{k}q\rfloor if and only if τ(1nkμπnk)τ(ν)\tau_{*}(\frac{1}{n_{k}}\mu_{\pi_{n_{k}}})\Rightarrow\tau_{*}(\nu).

Proof.

[Bat20, Lemma 6.15] establishes that there is a probability 1 event on which
1nkμπnkν\frac{1}{n_{k}}\mu_{\pi_{n_{k}}}\Rightarrow\nu implies τ(1nkμπnk)τ(ν)\tau_{*}(\frac{1}{n_{k}}\mu_{\pi_{n_{k}}})\Rightarrow\tau_{*}(\nu). But then on the same event, given a subsequence for which τ(1nkμπnk)τ(ν)\tau_{*}(\frac{1}{n_{k}}\mu_{\pi_{n_{k}}})\Rightarrow\tau_{*}(\nu), compactness gives us a convergent subsubsequence 1nkjμπnkjξ\frac{1}{n_{k_{j}}}\mu_{\pi_{n_{k_{j}}}}\Rightarrow\xi hence

τ(1nkjμπnkj)τ(ξ)\tau_{*}\bigg{(}\frac{1}{n_{k_{j}}}\mu_{\pi_{n_{k_{j}}}}\bigg{)}\Rightarrow\tau_{*}(\xi)

so τ(ξ)=τ(ν)\tau_{*}(\xi)=\tau_{*}(\nu). The fact that FθF_{\theta} is continuous and the quantile function τ\tau satisfies

τ1((,x])=[0,Fθ(x)]x\tau^{-1}((-\infty,x])=[0,F_{\theta}(x)]\ \forall x\in{\mathbb{R}}

implies ξ\xi and ν\nu agree on all sets [0,Fθ][0,F_{\theta}] hence ξ=ν\xi=\nu. ∎

Thus, in the case when θ\theta has continuous cdf, we lose no generality by doing this coupling and working with measures on [0,1].

However, even in the most general case where θ\theta may not have continuous cdf or bounded support, our work in developing grid entropy still holds because we only use the compactness of the space of measures in later sections devoted to our variational formula for the Gibbs Free Energy.

In short, we lose nothing by restricting ourselves to the compact space of measures on [0,1][0,1].

A benefit of this coupling is the following amazing result of Bates [Bat20, Lemma 6.3 and Thm 6.4]. Here we denote by +,t\mathcal{M}_{+},\mathcal{M}_{t} the sets of finite non-negative Borel measures on [0,1] and finite non-negative Borel measures on [0,1] with total mass t0t\geq 0.

Theorem 6.
  1. (i)

    Fix q0Dq\in{\mathbb{R}}^{D}_{\geq 0}. Define q\mathcal{R}_{\infty}^{q} to be the (event-dependent) set of measures ν+\nu\in\mathcal{M}_{+} for which there is a subsequence πnk\pi_{n_{k}} of paths 0nkq\vec{0}\rightarrow\lfloor n_{k}q\rfloor with μπnkν\mu_{\pi_{n_{k}}}\Rightarrow\nu. Then there exists a deterministic, weakly closed set qq1\mathcal{R}^{q}\subseteq\mathcal{M}_{||q||_{1}} independent of τ\tau s.t.

    P(q=q)=1P(\mathcal{R}_{\infty}^{q}=\mathcal{R}^{q})=1
  2. (ii)

    Fix t0t\geq 0. Define t\mathcal{R}^{t}_{\infty} to be the (event-dependent) set of measures ν+\nu\in\mathcal{M}_{+} for which there is a subsequence πnk\pi_{n_{k}} of length nt\lfloor nt\rfloor paths from 0\vec{0} with μπnkν\mu_{\pi_{n_{k}}}\Rightarrow\nu. Then there exists a deterministic, weakly closed set tt\mathcal{R}^{t}\subseteq\mathcal{M}_{t} independent of τ\tau s.t.

    P(t=t)=1P(\mathcal{R}^{t}_{\infty}=\mathcal{R}^{t})=1

    Moreover, t=q0D,q1=tq\mathcal{R}^{t}=\bigcup\limits_{q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=t}\mathcal{R}^{q}.

Remark.

Bates proves this theorem in the setup of First Passage Percolation, but notes that it holds analogously in the Last Passage model.

Instead of looking at all empirical measures that have a weakly convergent subsequence we may look at only certain empirical measures with this property and the same result will hold. The proof is almost identical to Bates’s original proof except for this change, so we omit it.

Corollary 7.
  1. (i)

    Fix qDq\in{\mathbb{R}}^{D} and 0αR(q)0\leq\alpha\leq R(q). Define q,α\mathcal{R}_{\infty}^{q,\alpha} to be the (event-dependent) set of measures ν+\nu\in\mathcal{M}_{+} for which there is a subsequence πnk\pi_{n_{k}} of the paths πn:0nq\pi_{n}:\vec{0}\rightarrow\lfloor nq\rfloor with the enα\lfloor e^{n\alpha}\rfloorth smallest values of ρ(1nμπn,ν)\rho(\frac{1}{n}\mu_{\pi_{n}},\nu) satisfying

    1nμπnkνi.e.lim infnminπ:0nqenαρ(1nμπ,ν)=0\frac{1}{n}\mu_{\pi_{n_{k}}}\Rightarrow\nu\ \mbox{i.e.}\ \liminf_{n\rightarrow\infty}\min_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0

    Then there exists a deterministic weakly closed set q,αq1\mathcal{R}^{q,\alpha}\subseteq\mathcal{M}_{||q||_{1}} s.t.

    P(q,α=q,α)=1P(\mathcal{R}_{\infty}^{q,\alpha}=\mathcal{R}^{q,\alpha})=1
  2. (ii)

    Fix t0t\geq 0 and 0αtlogD0\leq\alpha\leq t\log D. Define t,α\mathcal{R}_{\infty}^{t,\alpha} to be the (event-dependent) set of measures ν+\nu\in\mathcal{M}_{+} for which there is a subsequence πnk\pi_{n_{k}} of the length tn\lfloor tn\rfloor paths πn\pi_{n} from 0\vec{0} with the enα\lfloor e^{n\alpha}\rfloorth smallest values of ρ(1nμπn,ν)\rho(\frac{1}{n}\mu_{\pi_{n}},\nu) satisfying

    1nμπnkνi.e.lim infnminπ:|π|=ntenαρ(1nμπ,ν)=0\frac{1}{n}\mu_{\pi_{n_{k}}}\Rightarrow\nu\ \mbox{i.e.}\ \liminf_{n\rightarrow\infty}\min_{\pi:|\pi|=\lfloor nt\rfloor}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0

    Then there exists a deterministic weakly closed set t,αt\mathcal{R}^{t,\alpha}\subseteq\mathcal{M}_{t} s.t.

    P(t,α=t,α)=1P(\mathcal{R}_{\infty}^{t,\alpha}=\mathcal{R}^{t,\alpha})=1
Remark.

Since the minπ:0nqenα\min\limits_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{\lfloor e^{n\alpha}\rfloor} are increasing in α\alpha then the sets q,α\mathcal{R}^{q,\alpha} are decreasing in α\alpha. The same holds for t,α\mathcal{R}^{t,\alpha}.

Once we develop the concept of grid entropy, we will easily relate these sets q,q,α,t,t,α\mathcal{R}^{q},\mathcal{R}^{q,\alpha},\mathcal{R}^{t},\mathcal{R}^{t,\alpha} to the sets of measures with finite grid entropy in direction qq, grid entropy at least α\alpha in direction qq, finite direction-free length tt grid entropy, direction-free length tt grid entropy at least α\alpha respectively.

2.4 Directed Metric Spaces

Grid entropy will turn out to be a directed metric with negative sign. We recall what that entails.

Definition.

A directed metric space with positive sign is a triple (M,d,+)(M,d,+) where MM is a vector space, d:M2(,+]d:M^{2}\rightarrow(-\infty,+\infty] is a distance function satisfying d(x,x)=0d(x,x)=0 and the usual triangle inequality d(x,y)+d(y,z)d(x,z)d(x,y)+d(y,z)\geq d(x,z). A directed metric space with negative sign is a triple (M,d,)(M,d,-) such that (M,d,+)(M,-d,+) is a directed metric space with positive sign.

Remark.

Standard metric spaces are clearly examples of directed metric spaces with positive sign. However, directed metric spaces with positive sign might not be metric spaces: the distance dd might not be symmetric and might not be positive and finite for non-equal arguments. The ”directed” in the name indicates the possibility for asymmetry.

Certain directed metrics give rise to directed norms in the same way certain metrics give rise to norms.

Definition.

If (M,d,σ)(M,d,\sigma) is a directed metric with positive/negative sign such that it is translation-invariant and homogeneous with respect to positive factors, then it induces a directed norm with positive/negative sign given by

x:=d(0,x)||x||:=d(\vec{0},x)

Of particular interest to us are directed norms defined in terms of the empirical measures we observe along paths between points. In the fixed direction case, our directed norms will be defined on the space of tuples consisting of a point in D{\mathbb{R}}^{D} (the ”direction” we are observing) and a finite Borel measure on {\mathbb{R}} (the target measure we want the empirical measures to be near). In the direction-free case, our directed norms will just be defined on the space +\mathcal{M}_{+} of finite Borel measure on {\mathbb{R}}.

2.5 The Subadditive Ergodic Theorem

The key theorem we use to prove the existence of the scaling limit of these directed metrics is Liggett’s improved version of Kingman’s Subadditive Ergodic Theorem. Before stating this theorem, we recall the definitions of stationary sequences and ergodicity, as presented in [Dur19, Sect. 7].

Definition.

A sequence (Yn)n1(Y_{n})_{n\geq 1} of random variables is called stationary if the joint distributions of the shifted sequences {Yk+n:n1}\{Y_{k+n}:n\geq 1\} is not dependent on k0k\geq 0.

As it turns out, the sequence of random variables we are interested in is a sequence of i.i.d. (Yn)n1(Y_{n})_{n\geq 1}, which clearly is stationary.

Definition.

Let (Ω,,P)(\Omega,\mathcal{F},P) be a probability space and let T:ΩΩT:\Omega\rightarrow\Omega be a map. TT is said to be measure-preserving if P(T1(A))=P(A)AP(T^{-1}(A))=P(A)\ \forall A\in\mathcal{F}. TT is said to be ergodic if it is measure-preserving and if all TT-invariant measurable sets are trivial, i.e. P(A){0,1}P(A)\in\{0,1\} whenever AA\in\mathcal{F} and T1(A)=AT^{-1}(A)=A.

In the context of sequences, we look at the space Ω=\Omega={\mathbb{R}}^{\infty} of infinite sequences of real numbers with the σ\sigma-algebra \mathcal{B}_{\infty} generated by

{{(y1,y2,)Ω:ynB}:n1,B()}\{\{(y_{1},y_{2},\ldots)\in\Omega:y_{n}\in B\}:n\geq 1,B\in\mathcal{B}({\mathbb{R}})\}

(where ()\mathcal{B}({\mathbb{R}}) is the Borel σ\sigma-algebra on {\mathbb{R}}) and with the product probability measure μ\mu_{\infty} on \mathcal{B}_{\infty} determined by

μ(B1×B2××Bn××)=i=1nμ(Bi)\mu_{\infty}(B_{1}\times B_{2}\times\ldots\times B_{n}\times{\mathbb{R}}\times\cdots)=\prod_{i=1}^{n}\mu(B_{i})

where Bi()B_{i}\in\mathcal{B}({\mathbb{R}}) and μ\mu is a Borel probability measure on {\mathbb{R}}. We consider the shift operator T:ΩΩT:\Omega\rightarrow\Omega given by T(y1,y2,y3,)=T(y2,y3,)T(y_{1},y_{2},y_{3},\ldots)=T(y_{2},y_{3},\ldots). TT is easily seen to be measure-preserving with respect to μ\mu_{\infty} since μ(T1(A))=μ(A)\mu_{\infty}(T^{-1}(A))=\mu_{\infty}(A) for the generating sets AA of \mathcal{B}_{\infty}. When we refer to the ergodicity of a sequence of random variables, we mean the ergodicity of this shift operator.

In our case, where we have a sequence of i.i.d. (Yn)n1(Y_{n})_{n\geq 1}, the corresponding shift operator is ergodic. Indeed, if T1(A)=AT^{-1}(A)=A then

(Y1,Y2,)ATn1(Y1,Y2,)=(Yn,Yn+1,)An1(Y_{1},Y_{2},\ldots)\in A\Leftrightarrow T^{n-1}(Y_{1},Y_{2},\ldots)=(Y_{n},Y_{n+1},\ldots)\in A\ \forall n\geq 1

so AA is in the tail σ\sigma-field n1σ(Yn,Yn+1,)\bigcap\limits_{n\geq 1}\sigma(Y_{n},Y_{n+1},\ldots) and thus μ(A){0,1}\mu_{\infty}(A)\in\{0,1\} by Kolmogorov’s 0-1 Law.

We are now ready to state the Subadditive Ergodic Theorem in the form we need.

Theorem 8 (Kingman’s Subadditive Ergodic Theorem, [Lig85]).

Suppose (Ym,n)0m<n(Y_{m,n})_{0\leq m<n} are random variables satisfying

  1. (i)

    \exists constant CC s.t. E|Y0,n|<E|Y_{0,n}|<\infty and EY0,nCnEY_{0,n}\geq Cn for all nn

  2. (ii)

    k1\forall k\geq 1, {Ynk,(n+1)k:n1}\{Y_{nk,(n+1)k}:n\geq 1\} is a stationary process

  3. (iii)

    The joint distributions of {Ym,m+k:k1}\{Y_{m,m+k}:k\geq 1\} are not dependent on mm

  4. (iv)

    Y0,m+nY0,m+Ym,m+nm,n>0Y_{0,m+n}\leq Y_{0,m}+Y_{m,m+n}\ \forall m,n>0

Then

  1. (a)

    limnEY0,nn=infm0EY0,mm:=γ\lim\limits_{n\rightarrow\infty}\frac{EY_{0,n}}{n}=\inf\limits_{m\geq 0}\frac{EY_{0,m}}{m}:=\gamma

  2. (b)

    Y:=limnY0,nnY:=\lim\limits_{n\rightarrow\infty}\frac{Y_{0,n}}{n} exists a.s. and in L1L^{1}, and EY=γEY=\gamma

  3. (c)

    If the stationary sequences in (ii) are ergodic, then Y=γY=\gamma a.s.

Remark.

We may replace Ym,nY_{m,n} with Ym,n-Y_{m,n} in the statement of the theorem to obtain a version for superadditive sequences.

This theorem is the basis for the construction of grid entropy in our paper.

2.6 Relative Entropy and Sanov’s Theorem

In the next preliminary section, we recall the basics of the Kullback-Leibler divergence (introduced in [KL51]) and Sanov’s Theorem for large deviations. We later use this theorem to establish a relationship between our grid entropy and this notion of relative entropy.

Definition.

Let P,QP,Q be distributions on our inherent metric space XX. The Kullback-Leibler divergence or relative entropy of QQ from PP is defined to be

DKL(P||Q)={XlogfdP=XflogfdQ,PQ+,otherwiseD_{KL}(P||Q)=\begin{cases}\int_{X}\log f\ dP=\int_{X}f\log f\ dQ,&P\ll Q\\ +\infty,&\mbox{otherwise}\end{cases}

where f:=dPdQf:=\frac{dP}{dQ} is the Radon-Nikodym derivative and log\log is the natural logarithm.

Remark.

[KL51] also derive several basic properties such as DKLD_{KL} being a pre-metric. [Pos75] shows that DKLD_{KL} is lower semicontinuous, in the sense that, given probability distributions PnPP_{n}\Rightarrow P and QnQQ_{n}\Rightarrow Q, we have

DKL(P||Q)lim infnDKL(Pn||Qn)D_{KL}(P||Q)\leq\liminf_{n\rightarrow\infty}D_{KL}(P_{n}||Q_{n})

Our main interest in relative entropy is that it is the rate function for large deviations of empirical measures. This is captured by Sanov’s Theorem.

Theorem 9 (Sanov’s Theorem, [DS01]).

Consider a sequence of i.i.d. random variables XiθX_{i}\sim\theta taking values in a set XX. Let μn=i=1nδXi\mu_{n}=\sum_{i=1}^{n}\delta_{X_{i}} be their unnormalized empirical measures. Then for any weakly closed set F1F\subset\mathcal{M}_{1} we have

lim supn1nlogP(1nμnF)infξFDKL(ξ||θ)\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{n}\in F\bigg{)}\leq-\inf_{\xi\in F}D_{KL}(\xi||\theta)

and for any weakly open set G1G\subset\mathcal{M}_{1} we have

lim infn1nlogP(1nμnU)infξGDKL(ξ||θ)\liminf_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{n}\in U\bigg{)}\geq-\inf_{\xi\in G}D_{KL}(\xi||\theta)
Remark.

Since FF is closed then the infimum is achieved by some ξF\xi\in F. Furthermore, if θF\theta\in F then the right-hand side of the inequality is 0, which gives us no information; however, if θF\theta\notin F, then the theorem gives an exponential bound on large deviations.

3 Grid Entropy as a Directed Norm

3.1 The Plan for Deriving Direction-Fixed Grid Entropy

For the purposes of Section 3, we temporarily forget our original definition (1) of grid entropy and rederive it as a limit of scaled directed metrics.

To summarize the setting we described in section 2, we consider empirical measures 1nμπ\frac{1}{n}\mu_{\pi} along NE-paths on the lattice D{\mathbb{Z}}^{D}, where the edges have weights τe=τ(Ue)\tau_{e}=\tau(U_{e}) for some measurable τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}} and UeU_{e} are i.i.d. Unif[0,1] random variables. ,+,t\mathcal{M},\mathcal{M}_{+},\mathcal{M}_{t} denote the spaces of finite, finite non-negative, and finite non-negative with total mass t0t\geq 0 Borel measures respectively.

We begin with direction-fixed grid entropy. We wish to count the number of paths with empirical measure very close to the target ν\nu. Let us try to define a distance on D×+\mathbb{{\mathbb{R}}}^{D}\times\mathcal{M}_{+} by

d((p,ξ),(q,ν))=log#{paths π:pq with μπ=νξ}d((p,\xi),(q,\nu))=\log\#\{\mbox{paths $\pi:\lfloor p\rfloor\rightarrow\lfloor q\rfloor$ with $\mu_{\pi}=\nu-\xi$}\}

Note that this is -\infty if there are no such paths and it is 0 if p=q,ξ=νp=q,\xi=\nu (since there is exactly one path π:pp\pi:\lfloor p\rfloor\rightarrow\lfloor p\rfloor, and it has empirical measure 0).

One glaring issue is that we are only counting paths that have exactly the target empirical measure. Since the Lebesgue measure on [0,1] is continuous, then almost surely the UeU_{e} have different values, and thus the unnormalized empirical measures uniquely determine the paths π\pi. It follows that almost surely d((p,ξ),(q,ν))d((p,\xi),(q,\nu)) is always either -\infty or 0. We need to change our definition of dd to count paths with an empirical measure ”close” to νξ\nu-\xi instead.

Another problem is that we wish to apply the Subadditive Ergodic Theorem to learn about the behavior of

d((np,nξ),(nq,nν))n\frac{d((np,n\xi),(nq,n\nu))}{n}

as nn\rightarrow\infty. Thus we must also change our definition of dd so that it is integrable (and in particular finite a.s.) when qp0Dq-p\in{\mathbb{R}}_{\geq 0}^{D}.

The trick is to replace the counting of the paths exhibiting the exact target empirical measures with a ”cost function” that attributes an exponential cost to each path based on how far its empirical measure is from the target. We also introduce a parameter ϵ\epsilon that, as it decreases to 0, takes the cost function towards the counting of the paths with the target empirical measure we tried initially.

Definition.

Fix ϵ>0\epsilon>0. Define a distance on D×+\mathbb{R}^{D}\times\mathcal{M}_{+} by

dϵ((p,ξ),(q,ν))=logπ𝒫(p,q)e1ϵρ(μπ,νξ)d^{\epsilon}((p,\xi),(q,\nu))=\log\sum_{\pi\in\mathcal{P}(\lfloor p\rfloor,\lfloor q\rfloor)}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},\nu-\xi)}

where ρ\rho is the Levy-Prokhorov metric and where we sum over all NE paths π:pq\pi:\lfloor p\rfloor\rightarrow\lfloor q\rfloor.

Remark.

This distance is 0 if p=q,ξ=νp=q,\xi=\nu and -\infty if and only if qp0Dq-p\notin{\mathbb{R}}_{\geq 0}^{D}.

Remark.

For any path π:pq\pi:\lfloor p\rfloor\rightarrow\lfloor q\rfloor, the corresponding empirical measure μπ\mu_{\pi} observed must necessarily be of the form μπ=i=1|π|δai\mu_{\pi}=\sum\limits_{i=1}^{|\pi|}\delta_{a_{i}} where |π|=qp1|\pi|=||\lfloor q\rfloor-\lfloor p\rfloor||_{1}.

Remark.

As ϵ0\epsilon\rightarrow 0 the costs e1ϵρ(μπ,νξ)e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},\nu-\xi)} converge to the indicators

𝟏νξ(μπ)=𝟏νξqp1(1|π|μπ)\mathbf{1}_{\nu-\xi}(\mu_{\pi})=\mathbf{1}_{\frac{\nu-\xi}{||\lfloor q\rfloor-\lfloor p\rfloor||_{1}}}\bigg{(}\frac{1}{|\pi|}\mu_{\pi}\bigg{)}

Hence the sum of the costs approaches the number of paths with empirical measures 1|π|μπ\frac{1}{|\pi|}\mu_{\pi} precisely equal to νξqp1\frac{\nu-\xi}{||\lfloor q\rfloor-\lfloor p\rfloor||_{1}}.

With this definition in mind, let us discuss the plan of attack.

First, we prove the existence of limndϵ((np,nξ),(nq,nν))n\lim\limits_{n\rightarrow\infty}\frac{d^{\epsilon}((np,n\xi),(nq,n\nu))}{n} using the Subadditive Ergodic Theorem and some estimates we derive for the error terms when p,qp,q do not have integer coordinates. Then we take the infimum over ϵ>0\epsilon>0 of these limits and we define the resulting norm to be our grid entropy.

Theorem 10.

Fix ϵ>0,ν,ξ+\epsilon>0,\nu,\xi\in\mathcal{M}_{+} and p,qDp,q\in{\mathbb{R}}^{D}. Then

dϵ((np,nξ),(nq,nν))n\frac{d^{\epsilon}((np,n\xi),(nq,n\nu))}{n}

converges in probability to a constant. When p=0p=\vec{0}, the convergence is pointwise a.s.

Remark.

The theorem holds trivially, with the limit being -\infty, if qp0Dq-p\notin{\mathbb{R}}^{D}_{\geq 0} or if q=pq=p and νξ\nu\neq\xi. It also holds trivially, with the limit being 0, if q=pq=p and ν=ξ\nu=\xi.

The limit given by this theorem is a directed metric with negative sign on D×+{\mathbb{R}}^{D}\times\mathcal{M}_{+}. When we take an infimum over ϵ>0\epsilon>0 we still get a directed metric with negative sign.

Theorem 11.

For ϵ>0\epsilon>0, ν,ξ+\nu,\xi\in\mathcal{M}_{+} and p,qDp,q\in{\mathbb{R}}^{D} define

d~ϵ((p,ξ),(q,ν)):=limndϵ((np,nξ),(nq,nν))nandd~((p,ξ),(q,ν)):=infϵ>0d~ϵ((p,ξ),(q,ν))\widetilde{d}^{\epsilon}((p,\xi),(q,\nu)):=\lim\limits_{n\rightarrow\infty}\frac{d^{\epsilon}((np,n\xi),(nq,n\nu))}{n}\ \mbox{and}\ \widetilde{d}((p,\xi),(q,\nu)):=\inf_{\epsilon>0}\widetilde{d}^{\epsilon}((p,\xi),(q,\nu))

Then each d~ϵ\widetilde{d}^{\epsilon} as well as d~\widetilde{d} are directed metrics with negative sign on D×+{\mathbb{R}}^{D}\times\mathcal{M}_{+}.

We show that this metric d~\widetilde{d} gives rise to a norm on 0D×+{\mathbb{R}}^{D}_{\geq 0}\times\mathcal{M}_{+}. This will finish our discussion of the direction-fixed grid entropy and we will move on to the direction-free case.

3.2 The Limit Shape of dϵd^{\epsilon} Starting at (0,0)(\vec{0},0)

In this and the following subsection, we focus on the direction-fixed grid entropy.

To prove Theorem 10, we first prove a simplified version, which we later generalize easily.

Theorem 12.

Fix ϵ>0,ν+\epsilon>0,\nu\in\mathcal{M}_{+} and q0D{0}q\in{\mathbb{R}}^{D}_{\geq 0}\setminus\{\vec{0}\}. Then

Xnϵ,q,νn:=dϵ((0,0),(nq,nν))nXϵ,q,ν:=supnEXnϵ,q,νn=limnEXnϵ,q,νna.s.\frac{X_{n}^{\epsilon,q,\nu}}{n}:=\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}\rightarrow X^{\epsilon,q,\nu}:=\sup\limits_{n}\frac{EX_{n}^{\epsilon,q,\nu}}{n}=\lim\limits_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q,\nu}}{n}\ \mbox{a.s.}
Remark.

As noted before, Theorem 10 holds trivially when q0D{0}q\notin{\mathbb{R}}^{D}_{\geq 0}\setminus\{\vec{0}\}, so we need not bother with this case.

We prove this theorem in stages, starting with the case when qq has integer coordinates. But first, we show a useful bound on our random variables Xnϵ,q,νX_{n}^{\epsilon,q,\nu}.

Lemma 13.

Let ϵ>0,ν,ξ+\epsilon>0,\nu,\xi\in\mathcal{M}_{+} and p,qDp,q\in{\mathbb{Z}}^{D} with qp0D{0}q-p\in{\mathbb{Z}}^{D}_{\geq 0}\setminus\{\vec{0}\}. Then

dϵ((p,ξ),(q,ν))[1ϵ(qp1+νξTV),qp1logD]d^{\epsilon}((p,\xi),(q,\nu))\in\bigg{[}-\frac{1}{\epsilon}(||q-p||_{1}+||\nu-\xi||_{TV}),||q-p||_{1}\log D\bigg{]}
Proof.

Recall that any path π:pq\pi:p\rightarrow q has qp1||q-p||_{1} edges, so μπTV=qp1||\mu_{\pi}||_{TV}=||q-p||_{1}, and that the total number of paths π:pq\pi:p\rightarrow q is

(qp1q1p1,,qDpD)[1,Dqp1]\binom{||q-p||_{1}}{q_{1}-p_{1},\cdots,q_{D}-p_{D}}\in[1,D^{||q-p||_{1}}]

Also, for any such π\pi we have by Lemma 3

ρ(μπ,νξ)[0,μπ(νξ)TV][0,qp1+νξTV]\rho(\mu_{\pi},\nu-\xi)\in[0,||\mu_{\pi}-(\nu-\xi)||_{TV}]\subseteq[0,||q-p||_{1}+||\nu-\xi||_{TV}]
e1ϵρ(μπ,νξ)[e1ϵ(qp1+νξTV),1]\Rightarrow e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},\nu-\xi)}\in[e^{-\frac{1}{\epsilon}(||q-p||_{1}+||\nu-\xi||_{TV})},1]

Thus

dϵ((p,ξ),(q,ν))=logπ𝒫(p,q)e1ϵρ(μπ,νξ)[1ϵ(qp1+νξTV),qp1logD]d^{\epsilon}((p,\xi),(q,\nu))=\log\sum_{\pi\in\mathcal{P}(p,q)}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},\nu-\xi)}\in\bigg{[}-\frac{1}{\epsilon}(||q-p||_{1}+||\nu-\xi||_{TV}),||q-p||_{1}\log D\bigg{]}

Lemma 14.

Theorem 12 holds with q0D{0}q\in{\mathbb{Z}}^{D}_{\geq 0}\setminus\{\vec{0}\}.

Proof.

We wish to use Kingman’s Subadditive Ergodic Theorem (Theorem 8) with

Ym,n:=dϵ((mq,mν),(nq,nν))mnY_{m,n}:=-d^{\epsilon}((mq,m\nu),(nq,n\nu))\ \forall m\leq n

Let us now check the conditions (i)-(iv).

By Lemma 13,

Y0,n=dϵ((0,0),(nq,nν))[nq1logD,1ϵ(nq1+nνTV)]Y_{0,n}=-d^{\epsilon}((\vec{0},0),(nq,n\nu))\in\bigg{[}-n||q||_{1}\log D,\frac{1}{\epsilon}(n||q||_{1}+n||\nu||_{TV})\bigg{]}

hence (i) holds.

Next, for every k1k\geq 1, the sequence

Ynk,(n+1)k=dϵ((nkq,nkν),((n+1)kq,(n+1)kν))Y_{nk,(n+1)k}=-d^{\epsilon}((nkq,nk\nu),((n+1)kq,(n+1)k\nu))

is i.i.d. because the distribution of the unnormalized empirical measures of paths π:nkq(n+1)kq\pi:nkq\rightarrow(n+1)kq is not dependent on nn (since the edge labels UeU_{e} are i.i.d.) so the distribution of the cost functions e1ϵρ(μπ,kν)e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},k\nu)} for π:nkq(n+1)kq\pi:nkq\rightarrow(n+1)kq is not dependent on nn. Thus (ii) holds. Furthermore, as discussed in section 2.5, Ynk,(n+1)kY_{nk,(n+1)k} being i.i.d. implies the sequence is ergodic.

Similarly, the joint distributions of

{Ym,m+k:k1}={dϵ((mq,mν),((m+k)q,(m+k)ν)):k1}\{Y_{m,m+k}:k\geq 1\}=\{-d^{\epsilon}((mq,m\nu),((m+k)q,(m+k)\nu)):k\geq 1\}

are not dependent on mm since the edge labels are i.i.d. and the distribution of empirical measures of paths in a rectangle on the lattice with the difference between the top right and bottom left corners being kqkq is independent of the location of the rectangle. Thus we have (iii).

It remains to show (iv), namely to show that given m,n>0,m,n>0,

dϵ((0,0),((m+n)q,(m+n)ν))dϵ((0,0),(mq,mν))+dϵ((mq,mν),((m+n)q,(m+n)ν))d^{\epsilon}((\vec{0},0),((m+n)q,(m+n)\nu))\geq d^{\epsilon}((\vec{0},0),(mq,m\nu))+d^{\epsilon}((mq,m\nu),((m+n)q,(m+n)\nu))

For any paths π:0mq\pi:\vec{0}\rightarrow mq and π:mq(m+n)q\pi^{\prime}:mq\rightarrow(m+n)q, we get a unique concatenation ππ:0mq(m+n)q\pi\cdot\pi^{\prime}:\vec{0}\rightarrow mq\rightarrow(m+n)q. Its empirical measure satisfies μππ=μπ+μπ\mu_{\pi\cdot\pi^{\prime}}=\mu_{\pi}+\mu_{\pi^{\prime}}. But the Levy-Prokhorov metric satisfies subadditivity by Lemma 4. Thus

ρ(μππ,(m+n)ν)ρ(μπ,mν)+ρ(μπ,nν)\rho(\mu_{\pi\cdot\pi^{\prime}},(m+n)\nu)\leq\rho(\mu_{\pi},m\nu)+\rho(\mu_{\pi^{\prime}},n\nu)

But then

Y0,mYm,m+n\displaystyle-Y_{0,m}-Y_{m,m+n} =logπ:0mqe1ϵρ(μπ,mν)+logπ:mq(m+n)qe1ϵρ(μπ,nν)\displaystyle=\log\sum_{\pi:\vec{0}\rightarrow mq}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},m\nu)}+\log\sum_{\pi^{\prime}:mq\rightarrow(m+n)q}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi^{\prime}},n\nu)}
=logπ:0mqπ:mq(m+n)qe1ϵρ(μπ,mν)1ϵρ(μπ,nν)\displaystyle=\log\sum_{\begin{subarray}{c}\pi:\vec{0}\rightarrow mq\\ \pi^{\prime}:mq\rightarrow(m+n)q\end{subarray}}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},m\nu)-\frac{1}{\epsilon}\rho(\mu_{\pi^{\prime}},n\nu)}
logπ:0mqπ:mp(m+n)qe1ϵρ(μππ,(m+n)ν)\displaystyle\leq\log\sum_{\begin{subarray}{c}\pi:\vec{0}\rightarrow mq\\ \pi^{\prime}:mp\rightarrow(m+n)q\end{subarray}}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi\cdot\pi^{\prime}},(m+n)\nu)}

Not all paths π′′′:0(m+n)q\pi^{\prime\prime\prime}:\vec{0}\rightarrow(m+n)q pass through mqmq so we can upper bound the expression above by removing this condition:

Y0,mYm,m+n\displaystyle-Y_{0,m}-Y_{m,m+n} logπ′′′:0(m+n)qe1ϵρ(μπ′′′,(m+n)ν)\displaystyle\leq\log\sum_{\pi^{\prime\prime\prime}:\vec{0}\rightarrow(m+n)q}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi^{\prime\prime\prime}},(m+n)\nu)}
=Y0,m+n\displaystyle=-Y_{0,m+n}

Thus we can apply the Subadditive Ergodic Theorem (Theorem 8) to get that

Y0,nn=Xnϵ,q,νn\frac{-Y_{0,n}}{n}=\frac{X_{n}^{\epsilon,q,\nu}}{n}

converges a.s. to the constant

Xϵ,q,ν:=supnEXnϵ,q,νn=limnEXnϵ,q,νn[1ϵmax(q1,ν),q1logD]X^{\epsilon,q,\nu}:=\sup\limits_{n}\frac{EX_{n}^{\epsilon,q,\nu}}{n}=\lim\limits_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q,\nu}}{n}\in\bigg{[}-\frac{1}{\epsilon}\max(||q||_{1},||\nu||),||q||_{1}\log D\bigg{]}

The next order of business is proving the theorem for qq with rational coordinates. We will find useful the following error estimate on how Xnϵ,q,νX_{n}^{\epsilon,q,\nu} changes when qq is perturbed in the SE direction and ν\nu is perturbed arbitrarily.

Lemma 15.

Fix q0D,ϵ>0q\in{\mathbb{R}}^{D}_{\geq 0},\epsilon>0 and ν,ξ+\nu,\xi\in\mathcal{M}_{+}. Then for any p0Dp\in{\mathbb{R}}^{D}_{\geq 0} with qp0Dq-p\in{\mathbb{R}}^{D}_{\geq 0},

Xnϵ,p,ξ1ϵ(nqp1+nνξTV+D)\displaystyle X_{n}^{\epsilon,p,\xi}-\frac{1}{\epsilon}(n||q-p||_{1}+n||\nu-\xi||_{TV}+D)
Xnϵ,p,ξ1ϵ(nqnp1+nρ(ν,ξ))\displaystyle\leq X_{n}^{\epsilon,p,\xi}-\frac{1}{\epsilon}(||\lfloor nq\rfloor-\lfloor np\rfloor||_{1}+n\rho(\nu,\xi))
Xnϵ,q,ν\displaystyle\leq X_{n}^{\epsilon,q,\nu}
Proof.

Fix any such pp. The inequality holds trivially if p=qp=q so we may assume qp1>0||q-p||_{1}>0. Then there is at least one path π:npnq\pi^{\prime}:\lfloor np\rfloor\rightarrow\lfloor nq\rfloor, and we fix it.

For any path π:0np\pi:\vec{0}\rightarrow\lfloor np\rfloor, we concatenate it with π\pi^{\prime} to get a unique path ππ:0nq\pi\cdot\pi^{\prime}:\vec{0}\rightarrow\lfloor nq\rfloor. Note that π\pi^{\prime} consists of nqnp1nqp1+D||\lfloor nq\rfloor-\lfloor np\rfloor||_{1}\leq n||q-p||_{1}+D edges so its empirical measure μπ\mu_{\pi^{\prime}} has Total Variation at most nqp1+Dn||q-p||_{1}+D. Thus ππ\pi\cdot\pi^{\prime} satisfies

ρ(μππ,nν)\displaystyle\rho(\mu_{\pi\cdot\pi^{\prime}},n\nu) ρ(μπ+μπ,μπ)+ρ(μπ,nξ)+ρ(nξ,nν)\displaystyle\leq\rho(\mu_{\pi}+\mu_{\pi^{\prime}},\mu_{\pi})+\rho(\mu_{\pi},n\xi)+\rho(n\xi,n\nu)
μπTV+ρ(μπ,nξ)+nρ(ν,ξ)\displaystyle\leq||\mu_{\pi^{\prime}}||_{TV}+\rho(\mu_{\pi},n\xi)+n\rho(\nu,\xi)
ρ(μπ,nξ)+(nqnp1+nρ(ν,ξ))\displaystyle\leq\rho(\mu_{\pi},n\xi)+(||\lfloor nq\rfloor-\lfloor np\rfloor||_{1}+n\rho(\nu,\xi))

It follows that

Xnϵ,p,ξ1ϵ(nqp1+nνξTV+D)\displaystyle X_{n}^{\epsilon,p,\xi}-\frac{1}{\epsilon}(n||q-p||_{1}+n||\nu-\xi||_{TV}+D)
Xnϵ,p,ξ1ϵ(||nqnp||1+nρ(ν,ξ))\displaystyle\leq X_{n}^{\epsilon,p,\xi}-\frac{1}{\epsilon}(||\lfloor nq\rfloor-\lfloor np\rfloor||_{1}+n\rho(\nu,\xi))
Xnϵ,q,ν\displaystyle\leq X_{n}^{\epsilon,q,\nu}

Using this lemma to approximate Xnϵ,q,νX_{n}^{\epsilon,q,\nu} for qD0q\in{\mathbb{Q}}^{D}_{\geq 0} in terms of Xnϵ,p,νX_{n}^{\epsilon,p,\nu} where pD0p\in{\mathbb{Z}}^{D}_{\geq 0}, we prove our limit theorem for qq with rational coordinates.

Lemma 16.

Theorem 12 holds for qD0{0}q\in{\mathbb{Q}}^{D}_{\geq 0}\setminus\{\vec{0}\}.

Proof.

Let q=(s1,,sD)tq=\frac{(s_{1},\ldots,s_{D})}{t} for si,t0s_{i},t\in{\mathbb{Z}}_{\geq 0}. First, compare Xtrϵ,q,ν,Xtr+1ϵ,q,ν,,Xtr+(t1)ϵ,q,ν,Xt(r+1)ϵ,q,νX_{tr}^{\epsilon,q,\nu},X_{tr+1}^{\epsilon,q,\nu},\ldots,X_{tr+(t-1)}^{\epsilon,q,\nu},X_{t(r+1)}^{\epsilon,q,\nu} for arbitrary rr using the previous lemma. The idea is that trqtrq has integer coordinates so Xtrϵ,q,νtr,Xt(r+1)ϵ,q,νtr\frac{X_{tr}^{\epsilon,q,\nu}}{tr},\frac{X_{t(r+1)}^{\epsilon,q,\nu}}{tr} have the desired limits by Lemma 14. The rest of the Xtr+jϵ,q,νX_{tr+j}^{\epsilon,q,\nu} are bounded above/below by Xt(r+1)ϵ,q,ν/Xtrϵ,q,νX_{t(r+1)}^{\epsilon,q,\nu}/X_{tr}^{\epsilon,q,\nu} respectively plus some small error terms which go to 0 as rr\rightarrow\infty.

Consider any 1jt1\leq j\leq t. Note that

Xtr+j1ϵ,q,ν=dϵ((0,0),((tr+j1)q,(tr+j1)ν))=Xtr+jϵ,tr+j1tr+jq,tr+j1tr+jνX_{tr+j-1}^{\epsilon,q,\nu}=d^{\epsilon}((\vec{0},0),((tr+j-1)q,(tr+j-1)\nu))=X_{tr+j}^{\epsilon,\frac{tr+j-1}{tr+j}q,\frac{tr+j-1}{tr+j}\nu}

and

||qtr+j1tr+jq||1+||νtr+j1tr+jν||TV=||q||1+||ν||TVtr+j\bigg{|}\bigg{|}q-\frac{tr+j-1}{tr+j}q\bigg{|}\bigg{|}_{1}+\bigg{|}\bigg{|}\nu-\frac{tr+j-1}{tr+j}\nu\bigg{|}\bigg{|}_{TV}=\frac{||q||_{1}+||\nu||_{TV}}{tr+j}

By Lemma 15,

Xtr+j1ϵ,q,ν=Xtr+jϵ,tr+j1tr+jq,tr+j1tr+jνXtr+jϵ,q,ν+1ϵ(||q||1+||ν||TV+D)X_{tr+j-1}^{\epsilon,q,\nu}=X_{tr+j}^{\epsilon,\frac{tr+j-1}{tr+j}q,\frac{tr+j-1}{tr+j}\nu}\leq X_{tr+j}^{\epsilon,q,\nu}+\frac{1}{\epsilon}(||q||_{1}+||\nu||_{TV}+D)

Thus

Xtrϵ,q,νrXtr+1ϵ,q,νr+||q||1+||ν||TV+DϵrXt(r+1)ϵ,q,νr+t(||q||1+||ν||TV+D)ϵr\begin{split}\frac{X_{tr}^{\epsilon,q,\nu}}{r}&\leq\frac{X_{tr+1}^{\epsilon,q,\nu}}{r}+\frac{||q||_{1}+||\nu||_{TV}+D}{\epsilon r}\\ &\leq\ldots\leq\frac{X_{t(r+1)}^{\epsilon,q,\nu}}{r}+\frac{t(||q||_{1}+||\nu||_{TV}+D)}{\epsilon r}\end{split} (4)

But tq0Dtq\in{\mathbb{Z}}_{\geq 0}^{D} so by Lemma 13, our limit theorem holds for Xtrϵ,q,ν=Xrϵ,tq,tνX_{tr}^{\epsilon,q,\nu}=X_{r}^{\epsilon,tq,t\nu}. That is,

Xtrϵ,q,νrsuprEXtrϵ,q,νr=limrEXtrϵ,q,νra.s.\frac{X_{tr}^{\epsilon,q,\nu}}{r}\rightarrow\sup_{r}\frac{EX_{tr}^{\epsilon,q,\nu}}{r}=\lim_{r\rightarrow\infty}\frac{EX_{tr}^{\epsilon,q,\nu}}{r}\ \mbox{a.s.} (5)

Taking expectations in (4), then taking the supremum/limit as rr\rightarrow\infty and using (5) we get

limrEXtr+jϵ,q,νr=suprEXtr+jϵ,q,νr=limrEXtrϵ,q,νr=suprEXtrϵ,q,νr0jt1\lim_{r\rightarrow\infty}\frac{EX_{tr+j}^{\epsilon,q,\nu}}{r}=\sup_{r}\frac{EX_{tr+j}^{\epsilon,q,\nu}}{r}=\lim_{r\rightarrow\infty}\frac{EX_{tr}^{\epsilon,q,\nu}}{r}=\sup_{r}\frac{EX_{tr}^{\epsilon,q,\nu}}{r}\ \forall 0\leq j\leq t-1 (6)

since the error terms in (4) go to 0. Similarly, taking the limit as rr\rightarrow\infty in (5) and using (4), (6) we get

limrXtr+jϵ,q,νr=suprEXtrϵ,q,νr=limrEXtr+jϵ,q,νr=suprEXtr+jϵ,q,νra.s.0jt1\lim_{r\rightarrow\infty}\frac{X_{tr+j}^{\epsilon,q,\nu}}{r}=\sup_{r}\frac{EX_{tr}^{\epsilon,q,\nu}}{r}=\lim_{r\rightarrow\infty}\frac{EX_{tr+j}^{\epsilon,q,\nu}}{r}=\sup_{r}\frac{EX_{tr+j}^{\epsilon,q,\nu}}{r}\ \mbox{a.s.}\ \forall 0\leq j\leq t-1

Multiplying everything by 1t\frac{1}{t} and using the fact that any nn can be written as tr+jtr+j, we get

Xϵ,q,νnnlimnEXnϵ,q,νn=supnEXnϵ,q,νna.s.\frac{X^{\epsilon,q,\nu}_{n}}{n}\rightarrow\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q,\nu}}{n}=\sup_{n}\frac{EX_{n}^{\epsilon,q,\nu}}{n}\ \mbox{a.s.}

Our next objective is to prove the full version of Theorem 12. We will need two short lemmas giving sufficient conditions for a.s. convergence of bounded random variables to the supremum of their expectations. The first of these lemmas looks at the case where we are given a particular lower bound for the liminf of all subsequences of our sequence.

Lemma 17.

Let YnY_{n} be uniformly bounded random variables. If a.s.

supnEYnlim infiYnifor all subsequences (ni)\sup_{n}EY_{n}\leq\liminf_{i\rightarrow\infty}Y_{n_{i}}\ \mbox{for all subsequences $(n_{i})$}

then limnYn=supnEYn=limnEYn\lim\limits_{n\rightarrow\infty}Y_{n}=\sup\limits_{n}EY_{n}=\lim\limits_{n\rightarrow\infty}EY_{n} a.s.

Proof.

First, taking expectations in

supnEYnlim infiYnia.s.\sup_{n}EY_{n}\leq\liminf_{i\rightarrow\infty}Y_{n_{i}}\ \mbox{a.s.}

and using Fatou’s Lemma, we get

supnEYnElim infiYnilim infiEYni\displaystyle\sup_{n}EY_{n}\leq E\liminf_{i\rightarrow\infty}Y_{n_{i}}\leq\liminf_{i\rightarrow\infty}EY_{n_{i}}
lim supiEYnisupiEYnisupnEYn\displaystyle\leq\limsup_{i\rightarrow\infty}EY_{n_{i}}\leq\sup_{i}EY_{n_{i}}\leq\sup_{n}EY_{n}

so all the inequalities are equalities and

supnEYn=limiEYni=lim infiYnia.s.\sup_{n}EY_{n}=\lim\limits_{i\rightarrow\infty}EY_{n_{i}}=\liminf_{i\rightarrow\infty}Y_{n_{i}}\ \mbox{a.s.}

This holds for all subsequences (ni)(n_{i}), including the full sequence. Thus a.s.

L:=supnEYn=limnEYn=lim infiYni for all subsequences (ni)L:=\sup_{n}EY_{n}=\lim\limits_{n\rightarrow\infty}EY_{n}=\liminf_{i\rightarrow\infty}Y_{n_{i}}\ \mbox{ for all subsequences $(n_{i})$}

hence YnLY_{n}\rightarrow L a.s.. ∎

The next lemma is about approximating a sequence of random variables from below by sequences that each converge to the supremum of their expectations.

Lemma 18.

Let YnY_{n} be uniformly bounded random variables s.t.

  1. (i)

    For every fixed kk, Yn,kYnY_{n,k}\leq Y_{n} for large enough nn

  2. (ii)

    For every fixed kk,

    limnYn,k=supnEYn,k=limnEYn,ka.s.\lim_{n\rightarrow\infty}Y_{n,k}=\sup_{n}EY_{n,k}=\lim_{n\rightarrow\infty}EY_{n,k}\ \mbox{a.s.}
  3. (iii)

    supnEYn=supnsupkEYn,k\sup\limits_{n}EY_{n}=\sup\limits_{n}\sup\limits_{k}EY_{n,k} Then limnYn=supnEYn=limnEYn\lim\limits_{n\rightarrow\infty}Y_{n}=\sup\limits_{n}EY_{n}=\lim\limits_{n\rightarrow\infty}EY_{n} a.s.

Proof.

We wish to prove the hypothesis of Lemma 17 for YnY_{n}. Let

={limnYn,ksupnEYn,kfor some k}\mathcal{F}=\bigg{\{}\lim_{n\rightarrow\infty}Y_{n,k}\neq\sup_{n}EY_{n,k}\ \mbox{for some $k$}\bigg{\}}

which has measure 0 by (ii). We claim that in the event C\mathcal{F}^{C} we have

supnEYnlim infiYnifor all subsequences (ni)\sup_{n}EY_{n}\leq\liminf_{i\rightarrow\infty}Y_{n_{i}}\ \mbox{for all subsequences $(n_{i})$}

Consider any subsequence (ni)(n_{i}) and any kk. By (i), Yni,kYnifor large enough iY_{n_{i},k}\leq Y_{n_{i}}\ \mbox{for large enough $i$}. Taking the lim inf\liminf over ii, and using the fact that we are in the event C\mathcal{F}^{C}, we get

supnEYn,k=limnYn,k=limiYni,klim infiYni\sup_{n}EY_{n,k}=\lim_{n\rightarrow\infty}Y_{n,k}=\lim_{i\rightarrow\infty}Y_{n_{i},k}\leq\liminf_{i\rightarrow\infty}Y_{n_{i}}

This holds for all kk. Taking the supremum over kk and using (iii), we get

supnEYn=supnsupkEYn,k=supksupnEYn,klim infiYni\sup_{n}EY_{n}=\sup_{n}\sup\limits_{k}EY_{n,k}=\sup\limits_{k}\sup_{n}EY_{n,k}\leq\liminf_{i\rightarrow\infty}Y_{n_{i}}

as desired. Applying Lemma 17, we get

limnEYn=supnEYn=limnYna.s.\lim_{n\rightarrow\infty}EY_{n}=\sup_{n}EY_{n}=\lim_{n\rightarrow\infty}Y_{n}\ \mbox{a.s.}

We can finally prove Theorem 12.

Proof of Theorem 12.


The case qD0{0}q\in{\mathbb{Q}}^{D}_{\geq 0}\setminus\{\vec{0}\} is handled by Lemma 16. So we may assume qD0q\notin{\mathbb{Q}}^{D}_{\geq 0}.

We construct a sequence pk0Dp_{k}\in{\mathbb{Q}}_{\geq 0}^{D} as follows. For every 1jD1\leq j\leq D, either qjq_{j}\in{\mathbb{Q}} and we pick (pk)j:=qj(p_{k})_{j}:=q_{j}, or qjq_{j}\notin{\mathbb{Q}} and we pick (pk)j0(p_{k})_{j}\in{\mathbb{Q}}_{\geq 0} s.t. (pk)jqj(p_{k})_{j}\uparrow q_{j}. It follows that pk0Dp_{k}\in{\mathbb{Q}}_{\geq 0}^{D} with pk+1pk,qpkD0kp_{k+1}-p_{k},q-p_{k}\in{\mathbb{Q}}^{D}_{\geq 0}\ \forall k. That is, p1,p2,,qp_{1},p_{2},\ldots,q forms a ”staircase” in D{\mathbb{R}}^{D}.

p1p_{1}p2p_{2}p3p_{3}p4p_{4}qq

The intuition is that Xnϵ,pk,νX_{n}^{\epsilon,p_{k},\nu} approximates Xnϵ,q,νX_{n}^{\epsilon,q,\nu} from below. Each Xnϵ,pk,νn\frac{X_{n}^{\epsilon,p_{k},\nu}}{n} converges in nn to the right limit a.s. by Lemma 16, and we use Lemma 18 to prove that Xnϵ,q,νn\frac{X_{n}^{\epsilon,q,\nu}}{n} converges a.s. to the right limit.

Let us now prove the hypothesis of Lemma 18 with

Yn,k:=Xnϵ,pk,νn||nqnpk||1ϵ,Yn:=Xnϵ,q,νnY_{n,k}:=\frac{X_{n}^{\epsilon,p_{k},\nu}}{n}-\frac{||\lfloor nq\rfloor-\lfloor np_{k}\rfloor||_{1}}{\epsilon},Y_{n}:=\frac{X_{n}^{\epsilon,q,\nu}}{n}

Note that these random variables are bounded by Lemma 13:

Yn,k,Yn[1ϵ(||q||1+||ν||TV)||q||1ϵ,||q||1logD]Y_{n,k},Y_{n}\in\bigg{[}-\frac{1}{\epsilon}(||q||_{1}+||\nu||_{TV})-\frac{||q||_{1}}{\epsilon},||q||_{1}\log D\bigg{]}

First, fix nn and consider any kk. By Lemma 15,

Xnϵ,pk,ν||nqnpk||1ϵXnϵ,q,ν=YnX_{n}^{\epsilon,p_{k},\nu}-\frac{||\lfloor nq\rfloor-\lfloor np_{k}\rfloor||_{1}}{\epsilon}\leq X_{n}^{\epsilon,q,\nu}=Y_{n}

so we have (i). Next, (ii) follows from Lemma 16:

limnYn,k=limnXnϵ,pk,ν||nqnpk||1ϵn=limnXnϵ,pk,νn||qpk||1ϵ\displaystyle\lim_{n\rightarrow\infty}Y_{n,k}=\lim_{n\rightarrow\infty}\frac{X_{n}^{\epsilon,p_{k},\nu}-\frac{||\lfloor nq\rfloor-\lfloor np_{k}\rfloor||_{1}}{\epsilon}}{n}=\lim_{n\rightarrow\infty}\frac{X_{n}^{\epsilon,p_{k},\nu}}{n}-\frac{||q-p_{k}||_{1}}{\epsilon}
=supnEXnϵ,pk,νn||qpk||1ϵ=limnEXnϵ,pk,νn||qpk||1ϵ\displaystyle=\sup_{n}\frac{EX_{n}^{\epsilon,p_{k},\nu}}{n}-\frac{||q-p_{k}||_{1}}{\epsilon}=\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,p_{k},\nu}}{n}-\frac{||q-p_{k}||_{1}}{\epsilon}
=limnEYn,k=supnEYn,k\displaystyle=\lim_{n\rightarrow\infty}EY_{n,k}=\sup_{n}EY_{n,k}

It remains to show (iii). Consider any fixed nn. For each 1jD1\leq j\leq D either qjq_{j}\in{\mathbb{Q}} and qj=(pk)jkq_{j}=(p_{k})_{j}\ \forall k, or nqjnq_{j}\notin{\mathbb{Z}} and (pk)jqj(p_{k})_{j}\uparrow q_{j} hence nqj=n(pk)j\lfloor nq_{j}\rfloor=\lfloor n(p_{k})_{j}\rfloor for large enough kk. Thus Xnϵ,pk,ν=Xnϵ,q,νX_{n}^{\epsilon,p_{k},\nu}=X_{n}^{\epsilon,q,\nu} for large enough kk. It follows that

Yn,k=Xnϵ,pk,ν||nqnpk||1ϵXnϵ,q,ν=YnaskY_{n,k}=X_{n}^{\epsilon,p_{k},\nu}-\frac{||\lfloor nq\rfloor-\lfloor np_{k}\rfloor||_{1}}{\epsilon}\uparrow X_{n}^{\epsilon,q,\nu}=Y_{n}\ \mbox{as}\ k\rightarrow\infty

By the Bounded Convergence Theorem, EYn,kEYnEY_{n,k}\uparrow EY_{n}. Thus

limkEYn,k=supkEYn,k=EYnn\lim_{k\rightarrow\infty}EY_{n,k}=\sup_{k}EY_{n,k}=EY_{n}\ \forall n

so taking the supremum over nn we get (iii):

supnsupkEYn,k=supnEYn\sup_{n}\sup_{k}EY_{n,k}=\sup_{n}EY_{n}

Applying Lemma 18, we get

limnYn=limnEYn=supnEYna.s.\lim_{n\rightarrow\infty}Y_{n}=\lim_{n\rightarrow\infty}EY_{n}=\sup_{n}EY_{n}\ \mbox{a.s.}

i.e.

limnXnϵ,q,νn=limnEXnϵ,q,νn=supnEXnϵ,q,νna.s.\lim_{n\rightarrow\infty}\frac{X_{n}^{\epsilon,q,\nu}}{n}=\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q,\nu}}{n}=\sup_{n}\frac{EX_{n}^{\epsilon,q,\nu}}{n}\ \mbox{a.s.}

This completes the proof of the existence of the limit shape of dϵd^{\epsilon} when measuring the distance from (0,0)D×+(\vec{0},0)\in{\mathbb{R}}^{D}\times\mathcal{M}_{+}.

3.3 The Limit Shape of dϵd^{\epsilon}—The General Case

We use Lemma 18 to generalize Theorem 12 to Theorem 10, where dϵd^{\epsilon} measures distance between any two elements of D×+{\mathbb{R}}^{D}\times\mathcal{M}_{+}.

Theorem 10.

Fix ϵ>0,ν,ξ+\epsilon>0,\nu,\xi\in\mathcal{M}_{+} and p,qDp,q\in{\mathbb{R}}^{D}. Then

dϵ((np,nξ),(nq,nν))n\frac{d^{\epsilon}((np,n\xi),(nq,n\nu))}{n}

converges in probability to a constant. When p=0p=\vec{0}, the convergence is pointwise a.s.

Remark.

During the course of the proof, we also show that the limit shape of
dϵ((np,nξ),(nq,nν))n\frac{d^{\epsilon}((np,n\xi),(nq,n\nu))}{n} is translation-invariant. This will help us later.

Proof.

As noted before, the theorem holds trivially if qpD0{0}q-p\notin{\mathbb{R}}^{D}_{\geq 0}\setminus\{\vec{0}\}. Thus we may assume qpD0{0}q-p\in{\mathbb{R}}^{D}_{\geq 0}\setminus\{\vec{0}\}.

Observe that

dϵ((np,nα),(nq,nν))=dϵ((np,0),(nq,nνnξ))=ddϵ((0,0),(nqnp,nνnξ))\begin{split}d^{\epsilon}((np,n\alpha),(nq,n\nu))&=d^{\epsilon}((np,0),(nq,n\nu-n\xi))\\ &=^{d}d^{\epsilon}((\vec{0},0),(\lfloor nq\rfloor-\lfloor np\rfloor,n\nu-n\xi))\end{split} (7)

so it suffices to assume ξ=0\xi=0 and prove

Xnϵ,nqnpn,νn=dϵ((0,0),(nqnp,nν))n\frac{X_{n}^{\epsilon,\frac{\lfloor nq\rfloor-\lfloor np\rfloor}{n},\nu}}{n}=\frac{d^{\epsilon}((\vec{0},0),(\lfloor nq\rfloor-\lfloor np\rfloor,n\nu))}{n}

converge a.s. to a constant.

Our argument mirrors the one used in the proof of Theorem 12: we approximate these YnY_{n} from below by some Yn,kY_{n,k} converging in kk and we apply Lemma 18.

First we prove some inequalities. Fix nn. Observe that for 1jD1\leq j\leq D,

n(qjpj)1<n(qjpj)n(qjpj)n(q_{j}-p_{j})-1<\lfloor n(q_{j}-p_{j})\rfloor\leq n(q_{j}-p_{j})
n(qjpj)1<nqjnpj<n(qjpj)+1n(q_{j}-p_{j})-1<\lfloor nq_{j}\rfloor-\lfloor np_{j}\rfloor<n(q_{j}-p_{j})+1
(nqjnpj)n(qjpj){0,1}\Rightarrow(\lfloor nq_{j}\rfloor-\lfloor np_{j}\rfloor)-\lfloor n(q_{j}-p_{j})\rfloor\in\{0,1\} (8)

and thus

(nqnp)n(qp)D0with||(nqnp)n(qp)||1D(\lfloor nq\rfloor-\lfloor np\rfloor)-\lfloor n(q-p)\rfloor\in{\mathbb{R}}^{D}_{\geq 0}\ \mbox{with}\ ||(\lfloor nq\rfloor-\lfloor np\rfloor)-\lfloor n(q-p)\rfloor||_{1}\leq D (9)

On the other hand, for 1jD1\leq j\leq D s.t. pjqjp_{j}\neq q_{j},

nqjnpjn(qjpj)+1(n+c)(qjpj)\lfloor nq_{j}\rfloor-\lfloor np_{j}\rfloor\leq n(q_{j}-p_{j})+1\leq(n+c)(q_{j}-p_{j})

for c:=max1jDs.t.pjqj(1qjpj)c:=\bigg{\lceil}\max\limits_{1\leq j\leq D\ \mbox{s.t.}\ p_{j}\neq q_{j}}\bigg{(}\frac{1}{q_{j}-p_{j}}\bigg{)}\bigg{\rceil}. Thus

0(n+c)(qjpj)(nqjnpj)c(qjpj)by(8)0\leq\lfloor(n+c)(q_{j}-p_{j})\rfloor-(\lfloor nq_{j}\rfloor-\lfloor np_{j}\rfloor)\leq c(q_{j}-p_{j})\ \mbox{by}\ \eqref{4}

This equation also holds trivially for jj s.t. pj=qjp_{j}=q_{j}. Thus

(n+c)(qp)(nqnp)D0with||(n+c)(qp)(nqnp)||1c||qp||1\lfloor(n+c)(q-p)\rfloor-(\lfloor nq\rfloor-\lfloor np\rfloor)\in{\mathbb{R}}^{D}_{\geq 0}\ \mbox{with}\ ||\lfloor(n+c)(q-p)\rfloor-(\lfloor nq\rfloor-\lfloor np\rfloor)||_{1}\leq c||q-p||_{1} (10)

We prove the hypothesis of Lemma 18 with

Yn,k:=Xnϵ,qp,νnϵkn,Yn:=Xnϵ,nqnpn,νnY_{n,k}:=\frac{X_{n}^{\epsilon,q-p,\nu}-\frac{n}{\epsilon k}}{n},Y_{n}:=\frac{X_{n}^{\epsilon,\frac{\lfloor nq\rfloor-\lfloor np\rfloor}{n},\nu}}{n}

First, by Lemma 15 and (9), for every fixed kk, for large enough nn we have

Xnϵ,qp,νnϵkn\displaystyle\frac{X_{n}^{\epsilon,q-p,\nu}-\frac{n}{\epsilon k}}{n} Xnϵ,qp,νDϵn\displaystyle\leq\frac{X_{n}^{\epsilon,q-p,\nu}-\frac{D}{\epsilon}}{n}
Xnϵ,qp,ν1ϵ||(nqnp)n(qp)||1n\displaystyle\leq\frac{X_{n}^{\epsilon,q-p,\nu}-\frac{1}{\epsilon}||(\lfloor nq\rfloor-\lfloor np\rfloor)-\lfloor n(q-p)\rfloor||_{1}}{n}
Xnϵ,nqnpn,νn\displaystyle\leq\frac{X_{n}^{\epsilon,\frac{\lfloor nq\rfloor-\lfloor np\rfloor}{n},\nu}}{n}

i.e. Yn,kYnY_{n,k}\leq Y_{n} which is precisely (i). Also, Theorem 12 gives (ii): for every fixed kk, a.s.

limnYn,k=limnXnϵ,qp,νnϵkn=supnEXnϵ,qp,νn1ϵk=limnEXnϵ,qp,νn1ϵk\lim_{n\rightarrow\infty}Y_{n,k}=\lim_{n\rightarrow\infty}\frac{X_{n}^{\epsilon,q-p,\nu}-\frac{n}{\epsilon k}}{n}=\sup_{n}\frac{EX_{n}^{\epsilon,q-p,\nu}}{n}-\frac{1}{\epsilon k}=\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q-p,\nu}}{n}-\frac{1}{\epsilon k}
=supnEYn,k=limnEYn,k=\sup_{n}EY_{n,k}=\lim_{n\rightarrow\infty}EY_{n,k}

Note that this implies

supnEXnϵ,qp,νn=supnsupkEYn,k=supksupnEYn,k=supklimnEYn,k=limnEXnϵ,qp,νn\sup_{n}\frac{EX_{n}^{\epsilon,q-p,\nu}}{n}=\sup_{n}\sup_{k}EY_{n,k}=\sup_{k}\sup_{n}EY_{n,k}=\sup_{k}\lim_{n\rightarrow\infty}EY_{n,k}=\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q-p,\nu}}{n} (11)

It remains to show (iii). Note that taking expectations in (i) immediately gives

supnsupkEYn,k=supksupnEYn,ksupEYn\sup_{n}\sup_{k}EY_{n,k}=\sup_{k}\sup_{n}EY_{n,k}\leq\sup EY_{n} (12)

We show this is an equality. By Lemma 15 and (10), for every fixed kk, for large enough nn we have

Xnϵ,nqnpn,ν\displaystyle X_{n}^{\epsilon,\frac{\lfloor nq\rfloor-\lfloor np\rfloor}{n},\nu} Xnϵ,n+cn(qp),n+cnν+1ϵ(||(n+c)(qp)(nqnp)||1+c||ν||TV)\displaystyle\leq X_{n}^{\epsilon,\frac{n+c}{n}(q-p),\frac{n+c}{n}\nu}+\frac{1}{\epsilon}\bigg{(}||\lfloor(n+c)(q-p)\rfloor-(\lfloor nq\rfloor-\lfloor np\rfloor)||_{1}+c||\nu||_{TV}\bigg{)}
Xn+cϵ,(qp),ν+c||qp||1+c||ν||TVϵ\displaystyle\leq X_{n+c}^{\epsilon,(q-p),\nu}+\frac{c||q-p||_{1}+c||\nu||_{TV}}{\epsilon}

Dividing by nn, taking expectations and taking the supremum over nn we get by (11)

supnEYnsupnEXn+cϵ,qp,νn=limnEXnϵ,qp,νn=supnsupkEYn,k\sup_{n}EY_{n}\leq\sup_{n}\frac{EX_{n+c}^{\epsilon,q-p,\nu}}{n}=\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q-p,\nu}}{n}=\sup_{n}\sup_{k}EY_{n,k} (13)

which when combined with (12) gives (iii).

Applying Lemma 18, we get

limnYn=limnEYn=supnEYna.s.\lim_{n\rightarrow\infty}Y_{n}=\lim_{n\rightarrow\infty}EY_{n}=\sup_{n}EY_{n}\ \mbox{a.s.}

i.e.

limnXnϵ,nqnpnn=supnEXnϵ,nqnpnn=limnEXnϵ,nqnpnna.s.\lim\limits_{n\rightarrow\infty}\frac{X_{n}^{\epsilon,\frac{\lfloor nq\rfloor-\lfloor np\rfloor}{n}}}{n}=\sup\limits_{n}\frac{EX_{n}^{\epsilon,\frac{\lfloor nq\rfloor-\lfloor np\rfloor}{n}}}{n}=\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,\frac{\lfloor nq\rfloor-\lfloor np\rfloor}{n}}}{n}\ \mbox{a.s.}

Furthermore, by (11), (12) and (13), we have

supnEYn=supnsupkEYn,k=supnEXnϵ,qp,νn=limnEXnϵ,qp,νn\sup_{n}EY_{n}=\sup_{n}\sup_{k}EY_{n,k}=\sup_{n}\frac{EX_{n}^{\epsilon,q-p,\nu}}{n}=\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q-p,\nu}}{n}

Combining this with (7), we get that dϵ((np,nξ),(nq,nν))n\frac{d^{\epsilon}((np,n\xi),(nq,n\nu))}{n} converges in probability to the a.s. limit of dϵ((0,0),(nqnp,nνnξ))n\frac{d^{\epsilon}((\vec{0},0),(nq-np,n\nu-n\xi))}{n}.

This completes the proof of Theorem 10 and of translation-invariance of the limit shape.

3.4 Grid Entropy as a Directed Norm

In the previous sections we showed the existence of the limit shape of dϵd^{\epsilon}. We now take the infimum as ϵ0\epsilon\downarrow 0 and we show that the result is a directed metric with negative sign that gives rise to a norm, which we call grid entropy.

Theorem 12.

For ϵ>0\epsilon>0, ν,ξ+\nu,\xi\in\mathcal{M}_{+} and p,qDp,q\in{\mathbb{R}}^{D} define

d~ϵ((p,ξ),(q,ν)):=limndϵ((np,nξ),(nq,nν))n,and\widetilde{d}^{\epsilon}((p,\xi),(q,\nu)):=\lim\limits_{n\rightarrow\infty}\frac{d^{\epsilon}((np,n\xi),(nq,n\nu))}{n},\ \mbox{and}
d~((p,ξ),(q,ν)):=infϵ>0d~ϵ((p,ξ),(q,ν))[,)\widetilde{d}((p,\xi),(q,\nu)):=\inf_{\epsilon>0}\widetilde{d}^{\epsilon}((p,\xi),(q,\nu))\in[-\infty,\infty)

Then each d~ϵ\widetilde{d}^{\epsilon} as well as d~\widetilde{d} are directed metrics with negative sign on D×+{\mathbb{R}}^{D}\times\mathcal{M}_{+}.

Remark.

For any p,qD,ϵ>0p,q\in{\mathbb{R}}^{D},\epsilon>0 and ν,ξ+\nu,\xi\in\mathcal{M}_{+}, dϵ((np,nξ),(nq,nν))d^{\epsilon}((np,n\xi),(nq,n\nu)) is monotone decreasing as ϵ0\epsilon\downarrow 0 so

d~((p,ξ),(q,ν))=infϵ>0d~ϵ((p,ξ),(q,ν))=limϵ0d~ϵ((p,ξ),(q,ν))\widetilde{d}((p,\xi),(q,\nu))=\inf_{\epsilon>0}\widetilde{d}^{\epsilon}((p,\xi),(q,\nu))=\lim_{\epsilon\downarrow 0}\widetilde{d}^{\epsilon}((p,\xi),(q,\nu))

By Lemma 13, for every nn and ϵ>0\epsilon>0,

dϵ((np,nξ),(nq,nν))[,||nqnp||1logD]d^{\epsilon}((np,n\xi),(nq,n\nu))\in[-\infty,||\lfloor nq\rfloor-\lfloor np\rfloor||_{1}\log D]

so it follows that

d~((p,ξ),(q,ν))[,||qp||1logD]\widetilde{d}((p,\xi),(q,\nu))\in[-\infty,||q-p||_{1}\log D]

Once we prove that our two definitions of grid entropy are equivalent, this bound will be improved.

Remark.

As was the case with Theorem 12, the limit d~((p,ξ),(q,ν))\widetilde{d}((p,\xi),(q,\nu)) is trivially -\infty if qpD0q-p\notin{\mathbb{R}}^{D}_{\geq 0} or if q=pq=p and νξ\nu\neq\xi, and it is trivially 0 if q=pq=p and ν=ξ\nu=\xi.

Proof.

As noted above,

d~ϵ((p,ξ),(p,ξ))=d~((p,ξ),(p,ξ))=0ϵ>0\widetilde{d}^{\epsilon}((p,\xi),(p,\xi))=\widetilde{d}((p,\xi),(p,\xi))=0\ \forall\epsilon>0

It remains to prove the reverse triangle inequality. Let p,q,rDp,q,r\in{\mathbb{R}}^{D} and ν,ξ,η+\nu,\xi,\eta\in\mathcal{M}_{+} and consider any ϵ>0\epsilon>0 and any nn. If qpD0q-p\notin{\mathbb{R}}^{D}_{\geq 0} or rqD0r-q\notin{\mathbb{R}}^{D}_{\geq 0} then the following inequality holds trivially (because the right-hand side is -\infty)

dϵ((np,nξ),(nr,nη))dϵ((np,nξ),(nq,nν))+dϵ((nq,nν),(nr,nη))d^{\epsilon}((np,n\xi),(nr,n\eta))\geq d^{\epsilon}((np,n\xi),(nq,n\nu))+d^{\epsilon}((nq,n\nu),(nr,n\eta)) (14)

Now suppose rq,qpD0r-q,q-p\in{\mathbb{R}}^{D}_{\geq 0}. Given paths π:npnq\pi:\lfloor np\rfloor\rightarrow\lfloor nq\rfloor, π:nqnr\pi^{\prime}:\lfloor nq\rfloor\rightarrow\lfloor nr\rfloor, we concatenate them to obtain a unique path ππ:npnr\pi\cdot\pi^{\prime}:\lfloor np\rfloor\rightarrow\lfloor nr\rfloor with unnormalized empirical measure μππ=μπ+μπ\mu_{\pi\cdot\pi^{\prime}}=\mu_{\pi}+\mu_{\pi^{\prime}}. By the subadditivity of the Levy-Prokhorov metric (Lemma 4),

ρ(μππ,n(ηξ))=ρ(μπ+μπ,n(ην)+n(νξ))ρ(μπ,n(ην))+ρ(μπ,n(νξ))\rho(\mu_{\pi\cdot\pi^{\prime}},n(\eta-\xi))=\rho(\mu_{\pi}+\mu_{\pi^{\prime}},n(\eta-\nu)+n(\nu-\xi))\leq\rho(\mu_{\pi},n(\eta-\nu))+\rho(\mu_{\pi^{\prime}},n(\nu-\xi))

so

(π:npnqe1ϵρ(μπ,n(ην)))(π:nqnre1ϵρ(μπ,n(νξ)))\displaystyle\bigg{(}\sum_{\pi:\lfloor np\rfloor\rightarrow\lfloor nq\rfloor}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},n(\eta-\nu))}\bigg{)}\bigg{(}\sum_{\pi^{\prime}:\lfloor nq\rfloor\rightarrow\lfloor nr\rfloor}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi^{\prime}},n(\nu-\xi))}\bigg{)}
(π:npnre1ϵρ(μπ,n(ηξ)))\displaystyle\leq\bigg{(}\sum_{\pi^{\prime\prime\prime}:\lfloor np\rfloor\rightarrow\lfloor nr\rfloor}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi^{\prime\prime\prime}},n(\eta-\xi))}\bigg{)}

It follows that (14) holds. Dividing (14) by nn and taking the limit (in probability) as nn\rightarrow\infty we get

d~ϵ((p,ξ),(r,η))d~ϵ((p,ξ),(q,ν))+d~ϵ((q,ν),(r,η))\widetilde{d}^{\epsilon}((p,\xi),(r,\eta))\geq\widetilde{d}^{\epsilon}((p,\xi),(q,\nu))+\widetilde{d}^{\epsilon}((q,\nu),(r,\eta))

so d~ϵ\widetilde{d}^{\epsilon} is a directed metric with negative sign. Taking the limit as ϵ0+\epsilon\rightarrow 0^{+}, we still obtain a directed metric with negative sign. ∎

We proceed to show that d~\widetilde{d} gives rise to a directed norm with negative sign.

Theorem 19.
  1. (i)

    Each d~ϵ\widetilde{d}^{\epsilon} is translation-invariant and positive-homogeneous. So is d~\widetilde{d}.

  2. (ii)

    For qD,ν+q\in{\mathbb{R}}^{D},\nu\in\mathcal{M}_{+} define the grid entropy with respect to (q,ν)(q,\nu) to be

    ||(q,ν)||:=d~((0,0),(q,ν))||(q,\nu)||:=\widetilde{d}((\vec{0},0),(q,\nu))

    Then this is a directed norm with negative sign on D×+{\mathbb{R}}^{D}\times\mathcal{M}_{+}.

Remark.

From before, ||(q,ν)||||(q,\nu)|| is -\infty if qD0q\notin{\mathbb{R}}^{D}_{\geq 0} or if q=0q=\vec{0} and ν0\nu\neq 0, and it is 0 if q=0q=\vec{0} and ν=0\nu=0.

Remark.

A directed metric with negative sign is clearly concave. Thus each d~ϵ\widetilde{d}^{\epsilon} as well as ||(,)||||(\cdot,\cdot)|| are concave functions on their respective domains.

Remark.

These properties of grid entropy also follow from [RAS14].

Proof.

(i) Fix ϵ>0\epsilon>0. We already showed that d~ϵ\widetilde{d}^{\epsilon} is translation-invariant while proving Theorem 7.

By translation-invariance, it suffices to show that d~ϵ((0,0),(q,ν))\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu)) is positive-homogeneous. Consider any c=ab>0c=\frac{a}{b}\in{\mathbb{Q}}_{>0}. Then

d~ϵ((0,0),(cq,cν))=limndϵ((0,0),(cnq,cnν))na.s.\widetilde{d}^{\epsilon}((\vec{0},0),(cq,c\nu))=\lim_{n\rightarrow\infty}\frac{d^{\epsilon}((\vec{0},0),(cnq,cn\nu))}{n}\ \mbox{a.s.}

Looking at the subsequence consisting of multiples n=mbn=mb of bb, we get

d~ϵ((0,0),(cq,cν))=limmdϵ((0,0),(amq,amν))mba.s.\widetilde{d}^{\epsilon}((\vec{0},0),(cq,c\nu))=\lim_{m\rightarrow\infty}\frac{d^{\epsilon}((\vec{0},0),(amq,am\nu))}{mb}\ \mbox{a.s.}

But each amam\in{\mathbb{N}} so

d~ϵ((0,0),(cq,cν))=ablimndϵ((0,0),(nq,nν))n=cd~ϵ((0,0),(q,ν))a.s.\widetilde{d}^{\epsilon}((\vec{0},0),(cq,c\nu))=\frac{a}{b}\lim_{n\rightarrow\infty}\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}=c\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))\ \mbox{a.s.}

Thus d~ϵ\widetilde{d}^{\epsilon} is positive-homogeneous for rational factors.

Now consider any c>0c\in{\mathbb{R}}_{>0} and take sequences ak,bk>0,akc,bkca_{k},b_{k}\in{\mathbb{Q}}_{>0},a_{k}\uparrow c,b_{k}\downarrow c. Consider any nn, and let us look at

Xnϵ,cq,cν=dϵ((0,0),(cq,cν))X_{n}^{\epsilon,cq,c\nu}=d^{\epsilon}((\vec{0},0),(cq,c\nu))

By Lemma 15,

Xnϵ,akq,akν1ϵ(n|cak|(||q||1+||ν||TV))Xnϵ,cq,cνXnϵ,bkq,bkν+1ϵ(n|bkc|(||q||1+||ν||TV))X_{n}^{\epsilon,a_{k}q,a_{k}\nu}-\frac{1}{\epsilon}(n|c-a_{k}|\cdot(||q||_{1}+||\nu||_{TV}))\leq X_{n}^{\epsilon,cq,c\nu}\leq X_{n}^{\epsilon,b_{k}q,b_{k}\nu}+\frac{1}{\epsilon}(n|b_{k}-c|\cdot(||q||_{1}+||\nu||_{TV}))

Dividing by nn and taking a.s. limits, and applying homogeneity for positive rational factors we get

akd~ϵ((0,0),(q,ν))|cak|(||q||1+||ν||TV)ϵ\displaystyle a_{k}\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))-\frac{|c-a_{k}|\cdot(||q||_{1}+||\nu||_{TV})}{\epsilon}
d~ϵ((0,0),(cq,cν))\displaystyle\leq\widetilde{d}^{\epsilon}((\vec{0},0),(cq,c\nu))
bkd~ϵ((0,0),(q,ν))+|bkc|(||q||1+||ν||TV)ϵ\displaystyle\leq b_{k}\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))+\frac{|b_{k}-c|\cdot(||q||_{1}+||\nu||_{TV})}{\epsilon}

Taking kk\rightarrow\infty gives us

d~ϵ((0,0),(cq,cν))=cd~ϵ((0,0),(q,ν))\widetilde{d}^{\epsilon}((\vec{0},0),(cq,c\nu))=c\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))

so d~ϵ\widetilde{d}^{\epsilon} is homogenous with respect to any positive real factor.

Taking the infimum over ϵ>0\epsilon>0 we get that d~\widetilde{d} is translation-invariant and positive-homogenous.

(ii) Follows directly from (i).

3.5 Direction-free Grid Entropy

We now wish to develop a grid entropy for the case where we no longer restrict ourselves to paths π:0nq\pi:\vec{0}\rightarrow\lfloor nq\rfloor for a given direction qDq\in{\mathbb{R}}^{D}, and instead look at all length nt\lfloor nt\rfloor paths from 0\vec{0} for a given size parameter t0t\geq 0. Another way of putting this is that we look at paths from the origin to the line or ”level” x1+x2++xD=ntx_{1}+x_{2}+\ldots+x_{D}=\lfloor nt\rfloor. Recall that the set of all such paths is denoted 𝒫nt(0)\mathcal{P}_{\lfloor nt\rfloor}(\vec{0}).

If we try to simply repeat our previous argument, we run into a dead end because we are no longer in a superadditive setting. The solution is to observe that the distances d~ϵ((0,0),(q,ν))\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu)) are maximized over qD0q\in{\mathbb{R}}^{D}_{\geq 0} with ||q||1=t||q||_{1}=t by q=t(1D,,1D):=tq=t(\frac{1}{D},\ldots,\frac{1}{D}):=t\ell. This intuitively makes sense, since this direction is the direction which has the most NE paths.

Lemma 20.

Fix t0,νtt\geq 0,\nu\in\mathcal{M}_{t}. Then

supqD0:||q||1=td~ϵ((0,0),(q,ν))=d~ϵ((0,0),(t,ν))ϵ>0andsupqD0:||q||1=t||(q,ν)||=||(t,ν)||\sup_{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t}\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))=\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))\ \forall\epsilon>0\ \mbox{and}\ \sup_{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t}||(q,\nu)||=||(t\ell,\nu)||
Remark.

In Section 4 we show that ||q||1=||ν||TV||q||_{1}=||\nu||_{TV} is a necessary condition for ||(q,ν)||||(q,\nu)|| to be finite, so it makes sense that we only take the supremum over qD0q\in{\mathbb{R}}^{D}_{\geq 0} with ||q||1=t||q||_{1}=t.

Proof.

This is an easy consequence of the symmetries of the grid and the concavity of d~ϵ\widetilde{d}^{\epsilon} and direction-fixed grid entropy. We focus on the proof for d~ϵ\widetilde{d}^{\epsilon}; the argument for grid entropy goes the same way.

Fix ϵ>0\epsilon>0. By positive-homogeneity and since the t=0t=0 is trivial, we may assume t=1t=1. Suppose there exists qD0q\in{\mathbb{R}}^{D}_{\geq 0} s.t. ||q||1=1||q||_{1}=1 and

d~ϵ((0,0),(q,ν))>d~ϵ((0,0),(,ν))\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))>\widetilde{d}^{\epsilon}((\vec{0},0),(\ell,\nu)) (15)

Among such qq pick one that maximizes the number of coordinates which are equal 1D\frac{1}{D}. Thus there are distinct 1i,jD1\leq i,j\leq D s.t. qi<1D<qjq_{i}<\frac{1}{D}<q_{j}, so we can write 1D\frac{1}{D} as a convex combination of qi,qjq_{i},q_{j}:

1D=wqi+(1w)qjfor somew(0,1)\frac{1}{D}=wq_{i}+(1-w)q_{j}\ \mbox{for some}\ w\in(0,1) (16)

Let σij(q)\sigma_{ij}(q) be qq with qi,qjq_{i},q_{j} swapped. By symmetry of the grid,

d~ϵ((0,0),(q,ν))=d~ϵ((0,0),(σij(q),ν))\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))=\widetilde{d}^{\epsilon}((\vec{0},0),(\sigma_{ij}(q),\nu))

hence by concavity of d~ϵ\widetilde{d}^{\epsilon},

d~ϵ((0,0),(q,ν))\displaystyle\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu)) =wd~ϵ((0,0),(q,ν))+(1w)d~ϵ((0,0),(σij(q),ν))\displaystyle=w\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))+(1-w)\widetilde{d}^{\epsilon}((\vec{0},0),(\sigma_{ij}(q),\nu))
d~ϵ((0,0),(wq+(1w)σij(q),ν))\displaystyle\leq\widetilde{d}^{\epsilon}((\vec{0},0),(wq+(1-w)\sigma_{ij}(q),\nu))

But wq+(1w)σij(q)wq+(1-w)\sigma_{ij}(q) only changes the coordinates of qq in positions i,ji,j, with qiq_{i} becoming 1D\frac{1}{D} by (16). Thus we have found a qq satisfying (15) that has at least one more coordinate that is 1D\frac{1}{D} than our previous qq, which we had assumed had the maximal number of such coordinates. Contradiction. Therefore q=q=\ell as desired. ∎

We now use this useful fact along with the compactness of {qD0:||q||1=t}\{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t\} to show that ||(t,ν)||||(t\ell,\nu)|| is the desired direction-free grid entropy of length tt.

Theorem 21.

Fix t0,νtt\geq 0,\nu\in\mathcal{M}_{t}. For any ϵ>0\epsilon>0 we have

d~ϵ((0,0),(t,ν))\displaystyle\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu)) =limnsupqD0:||q||1=t1nlogπ𝒫(0,nq)enϵρ(1nμπ,ν)\displaystyle=\lim_{n\rightarrow\infty}\sup_{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t}\frac{1}{n}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}
=limn1nlogπ𝒫nt(0)enϵρ(1nμπ,ν)\displaystyle=\lim_{n\rightarrow\infty}\frac{1}{n}\log\sum_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}

a.s.

Proof.

The statement is trivial when t=0t=0 so we may assume t>0t>0. We focus on the first equality. By Lemma 20 and the trivial fact that suplim suplim supsup\sup\limsup\leq\limsup\sup in general, we immediately get

d~ϵ((0,0),(t,ν))lim supnsupqD0:||q||1=t1nlogπ𝒫(0,nq)enϵρ(1nμπ,ν)a.s.\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))\leq\limsup_{n\rightarrow\infty}\sup_{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t}\frac{1}{n}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}\ \mbox{a.s.}

Suppose equality does not hold on some event of positive probability. Thus there exists δ>0\delta>0 s.t.

d~ϵ((0,0),(t,ν))+7δ<lim supnsupqD0:||q||1=t1nlogπ𝒫(0,nq)enϵρ(1nμπ,ν)\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))+7\delta<\limsup_{n\rightarrow\infty}\sup_{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t}\frac{1}{n}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)} (17)

with positive probability. Let us intersect this event with the measure 1 event that
d~ϵ((0,0),(p,ν))\widetilde{d}^{\epsilon}((\vec{0},0),(p,\nu)) exists and is equal to the limit in Theorem 10 for every pD0p\in{\mathbb{Q}}^{D}_{\geq 0}. Denote this event by \mathcal{E}.

Unfortunately, the expression inside the supremum is only continuous when approaching qq from the SE, and not necessarily when approaching along {rD0:||r||1=t}\{r\in{\mathbb{R}}^{D}_{\geq 0}:||r||_{1}=t\}, so we need to work with some rather unpleasant approximations.

Let (ni)(n_{i}) be the (event-dependent) subsequence corresponding to the lim sup\limsup. For every ii\in{\mathbb{N}} pick some (event-dependent) qniD0q^{n_{i}}\in{\mathbb{R}}^{D}_{\geq 0} with ||qni||1=t||q^{n_{i}}||_{1}=t s.t.

supqD0:||q||1=t1nilogπ𝒫(0,niq)eniϵρ(1niμπ,ν)δ+1nilogπ𝒫(0,niqni)eniϵρ(1niμπ,ν)\sup_{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t}\frac{1}{n_{i}}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor n_{i}q\rfloor)}e^{-\frac{n_{i}}{\epsilon}\rho(\frac{1}{n_{i}}\mu_{\pi},\nu)}\leq\delta+\frac{1}{n_{i}}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor n_{i}q^{n_{i}}\rfloor)}e^{-\frac{n_{i}}{\epsilon}\rho(\frac{1}{n_{i}}\mu_{\pi},\nu)} (18)

By compactness of {rD0:||r||1=t}\{r\in{\mathbb{R}}^{D}_{\geq 0}:||r||_{1}=t\}, there is a converging subsequence qnijL1qq^{n_{i_{j}}}\rightarrow^{L^{1}}q^{\prime} for some qD0,||q||1=tq^{\prime}\in{\mathbb{R}}^{D}_{\geq 0},||q^{\prime}||_{1}=t. Pick some qD0q^{\prime\prime}\in{\mathbb{Q}}^{D}_{\geq 0} s.t. qqD>0q^{\prime\prime}-q^{\prime}\in{\mathbb{R}}^{D}_{>0} and

max(1ϵ||qq||1,(||q||1t1)d~ϵ((0,0),(t,ν)),1ϵ||(1t||q||1)ν||TV)<δ\max\bigg{(}\frac{1}{\epsilon}||q^{\prime\prime}-q^{\prime}||_{1},\bigg{(}\frac{||q^{\prime\prime}||_{1}}{t}-1\bigg{)}\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu)),\frac{1}{\epsilon}\bigg{|}\bigg{|}\bigg{(}1-\frac{t}{||q^{\prime\prime}||_{1}}\bigg{)}\nu\bigg{|}\bigg{|}_{TV}\bigg{)}<\delta (19)

The idea is that since qnijL1qq^{n_{i_{j}}}\rightarrow^{L^{1}}q^{\prime} and the coordinates of qqq^{\prime\prime}-q^{\prime} are strictly positive, then there exists JJ s.t. qqnijD0jJq^{\prime\prime}-q^{n_{i_{j}}}\in{\mathbb{R}}^{D}_{\geq 0}\ \forall j\geq J. For such jj we can apply Lemma 15 to get

1nijlogπ𝒫(0,nijqnij)enijϵρ(1nijμπ,ν)1ϵ(||qqnij||1+Dnij)+1nijlogπ𝒫(0,nijq)enijϵρ(1nijμπ,ν)\begin{split}&\frac{1}{n_{i_{j}}}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor n_{i_{j}}q^{n_{i_{j}}}\rfloor)}e^{-\frac{n_{i_{j}}}{\epsilon}\rho(\frac{1}{n_{i_{j}}}\mu_{\pi},\nu)}\\ &\leq\frac{1}{\epsilon}\bigg{(}||q^{\prime\prime}-q^{n_{i_{j}}}||_{1}+\frac{D}{n_{i_{j}}}\bigg{)}+\frac{1}{n_{i_{j}}}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor n_{i_{j}}q^{\prime\prime}\rfloor)}e^{-\frac{n_{i_{j}}}{\epsilon}\rho(\frac{1}{n_{i_{j}}}\mu_{\pi},\nu)}\end{split} (20)

Recalling that we are in the event \mathcal{E}, pick some JJJ^{\prime}\geq J large enough so that for any jJ,j\geq J^{\prime},

1nijlogπ𝒫(0,nijq)enijϵρ(1nijμπ,ν)δ+d~ϵ((0,0),(q,ν))\frac{1}{n_{i_{j}}}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor n_{i_{j}}q^{\prime\prime}\rfloor)}e^{-\frac{n_{i_{j}}}{\epsilon}\rho(\frac{1}{n_{i_{j}}}\mu_{\pi},\nu)}\leq\delta+\widetilde{d}^{\epsilon}((\vec{0},0),(q^{\prime\prime},\nu)) (21)

and 1ϵ(||qqnij||1+Dnij)<δhence\frac{1}{\epsilon}\bigg{(}||q^{\prime}-q^{n_{i_{j}}}||_{1}+\frac{D}{n_{i_{j}}}\bigg{)}<\delta\ \mbox{hence}

1ϵ(||qqnij||1+Dnij)1ϵ(||qqnij||1+||qq||1+Dnij)<2δ\frac{1}{\epsilon}\bigg{(}||q^{\prime\prime}-q^{n_{i_{j}}}||_{1}+\frac{D}{n_{i_{j}}}\bigg{)}\leq\frac{1}{\epsilon}\bigg{(}||q^{\prime}-q^{n_{i_{j}}}||_{1}+||q^{\prime\prime}-q^{\prime}||_{1}+\frac{D}{n_{i_{j}}}\bigg{)}<2\delta (22)

by (19). Putting everything together, we get

d~ϵ((0,0),(t,ν))+7δ\displaystyle\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))+7\delta
d~ϵ((0,0),(q,ν))+4δby (17)-(22)\displaystyle\leq\widetilde{d}^{\epsilon}((\vec{0},0),(q^{\prime\prime},\nu))+4\delta\ \mbox{by \eqref{EQ13}-\eqref{EQ18}}
=||q||1td~ϵ((0,0),(t||q||1q,t||q||1ν))+4δby positive-homogeneity\displaystyle=\frac{||q^{\prime\prime}||_{1}}{t}\widetilde{d}^{\epsilon}\bigg{(}(\vec{0},0),\bigg{(}\frac{t}{||q^{\prime\prime}||_{1}}q^{\prime\prime},\frac{t}{||q^{\prime\prime}||_{1}}\nu\bigg{)}\bigg{)}+4\delta\ \mbox{by positive-homogeneity}

By Lemma 20, this is

||q||1td~ϵ((0,0),(t,t||q||1ν))+4δ\displaystyle\leq\frac{||q^{\prime\prime}||_{1}}{t}\widetilde{d}^{\epsilon}\bigg{(}(\vec{0},0),\bigg{(}t\ell,\frac{t}{||q^{\prime\prime}||_{1}}\nu\bigg{)}\bigg{)}+4\delta
||q||1td~ϵ((0,0),(t,ν))+1ϵ||(1t||q||1)ν||TV+4δby Lemma 15\displaystyle\leq\frac{||q^{\prime\prime}||_{1}}{t}\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))+\frac{1}{\epsilon}\bigg{|}\bigg{|}\bigg{(}1-\frac{t}{||q^{\prime\prime}||_{1}}\bigg{)}\nu\bigg{|}\bigg{|}_{TV}+4\delta\ \mbox{by Lemma \ref{lemma12}}
d~ϵ((0,0),(t,ν))+6δby (19)\displaystyle\leq\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))+6\delta\ \mbox{by \eqref{EQ15}}

which is a contradiction. Note that even though our sequence qnijq^{n_{i_{j}}}, the limit qq^{\prime\prime} and the JJ^{\prime} chosen are event-dependent, the upper bound above is not by virtue of Lemma 20. That is what makes this argument work.

We have thus shown

d~ϵ((0,0),(t,ν))=limnsupqD0:||q||1=t1nlogπ𝒫(0,nq)enϵρ(1nμπ,ν)a.s.\displaystyle\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))=\lim_{n\rightarrow\infty}\sup_{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t}\frac{1}{n}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}\ \mbox{a.s.}

We now proceed with the second equality. Observe that for any nn, any length nt\lfloor nt\rfloor NE path π\pi from 0\vec{0} ends at nq\lfloor nq\rfloor for some qD0q\in{\mathbb{R}}^{D}_{\geq 0} with ||q||1=t||q||_{1}=t. Furthermore, the number of possible different endpoints of such paths is precisely (nt+D1D1)=O(nD)\binom{\lfloor nt\rfloor+D-1}{D-1}=O(n^{D}), which is the number of DD-tuples of non-negative integers summing to nt\lfloor nt\rfloor. Therefore

logπ𝒫nt(0)enϵρ(1nμπ,ν)\displaystyle\log\sum_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}
[sup||q||1=tlogπ𝒫(0,nq)enϵρ(1nμπ,ν),log(O(nD)sup||q||1=tπ𝒫(0,nq)enϵρ(1nμπ,ν))]\displaystyle\in\bigg{[}\sup_{||q||_{1}=t}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)},\log\bigg{(}O(n^{D})\sup_{||q||_{1}=t}\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}\bigg{)}\bigg{]}
=[sup||q||1=tdϵ((0,nξ),(nq,nν)),O(Dlogn)+sup||q||1=tdnϵ((0,nξ),(nq,nν))]\displaystyle=\bigg{[}\sup_{||q||_{1}=t}d^{\epsilon}((\vec{0},n\xi),(nq,n\nu)),O(D\log n)+\sup_{||q||_{1}=t}d^{n\epsilon}((\vec{0},n\xi),(nq,n\nu))\bigg{]}

When we divide by nn and take the limit, the O(Dlogn)O(D\log n) term goes to 0 and we get that

limnlogπ𝒫nt(0)enϵρ(1nμπ,ν)=limnsupqD0,||q||1=tdnϵ((0,nξ),(nq,nν))na.s.\lim_{n\rightarrow\infty}\log\sum_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}=\lim_{n\rightarrow\infty}\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=t}\frac{d^{n\epsilon}((\vec{0},n\xi),(nq,n\nu))}{n}\ \mbox{a.s.}

as desired. ∎

Of course, we already established that d~ϵ((0,0),(t,ν))\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu)) is a directed metric with negative sign. This means that if we now take the infimum over ϵ>0\epsilon>0 we get that the length tt direction-free grid entropy is precisely ||(t,ν)||||(t\ell,\nu)||. Since this is -\infty unless t=||ν||TVt=||\nu||_{TV} (see Theorem 24 in a later section), then we can simply let t=||ν||TVt=||\nu||_{TV} to begin with.

Definition.

For ν+\nu\in\mathcal{M}_{+} let t:=||ν||TVt:=||\nu||_{TV} and define the direction-free grid entropy to be

||ν||:=infϵd~ϵ((0,0),(t,ν))=||(t,ν)||=supqD||(q,ν)||||\nu||:=\inf_{\epsilon}\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))=||(t\ell,\nu)||=\sup_{q\in{\mathbb{R}}^{D}}||(q,\nu)||
Remark.

The final equality holds by Lemma 20 combined with the fact that ||(q,ν)||=||(q,\nu)||=-\infty unless ||q||1=||ν||TV||q||_{1}=||\nu||_{TV}.

3.6 Equivalence of Approaches

The next order of business is to show that these definitions of direction-fixed/direction-free grid entropy are equivalent to our original, more intuitive, definitions (1). The novel results in this subsection show the benefit of the fresh approach to grid entropy presented in this paper.

The key idea is that the sum

π𝒫(0,nq)enϵρ(1nμπ,ν)\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}

appearing in the definition of dϵd^{\epsilon} is approximately the number of paths 0nq\vec{0}\rightarrow\lfloor nq\rfloor with very small ρ(1nμπ,ν)\rho(\frac{1}{n}\mu_{\pi},\nu) and the error should disappear in the limit.

Let us recall how the minπk\min\limits_{\pi}^{k} are defined. Fix qD,ν+q\in{\mathbb{R}}^{D},\nu\in\mathcal{M}_{+}. For any n,kn,k\in{\mathbb{N}} with k(#π:0nq)k\leq(\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor) define

minπ𝒫(0,nq)kρ(1nμπ,ν)\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{k}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}

to be the order statistics of ρ(1nμπ,ν)\rho(\frac{1}{n}\mu_{\pi},\nu) as we range over all π:0nq\pi:\vec{0}\rightarrow\lfloor nq\rfloor, and let

minπ:0nqkρ(1nμπ,ν):=+fork>#π:0nq\min\limits_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{k}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}:=+\infty\ \mbox{for}\ k>\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor

In this way minπ𝒫(0,nq)1ρ(1nμπ,ν),minπ𝒫(0,nq)#πρ(1nμπ,ν)\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho(\frac{1}{n}\mu_{\pi},\nu),\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\#\pi}\rho(\frac{1}{n}\mu_{\pi},\nu) are just the minimum/maximum values of ρ(1nμπ,ν)\rho(\frac{1}{n}\mu_{\pi},\nu) respectively.

Similarly, in the direction-free case, for ν+\nu\in\mathcal{M}_{+} we let t:=||ν||TVt:=||\nu||_{TV} and for any n,kn,k\in{\mathbb{N}} with k(#length nt paths π from 0)k\leq(\#\ \mbox{length $\lfloor nt\rfloor$ paths $\pi$ from $\vec{0}$}) define

minπ𝒫nt(0)kρ(1nμπ,ν)\min\limits_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{k}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}

to be the kk-th smallest value of ρ(1nμπ,ν)\rho(\frac{1}{n}\mu_{\pi},\nu) when we range over all length nt\lfloor nt\rfloor paths from 0\vec{0}.

We usually omit the ρ(1nμπ,ν)\rho(\frac{1}{n}\mu_{\pi},\nu) in minπjρ(1nμπ,ν)\min\limits_{\pi}^{j}\rho(\frac{1}{n}\mu_{\pi},\nu) for the sake of space.

We are now ready to prove that our definitions of grid entropy are equivalent, and moreover that the pertinent limits either converge to 0 a.s. or the corresponding lim inf\liminf’s are >0>0 a.s.. In terms of notation, we denote by ||(q,ν)||,||ν||||(q,\nu)||,||\nu|| the direction-fixed/direction-free grid entropy as defined in Theorem 19/Theorem 21.

Theorem 22.

Fix qD,ν+q\in{\mathbb{R}}^{D},\nu\in\mathcal{M}_{+} and let t:=||ν||TVt:=||\nu||_{TV}.

  1. (i)

    The grid entropy being finite is characterized by the existence of paths with empirical measure arbitrarily close to the target. More specifically,

    limnminπ𝒫(0,nq)1ρ(1nμπ,ν),limnminπ𝒫nt(0)1ρ(1nμπ,ν)exist a.s. and\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)},\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\ \mbox{exist a.s. and}
    ||(q,ν)||limnminπ𝒫(0,nq)1ρ(1nμπ,ν)=0a.s.||(q,\nu)||\neq-\infty\Leftrightarrow\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}
    ||ν||limnminπ𝒫nt(0)1ρ(1nμπ,ν)=0a.s.\displaystyle||\nu||\neq-\infty\Leftrightarrow\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}

    Furthermore, if ||(q,ν)||||(q,\nu)||\neq-\infty then

    ||(q,ν)||[0,H(q)]whereH(q)=i=1Dqilogqi||q||1||(q,\nu)||\in[0,H(q)]\ \mbox{where}\ H(q)=\sum_{i=1}^{D}-q_{i}\log\frac{q_{i}}{||q||_{1}}

    and if ||ν||||\nu||\neq-\infty then ||ν||=||(t,ν)||[0,tlogD]||\nu||=||(t\ell,\nu)||\in[0,t\log D]. In particular,

    ||||q||1Λ||||(q,||q||1Λ)||0||\ ||q||_{1}\Lambda\ ||\geq||(q,||q||_{1}\Lambda)||\geq 0

    i.e. Λ\Lambda is always among the distributions observed.

  2. (ii)

    We have

    limnminπ𝒫(0,nq)enαρ(1nμπ,ν)=0a.s.0α<||(q,ν)||\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\ \forall 0\leq\alpha<||(q,\nu)||
    lim infnminπ𝒫(0,nq)enαρ(1nμπ,ν)>0a.s.||(q,ν)||<αH(q)\liminf_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}>0\ \mbox{a.s.}\ \forall||(q,\nu)||<\alpha\leq H(q)

    In other words,

    ||(q,ν)||=sup{α0:limnminπ𝒫(0,nq)enαρ(1nμπ,ν)=0a.s.}=sup{α0:lim infnminπ𝒫(0,nq)enαρ(1nμπ,ν)=0a.s.}\begin{split}||(q,\nu)||&=\sup\bigg{\{}\alpha\geq 0:\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\bigg{\}}\\ &=\sup\bigg{\{}\alpha\geq 0:\liminf_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\bigg{\}}\end{split} (23)

    where this is -\infty if the set of α\alpha’s is empty. Similarly,

    limnminπ𝒫nt(0)enαρ(1nμπ,ν)=0a.s.0α<||ν||\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\ \forall 0\leq\alpha<||\nu||
    lim infnminπ𝒫nt(0)enαρ(1nμπ,ν)>0a.s.||ν||<αtlogD\liminf_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}>0\ \mbox{a.s.}\ \forall||\nu||<\alpha\leq t\log D

    and the direction-free grid entropy is given by

    ||ν||\displaystyle||\nu|| =sup{α0:limnminπ𝒫nt(0)enαρ(1nμπ,ν)=0a.s.}\displaystyle=\sup\bigg{\{}\alpha\geq 0:\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\bigg{\}}
    =sup{α0:lim infnminπ𝒫nt(0)enαρ(1nμπ,ν)=0a.s.}\displaystyle=\sup\bigg{\{}\alpha\geq 0:\liminf_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\bigg{\}}
Remark.

Since the minπj\min\limits_{\pi}^{j} increase in jj, then the sets we are taking the supremum over in the definition in (ii) are either \emptyset or of the form [0,C)[0,C) or [0,C][0,C] where CC is the corresponding grid entropy.

Remark.

The limits

limnminπ𝒫(0,nq)enαρ(1niμπ,ν)\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n_{i}}\mu_{\pi},\nu\bigg{)}

need not exist for all α\alpha. What we are saying is that they do exist and are equal to 0 a.s. for α<||(q,ν)||\alpha<||(q,\nu)|| and that the corresponding lim inf\liminf’s are >0>0 a.s. for α>||(q,ν)||\alpha>||(q,\nu)||. This along with the existence of the limits in (i) are completely non-trivial facts which will be a direct consequence of the a.s. convergence of dϵ((0,0),(nq,nν))n\frac{d^{\epsilon}((0,0),(nq,n\nu))}{n}.

Proof.

We focus on proving the direction-fixed grid entropy statements. The direction-free arguments are analogous.

(i) The strategy is to provide upper and lower bounds on dϵd^{\epsilon} in terms of

exp[nϵminπ𝒫(0,nq)1ρ(1nμπ,ν)],\mbox{exp}\bigg{[}-\frac{n}{\epsilon}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\bigg{]},

the leading term in the sum over π\pi. Since dϵ((0,0),(nq,nν))n\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n} converges a.s. then our analysis gives that d~ϵ\widetilde{d}^{\epsilon} must be in the intersection of two intervals, one in terms of
lim supminπ𝒫(0,nq)1ρ(1nμπ,ν)\limsup\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho(\frac{1}{n}\mu_{\pi},\nu) and the other in terms of the lim inf\liminf. The only way these intervals can have non-empty intersection for small enough ϵ>0\epsilon>0 is if the lim sup\limsup’s equal the lim inf\liminf’s.

Fix ϵ>0\epsilon>0. We consider the minπ𝒫(0,nq)j\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{j} over π:0nq\pi:\vec{0}\rightarrow\lfloor nq\rfloor. Since minπ𝒫(0,nq)1minπ𝒫(0,nq)j\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\leq\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{j} for all valid jj, then

dϵ((0,0),(nq,nν))n\displaystyle\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}
=1nlogπ𝒫(0,nq)enϵρ(1nμπ,ν)\displaystyle=\frac{1}{n}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}
[1nlog(enϵminπ𝒫(0,nq)1),1nlog((#π:0nq)enϵminπ𝒫(0,nq)1)]\displaystyle\in\bigg{[}\frac{1}{n}\log\bigg{(}e^{-\frac{n}{\epsilon}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}}\bigg{)},\frac{1}{n}\log\bigg{(}(\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor)\ e^{-\frac{n}{\epsilon}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}}\bigg{)}\bigg{]}
=[1ϵminπ𝒫(0,nq)1ρ(1nμπ,ν),1nlog(#π:0nq)1ϵminπ𝒫(0,nq)1ρ(1nμπ,ν)]\displaystyle=\bigg{[}-\frac{1}{\epsilon}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)},\frac{1}{n}\log(\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor)-\frac{1}{\epsilon}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\bigg{]}

where each ρ(1nμπ,ν)0\rho(\frac{1}{n}\mu_{\pi},\nu)\geq 0. Taking the a.s. lim sup\limsup or the a.s. lim inf\liminf and using Lemma 1 we get that a.s.,

d~ϵ((0,0),(q,ν))[1ϵlim infnminπ𝒫(0,nq)1,H(q)1ϵlim infnminπ𝒫(0,nq)1][1ϵlim supnminπ𝒫(0,nq)1,H(q)1ϵlim supnminπ𝒫(0,nq)1]\begin{split}\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))\in&\bigg{[}-\frac{1}{\epsilon}\liminf_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1},H(q)-\frac{1}{\epsilon}\liminf_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\bigg{]}\\ &\bigcap\bigg{[}-\frac{1}{\epsilon}\limsup_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1},H(q)-\frac{1}{\epsilon}\limsup_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\bigg{]}\end{split} (24)

Since this holds for arbitrarily small ϵ>0\epsilon>0 then we must have

lim infnminπ𝒫(0,nq)1ρ(1nμπ,ν)=lim supnminπ𝒫(0,nq)1ρ(1nμπ,ν)a.s.\liminf_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=\limsup_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\ \mbox{a.s.}

Furthermore, taking ϵ0+\epsilon\rightarrow 0^{+} in (24), we see that

||(q,ν)||=infϵ>0d~ϵ((0,0),(q,ν))limnminπ:0nq1ρ(1nμπ,ν)=0a.s.\displaystyle||(q,\nu)||=\inf_{\epsilon>0}\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))\neq-\infty\Leftrightarrow\lim_{n\rightarrow\infty}\min_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}

Moreover, if indeed ||(q,ν)||||(q,\nu)||\neq-\infty then (24) gives ||(q,ν)||[0,H(q)]||(q,\nu)||\in[0,H(q)].

Finally, if we fix an infinite NE path π\pi passing through all nq\lfloor nq\rfloor then by the Glivenko-Cantelli Theorem (Theorem 2),

ρ(1nμπ|0nq,||q||1Λ)=||q||1ρ(1|π|μπ|0nq,Λ)0\rho\bigg{(}\frac{1}{n}\mu_{\pi|_{\vec{0}\rightarrow\lfloor nq\rfloor}},||q||_{1}\Lambda\bigg{)}=||q||_{1}\rho\bigg{(}\frac{1}{|\pi|}\mu_{\pi|_{\vec{0}\rightarrow\lfloor nq\rfloor}},\Lambda\bigg{)}\ \rightarrow 0

where μπ|0nq\mu_{\pi|_{\vec{0}\rightarrow\lfloor nq\rfloor}} denotes the restriction of the path to 0nq\vec{0}\rightarrow\lfloor nq\rfloor, so from above it follows that ||(q,||q||1Λ)||0||(q,||q||_{1}\Lambda)||\geq 0.

(ii) If ||(q,ν)||=||(q,\nu)||=-\infty then by (i),

limnminπ𝒫(0,nq)1ρ(1niμπ,ν)exists a.s. and is >0\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n_{i}}\mu_{\pi},\nu\bigg{)}\ \mbox{exists a.s. and is $>0$}

so the statement holds trivially because the minπ𝒫(0,nq)j\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{j} are nondecreasing in jj. Thus it suffices to assume ||(q,ν)||0||(q,\nu)||\geq 0 and hence

minπ𝒫(0,nq)1ρ(1nμπ,ν)0a.s.\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\rightarrow 0\ \mbox{a.s.} (25)

Fix ϵ>0\epsilon>0 and consider any nn\in{\mathbb{N}}. By (25), the leading term of

π𝒫(0,nq)enϵρ(1nμπ,ν)\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}

is 1\approx 1. So we must look at the secondary terms by writing

dϵ((0,0),(nq,nν))n=1ϵminπ𝒫(0,nq)1+1nlog[1+j=2#πenϵ[minπ𝒫(0,nq)jminπ𝒫(0,nq)1]]\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}=-\frac{1}{\epsilon}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}+\frac{1}{n}\log\bigg{[}1+\sum_{j=2}^{\#\pi}e^{-\frac{n}{\epsilon}[\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{j}-\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}]}\bigg{]} (26)

where the summands are nonincreasing in jj.

The argument goes similar to the one in (i), in that we compute some upper and lower bounds for this expression.

Fix an arbitrary 0CH(q)0\leq C\leq H(q). We lower bound (26) by truncating the sum at j=eCnj=\lfloor e^{Cn}\rfloor and lower bounding all the summands by the j=eCnj=\lfloor e^{Cn}\rfloor summand:

dϵ((0,0),(nq,nν))n1ϵminπ𝒫(0,nq)1+1nlog[1+(eCn1)exp[nϵ(minπ𝒫(0,nq)eCnminπ𝒫(0,nq)1)]]\begin{split}&\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}\geq\\ &-\frac{1}{\epsilon}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}+\frac{1}{n}\log\bigg{[}1+(\lfloor e^{Cn}\rfloor-1)\mbox{exp}\bigg{[}-\frac{n}{\epsilon}\bigg{(}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{Cn}\rfloor}-\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\bigg{)}\bigg{]}\bigg{]}\end{split} (27)

We upper bound (26) by upper bounding summands for j=2j=2 to eCn\lfloor e^{Cn}\rfloor to 1 and upper bounding the rest of the summands to the j=eCnj=\lfloor e^{Cn}\rfloor summand:

dϵ((0,0),(nq,nν))n1ϵminπ𝒫(0,nq)1+1nlog[1+eCn1+(#π)exp[nϵ(minπ𝒫(0,nq)eCnminπ𝒫(0,nq)1)]]\begin{split}&\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}\leq\\ &-\frac{1}{\epsilon}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}+\frac{1}{n}\log\bigg{[}1+\lfloor e^{Cn}\rfloor\cdot 1+(\#\pi)\mbox{exp}\bigg{[}-\frac{n}{\epsilon}\bigg{(}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{Cn}\rfloor}-\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\bigg{)}\bigg{]}\bigg{]}\end{split} (28)

Consider the event-dependent sequence

an(C):=minπ𝒫(0,nq)eCnρ(1nμπ,ν)minπ𝒫(0,nq)1ρ(1nμπ,ν)0a_{n}(C):=\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{Cn}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}-\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\geq 0

Taking the lim inf/lim sup\liminf/\limsup in (27),(28) and using (25) and Lemma 1 we get

d~ϵ((0,0),(q,ν))\displaystyle\widetilde{d}^{\epsilon}((0,0),(q,\nu)) loglim supn(1+(eCn1)ean(C)ϵn)1n\displaystyle\geq\log\limsup_{n\rightarrow\infty}\bigg{(}1+(\lfloor e^{Cn}\rfloor-1)e^{-\frac{a_{n}(C)}{\epsilon}n}\bigg{)}^{\frac{1}{n}}
=C1ϵlim infnan(C),\displaystyle=C-\frac{1}{\epsilon}\liminf\limits_{n\rightarrow\infty}a_{n}(C),
d~ϵ((0,0),(q,ν))\displaystyle\widetilde{d}^{\epsilon}((0,0),(q,\nu)) loglim infn(1+eCn+eH(q)nan(C)ϵn)1n\displaystyle\leq\log\liminf_{n\rightarrow\infty}\bigg{(}1+e^{Cn}+e^{H(q)n-\frac{a_{n}(C)}{\epsilon}n}\bigg{)}^{\frac{1}{n}}
=max(C,H(q)1ϵlim supnan(C))\displaystyle=\max(C,H(q)-\frac{1}{\epsilon}\limsup\limits_{n\rightarrow\infty}a_{n}(C))

a.s. for this arbitrary 0CH(q)0\leq C\leq H(q).

In particular, for any 0C<||(q,ν)||0\leq C<||(q,\nu)|| we get

0||(q,ν)||d~ϵ((0,0),(q,ν))max(C,H(q)1ϵlim supnan(C))a.s.ϵ>00\leq||(q,\nu)||\leq\widetilde{d}^{\epsilon}((0,0),(q,\nu))\leq\max(C,H(q)-\frac{1}{\epsilon}\limsup\limits_{n\rightarrow\infty}a_{n}(C))\ \mbox{a.s.}\ \forall\epsilon>0

hence

limnminπ𝒫(0,nq)eCnρ(1nμπ,ν)=limnan(C)=0a.s.\lim_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{Cn}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=\lim_{n\rightarrow\infty}a_{n}(C)=0\ \mbox{a.s.}

On the other hand, for any ||(q,ν)||<CH(q)||(q,\nu)||<C\leq H(q),

||(q,ν)||=infϵ>0d~ϵ((0,0),(q,ν))Cinfϵ>01ϵlim infnan(C)a.s.||(q,\nu)||=\inf_{\epsilon>0}\widetilde{d}^{\epsilon}((0,0),(q,\nu))\geq C-\inf_{\epsilon>0}\frac{1}{\epsilon}\liminf\limits_{n\rightarrow\infty}a_{n}(C)\ \mbox{a.s.}

hence

lim infnminπ𝒫(0,nq)eCnρ(1nμπ,ν)=limnan(C)>0a.s.\liminf_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{Cn}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=\lim_{n\rightarrow\infty}a_{n}(C)>0\ \mbox{a.s.}

Equation (23) follows. ∎

Thus our approaches to grid entropy are equivalent.

Note that so far we have not made use of the coupling τe=τ(Ue)\tau_{e}=\tau(U_{e}) or of the compactness of the space of measures on [0,1][0,1]. We could have developed grid entropy in the original environment, with unnormalized empirical measures σπ\sigma_{\pi} and edge weight distribution θ\theta and target measures ν\nu on {\mathbb{R}} in the same way. Furthermore, if θ\theta has a continuous cdf then the value of the grid entropy would be the same either way, because of the duality between the environments, established in Lemma 5; that is, for any qDq\in{\mathbb{R}}^{D} and ν+\nu\in\mathcal{M}_{+} we would have

||(q,ν)||=||(q,τ(ν))||,||ν||=||τ(ν)||||(q,\nu)||=||(q,\tau_{*}(\nu))||,||\nu||=||\tau_{*}(\nu)||

where the grid entropies are developed on the environments ([0,1],UeΛ),(,τeθ)([0,1],U_{e}\sim\Lambda),({\mathbb{R}},\tau_{e}\sim\theta) respectively. Even in the general case where FθF_{\theta} may not be continuous, [Bat20, Lemma 6.15] implies that

||(q,ν)||||(q,τ(ν))||,||ν||||τ(ν)||||(q,\nu)||\leq||(q,\tau_{*}(\nu))||,||\nu||\leq||\tau_{*}(\nu)||

Again though, this is just a nice fact; for practical purposes we may simply define grid entropy on the space of measures on {\mathbb{R}} and our arguments thus far hold.

We now reap the benefits of this coupling, by describing the sets q,q,α,t,t,α\mathcal{R}^{q},\mathcal{R}^{q,\alpha},\mathcal{R}^{t},\mathcal{R}^{t,\alpha} defined in Theorem 6 and Corollary 7 in terms of direction-fixed/direction-free grid entropy.

Corollary 23.

Fix qDq\in{\mathbb{R}}^{D} and t0t\geq 0.

  1. (i)

    The deterministic set q\mathcal{R}^{q} of limits of empirical measures in direction qq whose existence is established by Theorem 6 (i) is precisely {ν+:||(q,ν)||0}\{\nu\in\mathcal{M}_{+}:||(q,\nu)||\geq 0\}. Thus

    ||(q,ν)||=\displaystyle||(q,\nu)||=-\infty limnminπ𝒫(0,nq)1ρ(1nμπ,ν)>0a.s.\displaystyle\Leftrightarrow\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}>0\ \mbox{a.s.}
    and||(q,ν)||0\displaystyle\mbox{and}\ ||(q,\nu)||\geq 0 P(π:0nqwithρ(1nμπ,ν)<ϵi.o.)=1ϵ>0\displaystyle\Leftrightarrow P\bigg{(}\exists\pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{with}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{i.o.}\bigg{)}=1\ \forall\epsilon>0

    Likewise, the deterministic set t\mathcal{R}^{t} of limits of length tt empirical measures from direction-free paths, whose existence is established by Theorem 6(ii) is precisely

    {νt:||ν||0}={νt:||(t,ν)||0}=t\{\nu\in\mathcal{M}_{t}:||\nu||\geq 0\}=\{\nu\in\mathcal{M}_{t}:||(t\ell,\nu)||\geq 0\}=\mathcal{R}^{t\ell}

    Thus

    ||ν||=\displaystyle||\nu||=-\infty limnminπ𝒫nt(0)1ρ(1nμπ,ν)>0a.s.\displaystyle\Leftrightarrow\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}>0\ \mbox{a.s.}
    and||ν||0\displaystyle\mbox{and}\ ||\nu||\geq 0 P(πs.t.|π|=ntwithρ(1nμπ,ν)<ϵi.o.)=1ϵ>0\displaystyle\Leftrightarrow P\bigg{(}\exists\ \pi\ \mbox{s.t.}\ |\pi|=\lfloor nt\rfloor\ \mbox{with}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{i.o.}\bigg{)}=1\ \forall\epsilon>0
  2. (ii)

    Fix 0<CH(q)0<C\leq H(q). The set of measures with grid entropy in direction qq at least/most CC can be characterized in terms of the deterministic sets q,α\mathcal{R}^{q,\alpha} defined in Corollary 7 (i):

    q,C=0αCq,α{ν+:||(q,ν)||C}0α<Cq,α\mathcal{R}^{q,C}=\bigcap_{0\leq\alpha\leq C}\mathcal{R}^{q,\alpha}\subseteq\{\nu\in\mathcal{M}_{+}:||(q,\nu)||\geq C\}\subseteq\bigcap_{0\leq\alpha<C}\mathcal{R}^{q,\alpha}

    Thus

    ||(q,ν)||C>0\displaystyle||(q,\nu)||\geq C>0\Leftrightarrow
    P(enαpathsπ:0nqwithρ(1nμπ,ν)<ϵi.o.)=1\displaystyle\qquad P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{with}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{i.o.}\bigg{)}=1

    ϵ>0α(0,C)\forall\epsilon>0\ \forall\alpha\in(0,C), and

    ||(q,ν)||<Clim infnminπ𝒫(0,nq)enCρ(1nμπ,ν)>0a.s.||(q,\nu)||<C\Leftrightarrow\liminf_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{nC}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}>0\ \mbox{a.s.}

    Now fix 0<CH(t)=tlogD0<C\leq H(t\ell)=t\log D. The set of measures with length tt direction-free grid entropy at least/most CC can be characterized in terms of the deterministic sets t,α\mathcal{R}^{t,\alpha} defined in Corollary 7 (ii):

    t,C=t,C=0αCt,α{νt:||ν||C}0α<Ct,α\mathcal{R}^{t\ell,C}=\mathcal{R}^{t,C}=\bigcap_{0\leq\alpha\leq C}\mathcal{R}^{t,\alpha}\subseteq\{\nu\in\mathcal{M}_{t}:||\nu||\geq C\}\subseteq\bigcap_{0\leq\alpha<C}\mathcal{R}^{t,\alpha}

    Thus

    ||ν||C>0\displaystyle||\nu||\geq C>0\Leftrightarrow
    P(enαπ𝒫nt(0)withρ(1nμπ,ν)<ϵi.o.)=1ϵ>0\displaystyle\qquad P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})\ \mbox{with}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{i.o.}\bigg{)}=1\ \forall\epsilon>0

    α(0,C)\forall\alpha\in(0,C), and

    ||ν||<Clim infnminπ𝒫nt(0)enCρ(1nμπ,ν)>0a.s.||\nu||<C\Leftrightarrow\liminf_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{\lfloor e^{nC}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}>0\ \mbox{a.s.}
Proof.

This follows immediately from Theorem 22. ∎

Now that we have a firm grasp of what direction-fixed and direction-free grid entropies actually measure, we can move on examining what information they can give us.

4 Properties of Grid Entropy

Since direction-free entropy ||ν||||\nu|| is just ||(t,ν)||||(t\ell,\nu)|| for t:=||ν||TVt:=||\nu||_{TV} then it suffices to study the properties of direction-fixed grid entropy. We are particularly interested in what the grid entropy can tell us about qq and ν\nu when ||(q,ν)||||(q,\nu)|| is finite.

We have already established that it is necessary for qq to be in D0{\mathbb{R}}^{D}_{\geq 0} if we want ||(q,ν)||||(q,\nu)|| not to be -\infty, and that if q=0q=\vec{0} then ||(q,ν)||>||(q,\nu)||>-\infty if and only if ν=0\nu=0. The question now is what sorts of measures ν\nu are observed by the empirical measures along the direction qD0{0}q\in{\mathbb{R}}^{D}_{\geq 0}\setminus\{\vec{0}\}.

By positive-homogeneity of the norm, it suffices to consider qD0q\in{\mathbb{R}}^{D}_{\geq 0} with ||q||1=1||q||_{1}=1 and study which ν+\nu\in\mathcal{M}_{+} give rise to finite ||(q,ν)||||(q,\nu)||. In this section, it will be more convenient to work with the description of grid entropy given by Corollary 23 (ii).

Theorem 24.

Recall that we couple the edge weights with i.i.d. edge labels UeΛU_{e}\sim\Lambda (the Lebesgue measure on [0,1][0,1]) and that

H(q):=limn1nlog(#π:0nq)H(q):=\lim\limits_{n\rightarrow\infty}\frac{1}{n}\log(\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor)

Consider any qD0q\in{\mathbb{R}}^{D}_{\geq 0} with ||q||1=1||q||_{1}=1 and any ν+\nu\in\mathcal{M}_{+}. Suppose ||(q,ν)||||(q,\nu)||\neq-\infty. Then:

  1. (i)

    ν\nu is a probability measure on [0,1][0,1].

  2. (ii)

    For any GδG_{\delta} or FσF_{\sigma} set A[0,1]A\subseteq[0,1], we have

    Λ(A)ν(A)Λ(AC)ν(AC)ν(A)ν(A)ν(AC)ν(AC)e||(q,ν)||H(q)\Lambda(A)^{\nu(A)}\Lambda(A^{C})^{\nu(A^{C})}\geq\nu(A)^{\nu(A)}\nu(A^{C})^{\nu(A^{C})}e^{||(q,\nu)||-H(q)}
  3. (iii)

    From Section 2.6 the Kullback-Leibler divergence is defined as

    DKL(ν||Λ)={flogfdxwithf:=dνdx,νΛ+,otherwiseD_{KL}(\nu||\Lambda)=\begin{cases}\int f\log f\ dx\ \mbox{with}\ f:=\frac{d\nu}{dx},&\nu\ll\Lambda\\ +\infty,&\mbox{otherwise}\end{cases}

    Then

    ||(q,ν)||+DKL(ν||Λ)H(q)||(q,\nu)||+D_{KL}(\nu||\Lambda)\leq H(q)

    so in particular, ν\nu is absolutely continuous with respect to the Lebesgue measure on [0,1][0,1].

Remark.

If we take q=q=\ell and recall that H()=logDH(\ell)=\log D we get the analogous statements for length 1 direction-free grid entropy.

Remark.

(i) is exactly what we expect because the empirical measures 1nμπ\frac{1}{n}\mu_{\pi} for paths
π:0nq\pi:\vec{0}\rightarrow\lfloor nq\rfloor are non-negative Borel measures with total mass ||nq||n1\frac{||\lfloor nq\rfloor||}{n}\approx 1 so in the limit we expect to only observe ν\nu which are probability measures.

Remark.

Recalling Theorem 6, (i) implies that q={ν+:||(q,ν)||0}\mathcal{R}^{q}=\{\nu\in\mathcal{M}_{+}:||(q,\nu)||\geq 0\} (and hence 1=\mathcal{R}^{1}=\mathcal{R}^{\ell}) is weakly compact.

Remark.

Weaker versions of (iii) previously appeared in [Bat20] and [RAS14]. To be specific, Bates shows that DKL(ν||Λ)H(q)D_{KL}(\nu||\Lambda)\leq H(q) for νq\nu\in\mathcal{R}^{q} and it is immediate from the work of Rassoul-Agha and Seppäläinen that grid entropy ||(q,ν)||||(q,\nu)|| is upper bounded by H(q)H(q). Here (iii) combines both of these into an amazing inequality.

Remark.

If we do not make use of our coupling and define grid entropy directly on the space of measures on {\mathbb{R}}, (i)-(iii) hold analogously. This is trivial since we do not use weak compactness in the proof of this theorem.

Proof.

(i) The intuition is that if ||ν||TV1||\nu||_{TV}\neq 1 then each ρ(1nμπ,ν)\rho(\frac{1}{n}\mu_{\pi},\nu) is bounded below away from 0 hence it does not converge to 0.

Suppose ν+\nu\in\mathcal{M}_{+} is not a probability measure on [0,1][0,1]. By the reverse triangle inequality for ρ\rho, for any nn and any path π:0nq\pi:\vec{0}\rightarrow\lfloor nq\rfloor,

ρ(1nμπ,ν)|ρ(μπ,0)ρ(nν,0)|=|||μπ||TVn||ν||TV|=|||nq||1n||ν||TV|\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\geq|\rho(\mu_{\pi},0)-\rho(n\nu,0)|=|\ ||\mu_{\pi}||_{TV}-n||\nu||_{TV}\ |=|\ ||\lfloor nq\rfloor||_{1}-n||\nu||_{TV}\ |

Since ||nq||1n||q||1=1\frac{||\lfloor nq\rfloor||_{1}}{n}\rightarrow||q||_{1}=1 and ||ν||TV1||\nu||_{TV}\neq 1 then there is δ>0\delta>0 s.t. for large enough nn,

ρ(1nμπ,ν)1n|||nq||1n||ν||TV|>δ\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\geq\frac{1}{n}|\ ||\lfloor nq\rfloor||_{1}-n||\nu||_{TV}\ |>\delta

It follows that minπ1ρ(1nμπ,ν)↛0\min\limits_{\pi}^{1}\rho(\frac{1}{n}\mu_{\pi},\nu)\not\rightarrow 0 a.s. so ||(q,ν)||=||(q,\nu)||=-\infty. Contradiction. Therefore ν\nu must be a probability measure on [0,1][0,1].

(ii) Before proceeding with (ii) we prove a quick claim that will also help with the proof of (iii).

Claim: Fix α[0,H(q)]\alpha\in[0,H(q)] and a nonempty set of measures T+T\subseteq\mathcal{M}_{+}. Then

lim supn1nlogP(enαpathsπ:0nqs.t.1nμπT)\displaystyle\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in T\bigg{)}
H(q)α+lim supn1nlogP(1nμπT)\displaystyle\leq H(q)-\alpha+\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{\pi}\in T\bigg{)}

In particular, if α0\alpha\geq 0 and

α>H(q)+lim supn1nlogP(1nμπT)\alpha>H(q)+\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{\pi}\in T\bigg{)}

then

P(enαpathsπ:0nqs.t.1nμπTi.o.)=0P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in T\ \mbox{i.o.}\bigg{)}=0

Proof. Indeed, for any nn\in{\mathbb{N}} we have by Markov’s Inequality

P(π𝒫(0,nq)𝟏1nμπTenα)\displaystyle P\bigg{(}\sum_{\pi\in\mathcal{P}(0,\lfloor nq\rfloor)}\mathbf{1}_{\frac{1}{n}\mu_{\pi}\in T}\geq\lfloor e^{n\alpha}\rfloor\bigg{)} 1enαE[π𝒫(0,nq)𝟏1nμπT]\displaystyle\leq\frac{1}{\lfloor e^{n\alpha}\rfloor}E\bigg{[}\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}\mathbf{1}_{\frac{1}{n}\mu_{\pi}\in T}\bigg{]}
=|𝒫(0,nq)|enαP(1nμπT)\displaystyle=\frac{|\mathcal{P}(\vec{0},\lfloor nq\rfloor)|}{\lfloor e^{n\alpha}\rfloor}P\bigg{(}\frac{1}{n}\mu_{\pi}\in T\bigg{)}

Taking lim supn1nlog\limsup\limits_{n\rightarrow\infty}\frac{1}{n}\log of both sides and using the definition of H(q)H(q) we get the desired inequality.

Now suppose

α>αδ>H(q)+lim supn1nlogP(1nμπT)\alpha>\alpha-\delta>H(q)+\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{\pi}\in T\bigg{)}

Then for large enough nn,

P(π𝒫(0,nq)𝟏1nμπTenα)exp[n(H(q)α+δ+lim supn1nlogP(1nμπT))]P\bigg{(}\sum_{\pi\in\mathcal{P}(0,\lfloor nq\rfloor)}\mathbf{1}_{\frac{1}{n}\mu_{\pi}\in T}\geq\lfloor e^{n\alpha}\rfloor\bigg{)}\leq\mbox{exp}\bigg{[}n\bigg{(}H(q)-\alpha+\delta+\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{\pi}\in T\bigg{)}\bigg{)}\bigg{]}

where the exponent is a strictly negative constant multiple of nn. Hence, by the BorelCantelli Lemma,

P(enαpathsπ:0nqs.t.1nμπTi.o.)=0P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in T\ \mbox{i.o.}\bigg{)}=0

\square (Claim)


Now we begin the proof of (ii). The idea is to use the definition of the Levy-Prokhorov metric to unpack what ρ(1nμπ,ν)<ϵ\rho(\frac{1}{n}\mu_{\pi},\nu)<\epsilon means in terms of the values of the labels along π\pi.

First, consider any closed A[0,1]A\subseteq[0,1]. Observe that A=A¯=ϵ>0AϵA=\overline{A}=\bigcap\limits_{\epsilon>0}A^{\epsilon} hence by continuity from above, ν(Aϵ)ν(A)\nu(A^{\epsilon})\downarrow\nu(A) and Λ(Aϵ)Λ(A)\Lambda(A^{\epsilon})\downarrow\Lambda(A) as ϵ0+\epsilon\rightarrow 0^{+}.

We compute the probability that ρ(1nμπ,ν)<ϵ\rho(\frac{1}{n}\mu_{\pi},\nu)<\epsilon for a path π:0nq\pi:\vec{0}\rightarrow\lfloor nq\rfloor directly, using the definitions of ρ\rho and μπ\mu_{\pi}, and then we will apply the Claim.

Consider nn\in{\mathbb{N}}. For any path π:0nq\pi:\vec{0}\rightarrow\lfloor nq\rfloor, by definition of the Levy-Prokhorov metric and the empirical measure μπ\mu_{\pi},

ρ(1nμπ,ν)<ϵν(A)1nμπ(Aϵ)+ϵand1nμπ(Aϵ)ν((Aϵ)ϵ)+ϵ\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\Rightarrow\nu(A)\leq\frac{1}{n}\mu_{\pi}(A^{\epsilon})+\epsilon\ \mbox{and}\ \frac{1}{n}\mu_{\pi}(A^{\epsilon})\leq\nu((A^{\epsilon})^{\epsilon})+\epsilon
π has n(ν(A)ϵ) edge labels in Aϵ\displaystyle\Rightarrow\mbox{$\pi$ has $\geq\lfloor n(\nu(A)-\epsilon)\rfloor$ edge labels in $A^{\epsilon}$}
and n(ν(A2ϵ)+ϵ)\leq\lfloor n(\nu(A^{2\epsilon})+\epsilon)\rfloor edge labels in AϵA^{\epsilon}
π has n(ν(A)ϵ) edge labels in Aϵ\displaystyle\Rightarrow\mbox{$\pi$ has $\geq\lfloor n(\nu(A)-\epsilon)\rfloor$ edge labels in $A^{\epsilon}$}
and ||nq||1n(ν(A2ϵ)ϵ)\geq||\lfloor nq\rfloor||_{1}-\lfloor n(\nu(A^{2\epsilon})-\epsilon)\rfloor edge labels in (Aϵ)C(A^{\epsilon})^{C}

If ν(A)=0\nu(A)=0 or ν(A)=1\nu(A)=1 then we may completely omit the first/second half of the statement above as it is trivial. Otherwise, we take ϵ>0\epsilon>0 small enough so that ν(A2ϵ)+ϵ<1\nu(A^{2\epsilon})+\epsilon<1 and nn large enough so that n(ν(A2ϵ)+ϵ)<||nq||1n(\nu(A^{2\epsilon})+\epsilon)<||\lfloor nq\rfloor||_{1}. For the sake of convenience we only show the computation in the latter case; the ν(A)=0\nu(A)=0 or ν(A)=1\nu(A)=1 case is merely a simplified version of it.

Since UeU_{e} are i.i.d. Unif[0,1][0,1], then

P(ρ(1nμπ,ν)<ϵ)P(π has n(ν(A)ϵ) edge labels in Aϵ,||nq||1n(ν(A2ϵ)ϵ) labels in (Aϵ)C)(||nq||1n(ν(A)ϵ),||nq||1n(ν(A2ϵ)ϵ))Λ(Aϵ)n(ν(A)ϵ)Λ((Aϵ)C)||nq||1n(ν(A2ϵ)ϵ)\begin{split}P\bigg{(}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\bigg{)}&\leq P\bigg{(}\begin{array}[]{l}\mbox{$\pi$ has $\geq\lfloor n(\nu(A)-\epsilon)\rfloor$ edge labels in $A^{\epsilon}$,}\\ \mbox{$\geq||\lfloor nq\rfloor||_{1}-\lfloor n(\nu(A^{2\epsilon})-\epsilon)\rfloor$ labels in $(A^{\epsilon})^{C}$}\end{array}\bigg{)}\\ &\leq\binom{||\lfloor nq\rfloor||_{1}}{\lfloor n(\nu(A)-\epsilon)\rfloor,||\lfloor nq\rfloor||_{1}-\lfloor n(\nu(A^{2\epsilon})-\epsilon)\rfloor}\\ &\qquad\qquad\cdot\Lambda(A^{\epsilon})^{n(\nu(A)-\epsilon)}\Lambda((A^{\epsilon})^{C})^{||\lfloor nq\rfloor||_{1}-n(\nu(A^{2\epsilon})-\epsilon)}\end{split} (29)

by the union bound. We get the asymptotics of this binomial coefficient from Lemma 1:

(||nq||1n(ν(A)ϵ),||nq||1n(ν(A2ϵ)ϵ))=(1(ν(A)ϵ)ν(A)ϵ(1ν(A2ϵ)+ϵ)1ν(A2ϵ)+ϵ(ν(A2ϵ)ν(A))ν(A2ϵ)ν(A)+o(1))n\begin{split}&\binom{||\lfloor nq\rfloor||_{1}}{\lfloor n(\nu(A)-\epsilon)\rfloor,||\lfloor nq\rfloor||_{1}-\lfloor n(\nu(A^{2\epsilon})-\epsilon)\rfloor}\\ &=\bigg{(}\frac{1}{(\nu(A)-\epsilon)^{\nu(A)-\epsilon}(1-\nu(A^{2\epsilon})+\epsilon)^{1-\nu(A^{2\epsilon})+\epsilon}(\nu(A^{2\epsilon})-\nu(A))^{\nu(A^{2\epsilon})-\nu(A)}}+o(1)\bigg{)}^{n}\end{split} (30)

Therefore

lim supn1nlogP(ρ(1nμπ,ν)<ϵ)\displaystyle\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\bigg{)}\leq
logΛ(Aϵ)ν(A)ϵΛ((Aϵ)C)1ν(A2ϵ)+ϵ(ν(A)ϵ)ν(A)ϵ(1ν(A2ϵ)+ϵ)1ν(A2ϵ)+ϵ(ν(A2ϵ)ν(A))ν(A2ϵ)ν(A)\displaystyle\log\frac{\Lambda(A^{\epsilon})^{\nu(A)-\epsilon}\Lambda((A^{\epsilon})^{C})^{1-\nu(A^{2\epsilon})+\epsilon}}{(\nu(A)-\epsilon)^{\nu(A)-\epsilon}(1-\nu(A^{2\epsilon})+\epsilon)^{1-\nu(A^{2\epsilon})+\epsilon}(\nu(A^{2\epsilon})-\nu(A))^{\nu(A^{2\epsilon})-\nu(A)}}

Combining this with the Claim with T=Bϵ(ν)T=B_{\epsilon}(\nu), we get that

P(enαpathsπ:0nqwithρ(1nμπ,ν)<ϵi.o.)=0\displaystyle P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{with}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{i.o.}\bigg{)}=0
α>H(q)+logΛ(Aϵ)ν(A)ϵΛ((Aϵ)C)1ν(A2ϵ)+ϵ(ν(A)ϵ)ν(A)ϵ(1ν(A2ϵ)+ϵ)1ν(A2ϵ)+ϵ(ν(A2ϵ)ν(A))ν(A2ϵ)ν(A)\displaystyle\forall\alpha>H(q)+\log\frac{\Lambda(A^{\epsilon})^{\nu(A)-\epsilon}\Lambda((A^{\epsilon})^{C})^{1-\nu(A^{2\epsilon})+\epsilon}}{(\nu(A)-\epsilon)^{\nu(A)-\epsilon}(1-\nu(A^{2\epsilon})+\epsilon)^{1-\nu(A^{2\epsilon})+\epsilon}(\nu(A^{2\epsilon})-\nu(A))^{\nu(A^{2\epsilon})-\nu(A)}}

By Corollary 23 (ii), it follows that

||(q,ν)||\displaystyle||(q,\nu)||\leq
H(q)+logΛ(Aϵ)ν(A)ϵ(1Λ(Aϵ))1ν(A2ϵ)+ϵ(ν(A)ϵ)ν(A)ϵ(1ν(A2ϵ)+ϵ)1ν(A2ϵ)+ϵ(ν(A2ϵ)ν(A))ν(A2ϵ)ν(A)\displaystyle H(q)+\log\frac{\Lambda(A^{\epsilon})^{\nu(A)-\epsilon}(1-\Lambda(A^{\epsilon}))^{1-\nu(A^{2\epsilon})+\epsilon}}{(\nu(A)-\epsilon)^{\nu(A)-\epsilon}(1-\nu(A^{2\epsilon})+\epsilon)^{1-\nu(A^{2\epsilon})+\epsilon}(\nu(A^{2\epsilon})-\nu(A))^{\nu(A^{2\epsilon})-\nu(A)}}

This holds for arbitrary ϵ>0\epsilon>0. Taking ϵ0+\epsilon\rightarrow 0^{+} and using ν(Aϵ)ν(A)\nu(A^{\epsilon})\downarrow\nu(A) and Λ(Aϵ)Λ(A)\Lambda(A^{\epsilon})\downarrow\Lambda(A),

1eH(q)||(q,ν)||Λ(A)ν(A)Λ(AC)ν(AC)ν(A)ν(A)ν(AC)ν(AC)1\leq e^{H(q)-||(q,\nu)||}\frac{\Lambda(A)^{\nu(A)}\Lambda(A^{C})^{\nu(A^{C})}}{\nu(A)^{\nu(A)}\nu(A^{C})^{\nu(A^{C})}}

Therefore

Λ(A)ν(A)Λ(AC)ν(AC)ν(A)ν(A)ν(AC)ν(AC)e||(q,ν)||H(q)\Lambda(A)^{\nu(A)}\Lambda(A^{C})^{\nu(A^{C})}\geq\nu(A)^{\nu(A)}\nu(A^{C})^{\nu(A^{C})}e^{||(q,\nu)||-H(q)} (31)

This equation is symmetric in A,ACA,A^{C} hence it holds for all AA open or closed. By continuity from below/above it holds for all GδG_{\delta} and FσF_{\sigma} subsets of [0,1][0,1].

(iii) The idea is to again use the Claim, this time to compute lim supn1nlogP(ρ(1nμπ,ν)<ϵ)\limsup\limits_{n\rightarrow\infty}\frac{1}{n}\log P(\rho(\frac{1}{n}\mu_{\pi},\nu)<\epsilon) using Sanov’s Theorem.

Fix ϵ>0\epsilon>0. By Sanov’s Theorem (Theorem 9), since B2ϵ¯(ν)\overline{B_{2\epsilon}}(\nu) is weakly closed then

lim supn1nlogP(1|π|μπB2ϵ¯(ν))infξB2ϵ¯(ν)DKL(ξ||Λ)\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{|\pi|}\mu_{\pi}\in\overline{B_{2\epsilon}}(\nu)\bigg{)}\leq-\inf_{\xi\in\overline{B_{2\epsilon}}(\nu)}D_{KL}(\xi||\Lambda)

where the right hand side might a priori be -\infty.

Since ρ(1|π|μπ,1nμπ)|1||nq||1n|0\rho(\frac{1}{|\pi|}\mu_{\pi},\frac{1}{n}\mu_{\pi})\leq|1-\frac{||\lfloor nq\rfloor||_{1}}{n}|\rightarrow 0 as nn\rightarrow\infty, it follows that

lim supn1nlogP(1nμπBϵ(ν))lim supn1nlogP(1|π|μπB2ϵ¯(ν))infξB2ϵ¯(ν)DKL(ξ||Λ)\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{\pi}\in B_{\epsilon}(\nu)\bigg{)}\leq\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{|\pi|}\mu_{\pi}\in\overline{B_{2\epsilon}}(\nu)\bigg{)}\leq-\inf_{\xi\in\overline{B_{2\epsilon}}(\nu)}D_{KL}(\xi||\Lambda)

By the Claim from (ii) with T=Bϵ(ν)T=B_{\epsilon}(\nu),

P(enαpathsπ:0nqwithρ(1nμπ,ν)<ϵi.o.)=0α>H(q)infξB2ϵ¯(ν)DKL(ξ||Λ)P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{with}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{i.o.}\bigg{)}=0\ \forall\alpha>H(q)-\inf_{\xi\in\overline{B_{2\epsilon}}(\nu)}D_{KL}(\xi||\Lambda)

Hence, by Corollary 23 (ii),

||(q,ν)||H(q)infξB2ϵ¯(ν)DKL(ξ||Λ)||(q,\nu)||\leq H(q)-\inf_{\xi\in\overline{B_{2\epsilon}}(\nu)}D_{KL}(\xi||\Lambda)

Take ϵ=1k0\epsilon=\frac{1}{k}\downarrow 0 and let ξk=argminξB2k¯(ν)DKL(ξ||Λ)\xi_{k}=\operatorname*{arg\,min}\limits_{\xi\in\overline{B_{\frac{2}{k}}}(\nu)}D_{KL}(\xi||\Lambda) (the infimum over the weakly closed B2k¯(ν)\overline{B_{\frac{2}{k}}}(\nu) is achieved since relative entropy is lower semicontinuous). Then ξkν\xi_{k}\Rightarrow\nu hence

DKL(ν||Λ)+||(q,ν)||lim infkDKL(ξk||Λ)+||(q,ν)||H(q)D_{KL}(\nu||\Lambda)+||(q,\nu)||\leq\liminf_{k\rightarrow\infty}D_{KL}(\xi_{k}||\Lambda)+||(q,\nu)||\leq H(q)

as desired.

Finally, recall that νq\nu\in\mathcal{R}^{q} hence ||(q,ν)||0||(q,\nu)||\geq 0 so DKL(ν||Λ)D_{KL}(\nu||\Lambda) is finite and νΛ\nu\ll\Lambda. ∎

It is not clear in general whether any probability measure ν\nu satisfying DKL(ν||Λ)H(q)D_{KL}(\nu||\Lambda)\leq H(q) results in finite grid entropies ||(q,ν)||||(q,\nu)|| for qD0,||q||1=1q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=1.

In one specific case, however, this does hold trivially. When qq is a permutation of
(1,0,,0)(1,0,\ldots,0), so that there is exactly one NE path 0nq\vec{0}\rightarrow nq, we get

||(q,ν)||>DKL(ν||Λ)=0ν=Λ||(q,\nu)||>-\infty\Leftrightarrow D_{KL}(\nu||\Lambda)=0\Leftrightarrow\nu=\Lambda

with the value of ||(q,Λ)||||(q,\Lambda)|| being 0. Note that this completely covers the D=1D=1 case.


Also, in general we cannot explicitly compute grid entropy. But when the target measure ν\nu is Λ\Lambda, we can easily show that grid entropy is maximal. This makes sense, since the Glivenko-Cantelli Theorem says we expect that most empirical measures observed are very close to the edge label distribution.

Corollary 25.

Let qD0{0}q\in{\mathbb{R}}^{D}_{\geq 0}\setminus\{\vec{0}\}. Then ||(q,qΛ)||=H(q)||(q,q\Lambda)||=H(q).

Proof.

Since H(q)H(q) and ||(q,ν)||||(q,\nu)|| are positive-homogeneous (in the sense that H(cq)=cH(q)H(cq)=cH(q) and ||(cq,cν)||=c||(q,ν)||c>0||(cq,c\nu)||=c||(q,\nu)||\ \forall c>0), we may without loss of generality assume ||q||1=1||q||_{1}=1.

Suppose ||(q,Λ)||<H(q)||(q,\Lambda)||<H(q). Consider any ϵ>0\epsilon>0. By Sanov’s Theorem (Theorem 9),

lim supn1nlogP(1nμπBϵ(Λ)C)infξBϵ(Λ)CDKL(ξ||Λ)\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{\pi}\in B_{\epsilon}(\Lambda)^{C}\bigg{)}\leq-\inf_{\xi\in B_{\epsilon}(\Lambda)^{C}}D_{KL}(\xi||\Lambda) (32)

where Bϵ(Λ)CB_{\epsilon}(\Lambda)^{C} is weakly closed and relative entropy is lower semicontinuous hence the infimum is achieved by some ξBϵ(Λ)C\xi\in B_{\epsilon}(\Lambda)^{C}. But ξΛ\xi\neq\Lambda implies DKL(ξ||Λ)<0-D_{KL}(\xi||\Lambda)<0. Fix any

max(||(q,Λ)||,H(q)DKL(ξ||Λ))<α<H(q)\max(||(q,\Lambda)||,H(q)-D_{KL}(\xi||\Lambda))<\alpha<H(q)

By the Claim from the proof of (ii) in the previous theorem with T=Bϵ(Λ)CT=B_{\epsilon}(\Lambda)^{C}, we have

P(enαpathsπ:0nqs.t.ρ(1nμπ,ν)ϵi.o.)=0P\bigg{(}\exists\lceil e^{n\alpha}\rceil\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\geq\epsilon\ \mbox{i.o.}\bigg{)}=0

hence

P((#π:0nq)enαpathsπ:0nqs.t.ρ(1nμπ,ν)<ϵeventually)=1P\bigg{(}\exists(\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor)-\lceil e^{n\alpha}\rceil\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{eventually}\bigg{)}=1

But (#π:0nq)enαenα(\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor)-\lceil e^{n\alpha}\rceil\geq\lfloor e^{n\alpha}\rfloor for large enough nn since α<H(q)\alpha<H(q), so

P(enαpathsπ:0nqs.t.ρ(1nμπ,ν)<ϵeventually)=1P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{eventually}\bigg{)}=1

for this arbitrary ϵ>0\epsilon>0, hence α||(q,Λ)||\alpha\geq||(q,\Lambda)||. Contradiction. ∎

Continuity is another property of interest. It turns out that grid entropy is upper semicontinuous in its argument(s); this follows immediately from (3) and the lower semicontinuity of relative entropy. However, we have not yet proved (3) so we prove a slightly weaker statement using the tools developed in this paper. We also note that grid entropy is lower semicontinuous (hence fully continuous) in some cases.

Note that here we no longer restrict ourselves to the case when ||q||1=1||q||_{1}=1.

Theorem 26.

Let (q,ν),(qk,νk)D×+(q,\nu),(q^{k},\nu^{k})\in{\mathbb{R}}^{D}\times\mathcal{M}_{+} s.t. qkL1q,νkνq^{k}\rightarrow^{L^{1}}q,\nu^{k}\Rightarrow\nu.

  1. (i)

    If either q=0q=\vec{0} or for large enough kk we have qqkD0q-q^{k}\in{\mathbb{R}}^{D}_{\geq 0} or for large enough kk we have (qk)i=0(q^{k})_{i}=0 for all ii s.t. qi=0q_{i}=0 then

    lim supk||(qk,νk)||||(q,ν)||\limsup_{k\rightarrow\infty}||(q^{k},\nu^{k})||\leq||(q,\nu)||
  2. (ii)

    If ||(qkq,νkν)||||(q^{k}-q,\nu^{k}-\nu)||\neq-\infty, for large enough kk then

    ||(q,ν)||lim infk||(qk,νk)||||(q,\nu)||\leq\liminf_{k\rightarrow\infty}||(q^{k},\nu^{k})||
  3. (iii)

    If there exist constants Ck>0C_{k}>0 s.t. Ck0C_{k}\rightarrow 0 and s.t. for large enough kk,

    ||(qqqkCk,νννkCk)||\bigg{|}\bigg{|}\bigg{(}q-\frac{q-q^{k}}{C_{k}},\nu-\frac{\nu-\nu^{k}}{C_{k}}\bigg{)}\bigg{|}\bigg{|}\neq-\infty

    then

    ||(q,ν)||lim infk||(qk,νk)||||(q,\nu)||\leq\liminf_{k\rightarrow\infty}||(q^{k},\nu^{k})||
Remark.

In the case when qk=qq^{k}=q for large enough kk, (i) shows that direction-fixed grid entropy ||(q,ν)||||(q,\nu)|| is always upper semicontinuous in ν\nu. Taking qk=||νk||TV||ν||TVq^{k}=||\nu^{k}||_{TV}\ell\rightarrow||\nu||_{TV}\ell we get that

lim supk||νk||=lim supk||(||νk||TV,νk)||||(||ν||TV,ν)||=||ν||\limsup_{k\rightarrow\infty}||\nu^{k}||=\limsup_{k\rightarrow\infty}||(||\nu^{k}||_{TV}\ell,\nu^{k})||\leq||(||\nu||_{TV}\ell,\nu)||=||\nu||

so direction-free grid entropy is upper semicontinuous.

Remark.

In the assumptions in (ii) and (iii) it is implicit that the parameters lie in
D0×+{\mathbb{R}}^{D}_{\geq 0}\times\mathcal{M}_{+} because otherwise the grid entropy would be -\infty.

Remark.

The conditions in (i), (ii), (iii) are not mutually exclusive so for some sequences we may combine the statements to obtain

||(q,ν)||=limk||(qk,νk)||||(q,\nu)||=\lim_{k\rightarrow\infty}||(q^{k},\nu^{k})||
Proof.

For all three parts, we may assume the conditions hold for all kk\in{\mathbb{N}}.
(i) Case 1: q=0q=\vec{0}
First suppose ν=0\nu=0 so that ||(q,ν)||=0||(q,\nu)||=0. By Theorem 22,

||(qk,νk)||{}[0,H(qk)]k||(q^{k},\nu^{k})||\in\{-\infty\}\cup[0,H(q^{k})]\ \forall k

hence

lim supk||(qk,νk)||\displaystyle\limsup_{k\rightarrow\infty}||(q^{k},\nu^{k})|| lim supkH(qk)\displaystyle\leq\limsup_{k\rightarrow\infty}H(q^{k})
=lim supk||qk||1log||qk||1i=1D(qk)ilog(qk)i\displaystyle=\limsup_{k\rightarrow\infty}||q^{k}||_{1}\log||q^{k}||_{1}-\sum_{i=1}^{D}(q^{k})_{i}\log(q^{k})_{i}
=0\displaystyle=0

since ||qkq||1=||qk||10||q^{k}-q||_{1}=||q^{k}||_{1}\rightarrow 0.

On the other hand, if ν0\nu\neq 0 then for large enough kk we have ||qk||1||νk||TV||q^{k}||_{1}\neq||\nu^{k}||_{TV} (because otherwise ||νk||TV0||\nu^{k}||_{TV}\rightarrow 0 and hence ν\nu, the weak limit of the νk\nu^{k}, would have to be 0). But then

lim supk||(qk,νk)||==||(q,ν)||\limsup_{k\rightarrow\infty}||(q^{k},\nu^{k})||=-\infty=||(q,\nu)||

by Theorem 24 (i).

Case 2: qqkD0kq-q^{k}\in{\mathbb{R}}^{D}_{\geq 0}\ \forall k
Recall that the infimum of a family of upper semicontinuous functions is upper semicontinuous. Thus it suffices to fix ϵ>0\epsilon>0 and show that

lim supkd~ϵ((0,0),(qk,νk))d~ϵ((0,0),(q,ν))a.s.\limsup_{k\rightarrow\infty}\widetilde{d}^{\epsilon}((\vec{0},0),(q^{k},\nu^{k}))\leq\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))\ \mbox{a.s.} (33)

But for every k,nk,n\in{\mathbb{N}} we have by Lemma 15 and the fact that qqkD0q-q^{k}\in{\mathbb{R}}^{D}_{\geq 0},

dϵ((0,0),(nqk,nνk))n1ϵ(1n||nqnqk||1+ρ(ν,νk))dϵ((0,0),(nqk,nνk))n\frac{d^{\epsilon}((\vec{0},0),(nq^{k},n\nu^{k}))}{n}-\frac{1}{\epsilon}\bigg{(}\frac{1}{n}||\lfloor nq\rfloor-\lfloor nq^{k}\rfloor||_{1}+\rho(\nu,\nu^{k})\bigg{)}\leq\frac{d^{\epsilon}((\vec{0},0),(nq^{k},n\nu^{k}))}{n}

For a fixed kk, taking nn\rightarrow\infty we get

d~ϵ((0,0),(qk,νk))1ϵ(||qqk||1+ρ(ν,νk))d~ϵ((0,0),(q,ν))a.s.\widetilde{d}^{\epsilon}((\vec{0},0),(q^{k},\nu^{k}))-\frac{1}{\epsilon}(||q-q^{k}||_{1}+\rho(\nu,\nu^{k}))\leq\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))\ \mbox{a.s.}

(33) follows immediately.

Case 3: (qk)i=0(q^{k})_{i}=0 whenever qi=0q_{i}=0 for all kk\in{\mathbb{N}}
We may assume q0q\neq\vec{0} (because this scenario was covered by Case 1). In particular, we may assume without loss of generality that qk0kq^{k}\neq\vec{0}\ \forall k.

To show

lim supk||(qk,νk)||||(q,ν)||\limsup_{k\rightarrow\infty}||(q^{k},\nu^{k})||\leq||(q,\nu)||

it suffices to prove this when ||(qk,νk)||0k||(q^{k},\nu^{k})||\geq 0\ \forall k. In particular, this implies that
(qk)i0k,1iD(q^{k})_{i}\geq 0\ \forall k\in{\mathbb{N}},1\leq i\leq D and hence qi01iDq_{i}\geq 0\ \forall 1\leq i\leq D.

Now, the trick is to rescale the qkq^{k} by a factor Bk1B_{k}\geq 1 converging to 1 in kk s.t. we land back in Case 2. More precisely, define

Bk:=max(max1iDs.t.(qk)i0qi(qk)i,max1iDs.t.qi0(qk)iqi)B_{k}:=\max\bigg{(}\max_{1\leq i\leq D\ \mbox{s.t.}\ (q^{k})_{i}\neq 0}\frac{q_{i}}{(q^{k})_{i}},\max_{1\leq i\leq D\ \mbox{s.t.}\ q_{i}\neq 0}\frac{(q^{k})_{i}}{q_{i}}\bigg{)}

Note that Bk0B_{k}\geq 0 because qi,(qk)i0q_{i},(q^{k})_{i}\geq 0. Since qk0q^{k}\neq\vec{0} then there is some ii s.t. (qk)i>0(q^{k})_{i}>0 so by the assumption of Case 3, qi>0q_{i}>0. Thus Bk1B_{k}\geq 1.

Furthermore, ||qqk||10||q-q^{k}||_{1}\rightarrow 0 implies Bk1B_{k}\rightarrow 1. In particular, 1BkqkL1q,1Bkνkν\frac{1}{B_{k}}q_{k}\rightarrow^{L^{1}}q,\frac{1}{B_{k}}\nu^{k}\Rightarrow\nu. By the choice of BkB_{k} we have for all 1iD1\leq i\leq D, either qi=0q_{i}=0 hence Bkqi=0=(qk)iB_{k}q_{i}=0=(q^{k})_{i} or qi0q_{i}\neq 0 so Bkqi(qk)iB_{k}q_{i}\geq(q^{k})_{i}. Thus

BkqqkD0q1BkqkD0B_{k}q-q^{k}\in{\mathbb{R}}^{D}_{\geq 0}\Rightarrow q-\frac{1}{B_{k}}q^{k}\in{\mathbb{R}}^{D}_{\geq 0}

By Case 2 and positive-homogeneity of grid entropy,

lim supk||(qk,νk)||=lim supk||(1Bkqk,1Bkνk)||||(q,ν)||\limsup_{k\rightarrow\infty}||(q^{k},\nu^{k})||=\limsup_{k\rightarrow\infty}\bigg{|}\bigg{|}\bigg{(}\frac{1}{B_{k}}q^{k},\frac{1}{B_{k}}\nu^{k}\bigg{)}\bigg{|}\bigg{|}\leq||(q,\nu)||

as desired.

(ii) By the triangle inequality for a directed norm with negative sign,

||(qk,νk)||||(q,ν)||+||(qkq,νkν)||||(q,ν)||k||(q^{k},\nu^{k})||\geq||(q,\nu)||+||(q^{k}-q,\nu^{k}-\nu)||\geq||(q,\nu)||\ \forall k

since ||(qkq,νkν)||||(q^{k}-q,\nu^{k}-\nu)||\neq-\infty. Therefore

lim infk||(qk,νk)||||(q,ν)||\liminf_{k\rightarrow\infty}||(q^{k},\nu^{k})||\geq||(q,\nu)||

(iii) Now suppose

||(qqqkCk,νννkCk)||0andCk0k,Ck0\bigg{|}\bigg{|}\bigg{(}q-\frac{q-q^{k}}{C_{k}},\nu-\frac{\nu-\nu^{k}}{C_{k}}\bigg{)}\bigg{|}\bigg{|}\geq 0\ \mbox{and}\ C_{k}\geq 0\ \forall k,C_{k}\rightarrow 0

By the reverse triangle inequality and positive-homogeneity,

(1Ck)||(q,ν)||\displaystyle(1-C_{k})||(q,\nu)|| =||(qk(Ckq(qqk)),νk(Ckν(ννk)))||\displaystyle=||(q^{k}-(C_{k}q-(q-q^{k})),\nu^{k}-(C_{k}\nu-(\nu-\nu^{k})))||
||(qk,νk)||Ck||(qqqkCk,νννkCk)||\displaystyle\leq||(q^{k},\nu^{k})||-C_{k}\bigg{|}\bigg{|}\bigg{(}q-\frac{q-q^{k}}{C_{k}},\nu-\frac{\nu-\nu^{k}}{C_{k}}\bigg{)}\bigg{|}\bigg{|}
||(qk,νk)||\displaystyle\leq||(q^{k},\nu^{k})||

But Ck0C_{k}\rightarrow 0 hence

||(q,ν)||lim infk||(qk,νk)||||(q,\nu)||\leq\liminf_{k\rightarrow\infty}||(q^{k},\nu^{k})||

In Lemma 20 we made a simple observation that is worth mentioning explicitly: ||(q,ν)||||(q,\nu)|| is invariant under permutations of the coordinates of qq because of the symmetry of the grid lattice D{\mathbb{Z}}^{D}.

A last property we are interested in is the convexity of q,t\mathcal{R}^{q},\mathcal{R}^{t}.

Lemma 27.

Let qD0,t0q\in{\mathbb{R}}^{D}_{\geq 0},t\geq 0. The sets q={(q,ν)D0×+:||(q,ν)||0}\mathcal{R}^{q}=\{(q,\nu)\in{\mathbb{R}}^{D}_{\geq 0}\times\mathcal{M}_{+}:||(q,\nu)||\geq 0\}, t={νt:||ν||0}\mathcal{R}^{t}=\{\nu\in\mathcal{M}_{t}:||\nu||\geq 0\} are convex.

Proof.

Grid entropy is concave in by positive-homogeneity and by the reverse triangle-inequality it satisfies. ∎

5 Application: Directed Polymers

We now apply our results to give variational formulas for the point-to-point/point-to-level Gibbs Free Energy in terms of direction-fixed/direction-free grid entropy and last passage time. As mentioned, these variational formulas have previously appeared in [RAS14], [GRAS16]. Our aim is to arrive at these from the framework presented in this paper with the hope of providing fresh insights.

5.1 Setup and Known Results

Recall that the measurable function τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}} determines the edge weights
τe=τ(Ue)\tau_{e}=\tau(U_{e}). We will be dealing with the Gibbs Free Energy so to make our life easy we restrict ourselves in Section 5 to τ\tau bounded and measurable. This could be extended to measurable but possibly unbounded τ\tau by truncating τ\tau at constants C>0C>0 and taking a supremum over C>0C>0.

Define the last passage time between points p,qDp,q\in{\mathbb{R}}^{D} to be

T(p,q)=supπ:pqT(π)=supπ:pqeπτ(Ue)T(p,q)=\sup_{\pi:\lfloor p\rfloor\rightarrow\lfloor q\rfloor}T(\pi)=\sup_{\pi:\lfloor p\rfloor\rightarrow\lfloor q\rfloor}\sum_{e\in\pi}\tau(U_{e})

where this is -\infty if qpD0q-p\notin{\mathbb{R}}^{D}_{\geq 0}. A standard result (see [Mar04] for example) is the existence of a last passage time constant, or in other words a deterministic limiting value for the last passage time in a fixed direction qD0q\in{\mathbb{R}}^{D}_{\geq 0}.

Theorem 28.

Let qD0q\in{\mathbb{R}}^{D}_{\geq 0}. Then there is a τ\tau-dependent time constant λq(τ)\lambda_{q}(\tau)\in{\mathbb{R}} s.t.

T(0,nq)nλq(τ)\frac{T(\vec{0},nq)}{n}\rightarrow\lambda_{q}(\tau)

λq\lambda_{q} is homogeneous and concave in qq.

Remark.

As was the case with grid entropy, the symmetries of the grid and concavity of passage time imply that the time constant is maximized along the direction =(1D,,1D)\ell=(\frac{1}{D},\ldots,\frac{1}{D}):

supqD,||q||1=1λq=λ\sup\limits_{q\in{\mathbb{R}}^{D},||q||_{1}=1}\lambda_{q}=\lambda_{\ell}
Remark.

Thus far the time constant is only known to be explicitly computable when the underlying distribution θ\theta is exponential.

It is useful to view last passage time as a linear functional. For any NE path π:npnq\pi:\lfloor np\rfloor\rightarrow\lfloor nq\rfloor, note that

T(π)=eπτ(Ue)=01τ(u)d(μπ):=τ,μπsoT(π)n=τ,1nμπT(\pi)=\sum_{e\in\pi}\tau(U_{e})=\int_{0}^{1}\tau(u)d(\mu_{\pi}):=\langle\tau,\mu_{\pi}\rangle\ \mbox{so}\ \frac{T(\pi)}{n}=\bigg{\langle}\tau,\frac{1}{n}\mu_{\pi}\bigg{\rangle}

where f,ν\langle f,\nu\rangle denotes the linear functional that is integration of a measurable function
f:[0,1]f:[0,1]\rightarrow{\mathbb{R}} against a measure ν+\nu\in\mathcal{M}_{+}. In this language, weak convergence νkν\nu_{k}\Rightarrow\nu is equivalent to

limkf,νk=f,νbounded, continuous functions f\lim_{k\rightarrow\infty}\langle f,\nu_{k}\rangle=\langle f,\nu\rangle\ \forall\ \mbox{bounded, continuous functions $f$}

Of course, τ\tau is likely not continuous. However, recalling [Bat20, Lemma 6.15] , we see that there is an a.s. event on which 1nμπn\frac{1}{n}\mu_{\pi_{n}} converging weakly to ν\nu implies (τ)(1nμπn)(\tau)_{\ast}(\frac{1}{n}\mu_{\pi_{n}}) converges weakly to (τ)(ν)(\tau)_{\ast}(\nu) hence

τ,1nμπn=Id,(τ)(1nμπn)Id,(τ)(ν)=τ,ν\bigg{\langle}\tau,\frac{1}{n}\mu_{\pi_{n}}\bigg{\rangle}=\bigg{\langle}Id,(\tau)_{\ast}\bigg{(}\frac{1}{n}\mu_{\pi_{n}}\bigg{)}\bigg{\rangle}\rightarrow\bigg{\langle}Id,(\tau)_{\ast}(\nu)\bigg{\rangle}=\langle\tau,\nu\rangle

where Id(x)=xId(x)=x is the identity function on [0,1].

By using the compactness of q\mathcal{R}^{q} and Theorem 6, Bates establishes a variational formula for the direction-fixed limit shape. See [Bat20, Theorem 2.1] for details. We restate this theorem here in our LPP setting, which can be proved by tweaking Bates’ original argument. We give our own proof in the next section.

Theorem 29.

Let qD0q\in{\mathbb{R}}^{D}_{\geq 0}. Recall that Theorem 6 gives a deterministic weakly closed set q\mathcal{R}^{q} which a.s. consists of all weak limits of subsequences of μπ\mu_{\pi} for paths π\pi of increasing length. Then for any measurable τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}}, the corresponding limit shape is given by

λq(τ)=supνqτ,ν\lambda_{q}(\tau)=\sup_{\nu\in\mathcal{R}^{q}}\langle\tau,\nu\rangle

The direction-free analogue holds as well.

Remark.

Since q\mathcal{R}^{q} is weakly closed and nonempty (because ||q||1Λq||q||_{1}\Lambda\in\mathcal{R}^{q} from before), then the set of maximizers is nonempty:

qτ:={νq:τ,ν=λq(τ)}\mathcal{R}^{q}_{\tau}:=\{\nu\in\mathcal{R}^{q}:\langle\tau,\nu\rangle=\lambda_{q}(\tau)\}\neq\emptyset

As Bates remarks, when qτ\mathcal{R}^{q}_{\tau} is a singleton, empirical measures along geodesics converge weakly to this unique maximizer (or analogously unique minimizer in FPP), answering in the affirmative Hoffman’s question. It is further pointed out that qτ\mathcal{R}^{q}_{\tau} is a singleton for a dense family of functions τ\tau. See [Bat20] for more details.

Remark.

Noting that the limit shape is the zero temperature analogue of β\beta-Gibbs Free Energy, these direction-fixed/direction-free variational formulas for the limit shape are also given as (7.6), (7.5) in [GRAS16].

Finally, let us recall the directed polymer model, still restricting ourselves to the LPP on D{\mathbb{R}}^{D} setting.

Definition.

Fix a direction qD0q\in{\mathbb{R}}^{D}_{\geq 0}, an inverse temperature β>0\beta>0 and a bounded measurable τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}}. The point-to-point β\beta-partition function in direction qq is defined to be

Zn,qβ=π𝒫(0,nq)eβT(π)Z_{n,q}^{\beta}=\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{\beta T(\pi)}

The corresponding point-to-point β\beta-polymer measure on the set of paths π:0nq\pi:\vec{0}\rightarrow\lfloor nq\rfloor is

ρn,qβ(dπ)=1Zn,qβeβT(π)\rho_{n,q}^{\beta}(d\pi)=\frac{1}{Z_{n,q}^{\beta}}e^{\beta T(\pi)}

and the corresponding point-to-point β\beta- Gibbs Free Energy is defined to be

Gqβ=limn1nlogZn,qβG_{q}^{\beta}=\lim_{n\rightarrow\infty}\frac{1}{n}\log Z_{n,q}^{\beta}

Suppressing the direction qq we get point-to-level analogues (to the ”level” x1++xD=nx_{1}+\ldots+x_{D}=n):

Znβ=π𝒫n(0)eβT(π),ρnβ(dπ)=1ZnβeβT(π),Gβ=limn1nlogZnβZ_{n}^{\beta}=\sum_{\pi\in\mathcal{P}_{n}(\vec{0})}e^{\beta T(\pi)},\ \rho_{n}^{\beta}(d\pi)=\frac{1}{Z_{n}^{\beta}}e^{\beta T(\pi)},\ G^{\beta}=\lim_{n\rightarrow\infty}\frac{1}{n}\log Z_{n}^{\beta}

5.2 Variational Formulas for PTP/PTL Gibbs Free Energies

We derive a formula for the point-to-point Gibbs Free Energy as the supremum over measures ν\nu of the sum of the integral βτ,ν\beta\langle\tau,\nu\rangle and grid entropy ||(q,ν)||||(q,\nu)||. Since direction-free grid entropy is just grid entropy in the \ell direction, then we arrive at a similar formula for the point-to-level Gibbs Free Energy.

Theorem 30.

Fix an inverse temperature β>0\beta>0 and a bounded measurable
τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}}.

  1. (i)

    Fix qD0q\in{\mathbb{R}}^{D}_{\geq 0}. The point-to-point Gibbs Free Energy is given by

    Gqβ(τ)=supν+[βτ,ν+||(q,ν)||]a.s.G_{q}^{\beta}(\tau)=\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle+||(q,\nu)||]\ \mbox{a.s.}

    Moreover, this supremum is achieved by some νq\nu\in\mathcal{R}^{q}.

  2. (ii)

    The point-to-level Gibbs Free Energy is given by

    Gβ(τ)=supν1[βτ,ν+||ν||]=Gβ=supqD0,||q||1=1Gβqa.s.G^{\beta}(\tau)=\sup_{\nu\in\mathcal{M}_{1}}[\beta\langle\tau,\nu\rangle+||\nu||]=G_{\ell}^{\beta}=\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=1}G^{\beta}_{q}\ \mbox{a.s.}
Remark.

These entropy variational formulas appear as (7.12), (7.10) in [GRAS16] respectively, with a different normalization. The point-to-point formula originally appears as (5.4) in [RAS14].

Remark.

If τ\tau is non-negative then the variational formulas are trivially positive-homogeneous.

Remark.

Again, we may extend these formulas to a supremum over all measurable τ\tau by truncating τ\tau in τ,ν\langle\tau,\nu\rangle at some C>0C>0 and then taking a supremum over C>0C>0 of the variational formula.

We require a short lemma in order to prove our variational formula. The main technical difficulty is that each direction-qq grid entropy is an almost sure limit:

||(q,ν)||=infϵ>0limndϵ((0,0),(nq,nν))na.s.||(q,\nu)||=\inf_{\epsilon>0}\lim_{n\rightarrow\infty}\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}\ \mbox{a.s.}

but we are now dealing with a whole space of measures, which is uncountable. Thus we need to assume these limits exist for a countable dense family of ν\nu’s and approximate the sums of eβT(π)e^{\beta T(\pi)} over paths with empirical measure in ϵ\epsilon-balls.

Lemma 31.

Fix qD0,χ+,γ>0q\in{\mathbb{R}}^{D}_{\geq 0},\chi\in\mathcal{M}_{+},\gamma>0 and mm\in{\mathbb{N}}. For any nn\in{\mathbb{N}},

log(π:0nqs.t.1nμπB1m(χ)eβT(π))\displaystyle\log\bigg{(}\sum_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in B_{\frac{1}{m}}(\chi)}e^{\beta T(\pi)}\bigg{)}
βn(supξB1m(χ)τ,ξ)+n1/mγ+dγ((0,0),(nq,nχ))\displaystyle\leq\beta n\bigg{(}\sup_{\xi\in B_{\frac{1}{m}}(\chi)}\langle\tau,\xi\rangle\bigg{)}+n\frac{1/m}{\gamma}+d^{\gamma}((\vec{0},0),(nq,n\chi))
Proof.

Let us look closer at this sum. Observe that we can upper bound the indicator of the set we are summing over by a smooth function:

𝟙{1nμπB1m(χ)}exp(n1/mγ)exp(nγρ(1nμπ,χ))\mathds{1}\bigg{\{}\frac{1}{n}\mu_{\pi}\in B_{\frac{1}{m}}(\chi)\bigg{\}}\leq\mbox{exp}\bigg{(}n\frac{1/m}{\gamma}\bigg{)}\mbox{exp}\bigg{(}-\frac{n}{\gamma}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\chi\bigg{)}\bigg{)}

Thus

log(πs.t.1nμπB1m(χ)eβT(π))\displaystyle\log\bigg{(}\sum_{\pi\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in B_{\frac{1}{m}}(\chi)}e^{\beta T(\pi)}\bigg{)}
β(supπs.t.1nμπB1m(χ)T(π))+log(πs.t.1nμπB1m(χ)1)\displaystyle\leq\beta\bigg{(}\sup_{\pi\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in B_{\frac{1}{m}}(\chi)}T(\pi)\bigg{)}+\log\bigg{(}\sum_{\pi\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in B_{\frac{1}{m}}(\chi)}1\bigg{)}
βn(supπs.t.1nμπB1m(χ)τ,1nμπ)+log[exp(n1/mγ)π𝒫(0,nq)exp(nγρ(1nμπ,χ))]\displaystyle\leq\beta n\bigg{(}\sup_{\pi\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in B_{\frac{1}{m}}(\chi)}\langle\tau,\frac{1}{n}\mu_{\pi}\rangle\bigg{)}+\log\bigg{[}\mbox{exp}\bigg{(}n\frac{1/m}{\gamma}\bigg{)}\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}\mbox{exp}\bigg{(}-\frac{n}{\gamma}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\chi\bigg{)}\bigg{)}\bigg{]}
βn(supξB1m(χ)τ,ξ)+n1/mγ+dγ((0,0),(nq,nχ))\displaystyle\leq\beta n\bigg{(}\sup_{\xi\in B_{\frac{1}{m}}(\chi)}\langle\tau,\xi\rangle\bigg{)}+n\frac{1/m}{\gamma}+d^{\gamma}((\vec{0},0),(nq,n\chi))

Proof of Theorem 30.

We focus on showing (i). The proof of the direction-free case (ii) is analogous.

(i) Note that if q=0q=\vec{0} then Zn,qβ=1Z_{n,q}^{\beta}=1 hence the variational formula holds trivially. Thus we may assume ||q||1>0||q||_{1}>0.

Recall that the Levy-Prokhorov metric metrizes weak convergence of measures and that {ξ+:12||q|1||ξ||TV32||q||1}:=S\{\xi\in\mathcal{M}_{+}:\frac{1}{2}||q|_{1}\leq||\xi||_{TV}\leq\frac{3}{2}||q||_{1}\}:=S is weakly compact. It follows that βτ,\beta\langle\tau,\cdot\rangle is uniformly continuous on SS. More precisely,

ϵ>0δ>0s.t.ξ,ξS,ρ(ξ,ξ)<δ|βτ,ξβτ,ξ|<ϵ\forall\epsilon>0\ \exists\delta>0\ \mbox{s.t.}\ \forall\xi,\xi^{\prime}\in S,\ \rho(\xi,\xi^{\prime})<\delta\Rightarrow|\beta\langle\tau,\xi\rangle-\beta\langle\tau,\xi^{\prime}\rangle|<\epsilon (34)

We split the argument into two claims.

Claim 1:

limn1nlogZn,qβsupν+[βτ,ν+||(q,ν)||]a.s.\lim_{n\rightarrow\infty}\frac{1}{n}\log Z_{n,q}^{\beta}\geq\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle+||(q,\nu)||]\ \mbox{a.s.}

Proof of Claim 1
Note that ||(q,ν)||=||(q,\nu)||=-\infty unless νq\nu\in\mathcal{R}^{q}. Also q\mathcal{R}^{q} is weakly closed and ||(q,ν)||||(q,\nu)|| is upper semicontinuous in ν\nu so there exists some νq\nu\in\mathcal{R}^{q} that achieves the supremum

supν+[βτ,ν+||(q,ν)||]\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle+||(q,\nu)||]

We prove the claim when ||(q,ν)||>0||(q,\nu)||>0; the ||(q,ν)||=0||(q,\nu)||=0 case is very similar.

For Claim 1, we restrict ourselves to the measure 1 event

{limnminπen(||(q,ν)||ϵ)ρ(1nμπ,ν)=0ϵ+withϵ<||(q,ν)||}\bigg{\{}\lim_{n\rightarrow\infty}\min_{\pi}^{\lfloor e^{n(||(q,\nu)||-\epsilon)}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \forall\epsilon\in{\mathbb{Q}}_{+}\ \mbox{with}\ \epsilon<||(q,\nu)||\bigg{\}} (35)

(this has measure 1 by Theorem 22). We wish to show that in this event,

lim infn1nlogZn,qββτ,ν+||(q,ν)||2ϵfor ϵ>0 arbitrarily small\liminf_{n\rightarrow\infty}\frac{1}{n}\log Z_{n,q}^{\beta}\geq\beta\langle\tau,\nu\rangle+||(q,\nu)||-2\epsilon\ \mbox{for $\epsilon>0$ arbitrarily small}

Fix 0<ϵ<||(q,ν)||0<\epsilon<||(q,\nu)|| in +{\mathbb{Q}}_{+} and let δ>0\delta>0 satisfy (34). We are in the event (35) so

limnminπen(||(q,ν)||ϵ)ρ(1nμπ,ν)=0\lim_{n\rightarrow\infty}\min_{\pi}^{\lfloor e^{n(||(q,\nu)||-\epsilon)}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0

Denote by πn\pi_{n} the (event-depedent) paths corresponding to these order statistics. For large enough nn we have ρ(1nμπn,ν)<δ\rho(\frac{1}{n}\mu_{\pi_{n}},\nu)<\delta hence there are en(||(q,ν)||ϵ)\lfloor e^{n(||(q,\nu)||-\epsilon)}\rfloor paths π:0nq\pi:\vec{0}\rightarrow\lfloor nq\rfloor satisfying

ρ(1nμπ,ν)ρ(1nμπn,ν)<δso|βτ,1nμπβτ,ν|<ϵ\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\leq\rho\bigg{(}\frac{1}{n}\mu_{\pi_{n}},\nu\bigg{)}<\delta\ \mbox{so}\ \bigg{|}\beta\langle\tau,\frac{1}{n}\mu_{\pi}\rangle-\beta\langle\tau,\nu\rangle\bigg{|}<\epsilon (36)

by (34). But then

Zn,qβthe en(||(q,ν)||ϵ) paths π from aboveeβnτ,1nμπen(||(q,ν)||ϵ)en(βτ,νϵ)Z_{n,q}^{\beta}\geq\sum_{\mbox{the $\lfloor e^{n(||(q,\nu)||-\epsilon)}\rfloor$ paths $\pi$ from above}}e^{\beta n\langle\tau,\frac{1}{n}\mu_{\pi}\rangle}\geq\lfloor e^{n(||(q,\nu)||-\epsilon)}\rfloor e^{n(\beta\langle\tau,\nu\rangle-\epsilon)}

by (36). It follows that

lim infn1nlogZn,qββτ,νϵ+||(q,ν)||ϵ\liminf_{n\rightarrow\infty}\frac{1}{n}\log Z_{n,q}^{\beta}\geq\beta\langle\tau,\nu\rangle-\epsilon+||(q,\nu)||-\epsilon

Since ϵ>0\epsilon>0 was arbitrarily small, then taking ϵ0+\epsilon\rightarrow 0^{+} we get

lim infn1nlogZn,qβsupνq[βτ,ν+||(q,ν)||]a.s.\liminf_{n\rightarrow\infty}\frac{1}{n}\log Z_{n,q}^{\beta}\geq\sup_{\nu\in\mathcal{R}^{q}}[\beta\langle\tau,\nu\rangle+||(q,\nu)||]\ \mbox{a.s.}\ (37)

as desired. \square[Claim 1]

For the second half, we actually show a slightly stronger claim, which we use in a later proof.

Claim 2: Let W+W\subseteq\mathcal{M}_{+} be a weakly open, possibly empty set s.t. qW\mathcal{R}^{q}\setminus W\neq\emptyset. Define

Yn,qβ:=π:0nqs.t.1nμπWeβT(π)=π:0nqs.t.1nμπWeβnτ,1nμπY_{n,q}^{\beta}:=\sum_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\notin W}e^{\beta T(\pi)}=\sum_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\notin W}e^{\beta n\langle\tau,\frac{1}{n}\mu_{\pi}\rangle}

Then

lim supn1nlogYn,qβsupν+W[βτ,ν+||(q,ν)||]a.s.\limsup_{n\rightarrow\infty}\frac{1}{n}\log Y_{n,q}^{\beta}\leq\sup_{\nu\in\mathcal{M}_{+}\setminus W}[\beta\langle\tau,\nu\rangle+||(q,\nu)||]\ \mbox{a.s.} (38)

Of course, in our case W=W=\emptyset and Yn,qβ=Zn,qβY_{n,q}^{\beta}=Z_{n,q}^{\beta}.

Proof of Claim 2
We must be extra careful to only specify countably many measure zero events we exclude. We do this by considering a countable dense family of target measures.

SWS\setminus W is weakly compact. So for every mm\in{\mathbb{N}} there is some M=M(m)M=M(m)\in{\mathbb{N}} and a finite 1m\frac{1}{m}-net B1m(ν1m),,B1m(νMm)B_{\frac{1}{m}}(\nu_{1}^{m}),\ldots,B_{\frac{1}{m}}(\nu_{M}^{m}) covering SWS\setminus W with the property that νimW\nu_{i}^{m}\notin W (but the balls themselves may intersect WW). From now on we restrict ourselves to the event

{limnd1m((0,0),(nq,nνim))n=d~1m((0,0),(q,νmi))m,1iM(m)}\bigg{\{}\lim_{n\rightarrow\infty}\frac{d^{\frac{1}{\sqrt{m}}}((\vec{0},0),(nq,n\nu_{i}^{m}))}{n}=\widetilde{d}^{\frac{1}{\sqrt{m}}}((\vec{0},0),(q,\nu^{m}_{i}))\ \forall m\in{\mathbb{N}},1\leq i\leq M(m)\bigg{\}} (39)

which is a countable intersection of measure 1 events by Theorem 12 hence has measure 1. It is important to note that for any arbitrary measure ξ+\xi\in\mathcal{M}_{+} we still have

||(q,ξ)||=infϵ>0d~ϵ((0,0),(q,ξ))||(q,\xi)||=\inf_{\epsilon>0}\widetilde{d}^{\epsilon}((\vec{0},0),(q,\xi))

because this is a non-probabilistic statement about constants; we do not however assume that these directed metrics d~ϵ((0,0),(q,ξ))\widetilde{d}^{\epsilon}((\vec{0},0),(q,\xi)) are limits of the dϵ((0,0),(nq,nξ))n\frac{d^{\epsilon}((\vec{0},0),(nq,n\xi))}{n}.

Now we wish to fix an arbitrary ϵ>0\epsilon>0 and prove that in the measure 1 event (39),

L:=lim supn1nlogYn,qβsupν+W[βτ,ν+||(q,ν)||]+5ϵL:=\limsup_{n\rightarrow\infty}\frac{1}{n}\log Y_{n,q}^{\beta}\leq\sup_{\nu\in\mathcal{M}_{+}\setminus W}[\beta\langle\tau,\nu\rangle+||(q,\nu)||]+5\epsilon\

It suffices to consider a convergent subsequence 1nklogYnk,qβL\frac{1}{n_{k}}\log Y_{n_{k},q}^{\beta}\rightarrow L and show LL\leq the right hand side.

Fix any mm\in{\mathbb{N}}. For large kk,

||nkq||1nk[12||q||1,32||q||1]\frac{||\lfloor n_{k}q\rfloor||_{1}}{n_{k}}\in\bigg{[}\frac{1}{2}||q||_{1},\frac{3}{2}||q||_{1}\bigg{]}

so any path π:0nkq\pi:\vec{0}\rightarrow\lfloor n_{k}q\rfloor we are summing over in Ynk,qβY_{n_{k},q}^{\beta} has 1nkμπSW\frac{1}{n_{k}}\mu_{\pi}\in S\setminus W hence 1nkμπ\frac{1}{n_{k}}\mu_{\pi} is in one of the B1m(νim)B_{\frac{1}{m}}(\nu_{i}^{m}). Therefore

Ynk,qβi=1M(m)πs.t.1nkμπB1m(νim)eβT(π)M(m)maxi=1M(m)(πs.t.1nkμπB1m(νim)eβT(π))Y_{n_{k},q}^{\beta}\leq\sum_{i=1}^{M(m)}\sum_{\pi\ \mbox{s.t.}\ \frac{1}{n_{k}}\mu_{\pi}\in B_{\frac{1}{m}}(\nu_{i}^{m})}e^{\beta T(\pi)}\leq M(m)\max_{i=1}^{M(m)}\bigg{(}\sum_{\pi\ \mbox{s.t.}\ \frac{1}{n_{k}}\mu_{\pi}\in B_{\frac{1}{m}}(\nu_{i}^{m})}e^{\beta T(\pi)}\bigg{)}

Note that M(m)M(m) is a constant with respect to kk, whereas the max above is both event-dependent and kk-dependent. Thus there is some event-dependent 1I(m)M(m)1\leq I(m)\leq M(m) and some subsequence nkjn_{k_{j}} s.t. the above maximum is achieved by the same i=I(m)i=I(m) for every nk=nkjn_{k}=n_{k_{j}}. (Finite pigeonholes, infinite pigeons if you will.) Therefore

L=lim supj1nkjlogYnkj,qβlim supj1nkjlog(πs.t.1nkjμπB1m(νI(m)m)eβT(π))L=\limsup_{j\rightarrow\infty}\frac{1}{n_{k_{j}}}\log Y_{n_{k_{j}},q}^{\beta}\leq\limsup_{j\rightarrow\infty}\frac{1}{n_{k_{j}}}\log\bigg{(}\sum_{\pi\ \mbox{s.t.}\ \frac{1}{n_{k_{j}}}\mu_{\pi}\in B_{\frac{1}{m}}(\nu_{I(m)}^{m})}e^{\beta T(\pi)}\bigg{)} (40)

This holds for any mm. Our strategy will be to choose a very large, event-dependent mm that gives a nice event-independent upper bound.

Let us upper bound the expression in the limit on the right-hand side of (40). Lemma 31 with χ=νI(m)m,γ=1m\chi=\nu_{I(m)}^{m},\gamma=\frac{1}{\sqrt{m}} establishes that for any jj\in{\mathbb{N}},

log(π:0nkjqs.t.1nkjμπB1m(νI(m)m)eβT(π))βnkj(supξB1m(νI(m)m)τ,ξ)+nkj1/m1/m+d1m((0,0),(nkjq,nkjνI(m)m)))\begin{split}&\log\bigg{(}\sum_{\pi:\vec{0}\rightarrow\lfloor n_{k_{j}}q\rfloor\ \mbox{s.t.}\ \frac{1}{n_{k_{j}}}\mu_{\pi}\in B_{\frac{1}{m}}(\nu_{I(m)}^{m})}e^{\beta T(\pi)}\bigg{)}\\ &\leq\beta n_{k_{j}}\bigg{(}\sup_{\xi\in B_{\frac{1}{m}}(\nu_{I(m)}^{m})}\langle\tau,\xi\rangle\bigg{)}+n_{k_{j}}\frac{1/m}{1/\sqrt{m}}+d^{\frac{1}{\sqrt{m}}}((\vec{0},0),(n_{k_{j}}q,n_{k_{j}}\nu_{I(m)}^{m})))\end{split} (41)

This holds for any arbitrary mm\in{\mathbb{N}}. The next step is to specify a sufficiently large mm.

By compactness of SWS\setminus W, the measures νI(m)m\nu_{I(m)}^{m} have a weakly convergent subsequence
νI(ml)mlνSW\nu_{I(m_{l})}^{m_{l}}\Rightarrow\nu^{\prime}\in S\setminus W.

Fix some l0l_{0}\in{\mathbb{N}} large enough so that

1ml0+ϵml0<δ,1ml0<ϵ,d~1ml0((0,0),(q,ν))<||(q,ν)||+ϵ\frac{1}{m_{l_{0}}}+\frac{\epsilon}{\sqrt{m_{l_{0}}}}<\delta,\ \frac{1}{\sqrt{m_{l_{0}}}}<\epsilon,\ \widetilde{d}^{\ \frac{1}{\sqrt{m_{l_{0}}}}}((\vec{0},0),(q,\nu^{\prime}))<||(q,\nu^{\prime})||+\epsilon (42)

Consider some event-dependent ll0l\geq l_{0} large enough s.t.

ρ(νI(ml)ml,ν)<ϵml0\rho(\nu_{I(m_{l})}^{m_{l}},\nu^{\prime})<\frac{\epsilon}{\sqrt{m_{l_{0}}}} (43)

Since

ξB1ml(νI(ml)ml)¯ρ(ξ,ν)ρ(ξ,νI(ml)ml)+ρ(νI(ml)ml,ν)1ml+ϵml01ml0+ϵml0<δ\xi\in\overline{B_{\frac{1}{m_{l}}}(\nu_{I(m_{l})}^{m_{l}})}\Rightarrow\rho(\xi,\nu^{\prime})\leq\rho(\xi,\nu_{I(m_{l})}^{m_{l}})+\rho(\nu_{I(m_{l})}^{m_{l}},\nu^{\prime})\leq\frac{1}{m_{l}}+\frac{\epsilon}{\sqrt{m_{l_{0}}}}\leq\frac{1}{m_{l_{0}}}+\frac{\epsilon}{\sqrt{m_{l_{0}}}}<\delta

and δ\delta satisfies (34) then

supξB1ml(νI(ml)ml)βτ,ξsupξB1ml(νI(ml)ml)¯βτ,ξ<βτ,ν+ϵ\sup_{\xi\in B_{\frac{1}{m_{l}}}(\nu_{I(m_{l})}^{m_{l}})}\beta\langle\tau,\xi\rangle\leq\sup_{\xi\in\overline{B_{\frac{1}{m_{l}}}(\nu_{I(m_{l})}^{m_{l}})}}\beta\langle\tau,\xi\rangle<\beta\langle\tau,\nu^{\prime}\rangle+\epsilon (44)

Next, we have

1ml1ml0<ϵ\frac{1}{\sqrt{m_{l}}}\leq\frac{1}{\sqrt{m_{l_{0}}}}<\epsilon (45)

Finally, we are in the event (39) so for large enough jj we have

d1ml((0,0),(nkjq,nkjνmlI(ml)))<nkjd~1ml((0,0),(q,νmlI(ml)))+nkjϵnkjd~1ml0((0,0),(q,νmlI(ml)))+nkjϵnkjd~1ml0((0,0),(q,ν))+nkj1/ml0ρ(νmlI(ml),ν)+nkjϵ\begin{split}&d^{\frac{1}{\sqrt{m_{l}}}}((\vec{0},0),(n_{k_{j}}q,n_{k_{j}}\nu^{m_{l}}_{I(m_{l})}))\\ &<n_{k_{j}}\widetilde{d}^{\ \frac{1}{\sqrt{m_{l}}}}((\vec{0},0),(q,\nu^{m_{l}}_{I(m_{l})}))+n_{k_{j}}\epsilon\\ &\leq n_{k_{j}}\widetilde{d}^{\ \frac{1}{\sqrt{m_{l_{0}}}}}((\vec{0},0),(q,\nu^{m_{l}}_{I(m_{l})}))+n_{k_{j}}\epsilon\\ &\leq n_{k_{j}}\widetilde{d}^{\ \frac{1}{\sqrt{m_{l_{0}}}}}((\vec{0},0),(q,\nu^{\prime}))+\frac{n_{k_{j}}}{1/\sqrt{m_{l_{0}}}}\rho(\nu^{m_{l}}_{I(m_{l})},\nu^{\prime})+n_{k_{j}}\epsilon\end{split} (46)

by Lemma 15. By (42),(43) this last expression is

<nkj||(q,ν)||+3nkjϵ<n_{k_{j}}||(q,\nu^{\prime})||+3n_{k_{j}}\epsilon

Combining the inequalities (44),(45),(46) with (41) we get that for large enough jj,

log(π:0nkjqs.t.1nkjμπB1ml(νI(ml)ml)eβT(π))nkj(βτ,ν+||(q,ν)||+5ϵ)\log\bigg{(}\sum_{\pi:\vec{0}\rightarrow\lfloor n_{k_{j}}q\rfloor\ \mbox{s.t.}\ \frac{1}{n_{k_{j}}}\mu_{\pi}\in B_{\frac{1}{m_{l}}}(\nu_{I(m_{l})}^{m_{l}})}e^{\beta T(\pi)}\bigg{)}\leq n_{k_{j}}\bigg{(}\beta\langle\tau,\nu^{\prime}\rangle+||(q,\nu^{\prime})||+5\epsilon\bigg{)} (47)

Substituting this in (40), we get

Lβτ,ν+||(q,ν)||+5ϵsupν+Wβτ,ν+||(q,ν)||+5ϵL\leq\beta\langle\tau,\nu^{\prime}\rangle+||(q,\nu^{\prime})||+5\epsilon\leq\sup_{\nu\in\mathcal{M}_{+}\setminus W}\beta\langle\tau,\nu\rangle+||(q,\nu)||+5\epsilon

as desired. Note that even though mlm_{l} and ν\nu^{\prime} in (47) are event-dependent, the upper bound we get for LL is not event-dependent, makes our argument work.  \square[Claim 2]

This finishes the proof of the variational formula in the fixed-direction case.

(ii) An analogous argument gives

Gβ=supν1[βτ,ν+||ν||]G^{\beta}=\sup_{\nu\in\mathcal{M}_{1}}[\beta\langle\tau,\nu\rangle+||\nu||]

But direction-free grid entropy is just grid entropy in the direction \ell. Furthermore,
||(,ν)||=||(\ell,\nu)||=-\infty if ν1\nu\notin\mathcal{M}_{1}. Thus

Gβ\displaystyle G^{\beta} =supν1[βτ,ν+||ν||]\displaystyle=\sup_{\nu\in\mathcal{M}_{1}}[\beta\langle\tau,\nu\rangle+||\nu||]
=supν+[βτ,ν+||(,ν)||]\displaystyle=\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle+||(\ell,\nu)||]
=Gβ\displaystyle=G^{\beta}_{\ell}
=supqD0,||q||1=1supν+[βτ,ν+||(q,ν)||]by Lemma 20\displaystyle=\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=1}\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle+||(q,\nu)||]\ \mbox{by Lemma \ref{maximalLemma}}
=supqD0,||q||1=1Gβq\displaystyle=\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=1}G^{\beta}_{q}

As an immediate corollary, we get another proof of Bates’ Variational Formula (Theorem 29), as well as its direction-free analogue. We stress that our proof as written works only under the assumption that τ\tau is bounded, but can easily be extended to the general case of measurable τ\tau.

Proof of Theorem 29. Fix qD0q\in{\mathbb{R}}^{D}_{\geq 0} and bounded measurable τ\tau. It is standard (e.g. see [GRAS16]) that for every nn,

limββ1logZn,qβ=Zn,q=supβ>0β1logZn,qβ=T(0,nq):=logZn,qa.s.\lim_{\beta\rightarrow\infty}\beta^{-1}\log Z_{n,q}^{\beta}=Z_{n,q}=\sup_{\beta>0}\beta^{-1}\log Z_{n,q}^{\beta}=T(\vec{0},nq):=\log Z_{n,q}^{\infty}\ \mbox{a.s.} (48)

which we recall is the last passage time between 0\vec{0} and nq\lfloor nq\rfloor. That is, the last passage time is the zero temperature analogue of logZn,qβ\log Z_{n,q}^{\beta}. Thus the the zero temperature point-to-point Gibbs Free Energy is the last passage time constant λq\lambda_{q}:

Gq:=limnn1logZn,q=limnn1T(0,nq)=λqa.s.G^{\infty}_{q}:=\lim_{n\rightarrow\infty}n^{-1}\log Z_{n,q}^{\infty}=\lim_{n\rightarrow\infty}n^{-1}T(\vec{0},nq)=\lambda_{q}\ \mbox{a.s.}

On the other hand, another standard result (see [GRAS16]) says that the zero temperature point-to-point Gibbs Free Energy is a scaled limit of the positive temperature free energies:

limββ1Gβq=supβ>0β1Gβq=Gq=λqa.s\lim_{\beta\rightarrow\infty}\beta^{-1}G^{\beta}_{q}=\sup_{\beta>0}\beta^{-1}G^{\beta}_{q}=G^{\infty}_{q}=\lambda_{q}\ \mbox{a.s} (49)

Fix NN\in{\mathbb{N}}. Dividing our variational formula by β\beta and taking the supremum over βN\beta\geq N we get by (49)

λq\displaystyle\lambda_{q} =supβNβ1Gβq\displaystyle=\sup_{\beta\geq N}\beta^{-1}G^{\beta}_{q}
=supβNsupνq(τ,ν+β1||(q,ν)||)\displaystyle=\sup_{\beta\geq N}\sup_{\nu\in\mathcal{R}^{q}}\bigg{(}\langle\tau,\nu\rangle+\beta^{-1}||(q,\nu)||\bigg{)}
=supνqsupβN(τ,ν+β1||(q,ν)||)\displaystyle=\sup_{\nu\in\mathcal{R}^{q}}\sup_{\beta\geq N}\bigg{(}\langle\tau,\nu\rangle+\beta^{-1}||(q,\nu)||\bigg{)}
=supνq(τ,ν+N1||(q,ν)||)\displaystyle=\sup_{\nu\in\mathcal{R}^{q}}\bigg{(}\langle\tau,\nu\rangle+N^{-1}||(q,\nu)||\bigg{)}

Taking NN\rightarrow\infty and noting that νq\nu\in\mathcal{R}^{q} implies |(q,ν)||[0,H(q)]|(q,\nu)||\in[0,H(q)] is bounded, we get Bates’ variational formula:

λq=supνqτ,ν\lambda_{q}=\sup_{\nu\in\mathcal{R}^{q}}\langle\tau,\nu\rangle

a.s..  \square

In other words, Bates’ variational formula is nothing more than the zero temperature analogue of the entropy variational formula for Gibbs Free Energy.

The other corollary of note is the answer to Hoffman’s question in the directed polymer model when our variational formula has a unique maximizer. As Bates observes, this happens for a dense family of measurable functions τ\tau (recall Remark Remark).

Corollary 32.

Fix an inverse temperature β>0\beta>0 and a bounded measurable
τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}}.

  1. (i)

    Fix qD0q\in{\mathbb{R}}^{D}_{\geq 0} and suppose βτ,ν+||(q,ν)||\beta\langle\tau,\nu\rangle+||(q,\nu)|| has a unique maximizer νq\nu\in\mathcal{R}^{q}. For every nn pick a path πn:0nq\pi_{n}:\vec{0}\rightarrow\lfloor nq\rfloor independently and at random according to the probabilities prescribed the corresponding point-to-point β\beta-polymer measure ρn,qβ\rho_{n,q}^{\beta}. Then the empirical measures 1nμπn\frac{1}{n}\mu_{\pi_{n}} converge weakly to ν\nu a.s.

  2. (ii)

    Suppose βτ,ν+||ν||\beta\langle\tau,\nu\rangle+||\nu|| has a unique maximizer ν1\nu\in\mathcal{M}_{1}. For every nn pick a length nn path πn\pi_{n} from 0\vec{0} independently and at random according to the probabilities prescribed the corresponding point-to-level β\beta-polymer measure ρnβ\rho_{n}^{\beta}. Then the empirical measures 1nμπn\frac{1}{n}\mu_{\pi_{n}} converge weakly to ν\nu a.s.

Remark.

In fact, it follows from our argument below combined with the compactness of 1\mathcal{M}_{1} that even in the general case when the variational formula might not have a unique maximizer, all of the accumulation points of the empirical measures along random paths chosen according to the β\beta-polymer measure are maximizers of the variational formula. Thus, one interesting open question for the future is whether we can more precisely describe these maximizers. This information could help distinguish between weak and strong disorder regimes; see [RASY17] for a recent paper on the role of minimizers of a ”cocycle” variational formula for the Gibbs Free Energy in determining disorder regimes.

Remark.

This partial answer to Hoffman’s question in the positive temperature case does also follow from the development of grid entropy by Rassoul-Agha and Seppäläinen [RAS14] as the rate function of the LDP of empirical measures. However, it seems it has not yet been formulated explicitly as we have done here.

Proof.

We focus on proving (i). The direction-free argument (ii) is analogous.

(i) If q=0q=\vec{0} then the paths πn\pi_{n} are the trivial path 00\vec{0}\rightarrow\vec{0} and 0={0}\mathcal{R}^{\vec{0}}=\{0\} so ν=0\nu=0, hence the result holds trivially. Thus we may assume ||q||1>0||q||_{1}>0.

Since the empirical measures 1nμπn\frac{1}{n}\mu_{\pi^{n}} eventually lie in the weakly compact set
S={ξ+:12||q||1||ξ||TV32||q||1}S=\{\xi\in\mathcal{M}_{+}:\frac{1}{2}||q||_{1}\leq||\xi||_{TV}\leq\frac{3}{2}||q||_{1}\} then by compactness the existence of an accumulation point is guaranteed so it suffices to show that the sequence 1nμπn\frac{1}{n}\mu_{\pi_{n}} cannot have any accumulation point other than ν\nu.

If q\mathcal{R}^{q} is a singleton, i.e. q={ν}={||q||1Λ}\mathcal{R}^{q}=\{\nu\}=\{||q||_{1}\Lambda\}, then by the characterization of q\mathcal{R}^{q} as the set of accumulation points of empirical measures (see Theorem 6), ν\nu is the only possible accumulation point of 1nμπn\frac{1}{n}\mu_{\pi_{n}} so we would be done. We can thus assume q\mathcal{R}^{q} is not a singleton. Thus there exists some η0>0\eta_{0}>0 s.t. Bη0(ν)CqB_{\eta_{0}}(\nu)^{C}\cap\mathcal{R}^{q}\neq\emptyset.

Suppose we are in the measure 1 event

{lim supn1nlogπ:0nq1nμπBη(ν)eβT(π)supξ+Bη(ν)[βτ,ξ+||(q,ξ)||]η+(0,η0)}{limn1nlogZn,qβ=βτ,ν+||(q,ν)||}\begin{split}&\Bigg{\{}\begin{array}[]{l}\limsup\limits_{n\rightarrow\infty}\frac{1}{n}\log\ \sum\limits_{\begin{subarray}{c}\pi:\vec{0}\rightarrow\lfloor nq\rfloor\\ \frac{1}{n}\mu_{\pi}\notin B_{\eta}(\nu)\end{subarray}}e^{\beta T(\pi)}\leq\sup\limits_{\xi\in\mathcal{M}_{+}\setminus B_{\eta}(\nu)}[\beta\langle\tau,\xi\rangle+||(q,\xi)||]\ \forall\eta\in{\mathbb{Q}}_{+}\cap(0,\eta_{0})\end{array}\Bigg{\}}\\ &\cap\bigg{\{}\lim_{n\rightarrow\infty}\frac{1}{n}\log Z_{n,q}^{\beta}=\beta\langle\tau,\nu\rangle+||(q,\nu)||\bigg{\}}\end{split} (50)

This is the intersection of countably many measure 1 events by Claim 2 in the proof of Theorem 30. Claim 2 applies because

Bη(ν)CqBη0(ν)Cq0<η<η0B_{\eta}(\nu)^{C}\cap\mathcal{R}^{q}\supseteq B_{\eta_{0}}(\nu)^{C}\cap\mathcal{R}^{q}\neq\emptyset\ \forall 0<\eta<\eta_{0}

We need to prove that in the measure 1 event (50), the probability that the empirical measures 1nμπn\frac{1}{n}\mu_{\pi^{n}} of the paths chosen independently at random according to the polymer measures do not have an accumulation point other than ν\nu is 0.

Fix any 0<η<η00<\eta<\eta_{0} in +{\mathbb{Q}}_{+}. We wish to show that the probability that 1nμπn\frac{1}{n}\mu_{\pi_{n}} is not in Bη(ν)CB_{\eta}(\nu)^{C} infinitely often is 0.

Let ξη+Bη(ν)\xi_{\eta}\in\mathcal{M}_{+}\setminus B_{\eta}(\nu) achieve the supremum

supξ+Bη(ν)[βτ,ξ+||(q,ξ)||]\sup_{\xi\in\mathcal{M}_{+}\setminus B_{\eta}(\nu)}[\beta\langle\tau,\xi\rangle+||(q,\xi)||]

By the maximality of ν\nu, there is some ϵ>0\epsilon>0 s.t.

βτ,ξη+||(q,ξη)||<βτ,ν+||(q,ν)||3ϵ\beta\langle\tau,\xi_{\eta}\rangle+||(q,\xi_{\eta})||<\beta\langle\tau,\nu\rangle+||(q,\nu)||-3\epsilon (51)

Consider any nn\in{\mathbb{N}}. Recall the definition of the point-to-point β\beta-polymer measure ρn,qβ\rho_{n,q}^{\beta} on the set of paths π:0nq\pi:\vec{0}\rightarrow\lfloor nq\rfloor:

ρn,qβ(dπ)=1Zn,qβeβT(π)\rho_{n,q}^{\beta}(d\pi)=\frac{1}{Z_{n,q}^{\beta}}e^{\beta T(\pi)}

Thus the probability that 1nμπnBη(ν)\frac{1}{n}\mu_{\pi_{n}}\notin B_{\eta}(\nu) (with respect to the β\beta-polymer measure ρn,qβ\rho_{n,q}^{\beta}) is

1Zn,qβπ:0nqs.t.1nμπBη(ν)eβT(π)\frac{1}{Z_{n,q}^{\beta}}\sum_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\notin B_{\eta}(\nu)}e^{\beta T(\pi)}

We are in the event (50), so for large enough nn, this is

exp(nβτ,νn||(q,ν)||+nϵ)exp(nβτ,ξη+n||(q,ξη)||+nϵ)enϵ\leq\mbox{exp}\bigg{(}-n\beta\langle\tau,\nu\rangle-n||(q,\nu)||+n\epsilon\bigg{)}\mbox{exp}\bigg{(}n\beta\langle\tau,\xi_{\eta}\rangle+n||(q,\xi_{\eta})||+n\epsilon\bigg{)}\leq e^{-n\epsilon}

by (51). n=1enϵ<\sum\limits_{n=1}^{\infty}e^{-n\epsilon}<\infty so by the Borel-Cantelli Lemma,

P(1nμπnhas an accumulation point in B2η(ν)C)\displaystyle P\bigg{(}\frac{1}{n}\mu_{\pi_{n}}\ \mbox{has an accumulation point in $B_{2\eta}(\nu)^{C}$}\bigg{)}
P(1nμπnBη(ν)Cfor infinitely many n)=0\displaystyle\leq P\bigg{(}\frac{1}{n}\mu_{\pi_{n}}\in B_{\eta}(\nu)^{C}\ \mbox{for infinitely many $n$}\bigg{)}=0

By continuity from below,

P(1nμπnhas an acc. point other than ν)\displaystyle P\bigg{(}\frac{1}{n}\mu_{\pi_{n}}\ \mbox{has an acc. point other than $\nu$}\bigg{)}
=limη0+P(1nμπnhas an acc. point in B2η(ν)C)=0\displaystyle=\lim_{\eta\rightarrow 0^{+}}P\bigg{(}\frac{1}{n}\mu_{\pi_{n}}\ \mbox{has an acc. point in $B_{2\eta}(\nu)^{C}$}\bigg{)}=0

Therefore the empirical measures 1nμπn\frac{1}{n}\mu_{\pi_{n}} converge weakly to ν\nu a.s.. ∎

5.3 Grid Entropy as The Negative Convex Conjugate of Gibbs Free Energy

Another way of viewing the variational formulas in Theorem 30 is that the point-to-point/point-to-level β\beta-Gibbs Free Energies as functions of the bounded measurable function τ\tau are the convex conjugates of the functions ν||(q,ν)||,ν||ν||\nu\mapsto-||(q,\nu)||,\nu\mapsto-||\nu|| on +,1\mathcal{M}_{+},\mathcal{M}_{1} respectively.

Let us briefly recall what that means (see [Zal02] for more details).

Definition.

Let XX be a locally convex Hausdorff space and let XX^{\ast} be its dual with respect to an inner product ,\langle\cdot,\cdot\rangle. A convex function f:X[,]f:X\rightarrow[-\infty,\infty] is said to be proper if f>f>-\infty and ff\not\equiv\infty. For any proper convex function f:X[,]f:X\rightarrow[-\infty,\infty], its convex conjugate is the function f:X[,]f^{\ast}:X^{\ast}\rightarrow[-\infty,\infty] given by

f(x)=supxX[x,xf(x)]f^{\ast}(x^{\ast})=\sup_{x\in X}[\langle x,x^{\ast}\rangle-f(x)]

In our case, we have X=+X=\mathcal{M}_{+} (finite Borel measures on [0,1]) and XX^{\ast} is the set of bounded measurable functions τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}} with inner product given by integration.

Also, recall that direction-fixed/direction-free grid entropy is concave hence the maps ν||(q,ν)||,ν||ν||\nu\mapsto-||(q,\nu)||,\nu\mapsto-||\nu|| are convex and trivially proper. Furthermore, by our continuity theorem (Thm 26), ||(q,)||-||(q,\cdot)|| and ||ν||-||\nu|| are lower semicontinuous in ν\nu.

With this setup in mind, it is evident that Theorem 30 establishes that the point-to-point β\beta-Gibbs Free Energy is precisely the convex conjugate of ||(q,)||-||(q,\cdot)|| and the point-to-level β\beta-Gibbs Free Energy is the convex conjugate of ||ν||-||\nu||.

The convex conjugate has various important properties. We focus on those that are the most relevant and interesting in regards to grid entropy and Gibbs Free Energy. See [Zal02, Sect 2.3] for the full list of properties.

Corollary 33.

Fix qD0q\in{\mathbb{R}}^{D}_{\geq 0}, an inverse temperature β>0\beta>0. Then viewing Gibbs Free Energies as functions of bounded measurable τ\tau we have the following.

  1. (i)

    Gibbs Free Energies Gqβ(τ),Gβ(τ)G_{q}^{\beta}(\tau),G^{\beta}(\tau) are convex and lower semicontinuous in τ\tau

  2. (ii)

    (Order-preservation) If qD0q^{\prime}\in{\mathbb{R}}^{D}_{\geq 0} with ||(q,)||||(q,)||||(q,\cdot)||\leq||(q^{\prime},\cdot)|| then
    Gqβ()Gqβ()G_{q}^{\beta}(\cdot)\leq G_{q^{\prime}}^{\beta}(\cdot)

  3. (iii)

    (Biconjugate Duality) The convex conjugate of β\beta-Gibbs Free Energy (as a function of τ\tau) is minus grid entropy. That is,

    (Gqβ)=||(q,)||and(Gβ)=||||(G_{q}^{\beta})^{\ast}=-||(q,\cdot)||\ \mbox{and}\ (G^{\beta})^{\ast}=-||\cdot||

    In other words, grid entropy is equal to its biconjugate.

  4. (iv)

    (Fenchel’s Duality Theorem) Let g:+[,],h:1[,]g:\mathcal{M}_{+}\rightarrow[-\infty,\infty],h:\mathcal{M}_{1}\rightarrow[-\infty,\infty] be proper convex functions s.t. νq,ν1\exists\nu\in\mathcal{R}^{q},\nu^{\prime}\in\mathcal{R}^{1} with g(ν)<,h(ν)<g(\nu)<\infty,h(\nu^{\prime})<\infty and one of g,||(q,)||g,||(q,\cdot)|| is continuous at ν\nu and one of h,||||h,||\cdot|| is continuous at ν\nu^{\prime}. Then

    infν+[g(ν)||(q,ν)||]=supτ[Gqβ(τ)g(τ)]\inf_{\nu\in\mathcal{M}_{+}}[g(\nu)-||(q,\nu)||]=\sup_{\tau}[G_{q}^{\beta}(\tau)-g^{\ast}(\tau)]
    infν1[h(ν)||ν||]=supτ[Gβ(τ)h(τ)]\inf_{\nu\in\mathcal{M}_{1}}[h(\nu)-||\nu||]=\sup_{\tau}[G^{\beta}(\tau)-h^{\ast}(\tau)]

    where gβ,hβg^{\beta\ast},h^{\beta\ast} denote the β\beta-convex conjugates of g,hg,h respectively:

    gβ(τ)=supν+[βτ,νg(ν)],hβ(τ)=supν+[βτ,νh(ν)]g^{\beta\ast}(\tau)=\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle-g(\nu)],\ h^{\beta\ast}(\tau)=\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle-h(\nu)]
Remark.

(iii) gives yet another definition of direction-fixed/direction-free grid entropy. This one is perhaps the most remarkable of the three, as it connects it to the seemingly unrelated quantity that is Gibbs Free Energy. This more than justifies the importance and canonical nature of grid entropy.

Proof.

(iii) is an immediate consequence of the Fenchel-Moreau Theorem. The rest follow from the properties of convex conjugates of proper convex, lower semicontinuous functions. See [Zal02, Sect 2.3] for more details on (i), (ii) and see [VT84, Sect 7.15] for more details on (iv). ∎

The key takeaway is this: grid entropy and Gibbs Free Energy are intricately connected via convex duality.

5.4 Connection to Previous Literature on Grid Entropy

We end our discussion of these variational formulas by explicitly deriving the formulas (3) which link our notion of grid entropy to that presented previously in [RAS14]. For the sake of conciseness we focus on the direction-fixed (a.k.a. point-to-point) case.

Let us first recalibrate our setting to align with the framework of Rassoul-Agha and
Seppäläinen. We view each edge label as a value ωx,z\omega_{x,z} associated to the ”anchor” vertex xDx\in{\mathbb{Z}}^{D} and the unit NENE direction zz of the step (with respect to xx). Thus the environment becomes Ω:=[0,1]D×𝒢\Omega:=[0,1]^{{\mathbb{Z}}^{D}\times\mathcal{G}} where 𝒢={e1,,eD}\mathcal{G}=\{e_{1},\ldots,e_{D}\} is the set of DD unit NE steps. For each yDy\in{\mathbb{Z}}^{D}, we define a shift Ty:ΩΩT_{y}:\Omega\rightarrow\Omega in the natural way, by shifting the anchor: (Tyω)x,z=ωx+y,z(T_{y}\omega)_{x,z}=\omega_{x+y,z}. We let f:Ω×𝒢f:\Omega\times\mathcal{G}\rightarrow{\mathbb{R}} be a potential.

We initially work on the space Ω×𝒢\Omega\times\mathcal{G} of pairs of the environment and one unit step; that is we consider the empirical measures

1nχπn=1ni=0n1δ(TXiω,Xi+1Xi)\frac{1}{n}\chi_{\pi_{n}}=\frac{1}{n}\sum_{i=0}^{n-1}\delta_{(T_{X_{i}}\omega,X_{i+1}-X_{i})}

of the polymer paths πn:(X0X1Xn1)\pi_{n}:(X_{0}\rightarrow X_{1}\rightarrow\cdots\rightarrow X_{n-1}). It is established in [RASY13, Theorem 3.1] that the rate function I1(μ,q)I_{1}(\mu,q) of these empirical measures in the direction-qq case is precisely the lower semicontinuous regularization of

H1(μ)βf,μ+Λqβ(f)H_{1}(\mu)-\beta\langle f,\mu\rangle+\Lambda_{q}^{\beta}(f) (52)

where H1H_{1} is the infimum of certain relative entropies (see (5.2) of [RASY13]), and

Λqβ(f):=limn1nlogE[exp(βi=0n1f(TXiω,Xi+1Xi))1Xn1=nq]\Lambda_{q}^{\beta}(f):=\lim_{n\rightarrow\infty}\frac{1}{n}\log E\bigg{[}\mbox{exp}\bigg{(}\beta\sum\limits_{i=0}^{n-1}f(T_{X_{i}}\omega,X_{i+1}-X_{i})\bigg{)}\textbf{1}_{X_{n-1}=\lfloor nq\rfloor}\bigg{]}

is the (normalized) β\beta-Gibbs Free Energy in direction qq with potential ff defined in (2.3) of the same paper. Here μ\mu is a probability measure on Ω×𝒢\Omega\times\mathcal{G} s.t. the 1st step X1X0𝒢X_{1}-X_{0}\in\mathcal{G} has μ\mu-mean Eμ[X1X0]=qE^{\mu}[X_{1}-X_{0}]=q. This last condition is imposed because these paths go in the direction qq hence any limit points of their empirical measures must average to qq in the step coordinate; any μ\mu not satisfying this condition will trivially have an infinite rate function.

We consider the ”edge label potential” ϕ:Ω×𝒢[0,1]\phi:\Omega\times\mathcal{G}\rightarrow[0,1] given by ϕ(ω,z)=ω0,z\phi(\omega,z)=\omega_{0,z} which maps an (environment, unit direction) pair to the edge label of the corresponding unit direction anchored at the origin. If πn=(X0Xn1)\pi_{n}=(X_{0}\rightarrow\cdots\rightarrow X_{n-1}) is a polymer path then our empirical measures from this paper are just the ϕ\phi-pushforwards of 1nχπn\frac{1}{n}\chi_{\pi_{n}}:

1nμπn=1ni=0n1δωXi,Xi+1Xi=(ϕ)(1nχπn)\frac{1}{n}\mu_{\pi_{n}}=\frac{1}{n}\sum_{i=0}^{n-1}\delta_{\omega_{X_{i},X_{i+1}-X_{i}}}=(\phi)_{\ast}\bigg{(}\frac{1}{n}\chi_{\pi_{n}}\bigg{)}

We obtain the rate function I1(ν,q)I_{1}^{\prime}(\nu,q) of the empirical measures 1nμπn\frac{1}{n}\mu_{\pi_{n}} in the direction-qq case by contracting the higher level LDP rate function I1(μ,q)I_{1}(\mu,q):

I1(ν,q)=infμ:(ϕ)μ=νI1(μ,q)I_{1}^{\prime}(\nu,q)=\inf_{\mu:(\phi)_{\ast}\mu=\nu}I_{1}(\mu,q) (53)

for probability measures ν1\nu\in\mathcal{M}_{1}, where the infimum is taken over probability measures μ\mu on Ω×𝒢\Omega\times\mathcal{G} whose ϕ\phi-pushforward is ν\nu.

We now make some convenient observations. First, for any measurable and bounded
τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}}, our (unnormalized) β\beta-point-to-point Gibbs Free Energy Gqβ(τ)G_{q}^{\beta}(\tau) is precisely Λqβ(τϕ)+logD\Lambda_{q}^{\beta}(\tau\circ\phi)+\log D. Second, for any μ\mu s.t. ν=(ϕ)μ\nu=(\phi)_{\ast}\mu we have

ϕ,μ=ϕ(u)dμ=udν=ι,ν\langle\phi,\mu\rangle=\int\phi(u)d\mu=\int ud\nu=\langle\iota,\nu\rangle

where ι:[0,1][0,1],ι(u)=u\iota:[0,1]\rightarrow[0,1],\iota(u)=u is the identity function. Thus the latter two terms in (52) for I1(μ,q)I_{1}(\mu,q) are ignored by the infimum in (53) so we get that I1(ν,q)I_{1}^{\prime}(\nu,q) is the lower semicontinuous regularization of

H1(ν,q)βι,ν+Gqβ(ι)logDH_{1}^{\prime}(\nu,q)-\beta\langle\iota,\nu\rangle+G_{q}^{\beta}(\iota)-\log D

where

H1(ν,q)=inf{H1(μ):μis a prob. meas. onΩ×𝒢s.t.(ϕ)μ=ν,Eμ[X1X0]=q}H_{1}^{\prime}(\nu,q)=\inf\big{\{}H_{1}(\mu):\mu\ \mbox{is a prob. meas. on}\ \Omega\times\mathcal{G}\ \mbox{s.t.}\ (\phi)_{\ast}\mu=\nu,E^{\mu}[X_{1}-X_{0}]=q\big{\}} (54)

(5.9) from [RAS14] gives the following variational formula for the point-to-point Gibbs Free Energy:

Λqβ(f)=supμ:Eμ[X1X0]=q[βf,μH1(μ)]\Lambda_{q}^{\beta}(f)=\sup_{\mu:E^{\mu}[X_{1}-X_{0}]=q}[\beta\langle f,\mu\rangle-H_{1}(\mu)] (55)

for bounded measurable potentials f:Ω×𝒢f:\Omega\times\mathcal{G}\rightarrow{\mathbb{R}} where the supremum is over probability measures μ\mu on Ω×𝒢\Omega\times\mathcal{G} with mean step qq.

Taking f=τϕf=\tau\circ\phi for bounded measurable τ:[0,1]\tau:[0,1]\rightarrow{\mathbb{R}} and splitting the supremum over μ\mu into an outer supremum over ν1\nu\in\mathcal{M}_{1} and an inner supremum over μ\mu whose ϕ\phi-pushforward is ν\nu, we get

Λqβ(τϕ)=supμ:Eμ[X1X0]=q[βτϕ,μH1(μ)]=supν1supμ:(ϕ)μ=ν,Eμ[X1X0]=q[βτ,νH1(μ)]=supν1[βτ,νH1(ν,q)]\begin{split}\Lambda_{q}^{\beta}(\tau\circ\phi)&=\sup_{\mu:E^{\mu}[X_{1}-X_{0}]=q}[\beta\langle\tau\circ\phi,\mu\rangle-H_{1}(\mu)]\\ &=\sup_{\nu\in\mathcal{M}_{1}}\sup_{\mu:(\phi)_{\ast}\mu=\nu,E^{\mu}[X_{1}-X_{0}]=q}[\beta\langle\tau,\nu\rangle-H_{1}(\mu)]\\ &=\sup_{\nu\in\mathcal{M}_{1}}[\beta\langle\tau,\nu\rangle-H_{1}^{\prime}(\nu,q)]\end{split} (56)

where the last equality holds by (54). Comparing this to our variational formula for the point-to-point Gibbs Free Energy from Theorem 30 (i), recalling that Gibbs Free Energy is biconjugate from Corollary 33, and recalling that Gqβ(τ)=Λqβ(τϕ)+logDG_{q}^{\beta}(\tau)=\Lambda_{q}^{\beta}(\tau\circ\phi)+\log D, we get

H1(ν,q)=supτ:[0,1]bdd, meas.[βτ,νGqβ(τ)+logD]=||(q,ν)||+logDH_{1}^{\prime}(\nu,q)=\sup_{\tau:[0,1]\rightarrow{\mathbb{R}}\ \mbox{bdd, meas.}}[\beta\langle\tau,\nu\rangle-G_{q}^{\beta}(\tau)+\log D]=-||(q,\nu)||+\log D

The direction-free (a.k.a. point-to-level) argument is nearly identical, except we drop the qq indices and the condition that the first step X1X0X_{1}-X_{0} has μ\mu-mean Eμ[X1X0]=qE^{\mu}[X_{1}-X_{0}]=q. In the end we get

H1(ν)=||ν||+logDH_{1}^{\prime}(\nu)=-||\nu||+\log D

where

H1(ν)=infμ:(ϕ)μ=νH1(μ)H_{1}^{\prime}(\nu)=\inf_{\mu:(\phi)_{\ast}\mu=\nu}H_{1}(\mu)

As a final remark, we mention that this entropy H1H_{1} originally appears in Varadhan’s paper [Var03] on large deviations for empirical measures of Markov chains.

6 Extensions to Other Models and Open Questions

Looking back at our development of grid entropy in D{\mathbb{R}}^{D}, we never required some specific property of the lattice model other than that the number of paths π:0nq\pi:\vec{0}\rightarrow\lfloor nq\rfloor we consider is O(eH(q)n)O(e^{H(q)n}) for some model-dependent constant

H(q)=lim supn1nlog(#π:0nq)H(q)=\limsup_{n\rightarrow\infty}\frac{1}{n}\log(\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor)

that all such π\pi have the same length O(n)O(n), and that any pair of paths 0mq\vec{0}\rightarrow\lfloor mq\rfloor and mq(m+n)q\lfloor mq\rfloor\rightarrow\lfloor(m+n)q\rfloor can be concatenated (thus giving us the superadditivity required to be able to apply the Subadditive Ergodic Theorem).

Therefore the notion of grid entropy and all our results, including the variational formula for the Gibbs Free Energy, apply in any edge model with these properties. For example, we might consider random walks on +×D{\mathbb{Z}}_{+}\times{\mathbb{Z}}^{D} where the first coordinate, ”time,” is always increasing along a path.

In addition, if we loosen some of these conditions, then a part of our work in this paper still holds. For example, instead of NE paths on D{\mathbb{Z}}^{D} consider self-avoiding walks (SAWs) on D{\mathbb{Z}}^{D}, as done in [Bat20]. Then we cannot always concatenate a pair of SAWs 0mq\vec{0}\rightarrow\lfloor mq\rfloor and mq(m+n)q\lfloor mq\rfloor\rightarrow\lfloor(m+n)q\rfloor so our arguments that rely on superadditivity no longer apply. However, we may still define grid entropy by

||(q,ν)||:=sup{α0:limnminπ:0nqeαnρ(1nμπ,ν)=0a.s.}||(q,\nu)||:=\sup\bigg{\{}\alpha\geq 0:\lim_{n\rightarrow\infty}\min_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{\lfloor e^{\alpha n}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\bigg{\}}

and it will be true that only target measures νΛ\nu\ll\Lambda with total mass ||q||1||q||_{1} are observed and that we have an upper bound on the sum of relative entropy and grid entropy:

DKL(ν||Λ)+||(q,ν)||χDwhen ||q||1=1D_{KL}(\nu||\Lambda)+||(q,\nu)||\leq\chi_{D}\ \mbox{when $||q||_{1}=1$}

where χD\chi_{D} is the connectivity constant for D{\mathbb{Z}}^{D} satisfying

#(length n SAWs from 0)=enχD+o(n)\#(\mbox{length $n$ SAWs from $\vec{0}$})=e^{n\chi_{D}+o(n)}

(see [Law13, Section 6.2] for details). It is not clear, however, whether any of the other properties (including this being a directed norm with negative sign) still hold.

Note however that with a small modification we can make this work. All our results hold for SAWs on D+1{\mathbb{Z}}^{D+1} where the first coordinate, ”time,” is necessarily non-decreasing because we obtain a superadditivity akin to the one for NE paths.


We end by mentioning two open questions that could be the scope of future work. Can we describe the set of maximizers of our variational formula (Theorem 30)? As discussed before, this would provide insights into the limit points of empirical measures and also into the problem of distinguishing weak and strong order regimes. And secondly, is grid entropy strictly concave? From our results in this chapter, answering this would immediately imply that Gibbs Free Energy is strictly convex, which is currently a major open research problem in this field.

7 Acknowledgments

I would like to thank my friends and family for their continual support during my graduate studies. Special thanks goes to my advisor Bálint Virág, for his patience, guidance, and optimism in the face of setbacks. I also wish to thank Duncan Dauvergne for very helpful discussions and peer review on a preliminary draft. In addition, I am grateful to my thesis external reviewer Firas Rassoul-Agha for his comments and for drawing some connections to previous literature. Finally, this work would not have been possible without the funding from my NSERC Canadian Graduate Scholarship-Doctoral.

References

  • [AH16] Daniel Ahlberg and Christopher Hoffman, Random coalescing geodesics in first-passage percolation, arXiv preprint:1609.02447 (2016).
  • [Ahl15] Daniel Ahlberg (ed.), Problem lists: first passage percolation, American Institute of Mathematics, Aug 2015.
  • [Bat20] Erik Bates, Empirical distributions, geodesic lengths, and a variational formula in first-passage percolation, arXiv preprint:2006.12580 (2020).
  • [Car10] Philippe Carmona, Directed polymer in random environment and last passage percolation, ESAIM: Probability and Statistics 14 (2010), 263–270.
  • [DS01] Jean-Dominique Deuschel and Daniel W Stroock, Large deviations, vol. 342, American Mathematical Soc., 2001.
  • [Dur19] Rick Durrett, Probability: theory and examples, vol. 49, Cambridge university press, 2019.
  • [GRAS16] Nicos Georgiou, Firas Rassoul-Agha, and Timo Seppäläinen, Variational formulas and cocycle solutions for directed polymer and percolation models, Communications in Mathematical Physics 346 (2016), no. 2, 741–779.
  • [Hub04] Peter J Huber, Robust statistics, vol. 523, John Wiley & Sons, 2004.
  • [JLS20] Christopher Janjigian, Wai-Kit Lam, and Xiao Shen, Tail bounds for the averaged empirical distribution on a geodesic in first-passage percolation, arXiv preprint:2010.08072 (2020).
  • [JRAS19] Christopher Janjigian, Firas Rassoul-Agha, and Timo Seppäläinen, Geometry of geodesics through busemann measures in directed last-passage percolation, arXiv preprint:1908.09040 (2019).
  • [KL51] Solomon Kullback and Richard A Leibler, On information and sufficiency, The annals of mathematical statistics 22 (1951), no. 1, 79–86.
  • [Law13] Gregory F Lawler, Intersections of random walks, Springer Science & Business Media, 2013.
  • [Lig85] Thomas M Liggett, An improved subadditive ergodic theorem, The Annals of Probability 13 (1985), no. 4, 1279–1285.
  • [Mar04] James B Martin, Limiting shape for directed percolation models, The Annals of Probability 32 (2004), no. 4, 2908–2937.
  • [MSZ21] James B Martin, Allan Sly, and Lingfu Zhang, Convergence of the environment seen from geodesics in exponential last-passage percolation, arXiv preprint:2106.05242 (2021).
  • [Pos75] Edward Posner, Random coding strategies for minimum entropy, IEEE Transactions on Information Theory 21 (1975), no. 4, 388–391.
  • [RAS14] Firas Rassoul-Agha and Timo Seppäläinen, Quenched point-to-point free energy for random walks in random potentials, Probability Theory and Related Fields 158 (2014), no. 3, 711–750.
  • [RASY13] Firas Rassoul-Agha, Timo Seppäläinen, and Atilla Yilmaz, Quenched free energy and large deviations for random walks in random potentials, Communications on Pure and Applied Mathematics 66 (2013), no. 2, 202–244.
  • [RASY17]  , Variational formulas and disorder regimes of random walks in random potentials, Bernoulli 23 (2017), no. 1, 405–431.
  • [Var03] SRS1989232 Varadhan, Large deviations for random walks in a random environment, Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences 56 (2003), no. 8, 1222–1245.
  • [VT84] Jan Van Tiel, Convex analysis: an introductory text, Wiley, 1984.
  • [Zal02] Constantin Zalinescu, Convex analysis in general vector spaces, World scientific, 2002.