Grid entropy in last passage percolation — a superadditive critical exponent approach

Alexandru Gatea

Department of Mathematics, University of Toronto
[email protected]

Abstract

Working in the setting of i.i.d. last-passage percolation on $\mathbb{R}^{D}$ with no assumptions on the underlying edge-weight distribution, we arrive at the notion of grid entropy — a Subadditive Ergodic Theorem limit of the entropies of paths with empirical measures weakly converging to a given target, or equivalently a deterministic critical exponent of canonical order statistics associated with the Levy-Prokhorov metric. This provides a fresh approach to an entropy first developed by Rassoul-Agha and Seppäläinen as a large deviation rate function of empirical measures along paths. In their 2014 paper [RAS14], variational formulas are developed for the point-to-point/point-to-level Gibbs Free Energies as the convex conjugates of this entropy. We rework these formulas in our new framework and explicitly link our descriptions of grid entropy to theirs. We also improve on a known bound for this entropy by introducing a relative entropy term in the inequality. Furthermore, we show that the set of measures with finite grid entropy coincides with the deterministic set of limit points of empirical measures studied in a recent paper [Bat20] by Bates. In addition, we partially answer a directed polymer version of a question of Hoffman which was previously tackled in the zero temperature case by Bates. Our results cover both the point-to-point and point-to-level scenarios.

1 Introduction

The limiting behaviour of empirical measures is of paramount importance in percolation theory. The normalized passage time along a path, what we ultimately care about, is nothing but the identity function integrated against the empirical measure of that path. In First/Last Passage Percolation we study the minimizing/maximizing paths called geodesics.

A major open problem posed by Hoffman in [Ahl15] is whether empirical measures along geodesics in a fixed direction converge weakly. This is partially answered in [Bat20] for FPP. Bates proves that the sets $\mathcal{R}^{q},\mathcal{R}$ of limiting distributions in direction $q$ , limiting distributions in the direction-free case respectively are deterministic and derives an explicit variational formula for the limit shape of the first passage time as the minimum value of a linear functional over $\mathcal{R}^{q}$ . When the set of minimizers is a singleton, which Bates argues happens for a dense family of edge-weight distributions, it follows that Hoffman’s question is answered in the affirmative. The same argument applied to the LPP model yields analogous conclusions.

Generally, there is no known way of computing the limiting distributions along geodesics. [MSZ21] showcases some cutting edge developments for the solvable Exponential LPP on ${\mathbb{Z}}^{2}$ model, including an explicit formula for weak limits of empirical measures along geodesics. For other recent work on geodesics see [AH16], [JRAS19] and [JLS20].

One way to extend the work in [Bat20] is to study the more general question of what entropy of empirical measures in a fixed direction or direction-free converge in distribution to a certain target measure. Grid entropy gives an exact, deterministic answer.

In [RAS14], Rassoul-Agha and Seppäläinen derive this entropy as a large deviation rate function of empirical measures along paths and they provide variational formulas realizing the point-to-point/point-to-level Gibbs Free Energies as the convex conjugates of this mysterious new entropy. This rate function approach suggests that one might be able to derive formulas for this entropy coming from subadditivity, and furthermore that it is related to a critical exponent of paths.

Our main aim is to prove these intuitions correct, working in an LPP setting on ${\mathbb{Z}}^{D}$ which can easily be extended to more general frameworks. Rather than starting from the definitions given in [RAS14], we define grid entropy as a certain critical exponent of paths and show that it is equivalently described as a superadditive ergodic limit. Moreover, we arrive at variational formulas which are analogues to Rassoul-Agha and Seppäläinen’s and which are in fact positive-temperature analogues of Bates’ variational formula. Along the way, we establish various properties of this entropy, some of which are already noted in or follow from [RAS14] and [Bat20], but also some of which are not, such as a partial answer to Hoffman’s question in a directed polymer setting and such as an amazing equality between grid entropy, relative entropy and Shannon entropy. We give versions of these results for both direction- $q$ grid entropy $||(q,\nu)||$ and direction-free grid entropy $||\nu||$ of target measures $\nu$ . By concavity and the lattice symmetries, the direction-free case turns out to simply be the direction-fixed case in the unit direction $(\frac{1}{D},\ldots,\frac{1}{D})$ which maximizes the total number of paths.

Let us now be more precise. We consider north-east nearest-neighbor paths on ${\mathbb{Z}}^{D}$ , and work on ${\mathbb{R}}^{D}$ by taking coordinate-wise floors. We follow a similar approach to [Bat20], in that we couple our i.i.d. edge weights $\tau_{e}$ to i.i.d. edge labels Unif[0,1] random variables $U_{e}$ via a measurable function $\tau:[0,1]\rightarrow{\mathbb{R}}$ satisfying $\tau_{e}=\tau(U_{e})$ . This lets us work on [0,1] at no additional cost, as we can lift everything back to ${\mathbb{R}}$ via the pushforward of $\tau$ . We then interest ourselves with how many empirical measures $\frac{1}{n}\mu_{\pi}=\frac{1}{n}\sum\limits_{e\in\pi}\delta_{U_{e}}$ for paths $\vec{0}\rightarrow\lfloor nq\rfloor$ converge to some given target measure $\nu$ in $\mathcal{M}_{+}$ , the set of finite non-negative Borel measures on [0,1]. We may keep the direction $q$ fixed, or we may vary $q$ over all points in ${\mathbb{R}}^{D}$ with the same 1-norm $||q||_{1}$ .

To perform the counting, we consider the order statistics of the distance of the paths’ empirical measures $\frac{1}{n}\mu_{\pi}$ to the target $\nu$ , where distance is measured via the Levy-Prokhorov metric $\rho$ , which metrizes weak convergence of measures. That is, given a direction $q\in{\mathbb{R}}^{D}$ and a target measure $\nu$ , for every $n\in{\mathbb{N}}$ we let

\min_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\leq\min_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{2}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\leq\ldots\leq\min_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}

denote the order statistics value of $\rho(\frac{1}{n}\mu_{\pi},\nu)$ . It is convenient to define

\min\limits_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{j}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}:=+\infty\ \mbox{for}\ j>\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor

In the direction-free case, where we count all paths of a certain scaled length from $\vec{0}$ , we let
$t:=||\nu||_{TV}$ and similarly define $\min\limits_{\pi\ \mbox{s.t.}\ |\pi|=\lfloor nt\rfloor}^{j}\rho(\frac{1}{n}\mu_{\pi},\nu)$ over paths of length $\lfloor nt\rfloor$ anchored at $\vec{0}$ . Of course, these order statistics and the paths corresponding to them are event-dependent.

We then define the grid entropy with respect to the target $\nu$ and the direction $q$ , denoted $||(q,\nu)||$ , and the direction-free grid entropy, denoted $||\nu||$ , to be the critical exponent where the corresponding order statistics change from converging to 0 a.s. to diverging a.s.:

\begin{split}||(q,\nu)||&:=\sup\bigg{\{}\alpha\geq 0\ :\lim_{n\rightarrow\infty}\min_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{\lfloor e^{\alpha n}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\bigg{\}}\\ ||\nu||&:=\sup\bigg{\{}\alpha\geq 0\ :\lim_{n\rightarrow\infty}\min_{\pi\ \mbox{s.t.}\ |\pi|=\lfloor nt\rfloor}^{\lfloor e^{\alpha n}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\bigg{\}}\end{split}

(1)

where these are defined to be $-\infty$ if the set of $\alpha$ ’s is empty. It will turn out that we can replace the limits in this definition with $\liminf$ ’s and get the same quantity. Observe that these grid entropies lie in $\{-\infty\}\cup[0,H(q)]$ , $\{-\infty\}\cup[0,\log D]$ respectively where

H(q):=\sum_{i=1}^{D}-q_{i}\log\frac{q_{i}}{||q||_{1}}

is the (Shannon) entropy of the total number of paths in direction $q$ , in the sense that

(\#\mbox{paths}\ \vec{0}\rightarrow nq)=e^{H(q)n+o(n)}

We note that the description (1) of grid entropy as the critical exponent of these order statistics has been previously shown to hold only for Bernoulli edge labels (see [Car10]).

Over the course of this paper we establish two other equivalent definitions of grid entropy which avoid the annoyance of dealing with these event-dependent orderings of the paths. One of these alternate descriptions is remarkably simple: grid entropy is the negative convex conjugate of Gibbs Free Energy; as a direct consequence, our notion of grid entropy is equivalent to that appearing in [RAS14] modulo some normalizations. The other, as the entropy of paths with empirical measure Levy-Prokhorov-close to the target, is completely new though not unexpected.

We summarize these characterizations in the following theorem and link them to current literature in subsequent remarks.

Theorem A.

(i)

Let $q\in{\mathbb{R}}^{D}_{\geq 0}$ and let $\nu$ be a finite non-negative Borel measure on [0,1]. Then the direction $-q$ grid entropy $||(q,\nu)||$ as defined in (1) is also given by

||(q,\nu)||=\inf_{\epsilon>0}\lim_{n\rightarrow\infty}\frac{1}{n}\log\sum_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}\ \mbox{a.s.}

(2)

The expressions we take an infimum of are each directed metrics with negative sign on ${\mathbb{R}}^{D}\times\mathcal{M}_{+}$ . That is, they take value in $[-\infty,\infty)$ , evaluate to 0 when $(q,\nu)=(\vec{0},0)$ , and satisfy the triangle inequality with the sign reversed.

Moreover, direction-fixed grid entropy is the negative convex conjugate of the point-to-point $\beta$ -Gibbs Free Energy in direction $q$ (as a function of the environment-coupling function $\tau$ ): For $\beta>0$ ,

||(q,\nu)||=-(G_{q}^{\beta})^{*}(\nu)=-\sup_{\tau}[\beta\langle\tau,\nu\rangle-G_{q}^{\beta}(\tau)]

where the supremum is over bounded measurable $\tau:[0,1]\rightarrow{\mathbb{R}}$ , where $\langle\cdot,\cdot\rangle$ is the integration linear functional $\langle\tau,\nu\rangle=\int_{0}^{1}\tau(u)d\nu$ and where the point-to-point $\beta$ -Gibbs Free Energy is given by

G_{q}^{\beta}(\tau)=\lim_{n\rightarrow\infty}\frac{1}{n}\log\sum_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}e^{\beta T(\pi)}

(ii)

Analogous results hold in the direction-free case. Let $\nu$ be a finite non-negative Borel measure on [0,1] and let $t:=||\nu||_{TV}$ . Then the direction-free grid entropy $||\nu||$ as defined in (1) is also given by

	$\displaystyle\|\|\nu\|\|$	$\displaystyle=\inf_{\epsilon>0}\lim_{n\rightarrow\infty}\frac{1}{n}\log\sum_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}\ \mbox{a.s.}$
		$\displaystyle=\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},\|\|q\|\|_{1}=t}\|\|(q,\nu)\|\|$
		$\displaystyle=\|\|(t\ell,\nu)\|\|\ \mbox{where}\ \ell=\bigg{(}\frac{1}{D},\ldots,\frac{1}{D}\bigg{)}$

The expressions we take an infimum of are each directed metrics with negative sign on $\mathcal{M}_{t}$ , the set of finite non-negative Borel measures with total mass $t$ . Moreover, direction-free grid entropy is the negative convex conjugate of the point-to-level $\beta$ -Gibbs Free Energy:

||\nu||=-(G^{\beta})^{*}(\nu)=-\sup_{\tau}[\beta\langle\tau,\nu\rangle-G^{\beta}(\tau)]\ \mbox{a.s.}

where the supremum is over bounded measurable $\tau:[0,1]\rightarrow{\mathbb{R}}$ and where the point-to-level $\beta$ -Gibbs Free Energy is given by

G^{\beta}(\tau)=\lim_{n\rightarrow\infty}\frac{1}{n}\log\sum_{\pi\ \mbox{s.t.}\ |\pi|=n}e^{\beta T(\pi)}

Remark.

We can extend these variational formulas for point-to-point/point-to-level Gibbs Free Energy to general, possibly unbounded, measurable $\tau$ by truncating $\tau$ in $\beta\langle\tau,\nu\rangle$ at some constant $C>0$ we take to $\infty$ . For example,

||(q,\nu)||=-\sup_{C>0}\sup_{\tau}[\beta\langle\tau\land C,\nu\rangle-G_{q}^{\beta}(\tau)]

Remark.

This theorem is partially proved in [Car10] [Corollary 2] for the case when the edge labels follow a Bernoulli( $p$ ) distribution. In this setting, Carmona shows that the negative convex conjugate of Gibbs Free Energy of measures of the form $\nu_{s}:=s\delta_{1}+(1-s)\delta_{0}$ is given by

-(G^{1})^{\ast}(\nu_{s})=\begin{cases}\lim\limits_{n\rightarrow\infty}\frac{1}{n}\log\#(\mbox{length $n$ paths from $\vec{0}$ with $\geq ns$ 1-labels}),&s\geq p\\ \lim\limits_{n\rightarrow\infty}\frac{1}{n}\log\#(\mbox{length $n$ paths from $\vec{0}$ with $\leq ns$ 1-labels}),&s<p\\ \end{cases}

The inequalities $\geq,\leq$ can be replaced with equality since the number of paths with $ns$ 1-labels is exponentially decaying in $s$ . Using the definition of the Levy-Prokhorov metric we can conclude that this formulation is equivalent to (1).

Remark.

As mentioned, grid entropy first appears in literature as the rate function for the Large Deviation Principle of the empirical measures in [RAS14]. The fact that we derive formula (2) using the Subadditive Ergodic Theorem should therefore not be surprising. Our definitions (1) for direction-fixed/direction-free grid entropy align with those of Rassoul-Agha and Seppäläinen as follows. Consider the product space $\Omega\times\mathcal{G}$ of the environment
$\Omega:=[0,1]^{{\mathbb{Z}}^{D}\times\mathcal{G}}$ consisting of the i.i.d. edge labels and of the $D$ unit NE steps
$\mathcal{G}=\{e_{1},\ldots,e_{D}\}$ , and let $\phi:\Omega\times\mathcal{G}\rightarrow[0,1]$ map an (environment, unit direction) pair to the edge label of the corresponding unit direction anchored at the origin. Then

||(q,\nu)||=\log D-\inf_{\mu:\phi_{\ast}(\mu)=\nu,E^{\mu}[Z_{1}]=q}H_{1}(\mu),\ ||\nu||=\log D-\inf_{\mu:\phi_{\ast}(\mu)=\nu}H_{1}(\mu)

(3)

where $H_{1}$ is the relative entropy defined in (5.2) of [RAS14] which can be traced back to Varadhan’s paper [Var03], and where the infimums are over the measures $\mu$ on $\Omega\times\mathcal{G}$ whose $\phi$ -pushforward is $\nu$ and, in the direction-fixed case, for which in addition the $\mu$ -mean of the step coordinate $Z_{1}$ is $q$ . See Section 5.4 for a more detailed derivation and discussion of (3).

Now grid entropy satisfies some rather interesting properties. The following theorem captures the highlights for direction $-q$ grid entropy; the direction-free analogues hold as well. All but properties (iii) and (v) follow easily from the framework presented previously in [RAS14], yet we will showcase the power of our new approach to grid entropy by proving all of these properties directly.

Theorem B.

Let $q\in{\mathbb{R}}^{D}_{\geq 0}$ . Then:

(i)

Grid entropy $||(q,\nu)||$ is a directed norm with negative sign; it scales with positive-factors and it satisfies a reverse triangle inequality:

$||(p,\xi)||+||(q,\nu)||\leq||(p+q,\xi+\nu)||$
(ii)

Grid entropy $||(q,\nu)||$ is upper semicontinuous.

Let

$\mathcal{R}^{q}:=\{\mbox{accumulation points of empirical measures along paths in direction $q$}\}$
(iii)

$\mathcal{R}^{q}$ is weakly closed, convex and deterministic and coincides almost surely with

$\{\nu\in\mathcal{M}_{+}:||(q,\nu)||>-\infty\}$
(iv)

$\mathcal{R}^{q}$ consists only of measures $\nu$ with total variation $||\nu||_{TV}=||q||_{1}$ that are absolutely continuous with respect to the Lebesgue measure $\Lambda$ on [0,1].
(v)

Any $\nu\in\mathcal{R}^{q}$ satisfies the following upper bound on the sum of the grid entropy and the relative entropy with respect to Lebesgue measure on [0,1]:

$D_{KL}(\nu||\Lambda)+||(q,\nu)||\leq\sum_{i=1}^{D}-q_{i}\log\frac{q_{i}}{||q||_{1}}:=H(q)$

where $D_{KL}$ denotes relative entropy (or Kullback-Leibler divergence), and where,
again, this upper bound is simply the (Shannon) entropy of the total number of paths in direction $q$ .

Why do we care? The deterministic set $\mathcal{R}^{q}$ is nothing more than the LPP analogue of the set Bates takes a infimum over in [Bat20] in his variational formula for the FPP time constant. In our LPP setting, his formula becomes

	LPP time constant	$\displaystyle:=\lim_{n\rightarrow\infty}\frac{\mbox{last passage time for paths $\vec{0}\rightarrow\lfloor nq\rfloor$}}{n}$
		$\displaystyle=\sup_{\nu\in\mathcal{R}^{q}}\langle\tau,\nu\rangle\ \mbox{a.s.}$

We thus link [Bat20] and [RAS14] by providing a new, more enlightening description of these sets $\mathcal{R}^{q}$ in terms of these grid entropies. Furthermore, the amazing bound in (iii) is an improvement of a version without the grid entropy term proved by Bates and which also follows from Rassoul-Agha and Seppäläinen’s work.

As in [RAS14], the main application of grid entropy we present is a variational formula for the point-to-point/point-to-level Gibbs Free Energies in direction $q$ /to the ”level” $x_{1}+\ldots+x_{D}=nt$ in the directed polymer model. Our variational formula is simply the zero temperature LPP analogue of Bates’ variational formula for the FPP limit shape, and furthermore can be molded into Rassoul-Agha and Seppäläinen’s variational formula via some normalizations, thus proving that what we call grid entropy really is the same object first developed in [RAS14].

Theorem C.

Fix a direction $q\in{\mathbb{R}}^{D}_{\geq 0}$ and an inverse temperature $\beta>0$ . For bounded measurable $\tau:[0,1]\rightarrow{\mathbb{R}}$ , the point-to-point/point-to-level Gibbs Free Energies are given by

G_{q}^{\beta}(\tau)=\sup_{\nu\in\mathcal{R}^{q}}[\beta\langle\tau,\nu\rangle+||(q,\nu)||],

G^{\beta}(\tau)=\sup_{\nu\in\mathcal{R}^{1}}[\beta\langle\tau,\nu\rangle+||\nu||]=G^{\beta}_{\ell}=\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=1}G^{\beta}_{q}

where $\ell=(\frac{1}{D},\ldots,\frac{1}{D})$ is the maximizing direction, and

\mathcal{R}^{1}=\bigcup\limits_{q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=1}\mathcal{R}^{q}

is the set of Borel probability measures $\nu$ that have finite direction-free grid entropy. Moreover, these supremums are achieved by some $\nu$ in $\mathcal{R}^{q},\mathcal{R}^{\ell}$ respectively.

Remark.

We may replace the supremums in these formulas to be over all measurable $\tau$ by truncating $\tau$ at some $C>0$ and taking a supremum over $C$ .

As in [Bat20], it follows that the directed polymer analogue of Hoffman’s question is answered in the affirmative when our variational formula has a unique maximizer, which happens for a dense family of measurable functions $\tau$ .

Theorem D.

Fix an inverse temperature $\beta>0$ and a bounded measurable $\tau:[0,1]\rightarrow{\mathbb{R}}$ .

(i)

Fix $q\in{\mathbb{R}}^{D}_{\geq 0}$ and suppose $\beta\langle\tau,\nu\rangle+||(q,\nu)||$ has a unique maximizer $\nu\in\mathcal{R}^{q}$ . For every $n$ pick a path $\pi_{n}:\vec{0}\rightarrow\lfloor nq\rfloor$ independently and at random according to the probabilities prescribed the corresponding point-to-point $\beta$ -polymer measure

\rho_{n,q}^{\beta}(d\pi)=\frac{e^{\beta T(\pi)}}{\sum\limits_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}e^{\beta T(\pi)}}\ \mbox{for paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor

Then the empirical measures $\frac{1}{n}\mu_{\pi_{n}}$ converge weakly to $\nu$ a.s.

(ii)

Suppose $\beta\langle\tau,\nu\rangle+||\nu||$ has a unique maximizer $\nu\in\mathcal{M}_{1}$ . For every $n$ pick a length $n$ path $\pi_{n}$ from $\vec{0}$ independently and at random according to the probabilities prescribed the corresponding point-to-level $\beta$ -polymer measure

\rho_{n}^{\beta}(d\pi)=\frac{e^{\beta T(\pi)}}{\sum\limits_{\pi\ \mbox{s.t.}\ |\pi|=n}e^{\beta T(\pi)}}\ \mbox{for length $n$ paths $\pi$ from}\ \vec{0}

Then the empirical measures $\frac{1}{n}\mu_{\pi_{n}}$ converge weakly to $\nu$ a.s.

Remark.

In fact, even if there is no unique maximizer, we can show that all accumulation points of random paths chosen according to the direction-fixed/direction-free $\beta$ -polymer measure are among the maximizers of the corresponding variational formula. This partial answer to Hoffman’s question in the positive temperature case is immediate from the development of grid entropy by Rassoul-Agha and Seppäläinen [RAS14] as the rate function of the Large Deviation Principle of empirical measures. However, to the best of our knowledge this is the first time it has been formulated explicitly.

The plan is as follows. In Section 2, we describe the model and setup, and outline various facts and notions we will need over the course of this paper. Section 3 focuses on developing the second definition of grid entropy (2) as a directed norm with negative sign and showing that it is equivalent to the original definition with the $\min\limits_{\pi}^{j}$ . Then, in Section 4, we investigate what information we can extract from grid entropy and what properties it satisfies (Theorem B). We devote Section 5 to applying our results to establish our variational formula for point-to-point/point-to-level Gibbs Free Energies and study the consequences of this (namely Theorems C,D) as well as the correspondence to the work in [RAS14]. Last but not least, we make some closing remarks about adapting our results to other models.

2 Preliminaries

2.1 Empirical Measures on the Lattice

We begin by briefly describing the setup we use in this paper.

We restrict ourselves to a directed Last Passage Percolation model. Consider north-east nearest-neighbour paths on the lattice ${\mathbb{Z}}^{D},D\geq 1$ with i.i.d. edge weights $\tau_{e}\sim\theta$ for some probability distribution $\theta$ on ${\mathbb{R}}$ . By north-east, we of course mean that the coordinates of points on the path are nondecreasing.

For $p,q\in{\mathbb{Z}}^{D}$ we denote by $\mathcal{P}(p,q)$ the set of all NE paths $\pi:p\rightarrow q$ . Similarly, for $q\in{\mathbb{Z}}^{D}$ and $t\in{\mathbb{Z}}_{\geq 0}$ we denote by $\mathcal{P}_{t}(q)$ the set of all NE paths from $q$ of length $t$ (no restriction on the endpoint).

Observe that either $q-p\notin{\mathbb{Z}}_{\geq 0}^{D}$ and $\mathcal{P}(p,q)=\emptyset$ , or $q-p\in{\mathbb{Z}}_{\geq 0}$ and

|\mathcal{P}(p,q)|=\binom{||q-p||_{1}}{q_{1}-p_{1},q_{2}-p_{2},\cdots,q_{D}-p_{D}}

Here $||\cdot||_{1}$ is the 1-norm on ${\mathbb{R}}^{D}$ defined by $||p||_{1}=\sum\limits_{i=1}^{D}|p_{i}|$ .

On the other hand, $|\mathcal{P}_{t}(q)|=D^{t}$ trivially for any $q\in{\mathbb{Z}}^{D}$ .

Note that unlike other recent work (such as [MSZ21]) we do not restrict ourselves to a known solvable model. We do not impose any restrictions on the edge weight distribution $\theta$ , so that our results hold with the greatest generality possible.

We scale the grid by $n$ and look at the behaviour in the limit. As is standard, we extend our initial inputs $p,q$ to lie in ${\mathbb{R}}^{D}$ by taking coordinatewise floors of the scaled coordinates. That is, we consider paths $\pi\in\mathcal{P}(\lfloor np\rfloor,\lfloor nq\rfloor)$ where

\lfloor(x_{1},\ldots,x_{D})\rfloor:=(\lfloor x_{1}\rfloor,\ldots,\lfloor x_{D}\rfloor)

Our normalized directed metrics will converge almost surely to a translation-invariant limit so it will suffice to consider the case when $p=\vec{0}$ . But for now we let $\vec{p}$ be arbitrary.

Various inequalities we derive involve the asymptotics of the number of length $n$ NE paths from the origin in a fixed or unfixed direction. The following lemma, which is easily proved using Sterling’s approximation, gives us what we want.

Lemma 1.

Let $k,n\in{\mathbb{N}},a_{i}\in{\mathbb{R}}_{\geq 0},\sum a_{i}=a$ . Then

\binom{\lfloor na\rfloor}{\lfloor na_{1}\rfloor,\lfloor na_{2}\rfloor,\ldots,\lfloor na_{k}\rfloor}=\bigg{(}\frac{a^{a}}{\prod\limits_{1\leq i\leq k}a_{i}^{a_{i}}}+o(1)\bigg{)}^{n}

where we use the convention $0^{0}=1$ .

Remark.

Recall that for $q\in{\mathbb{R}}^{D}_{\geq 0}$ we denote by $\mathcal{P}(\vec{0},\lfloor nq\rfloor)$ the set of paths $\pi:\vec{0}\rightarrow\lfloor nq\rfloor$ . Thus

\lim_{n\rightarrow\infty}\frac{1}{n}\log|\mathcal{P}(\vec{0},\lfloor nq\rfloor)|=\sum_{i=1}^{D}-q_{i}\log\frac{q_{i}}{||q||_{1}}:=H(q)

$H(q)$ is the (Shannon) entropy of the number of paths in direction $q$ . Note that $H(q)$ scales with positive scalars and $H(q)$ is maximized among $q\in{\mathbb{R}}^{D}_{\geq 0}$ with the same 1-norm by

q=||q||_{1}\bigg{(}\frac{1}{D},\ldots,\frac{1}{D}\bigg{)}:=||q||_{1}\ell

in which case $H(q)=||q||_{1}\log D$ .

On the other hand, for $t\geq 0$ recall that we denote by $\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})$ the set of paths from $\vec{0}$ of length $\lfloor nt\rfloor$ . Thus

\lim_{n\rightarrow\infty}\frac{1}{n}\log|\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})|=t\log D=H(t\ell)

We study the distribution of weights that we observe along paths $\#\pi:\lfloor np\rfloor\rightarrow\lfloor nq\rfloor$ . For a path $\pi$ , let the unnormalized empirical measure along $\pi$ be

\sigma_{\pi}=\sum_{e\in\pi}\delta_{\tau_{e}}

Note that we normalize by $\frac{1}{n}$ rather than $\frac{1}{|\pi|}$ . This is simply for convenience in our proofs, as it gives us a certain superadditivity we do not get when we normalize by $\frac{1}{|\pi|}$ .

The Glivenko-Cantelli Theorem [Dur19, Thm. 2.4.7] tells us that for any fixed infinite NE path in the grid, empirical measures along the path converge weakly to $\theta$ .

Theorem 2 (Glivenko-Cantelli Theorem).

Let $F_{\theta}$ be the cumulative distribution function of $\theta$ , let $X_{i}\sim\theta$ be i.i.d. random variables and let

F_{n}(x)=\frac{1}{n}\sum_{i=1}^{n}\mathbf{1}_{\{X_{i}\leq x\}}

be the cumulative distribution functions of the empirical measures. Then

\sup_{x}|F_{n}(x)-F_{\theta}(x)|\rightarrow 0\ \mbox{a.s. as}\ n\rightarrow\infty

However, we are interested in the limiting behavior of the empirical measure of not one path, but of all paths from $\vec{0}$ or all paths with a given direction, as we scale the length of the paths. This allows us to observe more than just the original measure $\theta$ .

2.2 Metrics on Measures

To gauge the distance between two measures we use the Levy-Prokhorov metric. We briefly introduce this metric as well as the total variation metric and we outline the relevant properties.

Consider a metric space $(X,d)$ with Borel $\sigma$ -algebra $\mathcal{B}$ . We denote by $\mathcal{M}$ the set of finite Borel measures on $(X,\mathcal{B})$ , by $\mathcal{M}_{+}$ the set of non-negative finite Borel measures, and by $\mathcal{M}_{t}$ the set of Borel non-negative finite measures with total mass $t$ for any $t\geq 0$ . In this notation, $\mathcal{M}_{1}$ is the set of Borel probability measures.

Definition.

The total variation norm on $\mathcal{M}$ is defined by

||\mu||_{TV}=\sup_{A\in\mathcal{B}}|\mu(A)|

This of course gives rise to a total variation metric, given by

d_{TV}(\mu,\nu)=||\mu-\nu||_{TV}

For example, the total variation of any measure $\mu\in\mathcal{M}_{+}$ is its total mass $\mu(X)$ .

Definition.

For any $A\in\mathcal{B}$ and $\epsilon>0$ , the $\epsilon$ -neighborhood of $A$ is defined to be

A^{\epsilon}:=\{x\in X:d(x,a)<\epsilon\ \mbox{for some}\ a\in A\}

Definition.

The Levy-Prokhorov metric on $\mathcal{M}_{+}$ is defined by

\rho(\mu,\nu)=\inf\{\epsilon>0:\mu(A)\leq\nu(A^{\epsilon})+\epsilon\ \mbox{and}\ \nu(A)\leq\mu(A^{\epsilon})+\epsilon\ \forall A\in\mathcal{B}\}

It is a standard result that $\rho$ metrizes the weak convergence of measures and total variation metrizes the strong convergence of measures in $\mathcal{M}_{+}$ . For details, see [Hub04, Sect. 2.3].

We now derive two useful inequalities involving the Levy-Prokhorov metric.

Lemma 3.

For $\mu,\nu\in\mathcal{M}_{+}$ ,

\rho(\mu,\nu)\leq||\mu-\nu||_{TV}

Remark.

This lemma establishes that the Levy-Prokhorov metric is weaker than the total variation metric.

Remark.

It is trivial to see that $\rho(\mu,0)=||\mu||_{TV}$ .

Proof.

Let $\epsilon:=||\mu-\nu||_{TV}$ . If $\epsilon=0$ then $\mu=\nu$ so $\rho(\mu,\nu)=0$ . If $\epsilon>0$ , for any $A\in\mathcal{B}$ ,

A\subseteq A^{\epsilon}\Rightarrow\mu(A)-\nu(A^{\epsilon})\leq\mu(A)-\nu(A)\leq\sup_{A^{\prime}\in\mathcal{B}}|\mu(A^{\prime})-\nu(A^{\prime})|=\epsilon

and similarly $\nu(A)-\mu(A^{\epsilon})\leq\epsilon$ so $\rho(\mu,\nu)\leq\epsilon$ . ∎

We next show that $\rho$ satisfies a kind of subadditivity.

Lemma 4.

Let $\mu_{1},\mu_{2},\nu_{1},\nu_{2}\in\mathcal{M}_{+}$ . Then

\rho(\mu_{1}+\mu_{2},\nu_{1}+\nu_{2})\leq\rho(\mu_{1},\nu_{1})+\rho(\mu_{2},\nu_{2})

Remark.

Note that in the case $\mu_{2}=\nu_{2}$ the inequality becomes

\rho(\mu_{1}+\mu_{2},\nu_{1}+\mu_{2})\leq\rho(\mu_{1},\nu_{1})

Proof.

For any $\epsilon_{1}>\rho(\mu_{1},\nu_{1}),\epsilon_{2}>\rho(\mu_{2},\nu_{2})$ we have for any $A\in\mathcal{B}$ ,

\mu_{1}(A)\leq\nu_{1}(A^{\epsilon_{1}})+\epsilon_{1}\leq\nu_{1}(A^{\epsilon_{1}+\epsilon_{2}})+\epsilon_{1}\ \mbox{and}\ \mu_{2}(A)\leq\nu_{2}(A^{\epsilon_{2}})+\epsilon_{2}\leq\nu_{2}(A^{\epsilon_{1}+\epsilon_{2}})+\epsilon_{2}

\Rightarrow(\mu_{1}+\mu_{2})(A)\leq(\nu_{1}+\nu_{2})(A^{\epsilon_{1}+\epsilon_{2}})+\epsilon_{1}+\epsilon_{2}

By symmetry, the same inequality holds with $\mu_{i},\nu_{i}$ swapped. Thus

\rho(\mu_{1}+\mu_{2},\nu_{1}+\nu_{2})\leq\rho(\mu_{1},\nu_{1})+\rho(\mu_{2},\nu_{2})

∎

2.3 A Convenient Coupling of the Edge Weights

We follow [Bat20, Sect. 2.1] in coupling the environment to uniform random variables in order to work in a compact space of measures and to connect our results with his.

The idea is to write our i.i.d. edge weights $\tau_{e}\sim\theta$ as

\tau_{e}=\tau(U_{e})

for some measurable function $\tau:[0,1]\rightarrow{\mathbb{R}}$ and i.i.d. Unif[0,1]-valued random variables $(U_{e})_{e\in E({\mathbb{Z}}^{D})}$ on the same probability space as $(\tau_{e})_{e\in E({\mathbb{Z}}^{D})}$ . For instance, we could take the quantile function

\tau(x)=F_{\theta}^{-}(x):=\inf\{t\in{\mathbb{R}}:F_{\theta}(t)\geq x\}

But our results (in particular, our definitions of grid entropy) are independent of the $\tau$ chosen so we allow $\tau$ to be arbitrary (with the conditions stated above). This comes into play later in Section 5, when we study the Gibbs Free Energy as a function of $\tau$ .

To distinguish between the $\tau_{e}$ and the $U_{e}$ , we call the former edge weights and the latter edge labels.

Let $\Lambda$ denote Lebesgue measure on $[0,1]$ . We tweak the definition of empirical measures in this new setup: for any NE path $\pi:\lfloor np\rfloor\rightarrow\lfloor nq\rfloor$ in ${\mathbb{Z}}^{D}$ , define

\mu_{\pi}:=\sum_{e\in\pi}\delta_{U_{e}}

Then we can relate $\Lambda$ and the $\mu_{\pi}$ to $\theta$ and the $\sigma_{\pi}$ respectively via the pushforward:

\theta=\tau_{*}(\Lambda),\sigma_{\pi}=\tau_{*}(\mu_{\pi})\ \mbox{where}\ \tau_{*}(\xi)(B)=\xi(\tau^{-1}(B))\ \forall B\in\mathcal{B}({\mathbb{R}}^{D})\ \mbox{and $\forall$ measures $\xi$ on $[0,1]$}

One advantage is of course that the set of probability measures on $[0,1]$ is weakly compact, so for any sequence of paths $\pi_{n}:\lfloor np\rfloor\rightarrow\lfloor nq\rfloor$ in the grid we get a subsequence for which $\frac{1}{n_{k}}\mu_{\pi_{n_{k}}}$ converges weakly to some measure.

In the case of a continuous cumulative distribution function $F_{\theta}$ , we can use a lemma proved in [Bat20] to get a nice duality.

Lemma 5.

Given a measure $\theta$ on ${\mathbb{R}}$ with continuous cdf $F_{\theta}$ , if we let
$\tau=F_{\theta}^{-}:[0,1]\rightarrow{\mathbb{R}}$ be its quantile function then there is a probability 1 event on which $\frac{1}{n_{k}}\mu_{\pi_{n_{k}}}\Rightarrow\nu$ for some subsequence $n_{k}$ and paths $\pi_{n_{k}}:\vec{0}\rightarrow\lfloor n_{k}q\rfloor$ if and only if $\tau_{*}(\frac{1}{n_{k}}\mu_{\pi_{n_{k}}})\Rightarrow\tau_{*}(\nu)$ .

Proof.

[Bat20, Lemma 6.15] establishes that there is a probability 1 event on which
$\frac{1}{n_{k}}\mu_{\pi_{n_{k}}}\Rightarrow\nu$ implies $\tau_{*}(\frac{1}{n_{k}}\mu_{\pi_{n_{k}}})\Rightarrow\tau_{*}(\nu)$ . But then on the same event, given a subsequence for which $\tau_{*}(\frac{1}{n_{k}}\mu_{\pi_{n_{k}}})\Rightarrow\tau_{*}(\nu)$ , compactness gives us a convergent subsubsequence $\frac{1}{n_{k_{j}}}\mu_{\pi_{n_{k_{j}}}}\Rightarrow\xi$ hence

\tau_{*}\bigg{(}\frac{1}{n_{k_{j}}}\mu_{\pi_{n_{k_{j}}}}\bigg{)}\Rightarrow\tau_{*}(\xi)

so $\tau_{*}(\xi)=\tau_{*}(\nu)$ . The fact that $F_{\theta}$ is continuous and the quantile function $\tau$ satisfies

\tau^{-1}((-\infty,x])=[0,F_{\theta}(x)]\ \forall x\in{\mathbb{R}}

implies $\xi$ and $\nu$ agree on all sets $[0,F_{\theta}]$ hence $\xi=\nu$ . ∎

Thus, in the case when $\theta$ has continuous cdf, we lose no generality by doing this coupling and working with measures on [0,1].

However, even in the most general case where $\theta$ may not have continuous cdf or bounded support, our work in developing grid entropy still holds because we only use the compactness of the space of measures in later sections devoted to our variational formula for the Gibbs Free Energy.

In short, we lose nothing by restricting ourselves to the compact space of measures on $[0,1]$ .

A benefit of this coupling is the following amazing result of Bates [Bat20, Lemma 6.3 and Thm 6.4]. Here we denote by $\mathcal{M}_{+},\mathcal{M}_{t}$ the sets of finite non-negative Borel measures on [0,1] and finite non-negative Borel measures on [0,1] with total mass $t\geq 0$ .

Theorem 6.

(i)

Fix $q\in{\mathbb{R}}^{D}_{\geq 0}$ . Define $\mathcal{R}_{\infty}^{q}$ to be the (event-dependent) set of measures $\nu\in\mathcal{M}_{+}$ for which there is a subsequence $\pi_{n_{k}}$ of paths $\vec{0}\rightarrow\lfloor n_{k}q\rfloor$ with $\mu_{\pi_{n_{k}}}\Rightarrow\nu$ . Then there exists a deterministic, weakly closed set $\mathcal{R}^{q}\subseteq\mathcal{M}_{||q||_{1}}$ independent of $\tau$ s.t.

$P(\mathcal{R}_{\infty}^{q}=\mathcal{R}^{q})=1$
(ii)

Fix $t\geq 0$ . Define $\mathcal{R}^{t}_{\infty}$ to be the (event-dependent) set of measures $\nu\in\mathcal{M}_{+}$ for which there is a subsequence $\pi_{n_{k}}$ of length $\lfloor nt\rfloor$ paths from $\vec{0}$ with $\mu_{\pi_{n_{k}}}\Rightarrow\nu$ . Then there exists a deterministic, weakly closed set $\mathcal{R}^{t}\subseteq\mathcal{M}_{t}$ independent of $\tau$ s.t.

$P(\mathcal{R}^{t}_{\infty}=\mathcal{R}^{t})=1$

Moreover, $\mathcal{R}^{t}=\bigcup\limits_{q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=t}\mathcal{R}^{q}$ .

Remark.

Bates proves this theorem in the setup of First Passage Percolation, but notes that it holds analogously in the Last Passage model.

Instead of looking at all empirical measures that have a weakly convergent subsequence we may look at only certain empirical measures with this property and the same result will hold. The proof is almost identical to Bates’s original proof except for this change, so we omit it.

Corollary 7.

(i)

Fix $q\in{\mathbb{R}}^{D}$ and $0\leq\alpha\leq R(q)$ . Define $\mathcal{R}_{\infty}^{q,\alpha}$ to be the (event-dependent) set of measures $\nu\in\mathcal{M}_{+}$ for which there is a subsequence $\pi_{n_{k}}$ of the paths $\pi_{n}:\vec{0}\rightarrow\lfloor nq\rfloor$ with the $\lfloor e^{n\alpha}\rfloor$ th smallest values of $\rho(\frac{1}{n}\mu_{\pi_{n}},\nu)$ satisfying

\frac{1}{n}\mu_{\pi_{n_{k}}}\Rightarrow\nu\ \mbox{i.e.}\ \liminf_{n\rightarrow\infty}\min_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0

Then there exists a deterministic weakly closed set $\mathcal{R}^{q,\alpha}\subseteq\mathcal{M}_{||q||_{1}}$ s.t.

P(\mathcal{R}_{\infty}^{q,\alpha}=\mathcal{R}^{q,\alpha})=1

(ii)

Fix $t\geq 0$ and $0\leq\alpha\leq t\log D$ . Define $\mathcal{R}_{\infty}^{t,\alpha}$ to be the (event-dependent) set of measures $\nu\in\mathcal{M}_{+}$ for which there is a subsequence $\pi_{n_{k}}$ of the length $\lfloor tn\rfloor$ paths $\pi_{n}$ from $\vec{0}$ with the $\lfloor e^{n\alpha}\rfloor$ th smallest values of $\rho(\frac{1}{n}\mu_{\pi_{n}},\nu)$ satisfying

\frac{1}{n}\mu_{\pi_{n_{k}}}\Rightarrow\nu\ \mbox{i.e.}\ \liminf_{n\rightarrow\infty}\min_{\pi:|\pi|=\lfloor nt\rfloor}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0

Then there exists a deterministic weakly closed set $\mathcal{R}^{t,\alpha}\subseteq\mathcal{M}_{t}$ s.t.

P(\mathcal{R}_{\infty}^{t,\alpha}=\mathcal{R}^{t,\alpha})=1

Remark.

Since the $\min\limits_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{\lfloor e^{n\alpha}\rfloor}$ are increasing in $\alpha$ then the sets $\mathcal{R}^{q,\alpha}$ are decreasing in $\alpha$ . The same holds for $\mathcal{R}^{t,\alpha}$ .

Once we develop the concept of grid entropy, we will easily relate these sets $\mathcal{R}^{q},\mathcal{R}^{q,\alpha},\mathcal{R}^{t},\mathcal{R}^{t,\alpha}$ to the sets of measures with finite grid entropy in direction $q$ , grid entropy at least $\alpha$ in direction $q$ , finite direction-free length $t$ grid entropy, direction-free length $t$ grid entropy at least $\alpha$ respectively.

2.4 Directed Metric Spaces

Grid entropy will turn out to be a directed metric with negative sign. We recall what that entails.

Definition.

A directed metric space with positive sign is a triple $(M,d,+)$ where $M$ is a vector space, $d:M^{2}\rightarrow(-\infty,+\infty]$ is a distance function satisfying $d(x,x)=0$ and the usual triangle inequality $d(x,y)+d(y,z)\geq d(x,z)$ . A directed metric space with negative sign is a triple $(M,d,-)$ such that $(M,-d,+)$ is a directed metric space with positive sign.

Remark.

Standard metric spaces are clearly examples of directed metric spaces with positive sign. However, directed metric spaces with positive sign might not be metric spaces: the distance $d$ might not be symmetric and might not be positive and finite for non-equal arguments. The ”directed” in the name indicates the possibility for asymmetry.

Certain directed metrics give rise to directed norms in the same way certain metrics give rise to norms.

Definition.

If $(M,d,\sigma)$ is a directed metric with positive/negative sign such that it is translation-invariant and homogeneous with respect to positive factors, then it induces a directed norm with positive/negative sign given by

||x||:=d(\vec{0},x)

Of particular interest to us are directed norms defined in terms of the empirical measures we observe along paths between points. In the fixed direction case, our directed norms will be defined on the space of tuples consisting of a point in ${\mathbb{R}}^{D}$ (the ”direction” we are observing) and a finite Borel measure on ${\mathbb{R}}$ (the target measure we want the empirical measures to be near). In the direction-free case, our directed norms will just be defined on the space $\mathcal{M}_{+}$ of finite Borel measure on ${\mathbb{R}}$ .

2.5 The Subadditive Ergodic Theorem

The key theorem we use to prove the existence of the scaling limit of these directed metrics is Liggett’s improved version of Kingman’s Subadditive Ergodic Theorem. Before stating this theorem, we recall the definitions of stationary sequences and ergodicity, as presented in [Dur19, Sect. 7].

Definition.

A sequence $(Y_{n})_{n\geq 1}$ of random variables is called stationary if the joint distributions of the shifted sequences $\{Y_{k+n}:n\geq 1\}$ is not dependent on $k\geq 0$ .

As it turns out, the sequence of random variables we are interested in is a sequence of i.i.d. $(Y_{n})_{n\geq 1}$ , which clearly is stationary.

Definition.

Let $(\Omega,\mathcal{F},P)$ be a probability space and let $T:\Omega\rightarrow\Omega$ be a map. $T$ is said to be measure-preserving if $P(T^{-1}(A))=P(A)\ \forall A\in\mathcal{F}$ . $T$ is said to be ergodic if it is measure-preserving and if all $T$ -invariant measurable sets are trivial, i.e. $P(A)\in\{0,1\}$ whenever $A\in\mathcal{F}$ and $T^{-1}(A)=A$ .

In the context of sequences, we look at the space $\Omega={\mathbb{R}}^{\infty}$ of infinite sequences of real numbers with the $\sigma$ -algebra $\mathcal{B}_{\infty}$ generated by

\{\{(y_{1},y_{2},\ldots)\in\Omega:y_{n}\in B\}:n\geq 1,B\in\mathcal{B}({\mathbb{R}})\}

(where $\mathcal{B}({\mathbb{R}})$ is the Borel $\sigma$ -algebra on ${\mathbb{R}}$ ) and with the product probability measure $\mu_{\infty}$ on $\mathcal{B}_{\infty}$ determined by

\mu_{\infty}(B_{1}\times B_{2}\times\ldots\times B_{n}\times{\mathbb{R}}\times\cdots)=\prod_{i=1}^{n}\mu(B_{i})

where $B_{i}\in\mathcal{B}({\mathbb{R}})$ and $\mu$ is a Borel probability measure on ${\mathbb{R}}$ . We consider the shift operator $T:\Omega\rightarrow\Omega$ given by $T(y_{1},y_{2},y_{3},\ldots)=T(y_{2},y_{3},\ldots)$ . $T$ is easily seen to be measure-preserving with respect to $\mu_{\infty}$ since $\mu_{\infty}(T^{-1}(A))=\mu_{\infty}(A)$ for the generating sets $A$ of $\mathcal{B}_{\infty}$ . When we refer to the ergodicity of a sequence of random variables, we mean the ergodicity of this shift operator.

In our case, where we have a sequence of i.i.d. $(Y_{n})_{n\geq 1}$ , the corresponding shift operator is ergodic. Indeed, if $T^{-1}(A)=A$ then

(Y_{1},Y_{2},\ldots)\in A\Leftrightarrow T^{n-1}(Y_{1},Y_{2},\ldots)=(Y_{n},Y_{n+1},\ldots)\in A\ \forall n\geq 1

so $A$ is in the tail $\sigma$ -field $\bigcap\limits_{n\geq 1}\sigma(Y_{n},Y_{n+1},\ldots)$ and thus $\mu_{\infty}(A)\in\{0,1\}$ by Kolmogorov’s 0-1 Law.

We are now ready to state the Subadditive Ergodic Theorem in the form we need.

Theorem 8 (Kingman’s Subadditive Ergodic Theorem, [Lig85]).

Suppose $(Y_{m,n})_{0\leq m<n}$ are random variables satisfying

(i)

$\exists$ constant $C$ s.t. $E|Y_{0,n}|<\infty$ and $EY_{0,n}\geq Cn$ for all $n$
(ii)

$\forall k\geq 1$ , $\{Y_{nk,(n+1)k}:n\geq 1\}$ is a stationary process
(iii)

The joint distributions of $\{Y_{m,m+k}:k\geq 1\}$ are not dependent on $m$
(iv)

$Y_{0,m+n}\leq Y_{0,m}+Y_{m,m+n}\ \forall m,n>0$

Then

(a)

$\lim\limits_{n\rightarrow\infty}\frac{EY_{0,n}}{n}=\inf\limits_{m\geq 0}\frac{EY_{0,m}}{m}:=\gamma$
(b)

$Y:=\lim\limits_{n\rightarrow\infty}\frac{Y_{0,n}}{n}$ exists a.s. and in $L^{1}$ , and $EY=\gamma$
(c)

If the stationary sequences in (ii) are ergodic, then $Y=\gamma$ a.s.

Remark.

We may replace $Y_{m,n}$ with $-Y_{m,n}$ in the statement of the theorem to obtain a version for superadditive sequences.

This theorem is the basis for the construction of grid entropy in our paper.

2.6 Relative Entropy and Sanov’s Theorem

In the next preliminary section, we recall the basics of the Kullback-Leibler divergence (introduced in [KL51]) and Sanov’s Theorem for large deviations. We later use this theorem to establish a relationship between our grid entropy and this notion of relative entropy.

Definition.

Let $P,Q$ be distributions on our inherent metric space $X$ . The Kullback-Leibler divergence or relative entropy of $Q$ from $P$ is defined to be

D_{KL}(P||Q)=\begin{cases}\int_{X}\log f\ dP=\int_{X}f\log f\ dQ,&P\ll Q\\ +\infty,&\mbox{otherwise}\end{cases}

where $f:=\frac{dP}{dQ}$ is the Radon-Nikodym derivative and $\log$ is the natural logarithm.

Remark.

[KL51] also derive several basic properties such as $D_{KL}$ being a pre-metric. [Pos75] shows that $D_{KL}$ is lower semicontinuous, in the sense that, given probability distributions $P_{n}\Rightarrow P$ and $Q_{n}\Rightarrow Q$ , we have

D_{KL}(P||Q)\leq\liminf_{n\rightarrow\infty}D_{KL}(P_{n}||Q_{n})

Our main interest in relative entropy is that it is the rate function for large deviations of empirical measures. This is captured by Sanov’s Theorem.

Theorem 9 (Sanov’s Theorem, [DS01]).

Consider a sequence of i.i.d. random variables $X_{i}\sim\theta$ taking values in a set $X$ . Let $\mu_{n}=\sum_{i=1}^{n}\delta_{X_{i}}$ be their unnormalized empirical measures. Then for any weakly closed set $F\subset\mathcal{M}_{1}$ we have

\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{n}\in F\bigg{)}\leq-\inf_{\xi\in F}D_{KL}(\xi||\theta)

and for any weakly open set $G\subset\mathcal{M}_{1}$ we have

\liminf_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{n}\in U\bigg{)}\geq-\inf_{\xi\in G}D_{KL}(\xi||\theta)

Remark.

Since $F$ is closed then the infimum is achieved by some $\xi\in F$ . Furthermore, if $\theta\in F$ then the right-hand side of the inequality is 0, which gives us no information; however, if $\theta\notin F$ , then the theorem gives an exponential bound on large deviations.

3 Grid Entropy as a Directed Norm

3.1 The Plan for Deriving Direction-Fixed Grid Entropy

For the purposes of Section 3, we temporarily forget our original definition (1) of grid entropy and rederive it as a limit of scaled directed metrics.

To summarize the setting we described in section 2, we consider empirical measures $\frac{1}{n}\mu_{\pi}$ along NE-paths on the lattice ${\mathbb{Z}}^{D}$ , where the edges have weights $\tau_{e}=\tau(U_{e})$ for some measurable $\tau:[0,1]\rightarrow{\mathbb{R}}$ and $U_{e}$ are i.i.d. Unif[0,1] random variables. $\mathcal{M},\mathcal{M}_{+},\mathcal{M}_{t}$ denote the spaces of finite, finite non-negative, and finite non-negative with total mass $t\geq 0$ Borel measures respectively.

We begin with direction-fixed grid entropy. We wish to count the number of paths with empirical measure very close to the target $\nu$ . Let us try to define a distance on $\mathbb{{\mathbb{R}}}^{D}\times\mathcal{M}_{+}$ by

d((p,\xi),(q,\nu))=\log\#\{\mbox{paths $\pi:\lfloor p\rfloor\rightarrow\lfloor q\rfloor$ with $\mu_{\pi}=\nu-\xi$}\}

Note that this is $-\infty$ if there are no such paths and it is 0 if $p=q,\xi=\nu$ (since there is exactly one path $\pi:\lfloor p\rfloor\rightarrow\lfloor p\rfloor$ , and it has empirical measure 0).

One glaring issue is that we are only counting paths that have exactly the target empirical measure. Since the Lebesgue measure on [0,1] is continuous, then almost surely the $U_{e}$ have different values, and thus the unnormalized empirical measures uniquely determine the paths $\pi$ . It follows that almost surely $d((p,\xi),(q,\nu))$ is always either $-\infty$ or 0. We need to change our definition of $d$ to count paths with an empirical measure ”close” to $\nu-\xi$ instead.

Another problem is that we wish to apply the Subadditive Ergodic Theorem to learn about the behavior of

\frac{d((np,n\xi),(nq,n\nu))}{n}

as $n\rightarrow\infty$ . Thus we must also change our definition of $d$ so that it is integrable (and in particular finite a.s.) when $q-p\in{\mathbb{R}}_{\geq 0}^{D}$ .

The trick is to replace the counting of the paths exhibiting the exact target empirical measures with a ”cost function” that attributes an exponential cost to each path based on how far its empirical measure is from the target. We also introduce a parameter $\epsilon$ that, as it decreases to 0, takes the cost function towards the counting of the paths with the target empirical measure we tried initially.

Definition.

Fix $\epsilon>0$ . Define a distance on $\mathbb{R}^{D}\times\mathcal{M}_{+}$ by

d^{\epsilon}((p,\xi),(q,\nu))=\log\sum_{\pi\in\mathcal{P}(\lfloor p\rfloor,\lfloor q\rfloor)}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},\nu-\xi)}

where $\rho$ is the Levy-Prokhorov metric and where we sum over all NE paths $\pi:\lfloor p\rfloor\rightarrow\lfloor q\rfloor$ .

Remark.

This distance is 0 if $p=q,\xi=\nu$ and $-\infty$ if and only if $q-p\notin{\mathbb{R}}_{\geq 0}^{D}$ .

Remark.

For any path $\pi:\lfloor p\rfloor\rightarrow\lfloor q\rfloor$ , the corresponding empirical measure $\mu_{\pi}$ observed must necessarily be of the form $\mu_{\pi}=\sum\limits_{i=1}^{|\pi|}\delta_{a_{i}}$ where $|\pi|=||\lfloor q\rfloor-\lfloor p\rfloor||_{1}$ .

Remark.

As $\epsilon\rightarrow 0$ the costs $e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},\nu-\xi)}$ converge to the indicators

\mathbf{1}_{\nu-\xi}(\mu_{\pi})=\mathbf{1}_{\frac{\nu-\xi}{||\lfloor q\rfloor-\lfloor p\rfloor||_{1}}}\bigg{(}\frac{1}{|\pi|}\mu_{\pi}\bigg{)}

Hence the sum of the costs approaches the number of paths with empirical measures $\frac{1}{|\pi|}\mu_{\pi}$ precisely equal to $\frac{\nu-\xi}{||\lfloor q\rfloor-\lfloor p\rfloor||_{1}}$ .

With this definition in mind, let us discuss the plan of attack.

First, we prove the existence of $\lim\limits_{n\rightarrow\infty}\frac{d^{\epsilon}((np,n\xi),(nq,n\nu))}{n}$ using the Subadditive Ergodic Theorem and some estimates we derive for the error terms when $p,q$ do not have integer coordinates. Then we take the infimum over $\epsilon>0$ of these limits and we define the resulting norm to be our grid entropy.

Theorem 10.

Fix $\epsilon>0,\nu,\xi\in\mathcal{M}_{+}$ and $p,q\in{\mathbb{R}}^{D}$ . Then

\frac{d^{\epsilon}((np,n\xi),(nq,n\nu))}{n}

converges in probability to a constant. When $p=\vec{0}$ , the convergence is pointwise a.s.

Remark.

The theorem holds trivially, with the limit being $-\infty$ , if $q-p\notin{\mathbb{R}}^{D}_{\geq 0}$ or if $q=p$ and $\nu\neq\xi$ . It also holds trivially, with the limit being 0, if $q=p$ and $\nu=\xi$ .

The limit given by this theorem is a directed metric with negative sign on ${\mathbb{R}}^{D}\times\mathcal{M}_{+}$ . When we take an infimum over $\epsilon>0$ we still get a directed metric with negative sign.

Theorem 11.

For $\epsilon>0$ , $\nu,\xi\in\mathcal{M}_{+}$ and $p,q\in{\mathbb{R}}^{D}$ define

\widetilde{d}^{\epsilon}((p,\xi),(q,\nu)):=\lim\limits_{n\rightarrow\infty}\frac{d^{\epsilon}((np,n\xi),(nq,n\nu))}{n}\ \mbox{and}\ \widetilde{d}((p,\xi),(q,\nu)):=\inf_{\epsilon>0}\widetilde{d}^{\epsilon}((p,\xi),(q,\nu))

Then each $\widetilde{d}^{\epsilon}$ as well as $\widetilde{d}$ are directed metrics with negative sign on ${\mathbb{R}}^{D}\times\mathcal{M}_{+}$ .

We show that this metric $\widetilde{d}$ gives rise to a norm on ${\mathbb{R}}^{D}_{\geq 0}\times\mathcal{M}_{+}$ . This will finish our discussion of the direction-fixed grid entropy and we will move on to the direction-free case.

3.2 The Limit Shape of $d^{\epsilon}$ Starting at $(\vec{0},0)$

In this and the following subsection, we focus on the direction-fixed grid entropy.

To prove Theorem 10, we first prove a simplified version, which we later generalize easily.

Theorem 12.

Fix $\epsilon>0,\nu\in\mathcal{M}_{+}$ and $q\in{\mathbb{R}}^{D}_{\geq 0}\setminus\{\vec{0}\}$ . Then

\frac{X_{n}^{\epsilon,q,\nu}}{n}:=\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}\rightarrow X^{\epsilon,q,\nu}:=\sup\limits_{n}\frac{EX_{n}^{\epsilon,q,\nu}}{n}=\lim\limits_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q,\nu}}{n}\ \mbox{a.s.}

Remark.

As noted before, Theorem 10 holds trivially when $q\notin{\mathbb{R}}^{D}_{\geq 0}\setminus\{\vec{0}\}$ , so we need not bother with this case.

We prove this theorem in stages, starting with the case when $q$ has integer coordinates. But first, we show a useful bound on our random variables $X_{n}^{\epsilon,q,\nu}$ .

Lemma 13.

Let $\epsilon>0,\nu,\xi\in\mathcal{M}_{+}$ and $p,q\in{\mathbb{Z}}^{D}$ with $q-p\in{\mathbb{Z}}^{D}_{\geq 0}\setminus\{\vec{0}\}$ . Then

d^{\epsilon}((p,\xi),(q,\nu))\in\bigg{[}-\frac{1}{\epsilon}(||q-p||_{1}+||\nu-\xi||_{TV}),||q-p||_{1}\log D\bigg{]}

Proof.

Recall that any path $\pi:p\rightarrow q$ has $||q-p||_{1}$ edges, so $||\mu_{\pi}||_{TV}=||q-p||_{1}$ , and that the total number of paths $\pi:p\rightarrow q$ is

\binom{||q-p||_{1}}{q_{1}-p_{1},\cdots,q_{D}-p_{D}}\in[1,D^{||q-p||_{1}}]

Also, for any such $\pi$ we have by Lemma 3

\rho(\mu_{\pi},\nu-\xi)\in[0,||\mu_{\pi}-(\nu-\xi)||_{TV}]\subseteq[0,||q-p||_{1}+||\nu-\xi||_{TV}]

\Rightarrow e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},\nu-\xi)}\in[e^{-\frac{1}{\epsilon}(||q-p||_{1}+||\nu-\xi||_{TV})},1]

Thus

d^{\epsilon}((p,\xi),(q,\nu))=\log\sum_{\pi\in\mathcal{P}(p,q)}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},\nu-\xi)}\in\bigg{[}-\frac{1}{\epsilon}(||q-p||_{1}+||\nu-\xi||_{TV}),||q-p||_{1}\log D\bigg{]}

∎

Lemma 14.

Theorem 12 holds with $q\in{\mathbb{Z}}^{D}_{\geq 0}\setminus\{\vec{0}\}$ .

Proof.

We wish to use Kingman’s Subadditive Ergodic Theorem (Theorem 8) with

Y_{m,n}:=-d^{\epsilon}((mq,m\nu),(nq,n\nu))\ \forall m\leq n

Let us now check the conditions (i)-(iv).

By Lemma 13,

Y_{0,n}=-d^{\epsilon}((\vec{0},0),(nq,n\nu))\in\bigg{[}-n||q||_{1}\log D,\frac{1}{\epsilon}(n||q||_{1}+n||\nu||_{TV})\bigg{]}

hence (i) holds.

Next, for every $k\geq 1$ , the sequence

Y_{nk,(n+1)k}=-d^{\epsilon}((nkq,nk\nu),((n+1)kq,(n+1)k\nu))

is i.i.d. because the distribution of the unnormalized empirical measures of paths $\pi:nkq\rightarrow(n+1)kq$ is not dependent on $n$ (since the edge labels $U_{e}$ are i.i.d.) so the distribution of the cost functions $e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},k\nu)}$ for $\pi:nkq\rightarrow(n+1)kq$ is not dependent on $n$ . Thus (ii) holds. Furthermore, as discussed in section 2.5, $Y_{nk,(n+1)k}$ being i.i.d. implies the sequence is ergodic.

Similarly, the joint distributions of

\{Y_{m,m+k}:k\geq 1\}=\{-d^{\epsilon}((mq,m\nu),((m+k)q,(m+k)\nu)):k\geq 1\}

are not dependent on $m$ since the edge labels are i.i.d. and the distribution of empirical measures of paths in a rectangle on the lattice with the difference between the top right and bottom left corners being $kq$ is independent of the location of the rectangle. Thus we have (iii).

It remains to show (iv), namely to show that given $m,n>0,$

d^{\epsilon}((\vec{0},0),((m+n)q,(m+n)\nu))\geq d^{\epsilon}((\vec{0},0),(mq,m\nu))+d^{\epsilon}((mq,m\nu),((m+n)q,(m+n)\nu))

For any paths $\pi:\vec{0}\rightarrow mq$ and $\pi^{\prime}:mq\rightarrow(m+n)q$ , we get a unique concatenation $\pi\cdot\pi^{\prime}:\vec{0}\rightarrow mq\rightarrow(m+n)q$ . Its empirical measure satisfies $\mu_{\pi\cdot\pi^{\prime}}=\mu_{\pi}+\mu_{\pi^{\prime}}$ . But the Levy-Prokhorov metric satisfies subadditivity by Lemma 4. Thus

\rho(\mu_{\pi\cdot\pi^{\prime}},(m+n)\nu)\leq\rho(\mu_{\pi},m\nu)+\rho(\mu_{\pi^{\prime}},n\nu)

But then

	$\displaystyle-Y_{0,m}-Y_{m,m+n}$	$\displaystyle=\log\sum_{\pi:\vec{0}\rightarrow mq}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},m\nu)}+\log\sum_{\pi^{\prime}:mq\rightarrow(m+n)q}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi^{\prime}},n\nu)}$
		$\displaystyle=\log\sum_{\begin{subarray}{c}\pi:\vec{0}\rightarrow mq\\ \pi^{\prime}:mq\rightarrow(m+n)q\end{subarray}}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},m\nu)-\frac{1}{\epsilon}\rho(\mu_{\pi^{\prime}},n\nu)}$
		$\displaystyle\leq\log\sum_{\begin{subarray}{c}\pi:\vec{0}\rightarrow mq\\ \pi^{\prime}:mp\rightarrow(m+n)q\end{subarray}}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi\cdot\pi^{\prime}},(m+n)\nu)}$

Not all paths $\pi^{\prime\prime\prime}:\vec{0}\rightarrow(m+n)q$ pass through $mq$ so we can upper bound the expression above by removing this condition:

	$\displaystyle-Y_{0,m}-Y_{m,m+n}$	$\displaystyle\leq\log\sum_{\pi^{\prime\prime\prime}:\vec{0}\rightarrow(m+n)q}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi^{\prime\prime\prime}},(m+n)\nu)}$
		$\displaystyle=-Y_{0,m+n}$

Thus we can apply the Subadditive Ergodic Theorem (Theorem 8) to get that

\frac{-Y_{0,n}}{n}=\frac{X_{n}^{\epsilon,q,\nu}}{n}

converges a.s. to the constant

X^{\epsilon,q,\nu}:=\sup\limits_{n}\frac{EX_{n}^{\epsilon,q,\nu}}{n}=\lim\limits_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q,\nu}}{n}\in\bigg{[}-\frac{1}{\epsilon}\max(||q||_{1},||\nu||),||q||_{1}\log D\bigg{]}

∎

The next order of business is proving the theorem for $q$ with rational coordinates. We will find useful the following error estimate on how $X_{n}^{\epsilon,q,\nu}$ changes when $q$ is perturbed in the SE direction and $\nu$ is perturbed arbitrarily.

Lemma 15.

Fix $q\in{\mathbb{R}}^{D}_{\geq 0},\epsilon>0$ and $\nu,\xi\in\mathcal{M}_{+}$ . Then for any $p\in{\mathbb{R}}^{D}_{\geq 0}$ with $q-p\in{\mathbb{R}}^{D}_{\geq 0}$ ,

	$\displaystyle X_{n}^{\epsilon,p,\xi}-\frac{1}{\epsilon}(n\|\|q-p\|\|_{1}+n\|\|\nu-\xi\|\|_{TV}+D)$
	$\displaystyle\leq X_{n}^{\epsilon,p,\xi}-\frac{1}{\epsilon}(\|\|\lfloor nq\rfloor-\lfloor np\rfloor\|\|_{1}+n\rho(\nu,\xi))$
	$\displaystyle\leq X_{n}^{\epsilon,q,\nu}$

Proof.

Fix any such $p$ . The inequality holds trivially if $p=q$ so we may assume $||q-p||_{1}>0$ . Then there is at least one path $\pi^{\prime}:\lfloor np\rfloor\rightarrow\lfloor nq\rfloor$ , and we fix it.

For any path $\pi:\vec{0}\rightarrow\lfloor np\rfloor$ , we concatenate it with $\pi^{\prime}$ to get a unique path $\pi\cdot\pi^{\prime}:\vec{0}\rightarrow\lfloor nq\rfloor$ . Note that $\pi^{\prime}$ consists of $||\lfloor nq\rfloor-\lfloor np\rfloor||_{1}\leq n||q-p||_{1}+D$ edges so its empirical measure $\mu_{\pi^{\prime}}$ has Total Variation at most $n||q-p||_{1}+D$ . Thus $\pi\cdot\pi^{\prime}$ satisfies

	$\displaystyle\rho(\mu_{\pi\cdot\pi^{\prime}},n\nu)$	$\displaystyle\leq\rho(\mu_{\pi}+\mu_{\pi^{\prime}},\mu_{\pi})+\rho(\mu_{\pi},n\xi)+\rho(n\xi,n\nu)$
		$\displaystyle\leq\|\|\mu_{\pi^{\prime}}\|\|_{TV}+\rho(\mu_{\pi},n\xi)+n\rho(\nu,\xi)$
		$\displaystyle\leq\rho(\mu_{\pi},n\xi)+(\|\|\lfloor nq\rfloor-\lfloor np\rfloor\|\|_{1}+n\rho(\nu,\xi))$

It follows that

	$\displaystyle X_{n}^{\epsilon,p,\xi}-\frac{1}{\epsilon}(n\|\|q-p\|\|_{1}+n\|\|\nu-\xi\|\|_{TV}+D)$
	$\displaystyle\leq X_{n}^{\epsilon,p,\xi}-\frac{1}{\epsilon}(\|\|\lfloor nq\rfloor-\lfloor np\rfloor\|\|_{1}+n\rho(\nu,\xi))$
	$\displaystyle\leq X_{n}^{\epsilon,q,\nu}$

∎

Using this lemma to approximate $X_{n}^{\epsilon,q,\nu}$ for $q\in{\mathbb{Q}}^{D}_{\geq 0}$ in terms of $X_{n}^{\epsilon,p,\nu}$ where $p\in{\mathbb{Z}}^{D}_{\geq 0}$ , we prove our limit theorem for $q$ with rational coordinates.

Lemma 16.

Theorem 12 holds for $q\in{\mathbb{Q}}^{D}_{\geq 0}\setminus\{\vec{0}\}$ .

Proof.

Let $q=\frac{(s_{1},\ldots,s_{D})}{t}$ for $s_{i},t\in{\mathbb{Z}}_{\geq 0}$ . First, compare $X_{tr}^{\epsilon,q,\nu},X_{tr+1}^{\epsilon,q,\nu},\ldots,X_{tr+(t-1)}^{\epsilon,q,\nu},X_{t(r+1)}^{\epsilon,q,\nu}$ for arbitrary $r$ using the previous lemma. The idea is that $trq$ has integer coordinates so $\frac{X_{tr}^{\epsilon,q,\nu}}{tr},\frac{X_{t(r+1)}^{\epsilon,q,\nu}}{tr}$ have the desired limits by Lemma 14. The rest of the $X_{tr+j}^{\epsilon,q,\nu}$ are bounded above/below by $X_{t(r+1)}^{\epsilon,q,\nu}/X_{tr}^{\epsilon,q,\nu}$ respectively plus some small error terms which go to 0 as $r\rightarrow\infty$ .

Consider any $1\leq j\leq t$ . Note that

X_{tr+j-1}^{\epsilon,q,\nu}=d^{\epsilon}((\vec{0},0),((tr+j-1)q,(tr+j-1)\nu))=X_{tr+j}^{\epsilon,\frac{tr+j-1}{tr+j}q,\frac{tr+j-1}{tr+j}\nu}

and

\bigg{|}\bigg{|}q-\frac{tr+j-1}{tr+j}q\bigg{|}\bigg{|}_{1}+\bigg{|}\bigg{|}\nu-\frac{tr+j-1}{tr+j}\nu\bigg{|}\bigg{|}_{TV}=\frac{||q||_{1}+||\nu||_{TV}}{tr+j}

By Lemma 15,

X_{tr+j-1}^{\epsilon,q,\nu}=X_{tr+j}^{\epsilon,\frac{tr+j-1}{tr+j}q,\frac{tr+j-1}{tr+j}\nu}\leq X_{tr+j}^{\epsilon,q,\nu}+\frac{1}{\epsilon}(||q||_{1}+||\nu||_{TV}+D)

Thus

\begin{split}\frac{X_{tr}^{\epsilon,q,\nu}}{r}&\leq\frac{X_{tr+1}^{\epsilon,q,\nu}}{r}+\frac{||q||_{1}+||\nu||_{TV}+D}{\epsilon r}\\ &\leq\ldots\leq\frac{X_{t(r+1)}^{\epsilon,q,\nu}}{r}+\frac{t(||q||_{1}+||\nu||_{TV}+D)}{\epsilon r}\end{split}

(4)

But $tq\in{\mathbb{Z}}_{\geq 0}^{D}$ so by Lemma 13, our limit theorem holds for $X_{tr}^{\epsilon,q,\nu}=X_{r}^{\epsilon,tq,t\nu}$ . That is,

\frac{X_{tr}^{\epsilon,q,\nu}}{r}\rightarrow\sup_{r}\frac{EX_{tr}^{\epsilon,q,\nu}}{r}=\lim_{r\rightarrow\infty}\frac{EX_{tr}^{\epsilon,q,\nu}}{r}\ \mbox{a.s.}

(5)

Taking expectations in (4), then taking the supremum/limit as $r\rightarrow\infty$ and using (5) we get

\lim_{r\rightarrow\infty}\frac{EX_{tr+j}^{\epsilon,q,\nu}}{r}=\sup_{r}\frac{EX_{tr+j}^{\epsilon,q,\nu}}{r}=\lim_{r\rightarrow\infty}\frac{EX_{tr}^{\epsilon,q,\nu}}{r}=\sup_{r}\frac{EX_{tr}^{\epsilon,q,\nu}}{r}\ \forall 0\leq j\leq t-1

(6)

since the error terms in (4) go to 0. Similarly, taking the limit as $r\rightarrow\infty$ in (5) and using (4), (6) we get

\lim_{r\rightarrow\infty}\frac{X_{tr+j}^{\epsilon,q,\nu}}{r}=\sup_{r}\frac{EX_{tr}^{\epsilon,q,\nu}}{r}=\lim_{r\rightarrow\infty}\frac{EX_{tr+j}^{\epsilon,q,\nu}}{r}=\sup_{r}\frac{EX_{tr+j}^{\epsilon,q,\nu}}{r}\ \mbox{a.s.}\ \forall 0\leq j\leq t-1

Multiplying everything by $\frac{1}{t}$ and using the fact that any $n$ can be written as $tr+j$ , we get

\frac{X^{\epsilon,q,\nu}_{n}}{n}\rightarrow\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q,\nu}}{n}=\sup_{n}\frac{EX_{n}^{\epsilon,q,\nu}}{n}\ \mbox{a.s.}

∎

Our next objective is to prove the full version of Theorem 12. We will need two short lemmas giving sufficient conditions for a.s. convergence of bounded random variables to the supremum of their expectations. The first of these lemmas looks at the case where we are given a particular lower bound for the liminf of all subsequences of our sequence.

Lemma 17.

Let $Y_{n}$ be uniformly bounded random variables. If a.s.

\sup_{n}EY_{n}\leq\liminf_{i\rightarrow\infty}Y_{n_{i}}\ \mbox{for all subsequences $(n_{i})$}

then $\lim\limits_{n\rightarrow\infty}Y_{n}=\sup\limits_{n}EY_{n}=\lim\limits_{n\rightarrow\infty}EY_{n}$ a.s.

Proof.

First, taking expectations in

\sup_{n}EY_{n}\leq\liminf_{i\rightarrow\infty}Y_{n_{i}}\ \mbox{a.s.}

and using Fatou’s Lemma, we get

	$\displaystyle\sup_{n}EY_{n}\leq E\liminf_{i\rightarrow\infty}Y_{n_{i}}\leq\liminf_{i\rightarrow\infty}EY_{n_{i}}$
	$\displaystyle\leq\limsup_{i\rightarrow\infty}EY_{n_{i}}\leq\sup_{i}EY_{n_{i}}\leq\sup_{n}EY_{n}$

so all the inequalities are equalities and

\sup_{n}EY_{n}=\lim\limits_{i\rightarrow\infty}EY_{n_{i}}=\liminf_{i\rightarrow\infty}Y_{n_{i}}\ \mbox{a.s.}

This holds for all subsequences $(n_{i})$ , including the full sequence. Thus a.s.

L:=\sup_{n}EY_{n}=\lim\limits_{n\rightarrow\infty}EY_{n}=\liminf_{i\rightarrow\infty}Y_{n_{i}}\ \mbox{ for all subsequences $(n_{i})$}

hence $Y_{n}\rightarrow L$ a.s.. ∎

The next lemma is about approximating a sequence of random variables from below by sequences that each converge to the supremum of their expectations.

Lemma 18.

Let $Y_{n}$ be uniformly bounded random variables s.t.

(i)

For every fixed $k$ , $Y_{n,k}\leq Y_{n}$ for large enough $n$
(ii)

For every fixed $k$ ,

$\lim_{n\rightarrow\infty}Y_{n,k}=\sup_{n}EY_{n,k}=\lim_{n\rightarrow\infty}EY_{n,k}\ \mbox{a.s.}$
(iii)

$\sup\limits_{n}EY_{n}=\sup\limits_{n}\sup\limits_{k}EY_{n,k}$ Then $\lim\limits_{n\rightarrow\infty}Y_{n}=\sup\limits_{n}EY_{n}=\lim\limits_{n\rightarrow\infty}EY_{n}$ a.s.

Proof.

We wish to prove the hypothesis of Lemma 17 for $Y_{n}$ . Let

\mathcal{F}=\bigg{\{}\lim_{n\rightarrow\infty}Y_{n,k}\neq\sup_{n}EY_{n,k}\ \mbox{for some $k$}\bigg{\}}

which has measure 0 by (ii). We claim that in the event $\mathcal{F}^{C}$ we have

\sup_{n}EY_{n}\leq\liminf_{i\rightarrow\infty}Y_{n_{i}}\ \mbox{for all subsequences $(n_{i})$}

Consider any subsequence $(n_{i})$ and any $k$ . By (i), $Y_{n_{i},k}\leq Y_{n_{i}}\ \mbox{for large enough $i$}$ . Taking the $\liminf$ over $i$ , and using the fact that we are in the event $\mathcal{F}^{C}$ , we get

\sup_{n}EY_{n,k}=\lim_{n\rightarrow\infty}Y_{n,k}=\lim_{i\rightarrow\infty}Y_{n_{i},k}\leq\liminf_{i\rightarrow\infty}Y_{n_{i}}

This holds for all $k$ . Taking the supremum over $k$ and using (iii), we get

\sup_{n}EY_{n}=\sup_{n}\sup\limits_{k}EY_{n,k}=\sup\limits_{k}\sup_{n}EY_{n,k}\leq\liminf_{i\rightarrow\infty}Y_{n_{i}}

as desired. Applying Lemma 17, we get

\lim_{n\rightarrow\infty}EY_{n}=\sup_{n}EY_{n}=\lim_{n\rightarrow\infty}Y_{n}\ \mbox{a.s.}

∎

We can finally prove Theorem 12.

Proof of Theorem 12.

The case $q\in{\mathbb{Q}}^{D}_{\geq 0}\setminus\{\vec{0}\}$ is handled by Lemma 16. So we may assume $q\notin{\mathbb{Q}}^{D}_{\geq 0}$ .

We construct a sequence $p_{k}\in{\mathbb{Q}}_{\geq 0}^{D}$ as follows. For every $1\leq j\leq D$ , either $q_{j}\in{\mathbb{Q}}$ and we pick $(p_{k})_{j}:=q_{j}$ , or $q_{j}\notin{\mathbb{Q}}$ and we pick $(p_{k})_{j}\in{\mathbb{Q}}_{\geq 0}$ s.t. $(p_{k})_{j}\uparrow q_{j}$ . It follows that $p_{k}\in{\mathbb{Q}}_{\geq 0}^{D}$ with $p_{k+1}-p_{k},q-p_{k}\in{\mathbb{Q}}^{D}_{\geq 0}\ \forall k$ . That is, $p_{1},p_{2},\ldots,q$ forms a ”staircase” in ${\mathbb{R}}^{D}$ .

The intuition is that $X_{n}^{\epsilon,p_{k},\nu}$ approximates $X_{n}^{\epsilon,q,\nu}$ from below. Each $\frac{X_{n}^{\epsilon,p_{k},\nu}}{n}$ converges in $n$ to the right limit a.s. by Lemma 16, and we use Lemma 18 to prove that $\frac{X_{n}^{\epsilon,q,\nu}}{n}$ converges a.s. to the right limit.

Let us now prove the hypothesis of Lemma 18 with

Y_{n,k}:=\frac{X_{n}^{\epsilon,p_{k},\nu}}{n}-\frac{||\lfloor nq\rfloor-\lfloor np_{k}\rfloor||_{1}}{\epsilon},Y_{n}:=\frac{X_{n}^{\epsilon,q,\nu}}{n}

Note that these random variables are bounded by Lemma 13:

Y_{n,k},Y_{n}\in\bigg{[}-\frac{1}{\epsilon}(||q||_{1}+||\nu||_{TV})-\frac{||q||_{1}}{\epsilon},||q||_{1}\log D\bigg{]}

First, fix $n$ and consider any $k$ . By Lemma 15,

X_{n}^{\epsilon,p_{k},\nu}-\frac{||\lfloor nq\rfloor-\lfloor np_{k}\rfloor||_{1}}{\epsilon}\leq X_{n}^{\epsilon,q,\nu}=Y_{n}

so we have (i). Next, (ii) follows from Lemma 16:

	$\displaystyle\lim_{n\rightarrow\infty}Y_{n,k}=\lim_{n\rightarrow\infty}\frac{X_{n}^{\epsilon,p_{k},\nu}-\frac{\|\|\lfloor nq\rfloor-\lfloor np_{k}\rfloor\|\|_{1}}{\epsilon}}{n}=\lim_{n\rightarrow\infty}\frac{X_{n}^{\epsilon,p_{k},\nu}}{n}-\frac{\|\|q-p_{k}\|\|_{1}}{\epsilon}$
	$\displaystyle=\sup_{n}\frac{EX_{n}^{\epsilon,p_{k},\nu}}{n}-\frac{\|\|q-p_{k}\|\|_{1}}{\epsilon}=\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,p_{k},\nu}}{n}-\frac{\|\|q-p_{k}\|\|_{1}}{\epsilon}$
	$\displaystyle=\lim_{n\rightarrow\infty}EY_{n,k}=\sup_{n}EY_{n,k}$

It remains to show (iii). Consider any fixed $n$ . For each $1\leq j\leq D$ either $q_{j}\in{\mathbb{Q}}$ and $q_{j}=(p_{k})_{j}\ \forall k$ , or $nq_{j}\notin{\mathbb{Z}}$ and $(p_{k})_{j}\uparrow q_{j}$ hence $\lfloor nq_{j}\rfloor=\lfloor n(p_{k})_{j}\rfloor$ for large enough $k$ . Thus $X_{n}^{\epsilon,p_{k},\nu}=X_{n}^{\epsilon,q,\nu}$ for large enough $k$ . It follows that

Y_{n,k}=X_{n}^{\epsilon,p_{k},\nu}-\frac{||\lfloor nq\rfloor-\lfloor np_{k}\rfloor||_{1}}{\epsilon}\uparrow X_{n}^{\epsilon,q,\nu}=Y_{n}\ \mbox{as}\ k\rightarrow\infty

By the Bounded Convergence Theorem, $EY_{n,k}\uparrow EY_{n}$ . Thus

\lim_{k\rightarrow\infty}EY_{n,k}=\sup_{k}EY_{n,k}=EY_{n}\ \forall n

so taking the supremum over $n$ we get (iii):

\sup_{n}\sup_{k}EY_{n,k}=\sup_{n}EY_{n}

Applying Lemma 18, we get

\lim_{n\rightarrow\infty}Y_{n}=\lim_{n\rightarrow\infty}EY_{n}=\sup_{n}EY_{n}\ \mbox{a.s.}

i.e.

\lim_{n\rightarrow\infty}\frac{X_{n}^{\epsilon,q,\nu}}{n}=\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q,\nu}}{n}=\sup_{n}\frac{EX_{n}^{\epsilon,q,\nu}}{n}\ \mbox{a.s.}

∎

This completes the proof of the existence of the limit shape of $d^{\epsilon}$ when measuring the distance from $(\vec{0},0)\in{\mathbb{R}}^{D}\times\mathcal{M}_{+}$ .

3.3 The Limit Shape of $d^{\epsilon}$ —The General Case

We use Lemma 18 to generalize Theorem 12 to Theorem 10, where $d^{\epsilon}$ measures distance between any two elements of ${\mathbb{R}}^{D}\times\mathcal{M}_{+}$ .

Theorem 10.

Fix $\epsilon>0,\nu,\xi\in\mathcal{M}_{+}$ and $p,q\in{\mathbb{R}}^{D}$ . Then

\frac{d^{\epsilon}((np,n\xi),(nq,n\nu))}{n}

converges in probability to a constant. When $p=\vec{0}$ , the convergence is pointwise a.s.

Remark.

During the course of the proof, we also show that the limit shape of
$\frac{d^{\epsilon}((np,n\xi),(nq,n\nu))}{n}$ is translation-invariant. This will help us later.

Proof.

As noted before, the theorem holds trivially if $q-p\notin{\mathbb{R}}^{D}_{\geq 0}\setminus\{\vec{0}\}$ . Thus we may assume $q-p\in{\mathbb{R}}^{D}_{\geq 0}\setminus\{\vec{0}\}$ .

Observe that

\begin{split}d^{\epsilon}((np,n\alpha),(nq,n\nu))&=d^{\epsilon}((np,0),(nq,n\nu-n\xi))\\ &=^{d}d^{\epsilon}((\vec{0},0),(\lfloor nq\rfloor-\lfloor np\rfloor,n\nu-n\xi))\end{split}

(7)

so it suffices to assume $\xi=0$ and prove

\frac{X_{n}^{\epsilon,\frac{\lfloor nq\rfloor-\lfloor np\rfloor}{n},\nu}}{n}=\frac{d^{\epsilon}((\vec{0},0),(\lfloor nq\rfloor-\lfloor np\rfloor,n\nu))}{n}

converge a.s. to a constant.

Our argument mirrors the one used in the proof of Theorem 12: we approximate these $Y_{n}$ from below by some $Y_{n,k}$ converging in $k$ and we apply Lemma 18.

First we prove some inequalities. Fix $n$ . Observe that for $1\leq j\leq D$ ,

n(q_{j}-p_{j})-1<\lfloor n(q_{j}-p_{j})\rfloor\leq n(q_{j}-p_{j})

n(q_{j}-p_{j})-1<\lfloor nq_{j}\rfloor-\lfloor np_{j}\rfloor<n(q_{j}-p_{j})+1

\Rightarrow(\lfloor nq_{j}\rfloor-\lfloor np_{j}\rfloor)-\lfloor n(q_{j}-p_{j})\rfloor\in\{0,1\}

(8)

and thus

(\lfloor nq\rfloor-\lfloor np\rfloor)-\lfloor n(q-p)\rfloor\in{\mathbb{R}}^{D}_{\geq 0}\ \mbox{with}\ ||(\lfloor nq\rfloor-\lfloor np\rfloor)-\lfloor n(q-p)\rfloor||_{1}\leq D

(9)

On the other hand, for $1\leq j\leq D$ s.t. $p_{j}\neq q_{j}$ ,

\lfloor nq_{j}\rfloor-\lfloor np_{j}\rfloor\leq n(q_{j}-p_{j})+1\leq(n+c)(q_{j}-p_{j})

for $c:=\bigg{\lceil}\max\limits_{1\leq j\leq D\ \mbox{s.t.}\ p_{j}\neq q_{j}}\bigg{(}\frac{1}{q_{j}-p_{j}}\bigg{)}\bigg{\rceil}$ . Thus

0\leq\lfloor(n+c)(q_{j}-p_{j})\rfloor-(\lfloor nq_{j}\rfloor-\lfloor np_{j}\rfloor)\leq c(q_{j}-p_{j})\ \mbox{by}\ \eqref{4}

This equation also holds trivially for $j$ s.t. $p_{j}=q_{j}$ . Thus

\lfloor(n+c)(q-p)\rfloor-(\lfloor nq\rfloor-\lfloor np\rfloor)\in{\mathbb{R}}^{D}_{\geq 0}\ \mbox{with}\ ||\lfloor(n+c)(q-p)\rfloor-(\lfloor nq\rfloor-\lfloor np\rfloor)||_{1}\leq c||q-p||_{1}

(10)

We prove the hypothesis of Lemma 18 with

Y_{n,k}:=\frac{X_{n}^{\epsilon,q-p,\nu}-\frac{n}{\epsilon k}}{n},Y_{n}:=\frac{X_{n}^{\epsilon,\frac{\lfloor nq\rfloor-\lfloor np\rfloor}{n},\nu}}{n}

First, by Lemma 15 and (9), for every fixed $k$ , for large enough $n$ we have

	$\displaystyle\frac{X_{n}^{\epsilon,q-p,\nu}-\frac{n}{\epsilon k}}{n}$	$\displaystyle\leq\frac{X_{n}^{\epsilon,q-p,\nu}-\frac{D}{\epsilon}}{n}$
		$\displaystyle\leq\frac{X_{n}^{\epsilon,q-p,\nu}-\frac{1}{\epsilon}\|\|(\lfloor nq\rfloor-\lfloor np\rfloor)-\lfloor n(q-p)\rfloor\|\|_{1}}{n}$
		$\displaystyle\leq\frac{X_{n}^{\epsilon,\frac{\lfloor nq\rfloor-\lfloor np\rfloor}{n},\nu}}{n}$

i.e. $Y_{n,k}\leq Y_{n}$ which is precisely (i). Also, Theorem 12 gives (ii): for every fixed $k$ , a.s.

\lim_{n\rightarrow\infty}Y_{n,k}=\lim_{n\rightarrow\infty}\frac{X_{n}^{\epsilon,q-p,\nu}-\frac{n}{\epsilon k}}{n}=\sup_{n}\frac{EX_{n}^{\epsilon,q-p,\nu}}{n}-\frac{1}{\epsilon k}=\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q-p,\nu}}{n}-\frac{1}{\epsilon k}

=\sup_{n}EY_{n,k}=\lim_{n\rightarrow\infty}EY_{n,k}

Note that this implies

\sup_{n}\frac{EX_{n}^{\epsilon,q-p,\nu}}{n}=\sup_{n}\sup_{k}EY_{n,k}=\sup_{k}\sup_{n}EY_{n,k}=\sup_{k}\lim_{n\rightarrow\infty}EY_{n,k}=\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q-p,\nu}}{n}

(11)

It remains to show (iii). Note that taking expectations in (i) immediately gives

\sup_{n}\sup_{k}EY_{n,k}=\sup_{k}\sup_{n}EY_{n,k}\leq\sup EY_{n}

(12)

We show this is an equality. By Lemma 15 and (10), for every fixed $k$ , for large enough $n$ we have

	$\displaystyle X_{n}^{\epsilon,\frac{\lfloor nq\rfloor-\lfloor np\rfloor}{n},\nu}$	$\displaystyle\leq X_{n}^{\epsilon,\frac{n+c}{n}(q-p),\frac{n+c}{n}\nu}+\frac{1}{\epsilon}\bigg{(}\|\|\lfloor(n+c)(q-p)\rfloor-(\lfloor nq\rfloor-\lfloor np\rfloor)\|\|_{1}+c\|\|\nu\|\|_{TV}\bigg{)}$
		$\displaystyle\leq X_{n+c}^{\epsilon,(q-p),\nu}+\frac{c\|\|q-p\|\|_{1}+c\|\|\nu\|\|_{TV}}{\epsilon}$

Dividing by $n$ , taking expectations and taking the supremum over $n$ we get by (11)

\sup_{n}EY_{n}\leq\sup_{n}\frac{EX_{n+c}^{\epsilon,q-p,\nu}}{n}=\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q-p,\nu}}{n}=\sup_{n}\sup_{k}EY_{n,k}

(13)

which when combined with (12) gives (iii).

Applying Lemma 18, we get

\lim_{n\rightarrow\infty}Y_{n}=\lim_{n\rightarrow\infty}EY_{n}=\sup_{n}EY_{n}\ \mbox{a.s.}

i.e.

\lim\limits_{n\rightarrow\infty}\frac{X_{n}^{\epsilon,\frac{\lfloor nq\rfloor-\lfloor np\rfloor}{n}}}{n}=\sup\limits_{n}\frac{EX_{n}^{\epsilon,\frac{\lfloor nq\rfloor-\lfloor np\rfloor}{n}}}{n}=\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,\frac{\lfloor nq\rfloor-\lfloor np\rfloor}{n}}}{n}\ \mbox{a.s.}

Furthermore, by (11), (12) and (13), we have

\sup_{n}EY_{n}=\sup_{n}\sup_{k}EY_{n,k}=\sup_{n}\frac{EX_{n}^{\epsilon,q-p,\nu}}{n}=\lim_{n\rightarrow\infty}\frac{EX_{n}^{\epsilon,q-p,\nu}}{n}

Combining this with (7), we get that $\frac{d^{\epsilon}((np,n\xi),(nq,n\nu))}{n}$ converges in probability to the a.s. limit of $\frac{d^{\epsilon}((\vec{0},0),(nq-np,n\nu-n\xi))}{n}$ .

This completes the proof of Theorem 10 and of translation-invariance of the limit shape.

∎

3.4 Grid Entropy as a Directed Norm

In the previous sections we showed the existence of the limit shape of $d^{\epsilon}$ . We now take the infimum as $\epsilon\downarrow 0$ and we show that the result is a directed metric with negative sign that gives rise to a norm, which we call grid entropy.

Theorem 12.

For $\epsilon>0$ , $\nu,\xi\in\mathcal{M}_{+}$ and $p,q\in{\mathbb{R}}^{D}$ define

\widetilde{d}^{\epsilon}((p,\xi),(q,\nu)):=\lim\limits_{n\rightarrow\infty}\frac{d^{\epsilon}((np,n\xi),(nq,n\nu))}{n},\ \mbox{and}

\widetilde{d}((p,\xi),(q,\nu)):=\inf_{\epsilon>0}\widetilde{d}^{\epsilon}((p,\xi),(q,\nu))\in[-\infty,\infty)

Then each $\widetilde{d}^{\epsilon}$ as well as $\widetilde{d}$ are directed metrics with negative sign on ${\mathbb{R}}^{D}\times\mathcal{M}_{+}$ .

Remark.

For any $p,q\in{\mathbb{R}}^{D},\epsilon>0$ and $\nu,\xi\in\mathcal{M}_{+}$ , $d^{\epsilon}((np,n\xi),(nq,n\nu))$ is monotone decreasing as $\epsilon\downarrow 0$ so

\widetilde{d}((p,\xi),(q,\nu))=\inf_{\epsilon>0}\widetilde{d}^{\epsilon}((p,\xi),(q,\nu))=\lim_{\epsilon\downarrow 0}\widetilde{d}^{\epsilon}((p,\xi),(q,\nu))

By Lemma 13, for every $n$ and $\epsilon>0$ ,

d^{\epsilon}((np,n\xi),(nq,n\nu))\in[-\infty,||\lfloor nq\rfloor-\lfloor np\rfloor||_{1}\log D]

so it follows that

\widetilde{d}((p,\xi),(q,\nu))\in[-\infty,||q-p||_{1}\log D]

Once we prove that our two definitions of grid entropy are equivalent, this bound will be improved.

Remark.

As was the case with Theorem 12, the limit $\widetilde{d}((p,\xi),(q,\nu))$ is trivially $-\infty$ if $q-p\notin{\mathbb{R}}^{D}_{\geq 0}$ or if $q=p$ and $\nu\neq\xi$ , and it is trivially 0 if $q=p$ and $\nu=\xi$ .

Proof.

As noted above,

\widetilde{d}^{\epsilon}((p,\xi),(p,\xi))=\widetilde{d}((p,\xi),(p,\xi))=0\ \forall\epsilon>0

It remains to prove the reverse triangle inequality. Let $p,q,r\in{\mathbb{R}}^{D}$ and $\nu,\xi,\eta\in\mathcal{M}_{+}$ and consider any $\epsilon>0$ and any $n$ . If $q-p\notin{\mathbb{R}}^{D}_{\geq 0}$ or $r-q\notin{\mathbb{R}}^{D}_{\geq 0}$ then the following inequality holds trivially (because the right-hand side is $-\infty$ )

d^{\epsilon}((np,n\xi),(nr,n\eta))\geq d^{\epsilon}((np,n\xi),(nq,n\nu))+d^{\epsilon}((nq,n\nu),(nr,n\eta))

(14)

Now suppose $r-q,q-p\in{\mathbb{R}}^{D}_{\geq 0}$ . Given paths $\pi:\lfloor np\rfloor\rightarrow\lfloor nq\rfloor$ , $\pi^{\prime}:\lfloor nq\rfloor\rightarrow\lfloor nr\rfloor$ , we concatenate them to obtain a unique path $\pi\cdot\pi^{\prime}:\lfloor np\rfloor\rightarrow\lfloor nr\rfloor$ with unnormalized empirical measure $\mu_{\pi\cdot\pi^{\prime}}=\mu_{\pi}+\mu_{\pi^{\prime}}$ . By the subadditivity of the Levy-Prokhorov metric (Lemma 4),

\rho(\mu_{\pi\cdot\pi^{\prime}},n(\eta-\xi))=\rho(\mu_{\pi}+\mu_{\pi^{\prime}},n(\eta-\nu)+n(\nu-\xi))\leq\rho(\mu_{\pi},n(\eta-\nu))+\rho(\mu_{\pi^{\prime}},n(\nu-\xi))

	$\displaystyle\bigg{(}\sum_{\pi:\lfloor np\rfloor\rightarrow\lfloor nq\rfloor}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi},n(\eta-\nu))}\bigg{)}\bigg{(}\sum_{\pi^{\prime}:\lfloor nq\rfloor\rightarrow\lfloor nr\rfloor}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi^{\prime}},n(\nu-\xi))}\bigg{)}$
	$\displaystyle\leq\bigg{(}\sum_{\pi^{\prime\prime\prime}:\lfloor np\rfloor\rightarrow\lfloor nr\rfloor}e^{-\frac{1}{\epsilon}\rho(\mu_{\pi^{\prime\prime\prime}},n(\eta-\xi))}\bigg{)}$

It follows that (14) holds. Dividing (14) by $n$ and taking the limit (in probability) as $n\rightarrow\infty$ we get

\widetilde{d}^{\epsilon}((p,\xi),(r,\eta))\geq\widetilde{d}^{\epsilon}((p,\xi),(q,\nu))+\widetilde{d}^{\epsilon}((q,\nu),(r,\eta))

so $\widetilde{d}^{\epsilon}$ is a directed metric with negative sign. Taking the limit as $\epsilon\rightarrow 0^{+}$ , we still obtain a directed metric with negative sign. ∎

We proceed to show that $\widetilde{d}$ gives rise to a directed norm with negative sign.

Theorem 19.

(i)

Each $\widetilde{d}^{\epsilon}$ is translation-invariant and positive-homogeneous. So is $\widetilde{d}$ .
(ii)

For $q\in{\mathbb{R}}^{D},\nu\in\mathcal{M}_{+}$ define the grid entropy with respect to $(q,\nu)$ to be

$||(q,\nu)||:=\widetilde{d}((\vec{0},0),(q,\nu))$

Then this is a directed norm with negative sign on ${\mathbb{R}}^{D}\times\mathcal{M}_{+}$ .

Remark.

From before, $||(q,\nu)||$ is $-\infty$ if $q\notin{\mathbb{R}}^{D}_{\geq 0}$ or if $q=\vec{0}$ and $\nu\neq 0$ , and it is 0 if $q=\vec{0}$ and $\nu=0$ .

Remark.

A directed metric with negative sign is clearly concave. Thus each $\widetilde{d}^{\epsilon}$ as well as $||(\cdot,\cdot)||$ are concave functions on their respective domains.

Remark.

These properties of grid entropy also follow from [RAS14].

Proof.

(i) Fix $\epsilon>0$ . We already showed that $\widetilde{d}^{\epsilon}$ is translation-invariant while proving Theorem 7.

By translation-invariance, it suffices to show that $\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))$ is positive-homogeneous. Consider any $c=\frac{a}{b}\in{\mathbb{Q}}_{>0}$ . Then

\widetilde{d}^{\epsilon}((\vec{0},0),(cq,c\nu))=\lim_{n\rightarrow\infty}\frac{d^{\epsilon}((\vec{0},0),(cnq,cn\nu))}{n}\ \mbox{a.s.}

Looking at the subsequence consisting of multiples $n=mb$ of $b$ , we get

\widetilde{d}^{\epsilon}((\vec{0},0),(cq,c\nu))=\lim_{m\rightarrow\infty}\frac{d^{\epsilon}((\vec{0},0),(amq,am\nu))}{mb}\ \mbox{a.s.}

But each $am\in{\mathbb{N}}$ so

\widetilde{d}^{\epsilon}((\vec{0},0),(cq,c\nu))=\frac{a}{b}\lim_{n\rightarrow\infty}\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}=c\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))\ \mbox{a.s.}

Thus $\widetilde{d}^{\epsilon}$ is positive-homogeneous for rational factors.

Now consider any $c\in{\mathbb{R}}_{>0}$ and take sequences $a_{k},b_{k}\in{\mathbb{Q}}_{>0},a_{k}\uparrow c,b_{k}\downarrow c$ . Consider any $n$ , and let us look at

X_{n}^{\epsilon,cq,c\nu}=d^{\epsilon}((\vec{0},0),(cq,c\nu))

By Lemma 15,

X_{n}^{\epsilon,a_{k}q,a_{k}\nu}-\frac{1}{\epsilon}(n|c-a_{k}|\cdot(||q||_{1}+||\nu||_{TV}))\leq X_{n}^{\epsilon,cq,c\nu}\leq X_{n}^{\epsilon,b_{k}q,b_{k}\nu}+\frac{1}{\epsilon}(n|b_{k}-c|\cdot(||q||_{1}+||\nu||_{TV}))

Dividing by $n$ and taking a.s. limits, and applying homogeneity for positive rational factors we get

	$\displaystyle a_{k}\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))-\frac{\|c-a_{k}\|\cdot(\|\|q\|\|_{1}+\|\|\nu\|\|_{TV})}{\epsilon}$
	$\displaystyle\leq\widetilde{d}^{\epsilon}((\vec{0},0),(cq,c\nu))$
	$\displaystyle\leq b_{k}\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))+\frac{\|b_{k}-c\|\cdot(\|\|q\|\|_{1}+\|\|\nu\|\|_{TV})}{\epsilon}$

Taking $k\rightarrow\infty$ gives us

\widetilde{d}^{\epsilon}((\vec{0},0),(cq,c\nu))=c\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))

so $\widetilde{d}^{\epsilon}$ is homogenous with respect to any positive real factor.

Taking the infimum over $\epsilon>0$ we get that $\widetilde{d}$ is translation-invariant and positive-homogenous.

(ii) Follows directly from (i).

∎

3.5 Direction-free Grid Entropy

We now wish to develop a grid entropy for the case where we no longer restrict ourselves to paths $\pi:\vec{0}\rightarrow\lfloor nq\rfloor$ for a given direction $q\in{\mathbb{R}}^{D}$ , and instead look at all length $\lfloor nt\rfloor$ paths from $\vec{0}$ for a given size parameter $t\geq 0$ . Another way of putting this is that we look at paths from the origin to the line or ”level” $x_{1}+x_{2}+\ldots+x_{D}=\lfloor nt\rfloor$ . Recall that the set of all such paths is denoted $\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})$ .

If we try to simply repeat our previous argument, we run into a dead end because we are no longer in a superadditive setting. The solution is to observe that the distances $\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))$ are maximized over $q\in{\mathbb{R}}^{D}_{\geq 0}$ with $||q||_{1}=t$ by $q=t(\frac{1}{D},\ldots,\frac{1}{D}):=t\ell$ . This intuitively makes sense, since this direction is the direction which has the most NE paths.

Lemma 20.

Fix $t\geq 0,\nu\in\mathcal{M}_{t}$ . Then

\sup_{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t}\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))=\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))\ \forall\epsilon>0\ \mbox{and}\ \sup_{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t}||(q,\nu)||=||(t\ell,\nu)||

Remark.

In Section 4 we show that $||q||_{1}=||\nu||_{TV}$ is a necessary condition for $||(q,\nu)||$ to be finite, so it makes sense that we only take the supremum over $q\in{\mathbb{R}}^{D}_{\geq 0}$ with $||q||_{1}=t$ .

Proof.

This is an easy consequence of the symmetries of the grid and the concavity of $\widetilde{d}^{\epsilon}$ and direction-fixed grid entropy. We focus on the proof for $\widetilde{d}^{\epsilon}$ ; the argument for grid entropy goes the same way.

Fix $\epsilon>0$ . By positive-homogeneity and since the $t=0$ is trivial, we may assume $t=1$ . Suppose there exists $q\in{\mathbb{R}}^{D}_{\geq 0}$ s.t. $||q||_{1}=1$ and

\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))>\widetilde{d}^{\epsilon}((\vec{0},0),(\ell,\nu))

(15)

Among such $q$ pick one that maximizes the number of coordinates which are equal $\frac{1}{D}$ . Thus there are distinct $1\leq i,j\leq D$ s.t. $q_{i}<\frac{1}{D}<q_{j}$ , so we can write $\frac{1}{D}$ as a convex combination of $q_{i},q_{j}$ :

\frac{1}{D}=wq_{i}+(1-w)q_{j}\ \mbox{for some}\ w\in(0,1)

(16)

Let $\sigma_{ij}(q)$ be $q$ with $q_{i},q_{j}$ swapped. By symmetry of the grid,

\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))=\widetilde{d}^{\epsilon}((\vec{0},0),(\sigma_{ij}(q),\nu))

hence by concavity of $\widetilde{d}^{\epsilon}$ ,

	$\displaystyle\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))$	$\displaystyle=w\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))+(1-w)\widetilde{d}^{\epsilon}((\vec{0},0),(\sigma_{ij}(q),\nu))$
		$\displaystyle\leq\widetilde{d}^{\epsilon}((\vec{0},0),(wq+(1-w)\sigma_{ij}(q),\nu))$

But $wq+(1-w)\sigma_{ij}(q)$ only changes the coordinates of $q$ in positions $i,j$ , with $q_{i}$ becoming $\frac{1}{D}$ by (16). Thus we have found a $q$ satisfying (15) that has at least one more coordinate that is $\frac{1}{D}$ than our previous $q$ , which we had assumed had the maximal number of such coordinates. Contradiction. Therefore $q=\ell$ as desired. ∎

We now use this useful fact along with the compactness of $\{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t\}$ to show that $||(t\ell,\nu)||$ is the desired direction-free grid entropy of length $t$ .

Theorem 21.

Fix $t\geq 0,\nu\in\mathcal{M}_{t}$ . For any $\epsilon>0$ we have

	$\displaystyle\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))$	$\displaystyle=\lim_{n\rightarrow\infty}\sup_{q\in{\mathbb{R}}^{D}_{\geq 0}:\|\|q\|\|_{1}=t}\frac{1}{n}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}$
		$\displaystyle=\lim_{n\rightarrow\infty}\frac{1}{n}\log\sum_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}$

a.s.

Proof.

The statement is trivial when $t=0$ so we may assume $t>0$ . We focus on the first equality. By Lemma 20 and the trivial fact that $\sup\limsup\leq\limsup\sup$ in general, we immediately get

\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))\leq\limsup_{n\rightarrow\infty}\sup_{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t}\frac{1}{n}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}\ \mbox{a.s.}

Suppose equality does not hold on some event of positive probability. Thus there exists $\delta>0$ s.t.

\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))+7\delta<\limsup_{n\rightarrow\infty}\sup_{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t}\frac{1}{n}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}

(17)

with positive probability. Let us intersect this event with the measure 1 event that
$\widetilde{d}^{\epsilon}((\vec{0},0),(p,\nu))$ exists and is equal to the limit in Theorem 10 for every $p\in{\mathbb{Q}}^{D}_{\geq 0}$ . Denote this event by $\mathcal{E}$ .

Unfortunately, the expression inside the supremum is only continuous when approaching $q$ from the SE, and not necessarily when approaching along $\{r\in{\mathbb{R}}^{D}_{\geq 0}:||r||_{1}=t\}$ , so we need to work with some rather unpleasant approximations.

Let $(n_{i})$ be the (event-dependent) subsequence corresponding to the $\limsup$ . For every $i\in{\mathbb{N}}$ pick some (event-dependent) $q^{n_{i}}\in{\mathbb{R}}^{D}_{\geq 0}$ with $||q^{n_{i}}||_{1}=t$ s.t.

\sup_{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t}\frac{1}{n_{i}}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor n_{i}q\rfloor)}e^{-\frac{n_{i}}{\epsilon}\rho(\frac{1}{n_{i}}\mu_{\pi},\nu)}\leq\delta+\frac{1}{n_{i}}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor n_{i}q^{n_{i}}\rfloor)}e^{-\frac{n_{i}}{\epsilon}\rho(\frac{1}{n_{i}}\mu_{\pi},\nu)}

(18)

By compactness of $\{r\in{\mathbb{R}}^{D}_{\geq 0}:||r||_{1}=t\}$ , there is a converging subsequence $q^{n_{i_{j}}}\rightarrow^{L^{1}}q^{\prime}$ for some $q^{\prime}\in{\mathbb{R}}^{D}_{\geq 0},||q^{\prime}||_{1}=t$ . Pick some $q^{\prime\prime}\in{\mathbb{Q}}^{D}_{\geq 0}$ s.t. $q^{\prime\prime}-q^{\prime}\in{\mathbb{R}}^{D}_{>0}$ and

\max\bigg{(}\frac{1}{\epsilon}||q^{\prime\prime}-q^{\prime}||_{1},\bigg{(}\frac{||q^{\prime\prime}||_{1}}{t}-1\bigg{)}\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu)),\frac{1}{\epsilon}\bigg{|}\bigg{|}\bigg{(}1-\frac{t}{||q^{\prime\prime}||_{1}}\bigg{)}\nu\bigg{|}\bigg{|}_{TV}\bigg{)}<\delta

(19)

The idea is that since $q^{n_{i_{j}}}\rightarrow^{L^{1}}q^{\prime}$ and the coordinates of $q^{\prime\prime}-q^{\prime}$ are strictly positive, then there exists $J$ s.t. $q^{\prime\prime}-q^{n_{i_{j}}}\in{\mathbb{R}}^{D}_{\geq 0}\ \forall j\geq J$ . For such $j$ we can apply Lemma 15 to get

\begin{split}&\frac{1}{n_{i_{j}}}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor n_{i_{j}}q^{n_{i_{j}}}\rfloor)}e^{-\frac{n_{i_{j}}}{\epsilon}\rho(\frac{1}{n_{i_{j}}}\mu_{\pi},\nu)}\\ &\leq\frac{1}{\epsilon}\bigg{(}||q^{\prime\prime}-q^{n_{i_{j}}}||_{1}+\frac{D}{n_{i_{j}}}\bigg{)}+\frac{1}{n_{i_{j}}}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor n_{i_{j}}q^{\prime\prime}\rfloor)}e^{-\frac{n_{i_{j}}}{\epsilon}\rho(\frac{1}{n_{i_{j}}}\mu_{\pi},\nu)}\end{split}

(20)

Recalling that we are in the event $\mathcal{E}$ , pick some $J^{\prime}\geq J$ large enough so that for any $j\geq J^{\prime},$

\frac{1}{n_{i_{j}}}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor n_{i_{j}}q^{\prime\prime}\rfloor)}e^{-\frac{n_{i_{j}}}{\epsilon}\rho(\frac{1}{n_{i_{j}}}\mu_{\pi},\nu)}\leq\delta+\widetilde{d}^{\epsilon}((\vec{0},0),(q^{\prime\prime},\nu))

(21)

and $\frac{1}{\epsilon}\bigg{(}||q^{\prime}-q^{n_{i_{j}}}||_{1}+\frac{D}{n_{i_{j}}}\bigg{)}<\delta\ \mbox{hence}$

\frac{1}{\epsilon}\bigg{(}||q^{\prime\prime}-q^{n_{i_{j}}}||_{1}+\frac{D}{n_{i_{j}}}\bigg{)}\leq\frac{1}{\epsilon}\bigg{(}||q^{\prime}-q^{n_{i_{j}}}||_{1}+||q^{\prime\prime}-q^{\prime}||_{1}+\frac{D}{n_{i_{j}}}\bigg{)}<2\delta

(22)

by (19). Putting everything together, we get

	$\displaystyle\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))+7\delta$
	$\displaystyle\leq\widetilde{d}^{\epsilon}((\vec{0},0),(q^{\prime\prime},\nu))+4\delta\ \mbox{by \eqref{EQ13}-\eqref{EQ18}}$
	$\displaystyle=\frac{\|\|q^{\prime\prime}\|\|_{1}}{t}\widetilde{d}^{\epsilon}\bigg{(}(\vec{0},0),\bigg{(}\frac{t}{\|\|q^{\prime\prime}\|\|_{1}}q^{\prime\prime},\frac{t}{\|\|q^{\prime\prime}\|\|_{1}}\nu\bigg{)}\bigg{)}+4\delta\ \mbox{by positive-homogeneity}$

By Lemma 20, this is

	$\displaystyle\leq\frac{\|\|q^{\prime\prime}\|\|_{1}}{t}\widetilde{d}^{\epsilon}\bigg{(}(\vec{0},0),\bigg{(}t\ell,\frac{t}{\|\|q^{\prime\prime}\|\|_{1}}\nu\bigg{)}\bigg{)}+4\delta$
	$\displaystyle\leq\frac{\|\|q^{\prime\prime}\|\|_{1}}{t}\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))+\frac{1}{\epsilon}\bigg{\|}\bigg{\|}\bigg{(}1-\frac{t}{\|\|q^{\prime\prime}\|\|_{1}}\bigg{)}\nu\bigg{\|}\bigg{\|}_{TV}+4\delta\ \mbox{by Lemma \ref{lemma12}}$
	$\displaystyle\leq\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))+6\delta\ \mbox{by \eqref{EQ15}}$

which is a contradiction. Note that even though our sequence $q^{n_{i_{j}}}$ , the limit $q^{\prime\prime}$ and the $J^{\prime}$ chosen are event-dependent, the upper bound above is not by virtue of Lemma 20. That is what makes this argument work.

We have thus shown

\displaystyle\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))=\lim_{n\rightarrow\infty}\sup_{q\in{\mathbb{R}}^{D}_{\geq 0}:||q||_{1}=t}\frac{1}{n}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}\ \mbox{a.s.}

We now proceed with the second equality. Observe that for any $n$ , any length $\lfloor nt\rfloor$ NE path $\pi$ from $\vec{0}$ ends at $\lfloor nq\rfloor$ for some $q\in{\mathbb{R}}^{D}_{\geq 0}$ with $||q||_{1}=t$ . Furthermore, the number of possible different endpoints of such paths is precisely $\binom{\lfloor nt\rfloor+D-1}{D-1}=O(n^{D})$ , which is the number of $D$ -tuples of non-negative integers summing to $\lfloor nt\rfloor$ . Therefore

	$\displaystyle\log\sum_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}$
	$\displaystyle\in\bigg{[}\sup_{\|\|q\|\|_{1}=t}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)},\log\bigg{(}O(n^{D})\sup_{\|\|q\|\|_{1}=t}\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}\bigg{)}\bigg{]}$
	$\displaystyle=\bigg{[}\sup_{\|\|q\|\|_{1}=t}d^{\epsilon}((\vec{0},n\xi),(nq,n\nu)),O(D\log n)+\sup_{\|\|q\|\|_{1}=t}d^{n\epsilon}((\vec{0},n\xi),(nq,n\nu))\bigg{]}$

When we divide by $n$ and take the limit, the $O(D\log n)$ term goes to 0 and we get that

\lim_{n\rightarrow\infty}\log\sum_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}=\lim_{n\rightarrow\infty}\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=t}\frac{d^{n\epsilon}((\vec{0},n\xi),(nq,n\nu))}{n}\ \mbox{a.s.}

as desired. ∎

Of course, we already established that $\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))$ is a directed metric with negative sign. This means that if we now take the infimum over $\epsilon>0$ we get that the length $t$ direction-free grid entropy is precisely $||(t\ell,\nu)||$ . Since this is $-\infty$ unless $t=||\nu||_{TV}$ (see Theorem 24 in a later section), then we can simply let $t=||\nu||_{TV}$ to begin with.

Definition.

For $\nu\in\mathcal{M}_{+}$ let $t:=||\nu||_{TV}$ and define the direction-free grid entropy to be

||\nu||:=\inf_{\epsilon}\widetilde{d}^{\epsilon}((\vec{0},0),(t\ell,\nu))=||(t\ell,\nu)||=\sup_{q\in{\mathbb{R}}^{D}}||(q,\nu)||

Remark.

The final equality holds by Lemma 20 combined with the fact that $||(q,\nu)||=-\infty$ unless $||q||_{1}=||\nu||_{TV}$ .

3.6 Equivalence of Approaches

The next order of business is to show that these definitions of direction-fixed/direction-free grid entropy are equivalent to our original, more intuitive, definitions (1). The novel results in this subsection show the benefit of the fresh approach to grid entropy presented in this paper.

The key idea is that the sum

\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}

appearing in the definition of $d^{\epsilon}$ is approximately the number of paths $\vec{0}\rightarrow\lfloor nq\rfloor$ with very small $\rho(\frac{1}{n}\mu_{\pi},\nu)$ and the error should disappear in the limit.

Let us recall how the $\min\limits_{\pi}^{k}$ are defined. Fix $q\in{\mathbb{R}}^{D},\nu\in\mathcal{M}_{+}$ . For any $n,k\in{\mathbb{N}}$ with $k\leq(\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor)$ define

\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{k}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}

to be the order statistics of $\rho(\frac{1}{n}\mu_{\pi},\nu)$ as we range over all $\pi:\vec{0}\rightarrow\lfloor nq\rfloor$ , and let

\min\limits_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{k}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}:=+\infty\ \mbox{for}\ k>\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor

In this way $\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho(\frac{1}{n}\mu_{\pi},\nu),\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\#\pi}\rho(\frac{1}{n}\mu_{\pi},\nu)$ are just the minimum/maximum values of $\rho(\frac{1}{n}\mu_{\pi},\nu)$ respectively.

Similarly, in the direction-free case, for $\nu\in\mathcal{M}_{+}$ we let $t:=||\nu||_{TV}$ and for any $n,k\in{\mathbb{N}}$ with $k\leq(\#\ \mbox{length $\lfloor nt\rfloor$ paths $\pi$ from $\vec{0}$})$ define

\min\limits_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{k}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}

to be the $k$ -th smallest value of $\rho(\frac{1}{n}\mu_{\pi},\nu)$ when we range over all length $\lfloor nt\rfloor$ paths from $\vec{0}$ .

We usually omit the $\rho(\frac{1}{n}\mu_{\pi},\nu)$ in $\min\limits_{\pi}^{j}\rho(\frac{1}{n}\mu_{\pi},\nu)$ for the sake of space.

We are now ready to prove that our definitions of grid entropy are equivalent, and moreover that the pertinent limits either converge to 0 a.s. or the corresponding $\liminf$ ’s are $>0$ a.s.. In terms of notation, we denote by $||(q,\nu)||,||\nu||$ the direction-fixed/direction-free grid entropy as defined in Theorem 19/Theorem 21.

Theorem 22.

Fix $q\in{\mathbb{R}}^{D},\nu\in\mathcal{M}_{+}$ and let $t:=||\nu||_{TV}$ .

(i)

The grid entropy being finite is characterized by the existence of paths with empirical measure arbitrarily close to the target. More specifically,

\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)},\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\ \mbox{exist a.s. and}

||(q,\nu)||\neq-\infty\Leftrightarrow\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}

\displaystyle||\nu||\neq-\infty\Leftrightarrow\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}

Furthermore, if $||(q,\nu)||\neq-\infty$ then

||(q,\nu)||\in[0,H(q)]\ \mbox{where}\ H(q)=\sum_{i=1}^{D}-q_{i}\log\frac{q_{i}}{||q||_{1}}

and if $||\nu||\neq-\infty$ then $||\nu||=||(t\ell,\nu)||\in[0,t\log D]$ . In particular,

||\ ||q||_{1}\Lambda\ ||\geq||(q,||q||_{1}\Lambda)||\geq 0

i.e. $\Lambda$ is always among the distributions observed.

(ii)

We have

\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\ \forall 0\leq\alpha<||(q,\nu)||

\liminf_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}>0\ \mbox{a.s.}\ \forall||(q,\nu)||<\alpha\leq H(q)

In other words,

\begin{split}||(q,\nu)||&=\sup\bigg{\{}\alpha\geq 0:\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\bigg{\}}\\ &=\sup\bigg{\{}\alpha\geq 0:\liminf_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\bigg{\}}\end{split}

(23)

where this is $-\infty$ if the set of $\alpha$ ’s is empty. Similarly,

\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\ \forall 0\leq\alpha<||\nu||

\liminf_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}>0\ \mbox{a.s.}\ \forall||\nu||<\alpha\leq t\log D

and the direction-free grid entropy is given by

	$\displaystyle\|\|\nu\|\|$	$\displaystyle=\sup\bigg{\{}\alpha\geq 0:\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\bigg{\}}$
		$\displaystyle=\sup\bigg{\{}\alpha\geq 0:\liminf_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\bigg{\}}$

Remark.

Since the $\min\limits_{\pi}^{j}$ increase in $j$ , then the sets we are taking the supremum over in the definition in (ii) are either $\emptyset$ or of the form $[0,C)$ or $[0,C]$ where $C$ is the corresponding grid entropy.

Remark.

The limits

\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{n\alpha}\rfloor}\rho\bigg{(}\frac{1}{n_{i}}\mu_{\pi},\nu\bigg{)}

need not exist for all $\alpha$ . What we are saying is that they do exist and are equal to 0 a.s. for $\alpha<||(q,\nu)||$ and that the corresponding $\liminf$ ’s are $>0$ a.s. for $\alpha>||(q,\nu)||$ . This along with the existence of the limits in (i) are completely non-trivial facts which will be a direct consequence of the a.s. convergence of $\frac{d^{\epsilon}((0,0),(nq,n\nu))}{n}$ .

Proof.

We focus on proving the direction-fixed grid entropy statements. The direction-free arguments are analogous.

(i) The strategy is to provide upper and lower bounds on $d^{\epsilon}$ in terms of

\mbox{exp}\bigg{[}-\frac{n}{\epsilon}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\bigg{]},

the leading term in the sum over $\pi$ . Since $\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}$ converges a.s. then our analysis gives that $\widetilde{d}^{\epsilon}$ must be in the intersection of two intervals, one in terms of
$\limsup\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho(\frac{1}{n}\mu_{\pi},\nu)$ and the other in terms of the $\liminf$ . The only way these intervals can have non-empty intersection for small enough $\epsilon>0$ is if the $\limsup$ ’s equal the $\liminf$ ’s.

Fix $\epsilon>0$ . We consider the $\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{j}$ over $\pi:\vec{0}\rightarrow\lfloor nq\rfloor$ . Since $\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\leq\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{j}$ for all valid $j$ , then

	$\displaystyle\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}$
	$\displaystyle=\frac{1}{n}\log\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}$
	$\displaystyle\in\bigg{[}\frac{1}{n}\log\bigg{(}e^{-\frac{n}{\epsilon}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}}\bigg{)},\frac{1}{n}\log\bigg{(}(\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor)\ e^{-\frac{n}{\epsilon}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}}\bigg{)}\bigg{]}$
	$\displaystyle=\bigg{[}-\frac{1}{\epsilon}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)},\frac{1}{n}\log(\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor)-\frac{1}{\epsilon}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\bigg{]}$

where each $\rho(\frac{1}{n}\mu_{\pi},\nu)\geq 0$ . Taking the a.s. $\limsup$ or the a.s. $\liminf$ and using Lemma 1 we get that a.s.,

\begin{split}\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))\in&\bigg{[}-\frac{1}{\epsilon}\liminf_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1},H(q)-\frac{1}{\epsilon}\liminf_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\bigg{]}\\ &\bigcap\bigg{[}-\frac{1}{\epsilon}\limsup_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1},H(q)-\frac{1}{\epsilon}\limsup_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\bigg{]}\end{split}

(24)

Since this holds for arbitrarily small $\epsilon>0$ then we must have

\liminf_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=\limsup_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\ \mbox{a.s.}

Furthermore, taking $\epsilon\rightarrow 0^{+}$ in (24), we see that

\displaystyle||(q,\nu)||=\inf_{\epsilon>0}\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))\neq-\infty\Leftrightarrow\lim_{n\rightarrow\infty}\min_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}

Moreover, if indeed $||(q,\nu)||\neq-\infty$ then (24) gives $||(q,\nu)||\in[0,H(q)]$ .

Finally, if we fix an infinite NE path $\pi$ passing through all $\lfloor nq\rfloor$ then by the Glivenko-Cantelli Theorem (Theorem 2),

\rho\bigg{(}\frac{1}{n}\mu_{\pi|_{\vec{0}\rightarrow\lfloor nq\rfloor}},||q||_{1}\Lambda\bigg{)}=||q||_{1}\rho\bigg{(}\frac{1}{|\pi|}\mu_{\pi|_{\vec{0}\rightarrow\lfloor nq\rfloor}},\Lambda\bigg{)}\ \rightarrow 0

where $\mu_{\pi|_{\vec{0}\rightarrow\lfloor nq\rfloor}}$ denotes the restriction of the path to $\vec{0}\rightarrow\lfloor nq\rfloor$ , so from above it follows that $||(q,||q||_{1}\Lambda)||\geq 0$ .

(ii) If $||(q,\nu)||=-\infty$ then by (i),

\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n_{i}}\mu_{\pi},\nu\bigg{)}\ \mbox{exists a.s. and is $>0$}

so the statement holds trivially because the $\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{j}$ are nondecreasing in $j$ . Thus it suffices to assume $||(q,\nu)||\geq 0$ and hence

\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\rightarrow 0\ \mbox{a.s.}

(25)

Fix $\epsilon>0$ and consider any $n\in{\mathbb{N}}$ . By (25), the leading term of

\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}

is $\approx 1$ . So we must look at the secondary terms by writing

\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}=-\frac{1}{\epsilon}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}+\frac{1}{n}\log\bigg{[}1+\sum_{j=2}^{\#\pi}e^{-\frac{n}{\epsilon}[\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{j}-\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}]}\bigg{]}

(26)

where the summands are nonincreasing in $j$ .

The argument goes similar to the one in (i), in that we compute some upper and lower bounds for this expression.

Fix an arbitrary $0\leq C\leq H(q)$ . We lower bound (26) by truncating the sum at $j=\lfloor e^{Cn}\rfloor$ and lower bounding all the summands by the $j=\lfloor e^{Cn}\rfloor$ summand:

\begin{split}&\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}\geq\\ &-\frac{1}{\epsilon}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}+\frac{1}{n}\log\bigg{[}1+(\lfloor e^{Cn}\rfloor-1)\mbox{exp}\bigg{[}-\frac{n}{\epsilon}\bigg{(}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{Cn}\rfloor}-\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\bigg{)}\bigg{]}\bigg{]}\end{split}

(27)

We upper bound (26) by upper bounding summands for $j=2$ to $\lfloor e^{Cn}\rfloor$ to 1 and upper bounding the rest of the summands to the $j=\lfloor e^{Cn}\rfloor$ summand:

\begin{split}&\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}\leq\\ &-\frac{1}{\epsilon}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}+\frac{1}{n}\log\bigg{[}1+\lfloor e^{Cn}\rfloor\cdot 1+(\#\pi)\mbox{exp}\bigg{[}-\frac{n}{\epsilon}\bigg{(}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{Cn}\rfloor}-\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\bigg{)}\bigg{]}\bigg{]}\end{split}

(28)

Consider the event-dependent sequence

a_{n}(C):=\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{Cn}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}-\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\geq 0

Taking the $\liminf/\limsup$ in (27),(28) and using (25) and Lemma 1 we get

	$\displaystyle\widetilde{d}^{\epsilon}((0,0),(q,\nu))$	$\displaystyle\geq\log\limsup_{n\rightarrow\infty}\bigg{(}1+(\lfloor e^{Cn}\rfloor-1)e^{-\frac{a_{n}(C)}{\epsilon}n}\bigg{)}^{\frac{1}{n}}$
		$\displaystyle=C-\frac{1}{\epsilon}\liminf\limits_{n\rightarrow\infty}a_{n}(C),$

	$\displaystyle\widetilde{d}^{\epsilon}((0,0),(q,\nu))$	$\displaystyle\leq\log\liminf_{n\rightarrow\infty}\bigg{(}1+e^{Cn}+e^{H(q)n-\frac{a_{n}(C)}{\epsilon}n}\bigg{)}^{\frac{1}{n}}$
		$\displaystyle=\max(C,H(q)-\frac{1}{\epsilon}\limsup\limits_{n\rightarrow\infty}a_{n}(C))$

a.s. for this arbitrary $0\leq C\leq H(q)$ .

In particular, for any $0\leq C<||(q,\nu)||$ we get

0\leq||(q,\nu)||\leq\widetilde{d}^{\epsilon}((0,0),(q,\nu))\leq\max(C,H(q)-\frac{1}{\epsilon}\limsup\limits_{n\rightarrow\infty}a_{n}(C))\ \mbox{a.s.}\ \forall\epsilon>0

hence

\lim_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{Cn}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=\lim_{n\rightarrow\infty}a_{n}(C)=0\ \mbox{a.s.}

On the other hand, for any $||(q,\nu)||<C\leq H(q)$ ,

||(q,\nu)||=\inf_{\epsilon>0}\widetilde{d}^{\epsilon}((0,0),(q,\nu))\geq C-\inf_{\epsilon>0}\frac{1}{\epsilon}\liminf\limits_{n\rightarrow\infty}a_{n}(C)\ \mbox{a.s.}

hence

\liminf_{n\rightarrow\infty}\min\limits_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{Cn}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=\lim_{n\rightarrow\infty}a_{n}(C)>0\ \mbox{a.s.}

Equation (23) follows. ∎

Thus our approaches to grid entropy are equivalent.

Note that so far we have not made use of the coupling $\tau_{e}=\tau(U_{e})$ or of the compactness of the space of measures on $[0,1]$ . We could have developed grid entropy in the original environment, with unnormalized empirical measures $\sigma_{\pi}$ and edge weight distribution $\theta$ and target measures $\nu$ on ${\mathbb{R}}$ in the same way. Furthermore, if $\theta$ has a continuous cdf then the value of the grid entropy would be the same either way, because of the duality between the environments, established in Lemma 5; that is, for any $q\in{\mathbb{R}}^{D}$ and $\nu\in\mathcal{M}_{+}$ we would have

||(q,\nu)||=||(q,\tau_{*}(\nu))||,||\nu||=||\tau_{*}(\nu)||

where the grid entropies are developed on the environments $([0,1],U_{e}\sim\Lambda),({\mathbb{R}},\tau_{e}\sim\theta)$ respectively. Even in the general case where $F_{\theta}$ may not be continuous, [Bat20, Lemma 6.15] implies that

||(q,\nu)||\leq||(q,\tau_{*}(\nu))||,||\nu||\leq||\tau_{*}(\nu)||

Again though, this is just a nice fact; for practical purposes we may simply define grid entropy on the space of measures on ${\mathbb{R}}$ and our arguments thus far hold.

We now reap the benefits of this coupling, by describing the sets $\mathcal{R}^{q},\mathcal{R}^{q,\alpha},\mathcal{R}^{t},\mathcal{R}^{t,\alpha}$ defined in Theorem 6 and Corollary 7 in terms of direction-fixed/direction-free grid entropy.

Corollary 23.

Fix $q\in{\mathbb{R}}^{D}$ and $t\geq 0$ .

(i)

The deterministic set $\mathcal{R}^{q}$ of limits of empirical measures in direction $q$ whose existence is established by Theorem 6 (i) is precisely $\{\nu\in\mathcal{M}_{+}:||(q,\nu)||\geq 0\}$ . Thus

	$\displaystyle\|\|(q,\nu)\|\|=-\infty$	$\displaystyle\Leftrightarrow\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}>0\ \mbox{a.s.}$
	$\displaystyle\mbox{and}\ \|\|(q,\nu)\|\|\geq 0$	$\displaystyle\Leftrightarrow P\bigg{(}\exists\pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{with}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{i.o.}\bigg{)}=1\ \forall\epsilon>0$

Likewise, the deterministic set $\mathcal{R}^{t}$ of limits of length $t$ empirical measures from direction-free paths, whose existence is established by Theorem 6(ii) is precisely

\{\nu\in\mathcal{M}_{t}:||\nu||\geq 0\}=\{\nu\in\mathcal{M}_{t}:||(t\ell,\nu)||\geq 0\}=\mathcal{R}^{t\ell}

Thus

	$\displaystyle\|\|\nu\|\|=-\infty$	$\displaystyle\Leftrightarrow\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}>0\ \mbox{a.s.}$
	$\displaystyle\mbox{and}\ \|\|\nu\|\|\geq 0$	$\displaystyle\Leftrightarrow P\bigg{(}\exists\ \pi\ \mbox{s.t.}\ \|\pi\|=\lfloor nt\rfloor\ \mbox{with}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{i.o.}\bigg{)}=1\ \forall\epsilon>0$

(ii)

Fix $0<C\leq H(q)$ . The set of measures with grid entropy in direction $q$ at least/most $C$ can be characterized in terms of the deterministic sets $\mathcal{R}^{q,\alpha}$ defined in Corollary 7 (i):

\mathcal{R}^{q,C}=\bigcap_{0\leq\alpha\leq C}\mathcal{R}^{q,\alpha}\subseteq\{\nu\in\mathcal{M}_{+}:||(q,\nu)||\geq C\}\subseteq\bigcap_{0\leq\alpha<C}\mathcal{R}^{q,\alpha}

Thus

	$\displaystyle\|\|(q,\nu)\|\|\geq C>0\Leftrightarrow$
	$\displaystyle\qquad P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{with}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{i.o.}\bigg{)}=1$

$\forall\epsilon>0\ \forall\alpha\in(0,C)$ , and

||(q,\nu)||<C\Leftrightarrow\liminf_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}^{\lfloor e^{nC}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}>0\ \mbox{a.s.}

Now fix $0<C\leq H(t\ell)=t\log D$ . The set of measures with length $t$ direction-free grid entropy at least/most $C$ can be characterized in terms of the deterministic sets $\mathcal{R}^{t,\alpha}$ defined in Corollary 7 (ii):

\mathcal{R}^{t\ell,C}=\mathcal{R}^{t,C}=\bigcap_{0\leq\alpha\leq C}\mathcal{R}^{t,\alpha}\subseteq\{\nu\in\mathcal{M}_{t}:||\nu||\geq C\}\subseteq\bigcap_{0\leq\alpha<C}\mathcal{R}^{t,\alpha}

Thus

	$\displaystyle\|\|\nu\|\|\geq C>0\Leftrightarrow$
	$\displaystyle\qquad P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})\ \mbox{with}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{i.o.}\bigg{)}=1\ \forall\epsilon>0$

$\forall\alpha\in(0,C)$ , and

||\nu||<C\Leftrightarrow\liminf_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{\lfloor e^{nC}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}>0\ \mbox{a.s.}

Proof.

This follows immediately from Theorem 22. ∎

Now that we have a firm grasp of what direction-fixed and direction-free grid entropies actually measure, we can move on examining what information they can give us.

4 Properties of Grid Entropy

Since direction-free entropy $||\nu||$ is just $||(t\ell,\nu)||$ for $t:=||\nu||_{TV}$ then it suffices to study the properties of direction-fixed grid entropy. We are particularly interested in what the grid entropy can tell us about $q$ and $\nu$ when $||(q,\nu)||$ is finite.

We have already established that it is necessary for $q$ to be in ${\mathbb{R}}^{D}_{\geq 0}$ if we want $||(q,\nu)||$ not to be $-\infty$ , and that if $q=\vec{0}$ then $||(q,\nu)||>-\infty$ if and only if $\nu=0$ . The question now is what sorts of measures $\nu$ are observed by the empirical measures along the direction $q\in{\mathbb{R}}^{D}_{\geq 0}\setminus\{\vec{0}\}$ .

By positive-homogeneity of the norm, it suffices to consider $q\in{\mathbb{R}}^{D}_{\geq 0}$ with $||q||_{1}=1$ and study which $\nu\in\mathcal{M}_{+}$ give rise to finite $||(q,\nu)||$ . In this section, it will be more convenient to work with the description of grid entropy given by Corollary 23 (ii).

Theorem 24.

Recall that we couple the edge weights with i.i.d. edge labels $U_{e}\sim\Lambda$ (the Lebesgue measure on $[0,1]$ ) and that

H(q):=\lim\limits_{n\rightarrow\infty}\frac{1}{n}\log(\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor)

Consider any $q\in{\mathbb{R}}^{D}_{\geq 0}$ with $||q||_{1}=1$ and any $\nu\in\mathcal{M}_{+}$ . Suppose $||(q,\nu)||\neq-\infty$ . Then:

(i)

$\nu$ is a probability measure on $[0,1]$ .
(ii)

For any $G_{\delta}$ or $F_{\sigma}$ set $A\subseteq[0,1]$ , we have

$\Lambda(A)^{\nu(A)}\Lambda(A^{C})^{\nu(A^{C})}\geq\nu(A)^{\nu(A)}\nu(A^{C})^{\nu(A^{C})}e^{||(q,\nu)||-H(q)}$

(iii)

From Section 2.6 the Kullback-Leibler divergence is defined as

D_{KL}(\nu||\Lambda)=\begin{cases}\int f\log f\ dx\ \mbox{with}\ f:=\frac{d\nu}{dx},&\nu\ll\Lambda\\ +\infty,&\mbox{otherwise}\end{cases}

Then

||(q,\nu)||+D_{KL}(\nu||\Lambda)\leq H(q)

so in particular, $\nu$ is absolutely continuous with respect to the Lebesgue measure on $[0,1]$ .

Remark.

If we take $q=\ell$ and recall that $H(\ell)=\log D$ we get the analogous statements for length 1 direction-free grid entropy.

Remark.

(i) is exactly what we expect because the empirical measures $\frac{1}{n}\mu_{\pi}$ for paths
$\pi:\vec{0}\rightarrow\lfloor nq\rfloor$ are non-negative Borel measures with total mass $\frac{||\lfloor nq\rfloor||}{n}\approx 1$ so in the limit we expect to only observe $\nu$ which are probability measures.

Remark.

Recalling Theorem 6, (i) implies that $\mathcal{R}^{q}=\{\nu\in\mathcal{M}_{+}:||(q,\nu)||\geq 0\}$ (and hence $\mathcal{R}^{1}=\mathcal{R}^{\ell}$ ) is weakly compact.

Remark.

Weaker versions of (iii) previously appeared in [Bat20] and [RAS14]. To be specific, Bates shows that $D_{KL}(\nu||\Lambda)\leq H(q)$ for $\nu\in\mathcal{R}^{q}$ and it is immediate from the work of Rassoul-Agha and Seppäläinen that grid entropy $||(q,\nu)||$ is upper bounded by $H(q)$ . Here (iii) combines both of these into an amazing inequality.

Remark.

If we do not make use of our coupling and define grid entropy directly on the space of measures on ${\mathbb{R}}$ , (i)-(iii) hold analogously. This is trivial since we do not use weak compactness in the proof of this theorem.

Proof.

(i) The intuition is that if $||\nu||_{TV}\neq 1$ then each $\rho(\frac{1}{n}\mu_{\pi},\nu)$ is bounded below away from 0 hence it does not converge to 0.

Suppose $\nu\in\mathcal{M}_{+}$ is not a probability measure on $[0,1]$ . By the reverse triangle inequality for $\rho$ , for any $n$ and any path $\pi:\vec{0}\rightarrow\lfloor nq\rfloor$ ,

\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\geq|\rho(\mu_{\pi},0)-\rho(n\nu,0)|=|\ ||\mu_{\pi}||_{TV}-n||\nu||_{TV}\ |=|\ ||\lfloor nq\rfloor||_{1}-n||\nu||_{TV}\ |

Since $\frac{||\lfloor nq\rfloor||_{1}}{n}\rightarrow||q||_{1}=1$ and $||\nu||_{TV}\neq 1$ then there is $\delta>0$ s.t. for large enough $n$ ,

\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\geq\frac{1}{n}|\ ||\lfloor nq\rfloor||_{1}-n||\nu||_{TV}\ |>\delta

It follows that $\min\limits_{\pi}^{1}\rho(\frac{1}{n}\mu_{\pi},\nu)\not\rightarrow 0$ a.s. so $||(q,\nu)||=-\infty$ . Contradiction. Therefore $\nu$ must be a probability measure on $[0,1]$ .

(ii) Before proceeding with (ii) we prove a quick claim that will also help with the proof of (iii).

Claim: Fix $\alpha\in[0,H(q)]$ and a nonempty set of measures $T\subseteq\mathcal{M}_{+}$ . Then

	$\displaystyle\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in T\bigg{)}$
	$\displaystyle\leq H(q)-\alpha+\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{\pi}\in T\bigg{)}$

In particular, if $\alpha\geq 0$ and

\alpha>H(q)+\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{\pi}\in T\bigg{)}

then

P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in T\ \mbox{i.o.}\bigg{)}=0

Proof. Indeed, for any $n\in{\mathbb{N}}$ we have by Markov’s Inequality

	$\displaystyle P\bigg{(}\sum_{\pi\in\mathcal{P}(0,\lfloor nq\rfloor)}\mathbf{1}_{\frac{1}{n}\mu_{\pi}\in T}\geq\lfloor e^{n\alpha}\rfloor\bigg{)}$	$\displaystyle\leq\frac{1}{\lfloor e^{n\alpha}\rfloor}E\bigg{[}\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}\mathbf{1}_{\frac{1}{n}\mu_{\pi}\in T}\bigg{]}$
		$\displaystyle=\frac{\|\mathcal{P}(\vec{0},\lfloor nq\rfloor)\|}{\lfloor e^{n\alpha}\rfloor}P\bigg{(}\frac{1}{n}\mu_{\pi}\in T\bigg{)}$

Taking $\limsup\limits_{n\rightarrow\infty}\frac{1}{n}\log$ of both sides and using the definition of $H(q)$ we get the desired inequality.

Now suppose

\alpha>\alpha-\delta>H(q)+\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{\pi}\in T\bigg{)}

Then for large enough $n$ ,

P\bigg{(}\sum_{\pi\in\mathcal{P}(0,\lfloor nq\rfloor)}\mathbf{1}_{\frac{1}{n}\mu_{\pi}\in T}\geq\lfloor e^{n\alpha}\rfloor\bigg{)}\leq\mbox{exp}\bigg{[}n\bigg{(}H(q)-\alpha+\delta+\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{\pi}\in T\bigg{)}\bigg{)}\bigg{]}

where the exponent is a strictly negative constant multiple of $n$ . Hence, by the BorelCantelli Lemma,

P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in T\ \mbox{i.o.}\bigg{)}=0

$\square$ (Claim)

Now we begin the proof of (ii). The idea is to use the definition of the Levy-Prokhorov metric to unpack what $\rho(\frac{1}{n}\mu_{\pi},\nu)<\epsilon$ means in terms of the values of the labels along $\pi$ .

First, consider any closed $A\subseteq[0,1]$ . Observe that $A=\overline{A}=\bigcap\limits_{\epsilon>0}A^{\epsilon}$ hence by continuity from above, $\nu(A^{\epsilon})\downarrow\nu(A)$ and $\Lambda(A^{\epsilon})\downarrow\Lambda(A)$ as $\epsilon\rightarrow 0^{+}$ .

We compute the probability that $\rho(\frac{1}{n}\mu_{\pi},\nu)<\epsilon$ for a path $\pi:\vec{0}\rightarrow\lfloor nq\rfloor$ directly, using the definitions of $\rho$ and $\mu_{\pi}$ , and then we will apply the Claim.

Consider $n\in{\mathbb{N}}$ . For any path $\pi:\vec{0}\rightarrow\lfloor nq\rfloor$ , by definition of the Levy-Prokhorov metric and the empirical measure $\mu_{\pi}$ ,

\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\Rightarrow\nu(A)\leq\frac{1}{n}\mu_{\pi}(A^{\epsilon})+\epsilon\ \mbox{and}\ \frac{1}{n}\mu_{\pi}(A^{\epsilon})\leq\nu((A^{\epsilon})^{\epsilon})+\epsilon

	$\displaystyle\Rightarrow\mbox{$\pi$ has $\geq\lfloor n(\nu(A)-\epsilon)\rfloor$ edge labels in $A^{\epsilon}$}$
	and $\leq\lfloor n(\nu(A^{2\epsilon})+\epsilon)\rfloor$ edge labels in $A^{\epsilon}$
	$\displaystyle\Rightarrow\mbox{$\pi$ has $\geq\lfloor n(\nu(A)-\epsilon)\rfloor$ edge labels in $A^{\epsilon}$}$
	and $\geq\|\|\lfloor nq\rfloor\|\|_{1}-\lfloor n(\nu(A^{2\epsilon})-\epsilon)\rfloor$ edge labels in $(A^{\epsilon})^{C}$

If $\nu(A)=0$ or $\nu(A)=1$ then we may completely omit the first/second half of the statement above as it is trivial. Otherwise, we take $\epsilon>0$ small enough so that $\nu(A^{2\epsilon})+\epsilon<1$ and $n$ large enough so that $n(\nu(A^{2\epsilon})+\epsilon)<||\lfloor nq\rfloor||_{1}$ . For the sake of convenience we only show the computation in the latter case; the $\nu(A)=0$ or $\nu(A)=1$ case is merely a simplified version of it.

Since $U_{e}$ are i.i.d. Unif $[0,1]$ , then

\begin{split}P\bigg{(}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\bigg{)}&\leq P\bigg{(}\begin{array}[]{l}\mbox{$\pi$ has $\geq\lfloor n(\nu(A)-\epsilon)\rfloor$ edge labels in $A^{\epsilon}$,}\\ \mbox{$\geq||\lfloor nq\rfloor||_{1}-\lfloor n(\nu(A^{2\epsilon})-\epsilon)\rfloor$ labels in $(A^{\epsilon})^{C}$}\end{array}\bigg{)}\\ &\leq\binom{||\lfloor nq\rfloor||_{1}}{\lfloor n(\nu(A)-\epsilon)\rfloor,||\lfloor nq\rfloor||_{1}-\lfloor n(\nu(A^{2\epsilon})-\epsilon)\rfloor}\\ &\qquad\qquad\cdot\Lambda(A^{\epsilon})^{n(\nu(A)-\epsilon)}\Lambda((A^{\epsilon})^{C})^{||\lfloor nq\rfloor||_{1}-n(\nu(A^{2\epsilon})-\epsilon)}\end{split}

(29)

by the union bound. We get the asymptotics of this binomial coefficient from Lemma 1:

\begin{split}&\binom{||\lfloor nq\rfloor||_{1}}{\lfloor n(\nu(A)-\epsilon)\rfloor,||\lfloor nq\rfloor||_{1}-\lfloor n(\nu(A^{2\epsilon})-\epsilon)\rfloor}\\ &=\bigg{(}\frac{1}{(\nu(A)-\epsilon)^{\nu(A)-\epsilon}(1-\nu(A^{2\epsilon})+\epsilon)^{1-\nu(A^{2\epsilon})+\epsilon}(\nu(A^{2\epsilon})-\nu(A))^{\nu(A^{2\epsilon})-\nu(A)}}+o(1)\bigg{)}^{n}\end{split}

(30)

Therefore

	$\displaystyle\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\bigg{)}\leq$
	$\displaystyle\log\frac{\Lambda(A^{\epsilon})^{\nu(A)-\epsilon}\Lambda((A^{\epsilon})^{C})^{1-\nu(A^{2\epsilon})+\epsilon}}{(\nu(A)-\epsilon)^{\nu(A)-\epsilon}(1-\nu(A^{2\epsilon})+\epsilon)^{1-\nu(A^{2\epsilon})+\epsilon}(\nu(A^{2\epsilon})-\nu(A))^{\nu(A^{2\epsilon})-\nu(A)}}$

Combining this with the Claim with $T=B_{\epsilon}(\nu)$ , we get that

	$\displaystyle P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{with}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{i.o.}\bigg{)}=0$
	$\displaystyle\forall\alpha>H(q)+\log\frac{\Lambda(A^{\epsilon})^{\nu(A)-\epsilon}\Lambda((A^{\epsilon})^{C})^{1-\nu(A^{2\epsilon})+\epsilon}}{(\nu(A)-\epsilon)^{\nu(A)-\epsilon}(1-\nu(A^{2\epsilon})+\epsilon)^{1-\nu(A^{2\epsilon})+\epsilon}(\nu(A^{2\epsilon})-\nu(A))^{\nu(A^{2\epsilon})-\nu(A)}}$

By Corollary 23 (ii), it follows that

	$\displaystyle\|\|(q,\nu)\|\|\leq$
	$\displaystyle H(q)+\log\frac{\Lambda(A^{\epsilon})^{\nu(A)-\epsilon}(1-\Lambda(A^{\epsilon}))^{1-\nu(A^{2\epsilon})+\epsilon}}{(\nu(A)-\epsilon)^{\nu(A)-\epsilon}(1-\nu(A^{2\epsilon})+\epsilon)^{1-\nu(A^{2\epsilon})+\epsilon}(\nu(A^{2\epsilon})-\nu(A))^{\nu(A^{2\epsilon})-\nu(A)}}$

This holds for arbitrary $\epsilon>0$ . Taking $\epsilon\rightarrow 0^{+}$ and using $\nu(A^{\epsilon})\downarrow\nu(A)$ and $\Lambda(A^{\epsilon})\downarrow\Lambda(A)$ ,

1\leq e^{H(q)-||(q,\nu)||}\frac{\Lambda(A)^{\nu(A)}\Lambda(A^{C})^{\nu(A^{C})}}{\nu(A)^{\nu(A)}\nu(A^{C})^{\nu(A^{C})}}

Therefore

\Lambda(A)^{\nu(A)}\Lambda(A^{C})^{\nu(A^{C})}\geq\nu(A)^{\nu(A)}\nu(A^{C})^{\nu(A^{C})}e^{||(q,\nu)||-H(q)}

(31)

This equation is symmetric in $A,A^{C}$ hence it holds for all $A$ open or closed. By continuity from below/above it holds for all $G_{\delta}$ and $F_{\sigma}$ subsets of $[0,1]$ .

(iii) The idea is to again use the Claim, this time to compute $\limsup\limits_{n\rightarrow\infty}\frac{1}{n}\log P(\rho(\frac{1}{n}\mu_{\pi},\nu)<\epsilon)$ using Sanov’s Theorem.

Fix $\epsilon>0$ . By Sanov’s Theorem (Theorem 9), since $\overline{B_{2\epsilon}}(\nu)$ is weakly closed then

\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{|\pi|}\mu_{\pi}\in\overline{B_{2\epsilon}}(\nu)\bigg{)}\leq-\inf_{\xi\in\overline{B_{2\epsilon}}(\nu)}D_{KL}(\xi||\Lambda)

where the right hand side might a priori be $-\infty$ .

Since $\rho(\frac{1}{|\pi|}\mu_{\pi},\frac{1}{n}\mu_{\pi})\leq|1-\frac{||\lfloor nq\rfloor||_{1}}{n}|\rightarrow 0$ as $n\rightarrow\infty$ , it follows that

\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{\pi}\in B_{\epsilon}(\nu)\bigg{)}\leq\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{|\pi|}\mu_{\pi}\in\overline{B_{2\epsilon}}(\nu)\bigg{)}\leq-\inf_{\xi\in\overline{B_{2\epsilon}}(\nu)}D_{KL}(\xi||\Lambda)

By the Claim from (ii) with $T=B_{\epsilon}(\nu)$ ,

P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{with}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{i.o.}\bigg{)}=0\ \forall\alpha>H(q)-\inf_{\xi\in\overline{B_{2\epsilon}}(\nu)}D_{KL}(\xi||\Lambda)

Hence, by Corollary 23 (ii),

||(q,\nu)||\leq H(q)-\inf_{\xi\in\overline{B_{2\epsilon}}(\nu)}D_{KL}(\xi||\Lambda)

Take $\epsilon=\frac{1}{k}\downarrow 0$ and let $\xi_{k}=\operatorname*{arg\,min}\limits_{\xi\in\overline{B_{\frac{2}{k}}}(\nu)}D_{KL}(\xi||\Lambda)$ (the infimum over the weakly closed $\overline{B_{\frac{2}{k}}}(\nu)$ is achieved since relative entropy is lower semicontinuous). Then $\xi_{k}\Rightarrow\nu$ hence

D_{KL}(\nu||\Lambda)+||(q,\nu)||\leq\liminf_{k\rightarrow\infty}D_{KL}(\xi_{k}||\Lambda)+||(q,\nu)||\leq H(q)

as desired.

Finally, recall that $\nu\in\mathcal{R}^{q}$ hence $||(q,\nu)||\geq 0$ so $D_{KL}(\nu||\Lambda)$ is finite and $\nu\ll\Lambda$ . ∎

It is not clear in general whether any probability measure $\nu$ satisfying $D_{KL}(\nu||\Lambda)\leq H(q)$ results in finite grid entropies $||(q,\nu)||$ for $q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=1$ .

In one specific case, however, this does hold trivially. When $q$ is a permutation of
$(1,0,\ldots,0)$ , so that there is exactly one NE path $\vec{0}\rightarrow nq$ , we get

||(q,\nu)||>-\infty\Leftrightarrow D_{KL}(\nu||\Lambda)=0\Leftrightarrow\nu=\Lambda

with the value of $||(q,\Lambda)||$ being 0. Note that this completely covers the $D=1$ case.

Also, in general we cannot explicitly compute grid entropy. But when the target measure $\nu$ is $\Lambda$ , we can easily show that grid entropy is maximal. This makes sense, since the Glivenko-Cantelli Theorem says we expect that most empirical measures observed are very close to the edge label distribution.

Corollary 25.

Let $q\in{\mathbb{R}}^{D}_{\geq 0}\setminus\{\vec{0}\}$ . Then $||(q,q\Lambda)||=H(q)$ .

Proof.

Since $H(q)$ and $||(q,\nu)||$ are positive-homogeneous (in the sense that $H(cq)=cH(q)$ and $||(cq,c\nu)||=c||(q,\nu)||\ \forall c>0$ ), we may without loss of generality assume $||q||_{1}=1$ .

Suppose $||(q,\Lambda)||<H(q)$ . Consider any $\epsilon>0$ . By Sanov’s Theorem (Theorem 9),

\limsup_{n\rightarrow\infty}\frac{1}{n}\log P\bigg{(}\frac{1}{n}\mu_{\pi}\in B_{\epsilon}(\Lambda)^{C}\bigg{)}\leq-\inf_{\xi\in B_{\epsilon}(\Lambda)^{C}}D_{KL}(\xi||\Lambda)

(32)

where $B_{\epsilon}(\Lambda)^{C}$ is weakly closed and relative entropy is lower semicontinuous hence the infimum is achieved by some $\xi\in B_{\epsilon}(\Lambda)^{C}$ . But $\xi\neq\Lambda$ implies $-D_{KL}(\xi||\Lambda)<0$ . Fix any

\max(||(q,\Lambda)||,H(q)-D_{KL}(\xi||\Lambda))<\alpha<H(q)

By the Claim from the proof of (ii) in the previous theorem with $T=B_{\epsilon}(\Lambda)^{C}$ , we have

P\bigg{(}\exists\lceil e^{n\alpha}\rceil\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\geq\epsilon\ \mbox{i.o.}\bigg{)}=0

hence

P\bigg{(}\exists(\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor)-\lceil e^{n\alpha}\rceil\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{eventually}\bigg{)}=1

But $(\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor)-\lceil e^{n\alpha}\rceil\geq\lfloor e^{n\alpha}\rfloor$ for large enough $n$ since $\alpha<H(q)$ , so

P\bigg{(}\exists\lfloor e^{n\alpha}\rfloor\ \mbox{paths}\ \pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{eventually}\bigg{)}=1

for this arbitrary $\epsilon>0$ , hence $\alpha\geq||(q,\Lambda)||$ . Contradiction. ∎

Continuity is another property of interest. It turns out that grid entropy is upper semicontinuous in its argument(s); this follows immediately from (3) and the lower semicontinuity of relative entropy. However, we have not yet proved (3) so we prove a slightly weaker statement using the tools developed in this paper. We also note that grid entropy is lower semicontinuous (hence fully continuous) in some cases.

Note that here we no longer restrict ourselves to the case when $||q||_{1}=1$ .

Theorem 26.

Let $(q,\nu),(q^{k},\nu^{k})\in{\mathbb{R}}^{D}\times\mathcal{M}_{+}$ s.t. $q^{k}\rightarrow^{L^{1}}q,\nu^{k}\Rightarrow\nu$ .

(i)

If either $q=\vec{0}$ or for large enough $k$ we have $q-q^{k}\in{\mathbb{R}}^{D}_{\geq 0}$ or for large enough $k$ we have $(q^{k})_{i}=0$ for all $i$ s.t. $q_{i}=0$ then

$\limsup_{k\rightarrow\infty}||(q^{k},\nu^{k})||\leq||(q,\nu)||$
(ii)

If $||(q^{k}-q,\nu^{k}-\nu)||\neq-\infty$ , for large enough $k$ then

$||(q,\nu)||\leq\liminf_{k\rightarrow\infty}||(q^{k},\nu^{k})||$
(iii)

If there exist constants $C_{k}>0$ s.t. $C_{k}\rightarrow 0$ and s.t. for large enough $k$ ,

$\bigg{|}\bigg{|}\bigg{(}q-\frac{q-q^{k}}{C_{k}},\nu-\frac{\nu-\nu^{k}}{C_{k}}\bigg{)}\bigg{|}\bigg{|}\neq-\infty$

then

$||(q,\nu)||\leq\liminf_{k\rightarrow\infty}||(q^{k},\nu^{k})||$

Remark.

In the case when $q^{k}=q$ for large enough $k$ , (i) shows that direction-fixed grid entropy $||(q,\nu)||$ is always upper semicontinuous in $\nu$ . Taking $q^{k}=||\nu^{k}||_{TV}\ell\rightarrow||\nu||_{TV}\ell$ we get that

\limsup_{k\rightarrow\infty}||\nu^{k}||=\limsup_{k\rightarrow\infty}||(||\nu^{k}||_{TV}\ell,\nu^{k})||\leq||(||\nu||_{TV}\ell,\nu)||=||\nu||

so direction-free grid entropy is upper semicontinuous.

Remark.

In the assumptions in (ii) and (iii) it is implicit that the parameters lie in
${\mathbb{R}}^{D}_{\geq 0}\times\mathcal{M}_{+}$ because otherwise the grid entropy would be $-\infty$ .

Remark.

The conditions in (i), (ii), (iii) are not mutually exclusive so for some sequences we may combine the statements to obtain

||(q,\nu)||=\lim_{k\rightarrow\infty}||(q^{k},\nu^{k})||

Proof.

For all three parts, we may assume the conditions hold for all $k\in{\mathbb{N}}$ .
(i) Case 1: $q=\vec{0}$
First suppose $\nu=0$ so that $||(q,\nu)||=0$ . By Theorem 22,

||(q^{k},\nu^{k})||\in\{-\infty\}\cup[0,H(q^{k})]\ \forall k

hence

	$\displaystyle\limsup_{k\rightarrow\infty}\|\|(q^{k},\nu^{k})\|\|$	$\displaystyle\leq\limsup_{k\rightarrow\infty}H(q^{k})$
		$\displaystyle=\limsup_{k\rightarrow\infty}\|\|q^{k}\|\|_{1}\log\|\|q^{k}\|\|_{1}-\sum_{i=1}^{D}(q^{k})_{i}\log(q^{k})_{i}$
		$\displaystyle=0$

since $||q^{k}-q||_{1}=||q^{k}||_{1}\rightarrow 0$ .

On the other hand, if $\nu\neq 0$ then for large enough $k$ we have $||q^{k}||_{1}\neq||\nu^{k}||_{TV}$ (because otherwise $||\nu^{k}||_{TV}\rightarrow 0$ and hence $\nu$ , the weak limit of the $\nu^{k}$ , would have to be 0). But then

\limsup_{k\rightarrow\infty}||(q^{k},\nu^{k})||=-\infty=||(q,\nu)||

by Theorem 24 (i).

Case 2: $q-q^{k}\in{\mathbb{R}}^{D}_{\geq 0}\ \forall k$
Recall that the infimum of a family of upper semicontinuous functions is upper semicontinuous. Thus it suffices to fix $\epsilon>0$ and show that

\limsup_{k\rightarrow\infty}\widetilde{d}^{\epsilon}((\vec{0},0),(q^{k},\nu^{k}))\leq\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))\ \mbox{a.s.}

(33)

But for every $k,n\in{\mathbb{N}}$ we have by Lemma 15 and the fact that $q-q^{k}\in{\mathbb{R}}^{D}_{\geq 0}$ ,

\frac{d^{\epsilon}((\vec{0},0),(nq^{k},n\nu^{k}))}{n}-\frac{1}{\epsilon}\bigg{(}\frac{1}{n}||\lfloor nq\rfloor-\lfloor nq^{k}\rfloor||_{1}+\rho(\nu,\nu^{k})\bigg{)}\leq\frac{d^{\epsilon}((\vec{0},0),(nq^{k},n\nu^{k}))}{n}

For a fixed $k$ , taking $n\rightarrow\infty$ we get

\widetilde{d}^{\epsilon}((\vec{0},0),(q^{k},\nu^{k}))-\frac{1}{\epsilon}(||q-q^{k}||_{1}+\rho(\nu,\nu^{k}))\leq\widetilde{d}^{\epsilon}((\vec{0},0),(q,\nu))\ \mbox{a.s.}

(33) follows immediately.

Case 3: $(q^{k})_{i}=0$ whenever $q_{i}=0$ for all $k\in{\mathbb{N}}$
We may assume $q\neq\vec{0}$ (because this scenario was covered by Case 1). In particular, we may assume without loss of generality that $q^{k}\neq\vec{0}\ \forall k$ .

To show

\limsup_{k\rightarrow\infty}||(q^{k},\nu^{k})||\leq||(q,\nu)||

it suffices to prove this when $||(q^{k},\nu^{k})||\geq 0\ \forall k$ . In particular, this implies that
$(q^{k})_{i}\geq 0\ \forall k\in{\mathbb{N}},1\leq i\leq D$ and hence $q_{i}\geq 0\ \forall 1\leq i\leq D$ .

Now, the trick is to rescale the $q^{k}$ by a factor $B_{k}\geq 1$ converging to 1 in $k$ s.t. we land back in Case 2. More precisely, define

B_{k}:=\max\bigg{(}\max_{1\leq i\leq D\ \mbox{s.t.}\ (q^{k})_{i}\neq 0}\frac{q_{i}}{(q^{k})_{i}},\max_{1\leq i\leq D\ \mbox{s.t.}\ q_{i}\neq 0}\frac{(q^{k})_{i}}{q_{i}}\bigg{)}

Note that $B_{k}\geq 0$ because $q_{i},(q^{k})_{i}\geq 0$ . Since $q^{k}\neq\vec{0}$ then there is some $i$ s.t. $(q^{k})_{i}>0$ so by the assumption of Case 3, $q_{i}>0$ . Thus $B_{k}\geq 1$ .

Furthermore, $||q-q^{k}||_{1}\rightarrow 0$ implies $B_{k}\rightarrow 1$ . In particular, $\frac{1}{B_{k}}q_{k}\rightarrow^{L^{1}}q,\frac{1}{B_{k}}\nu^{k}\Rightarrow\nu$ . By the choice of $B_{k}$ we have for all $1\leq i\leq D$ , either $q_{i}=0$ hence $B_{k}q_{i}=0=(q^{k})_{i}$ or $q_{i}\neq 0$ so $B_{k}q_{i}\geq(q^{k})_{i}$ . Thus

B_{k}q-q^{k}\in{\mathbb{R}}^{D}_{\geq 0}\Rightarrow q-\frac{1}{B_{k}}q^{k}\in{\mathbb{R}}^{D}_{\geq 0}

By Case 2 and positive-homogeneity of grid entropy,

\limsup_{k\rightarrow\infty}||(q^{k},\nu^{k})||=\limsup_{k\rightarrow\infty}\bigg{|}\bigg{|}\bigg{(}\frac{1}{B_{k}}q^{k},\frac{1}{B_{k}}\nu^{k}\bigg{)}\bigg{|}\bigg{|}\leq||(q,\nu)||

as desired.

(ii) By the triangle inequality for a directed norm with negative sign,

||(q^{k},\nu^{k})||\geq||(q,\nu)||+||(q^{k}-q,\nu^{k}-\nu)||\geq||(q,\nu)||\ \forall k

since $||(q^{k}-q,\nu^{k}-\nu)||\neq-\infty$ . Therefore

\liminf_{k\rightarrow\infty}||(q^{k},\nu^{k})||\geq||(q,\nu)||

(iii) Now suppose

\bigg{|}\bigg{|}\bigg{(}q-\frac{q-q^{k}}{C_{k}},\nu-\frac{\nu-\nu^{k}}{C_{k}}\bigg{)}\bigg{|}\bigg{|}\geq 0\ \mbox{and}\ C_{k}\geq 0\ \forall k,C_{k}\rightarrow 0

By the reverse triangle inequality and positive-homogeneity,

	$\displaystyle(1-C_{k})\|\|(q,\nu)\|\|$	$\displaystyle=\|\|(q^{k}-(C_{k}q-(q-q^{k})),\nu^{k}-(C_{k}\nu-(\nu-\nu^{k})))\|\|$
		$\displaystyle\leq\|\|(q^{k},\nu^{k})\|\|-C_{k}\bigg{\|}\bigg{\|}\bigg{(}q-\frac{q-q^{k}}{C_{k}},\nu-\frac{\nu-\nu^{k}}{C_{k}}\bigg{)}\bigg{\|}\bigg{\|}$
		$\displaystyle\leq\|\|(q^{k},\nu^{k})\|\|$

But $C_{k}\rightarrow 0$ hence

||(q,\nu)||\leq\liminf_{k\rightarrow\infty}||(q^{k},\nu^{k})||

∎

In Lemma 20 we made a simple observation that is worth mentioning explicitly: $||(q,\nu)||$ is invariant under permutations of the coordinates of $q$ because of the symmetry of the grid lattice ${\mathbb{Z}}^{D}$ .

A last property we are interested in is the convexity of $\mathcal{R}^{q},\mathcal{R}^{t}$ .

Lemma 27.

Let $q\in{\mathbb{R}}^{D}_{\geq 0},t\geq 0$ . The sets $\mathcal{R}^{q}=\{(q,\nu)\in{\mathbb{R}}^{D}_{\geq 0}\times\mathcal{M}_{+}:||(q,\nu)||\geq 0\}$ , $\mathcal{R}^{t}=\{\nu\in\mathcal{M}_{t}:||\nu||\geq 0\}$ are convex.

Proof.

Grid entropy is concave in by positive-homogeneity and by the reverse triangle-inequality it satisfies. ∎

5 Application: Directed Polymers

We now apply our results to give variational formulas for the point-to-point/point-to-level Gibbs Free Energy in terms of direction-fixed/direction-free grid entropy and last passage time. As mentioned, these variational formulas have previously appeared in [RAS14], [GRAS16]. Our aim is to arrive at these from the framework presented in this paper with the hope of providing fresh insights.

5.1 Setup and Known Results

Recall that the measurable function $\tau:[0,1]\rightarrow{\mathbb{R}}$ determines the edge weights
$\tau_{e}=\tau(U_{e})$ . We will be dealing with the Gibbs Free Energy so to make our life easy we restrict ourselves in Section 5 to $\tau$ bounded and measurable. This could be extended to measurable but possibly unbounded $\tau$ by truncating $\tau$ at constants $C>0$ and taking a supremum over $C>0$ .

Define the last passage time between points $p,q\in{\mathbb{R}}^{D}$ to be

T(p,q)=\sup_{\pi:\lfloor p\rfloor\rightarrow\lfloor q\rfloor}T(\pi)=\sup_{\pi:\lfloor p\rfloor\rightarrow\lfloor q\rfloor}\sum_{e\in\pi}\tau(U_{e})

where this is $-\infty$ if $q-p\notin{\mathbb{R}}^{D}_{\geq 0}$ . A standard result (see [Mar04] for example) is the existence of a last passage time constant, or in other words a deterministic limiting value for the last passage time in a fixed direction $q\in{\mathbb{R}}^{D}_{\geq 0}$ .

Theorem 28.

Let $q\in{\mathbb{R}}^{D}_{\geq 0}$ . Then there is a $\tau$ -dependent time constant $\lambda_{q}(\tau)\in{\mathbb{R}}$ s.t.

\frac{T(\vec{0},nq)}{n}\rightarrow\lambda_{q}(\tau)

$\lambda_{q}$ is homogeneous and concave in $q$ .

Remark.

As was the case with grid entropy, the symmetries of the grid and concavity of passage time imply that the time constant is maximized along the direction $\ell=(\frac{1}{D},\ldots,\frac{1}{D})$ :

\sup\limits_{q\in{\mathbb{R}}^{D},||q||_{1}=1}\lambda_{q}=\lambda_{\ell}

Remark.

Thus far the time constant is only known to be explicitly computable when the underlying distribution $\theta$ is exponential.

It is useful to view last passage time as a linear functional. For any NE path $\pi:\lfloor np\rfloor\rightarrow\lfloor nq\rfloor$ , note that

T(\pi)=\sum_{e\in\pi}\tau(U_{e})=\int_{0}^{1}\tau(u)d(\mu_{\pi}):=\langle\tau,\mu_{\pi}\rangle\ \mbox{so}\ \frac{T(\pi)}{n}=\bigg{\langle}\tau,\frac{1}{n}\mu_{\pi}\bigg{\rangle}

where $\langle f,\nu\rangle$ denotes the linear functional that is integration of a measurable function
$f:[0,1]\rightarrow{\mathbb{R}}$ against a measure $\nu\in\mathcal{M}_{+}$ . In this language, weak convergence $\nu_{k}\Rightarrow\nu$ is equivalent to

\lim_{k\rightarrow\infty}\langle f,\nu_{k}\rangle=\langle f,\nu\rangle\ \forall\ \mbox{bounded, continuous functions $f$}

Of course, $\tau$ is likely not continuous. However, recalling [Bat20, Lemma 6.15] , we see that there is an a.s. event on which $\frac{1}{n}\mu_{\pi_{n}}$ converging weakly to $\nu$ implies $(\tau)_{\ast}(\frac{1}{n}\mu_{\pi_{n}})$ converges weakly to $(\tau)_{\ast}(\nu)$ hence

\bigg{\langle}\tau,\frac{1}{n}\mu_{\pi_{n}}\bigg{\rangle}=\bigg{\langle}Id,(\tau)_{\ast}\bigg{(}\frac{1}{n}\mu_{\pi_{n}}\bigg{)}\bigg{\rangle}\rightarrow\bigg{\langle}Id,(\tau)_{\ast}(\nu)\bigg{\rangle}=\langle\tau,\nu\rangle

where $Id(x)=x$ is the identity function on [0,1].

By using the compactness of $\mathcal{R}^{q}$ and Theorem 6, Bates establishes a variational formula for the direction-fixed limit shape. See [Bat20, Theorem 2.1] for details. We restate this theorem here in our LPP setting, which can be proved by tweaking Bates’ original argument. We give our own proof in the next section.

Theorem 29.

Let $q\in{\mathbb{R}}^{D}_{\geq 0}$ . Recall that Theorem 6 gives a deterministic weakly closed set $\mathcal{R}^{q}$ which a.s. consists of all weak limits of subsequences of $\mu_{\pi}$ for paths $\pi$ of increasing length. Then for any measurable $\tau:[0,1]\rightarrow{\mathbb{R}}$ , the corresponding limit shape is given by

\lambda_{q}(\tau)=\sup_{\nu\in\mathcal{R}^{q}}\langle\tau,\nu\rangle

The direction-free analogue holds as well.

Remark.

Since $\mathcal{R}^{q}$ is weakly closed and nonempty (because $||q||_{1}\Lambda\in\mathcal{R}^{q}$ from before), then the set of maximizers is nonempty:

\mathcal{R}^{q}_{\tau}:=\{\nu\in\mathcal{R}^{q}:\langle\tau,\nu\rangle=\lambda_{q}(\tau)\}\neq\emptyset

As Bates remarks, when $\mathcal{R}^{q}_{\tau}$ is a singleton, empirical measures along geodesics converge weakly to this unique maximizer (or analogously unique minimizer in FPP), answering in the affirmative Hoffman’s question. It is further pointed out that $\mathcal{R}^{q}_{\tau}$ is a singleton for a dense family of functions $\tau$ . See [Bat20] for more details.

Remark.

Noting that the limit shape is the zero temperature analogue of $\beta$ -Gibbs Free Energy, these direction-fixed/direction-free variational formulas for the limit shape are also given as (7.6), (7.5) in [GRAS16].

Finally, let us recall the directed polymer model, still restricting ourselves to the LPP on ${\mathbb{R}}^{D}$ setting.

Definition.

Fix a direction $q\in{\mathbb{R}}^{D}_{\geq 0}$ , an inverse temperature $\beta>0$ and a bounded measurable $\tau:[0,1]\rightarrow{\mathbb{R}}$ . The point-to-point $\beta$ -partition function in direction $q$ is defined to be

Z_{n,q}^{\beta}=\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}e^{\beta T(\pi)}

The corresponding point-to-point $\beta$ -polymer measure on the set of paths $\pi:\vec{0}\rightarrow\lfloor nq\rfloor$ is

\rho_{n,q}^{\beta}(d\pi)=\frac{1}{Z_{n,q}^{\beta}}e^{\beta T(\pi)}

and the corresponding point-to-point $\beta$ - Gibbs Free Energy is defined to be

G_{q}^{\beta}=\lim_{n\rightarrow\infty}\frac{1}{n}\log Z_{n,q}^{\beta}

Suppressing the direction $q$ we get point-to-level analogues (to the ”level” $x_{1}+\ldots+x_{D}=n$ ):

Z_{n}^{\beta}=\sum_{\pi\in\mathcal{P}_{n}(\vec{0})}e^{\beta T(\pi)},\ \rho_{n}^{\beta}(d\pi)=\frac{1}{Z_{n}^{\beta}}e^{\beta T(\pi)},\ G^{\beta}=\lim_{n\rightarrow\infty}\frac{1}{n}\log Z_{n}^{\beta}

5.2 Variational Formulas for PTP/PTL Gibbs Free Energies

We derive a formula for the point-to-point Gibbs Free Energy as the supremum over measures $\nu$ of the sum of the integral $\beta\langle\tau,\nu\rangle$ and grid entropy $||(q,\nu)||$ . Since direction-free grid entropy is just grid entropy in the $\ell$ direction, then we arrive at a similar formula for the point-to-level Gibbs Free Energy.

Theorem 30.

Fix an inverse temperature $\beta>0$ and a bounded measurable
$\tau:[0,1]\rightarrow{\mathbb{R}}$ .

(i)

Fix $q\in{\mathbb{R}}^{D}_{\geq 0}$ . The point-to-point Gibbs Free Energy is given by

$G_{q}^{\beta}(\tau)=\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle+||(q,\nu)||]\ \mbox{a.s.}$

Moreover, this supremum is achieved by some $\nu\in\mathcal{R}^{q}$ .

(ii)

The point-to-level Gibbs Free Energy is given by

G^{\beta}(\tau)=\sup_{\nu\in\mathcal{M}_{1}}[\beta\langle\tau,\nu\rangle+||\nu||]=G_{\ell}^{\beta}=\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},||q||_{1}=1}G^{\beta}_{q}\ \mbox{a.s.}

Remark.

These entropy variational formulas appear as (7.12), (7.10) in [GRAS16] respectively, with a different normalization. The point-to-point formula originally appears as (5.4) in [RAS14].

Remark.

If $\tau$ is non-negative then the variational formulas are trivially positive-homogeneous.

Remark.

Again, we may extend these formulas to a supremum over all measurable $\tau$ by truncating $\tau$ in $\langle\tau,\nu\rangle$ at some $C>0$ and then taking a supremum over $C>0$ of the variational formula.

We require a short lemma in order to prove our variational formula. The main technical difficulty is that each direction- $q$ grid entropy is an almost sure limit:

||(q,\nu)||=\inf_{\epsilon>0}\lim_{n\rightarrow\infty}\frac{d^{\epsilon}((\vec{0},0),(nq,n\nu))}{n}\ \mbox{a.s.}

but we are now dealing with a whole space of measures, which is uncountable. Thus we need to assume these limits exist for a countable dense family of $\nu$ ’s and approximate the sums of $e^{\beta T(\pi)}$ over paths with empirical measure in $\epsilon$ -balls.

Lemma 31.

Fix $q\in{\mathbb{R}}^{D}_{\geq 0},\chi\in\mathcal{M}_{+},\gamma>0$ and $m\in{\mathbb{N}}$ . For any $n\in{\mathbb{N}}$ ,

	$\displaystyle\log\bigg{(}\sum_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in B_{\frac{1}{m}}(\chi)}e^{\beta T(\pi)}\bigg{)}$
	$\displaystyle\leq\beta n\bigg{(}\sup_{\xi\in B_{\frac{1}{m}}(\chi)}\langle\tau,\xi\rangle\bigg{)}+n\frac{1/m}{\gamma}+d^{\gamma}((\vec{0},0),(nq,n\chi))$

Proof.

Let us look closer at this sum. Observe that we can upper bound the indicator of the set we are summing over by a smooth function:

\mathds{1}\bigg{\{}\frac{1}{n}\mu_{\pi}\in B_{\frac{1}{m}}(\chi)\bigg{\}}\leq\mbox{exp}\bigg{(}n\frac{1/m}{\gamma}\bigg{)}\mbox{exp}\bigg{(}-\frac{n}{\gamma}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\chi\bigg{)}\bigg{)}

Thus

	$\displaystyle\log\bigg{(}\sum_{\pi\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in B_{\frac{1}{m}}(\chi)}e^{\beta T(\pi)}\bigg{)}$
	$\displaystyle\leq\beta\bigg{(}\sup_{\pi\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in B_{\frac{1}{m}}(\chi)}T(\pi)\bigg{)}+\log\bigg{(}\sum_{\pi\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in B_{\frac{1}{m}}(\chi)}1\bigg{)}$
	$\displaystyle\leq\beta n\bigg{(}\sup_{\pi\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\in B_{\frac{1}{m}}(\chi)}\langle\tau,\frac{1}{n}\mu_{\pi}\rangle\bigg{)}+\log\bigg{[}\mbox{exp}\bigg{(}n\frac{1/m}{\gamma}\bigg{)}\sum_{\pi\in\mathcal{P}(\vec{0},\lfloor nq\rfloor)}\mbox{exp}\bigg{(}-\frac{n}{\gamma}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\chi\bigg{)}\bigg{)}\bigg{]}$
	$\displaystyle\leq\beta n\bigg{(}\sup_{\xi\in B_{\frac{1}{m}}(\chi)}\langle\tau,\xi\rangle\bigg{)}+n\frac{1/m}{\gamma}+d^{\gamma}((\vec{0},0),(nq,n\chi))$

∎

Proof of Theorem 30.

We focus on showing (i). The proof of the direction-free case (ii) is analogous.

(i) Note that if $q=\vec{0}$ then $Z_{n,q}^{\beta}=1$ hence the variational formula holds trivially. Thus we may assume $||q||_{1}>0$ .

Recall that the Levy-Prokhorov metric metrizes weak convergence of measures and that $\{\xi\in\mathcal{M}_{+}:\frac{1}{2}||q|_{1}\leq||\xi||_{TV}\leq\frac{3}{2}||q||_{1}\}:=S$ is weakly compact. It follows that $\beta\langle\tau,\cdot\rangle$ is uniformly continuous on $S$ . More precisely,

\forall\epsilon>0\ \exists\delta>0\ \mbox{s.t.}\ \forall\xi,\xi^{\prime}\in S,\ \rho(\xi,\xi^{\prime})<\delta\Rightarrow|\beta\langle\tau,\xi\rangle-\beta\langle\tau,\xi^{\prime}\rangle|<\epsilon

(34)

We split the argument into two claims.

Claim 1:

\lim_{n\rightarrow\infty}\frac{1}{n}\log Z_{n,q}^{\beta}\geq\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle+||(q,\nu)||]\ \mbox{a.s.}

Proof of Claim 1
Note that $||(q,\nu)||=-\infty$ unless $\nu\in\mathcal{R}^{q}$ . Also $\mathcal{R}^{q}$ is weakly closed and $||(q,\nu)||$ is upper semicontinuous in $\nu$ so there exists some $\nu\in\mathcal{R}^{q}$ that achieves the supremum

\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle+||(q,\nu)||]

We prove the claim when $||(q,\nu)||>0$ ; the $||(q,\nu)||=0$ case is very similar.

For Claim 1, we restrict ourselves to the measure 1 event

\bigg{\{}\lim_{n\rightarrow\infty}\min_{\pi}^{\lfloor e^{n(||(q,\nu)||-\epsilon)}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \forall\epsilon\in{\mathbb{Q}}_{+}\ \mbox{with}\ \epsilon<||(q,\nu)||\bigg{\}}

(35)

(this has measure 1 by Theorem 22). We wish to show that in this event,

\liminf_{n\rightarrow\infty}\frac{1}{n}\log Z_{n,q}^{\beta}\geq\beta\langle\tau,\nu\rangle+||(q,\nu)||-2\epsilon\ \mbox{for $\epsilon>0$ arbitrarily small}

Fix $0<\epsilon<||(q,\nu)||$ in ${\mathbb{Q}}_{+}$ and let $\delta>0$ satisfy (34). We are in the event (35) so

\lim_{n\rightarrow\infty}\min_{\pi}^{\lfloor e^{n(||(q,\nu)||-\epsilon)}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0

Denote by $\pi_{n}$ the (event-depedent) paths corresponding to these order statistics. For large enough $n$ we have $\rho(\frac{1}{n}\mu_{\pi_{n}},\nu)<\delta$ hence there are $\lfloor e^{n(||(q,\nu)||-\epsilon)}\rfloor$ paths $\pi:\vec{0}\rightarrow\lfloor nq\rfloor$ satisfying

\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}\leq\rho\bigg{(}\frac{1}{n}\mu_{\pi_{n}},\nu\bigg{)}<\delta\ \mbox{so}\ \bigg{|}\beta\langle\tau,\frac{1}{n}\mu_{\pi}\rangle-\beta\langle\tau,\nu\rangle\bigg{|}<\epsilon

(36)

by (34). But then

Z_{n,q}^{\beta}\geq\sum_{\mbox{the $\lfloor e^{n(||(q,\nu)||-\epsilon)}\rfloor$ paths $\pi$ from above}}e^{\beta n\langle\tau,\frac{1}{n}\mu_{\pi}\rangle}\geq\lfloor e^{n(||(q,\nu)||-\epsilon)}\rfloor e^{n(\beta\langle\tau,\nu\rangle-\epsilon)}

by (36). It follows that

\liminf_{n\rightarrow\infty}\frac{1}{n}\log Z_{n,q}^{\beta}\geq\beta\langle\tau,\nu\rangle-\epsilon+||(q,\nu)||-\epsilon

Since $\epsilon>0$ was arbitrarily small, then taking $\epsilon\rightarrow 0^{+}$ we get

\liminf_{n\rightarrow\infty}\frac{1}{n}\log Z_{n,q}^{\beta}\geq\sup_{\nu\in\mathcal{R}^{q}}[\beta\langle\tau,\nu\rangle+||(q,\nu)||]\ \mbox{a.s.}\

(37)

as desired. $\square$ [Claim 1]

For the second half, we actually show a slightly stronger claim, which we use in a later proof.

Claim 2: Let $W\subseteq\mathcal{M}_{+}$ be a weakly open, possibly empty set s.t. $\mathcal{R}^{q}\setminus W\neq\emptyset$ . Define

Y_{n,q}^{\beta}:=\sum_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\notin W}e^{\beta T(\pi)}=\sum_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\notin W}e^{\beta n\langle\tau,\frac{1}{n}\mu_{\pi}\rangle}

Then

\limsup_{n\rightarrow\infty}\frac{1}{n}\log Y_{n,q}^{\beta}\leq\sup_{\nu\in\mathcal{M}_{+}\setminus W}[\beta\langle\tau,\nu\rangle+||(q,\nu)||]\ \mbox{a.s.}

(38)

Of course, in our case $W=\emptyset$ and $Y_{n,q}^{\beta}=Z_{n,q}^{\beta}$ .

Proof of Claim 2
We must be extra careful to only specify countably many measure zero events we exclude. We do this by considering a countable dense family of target measures.

$S\setminus W$ is weakly compact. So for every $m\in{\mathbb{N}}$ there is some $M=M(m)\in{\mathbb{N}}$ and a finite $\frac{1}{m}$ -net $B_{\frac{1}{m}}(\nu_{1}^{m}),\ldots,B_{\frac{1}{m}}(\nu_{M}^{m})$ covering $S\setminus W$ with the property that $\nu_{i}^{m}\notin W$ (but the balls themselves may intersect $W$ ). From now on we restrict ourselves to the event

\bigg{\{}\lim_{n\rightarrow\infty}\frac{d^{\frac{1}{\sqrt{m}}}((\vec{0},0),(nq,n\nu_{i}^{m}))}{n}=\widetilde{d}^{\frac{1}{\sqrt{m}}}((\vec{0},0),(q,\nu^{m}_{i}))\ \forall m\in{\mathbb{N}},1\leq i\leq M(m)\bigg{\}}

(39)

which is a countable intersection of measure 1 events by Theorem 12 hence has measure 1. It is important to note that for any arbitrary measure $\xi\in\mathcal{M}_{+}$ we still have

||(q,\xi)||=\inf_{\epsilon>0}\widetilde{d}^{\epsilon}((\vec{0},0),(q,\xi))

because this is a non-probabilistic statement about constants; we do not however assume that these directed metrics $\widetilde{d}^{\epsilon}((\vec{0},0),(q,\xi))$ are limits of the $\frac{d^{\epsilon}((\vec{0},0),(nq,n\xi))}{n}$ .

Now we wish to fix an arbitrary $\epsilon>0$ and prove that in the measure 1 event (39),

L:=\limsup_{n\rightarrow\infty}\frac{1}{n}\log Y_{n,q}^{\beta}\leq\sup_{\nu\in\mathcal{M}_{+}\setminus W}[\beta\langle\tau,\nu\rangle+||(q,\nu)||]+5\epsilon\

It suffices to consider a convergent subsequence $\frac{1}{n_{k}}\log Y_{n_{k},q}^{\beta}\rightarrow L$ and show $L\leq$ the right hand side.

Fix any $m\in{\mathbb{N}}$ . For large $k$ ,

\frac{||\lfloor n_{k}q\rfloor||_{1}}{n_{k}}\in\bigg{[}\frac{1}{2}||q||_{1},\frac{3}{2}||q||_{1}\bigg{]}

so any path $\pi:\vec{0}\rightarrow\lfloor n_{k}q\rfloor$ we are summing over in $Y_{n_{k},q}^{\beta}$ has $\frac{1}{n_{k}}\mu_{\pi}\in S\setminus W$ hence $\frac{1}{n_{k}}\mu_{\pi}$ is in one of the $B_{\frac{1}{m}}(\nu_{i}^{m})$ . Therefore

Y_{n_{k},q}^{\beta}\leq\sum_{i=1}^{M(m)}\sum_{\pi\ \mbox{s.t.}\ \frac{1}{n_{k}}\mu_{\pi}\in B_{\frac{1}{m}}(\nu_{i}^{m})}e^{\beta T(\pi)}\leq M(m)\max_{i=1}^{M(m)}\bigg{(}\sum_{\pi\ \mbox{s.t.}\ \frac{1}{n_{k}}\mu_{\pi}\in B_{\frac{1}{m}}(\nu_{i}^{m})}e^{\beta T(\pi)}\bigg{)}

Note that $M(m)$ is a constant with respect to $k$ , whereas the max above is both event-dependent and $k$ -dependent. Thus there is some event-dependent $1\leq I(m)\leq M(m)$ and some subsequence $n_{k_{j}}$ s.t. the above maximum is achieved by the same $i=I(m)$ for every $n_{k}=n_{k_{j}}$ . (Finite pigeonholes, infinite pigeons if you will.) Therefore

L=\limsup_{j\rightarrow\infty}\frac{1}{n_{k_{j}}}\log Y_{n_{k_{j}},q}^{\beta}\leq\limsup_{j\rightarrow\infty}\frac{1}{n_{k_{j}}}\log\bigg{(}\sum_{\pi\ \mbox{s.t.}\ \frac{1}{n_{k_{j}}}\mu_{\pi}\in B_{\frac{1}{m}}(\nu_{I(m)}^{m})}e^{\beta T(\pi)}\bigg{)}

(40)

This holds for any $m$ . Our strategy will be to choose a very large, event-dependent $m$ that gives a nice event-independent upper bound.

Let us upper bound the expression in the limit on the right-hand side of (40). Lemma 31 with $\chi=\nu_{I(m)}^{m},\gamma=\frac{1}{\sqrt{m}}$ establishes that for any $j\in{\mathbb{N}}$ ,

\begin{split}&\log\bigg{(}\sum_{\pi:\vec{0}\rightarrow\lfloor n_{k_{j}}q\rfloor\ \mbox{s.t.}\ \frac{1}{n_{k_{j}}}\mu_{\pi}\in B_{\frac{1}{m}}(\nu_{I(m)}^{m})}e^{\beta T(\pi)}\bigg{)}\\ &\leq\beta n_{k_{j}}\bigg{(}\sup_{\xi\in B_{\frac{1}{m}}(\nu_{I(m)}^{m})}\langle\tau,\xi\rangle\bigg{)}+n_{k_{j}}\frac{1/m}{1/\sqrt{m}}+d^{\frac{1}{\sqrt{m}}}((\vec{0},0),(n_{k_{j}}q,n_{k_{j}}\nu_{I(m)}^{m})))\end{split}

(41)

This holds for any arbitrary $m\in{\mathbb{N}}$ . The next step is to specify a sufficiently large $m$ .

By compactness of $S\setminus W$ , the measures $\nu_{I(m)}^{m}$ have a weakly convergent subsequence
$\nu_{I(m_{l})}^{m_{l}}\Rightarrow\nu^{\prime}\in S\setminus W$ .

Fix some $l_{0}\in{\mathbb{N}}$ large enough so that

\frac{1}{m_{l_{0}}}+\frac{\epsilon}{\sqrt{m_{l_{0}}}}<\delta,\ \frac{1}{\sqrt{m_{l_{0}}}}<\epsilon,\ \widetilde{d}^{\ \frac{1}{\sqrt{m_{l_{0}}}}}((\vec{0},0),(q,\nu^{\prime}))<||(q,\nu^{\prime})||+\epsilon

(42)

Consider some event-dependent $l\geq l_{0}$ large enough s.t.

\rho(\nu_{I(m_{l})}^{m_{l}},\nu^{\prime})<\frac{\epsilon}{\sqrt{m_{l_{0}}}}

(43)

Since

\xi\in\overline{B_{\frac{1}{m_{l}}}(\nu_{I(m_{l})}^{m_{l}})}\Rightarrow\rho(\xi,\nu^{\prime})\leq\rho(\xi,\nu_{I(m_{l})}^{m_{l}})+\rho(\nu_{I(m_{l})}^{m_{l}},\nu^{\prime})\leq\frac{1}{m_{l}}+\frac{\epsilon}{\sqrt{m_{l_{0}}}}\leq\frac{1}{m_{l_{0}}}+\frac{\epsilon}{\sqrt{m_{l_{0}}}}<\delta

and $\delta$ satisfies (34) then

\sup_{\xi\in B_{\frac{1}{m_{l}}}(\nu_{I(m_{l})}^{m_{l}})}\beta\langle\tau,\xi\rangle\leq\sup_{\xi\in\overline{B_{\frac{1}{m_{l}}}(\nu_{I(m_{l})}^{m_{l}})}}\beta\langle\tau,\xi\rangle<\beta\langle\tau,\nu^{\prime}\rangle+\epsilon

(44)

Next, we have

\frac{1}{\sqrt{m_{l}}}\leq\frac{1}{\sqrt{m_{l_{0}}}}<\epsilon

(45)

Finally, we are in the event (39) so for large enough $j$ we have

\begin{split}&d^{\frac{1}{\sqrt{m_{l}}}}((\vec{0},0),(n_{k_{j}}q,n_{k_{j}}\nu^{m_{l}}_{I(m_{l})}))\\ &<n_{k_{j}}\widetilde{d}^{\ \frac{1}{\sqrt{m_{l}}}}((\vec{0},0),(q,\nu^{m_{l}}_{I(m_{l})}))+n_{k_{j}}\epsilon\\ &\leq n_{k_{j}}\widetilde{d}^{\ \frac{1}{\sqrt{m_{l_{0}}}}}((\vec{0},0),(q,\nu^{m_{l}}_{I(m_{l})}))+n_{k_{j}}\epsilon\\ &\leq n_{k_{j}}\widetilde{d}^{\ \frac{1}{\sqrt{m_{l_{0}}}}}((\vec{0},0),(q,\nu^{\prime}))+\frac{n_{k_{j}}}{1/\sqrt{m_{l_{0}}}}\rho(\nu^{m_{l}}_{I(m_{l})},\nu^{\prime})+n_{k_{j}}\epsilon\end{split}

(46)

by Lemma 15. By (42),(43) this last expression is

<n_{k_{j}}||(q,\nu^{\prime})||+3n_{k_{j}}\epsilon

Combining the inequalities (44),(45),(46) with (41) we get that for large enough $j$ ,

\log\bigg{(}\sum_{\pi:\vec{0}\rightarrow\lfloor n_{k_{j}}q\rfloor\ \mbox{s.t.}\ \frac{1}{n_{k_{j}}}\mu_{\pi}\in B_{\frac{1}{m_{l}}}(\nu_{I(m_{l})}^{m_{l}})}e^{\beta T(\pi)}\bigg{)}\leq n_{k_{j}}\bigg{(}\beta\langle\tau,\nu^{\prime}\rangle+||(q,\nu^{\prime})||+5\epsilon\bigg{)}

(47)

Substituting this in (40), we get

L\leq\beta\langle\tau,\nu^{\prime}\rangle+||(q,\nu^{\prime})||+5\epsilon\leq\sup_{\nu\in\mathcal{M}_{+}\setminus W}\beta\langle\tau,\nu\rangle+||(q,\nu)||+5\epsilon

as desired. Note that even though $m_{l}$ and $\nu^{\prime}$ in (47) are event-dependent, the upper bound we get for $L$ is not event-dependent, makes our argument work. $\square$ [Claim 2]

This finishes the proof of the variational formula in the fixed-direction case.

(ii) An analogous argument gives

G^{\beta}=\sup_{\nu\in\mathcal{M}_{1}}[\beta\langle\tau,\nu\rangle+||\nu||]

But direction-free grid entropy is just grid entropy in the direction $\ell$ . Furthermore,
$||(\ell,\nu)||=-\infty$ if $\nu\notin\mathcal{M}_{1}$ . Thus

	$\displaystyle G^{\beta}$	$\displaystyle=\sup_{\nu\in\mathcal{M}_{1}}[\beta\langle\tau,\nu\rangle+\|\|\nu\|\|]$
		$\displaystyle=\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle+\|\|(\ell,\nu)\|\|]$
		$\displaystyle=G^{\beta}_{\ell}$
		$\displaystyle=\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},\|\|q\|\|_{1}=1}\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle+\|\|(q,\nu)\|\|]\ \mbox{by Lemma \ref{maximalLemma}}$
		$\displaystyle=\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},\|\|q\|\|_{1}=1}G^{\beta}_{q}$

∎

As an immediate corollary, we get another proof of Bates’ Variational Formula (Theorem 29), as well as its direction-free analogue. We stress that our proof as written works only under the assumption that $\tau$ is bounded, but can easily be extended to the general case of measurable $\tau$ .

Proof of Theorem 29. Fix $q\in{\mathbb{R}}^{D}_{\geq 0}$ and bounded measurable $\tau$ . It is standard (e.g. see [GRAS16]) that for every $n$ ,

\lim_{\beta\rightarrow\infty}\beta^{-1}\log Z_{n,q}^{\beta}=Z_{n,q}=\sup_{\beta>0}\beta^{-1}\log Z_{n,q}^{\beta}=T(\vec{0},nq):=\log Z_{n,q}^{\infty}\ \mbox{a.s.}

(48)

which we recall is the last passage time between $\vec{0}$ and $\lfloor nq\rfloor$ . That is, the last passage time is the zero temperature analogue of $\log Z_{n,q}^{\beta}$ . Thus the the zero temperature point-to-point Gibbs Free Energy is the last passage time constant $\lambda_{q}$ :

G^{\infty}_{q}:=\lim_{n\rightarrow\infty}n^{-1}\log Z_{n,q}^{\infty}=\lim_{n\rightarrow\infty}n^{-1}T(\vec{0},nq)=\lambda_{q}\ \mbox{a.s.}

On the other hand, another standard result (see [GRAS16]) says that the zero temperature point-to-point Gibbs Free Energy is a scaled limit of the positive temperature free energies:

\lim_{\beta\rightarrow\infty}\beta^{-1}G^{\beta}_{q}=\sup_{\beta>0}\beta^{-1}G^{\beta}_{q}=G^{\infty}_{q}=\lambda_{q}\ \mbox{a.s}

(49)

Fix $N\in{\mathbb{N}}$ . Dividing our variational formula by $\beta$ and taking the supremum over $\beta\geq N$ we get by (49)

	$\displaystyle\lambda_{q}$	$\displaystyle=\sup_{\beta\geq N}\beta^{-1}G^{\beta}_{q}$
		$\displaystyle=\sup_{\beta\geq N}\sup_{\nu\in\mathcal{R}^{q}}\bigg{(}\langle\tau,\nu\rangle+\beta^{-1}\|\|(q,\nu)\|\|\bigg{)}$
		$\displaystyle=\sup_{\nu\in\mathcal{R}^{q}}\sup_{\beta\geq N}\bigg{(}\langle\tau,\nu\rangle+\beta^{-1}\|\|(q,\nu)\|\|\bigg{)}$
		$\displaystyle=\sup_{\nu\in\mathcal{R}^{q}}\bigg{(}\langle\tau,\nu\rangle+N^{-1}\|\|(q,\nu)\|\|\bigg{)}$

Taking $N\rightarrow\infty$ and noting that $\nu\in\mathcal{R}^{q}$ implies $|(q,\nu)||\in[0,H(q)]$ is bounded, we get Bates’ variational formula:

\lambda_{q}=\sup_{\nu\in\mathcal{R}^{q}}\langle\tau,\nu\rangle

a.s.. $\square$

In other words, Bates’ variational formula is nothing more than the zero temperature analogue of the entropy variational formula for Gibbs Free Energy.

The other corollary of note is the answer to Hoffman’s question in the directed polymer model when our variational formula has a unique maximizer. As Bates observes, this happens for a dense family of measurable functions $\tau$ (recall Remark Remark).

Corollary 32.

Fix an inverse temperature $\beta>0$ and a bounded measurable
$\tau:[0,1]\rightarrow{\mathbb{R}}$ .

(i)

Fix $q\in{\mathbb{R}}^{D}_{\geq 0}$ and suppose $\beta\langle\tau,\nu\rangle+||(q,\nu)||$ has a unique maximizer $\nu\in\mathcal{R}^{q}$ . For every $n$ pick a path $\pi_{n}:\vec{0}\rightarrow\lfloor nq\rfloor$ independently and at random according to the probabilities prescribed the corresponding point-to-point $\beta$ -polymer measure $\rho_{n,q}^{\beta}$ . Then the empirical measures $\frac{1}{n}\mu_{\pi_{n}}$ converge weakly to $\nu$ a.s.
(ii)

Suppose $\beta\langle\tau,\nu\rangle+||\nu||$ has a unique maximizer $\nu\in\mathcal{M}_{1}$ . For every $n$ pick a length $n$ path $\pi_{n}$ from $\vec{0}$ independently and at random according to the probabilities prescribed the corresponding point-to-level $\beta$ -polymer measure $\rho_{n}^{\beta}$ . Then the empirical measures $\frac{1}{n}\mu_{\pi_{n}}$ converge weakly to $\nu$ a.s.

Remark.

In fact, it follows from our argument below combined with the compactness of $\mathcal{M}_{1}$ that even in the general case when the variational formula might not have a unique maximizer, all of the accumulation points of the empirical measures along random paths chosen according to the $\beta$ -polymer measure are maximizers of the variational formula. Thus, one interesting open question for the future is whether we can more precisely describe these maximizers. This information could help distinguish between weak and strong disorder regimes; see [RASY17] for a recent paper on the role of minimizers of a ”cocycle” variational formula for the Gibbs Free Energy in determining disorder regimes.

Remark.

This partial answer to Hoffman’s question in the positive temperature case does also follow from the development of grid entropy by Rassoul-Agha and Seppäläinen [RAS14] as the rate function of the LDP of empirical measures. However, it seems it has not yet been formulated explicitly as we have done here.

Proof.

We focus on proving (i). The direction-free argument (ii) is analogous.

(i) If $q=\vec{0}$ then the paths $\pi_{n}$ are the trivial path $\vec{0}\rightarrow\vec{0}$ and $\mathcal{R}^{\vec{0}}=\{0\}$ so $\nu=0$ , hence the result holds trivially. Thus we may assume $||q||_{1}>0$ .

Since the empirical measures $\frac{1}{n}\mu_{\pi^{n}}$ eventually lie in the weakly compact set
$S=\{\xi\in\mathcal{M}_{+}:\frac{1}{2}||q||_{1}\leq||\xi||_{TV}\leq\frac{3}{2}||q||_{1}\}$ then by compactness the existence of an accumulation point is guaranteed so it suffices to show that the sequence $\frac{1}{n}\mu_{\pi_{n}}$ cannot have any accumulation point other than $\nu$ .

If $\mathcal{R}^{q}$ is a singleton, i.e. $\mathcal{R}^{q}=\{\nu\}=\{||q||_{1}\Lambda\}$ , then by the characterization of $\mathcal{R}^{q}$ as the set of accumulation points of empirical measures (see Theorem 6), $\nu$ is the only possible accumulation point of $\frac{1}{n}\mu_{\pi_{n}}$ so we would be done. We can thus assume $\mathcal{R}^{q}$ is not a singleton. Thus there exists some $\eta_{0}>0$ s.t. $B_{\eta_{0}}(\nu)^{C}\cap\mathcal{R}^{q}\neq\emptyset$ .

Suppose we are in the measure 1 event

\begin{split}&\Bigg{\{}\begin{array}[]{l}\limsup\limits_{n\rightarrow\infty}\frac{1}{n}\log\ \sum\limits_{\begin{subarray}{c}\pi:\vec{0}\rightarrow\lfloor nq\rfloor\\ \frac{1}{n}\mu_{\pi}\notin B_{\eta}(\nu)\end{subarray}}e^{\beta T(\pi)}\leq\sup\limits_{\xi\in\mathcal{M}_{+}\setminus B_{\eta}(\nu)}[\beta\langle\tau,\xi\rangle+||(q,\xi)||]\ \forall\eta\in{\mathbb{Q}}_{+}\cap(0,\eta_{0})\end{array}\Bigg{\}}\\ &\cap\bigg{\{}\lim_{n\rightarrow\infty}\frac{1}{n}\log Z_{n,q}^{\beta}=\beta\langle\tau,\nu\rangle+||(q,\nu)||\bigg{\}}\end{split}

(50)

This is the intersection of countably many measure 1 events by Claim 2 in the proof of Theorem 30. Claim 2 applies because

B_{\eta}(\nu)^{C}\cap\mathcal{R}^{q}\supseteq B_{\eta_{0}}(\nu)^{C}\cap\mathcal{R}^{q}\neq\emptyset\ \forall 0<\eta<\eta_{0}

We need to prove that in the measure 1 event (50), the probability that the empirical measures $\frac{1}{n}\mu_{\pi^{n}}$ of the paths chosen independently at random according to the polymer measures do not have an accumulation point other than $\nu$ is 0.

Fix any $0<\eta<\eta_{0}$ in ${\mathbb{Q}}_{+}$ . We wish to show that the probability that $\frac{1}{n}\mu_{\pi_{n}}$ is not in $B_{\eta}(\nu)^{C}$ infinitely often is 0.

Let $\xi_{\eta}\in\mathcal{M}_{+}\setminus B_{\eta}(\nu)$ achieve the supremum

\sup_{\xi\in\mathcal{M}_{+}\setminus B_{\eta}(\nu)}[\beta\langle\tau,\xi\rangle+||(q,\xi)||]

By the maximality of $\nu$ , there is some $\epsilon>0$ s.t.

\beta\langle\tau,\xi_{\eta}\rangle+||(q,\xi_{\eta})||<\beta\langle\tau,\nu\rangle+||(q,\nu)||-3\epsilon

(51)

Consider any $n\in{\mathbb{N}}$ . Recall the definition of the point-to-point $\beta$ -polymer measure $\rho_{n,q}^{\beta}$ on the set of paths $\pi:\vec{0}\rightarrow\lfloor nq\rfloor$ :

\rho_{n,q}^{\beta}(d\pi)=\frac{1}{Z_{n,q}^{\beta}}e^{\beta T(\pi)}

Thus the probability that $\frac{1}{n}\mu_{\pi_{n}}\notin B_{\eta}(\nu)$ (with respect to the $\beta$ -polymer measure $\rho_{n,q}^{\beta}$ ) is

\frac{1}{Z_{n,q}^{\beta}}\sum_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor\ \mbox{s.t.}\ \frac{1}{n}\mu_{\pi}\notin B_{\eta}(\nu)}e^{\beta T(\pi)}

We are in the event (50), so for large enough $n$ , this is

\leq\mbox{exp}\bigg{(}-n\beta\langle\tau,\nu\rangle-n||(q,\nu)||+n\epsilon\bigg{)}\mbox{exp}\bigg{(}n\beta\langle\tau,\xi_{\eta}\rangle+n||(q,\xi_{\eta})||+n\epsilon\bigg{)}\leq e^{-n\epsilon}

by (51). $\sum\limits_{n=1}^{\infty}e^{-n\epsilon}<\infty$ so by the Borel-Cantelli Lemma,

	$\displaystyle P\bigg{(}\frac{1}{n}\mu_{\pi_{n}}\ \mbox{has an accumulation point in $B_{2\eta}(\nu)^{C}$}\bigg{)}$
	$\displaystyle\leq P\bigg{(}\frac{1}{n}\mu_{\pi_{n}}\in B_{\eta}(\nu)^{C}\ \mbox{for infinitely many $n$}\bigg{)}=0$

By continuity from below,

	$\displaystyle P\bigg{(}\frac{1}{n}\mu_{\pi_{n}}\ \mbox{has an acc. point other than $\nu$}\bigg{)}$
	$\displaystyle=\lim_{\eta\rightarrow 0^{+}}P\bigg{(}\frac{1}{n}\mu_{\pi_{n}}\ \mbox{has an acc. point in $B_{2\eta}(\nu)^{C}$}\bigg{)}=0$

Therefore the empirical measures $\frac{1}{n}\mu_{\pi_{n}}$ converge weakly to $\nu$ a.s.. ∎

5.3 Grid Entropy as The Negative Convex Conjugate of Gibbs Free Energy

Another way of viewing the variational formulas in Theorem 30 is that the point-to-point/point-to-level $\beta$ -Gibbs Free Energies as functions of the bounded measurable function $\tau$ are the convex conjugates of the functions $\nu\mapsto-||(q,\nu)||,\nu\mapsto-||\nu||$ on $\mathcal{M}_{+},\mathcal{M}_{1}$ respectively.

Let us briefly recall what that means (see [Zal02] for more details).

Definition.

Let $X$ be a locally convex Hausdorff space and let $X^{\ast}$ be its dual with respect to an inner product $\langle\cdot,\cdot\rangle$ . A convex function $f:X\rightarrow[-\infty,\infty]$ is said to be proper if $f>-\infty$ and $f\not\equiv\infty$ . For any proper convex function $f:X\rightarrow[-\infty,\infty]$ , its convex conjugate is the function $f^{\ast}:X^{\ast}\rightarrow[-\infty,\infty]$ given by

f^{\ast}(x^{\ast})=\sup_{x\in X}[\langle x,x^{\ast}\rangle-f(x)]

In our case, we have $X=\mathcal{M}_{+}$ (finite Borel measures on [0,1]) and $X^{\ast}$ is the set of bounded measurable functions $\tau:[0,1]\rightarrow{\mathbb{R}}$ with inner product given by integration.

Also, recall that direction-fixed/direction-free grid entropy is concave hence the maps $\nu\mapsto-||(q,\nu)||,\nu\mapsto-||\nu||$ are convex and trivially proper. Furthermore, by our continuity theorem (Thm 26), $-||(q,\cdot)||$ and $-||\nu||$ are lower semicontinuous in $\nu$ .

With this setup in mind, it is evident that Theorem 30 establishes that the point-to-point $\beta$ -Gibbs Free Energy is precisely the convex conjugate of $-||(q,\cdot)||$ and the point-to-level $\beta$ -Gibbs Free Energy is the convex conjugate of $-||\nu||$ .

The convex conjugate has various important properties. We focus on those that are the most relevant and interesting in regards to grid entropy and Gibbs Free Energy. See [Zal02, Sect 2.3] for the full list of properties.

Corollary 33.

Fix $q\in{\mathbb{R}}^{D}_{\geq 0}$ , an inverse temperature $\beta>0$ . Then viewing Gibbs Free Energies as functions of bounded measurable $\tau$ we have the following.

(i)

Gibbs Free Energies $G_{q}^{\beta}(\tau),G^{\beta}(\tau)$ are convex and lower semicontinuous in $\tau$
(ii)

(Order-preservation) If $q^{\prime}\in{\mathbb{R}}^{D}_{\geq 0}$ with $||(q,\cdot)||\leq||(q^{\prime},\cdot)||$ then
$G_{q}^{\beta}(\cdot)\leq G_{q^{\prime}}^{\beta}(\cdot)$
(iii)

(Biconjugate Duality) The convex conjugate of $\beta$ -Gibbs Free Energy (as a function of $\tau$ ) is minus grid entropy. That is,

$(G_{q}^{\beta})^{\ast}=-||(q,\cdot)||\ \mbox{and}\ (G^{\beta})^{\ast}=-||\cdot||$

In other words, grid entropy is equal to its biconjugate.

(iv)

(Fenchel’s Duality Theorem) Let $g:\mathcal{M}_{+}\rightarrow[-\infty,\infty],h:\mathcal{M}_{1}\rightarrow[-\infty,\infty]$ be proper convex functions s.t. $\exists\nu\in\mathcal{R}^{q},\nu^{\prime}\in\mathcal{R}^{1}$ with $g(\nu)<\infty,h(\nu^{\prime})<\infty$ and one of $g,||(q,\cdot)||$ is continuous at $\nu$ and one of $h,||\cdot||$ is continuous at $\nu^{\prime}$ . Then

\inf_{\nu\in\mathcal{M}_{+}}[g(\nu)-||(q,\nu)||]=\sup_{\tau}[G_{q}^{\beta}(\tau)-g^{\ast}(\tau)]

\inf_{\nu\in\mathcal{M}_{1}}[h(\nu)-||\nu||]=\sup_{\tau}[G^{\beta}(\tau)-h^{\ast}(\tau)]

where $g^{\beta\ast},h^{\beta\ast}$ denote the $\beta$ -convex conjugates of $g,h$ respectively:

g^{\beta\ast}(\tau)=\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle-g(\nu)],\ h^{\beta\ast}(\tau)=\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle-h(\nu)]

Remark.

(iii) gives yet another definition of direction-fixed/direction-free grid entropy. This one is perhaps the most remarkable of the three, as it connects it to the seemingly unrelated quantity that is Gibbs Free Energy. This more than justifies the importance and canonical nature of grid entropy.

Proof.

(iii) is an immediate consequence of the Fenchel-Moreau Theorem. The rest follow from the properties of convex conjugates of proper convex, lower semicontinuous functions. See [Zal02, Sect 2.3] for more details on (i), (ii) and see [VT84, Sect 7.15] for more details on (iv). ∎

The key takeaway is this: grid entropy and Gibbs Free Energy are intricately connected via convex duality.

5.4 Connection to Previous Literature on Grid Entropy

We end our discussion of these variational formulas by explicitly deriving the formulas (3) which link our notion of grid entropy to that presented previously in [RAS14]. For the sake of conciseness we focus on the direction-fixed (a.k.a. point-to-point) case.

Let us first recalibrate our setting to align with the framework of Rassoul-Agha and
Seppäläinen. We view each edge label as a value $\omega_{x,z}$ associated to the ”anchor” vertex $x\in{\mathbb{Z}}^{D}$ and the unit $NE$ direction $z$ of the step (with respect to $x$ ). Thus the environment becomes $\Omega:=[0,1]^{{\mathbb{Z}}^{D}\times\mathcal{G}}$ where $\mathcal{G}=\{e_{1},\ldots,e_{D}\}$ is the set of $D$ unit NE steps. For each $y\in{\mathbb{Z}}^{D}$ , we define a shift $T_{y}:\Omega\rightarrow\Omega$ in the natural way, by shifting the anchor: $(T_{y}\omega)_{x,z}=\omega_{x+y,z}$ . We let $f:\Omega\times\mathcal{G}\rightarrow{\mathbb{R}}$ be a potential.

We initially work on the space $\Omega\times\mathcal{G}$ of pairs of the environment and one unit step; that is we consider the empirical measures

\frac{1}{n}\chi_{\pi_{n}}=\frac{1}{n}\sum_{i=0}^{n-1}\delta_{(T_{X_{i}}\omega,X_{i+1}-X_{i})}

of the polymer paths $\pi_{n}:(X_{0}\rightarrow X_{1}\rightarrow\cdots\rightarrow X_{n-1})$ . It is established in [RASY13, Theorem 3.1] that the rate function $I_{1}(\mu,q)$ of these empirical measures in the direction- $q$ case is precisely the lower semicontinuous regularization of

H_{1}(\mu)-\beta\langle f,\mu\rangle+\Lambda_{q}^{\beta}(f)

(52)

where $H_{1}$ is the infimum of certain relative entropies (see (5.2) of [RASY13]), and

\Lambda_{q}^{\beta}(f):=\lim_{n\rightarrow\infty}\frac{1}{n}\log E\bigg{[}\mbox{exp}\bigg{(}\beta\sum\limits_{i=0}^{n-1}f(T_{X_{i}}\omega,X_{i+1}-X_{i})\bigg{)}\textbf{1}_{X_{n-1}=\lfloor nq\rfloor}\bigg{]}

is the (normalized) $\beta$ -Gibbs Free Energy in direction $q$ with potential $f$ defined in (2.3) of the same paper. Here $\mu$ is a probability measure on $\Omega\times\mathcal{G}$ s.t. the 1st step $X_{1}-X_{0}\in\mathcal{G}$ has $\mu$ -mean $E^{\mu}[X_{1}-X_{0}]=q$ . This last condition is imposed because these paths go in the direction $q$ hence any limit points of their empirical measures must average to $q$ in the step coordinate; any $\mu$ not satisfying this condition will trivially have an infinite rate function.

We consider the ”edge label potential” $\phi:\Omega\times\mathcal{G}\rightarrow[0,1]$ given by $\phi(\omega,z)=\omega_{0,z}$ which maps an (environment, unit direction) pair to the edge label of the corresponding unit direction anchored at the origin. If $\pi_{n}=(X_{0}\rightarrow\cdots\rightarrow X_{n-1})$ is a polymer path then our empirical measures from this paper are just the $\phi$ -pushforwards of $\frac{1}{n}\chi_{\pi_{n}}$ :

\frac{1}{n}\mu_{\pi_{n}}=\frac{1}{n}\sum_{i=0}^{n-1}\delta_{\omega_{X_{i},X_{i+1}-X_{i}}}=(\phi)_{\ast}\bigg{(}\frac{1}{n}\chi_{\pi_{n}}\bigg{)}

We obtain the rate function $I_{1}^{\prime}(\nu,q)$ of the empirical measures $\frac{1}{n}\mu_{\pi_{n}}$ in the direction- $q$ case by contracting the higher level LDP rate function $I_{1}(\mu,q)$ :

I_{1}^{\prime}(\nu,q)=\inf_{\mu:(\phi)_{\ast}\mu=\nu}I_{1}(\mu,q)

(53)

for probability measures $\nu\in\mathcal{M}_{1}$ , where the infimum is taken over probability measures $\mu$ on $\Omega\times\mathcal{G}$ whose $\phi$ -pushforward is $\nu$ .

We now make some convenient observations. First, for any measurable and bounded
$\tau:[0,1]\rightarrow{\mathbb{R}}$ , our (unnormalized) $\beta$ -point-to-point Gibbs Free Energy $G_{q}^{\beta}(\tau)$ is precisely $\Lambda_{q}^{\beta}(\tau\circ\phi)+\log D$ . Second, for any $\mu$ s.t. $\nu=(\phi)_{\ast}\mu$ we have

\langle\phi,\mu\rangle=\int\phi(u)d\mu=\int ud\nu=\langle\iota,\nu\rangle

where $\iota:[0,1]\rightarrow[0,1],\iota(u)=u$ is the identity function. Thus the latter two terms in (52) for $I_{1}(\mu,q)$ are ignored by the infimum in (53) so we get that $I_{1}^{\prime}(\nu,q)$ is the lower semicontinuous regularization of

H_{1}^{\prime}(\nu,q)-\beta\langle\iota,\nu\rangle+G_{q}^{\beta}(\iota)-\log D

where

H_{1}^{\prime}(\nu,q)=\inf\big{\{}H_{1}(\mu):\mu\ \mbox{is a prob. meas. on}\ \Omega\times\mathcal{G}\ \mbox{s.t.}\ (\phi)_{\ast}\mu=\nu,E^{\mu}[X_{1}-X_{0}]=q\big{\}}

(54)

(5.9) from [RAS14] gives the following variational formula for the point-to-point Gibbs Free Energy:

\Lambda_{q}^{\beta}(f)=\sup_{\mu:E^{\mu}[X_{1}-X_{0}]=q}[\beta\langle f,\mu\rangle-H_{1}(\mu)]

(55)

for bounded measurable potentials $f:\Omega\times\mathcal{G}\rightarrow{\mathbb{R}}$ where the supremum is over probability measures $\mu$ on $\Omega\times\mathcal{G}$ with mean step $q$ .

Taking $f=\tau\circ\phi$ for bounded measurable $\tau:[0,1]\rightarrow{\mathbb{R}}$ and splitting the supremum over $\mu$ into an outer supremum over $\nu\in\mathcal{M}_{1}$ and an inner supremum over $\mu$ whose $\phi$ -pushforward is $\nu$ , we get

\begin{split}\Lambda_{q}^{\beta}(\tau\circ\phi)&=\sup_{\mu:E^{\mu}[X_{1}-X_{0}]=q}[\beta\langle\tau\circ\phi,\mu\rangle-H_{1}(\mu)]\\ &=\sup_{\nu\in\mathcal{M}_{1}}\sup_{\mu:(\phi)_{\ast}\mu=\nu,E^{\mu}[X_{1}-X_{0}]=q}[\beta\langle\tau,\nu\rangle-H_{1}(\mu)]\\ &=\sup_{\nu\in\mathcal{M}_{1}}[\beta\langle\tau,\nu\rangle-H_{1}^{\prime}(\nu,q)]\end{split}

(56)

where the last equality holds by (54). Comparing this to our variational formula for the point-to-point Gibbs Free Energy from Theorem 30 (i), recalling that Gibbs Free Energy is biconjugate from Corollary 33, and recalling that $G_{q}^{\beta}(\tau)=\Lambda_{q}^{\beta}(\tau\circ\phi)+\log D$ , we get

H_{1}^{\prime}(\nu,q)=\sup_{\tau:[0,1]\rightarrow{\mathbb{R}}\ \mbox{bdd, meas.}}[\beta\langle\tau,\nu\rangle-G_{q}^{\beta}(\tau)+\log D]=-||(q,\nu)||+\log D

The direction-free (a.k.a. point-to-level) argument is nearly identical, except we drop the $q$ indices and the condition that the first step $X_{1}-X_{0}$ has $\mu$ -mean $E^{\mu}[X_{1}-X_{0}]=q$ . In the end we get

H_{1}^{\prime}(\nu)=-||\nu||+\log D

where

H_{1}^{\prime}(\nu)=\inf_{\mu:(\phi)_{\ast}\mu=\nu}H_{1}(\mu)

As a final remark, we mention that this entropy $H_{1}$ originally appears in Varadhan’s paper [Var03] on large deviations for empirical measures of Markov chains.

6 Extensions to Other Models and Open Questions

Looking back at our development of grid entropy in ${\mathbb{R}}^{D}$ , we never required some specific property of the lattice model other than that the number of paths $\pi:\vec{0}\rightarrow\lfloor nq\rfloor$ we consider is $O(e^{H(q)n})$ for some model-dependent constant

H(q)=\limsup_{n\rightarrow\infty}\frac{1}{n}\log(\#\pi:\vec{0}\rightarrow\lfloor nq\rfloor)

that all such $\pi$ have the same length $O(n)$ , and that any pair of paths $\vec{0}\rightarrow\lfloor mq\rfloor$ and $\lfloor mq\rfloor\rightarrow\lfloor(m+n)q\rfloor$ can be concatenated (thus giving us the superadditivity required to be able to apply the Subadditive Ergodic Theorem).

Therefore the notion of grid entropy and all our results, including the variational formula for the Gibbs Free Energy, apply in any edge model with these properties. For example, we might consider random walks on ${\mathbb{Z}}_{+}\times{\mathbb{Z}}^{D}$ where the first coordinate, ”time,” is always increasing along a path.

In addition, if we loosen some of these conditions, then a part of our work in this paper still holds. For example, instead of NE paths on ${\mathbb{Z}}^{D}$ consider self-avoiding walks (SAWs) on ${\mathbb{Z}}^{D}$ , as done in [Bat20]. Then we cannot always concatenate a pair of SAWs $\vec{0}\rightarrow\lfloor mq\rfloor$ and $\lfloor mq\rfloor\rightarrow\lfloor(m+n)q\rfloor$ so our arguments that rely on superadditivity no longer apply. However, we may still define grid entropy by

||(q,\nu)||:=\sup\bigg{\{}\alpha\geq 0:\lim_{n\rightarrow\infty}\min_{\pi:\vec{0}\rightarrow\lfloor nq\rfloor}^{\lfloor e^{\alpha n}\rfloor}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}=0\ \mbox{a.s.}\bigg{\}}

and it will be true that only target measures $\nu\ll\Lambda$ with total mass $||q||_{1}$ are observed and that we have an upper bound on the sum of relative entropy and grid entropy:

D_{KL}(\nu||\Lambda)+||(q,\nu)||\leq\chi_{D}\ \mbox{when $||q||_{1}=1$}

where $\chi_{D}$ is the connectivity constant for ${\mathbb{Z}}^{D}$ satisfying

\#(\mbox{length $n$ SAWs from $\vec{0}$})=e^{n\chi_{D}+o(n)}

(see [Law13, Section 6.2] for details). It is not clear, however, whether any of the other properties (including this being a directed norm with negative sign) still hold.

Note however that with a small modification we can make this work. All our results hold for SAWs on ${\mathbb{Z}}^{D+1}$ where the first coordinate, ”time,” is necessarily non-decreasing because we obtain a superadditivity akin to the one for NE paths.

We end by mentioning two open questions that could be the scope of future work. Can we describe the set of maximizers of our variational formula (Theorem 30)? As discussed before, this would provide insights into the limit points of empirical measures and also into the problem of distinguishing weak and strong order regimes. And secondly, is grid entropy strictly concave? From our results in this chapter, answering this would immediately imply that Gibbs Free Energy is strictly convex, which is currently a major open research problem in this field.

7 Acknowledgments

I would like to thank my friends and family for their continual support during my graduate studies. Special thanks goes to my advisor Bálint Virág, for his patience, guidance, and optimism in the face of setbacks. I also wish to thank Duncan Dauvergne for very helpful discussions and peer review on a preliminary draft. In addition, I am grateful to my thesis external reviewer Firas Rassoul-Agha for his comments and for drawing some connections to previous literature. Finally, this work would not have been possible without the funding from my NSERC Canadian Graduate Scholarship-Doctoral.

References

[AH16] Daniel Ahlberg and Christopher Hoffman, Random coalescing geodesics in first-passage percolation, arXiv preprint:1609.02447 (2016).
[Ahl15] Daniel Ahlberg (ed.), Problem lists: first passage percolation, American Institute of Mathematics, Aug 2015.
[Bat20] Erik Bates, Empirical distributions, geodesic lengths, and a variational formula in first-passage percolation, arXiv preprint:2006.12580 (2020).
[Car10] Philippe Carmona, Directed polymer in random environment and last passage percolation, ESAIM: Probability and Statistics 14 (2010), 263–270.
[DS01] Jean-Dominique Deuschel and Daniel W Stroock, Large deviations, vol. 342, American Mathematical Soc., 2001.
[Dur19] Rick Durrett, Probability: theory and examples, vol. 49, Cambridge university press, 2019.
[GRAS16] Nicos Georgiou, Firas Rassoul-Agha, and Timo Seppäläinen, Variational formulas and cocycle solutions for directed polymer and percolation models, Communications in Mathematical Physics 346 (2016), no. 2, 741–779.
[Hub04] Peter J Huber, Robust statistics, vol. 523, John Wiley & Sons, 2004.
[JLS20] Christopher Janjigian, Wai-Kit Lam, and Xiao Shen, Tail bounds for the averaged empirical distribution on a geodesic in first-passage percolation, arXiv preprint:2010.08072 (2020).
[JRAS19] Christopher Janjigian, Firas Rassoul-Agha, and Timo Seppäläinen, Geometry of geodesics through busemann measures in directed last-passage percolation, arXiv preprint:1908.09040 (2019).
[KL51] Solomon Kullback and Richard A Leibler, On information and sufficiency, The annals of mathematical statistics 22 (1951), no. 1, 79–86.
[Law13] Gregory F Lawler, Intersections of random walks, Springer Science & Business Media, 2013.
[Lig85] Thomas M Liggett, An improved subadditive ergodic theorem, The Annals of Probability 13 (1985), no. 4, 1279–1285.
[Mar04] James B Martin, Limiting shape for directed percolation models, The Annals of Probability 32 (2004), no. 4, 2908–2937.
[MSZ21] James B Martin, Allan Sly, and Lingfu Zhang, Convergence of the environment seen from geodesics in exponential last-passage percolation, arXiv preprint:2106.05242 (2021).
[Pos75] Edward Posner, Random coding strategies for minimum entropy, IEEE Transactions on Information Theory 21 (1975), no. 4, 388–391.
[RAS14] Firas Rassoul-Agha and Timo Seppäläinen, Quenched point-to-point free energy for random walks in random potentials, Probability Theory and Related Fields 158 (2014), no. 3, 711–750.
[RASY13] Firas Rassoul-Agha, Timo Seppäläinen, and Atilla Yilmaz, Quenched free energy and large deviations for random walks in random potentials, Communications on Pure and Applied Mathematics 66 (2013), no. 2, 202–244.
[RASY17] , Variational formulas and disorder regimes of random walks in random potentials, Bernoulli 23 (2017), no. 1, 405–431.
[Var03] SRS1989232 Varadhan, Large deviations for random walks in a random environment, Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences 56 (2003), no. 8, 1222–1245.
[VT84] Jan Van Tiel, Convex analysis: an introductory text, Wiley, 1984.
[Zal02] Constantin Zalinescu, Convex analysis in general vector spaces, World scientific, 2002.

	$\displaystyle\|\|\nu\|\|$	$\displaystyle=\inf_{\epsilon>0}\lim_{n\rightarrow\infty}\frac{1}{n}\log\sum_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}e^{-\frac{n}{\epsilon}\rho(\frac{1}{n}\mu_{\pi},\nu)}\ \mbox{a.s.}$
		$\displaystyle=\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},\|\|q\|\|_{1}=t}\|\|(q,\nu)\|\|$
		$\displaystyle=\|\|(t\ell,\nu)\|\|\ \mbox{where}\ \ell=\bigg{(}\frac{1}{D},\ldots,\frac{1}{D}\bigg{)}$

	$\displaystyle\|\|\nu\|\|=-\infty$	$\displaystyle\Leftrightarrow\lim_{n\rightarrow\infty}\min_{\pi\in\mathcal{P}_{\lfloor nt\rfloor}(\vec{0})}^{1}\rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}>0\ \mbox{a.s.}$
	$\displaystyle\mbox{and}\ \|\|\nu\|\|\geq 0$	$\displaystyle\Leftrightarrow P\bigg{(}\exists\ \pi\ \mbox{s.t.}\ \|\pi\|=\lfloor nt\rfloor\ \mbox{with}\ \rho\bigg{(}\frac{1}{n}\mu_{\pi},\nu\bigg{)}<\epsilon\ \mbox{i.o.}\bigg{)}=1\ \forall\epsilon>0$

	$\displaystyle(1-C_{k})\|\|(q,\nu)\|\|$	$\displaystyle=\|\|(q^{k}-(C_{k}q-(q-q^{k})),\nu^{k}-(C_{k}\nu-(\nu-\nu^{k})))\|\|$
		$\displaystyle\leq\|\|(q^{k},\nu^{k})\|\|-C_{k}\bigg{\|}\bigg{\|}\bigg{(}q-\frac{q-q^{k}}{C_{k}},\nu-\frac{\nu-\nu^{k}}{C_{k}}\bigg{)}\bigg{\|}\bigg{\|}$
		$\displaystyle\leq\|\|(q^{k},\nu^{k})\|\|$

	$\displaystyle G^{\beta}$	$\displaystyle=\sup_{\nu\in\mathcal{M}_{1}}[\beta\langle\tau,\nu\rangle+\|\|\nu\|\|]$
		$\displaystyle=\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle+\|\|(\ell,\nu)\|\|]$
		$\displaystyle=G^{\beta}_{\ell}$
		$\displaystyle=\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},\|\|q\|\|_{1}=1}\sup_{\nu\in\mathcal{M}_{+}}[\beta\langle\tau,\nu\rangle+\|\|(q,\nu)\|\|]\ \mbox{by Lemma \ref{maximalLemma}}$
		$\displaystyle=\sup_{q\in{\mathbb{R}}^{D}_{\geq 0},\|\|q\|\|_{1}=1}G^{\beta}_{q}$

	$\displaystyle\lambda_{q}$	$\displaystyle=\sup_{\beta\geq N}\beta^{-1}G^{\beta}_{q}$
		$\displaystyle=\sup_{\beta\geq N}\sup_{\nu\in\mathcal{R}^{q}}\bigg{(}\langle\tau,\nu\rangle+\beta^{-1}\|\|(q,\nu)\|\|\bigg{)}$
		$\displaystyle=\sup_{\nu\in\mathcal{R}^{q}}\sup_{\beta\geq N}\bigg{(}\langle\tau,\nu\rangle+\beta^{-1}\|\|(q,\nu)\|\|\bigg{)}$
		$\displaystyle=\sup_{\nu\in\mathcal{R}^{q}}\bigg{(}\langle\tau,\nu\rangle+N^{-1}\|\|(q,\nu)\|\|\bigg{)}$

Grid entropy in last passage percolation — a superadditive critical exponent approach

Abstract

1 Introduction

Theorem A.

Remark.

Remark.

Remark.

Theorem B.

Theorem C.

Remark.

Theorem D.

Remark.

2 Preliminaries

2.1 Empirical Measures on the Lattice

Lemma 1.

Remark.

Theorem 2 (Glivenko-Cantelli Theorem).

2.2 Metrics on Measures

Definition.

Definition.

Definition.

Lemma 3.

Remark.

Remark.

Proof.

Lemma 4.

Remark.

Proof.

2.3 A Convenient Coupling of the Edge Weights

Lemma 5.

Proof.

Theorem 6.

Remark.

Corollary 7.

Remark.

2.4 Directed Metric Spaces

Definition.

Remark.

Definition.

2.5 The Subadditive Ergodic Theorem

Definition.

Definition.

Theorem 8 (Kingman’s Subadditive Ergodic Theorem, [Lig85]).

Remark.

2.6 Relative Entropy and Sanov’s Theorem

Definition.

Remark.

Theorem 9 (Sanov’s Theorem, [DS01]).

Remark.

3 Grid Entropy as a Directed Norm

3.1 The Plan for Deriving Direction-Fixed Grid Entropy

Definition.

Remark.

Remark.

Remark.

Theorem 10.

Remark.

Theorem 11.

3.2 The Limit Shape of dϵd^{\epsilon} Starting at (0→,0)(\vec{0},0)

Theorem 12.

Remark.

Lemma 13.

Proof.

Lemma 14.

Proof.

Lemma 15.

Proof.

Lemma 16.

Proof.

Lemma 17.

Proof.

Lemma 18.

Proof.

Proof of Theorem 12.

3.3 The Limit Shape of dϵd^{\epsilon}—The General Case

Theorem 10.

Remark.

Proof.

3.4 Grid Entropy as a Directed Norm

Theorem 12.

3.2 The Limit Shape of $d^{\epsilon}$ Starting at $(\vec{0},0)$

3.3 The Limit Shape of $d^{\epsilon}$ —The General Case