Grid entropy in last passage percolation — a superadditive critical exponent approach
Abstract
Working in the setting of i.i.d. last-passage percolation on with no assumptions on the underlying edge-weight distribution, we arrive at the notion of grid entropy — a Subadditive Ergodic Theorem limit of the entropies of paths with empirical measures weakly converging to a given target, or equivalently a deterministic critical exponent of canonical order statistics associated with the Levy-Prokhorov metric. This provides a fresh approach to an entropy first developed by Rassoul-Agha and Seppäläinen as a large deviation rate function of empirical measures along paths. In their 2014 paper [RAS14], variational formulas are developed for the point-to-point/point-to-level Gibbs Free Energies as the convex conjugates of this entropy. We rework these formulas in our new framework and explicitly link our descriptions of grid entropy to theirs. We also improve on a known bound for this entropy by introducing a relative entropy term in the inequality. Furthermore, we show that the set of measures with finite grid entropy coincides with the deterministic set of limit points of empirical measures studied in a recent paper [Bat20] by Bates. In addition, we partially answer a directed polymer version of a question of Hoffman which was previously tackled in the zero temperature case by Bates. Our results cover both the point-to-point and point-to-level scenarios.
1 Introduction
The limiting behaviour of empirical measures is of paramount importance in percolation theory. The normalized passage time along a path, what we ultimately care about, is nothing but the identity function integrated against the empirical measure of that path. In First/Last Passage Percolation we study the minimizing/maximizing paths called geodesics.
A major open problem posed by Hoffman in [Ahl15] is whether empirical measures along geodesics in a fixed direction converge weakly. This is partially answered in [Bat20] for FPP. Bates proves that the sets of limiting distributions in direction , limiting distributions in the direction-free case respectively are deterministic and derives an explicit variational formula for the limit shape of the first passage time as the minimum value of a linear functional over . When the set of minimizers is a singleton, which Bates argues happens for a dense family of edge-weight distributions, it follows that Hoffman’s question is answered in the affirmative. The same argument applied to the LPP model yields analogous conclusions.
Generally, there is no known way of computing the limiting distributions along geodesics. [MSZ21] showcases some cutting edge developments for the solvable Exponential LPP on model, including an explicit formula for weak limits of empirical measures along geodesics. For other recent work on geodesics see [AH16], [JRAS19] and [JLS20].
One way to extend the work in [Bat20] is to study the more general question of what entropy of empirical measures in a fixed direction or direction-free converge in distribution to a certain target measure. Grid entropy gives an exact, deterministic answer.
In [RAS14], Rassoul-Agha and Seppäläinen derive this entropy as a large deviation rate function of empirical measures along paths and they provide variational formulas realizing the point-to-point/point-to-level Gibbs Free Energies as the convex conjugates of this mysterious new entropy. This rate function approach suggests that one might be able to derive formulas for this entropy coming from subadditivity, and furthermore that it is related to a critical exponent of paths.
Our main aim is to prove these intuitions correct, working in an LPP setting on which can easily be extended to more general frameworks. Rather than starting from the definitions given in [RAS14], we define grid entropy as a certain critical exponent of paths and show that it is equivalently described as a superadditive ergodic limit. Moreover, we arrive at variational formulas which are analogues to Rassoul-Agha and Seppäläinen’s and which are in fact positive-temperature analogues of Bates’ variational formula. Along the way, we establish various properties of this entropy, some of which are already noted in or follow from [RAS14] and [Bat20], but also some of which are not, such as a partial answer to Hoffman’s question in a directed polymer setting and such as an amazing equality between grid entropy, relative entropy and Shannon entropy. We give versions of these results for both direction- grid entropy and direction-free grid entropy of target measures . By concavity and the lattice symmetries, the direction-free case turns out to simply be the direction-fixed case in the unit direction which maximizes the total number of paths.
Let us now be more precise. We consider north-east nearest-neighbor paths on , and work on by taking coordinate-wise floors. We follow a similar approach to [Bat20], in that we couple our i.i.d. edge weights to i.i.d. edge labels Unif[0,1] random variables via a measurable function satisfying . This lets us work on [0,1] at no additional cost, as we can lift everything back to via the pushforward of . We then interest ourselves with how many empirical measures for paths converge to some given target measure in , the set of finite non-negative Borel measures on [0,1]. We may keep the direction fixed, or we may vary over all points in with the same 1-norm .
To perform the counting, we consider the order statistics of the distance of the paths’ empirical measures to the target , where distance is measured via the Levy-Prokhorov metric , which metrizes weak convergence of measures. That is, given a direction and a target measure , for every we let
denote the order statistics value of . It is convenient to define
In the direction-free case, where we count all paths of a certain scaled length from , we let
and similarly define over paths of length anchored at .
Of course, these order statistics and the paths corresponding to them are event-dependent.
We then define the grid entropy with respect to the target and the direction , denoted , and the direction-free grid entropy, denoted , to be the critical exponent where the corresponding order statistics change from converging to 0 a.s. to diverging a.s.:
(1) |
where these are defined to be if the set of ’s is empty. It will turn out that we can replace the limits in this definition with ’s and get the same quantity. Observe that these grid entropies lie in , respectively where
is the (Shannon) entropy of the total number of paths in direction , in the sense that
We note that the description (1) of grid entropy as the critical exponent of these order statistics has been previously shown to hold only for Bernoulli edge labels (see [Car10]).
Over the course of this paper we establish two other equivalent definitions of grid entropy which avoid the annoyance of dealing with these event-dependent orderings of the paths. One of these alternate descriptions is remarkably simple: grid entropy is the negative convex conjugate of Gibbs Free Energy; as a direct consequence, our notion of grid entropy is equivalent to that appearing in [RAS14] modulo some normalizations. The other, as the entropy of paths with empirical measure Levy-Prokhorov-close to the target, is completely new though not unexpected.
We summarize these characterizations in the following theorem and link them to current literature in subsequent remarks.
Theorem A.
-
(i)
Let and let be a finite non-negative Borel measure on [0,1]. Then the direction grid entropy as defined in (1) is also given by
(2) The expressions we take an infimum of are each directed metrics with negative sign on . That is, they take value in , evaluate to 0 when , and satisfy the triangle inequality with the sign reversed.
Moreover, direction-fixed grid entropy is the negative convex conjugate of the point-to-point -Gibbs Free Energy in direction (as a function of the environment-coupling function ): For ,
where the supremum is over bounded measurable , where is the integration linear functional and where the point-to-point -Gibbs Free Energy is given by
-
(ii)
Analogous results hold in the direction-free case. Let be a finite non-negative Borel measure on [0,1] and let . Then the direction-free grid entropy as defined in (1) is also given by
The expressions we take an infimum of are each directed metrics with negative sign on , the set of finite non-negative Borel measures with total mass . Moreover, direction-free grid entropy is the negative convex conjugate of the point-to-level -Gibbs Free Energy:
where the supremum is over bounded measurable and where the point-to-level -Gibbs Free Energy is given by
Remark.
We can extend these variational formulas for point-to-point/point-to-level Gibbs Free Energy to general, possibly unbounded, measurable by truncating in at some constant we take to . For example,
Remark.
This theorem is partially proved in [Car10] [Corollary 2] for the case when the edge labels follow a Bernoulli() distribution. In this setting, Carmona shows that the negative convex conjugate of Gibbs Free Energy of measures of the form is given by
The inequalities can be replaced with equality since the number of paths with 1-labels is exponentially decaying in . Using the definition of the Levy-Prokhorov metric we can conclude that this formulation is equivalent to (1).
Remark.
As mentioned, grid entropy first appears in literature as the rate function for the Large Deviation Principle of the empirical measures in [RAS14]. The fact that we derive formula (2) using the Subadditive Ergodic Theorem should therefore not be surprising. Our definitions (1) for direction-fixed/direction-free grid entropy align with those of Rassoul-Agha and Seppäläinen as follows. Consider the product space of the environment
consisting of the i.i.d. edge labels and of the unit NE steps
, and let map an (environment, unit direction) pair to the edge label of the corresponding unit direction anchored at the origin. Then
(3) |
where is the relative entropy defined in (5.2) of [RAS14] which can be traced back to Varadhan’s paper [Var03], and where the infimums are over the measures on whose -pushforward is and, in the direction-fixed case, for which in addition the -mean of the step coordinate is . See Section 5.4 for a more detailed derivation and discussion of (3).
Now grid entropy satisfies some rather interesting properties. The following theorem captures the highlights for direction grid entropy; the direction-free analogues hold as well. All but properties (iii) and (v) follow easily from the framework presented previously in [RAS14], yet we will showcase the power of our new approach to grid entropy by proving all of these properties directly.
Theorem B.
Let . Then:
-
(i)
Grid entropy is a directed norm with negative sign; it scales with positive-factors and it satisfies a reverse triangle inequality:
-
(ii)
Grid entropy is upper semicontinuous.
Let
-
(iii)
is weakly closed, convex and deterministic and coincides almost surely with
-
(iv)
consists only of measures with total variation that are absolutely continuous with respect to the Lebesgue measure on [0,1].
-
(v)
Any satisfies the following upper bound on the sum of the grid entropy and the relative entropy with respect to Lebesgue measure on [0,1]:
where denotes relative entropy (or Kullback-Leibler divergence), and where,
again, this upper bound is simply the (Shannon) entropy of the total number of paths in direction .
Why do we care? The deterministic set is nothing more than the LPP analogue of the set Bates takes a infimum over in [Bat20] in his variational formula for the FPP time constant. In our LPP setting, his formula becomes
LPP time constant | |||
We thus link [Bat20] and [RAS14] by providing a new, more enlightening description of these sets in terms of these grid entropies. Furthermore, the amazing bound in (iii) is an improvement of a version without the grid entropy term proved by Bates and which also follows from Rassoul-Agha and Seppäläinen’s work.
As in [RAS14], the main application of grid entropy we present is a variational formula for the point-to-point/point-to-level Gibbs Free Energies in direction /to the ”level” in the directed polymer model. Our variational formula is simply the zero temperature LPP analogue of Bates’ variational formula for the FPP limit shape, and furthermore can be molded into Rassoul-Agha and Seppäläinen’s variational formula via some normalizations, thus proving that what we call grid entropy really is the same object first developed in [RAS14].
Theorem C.
Fix a direction and an inverse temperature . For bounded measurable , the point-to-point/point-to-level Gibbs Free Energies are given by
where is the maximizing direction, and
is the set of Borel probability measures that have finite direction-free grid entropy. Moreover, these supremums are achieved by some in respectively.
Remark.
We may replace the supremums in these formulas to be over all measurable by truncating at some and taking a supremum over .
As in [Bat20], it follows that the directed polymer analogue of Hoffman’s question is answered in the affirmative when our variational formula has a unique maximizer, which happens for a dense family of measurable functions .
Theorem D.
Fix an inverse temperature and a bounded measurable .
-
(i)
Fix and suppose has a unique maximizer . For every pick a path independently and at random according to the probabilities prescribed the corresponding point-to-point -polymer measure
Then the empirical measures converge weakly to a.s.
-
(ii)
Suppose has a unique maximizer . For every pick a length path from independently and at random according to the probabilities prescribed the corresponding point-to-level -polymer measure
Then the empirical measures converge weakly to a.s.
Remark.
In fact, even if there is no unique maximizer, we can show that all accumulation points of random paths chosen according to the direction-fixed/direction-free -polymer measure are among the maximizers of the corresponding variational formula. This partial answer to Hoffman’s question in the positive temperature case is immediate from the development of grid entropy by Rassoul-Agha and Seppäläinen [RAS14] as the rate function of the Large Deviation Principle of empirical measures. However, to the best of our knowledge this is the first time it has been formulated explicitly.
The plan is as follows. In Section 2, we describe the model and setup, and outline various facts and notions we will need over the course of this paper. Section 3 focuses on developing the second definition of grid entropy (2) as a directed norm with negative sign and showing that it is equivalent to the original definition with the . Then, in Section 4, we investigate what information we can extract from grid entropy and what properties it satisfies (Theorem B). We devote Section 5 to applying our results to establish our variational formula for point-to-point/point-to-level Gibbs Free Energies and study the consequences of this (namely Theorems C,D) as well as the correspondence to the work in [RAS14]. Last but not least, we make some closing remarks about adapting our results to other models.
2 Preliminaries
2.1 Empirical Measures on the Lattice
We begin by briefly describing the setup we use in this paper.
We restrict ourselves to a directed Last Passage Percolation model. Consider north-east nearest-neighbour paths on the lattice with i.i.d. edge weights for some probability distribution on . By north-east, we of course mean that the coordinates of points on the path are nondecreasing.
For we denote by the set of all NE paths . Similarly, for and we denote by the set of all NE paths from of length (no restriction on the endpoint).
Observe that either and , or and
Here is the 1-norm on defined by .
On the other hand, trivially for any .
Note that unlike other recent work (such as [MSZ21]) we do not restrict ourselves to a known solvable model. We do not impose any restrictions on the edge weight distribution , so that our results hold with the greatest generality possible.
We scale the grid by and look at the behaviour in the limit. As is standard, we extend our initial inputs to lie in by taking coordinatewise floors of the scaled coordinates. That is, we consider paths where
Our normalized directed metrics will converge almost surely to a translation-invariant limit so it will suffice to consider the case when . But for now we let be arbitrary.
Various inequalities we derive involve the asymptotics of the number of length NE paths from the origin in a fixed or unfixed direction. The following lemma, which is easily proved using Sterling’s approximation, gives us what we want.
Lemma 1.
Let . Then
where we use the convention .
Remark.
Recall that for we denote by the set of paths . Thus
is the (Shannon) entropy of the number of paths in direction . Note that scales with positive scalars and is maximized among with the same 1-norm by
in which case .
On the other hand, for recall that we denote by the set of paths from of length . Thus
We study the distribution of weights that we observe along paths . For a path , let the unnormalized empirical measure along be
Note that we normalize by rather than . This is simply for convenience in our proofs, as it gives us a certain superadditivity we do not get when we normalize by .
The Glivenko-Cantelli Theorem [Dur19, Thm. 2.4.7] tells us that for any fixed infinite NE path in the grid, empirical measures along the path converge weakly to .
Theorem 2 (Glivenko-Cantelli Theorem).
Let be the cumulative distribution function of , let be i.i.d. random variables and let
be the cumulative distribution functions of the empirical measures. Then
However, we are interested in the limiting behavior of the empirical measure of not one path, but of all paths from or all paths with a given direction, as we scale the length of the paths. This allows us to observe more than just the original measure .
2.2 Metrics on Measures
To gauge the distance between two measures we use the Levy-Prokhorov metric. We briefly introduce this metric as well as the total variation metric and we outline the relevant properties.
Consider a metric space with Borel -algebra . We denote by the set of finite Borel measures on , by the set of non-negative finite Borel measures, and by the set of Borel non-negative finite measures with total mass for any . In this notation, is the set of Borel probability measures.
Definition.
The total variation norm on is defined by
This of course gives rise to a total variation metric, given by
For example, the total variation of any measure is its total mass .
Definition.
For any and , the -neighborhood of is defined to be
Definition.
The Levy-Prokhorov metric on is defined by
It is a standard result that metrizes the weak convergence of measures and total variation metrizes the strong convergence of measures in . For details, see [Hub04, Sect. 2.3].
We now derive two useful inequalities involving the Levy-Prokhorov metric.
Lemma 3.
For ,
Remark.
This lemma establishes that the Levy-Prokhorov metric is weaker than the total variation metric.
Remark.
It is trivial to see that .
Proof.
Let . If then so . If , for any ,
and similarly so . ∎
We next show that satisfies a kind of subadditivity.
Lemma 4.
Let . Then
Remark.
Note that in the case the inequality becomes
Proof.
For any we have for any ,
By symmetry, the same inequality holds with swapped. Thus
∎
2.3 A Convenient Coupling of the Edge Weights
We follow [Bat20, Sect. 2.1] in coupling the environment to uniform random variables in order to work in a compact space of measures and to connect our results with his.
The idea is to write our i.i.d. edge weights as
for some measurable function and i.i.d. Unif[0,1]-valued random variables on the same probability space as . For instance, we could take the quantile function
But our results (in particular, our definitions of grid entropy) are independent of the chosen so we allow to be arbitrary (with the conditions stated above). This comes into play later in Section 5, when we study the Gibbs Free Energy as a function of .
To distinguish between the and the , we call the former edge weights and the latter edge labels.
Let denote Lebesgue measure on . We tweak the definition of empirical measures in this new setup: for any NE path in , define
Then we can relate and the to and the respectively via the pushforward:
One advantage is of course that the set of probability measures on is weakly compact, so for any sequence of paths in the grid we get a subsequence for which converges weakly to some measure.
In the case of a continuous cumulative distribution function , we can use a lemma proved in [Bat20] to get a nice duality.
Lemma 5.
Given a measure on with continuous cdf , if we let
be its quantile function then there is a probability 1 event on which for some subsequence and paths if and only if .
Proof.
[Bat20, Lemma 6.15] establishes that there is a probability 1 event on which
implies . But then on the same event, given a subsequence for which , compactness gives us a convergent subsubsequence hence
so . The fact that is continuous and the quantile function satisfies
implies and agree on all sets hence . ∎
Thus, in the case when has continuous cdf, we lose no generality by doing this coupling and working with measures on [0,1].
However, even in the most general case where may not have continuous cdf or bounded support, our work in developing grid entropy still holds because we only use the compactness of the space of measures in later sections devoted to our variational formula for the Gibbs Free Energy.
In short, we lose nothing by restricting ourselves to the compact space of measures on .
A benefit of this coupling is the following amazing result of Bates [Bat20, Lemma 6.3 and Thm 6.4]. Here we denote by the sets of finite non-negative Borel measures on [0,1] and finite non-negative Borel measures on [0,1] with total mass .
Theorem 6.
-
(i)
Fix . Define to be the (event-dependent) set of measures for which there is a subsequence of paths with . Then there exists a deterministic, weakly closed set independent of s.t.
-
(ii)
Fix . Define to be the (event-dependent) set of measures for which there is a subsequence of length paths from with . Then there exists a deterministic, weakly closed set independent of s.t.
Moreover, .
Remark.
Bates proves this theorem in the setup of First Passage Percolation, but notes that it holds analogously in the Last Passage model.
Instead of looking at all empirical measures that have a weakly convergent subsequence we may look at only certain empirical measures with this property and the same result will hold. The proof is almost identical to Bates’s original proof except for this change, so we omit it.
Corollary 7.
-
(i)
Fix and . Define to be the (event-dependent) set of measures for which there is a subsequence of the paths with the th smallest values of satisfying
Then there exists a deterministic weakly closed set s.t.
-
(ii)
Fix and . Define to be the (event-dependent) set of measures for which there is a subsequence of the length paths from with the th smallest values of satisfying
Then there exists a deterministic weakly closed set s.t.
Remark.
Since the are increasing in then the sets are decreasing in . The same holds for .
Once we develop the concept of grid entropy, we will easily relate these sets to the sets of measures with finite grid entropy in direction , grid entropy at least in direction , finite direction-free length grid entropy, direction-free length grid entropy at least respectively.
2.4 Directed Metric Spaces
Grid entropy will turn out to be a directed metric with negative sign. We recall what that entails.
Definition.
A directed metric space with positive sign is a triple where is a vector space, is a distance function satisfying and the usual triangle inequality . A directed metric space with negative sign is a triple such that is a directed metric space with positive sign.
Remark.
Standard metric spaces are clearly examples of directed metric spaces with positive sign. However, directed metric spaces with positive sign might not be metric spaces: the distance might not be symmetric and might not be positive and finite for non-equal arguments. The ”directed” in the name indicates the possibility for asymmetry.
Certain directed metrics give rise to directed norms in the same way certain metrics give rise to norms.
Definition.
If is a directed metric with positive/negative sign such that it is translation-invariant and homogeneous with respect to positive factors, then it induces a directed norm with positive/negative sign given by
Of particular interest to us are directed norms defined in terms of the empirical measures we observe along paths between points. In the fixed direction case, our directed norms will be defined on the space of tuples consisting of a point in (the ”direction” we are observing) and a finite Borel measure on (the target measure we want the empirical measures to be near). In the direction-free case, our directed norms will just be defined on the space of finite Borel measure on .
2.5 The Subadditive Ergodic Theorem
The key theorem we use to prove the existence of the scaling limit of these directed metrics is Liggett’s improved version of Kingman’s Subadditive Ergodic Theorem. Before stating this theorem, we recall the definitions of stationary sequences and ergodicity, as presented in [Dur19, Sect. 7].
Definition.
A sequence of random variables is called stationary if the joint distributions of the shifted sequences is not dependent on .
As it turns out, the sequence of random variables we are interested in is a sequence of i.i.d. , which clearly is stationary.
Definition.
Let be a probability space and let be a map. is said to be measure-preserving if . is said to be ergodic if it is measure-preserving and if all -invariant measurable sets are trivial, i.e. whenever and .
In the context of sequences, we look at the space of infinite sequences of real numbers with the -algebra generated by
(where is the Borel -algebra on ) and with the product probability measure on determined by
where and is a Borel probability measure on . We consider the shift operator given by . is easily seen to be measure-preserving with respect to since for the generating sets of . When we refer to the ergodicity of a sequence of random variables, we mean the ergodicity of this shift operator.
In our case, where we have a sequence of i.i.d. , the corresponding shift operator is ergodic. Indeed, if then
so is in the tail -field and thus by Kolmogorov’s 0-1 Law.
We are now ready to state the Subadditive Ergodic Theorem in the form we need.
Theorem 8 (Kingman’s Subadditive Ergodic Theorem, [Lig85]).
Suppose are random variables satisfying
-
(i)
constant s.t. and for all
-
(ii)
, is a stationary process
-
(iii)
The joint distributions of are not dependent on
-
(iv)
Then
-
(a)
-
(b)
exists a.s. and in , and
-
(c)
If the stationary sequences in (ii) are ergodic, then a.s.
Remark.
We may replace with in the statement of the theorem to obtain a version for superadditive sequences.
This theorem is the basis for the construction of grid entropy in our paper.
2.6 Relative Entropy and Sanov’s Theorem
In the next preliminary section, we recall the basics of the Kullback-Leibler divergence (introduced in [KL51]) and Sanov’s Theorem for large deviations. We later use this theorem to establish a relationship between our grid entropy and this notion of relative entropy.
Definition.
Let be distributions on our inherent metric space . The Kullback-Leibler divergence or relative entropy of from is defined to be
where is the Radon-Nikodym derivative and is the natural logarithm.
Remark.
Our main interest in relative entropy is that it is the rate function for large deviations of empirical measures. This is captured by Sanov’s Theorem.
Theorem 9 (Sanov’s Theorem, [DS01]).
Consider a sequence of i.i.d. random variables taking values in a set . Let be their unnormalized empirical measures. Then for any weakly closed set we have
and for any weakly open set we have
Remark.
Since is closed then the infimum is achieved by some . Furthermore, if then the right-hand side of the inequality is 0, which gives us no information; however, if , then the theorem gives an exponential bound on large deviations.
3 Grid Entropy as a Directed Norm
3.1 The Plan for Deriving Direction-Fixed Grid Entropy
For the purposes of Section 3, we temporarily forget our original definition (1) of grid entropy and rederive it as a limit of scaled directed metrics.
To summarize the setting we described in section 2, we consider empirical measures along NE-paths on the lattice , where the edges have weights for some measurable and are i.i.d. Unif[0,1] random variables. denote the spaces of finite, finite non-negative, and finite non-negative with total mass Borel measures respectively.
We begin with direction-fixed grid entropy. We wish to count the number of paths with empirical measure very close to the target . Let us try to define a distance on by
Note that this is if there are no such paths and it is 0 if (since there is exactly one path , and it has empirical measure 0).
One glaring issue is that we are only counting paths that have exactly the target empirical measure. Since the Lebesgue measure on [0,1] is continuous, then almost surely the have different values, and thus the unnormalized empirical measures uniquely determine the paths . It follows that almost surely is always either or 0. We need to change our definition of to count paths with an empirical measure ”close” to instead.
Another problem is that we wish to apply the Subadditive Ergodic Theorem to learn about the behavior of
as . Thus we must also change our definition of so that it is integrable (and in particular finite a.s.) when .
The trick is to replace the counting of the paths exhibiting the exact target empirical measures with a ”cost function” that attributes an exponential cost to each path based on how far its empirical measure is from the target. We also introduce a parameter that, as it decreases to 0, takes the cost function towards the counting of the paths with the target empirical measure we tried initially.
Definition.
Fix . Define a distance on by
where is the Levy-Prokhorov metric and where we sum over all NE paths .
Remark.
This distance is 0 if and if and only if .
Remark.
For any path , the corresponding empirical measure observed must necessarily be of the form where .
Remark.
As the costs converge to the indicators
Hence the sum of the costs approaches the number of paths with empirical measures precisely equal to .
With this definition in mind, let us discuss the plan of attack.
First, we prove the existence of using the Subadditive Ergodic Theorem and some estimates we derive for the error terms when do not have integer coordinates. Then we take the infimum over of these limits and we define the resulting norm to be our grid entropy.
Theorem 10.
Fix and . Then
converges in probability to a constant. When , the convergence is pointwise a.s.
Remark.
The theorem holds trivially, with the limit being , if or if and . It also holds trivially, with the limit being 0, if and .
The limit given by this theorem is a directed metric with negative sign on . When we take an infimum over we still get a directed metric with negative sign.
Theorem 11.
For , and define
Then each as well as are directed metrics with negative sign on .
We show that this metric gives rise to a norm on . This will finish our discussion of the direction-fixed grid entropy and we will move on to the direction-free case.
3.2 The Limit Shape of Starting at
In this and the following subsection, we focus on the direction-fixed grid entropy.
To prove Theorem 10, we first prove a simplified version, which we later generalize easily.
Theorem 12.
Fix and . Then
Remark.
As noted before, Theorem 10 holds trivially when , so we need not bother with this case.
We prove this theorem in stages, starting with the case when has integer coordinates. But first, we show a useful bound on our random variables .
Lemma 13.
Let and with . Then
Proof.
Recall that any path has edges, so , and that the total number of paths is
Also, for any such we have by Lemma 3
Thus
∎
Lemma 14.
Theorem 12 holds with .
Proof.
We wish to use Kingman’s Subadditive Ergodic Theorem (Theorem 8) with
Let us now check the conditions (i)-(iv).
Next, for every , the sequence
is i.i.d. because the distribution of the unnormalized empirical measures of paths is not dependent on (since the edge labels are i.i.d.) so the distribution of the cost functions for is not dependent on . Thus (ii) holds. Furthermore, as discussed in section 2.5, being i.i.d. implies the sequence is ergodic.
Similarly, the joint distributions of
are not dependent on since the edge labels are i.i.d. and the distribution of empirical measures of paths in a rectangle on the lattice with the difference between the top right and bottom left corners being is independent of the location of the rectangle. Thus we have (iii).
It remains to show (iv), namely to show that given
For any paths and , we get a unique concatenation . Its empirical measure satisfies . But the Levy-Prokhorov metric satisfies subadditivity by Lemma 4. Thus
But then
Not all paths pass through so we can upper bound the expression above by removing this condition:
Thus we can apply the Subadditive Ergodic Theorem (Theorem 8) to get that
converges a.s. to the constant
∎
The next order of business is proving the theorem for with rational coordinates. We will find useful the following error estimate on how changes when is perturbed in the SE direction and is perturbed arbitrarily.
Lemma 15.
Fix and . Then for any with ,
Proof.
Fix any such . The inequality holds trivially if so we may assume . Then there is at least one path , and we fix it.
For any path , we concatenate it with to get a unique path . Note that consists of edges so its empirical measure has Total Variation at most . Thus satisfies
It follows that
∎
Using this lemma to approximate for in terms of where , we prove our limit theorem for with rational coordinates.
Lemma 16.
Theorem 12 holds for .
Proof.
Let for . First, compare for arbitrary using the previous lemma. The idea is that has integer coordinates so have the desired limits by Lemma 14. The rest of the are bounded above/below by respectively plus some small error terms which go to 0 as .
Consider any . Note that
and
By Lemma 15,
Thus
(4) |
But so by Lemma 13, our limit theorem holds for . That is,
(5) |
Taking expectations in (4), then taking the supremum/limit as and using (5) we get
(6) |
since the error terms in (4) go to 0. Similarly, taking the limit as in (5) and using (4), (6) we get
Multiplying everything by and using the fact that any can be written as , we get
∎
Our next objective is to prove the full version of Theorem 12. We will need two short lemmas giving sufficient conditions for a.s. convergence of bounded random variables to the supremum of their expectations. The first of these lemmas looks at the case where we are given a particular lower bound for the liminf of all subsequences of our sequence.
Lemma 17.
Let be uniformly bounded random variables. If a.s.
then a.s.
Proof.
First, taking expectations in
and using Fatou’s Lemma, we get
so all the inequalities are equalities and
This holds for all subsequences , including the full sequence. Thus a.s.
hence a.s.. ∎
The next lemma is about approximating a sequence of random variables from below by sequences that each converge to the supremum of their expectations.
Lemma 18.
Let be uniformly bounded random variables s.t.
-
(i)
For every fixed , for large enough
-
(ii)
For every fixed ,
-
(iii)
Then a.s.
Proof.
We wish to prove the hypothesis of Lemma 17 for . Let
which has measure 0 by (ii). We claim that in the event we have
Consider any subsequence and any . By (i), . Taking the over , and using the fact that we are in the event , we get
This holds for all . Taking the supremum over and using (iii), we get
as desired. Applying Lemma 17, we get
∎
We can finally prove Theorem 12.
Proof of Theorem 12.
The case is handled by Lemma 16. So we may assume .
We construct a sequence as follows. For every , either and we pick , or and we pick s.t. . It follows that with . That is, forms a ”staircase” in .
The intuition is that approximates from below. Each converges in to the right limit a.s. by Lemma 16, and we use Lemma 18 to prove that converges a.s. to the right limit.
Let us now prove the hypothesis of Lemma 18 with
Note that these random variables are bounded by Lemma 13:
First, fix and consider any . By Lemma 15,
so we have (i). Next, (ii) follows from Lemma 16:
It remains to show (iii). Consider any fixed . For each either and , or and hence for large enough . Thus for large enough . It follows that
By the Bounded Convergence Theorem, . Thus
so taking the supremum over we get (iii):
This completes the proof of the existence of the limit shape of when measuring the distance from .
3.3 The Limit Shape of —The General Case
We use Lemma 18 to generalize Theorem 12 to Theorem 10, where measures distance between any two elements of .
Theorem 10.
Fix and . Then
converges in probability to a constant. When , the convergence is pointwise a.s.
Remark.
During the course of the proof, we also show that the limit shape of
is translation-invariant. This will help us later.
Proof.
As noted before, the theorem holds trivially if . Thus we may assume .
Observe that
(7) |
so it suffices to assume and prove
converge a.s. to a constant.
Our argument mirrors the one used in the proof of Theorem 12: we approximate these from below by some converging in and we apply Lemma 18.
First we prove some inequalities. Fix . Observe that for ,
(8) |
and thus
(9) |
On the other hand, for s.t. ,
for . Thus
This equation also holds trivially for s.t. . Thus
(10) |
We prove the hypothesis of Lemma 18 with
First, by Lemma 15 and (9), for every fixed , for large enough we have
i.e. which is precisely (i). Also, Theorem 12 gives (ii): for every fixed , a.s.
Note that this implies
(11) |
It remains to show (iii). Note that taking expectations in (i) immediately gives
(12) |
We show this is an equality. By Lemma 15 and (10), for every fixed , for large enough we have
Dividing by , taking expectations and taking the supremum over we get by (11)
(13) |
which when combined with (12) gives (iii).
Applying Lemma 18, we get
i.e.
Furthermore, by (11), (12) and (13), we have
Combining this with (7), we get that converges in probability to the a.s. limit of .
This completes the proof of Theorem 10 and of translation-invariance of the limit shape.
∎
3.4 Grid Entropy as a Directed Norm
In the previous sections we showed the existence of the limit shape of . We now take the infimum as and we show that the result is a directed metric with negative sign that gives rise to a norm, which we call grid entropy.
Theorem 12.
For , and define
Then each as well as are directed metrics with negative sign on .
Remark.
For any and , is monotone decreasing as so
By Lemma 13, for every and ,
so it follows that
Once we prove that our two definitions of grid entropy are equivalent, this bound will be improved.
Remark.
As was the case with Theorem 12, the limit is trivially if or if and , and it is trivially 0 if and .
Proof.
As noted above,
It remains to prove the reverse triangle inequality. Let and and consider any and any . If or then the following inequality holds trivially (because the right-hand side is )
(14) |
Now suppose . Given paths , , we concatenate them to obtain a unique path with unnormalized empirical measure . By the subadditivity of the Levy-Prokhorov metric (Lemma 4),
so
It follows that (14) holds. Dividing (14) by and taking the limit (in probability) as we get
so is a directed metric with negative sign. Taking the limit as , we still obtain a directed metric with negative sign. ∎
We proceed to show that gives rise to a directed norm with negative sign.
Theorem 19.
-
(i)
Each is translation-invariant and positive-homogeneous. So is .
-
(ii)
For define the grid entropy with respect to to be
Then this is a directed norm with negative sign on .
Remark.
From before, is if or if and , and it is 0 if and .
Remark.
A directed metric with negative sign is clearly concave. Thus each as well as are concave functions on their respective domains.
Remark.
These properties of grid entropy also follow from [RAS14].
Proof.
(i) Fix . We already showed that is translation-invariant while proving Theorem 7.
By translation-invariance, it suffices to show that is positive-homogeneous. Consider any . Then
Looking at the subsequence consisting of multiples of , we get
But each so
Thus is positive-homogeneous for rational factors.
Now consider any and take sequences . Consider any , and let us look at
By Lemma 15,
Dividing by and taking a.s. limits, and applying homogeneity for positive rational factors we get
Taking gives us
so is homogenous with respect to any positive real factor.
Taking the infimum over we get that is translation-invariant and positive-homogenous.
(ii) Follows directly from (i).
∎
3.5 Direction-free Grid Entropy
We now wish to develop a grid entropy for the case where we no longer restrict ourselves to paths for a given direction , and instead look at all length paths from for a given size parameter . Another way of putting this is that we look at paths from the origin to the line or ”level” . Recall that the set of all such paths is denoted .
If we try to simply repeat our previous argument, we run into a dead end because we are no longer in a superadditive setting. The solution is to observe that the distances are maximized over with by . This intuitively makes sense, since this direction is the direction which has the most NE paths.
Lemma 20.
Fix . Then
Remark.
In Section 4 we show that is a necessary condition for to be finite, so it makes sense that we only take the supremum over with .
Proof.
This is an easy consequence of the symmetries of the grid and the concavity of and direction-fixed grid entropy. We focus on the proof for ; the argument for grid entropy goes the same way.
Fix . By positive-homogeneity and since the is trivial, we may assume . Suppose there exists s.t. and
(15) |
Among such pick one that maximizes the number of coordinates which are equal . Thus there are distinct s.t. , so we can write as a convex combination of :
(16) |
Let be with swapped. By symmetry of the grid,
hence by concavity of ,
But only changes the coordinates of in positions , with becoming by (16). Thus we have found a satisfying (15) that has at least one more coordinate that is than our previous , which we had assumed had the maximal number of such coordinates. Contradiction. Therefore as desired. ∎
We now use this useful fact along with the compactness of to show that is the desired direction-free grid entropy of length .
Theorem 21.
Fix . For any we have
a.s.
Proof.
The statement is trivial when so we may assume . We focus on the first equality. By Lemma 20 and the trivial fact that in general, we immediately get
Suppose equality does not hold on some event of positive probability. Thus there exists s.t.
(17) |
with positive probability. Let us intersect this event with the measure 1 event that
exists and is equal to the limit in Theorem 10 for every . Denote this event by .
Unfortunately, the expression inside the supremum is only continuous when approaching from the SE, and not necessarily when approaching along , so we need to work with some rather unpleasant approximations.
Let be the (event-dependent) subsequence corresponding to the . For every pick some (event-dependent) with s.t.
(18) |
By compactness of , there is a converging subsequence for some . Pick some s.t. and
(19) |
The idea is that since and the coordinates of are strictly positive, then there exists s.t. . For such we can apply Lemma 15 to get
(20) |
Recalling that we are in the event , pick some large enough so that for any
(21) |
and
(22) |
by (19). Putting everything together, we get
By Lemma 20, this is
which is a contradiction. Note that even though our sequence , the limit and the chosen are event-dependent, the upper bound above is not by virtue of Lemma 20. That is what makes this argument work.
We have thus shown
We now proceed with the second equality. Observe that for any , any length NE path from ends at for some with . Furthermore, the number of possible different endpoints of such paths is precisely , which is the number of -tuples of non-negative integers summing to . Therefore
When we divide by and take the limit, the term goes to 0 and we get that
as desired. ∎
Of course, we already established that is a directed metric with negative sign. This means that if we now take the infimum over we get that the length direction-free grid entropy is precisely . Since this is unless (see Theorem 24 in a later section), then we can simply let to begin with.
Definition.
For let and define the direction-free grid entropy to be
Remark.
The final equality holds by Lemma 20 combined with the fact that unless .
3.6 Equivalence of Approaches
The next order of business is to show that these definitions of direction-fixed/direction-free grid entropy are equivalent to our original, more intuitive, definitions (1). The novel results in this subsection show the benefit of the fresh approach to grid entropy presented in this paper.
The key idea is that the sum
appearing in the definition of is approximately the number of paths with very small and the error should disappear in the limit.
Let us recall how the are defined. Fix . For any with define
to be the order statistics of as we range over all , and let
In this way are just the minimum/maximum values of respectively.
Similarly, in the direction-free case, for we let and for any with define
to be the -th smallest value of when we range over all length paths from .
We usually omit the in for the sake of space.
We are now ready to prove that our definitions of grid entropy are equivalent, and moreover that the pertinent limits either converge to 0 a.s. or the corresponding ’s are a.s.. In terms of notation, we denote by the direction-fixed/direction-free grid entropy as defined in Theorem 19/Theorem 21.
Theorem 22.
Fix and let .
-
(i)
The grid entropy being finite is characterized by the existence of paths with empirical measure arbitrarily close to the target. More specifically,
Furthermore, if then
and if then . In particular,
i.e. is always among the distributions observed.
-
(ii)
We have
In other words,
(23) where this is if the set of ’s is empty. Similarly,
and the direction-free grid entropy is given by
Remark.
Since the increase in , then the sets we are taking the supremum over in the definition in (ii) are either or of the form or where is the corresponding grid entropy.
Remark.
The limits
need not exist for all . What we are saying is that they do exist and are equal to 0 a.s. for and that the corresponding ’s are a.s. for . This along with the existence of the limits in (i) are completely non-trivial facts which will be a direct consequence of the a.s. convergence of .
Proof.
We focus on proving the direction-fixed grid entropy statements. The direction-free arguments are analogous.
(i) The strategy is to provide upper and lower bounds on in terms of
the leading term in the sum over . Since converges a.s. then our analysis gives that must be in the intersection of two intervals, one in terms of
and the other in terms of the . The only way these intervals can have non-empty intersection for small enough is if the ’s equal the ’s.
Fix . We consider the over . Since for all valid , then
where each . Taking the a.s. or the a.s. and using Lemma 1 we get that a.s.,
(24) |
Since this holds for arbitrarily small then we must have
Furthermore, taking in (24), we see that
Moreover, if indeed then (24) gives .
Finally, if we fix an infinite NE path passing through all then by the Glivenko-Cantelli Theorem (Theorem 2),
where denotes the restriction of the path to , so from above it follows that .
(ii) If then by (i),
so the statement holds trivially because the are nondecreasing in . Thus it suffices to assume and hence
(25) |
Fix and consider any . By (25), the leading term of
is . So we must look at the secondary terms by writing
(26) |
where the summands are nonincreasing in .
The argument goes similar to the one in (i), in that we compute some upper and lower bounds for this expression.
Fix an arbitrary . We lower bound (26) by truncating the sum at and lower bounding all the summands by the summand:
(27) |
We upper bound (26) by upper bounding summands for to to 1 and upper bounding the rest of the summands to the summand:
(28) |
Consider the event-dependent sequence
Taking the in (27),(28) and using (25) and Lemma 1 we get
a.s. for this arbitrary .
In particular, for any we get
hence
Thus our approaches to grid entropy are equivalent.
Note that so far we have not made use of the coupling or of the compactness of the space of measures on . We could have developed grid entropy in the original environment, with unnormalized empirical measures and edge weight distribution and target measures on in the same way. Furthermore, if has a continuous cdf then the value of the grid entropy would be the same either way, because of the duality between the environments, established in Lemma 5; that is, for any and we would have
where the grid entropies are developed on the environments respectively. Even in the general case where may not be continuous, [Bat20, Lemma 6.15] implies that
Again though, this is just a nice fact; for practical purposes we may simply define grid entropy on the space of measures on and our arguments thus far hold.
We now reap the benefits of this coupling, by describing the sets defined in Theorem 6 and Corollary 7 in terms of direction-fixed/direction-free grid entropy.
Corollary 23.
Fix and .
- (i)
-
(ii)
Fix . The set of measures with grid entropy in direction at least/most can be characterized in terms of the deterministic sets defined in Corollary 7 (i):
Thus
, and
Now fix . The set of measures with length direction-free grid entropy at least/most can be characterized in terms of the deterministic sets defined in Corollary 7 (ii):
Thus
, and
Proof.
This follows immediately from Theorem 22. ∎
Now that we have a firm grasp of what direction-fixed and direction-free grid entropies actually measure, we can move on examining what information they can give us.
4 Properties of Grid Entropy
Since direction-free entropy is just for then it suffices to study the properties of direction-fixed grid entropy. We are particularly interested in what the grid entropy can tell us about and when is finite.
We have already established that it is necessary for to be in if we want not to be , and that if then if and only if . The question now is what sorts of measures are observed by the empirical measures along the direction .
By positive-homogeneity of the norm, it suffices to consider with and study which give rise to finite . In this section, it will be more convenient to work with the description of grid entropy given by Corollary 23 (ii).
Theorem 24.
Recall that we couple the edge weights with i.i.d. edge labels (the Lebesgue measure on ) and that
Consider any with and any . Suppose . Then:
-
(i)
is a probability measure on .
-
(ii)
For any or set , we have
-
(iii)
From Section 2.6 the Kullback-Leibler divergence is defined as
Then
so in particular, is absolutely continuous with respect to the Lebesgue measure on .
Remark.
If we take and recall that we get the analogous statements for length 1 direction-free grid entropy.
Remark.
(i) is exactly what we expect because the empirical measures for paths
are non-negative Borel measures with total mass so in the limit we expect to only observe which are probability measures.
Remark.
Recalling Theorem 6, (i) implies that (and hence ) is weakly compact.
Remark.
Remark.
If we do not make use of our coupling and define grid entropy directly on the space of measures on , (i)-(iii) hold analogously. This is trivial since we do not use weak compactness in the proof of this theorem.
Proof.
(i) The intuition is that if then each is bounded below away from 0 hence it does not converge to 0.
Suppose is not a probability measure on . By the reverse triangle inequality for , for any and any path ,
Since and then there is s.t. for large enough ,
It follows that a.s. so . Contradiction. Therefore must be a probability measure on .
(ii) Before proceeding with (ii) we prove a quick claim that will also help with the proof of (iii).
Claim: Fix and a nonempty set of measures . Then
In particular, if and
then
Proof. Indeed, for any we have by Markov’s Inequality
Taking of both sides and using the definition of we get the desired inequality.
Now suppose
Then for large enough ,
where the exponent is a strictly negative constant multiple of . Hence, by the BorelCantelli Lemma,
(Claim)
Now we begin the proof of (ii). The idea is to use the definition of the Levy-Prokhorov metric to unpack what means in terms of the values of the labels along .
First, consider any closed . Observe that hence by continuity from above, and as .
We compute the probability that for a path directly, using the definitions of and , and then we will apply the Claim.
Consider . For any path , by definition of the Levy-Prokhorov metric and the empirical measure ,
and edge labels in | ||
and edge labels in |
If or then we may completely omit the first/second half of the statement above as it is trivial. Otherwise, we take small enough so that and large enough so that . For the sake of convenience we only show the computation in the latter case; the or case is merely a simplified version of it.
Since are i.i.d. Unif, then
(29) |
by the union bound. We get the asymptotics of this binomial coefficient from Lemma 1:
(30) |
Therefore
Combining this with the Claim with , we get that
By Corollary 23 (ii), it follows that
This holds for arbitrary . Taking and using and ,
Therefore
(31) |
This equation is symmetric in hence it holds for all open or closed. By continuity from below/above it holds for all and subsets of .
(iii) The idea is to again use the Claim, this time to compute using Sanov’s Theorem.
Fix . By Sanov’s Theorem (Theorem 9), since is weakly closed then
where the right hand side might a priori be .
Take and let (the infimum over the weakly closed is achieved since relative entropy is lower semicontinuous). Then hence
as desired.
Finally, recall that hence so is finite and . ∎
It is not clear in general whether any probability measure satisfying results in finite grid entropies for .
In one specific case, however, this does hold trivially. When is a permutation of
, so that there is exactly one NE path , we get
with the value of being 0. Note that this completely covers the case.
Also, in general we cannot explicitly compute grid entropy. But when the target measure is , we can easily show that grid entropy is maximal. This makes sense, since the Glivenko-Cantelli Theorem says we expect that most empirical measures observed are very close to the edge label distribution.
Corollary 25.
Let . Then .
Proof.
Since and are positive-homogeneous (in the sense that and ), we may without loss of generality assume .
Suppose . Consider any . By Sanov’s Theorem (Theorem 9),
(32) |
where is weakly closed and relative entropy is lower semicontinuous hence the infimum is achieved by some . But implies . Fix any
By the Claim from the proof of (ii) in the previous theorem with , we have
hence
But for large enough since , so
for this arbitrary , hence . Contradiction. ∎
Continuity is another property of interest. It turns out that grid entropy is upper semicontinuous in its argument(s); this follows immediately from (3) and the lower semicontinuity of relative entropy. However, we have not yet proved (3) so we prove a slightly weaker statement using the tools developed in this paper. We also note that grid entropy is lower semicontinuous (hence fully continuous) in some cases.
Note that here we no longer restrict ourselves to the case when .
Theorem 26.
Let s.t. .
-
(i)
If either or for large enough we have or for large enough we have for all s.t. then
-
(ii)
If , for large enough then
-
(iii)
If there exist constants s.t. and s.t. for large enough ,
then
Remark.
In the case when for large enough , (i) shows that direction-fixed grid entropy is always upper semicontinuous in . Taking we get that
so direction-free grid entropy is upper semicontinuous.
Remark.
In the assumptions in (ii) and (iii) it is implicit that the parameters lie in
because otherwise the grid entropy would be .
Remark.
The conditions in (i), (ii), (iii) are not mutually exclusive so for some sequences we may combine the statements to obtain
Proof.
For all three parts, we may assume the conditions hold for all .
(i) Case 1:
First suppose so that . By Theorem 22,
hence
since .
On the other hand, if then for large enough we have (because otherwise and hence , the weak limit of the , would have to be 0). But then
by Theorem 24 (i).
Case 2:
Recall that the infimum of a family of upper semicontinuous functions is upper semicontinuous. Thus it suffices to fix and show that
(33) |
But for every we have by Lemma 15 and the fact that ,
For a fixed , taking we get
(33) follows immediately.
Case 3: whenever for all
We may assume (because this scenario was covered by Case 1). In particular, we may assume without loss of generality that .
To show
it suffices to prove this when . In particular, this implies that
and hence .
Now, the trick is to rescale the by a factor converging to 1 in s.t. we land back in Case 2. More precisely, define
Note that because . Since then there is some s.t. so by the assumption of Case 3, . Thus .
Furthermore, implies . In particular, . By the choice of we have for all , either hence or so . Thus
By Case 2 and positive-homogeneity of grid entropy,
as desired.
(ii) By the triangle inequality for a directed norm with negative sign,
since . Therefore
(iii) Now suppose
By the reverse triangle inequality and positive-homogeneity,
But hence
∎
In Lemma 20 we made a simple observation that is worth mentioning explicitly: is invariant under permutations of the coordinates of because of the symmetry of the grid lattice .
A last property we are interested in is the convexity of .
Lemma 27.
Let . The sets , are convex.
Proof.
Grid entropy is concave in by positive-homogeneity and by the reverse triangle-inequality it satisfies. ∎
5 Application: Directed Polymers
We now apply our results to give variational formulas for the point-to-point/point-to-level Gibbs Free Energy in terms of direction-fixed/direction-free grid entropy and last passage time. As mentioned, these variational formulas have previously appeared in [RAS14], [GRAS16]. Our aim is to arrive at these from the framework presented in this paper with the hope of providing fresh insights.
5.1 Setup and Known Results
Recall that the measurable function determines the edge weights
. We will be dealing with the Gibbs Free Energy so to make our life easy we restrict ourselves in Section 5 to bounded and measurable. This could be extended to measurable but possibly unbounded by truncating at constants and taking a supremum over .
Define the last passage time between points to be
where this is if . A standard result (see [Mar04] for example) is the existence of a last passage time constant, or in other words a deterministic limiting value for the last passage time in a fixed direction .
Theorem 28.
Let . Then there is a -dependent time constant s.t.
is homogeneous and concave in .
Remark.
As was the case with grid entropy, the symmetries of the grid and concavity of passage time imply that the time constant is maximized along the direction :
Remark.
Thus far the time constant is only known to be explicitly computable when the underlying distribution is exponential.
It is useful to view last passage time as a linear functional. For any NE path , note that
where denotes the linear functional that is integration of a measurable function
against a measure . In this language, weak convergence is equivalent to
Of course, is likely not continuous. However, recalling [Bat20, Lemma 6.15] , we see that there is an a.s. event on which converging weakly to implies converges weakly to hence
where is the identity function on [0,1].
By using the compactness of and Theorem 6, Bates establishes a variational formula for the direction-fixed limit shape. See [Bat20, Theorem 2.1] for details. We restate this theorem here in our LPP setting, which can be proved by tweaking Bates’ original argument. We give our own proof in the next section.
Theorem 29.
Let . Recall that Theorem 6 gives a deterministic weakly closed set which a.s. consists of all weak limits of subsequences of for paths of increasing length. Then for any measurable , the corresponding limit shape is given by
The direction-free analogue holds as well.
Remark.
Since is weakly closed and nonempty (because from before), then the set of maximizers is nonempty:
As Bates remarks, when is a singleton, empirical measures along geodesics converge weakly to this unique maximizer (or analogously unique minimizer in FPP), answering in the affirmative Hoffman’s question. It is further pointed out that is a singleton for a dense family of functions . See [Bat20] for more details.
Remark.
Noting that the limit shape is the zero temperature analogue of -Gibbs Free Energy, these direction-fixed/direction-free variational formulas for the limit shape are also given as (7.6), (7.5) in [GRAS16].
Finally, let us recall the directed polymer model, still restricting ourselves to the LPP on setting.
Definition.
Fix a direction , an inverse temperature and a bounded measurable . The point-to-point -partition function in direction is defined to be
The corresponding point-to-point -polymer measure on the set of paths is
and the corresponding point-to-point - Gibbs Free Energy is defined to be
Suppressing the direction we get point-to-level analogues (to the ”level” ):
5.2 Variational Formulas for PTP/PTL Gibbs Free Energies
We derive a formula for the point-to-point Gibbs Free Energy as the supremum over measures of the sum of the integral and grid entropy . Since direction-free grid entropy is just grid entropy in the direction, then we arrive at a similar formula for the point-to-level Gibbs Free Energy.
Theorem 30.
Fix an inverse temperature and a bounded measurable
.
-
(i)
Fix . The point-to-point Gibbs Free Energy is given by
Moreover, this supremum is achieved by some .
-
(ii)
The point-to-level Gibbs Free Energy is given by
Remark.
Remark.
If is non-negative then the variational formulas are trivially positive-homogeneous.
Remark.
Again, we may extend these formulas to a supremum over all measurable by truncating in at some and then taking a supremum over of the variational formula.
We require a short lemma in order to prove our variational formula. The main technical difficulty is that each direction- grid entropy is an almost sure limit:
but we are now dealing with a whole space of measures, which is uncountable. Thus we need to assume these limits exist for a countable dense family of ’s and approximate the sums of over paths with empirical measure in -balls.
Lemma 31.
Fix and . For any ,
Proof.
Let us look closer at this sum. Observe that we can upper bound the indicator of the set we are summing over by a smooth function:
Thus
∎
Proof of Theorem 30.
We focus on showing (i). The proof of the direction-free case (ii) is analogous.
(i) Note that if then hence the variational formula holds trivially. Thus we may assume .
Recall that the Levy-Prokhorov metric metrizes weak convergence of measures and that is weakly compact. It follows that is uniformly continuous on . More precisely,
(34) |
We split the argument into two claims.
Claim 1:
Proof of Claim 1
Note that unless . Also is weakly closed and is upper semicontinuous in so there exists some that achieves the supremum
We prove the claim when ; the case is very similar.
For Claim 1, we restrict ourselves to the measure 1 event
(35) |
(this has measure 1 by Theorem 22). We wish to show that in this event,
Fix in and let satisfy (34). We are in the event (35) so
Denote by the (event-depedent) paths corresponding to these order statistics. For large enough we have hence there are paths satisfying
(36) |
by (34). But then
by (36). It follows that
Since was arbitrarily small, then taking we get
(37) |
as desired. [Claim 1]
For the second half, we actually show a slightly stronger claim, which we use in a later proof.
Claim 2: Let be a weakly open, possibly empty set s.t. . Define
Then
(38) |
Of course, in our case and .
Proof of Claim 2
We must be extra careful to only specify countably many measure zero events we exclude. We do this by considering a countable dense family of target measures.
is weakly compact. So for every there is some and a finite -net covering with the property that (but the balls themselves may intersect ). From now on we restrict ourselves to the event
(39) |
which is a countable intersection of measure 1 events by Theorem 12 hence has measure 1. It is important to note that for any arbitrary measure we still have
because this is a non-probabilistic statement about constants; we do not however assume that these directed metrics are limits of the .
Now we wish to fix an arbitrary and prove that in the measure 1 event (39),
It suffices to consider a convergent subsequence and show the right hand side.
Fix any . For large ,
so any path we are summing over in has hence is in one of the . Therefore
Note that is a constant with respect to , whereas the max above is both event-dependent and -dependent. Thus there is some event-dependent and some subsequence s.t. the above maximum is achieved by the same for every . (Finite pigeonholes, infinite pigeons if you will.) Therefore
(40) |
This holds for any . Our strategy will be to choose a very large, event-dependent that gives a nice event-independent upper bound.
Let us upper bound the expression in the limit on the right-hand side of (40). Lemma 31 with establishes that for any ,
(41) |
This holds for any arbitrary . The next step is to specify a sufficiently large .
By compactness of , the measures have a weakly convergent subsequence
.
Fix some large enough so that
(42) |
Consider some event-dependent large enough s.t.
(43) |
Since
and satisfies (34) then
(44) |
Next, we have
(45) |
Finally, we are in the event (39) so for large enough we have
(46) |
by Lemma 15. By (42),(43) this last expression is
Combining the inequalities (44),(45),(46) with (41) we get that for large enough ,
(47) |
Substituting this in (40), we get
as desired. Note that even though and in (47) are event-dependent, the upper bound we get for is not event-dependent, makes our argument work. [Claim 2]
This finishes the proof of the variational formula in the fixed-direction case.
(ii) An analogous argument gives
But direction-free grid entropy is just grid entropy in the direction . Furthermore,
if . Thus
∎
As an immediate corollary, we get another proof of Bates’ Variational Formula (Theorem 29), as well as its direction-free analogue. We stress that our proof as written works only under the assumption that is bounded, but can easily be extended to the general case of measurable .
Proof of Theorem 29. Fix and bounded measurable . It is standard (e.g. see [GRAS16]) that for every ,
(48) |
which we recall is the last passage time between and . That is, the last passage time is the zero temperature analogue of . Thus the the zero temperature point-to-point Gibbs Free Energy is the last passage time constant :
On the other hand, another standard result (see [GRAS16]) says that the zero temperature point-to-point Gibbs Free Energy is a scaled limit of the positive temperature free energies:
(49) |
Fix . Dividing our variational formula by and taking the supremum over we get by (49)
Taking and noting that implies is bounded, we get Bates’ variational formula:
a.s..
In other words, Bates’ variational formula is nothing more than the zero temperature analogue of the entropy variational formula for Gibbs Free Energy.
The other corollary of note is the answer to Hoffman’s question in the directed polymer model when our variational formula has a unique maximizer. As Bates observes, this happens for a dense family of measurable functions (recall Remark Remark).
Corollary 32.
Fix an inverse temperature and a bounded measurable
.
-
(i)
Fix and suppose has a unique maximizer . For every pick a path independently and at random according to the probabilities prescribed the corresponding point-to-point -polymer measure . Then the empirical measures converge weakly to a.s.
-
(ii)
Suppose has a unique maximizer . For every pick a length path from independently and at random according to the probabilities prescribed the corresponding point-to-level -polymer measure . Then the empirical measures converge weakly to a.s.
Remark.
In fact, it follows from our argument below combined with the compactness of that even in the general case when the variational formula might not have a unique maximizer, all of the accumulation points of the empirical measures along random paths chosen according to the -polymer measure are maximizers of the variational formula. Thus, one interesting open question for the future is whether we can more precisely describe these maximizers. This information could help distinguish between weak and strong disorder regimes; see [RASY17] for a recent paper on the role of minimizers of a ”cocycle” variational formula for the Gibbs Free Energy in determining disorder regimes.
Remark.
This partial answer to Hoffman’s question in the positive temperature case does also follow from the development of grid entropy by Rassoul-Agha and Seppäläinen [RAS14] as the rate function of the LDP of empirical measures. However, it seems it has not yet been formulated explicitly as we have done here.
Proof.
We focus on proving (i). The direction-free argument (ii) is analogous.
(i) If then the paths are the trivial path and so , hence the result holds trivially. Thus we may assume .
Since the empirical measures eventually lie in the weakly compact set
then by compactness the existence of an accumulation point is guaranteed so it suffices to show that the sequence cannot have any accumulation point other than .
If is a singleton, i.e. , then by the characterization of as the set of accumulation points of empirical measures (see Theorem 6), is the only possible accumulation point of so we would be done. We can thus assume is not a singleton. Thus there exists some s.t. .
Suppose we are in the measure 1 event
(50) |
This is the intersection of countably many measure 1 events by Claim 2 in the proof of Theorem 30. Claim 2 applies because
We need to prove that in the measure 1 event (50), the probability that the empirical measures of the paths chosen independently at random according to the polymer measures do not have an accumulation point other than is 0.
Fix any in . We wish to show that the probability that is not in infinitely often is 0.
Let achieve the supremum
By the maximality of , there is some s.t.
(51) |
Consider any . Recall the definition of the point-to-point -polymer measure on the set of paths :
Thus the probability that (with respect to the -polymer measure ) is
We are in the event (50), so for large enough , this is
by (51). so by the Borel-Cantelli Lemma,
By continuity from below,
Therefore the empirical measures converge weakly to a.s.. ∎
5.3 Grid Entropy as The Negative Convex Conjugate of Gibbs Free Energy
Another way of viewing the variational formulas in Theorem 30 is that the point-to-point/point-to-level -Gibbs Free Energies as functions of the bounded measurable function are the convex conjugates of the functions on respectively.
Let us briefly recall what that means (see [Zal02] for more details).
Definition.
Let be a locally convex Hausdorff space and let be its dual with respect to an inner product . A convex function is said to be proper if and . For any proper convex function , its convex conjugate is the function given by
In our case, we have (finite Borel measures on [0,1]) and is the set of bounded measurable functions with inner product given by integration.
Also, recall that direction-fixed/direction-free grid entropy is concave hence the maps are convex and trivially proper. Furthermore, by our continuity theorem (Thm 26), and are lower semicontinuous in .
With this setup in mind, it is evident that Theorem 30 establishes that the point-to-point -Gibbs Free Energy is precisely the convex conjugate of and the point-to-level -Gibbs Free Energy is the convex conjugate of .
The convex conjugate has various important properties. We focus on those that are the most relevant and interesting in regards to grid entropy and Gibbs Free Energy. See [Zal02, Sect 2.3] for the full list of properties.
Corollary 33.
Fix , an inverse temperature . Then viewing Gibbs Free Energies as functions of bounded measurable we have the following.
-
(i)
Gibbs Free Energies are convex and lower semicontinuous in
-
(ii)
(Order-preservation) If with then
-
(iii)
(Biconjugate Duality) The convex conjugate of -Gibbs Free Energy (as a function of ) is minus grid entropy. That is,
In other words, grid entropy is equal to its biconjugate.
-
(iv)
(Fenchel’s Duality Theorem) Let be proper convex functions s.t. with and one of is continuous at and one of is continuous at . Then
where denote the -convex conjugates of respectively:
Remark.
(iii) gives yet another definition of direction-fixed/direction-free grid entropy. This one is perhaps the most remarkable of the three, as it connects it to the seemingly unrelated quantity that is Gibbs Free Energy. This more than justifies the importance and canonical nature of grid entropy.
Proof.
The key takeaway is this: grid entropy and Gibbs Free Energy are intricately connected via convex duality.
5.4 Connection to Previous Literature on Grid Entropy
We end our discussion of these variational formulas by explicitly deriving the formulas (3) which link our notion of grid entropy to that presented previously in [RAS14]. For the sake of conciseness we focus on the direction-fixed (a.k.a. point-to-point) case.
Let us first recalibrate our setting to align with the framework of Rassoul-Agha and
Seppäläinen. We view each edge label as a value associated to the ”anchor” vertex and the unit direction of the step (with respect to ). Thus the environment becomes where is the set of unit NE steps. For each , we define a shift in the natural way, by shifting the anchor: . We let be a potential.
We initially work on the space of pairs of the environment and one unit step; that is we consider the empirical measures
of the polymer paths . It is established in [RASY13, Theorem 3.1] that the rate function of these empirical measures in the direction- case is precisely the lower semicontinuous regularization of
(52) |
where is the infimum of certain relative entropies (see (5.2) of [RASY13]), and
is the (normalized) -Gibbs Free Energy in direction with potential defined in (2.3) of the same paper. Here is a probability measure on s.t. the 1st step has -mean . This last condition is imposed because these paths go in the direction hence any limit points of their empirical measures must average to in the step coordinate; any not satisfying this condition will trivially have an infinite rate function.
We consider the ”edge label potential” given by which maps an (environment, unit direction) pair to the edge label of the corresponding unit direction anchored at the origin. If is a polymer path then our empirical measures from this paper are just the -pushforwards of :
We obtain the rate function of the empirical measures in the direction- case by contracting the higher level LDP rate function :
(53) |
for probability measures , where the infimum is taken over probability measures on whose -pushforward is .
We now make some convenient observations. First, for any measurable and bounded
, our (unnormalized) -point-to-point Gibbs Free Energy is precisely . Second, for any s.t. we have
where is the identity function. Thus the latter two terms in (52) for are ignored by the infimum in (53) so we get that is the lower semicontinuous regularization of
where
(54) |
(5.9) from [RAS14] gives the following variational formula for the point-to-point Gibbs Free Energy:
(55) |
for bounded measurable potentials where the supremum is over probability measures on with mean step .
Taking for bounded measurable and splitting the supremum over into an outer supremum over and an inner supremum over whose -pushforward is , we get
(56) |
where the last equality holds by (54). Comparing this to our variational formula for the point-to-point Gibbs Free Energy from Theorem 30 (i), recalling that Gibbs Free Energy is biconjugate from Corollary 33, and recalling that , we get
The direction-free (a.k.a. point-to-level) argument is nearly identical, except we drop the indices and the condition that the first step has -mean . In the end we get
where
As a final remark, we mention that this entropy originally appears in Varadhan’s paper [Var03] on large deviations for empirical measures of Markov chains.
6 Extensions to Other Models and Open Questions
Looking back at our development of grid entropy in , we never required some specific property of the lattice model other than that the number of paths we consider is for some model-dependent constant
that all such have the same length , and that any pair of paths and can be concatenated (thus giving us the superadditivity required to be able to apply the Subadditive Ergodic Theorem).
Therefore the notion of grid entropy and all our results, including the variational formula for the Gibbs Free Energy, apply in any edge model with these properties. For example, we might consider random walks on where the first coordinate, ”time,” is always increasing along a path.
In addition, if we loosen some of these conditions, then a part of our work in this paper still holds. For example, instead of NE paths on consider self-avoiding walks (SAWs) on , as done in [Bat20]. Then we cannot always concatenate a pair of SAWs and so our arguments that rely on superadditivity no longer apply. However, we may still define grid entropy by
and it will be true that only target measures with total mass are observed and that we have an upper bound on the sum of relative entropy and grid entropy:
where is the connectivity constant for satisfying
(see [Law13, Section 6.2] for details). It is not clear, however, whether any of the other properties (including this being a directed norm with negative sign) still hold.
Note however that with a small modification we can make this work. All our results hold for SAWs on where the first coordinate, ”time,” is necessarily non-decreasing because we obtain a superadditivity akin to the one for NE paths.
We end by mentioning two open questions that could be the scope of future work. Can we describe the set of maximizers of our variational formula (Theorem 30)? As discussed before, this would provide insights into the limit points of empirical measures and also into the problem of distinguishing weak and strong order regimes. And secondly, is grid entropy strictly concave? From our results in this chapter, answering this would immediately imply that Gibbs Free Energy is strictly convex, which is currently a major open research problem in this field.
7 Acknowledgments
I would like to thank my friends and family for their continual support during my graduate studies. Special thanks goes to my advisor Bálint Virág, for his patience, guidance, and optimism in the face of setbacks. I also wish to thank Duncan Dauvergne for very helpful discussions and peer review on a preliminary draft. In addition, I am grateful to my thesis external reviewer Firas Rassoul-Agha for his comments and for drawing some connections to previous literature. Finally, this work would not have been possible without the funding from my NSERC Canadian Graduate Scholarship-Doctoral.
References
- [AH16] Daniel Ahlberg and Christopher Hoffman, Random coalescing geodesics in first-passage percolation, arXiv preprint:1609.02447 (2016).
- [Ahl15] Daniel Ahlberg (ed.), Problem lists: first passage percolation, American Institute of Mathematics, Aug 2015.
- [Bat20] Erik Bates, Empirical distributions, geodesic lengths, and a variational formula in first-passage percolation, arXiv preprint:2006.12580 (2020).
- [Car10] Philippe Carmona, Directed polymer in random environment and last passage percolation, ESAIM: Probability and Statistics 14 (2010), 263–270.
- [DS01] Jean-Dominique Deuschel and Daniel W Stroock, Large deviations, vol. 342, American Mathematical Soc., 2001.
- [Dur19] Rick Durrett, Probability: theory and examples, vol. 49, Cambridge university press, 2019.
- [GRAS16] Nicos Georgiou, Firas Rassoul-Agha, and Timo Seppäläinen, Variational formulas and cocycle solutions for directed polymer and percolation models, Communications in Mathematical Physics 346 (2016), no. 2, 741–779.
- [Hub04] Peter J Huber, Robust statistics, vol. 523, John Wiley & Sons, 2004.
- [JLS20] Christopher Janjigian, Wai-Kit Lam, and Xiao Shen, Tail bounds for the averaged empirical distribution on a geodesic in first-passage percolation, arXiv preprint:2010.08072 (2020).
- [JRAS19] Christopher Janjigian, Firas Rassoul-Agha, and Timo Seppäläinen, Geometry of geodesics through busemann measures in directed last-passage percolation, arXiv preprint:1908.09040 (2019).
- [KL51] Solomon Kullback and Richard A Leibler, On information and sufficiency, The annals of mathematical statistics 22 (1951), no. 1, 79–86.
- [Law13] Gregory F Lawler, Intersections of random walks, Springer Science & Business Media, 2013.
- [Lig85] Thomas M Liggett, An improved subadditive ergodic theorem, The Annals of Probability 13 (1985), no. 4, 1279–1285.
- [Mar04] James B Martin, Limiting shape for directed percolation models, The Annals of Probability 32 (2004), no. 4, 2908–2937.
- [MSZ21] James B Martin, Allan Sly, and Lingfu Zhang, Convergence of the environment seen from geodesics in exponential last-passage percolation, arXiv preprint:2106.05242 (2021).
- [Pos75] Edward Posner, Random coding strategies for minimum entropy, IEEE Transactions on Information Theory 21 (1975), no. 4, 388–391.
- [RAS14] Firas Rassoul-Agha and Timo Seppäläinen, Quenched point-to-point free energy for random walks in random potentials, Probability Theory and Related Fields 158 (2014), no. 3, 711–750.
- [RASY13] Firas Rassoul-Agha, Timo Seppäläinen, and Atilla Yilmaz, Quenched free energy and large deviations for random walks in random potentials, Communications on Pure and Applied Mathematics 66 (2013), no. 2, 202–244.
- [RASY17] , Variational formulas and disorder regimes of random walks in random potentials, Bernoulli 23 (2017), no. 1, 405–431.
- [Var03] SRS1989232 Varadhan, Large deviations for random walks in a random environment, Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences 56 (2003), no. 8, 1222–1245.
- [VT84] Jan Van Tiel, Convex analysis: an introductory text, Wiley, 1984.
- [Zal02] Constantin Zalinescu, Convex analysis in general vector spaces, World scientific, 2002.