Bound on the running maximum of a random walk with small drift
Abstract.
We derive a lower bound for the probability that a random walk with i.i.d. increments and small negative drift exceeds the value by time . When the moment generating functions are bounded in an interval around the origin, this probability can be bounded below by . The approach is elementary and does not use strong approximation theorems.
2000 Mathematics Subject Classification:
60K35, 60K371. Introduction
1.1. Background
This paper arose from the need of a random walk estimate for the authors’ article [2] on directed polymers. This estimate is a positive lower bound on the running maximum of a random walk with a small negative drift. Importantly, the bound had to come with sufficient control over its constants so that it would apply to an infinite sequence of random walks whose drift scales to zero as the maximum is taken over expanding time intervals. The natural approach via a Brownian motion embedding appeared to not give either the desired precision or the uniformity. Hence we resorted to a proof from scratch. For possible wider use we derive the result here under general hypotheses on the distribution of the step of the walk.
The polymer application of the result pertains to the exactly solvable log-gamma polymer on the plane. The objective of [2] is to prove that there are no bi-infinite polymer paths on the planar lattice . The technical result is that there does not exist any nontrivial Gibbs measures on bi-infinite paths that satisfy the Dobrushin-Lanford-Ruelle (DLR) equations under the Gibbsian specification defined by the quenched polymer measures. In terms of limits of finite polymer distributions, this means that as the southwest and northeast endpoints of a random polymer path are taken to opposite infinities, the middle portion of the path escapes. This is proved by showing that in the limit the probability that the path crosses the -axis along a given edge decays to zero. This probability in turn is controlled by considering stationary polymer processes from the two endpoints to an interval along the -axis. The crossing probability can be controlled in terms of a maximum of a random walk. In the case of the log-gamma polymer, the steps of this random walk are distributed as the difference of two independent log-gamma variables. The case needed for [2] is treated in Example 2.7 below.
1.2. The question considered
We seek a lower bound on the probability that the running maximum of a random walk with negative drift reaches a level . To set the stage, we discuss the matter through Brownian motion. Let be a random walk with drift , and such that the random walks converge weakly to a Brownian motion with drift . The probability of the event
should be approximately the same as that of
(1.1) |
This latter can be computed (see (3.7)) to be
(1.2) |
This suggests that we should aim for an estimate of the type
(1.3) |
To reach this precision weak convergence is not powerful enough, for a weak approximation of random walk by Brownian motion reaches only a precision of [12, 7]. Our estimate (2.4) below does almost capture (1.3): we have to allow an additional factor inside the and consider of order at least .
The by-now classical Komlós-Major-Tusnády (KMT) coupling [9] gives a strong approximation of random walk with Brownian motion with a discrepancy that grows logarithmically in time. This precision is sufficient for us, as we illustrate in Section 3. The problem is now the control of the constants in the approximation. Uniformity of the constants is necessary for our application in [2]. But verifying this uniformity from the original work [9] appeared to be a highly nontrivial task. In the end it was more efficient to derive the estimate (Theorem 2.2 below) from the ground up.
The difficulty of the original KMT proof has motivated several recent attempts at simplification and better understanding of the result, such as Bhattacharjee and Goldstein [1], Chatterjee [3], and Krishnapur [10]. There is another strong approximation result due to Sakhanenko [11] which, according to p. 232 of [3], “is so complex that some researchers are hesitant to use it”.
1.3. Sketch of the proof
Our proof is elementary. The most sophisticated result used is the Berry-Esseen theorem. Given a random walk of small drift , our approach can be summarized in two main steps:
-
(1)
Up to the time the random walk hits the level it behaves like an unbiased random walk.
-
(2)
By the time the random walk hits the level it will have had about independent opportunities to hit the level . By the previous step this implies that the probability on the left-hand side of (1.3) is of order .
As we will take in the proof, we will obtain the right order in (1.3) up to a logarithmic factor (Theorem 2.2). After the statement of the theorem we illustrate it with examples. Then we demonstrate that even if we knew that the constants in the KMT approximation can be taken uniform, the result would not be essentially stronger in the regime in which we apply our result.
2. Main result
For each , let be a sequence of non-degenerate i.i.d. random variables. Denote their moment generating function by
Write and .
Assumption 2.1.
-
We assume that the random variables satisfy the following:
-
(i)
There exists an open interval around the origin on which each moment generating function is finite. Furthermore, there exists a finite constant and such that we have the uniform bounds
(2.1) for the compact interval .
-
(ii)
There exists a finite constant such that
(2.2) -
(iii)
There exists a finite constant such that the expectations satisfy
(2.3)
The conditions in Assumption 2.1 are fairly natural. Note that (2.1) has to be checked only for at the expense of shrinking the interval and increasing . To make a positive maximum possible, condition (2.2) ensures enough diffusivity and condition (2.3) limits the strength of the negative drift. The bound (2.4) below shows that has to be vanishingly small in order for the result to be nontrivial.
For let be the random walk associated with the steps . Here is the main theorem.
Theorem 2.2.
There exist finite constants and that depend on and such that, for every and ,
(2.4) |
Remark 2.3 (The constants in the theorem).
The constant in the upper bound (2.4) is given by
(2.5) |
where
(2.6) |
(2.7) |
and is the mean zero Gaussian distribution with variance .
Throughout the proof we state explicitly the various conditions required along the way. Let us assume that so that does not vanish. Then all the conditions can be combined into a single condition of the form
(2.8) |
where the function is a strictly positive continuous function on , nondecreasing in , nonincreasing in and , but depends on in both directions. When is restricted to a compact subset of , there exists a finite index such that is a nondecreasing function of for any fixed , and
Corollary 2.4.
We illustrate the result with some examples.
Example 2.5 (Gaussian random walk).
Let be a Brownian motion, , and define the random walk . We can verify that the bound (2.4) is off by a logarithmic factor in this case, by comparison with the running maximum of the Brownian motion. For and large enough
(2.9) | ||||
where the middle inequality follows from (3.7) with and .(2.9) shows that the optimal error is at most , and that Theorem 2.2, if not optimal, is only away from being so.
A natural way to produce examples is to take as the difference of two independent random variables whose means come closer as grows and whose variances stay bounded and bounded away from zero.
Example 2.6 (Exponential walk).
Consider a random walk with step distribution where and are two independent exponential random variables with rates and , respectively. so we assume that . The distribution of the supremum of is well-known and also feasible to compute (Example (b) in Section XII.5 of Feller [6]): for ,
where we assume small and expand . We obtain a lower bound:
Thus for and small , the upper bound (2.4) loses only a logarithmic factor.
That is close to the overall maximum in the case is a consequence of the fact that the overall maximum is taken at a random time of order . This claim is seen conveniently through ladder intervals . These are the intervals between successive ladder epochs defined by and
The distribution of is given by
where are the Catalan numbers. (This calculation can be found in Lemma B.3 in the appendix of [5].) Set and let be the number of finite ladder intervals. The maximum is taken at time . One calculates and . Thus for large enough , .
Example 2.7 (Log-gamma walk).
Let denote generically a parameter gamma random variable, that is, has density function on . For let denote the random walk where the distribution of the i.i.d. steps is specified by
with two independent gamma variables and on the right.
Let be the digamma function and the trigamma function on . Their key properties are that is strictly increasing with and , while is strictly decreasing and strictly convex with and .
Fix a compact interval . Fix a positive constant and let be a sequence of nonnegative reals such that . Define a set of admissible pairs
For , the mean step satisfies
(2.10) | ||||
where we used the mean value theorem with some . We take .
The MGF of is
(2.11) |
for . For the interval in assumption (2.1) we can take . Now (2.1) holds with a single constant for all choices of .
The variance satisfies
The constants have been fixed and they work simultaneously for all and all . Define through (2.5)–(2.7). Choose so that (2.8) holds for all . Now and are entirely determined by . We state the result as a corollary of Theorem 2.2.
Corollary 2.8.
In the setting described above the bound below holds for all , , and :
3. Comparison with the KMT coupling
As a counterpoint to our Theorem 2.2 we derive here an estimate for a single random walk with the Komlós-Major-Tusnády (KMT) [9] coupling with Brownian motion. We emphasize though that Theorem 3.1 below is not an alternative to our Theorem 2.2 because we do not know how the constants below depend on the distribution of the walk. Hence without further work we cannot apply the resulting estimate (3.2) to an infinite family of random walks.
However, this section does illustrate that in a certain regime of vanishing drift the estimates (2.4) and (3.2) are essentially equivalent, as explained below in Remark 3.2. So even if one were to conclude that the constants below can be taken uniform, the result remains the same.
Let be a mean-zero random walk with i.i.d. steps and unit variance . The KMT coupling (Theorem 1 in [9]) constructs this walk together with a standard Brownian motion on a probability space such that the following bound holds:
(3.1) |
where are finite positive constants determined by the distribution of .
We apply this to the running maximum of a random walk with a negative drift.
Theorem 3.1.
Let be a random walk with i.i.d. steps that satisfy for for some . Assume the drift is negative: , and the variance . Then there exists a constant determined by the distribution of the normalized variable such that, for all real and integers ,
(3.2) | ||||
Remark 3.2.
To compare this estimate with Theorem 2.2, imagine that we can let vary as a function of while preserving the constant in (3.2). Consider the regime where is constant, and vanishes fast enough so that stays bounded. Then the first parenthetical expression on the right of (3.2) is dominated by a constant multiple of . To the last part apply for . The bound (3.2) becomes
(3.3) |
The bound (2.4) is worse than the one above by at most a factor, and not at all if vanishes fast enough. In particular, for the application in [2], the KMT bound cannot give anything substantially better than Theorem 2.2.
Proof of Theorem 3.1.
Apply (3.1) to the mean-zero unit-variance normalized walk . To simplify some steps below we can assume that . Let and .
(3.4) |
Let . Since ,
With this we continue from above.
(3.5) | ||||
We bound the two probabilities above separately. Recall that . For the running maximum of standard Brownian motion, by (2.8.4) on page 96 of [8],
(3.6) | ||||
For the running maximum of Brownian motion with drift, use first Brownian scaling, and then the density of the hitting time of the point with drift from (3.5.12) on page 197 of [8].
(3.7) | ||||
The second last inequality dropped the denominator and the term from the exponent, and then integrated. The last inequality substituted in to bound
The conclusion (3.2) follows from substituting into (3.5) the bounds from above. ∎
4. Auxiliary facts
Before starting the proof proper, we record some simple facts. First, assumptions (2.1) and (2.2) gives these bounds:
(4.1) | ||||
Lemma 4.1.
Let be i.i.d. random variables with common marginal distribution . Assume that, for two constants ,
(4.2) |
Then
Proof.
For ,
and the claim follows by taking . ∎
Since there is a unique minimizer
(4.3) |
Lemma 4.2.
Let be such that for and set . Then for ,
Proof.
If then the minimum is taken at .
So suppose . Expansion for gives, with some ,
Since is strictly increasing and , by the choice of we have for
It follows that there exists a unique such that ∎
Define a tilted measure in terms of the Radon-Nikodym derivative
Denote the expectation under by . Increase further so that implies and . Then for and , the MGF under satisfies
(4.4) | ||||
where the first inequality used Jensen’s inequality and (2.1). From this we get moment bounds under : for ,
(4.5) |
For , there exists
Increase further if necessary so that for and we can write
Then from and the third equation in (4.4),
(4.6) |
5. Proof of the main theorem
To lighten the notation we omit the label from and , and from some other notation that obviously depend on . For let
denote the first hitting time of the cylinder of width . Let denote the centered Gaussian distribution with variance .
Lemma 5.1.
For real and we have where
(5.1) |
Proof.
Let be the centered walk. Consider an integer and a real . Look at the process along time increments of size :
The penultimate inequality is the Berry-Esseen Theorem. We use the version from [4, Section 3.4.4] where the constant is 3. The last inequality is a simple property of a centered Gaussian. Now for and as above,
We have proved for . Extend this to real :
Let . By (2.3)
(5.2) |
Define the truncated version of
The following result shows that although the random walk has negative drift, up to times of order it behaves similarly to an unbiased random walk in the following sense: if is not too small, but small compared to , the probability that the random walk reaches level before level is close to . Our choice of can be justified by decomposing the random walk into
For small and (so that ),
(5.3) |
As
we see that the left hand side of (5.3) is dominated by the first term on the right hand side. That is, up to time the random walk behaves approximately like an unbiased random walk.
Lemma 5.2.
Let be as in (5.1). There exist finite constants and such that, for and ,
depends on and while depends on , and .
Proof.
The constant comes as follows in terms of the constants previously introduced above and new constants introduced below in the course of the proof:
(5.4) | ||||
Under the measure , is a mean-zero random walk and hence a martingale. Furthermore, is a bounded stopping time. From this,
On the event , we have and and so
where we applied Lemma 4.1 under the distribution with from (4.4). Combine the displays above to obtain
Use
to rewrite the above as
(5.5) |
It remains to bound the probability on the right. forces and thereby another application of the Berry-Esseen theorem, while using (4.5), (4.6) and , gives
Rewrite (5.5) as
(5.6) |
It remains to switch from back to the original distribution . Recall the Radon-Nikodym derivative . Introduce a temporary quantity to be chosen precisely below. Decompose according to the value of and use Cauchy-Schwarz:
(5.7) |
Let us first bound the second term on line (5.7). Note that is a -martingale and is a stopping time bounded by . Hence is a submartingale and we have
To bound the -factor on the right, expand and use (2.1), (4.1) and . In the numerator, for some ,
and similarly in the denominator:
(5.8) | ||||
Above we used and increased once more so that guarantees , and . Then we applied the bounds
where the second line also used for . Put (5.8) back up, set , and apply Lemma 5.1 (for which we use the assumption ):
(5.9) |
Next we bound the first term on line (5.7). Use . Let .
(5.10) | ||||
Let us first bound the second term. Using Cauchy-Schwarz, the bound
the bound (5.8), and the tail bound in (4.1), it follows that
The first term on the last line of (5.10) is bounded as follows, with .
We used above Jensen’s inequality in the form , the definition of in the form , and then . Furthermore, by (5.2) and our assumption we have
where we choose large enough so that the last inequality holds for . Then we applied the inequality for .
By adjusting a constant we can replace with in the previous estimate.
Corollary 5.3.
Under the assumptions of Lemma 5.2, with ,
(5.12) |
Proof.
For truncate:
Define
We transfer bound (5.12) to the truncated walk . The reason is that the proof of the forthcoming Lemma 5.5 is easier for the truncated RW.
Corollary 5.4.
Under the assumptions of Lemma 5.2, with ,
We turn to the main argument of the proof of Theorem 2.2, that is, to show that the probability of the random walk to hit the level before hitting the level is close to . This gives rise to the error term in (2.4). We sketch the reasoning.
Let us try to hit the level starting from the origin. By Corollary 5.4 there is a probability to hit before hitting . Suppose we failed and hit first. We have another chance to hit by going upward from the level . By Corollary 5.4 the probability of going up to the level before going down to the level is . We continue this way until we either hit the level or the level . How many trials to hit do we have before we hit ? Approximately . The trials are independent and so the probability of hitting the level before hitting the level is , which is what we seek.
We introduce the notation to make the sketch precise. See Figure 5.1 for an illustration.
Define . For set . Inductively these satisfy and . Furthermore,
Define the stopping times
Note that is possible.
Lemma 5.5.
There exist finite constants and such that for and ,
where
and in the expression above is from (5.4).
Proof.
Since , we have . Then we can assume that , for otherwise the bound on the probability is . This guarantees that . It also implies that unless , the result is trivial.
Since ,
(5.16) |
and
Note that
(5.17) |
Due to (5.16)
(5.18) | ||||
The last equality used the definition of the stopping time , the definition of , and the Markov property. For define the events
Applying (5.18) to (5.17) repeatedly,
Let . Recall that by (5.2), and . Apply Corollary 5.4 with and for and to get this estimate:
where we set
and if necessary we increase further so that for .
Continue with the above estimate,
where we used . ∎
We are ready to prove Theorem 2.2. By Lemma 5.5, by the time hits the level , with high probability it has hit level as well. It remains to verify the two points below.
-
(i)
is close to on the time interval . This follows from a union bound and the exponential tail of .
-
(ii)
With high probability by time we hit the boundary of the cylinder of width . This follows from Lemma 5.1.
References
- [1] Chinmoy Bhattacharjee and Larry Goldstein. On strong embeddings by Stein’s method. Electron. J. Probab., 21:Paper No. 15, 30, 2016.
- [2] Ofer Busani and Timo Seppäläinen. Non-existence of bi-infinite polymer Gibbs measures in the inverse-gamma polymer model. 2020. arXiv:2010.11279.
- [3] Sourav Chatterjee. A new approach to strong embeddings. Probab. Theory Related Fields, 152(1-2):231–264, 2012.
- [4] Rick Durrett. Probability: theory and examples. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, fourth edition, 2010.
- [5] Wai-Tong (Louis) Fan and Timo Seppäläinen. Joint distribution of Busemann functions in the exactly solvable corner growth model. 2018. arXiv:1808.09069.
- [6] William Feller. An introduction to probability theory and its applications. Vol. II. Second edition. John Wiley & Sons Inc., New York, 1971.
- [7] David F. Fraser. The rate of convergence of a random walk to Brownian motion. Ann. Probability, 1:699–701, 1973.
- [8] Ioannis Karatzas and Steven E. Shreve. Brownian motion and stochastic calculus, volume 113 of Graduate Texts in Mathematics. Springer-Verlag, New York, second edition, 1991.
- [9] J. Komlós, P. Major, and G. Tusnády. An approximation of partial sums of independent RV’s, and the sample DF. II. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 34(1):33–58, 1976.
- [10] Manjunath Krishnapur. One idea and two proofs of the KMT theorems. 2020. arXiv:2008.03287.
- [11] A. I. Sakhanenko. Rate of convergence in the invariance principle for variables with exponential moments that are not identically distributed. In Limit theorems for sums of random variables, volume 3 of Trudy Inst. Mat., pages 4–49. “Nauka” Sibirsk. Otdel., Novosibirsk, 1984.
- [12] Stanley Sawyer. Rates of convergence for some functionals in probability. Ann. Math. Statist., 43:273–284, 1972.